Commit 2bcb14ff authored by Robert Bradshaw's avatar Robert Bradshaw

Merge docs repo

parents 8128cb65 19ca6b53
syntax: glob
*.pyc
*~
.*.swp
syntax: regexp
^build/
^_build/
# Makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
PAPER =
# Internal variables.
PAPEROPT_a4 = -D latex_paper_size=a4
PAPEROPT_letter = -D latex_paper_size=letter
ALLSPHINXOPTS = -d build/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
.PHONY: help clean html web htmlhelp latex changes linkcheck
help:
@echo "Please use \`make <target>' where <target> is one of"
@echo " html to make standalone HTML files"
@echo " web to make files usable by Sphinx.web"
@echo " htmlhelp to make HTML files and a HTML help project"
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
@echo " changes to make an overview over all changed/added/deprecated items"
@echo " linkcheck to check all external links for integrity"
clean:
-rm -rf build/*
html:
mkdir -p build/html build/doctrees
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) build/html
@echo
@echo "Build finished. The HTML pages are in build/html."
web:
mkdir -p build/web build/doctrees
$(SPHINXBUILD) -b web $(ALLSPHINXOPTS) build/web
@echo
@echo "Build finished; now you can run"
@echo " python -m sphinx.web build/web"
@echo "to start the server."
htmlhelp:
mkdir -p build/htmlhelp build/doctrees
$(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) build/htmlhelp
@echo
@echo "Build finished; now you can run HTML Help Workshop with the" \
".hhp project file in build/htmlhelp."
latex:
mkdir -p build/latex build/doctrees
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) build/latex
@echo
@echo "Build finished; the LaTeX files are in build/latex."
@echo "Run \`make all-pdf' or \`make all-ps' in that directory to" \
"run these through (pdf)latex."
changes:
mkdir -p build/changes build/doctrees
$(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) build/changes
@echo
@echo "The overview file is in build/changes."
linkcheck:
mkdir -p build/linkcheck build/doctrees
$(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) build/linkcheck
@echo
@echo "Link check complete; look for any errors in the above output " \
"or in build/linkcheck/output.txt."
Cython's entire documentation suite is currently being overhauled.
For the time being, I'll use this page to post notes.
The previous Cython documentation files are hosted at http://hg.cython.org/cython-docs
Notes
=======
1) Some css work should definately be done.
2) Use local 'top-of-page' contents rather than the sidebar, imo.
3) Provide a link from each (sub)section to the TOC of the page.
4) Fix cython highlighter for cdef blocks
{% extends "!layout.html" %}
{% block footer %}
{{ super() }}
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try {
var pageTracker = _gat._getTracker("UA-6139100-3");
pageTracker._trackPageview();
} catch(err) {}</script>
{% endblock %}
# -*- coding: utf-8 -*-
#
# Cython documentation build configuration file, created by
# sphinx-quickstart on Fri Apr 25 12:49:32 2008.
#
# This file is execfile()d with the current directory set to its containing dir.
#
# The contents of this file are pickled, so don't put values in the namespace
# that aren't pickleable (module imports are okay, they're removed automatically).
#
# All configuration values have a default value; values that are commented out
# serve to show the default value.
import sys
# If your extensions are in another directory, add it here.
sys.path.append('sphinxext')
# Import support for ipython console session syntax highlighting (lives
# in the sphinxext directory defined above)
import ipython_console_highlighting
# General configuration
# ---------------------
# Use cython as the default syntax highlighting language, as python is a subset
# this does the right thing
highlight_language = 'cython'
# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
extensions = ['ipython_console_highlighting', 'cython_highlighting', 'sphinx.ext.pngmath', 'sphinx.ext.todo']
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
# The suffix of source filenames.
source_suffix = '.rst'
# The master toctree document.
master_doc = 'index'
exclude_patterns = ['py*', 'build']
# General substitutions.
project = 'Cython'
copyright = '2011, Stefan Behnel, Robert Bradshaw, Dag Sverre Seljebotn, Greg Ewing, William Stein, Gabriel Gellner, et al.'
# The default replacements for |version| and |release|, also used in various
# other places throughout the built documents.
#
# The short X.Y version.
version = '0.15'
# The full version, including alpha/beta/rc tags.
release = '0.15pre'
# There are two options for replacing |today|: either, you set today to some
# non-false value, then it is used:
#today = ''
# Else, today_fmt is used as the format for a strftime call.
today_fmt = '%B %d, %Y'
# List of documents that shouldn't be included in the build.
#unused_docs = []
# If true, '()' will be appended to :func: etc. cross-reference text.
#add_function_parentheses = True
# If true, the current module name will be prepended to all description
# unit titles (such as .. function::).
#add_module_names = True
# If true, sectionauthor and moduleauthor directives will be shown in the
# output. They are ignored by default.
#show_authors = False
# The name of the Pygments (syntax highlighting) style to use.
pygments_style = 'sphinx'
# Options for HTML output
# -----------------------
# suffix for generated files
html_file_suffix = '.html'
# The style sheet to use for HTML and HTML Help pages. A file of that name
# must exist either in Sphinx' static/ path, or in one of the custom paths
# given in html_static_path.
html_style = 'default.css'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
# so a file named "default.css" will overwrite the builtin "default.css".
html_static_path = ['_static']
# If not '', a 'Last updated on:' timestamp is inserted at every page bottom,
# using the given strftime format.
html_last_updated_fmt = '%b %d, %Y'
# Include the Cython logo in the sidebar
html_logo = '_static/cython-logo-light.png'
# used a favicon!
html_favicon = '_static/favicon.ico'
# If true, SmartyPants will be used to convert quotes and dashes to
# typographically correct entities.
#html_use_smartypants = True
# Content template for the index page.
#html_index = ''
# Custom sidebar templates, maps document names to template names.
#html_sidebars = {}
# Additional templates that should be rendered to pages, maps page names to
# template names.
#html_additional_pages = {}
# If false, no module index is generated.
html_use_modindex = False
# Don't generate and index
html_use_index = False
# If true, the reST sources are included in the HTML build as _sources/<name>.
#html_copy_source = True
# Output file base name for HTML help builder.
htmlhelp_basename = 'Cythondoc'
# Options for LaTeX output
# ------------------------
# The paper size ('letter' or 'a4').
#latex_paper_size = 'letter'
# The font size ('10pt', '11pt' or '12pt').
#latex_font_size = '10pt'
# Grouping the document tree into LaTeX files. List of tuples
# (source start file, target name, title, author, document class [howto/manual]).
#_stdauthor = r'Greg Ewig\\ Gabriel Gellner, editor'
_stdauthor = r'Stefan Behnel, Robert Bradshaw, William Stein\\ Gary Furnish, Dag Seljebotn, Greg Ewing\\ Gabriel Gellner, editor'
latex_documents = [
('src/reference/index', 'reference.tex',
'Cython Reference Guide', _stdauthor, 'manual'),
('src/tutorial/index', 'tutorial.tex',
'Cython Tutorial', _stdauthor, 'manual')
]
# Additional stuff for the LaTeX preamble.
#latex_preamble = ''
# Documents to append as an appendix to all manuals.
#latex_appendices = []
# If false, no module index is generated.
#latex_use_modindex = True
# todo
todo_include_todos = True
def fib(n):
"""Print the Fibonacci series up to n."""
a, b = 0, 1
while b < n:
print b,
a, b = b, a + b
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [Extension("fib", ["fib.pyx"])]
)
import math
def great_circle(lon1, lat1, lon2, lat2):
radius = 3956 # miles
x = math.pi/180.0
a = (90.0 - lat1)*x
b = (90.0 - lat2)*x
theta = (lon2 - lon1)*x
c = math.acos(math.cos(a)*math.cos(b) + math.sin(a)*math.sin(b)*math.cos(theta))
return radius*c
import math
def great_circle(double lon1, double lat1, double lon2, double lat2):
cdef double radius = 3956 # miles
cdef double x = math.pi/180.0
cdef double a, b, theta, c
a = (90.0 - lat1)*x
b = (90.0 - lat2)*x
theta = (lon2 - lon1)*x
c = math.acos(math.cos(a)*math.cos(b) + math.sin(a)*math.sin(b)*math.cos(theta))
return radius*c
import math
def great_circle(lon1, lat1, lon2, lat2):
radius = 3956 # miles
x = math.pi/180.0
a = (90.0 - lat1)*x
b = (90.0 - lat2)*x
theta = (lon2 - lon1)*x
c = math.acos(math.cos(a)*math.cos(b) + math.sin(a)*math.sin(b)*math.cos(theta))
return radius*c
def primes(kmax):
result = []
if kmax > 1000:
kmax = 1000
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
result.append(n)
n = n + 1
return result
def primes(int kmax):
cdef int n, k, i
cdef int p[1000]
result = []
if kmax > 1000:
kmax = 1000
k = 0
n = 2
while k < kmax:
i = 0
while i < k and n % p[i] != 0:
i = i + 1
if i == k:
p[k] = n
k = k + 1
result.append(n)
n = n + 1
return result
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [Extension("primes", ["primes.pyx"])]
)
Welcome to Cython's Documentation
=================================
.. toctree::
:maxdepth: 2
src/quickstart/index
src/tutorial/index
src/userguide/index
src/reference/index
import re
from pygments.lexer import Lexer, RegexLexer, ExtendedRegexLexer, \
LexerContext, include, combined, do_insertions, bygroups, using
from pygments.token import Error, Text, \
Comment, Operator, Keyword, Name, String, Number, Generic, Punctuation
from pygments.util import get_bool_opt, get_list_opt, shebang_matches
from pygments import unistring as uni
from sphinx import highlighting
line_re = re.compile('.*?\n')
class CythonLexer(RegexLexer):
"""
For `Cython <http://cython.org>`_ source code.
"""
name = 'Cython'
aliases = ['cython', 'pyx']
filenames = ['*.pyx', '*.pxd', '*.pxi']
mimetypes = ['text/x-cython', 'application/x-cython']
tokens = {
'root': [
(r'\n', Text),
(r'^(\s*)("""(?:.|\n)*?""")', bygroups(Text, String.Doc)),
(r"^(\s*)('''(?:.|\n)*?''')", bygroups(Text, String.Doc)),
(r'[^\S\n]+', Text),
(r'#.*$', Comment),
(r'[]{}:(),;[]', Punctuation),
(r'\\\n', Text),
(r'\\', Text),
(r'(in|is|and|or|not)\b', Operator.Word),
(r'(<)([a-zA-Z0-9.?]+)(>)',
bygroups(Punctuation, Keyword.Type, Punctuation)),
(r'!=|==|<<|>>|[-~+/*%=<>&^|.?]', Operator),
(r'(from)(\d+)(<=)(\s+)(<)(\d+)(:)',
bygroups(Keyword, Number.Integer, Operator, Name, Operator,
Name, Punctuation)),
include('keywords'),
(r'(def|property)(\s+)', bygroups(Keyword, Text), 'funcname'),
(r'(cp?def)(\s+)', bygroups(Keyword, Text), 'cdef'),
(r'(class|struct)(\s+)', bygroups(Keyword, Text), 'classname'),
(r'(from)(\s+)', bygroups(Keyword, Text), 'fromimport'),
(r'(c?import)(\s+)', bygroups(Keyword, Text), 'import'),
include('builtins'),
include('backtick'),
('(?:[rR]|[uU][rR]|[rR][uU])"""', String, 'tdqs'),
("(?:[rR]|[uU][rR]|[rR][uU])'''", String, 'tsqs'),
('(?:[rR]|[uU][rR]|[rR][uU])"', String, 'dqs'),
("(?:[rR]|[uU][rR]|[rR][uU])'", String, 'sqs'),
('[uU]?"""', String, combined('stringescape', 'tdqs')),
("[uU]?'''", String, combined('stringescape', 'tsqs')),
('[uU]?"', String, combined('stringescape', 'dqs')),
("[uU]?'", String, combined('stringescape', 'sqs')),
include('name'),
include('numbers'),
],
'keywords': [
(r'(assert|break|by|continue|ctypedef|del|elif|else|except\??|exec|'
r'finally|for|gil|global|if|include|lambda|nogil|pass|print|raise|'
r'return|try|while|yield|as|with)\b', Keyword),
(r'(DEF|IF|ELIF|ELSE)\b', Comment.Preproc),
],
'builtins': [
(r'(?<!\.)(__import__|abs|all|any|apply|basestring|bin|bool|buffer|'
r'bytearray|bytes|callable|chr|classmethod|cmp|coerce|compile|'
r'complex|delattr|dict|dir|divmod|enumerate|eval|execfile|exit|'
r'file|filter|float|frozenset|getattr|globals|hasattr|hash|hex|id|'
r'input|int|intern|isinstance|issubclass|iter|len|list|locals|'
r'long|map|max|min|next|object|oct|open|ord|pow|property|range|'
r'raw_input|reduce|reload|repr|reversed|round|set|setattr|slice|'
r'sorted|staticmethod|str|sum|super|tuple|type|unichr|unicode|'
r'vars|xrange|zip)\b', Name.Builtin),
(r'(?<!\.)(self|None|Ellipsis|NotImplemented|False|True|NULL'
r')\b', Name.Builtin.Pseudo),
(r'(?<!\.)(ArithmeticError|AssertionError|AttributeError|'
r'BaseException|DeprecationWarning|EOFError|EnvironmentError|'
r'Exception|FloatingPointError|FutureWarning|GeneratorExit|IOError|'
r'ImportError|ImportWarning|IndentationError|IndexError|KeyError|'
r'KeyboardInterrupt|LookupError|MemoryError|NameError|'
r'NotImplemented|NotImplementedError|OSError|OverflowError|'
r'OverflowWarning|PendingDeprecationWarning|ReferenceError|'
r'RuntimeError|RuntimeWarning|StandardError|StopIteration|'
r'SyntaxError|SyntaxWarning|SystemError|SystemExit|TabError|'
r'TypeError|UnboundLocalError|UnicodeDecodeError|'
r'UnicodeEncodeError|UnicodeError|UnicodeTranslateError|'
r'UnicodeWarning|UserWarning|ValueError|Warning|ZeroDivisionError'
r')\b', Name.Exception),
],
'numbers': [
(r'(\d+\.?\d*|\d*\.\d+)([eE][+-]?[0-9]+)?', Number.Float),
(r'0\d+', Number.Oct),
(r'0[xX][a-fA-F0-9]+', Number.Hex),
(r'\d+L', Number.Integer.Long),
(r'\d+', Number.Integer)
],
'backtick': [
('`.*?`', String.Backtick),
],
'name': [
(r'@[a-zA-Z0-9_]+', Name.Decorator),
('[a-zA-Z_][a-zA-Z0-9_]*', Name),
],
'funcname': [
('[a-zA-Z_][a-zA-Z0-9_]*', Name.Function, '#pop')
],
'cdef': [
(r'(public|readonly|extern|api|inline)\b', Keyword.Reserved),
(r'(struct|enum|union|class)\b', Keyword),
(r'([a-zA-Z_][a-zA-Z0-9_]*)(\s*)(?=[(:#=]|$)',
bygroups(Name.Function, Text), '#pop'),
(r'([a-zA-Z_][a-zA-Z0-9_]*)(\s*)(,)',
bygroups(Name.Function, Text, Punctuation)),
(r'from\b', Keyword, '#pop'),
(r'as\b', Keyword),
(r':', Punctuation, '#pop'),
(r'(?=["\'])', Text, '#pop'),
(r'[a-zA-Z_][a-zA-Z0-9_]*', Keyword.Type),
(r'.', Text),
],
'classname': [
('[a-zA-Z_][a-zA-Z0-9_]*', Name.Class, '#pop')
],
'import': [
(r'(\s+)(as)(\s+)', bygroups(Text, Keyword, Text)),
(r'[a-zA-Z_][a-zA-Z0-9_.]*', Name.Namespace),
(r'(\s*)(,)(\s*)', bygroups(Text, Operator, Text)),
(r'', Text, '#pop') # all else: go back
],
'fromimport': [
(r'(\s+)(c?import)\b', bygroups(Text, Keyword), '#pop'),
(r'[a-zA-Z_.][a-zA-Z0-9_.]*', Name.Namespace),
# ``cdef foo from "header"``, or ``for foo from 0 < i < 10``
(r'', Text, '#pop'),
],
'stringescape': [
(r'\\([\\abfnrtv"\']|\n|N{.*?}|u[a-fA-F0-9]{4}|'
r'U[a-fA-F0-9]{8}|x[a-fA-F0-9]{2}|[0-7]{1,3})', String.Escape)
],
'strings': [
(r'%(\([a-zA-Z0-9]+\))?[-#0 +]*([0-9]+|[*])?(\.([0-9]+|[*]))?'
'[hlL]?[diouxXeEfFgGcrs%]', String.Interpol),
(r'[^\\\'"%\n]+', String),
# quotes, percents and backslashes must be parsed one at a time
(r'[\'"\\]', String),
# unhandled string formatting sign
(r'%', String)
# newlines are an error (use "nl" state)
],
'nl': [
(r'\n', String)
],
'dqs': [
(r'"', String, '#pop'),
(r'\\\\|\\"|\\\n', String.Escape), # included here again for raw strings
include('strings')
],
'sqs': [
(r"'", String, '#pop'),
(r"\\\\|\\'|\\\n", String.Escape), # included here again for raw strings
include('strings')
],
'tdqs': [
(r'"""', String, '#pop'),
include('strings'),
include('nl')
],
'tsqs': [
(r"'''", String, '#pop'),
include('strings'),
include('nl')
],
}
##TODO: fix this, as shebang lines don't make sense for cython.
def analyse_text(text):
return shebang_matches(text, r'pythonw?(2\.\d)?')
highlighting.lexers['cython'] = CythonLexer()
from pygments.lexer import Lexer, do_insertions
from pygments.lexers.agile import PythonConsoleLexer, PythonLexer, \
PythonTracebackLexer
from pygments.token import Comment, Generic
from sphinx import highlighting
import re
line_re = re.compile('.*?\n')
class IPythonConsoleLexer(Lexer):
"""
For IPython console output or doctests, such as:
Tracebacks are not currently supported.
.. sourcecode:: ipython
In [1]: a = 'foo'
In [2]: a
Out[2]: 'foo'
In [3]: print a
foo
In [4]: 1 / 0
"""
name = 'IPython console session'
aliases = ['ipython']
mimetypes = ['text/x-ipython-console']
input_prompt = re.compile("(In \[[0-9]+\]: )|( \.\.\.+:)")
output_prompt = re.compile("(Out\[[0-9]+\]: )|( \.\.\.+:)")
continue_prompt = re.compile(" \.\.\.+:")
tb_start = re.compile("\-+")
def get_tokens_unprocessed(self, text):
pylexer = PythonLexer(**self.options)
tblexer = PythonTracebackLexer(**self.options)
curcode = ''
insertions = []
for match in line_re.finditer(text):
line = match.group()
input_prompt = self.input_prompt.match(line)
continue_prompt = self.continue_prompt.match(line.rstrip())
output_prompt = self.output_prompt.match(line)
if line.startswith("#"):
insertions.append((len(curcode),
[(0, Comment, line)]))
elif input_prompt is not None:
insertions.append((len(curcode),
[(0, Generic.Prompt, input_prompt.group())]))
curcode += line[input_prompt.end():]
elif continue_prompt is not None:
insertions.append((len(curcode),
[(0, Generic.Prompt, continue_prompt.group())]))
curcode += line[continue_prompt.end():]
elif output_prompt is not None:
insertions.append((len(curcode),
[(0, Generic.Output, output_prompt.group())]))
curcode += line[output_prompt.end():]
else:
if curcode:
for item in do_insertions(insertions,
pylexer.get_tokens_unprocessed(curcode)):
yield item
curcode = ''
insertions = []
yield match.start(), Generic.Output, line
if curcode:
for item in do_insertions(insertions,
pylexer.get_tokens_unprocessed(curcode)):
yield item
highlighting.lexers['ipython'] = IPythonConsoleLexer()
Building Cython code
====================
Cython code must, unlike Python, be compiled. This happens in two stages:
- A ``.pyx`` file is compiled by Cython to a ``.c`` file, containing
the code of a Python extension module
- The ``.c`` file is compiled by a C compiler to
a ``.so`` file (or ``.pyd`` on Windows) which can be
``import``-ed directly into a Python session.
There are several ways to build Cython code:
- Write a distutils ``setup.py``.
- Use ``pyximport``, importing Cython ``.pyx`` files as if they
were ``.py`` files (using distutils to compile and build the background).
- Run the ``cython`` command-line utility manually to produce the ``.c`` file
from the ``.pyx`` file, then manually compiling the ``.c`` file into a shared
object library or ``.dll`` suitable for import from Python.
(This is mostly for debugging and experimentation.)
- Use the [Sage]_ notebook which allows Cython code inline.
Currently, distutils is the most common way Cython files are built and distributed. The other methods are described in more detail in the :ref:`compilation` section of the reference manual.
Building a Cython module using distutils
----------------------------------------
Imagine a simple "hello world" script in a file ``hello.pyx``::
def say_hello_to(name):
print("Hello %s!" % name)
The following could be a corresponding ``setup.py`` script::
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
ext_modules = [Extension("hello", ["hello.pyx"])]
setup(
name = 'Hello world app',
cmdclass = {'build_ext': build_ext},
ext_modules = ext_modules
)
To build, run ``python setup.py build_ext --inplace``. Then simply
start a Python session and do ``from hello import say_hello_to`` and
use the imported function as you see fit.
.. figure:: sage.png
The Sage notebook allows transparently editing and compiling Cython
code simply by typing ``%cython`` at the top of a cell and evaluate
it. Variables and functions defined in a Cython cell imported into
the running session.
.. [Sage] W. Stein et al., Sage Mathematics Software, http://sagemath.org
Faster code via static typing
=============================
Cython is a Python compiler. This means that it can compile normal
Python code without changes (with a few obvious exceptions of some as-yet
unsupported language features). However, for performance critical
code, it is often helpful to add static type declarations, as they
will allow Cython to step out of the dynamic nature of the Python code
and generate simpler and faster C code - sometimes faster by orders of
magnitude.
It must be noted, however, that type declarations can make the source
code more verbose and thus less readable. It is therefore discouraged
to use them without good reason, such as where benchmarks prove
that they really make the code substantially faster in a performance
critical section. Typically a few types in the right spots go a long way.
All C types are available for type declarations: integer and floating
point types, complex numbers, structs, unions and pointer types.
Cython can automatically and correctly convert between the types on
assignment. This also includes Python's arbitrary size integer types,
where value overflows on conversion to a C type will raise a Python
``OverflowError`` at runtime. (It does not, however, check for overflow
when doing arithmetic.) The generated C code will handle the
platform dependent sizes of C types correctly and safely in this case.
Types are declared via the cdef keyword.
Typing Variables
----------------
Consider the following pure Python code::
def f(x):
return x**2-x
def integrate_f(a, b, N):
s = 0
dx = (b-a)/N
for i in range(N):
s += f(a+i*dx)
return s * dx
Simply compiling this in Cython merely gives a 35% speedup. This is
better than nothing, but adding some static types can make a much larger
difference.
With additional type declarations, this might look like::
def f(double x):
return x**2-x
def integrate_f(double a, double b, int N):
cdef int i
cdef double s, dx
s = 0
dx = (b-a)/N
for i in range(N):
s += f(a+i*dx)
return s * dx
Since the iterator variable ``i`` is typed with C semantics, the for-loop will be compiled
to pure C code. Typing ``a``, ``s`` and ``dx`` is important as they are involved
in arithmetic withing the for-loop; typing ``b`` and ``N`` makes less of a
difference, but in this case it is not much extra work to be
consistent and type the entire function.
This results in a 4 times speedup over the pure Python version.
Typing Functions
----------------
Python function calls can be expensive -- in Cython doubly so because
one might need to convert to and from Python objects to do the call.
In our example above, the argument is assumed to be a C double both inside f()
and in the call to it, yet a Python ``float`` object must be constructed around the
argument in order to pass it.
Therefore Cython provides a syntax for declaring a C-style function,
the cdef keyword::
cdef double f(double) except? -2:
return x**2-x
Some form of except-modifier should usually be added, otherwise Cython
will not be able to propagate exceptions raised in the function (or a
function it calls). The ``except? -2`` means that an error will be checked
for if ``-2`` is returned (though the ``?`` indicates that ``-2`` may also
be used as a valid return value).
Alternatively, the slower ``except *`` is always
safe. An except clause can be left out if the function returns a Python
object or if it is guaranteed that an exception will not be raised
within the function call.
A side-effect of cdef is that the function is no longer available from
Python-space, as Python wouldn't know how to call it. Using the
``cpdef`` keyword instead of cdef, a Python wrapper is also created,
so that the function is available both from Cython (fast, passing
typed values directly) and from Python (wrapping values in Python
objects).
Note also that it is no longer possible to change ``f`` at runtime.
Speedup: 150 times over pure Python.
Determining where to add types
------------------------------
Because static typing is often the key to large speed gains, beginners
often have a tendency to type everything in sight. This cuts down on both
readability and flexibility. On the other hand, it is easy to kill
performance by forgetting to type a critical loop variable. Two essential
tools to help with this task are profiling and annotation.
Profiling should be the first step of any optimization effort, and can
tell you where you are spending your time. Cython's annotation can then
tell you why your code is taking time.
Using the ``-a`` switch to the ``cython`` command line program (or
following a link from the Sage notebook) results in an HTML report
of Cython code interleaved with the generated C code. Lines are
colored according to the level of "typedness" -- white lines
translates to pure C without any Python API calls. This report
is invaluable when optimizing a function for speed.
.. figure:: htmlreport.png
from time import time
from math import sin
cdef double first_time = 0
def timeit(f, label):
global first_time
t = time()
f(1.0, 2.0, 10**7)
cdef double elapsed = time() - t
if first_time == 0:
first_time = elapsed
print label, elapsed, (100*elapsed/first_time), '% or', first_time/elapsed, 'x'
# Pure Python
py_funcs = {'sin': sin}
py_funcs.update(__builtins__.__dict__)
exec """
def f(x):
return x**2-x
def integrate_f(a, b, N):
s = 0
dx = (b-a)/N
for i in range(N):
s += f(a+i*dx)
return s * dx
""" in py_funcs
timeit(py_funcs['integrate_f'], "Python")
# Just compiled
def f0(x):
return x**2-x
def integrate_f0(a, b, N):
s = 0
dx = (b-a)/N
for i in range(N):
s += f0(a+i*dx)
return s * dx
timeit(integrate_f0, "Cython")
# Typed vars
def f1(double x):
return x**2-x
def integrate_f1(double a, double b, int N):
cdef int i
cdef double s, dx
s = 0
dx = (b-a)/N
for i in range(N):
s += f1(a+i*dx)
return s * dx
timeit(integrate_f1, "Typed vars")
# Typed func
cdef double f2(double x) except? -2:
return x**2-x
def integrate_f2(double a, double b, int N):
cdef int i
cdef double s, dx
s = 0
dx = (b-a)/N
for i in range(N):
s += f2(a+i*dx)
return s * dx
timeit(integrate_f2, "Typed func")
Getting Started
===============
.. toctree::
:maxdepth: 2
overview
install
build
cythonize
Installing Cython
=================
Many scientific Python distributions, such as the Enthought Python
Distribution [EPD]_, Python(x,y) [Pythonxy]_, and Sage [Sage]_, bundle
Cython and no setup is needed. Note however that if your distribution
ships a version of Cython which is too old you can still use the
instructions below to update Cython. Everything in this tutorial
should work with Cython 0.11.2 and newer, unless a footnote says
otherwise.
Unlike most Python software, Cython requires a C compiler to be
present on the system. The details of getting a C compiler varies
according to the system used:
- **Linux** The GNU C Compiler (gcc) is usually present, or easily
available through the package system. On Ubuntu or Debian, for
instance, the command ``sudo apt-get install build-essential`` will
fetch everything you need.
- **Mac OS X** To retrieve gcc, one option is to install Apple's
XCode, which can be retrieved from the Mac OS X's install DVDs or
from http://developer.apple.com.
- **Windows** A popular option is to use the open source MinGW (a
Windows distribution of gcc). See the appendix for instructions for
setting up MinGW manually. EPD and Python(x,y) bundle MinGW, but
some of the configuration steps in the appendix might still be
necessary. Another option is to use Microsoft's Visual C. One must
then use the same version which the installed Python was compiled
with.
.. dagss tried other forms of ReST lists and they didn't look nice
.. with rst2latex.
The newest Cython release can always be downloaded from
http://cython.org. Unpack the tarball or zip file, enter the
directory, and then run::
python setup.py install
If you have Python setuptools set up on your system, you should be
able to fetch Cython from PyPI and install it using::
easy_install cython
For Windows there is also an executable installer available for
download.
.. [EPD] http://www.enthought.com/products/epd.php
.. [Pythonxy] http://www.pythonxy.com/
.. [Sage] W. Stein et al., Sage Mathematics Software, http://sagemath.org
Cython - an overview
====================
[Cython]_ is a programming language based on Python, with extra syntax
allowing for optional static type declarations. It aims to become a superset
of the [Python]_ language which gives it high-level, object-oriented,
functional, and dynamic programming. The source code gets translated
into optimized C/C++ code and compiled as Python extension modules.
This allows for both very fast program execution and tight integration
with external C libraries, while keeping up the high programmer
productivity for which the Python language is well known.
The primary Python execution environment is commonly referred to as
CPython, as it is written in C. Other major implementations use Java
(Jython [Jython]_), C# (IronPython [IronPython]_) and Python itself
(PyPy [PyPy]_). Written in C, CPython has been conducive to wrapping
many external libraries that interface through the C language. It
has, however, remained non trivial to write the necessary glue code in
C, especially for programmers who are more fluent in a high-level
language like Python than in a close-to-the-metal language like C.
Originally based on the well-known Pyrex [Pyrex]_, the Cython project has
approached this problem by means of a source code compiler that
translates Python code to equivalent C code. This code is executed
within the CPython runtime environment, but at the speed of compiled C
and with the ability to call directly into C libraries.
At the same time, it keeps the original interface of the Python
source code, which makes it directly usable from Python code. These
two-fold characteristics enable Cython's two major use cases:
extending the CPython interpreter with fast binary modules, and
interfacing Python code with external C libraries.
While Cython can compile (most) regular Python code, the generated C
code usually gains major (and sometime impressive) speed improvements
from optional static type declarations for both Python and C types.
These allow Cython to assign C semantics to parts of the code, and to
translate them into very efficient C code. Type declarations can
therefore be used for two purposes: for moving code sections from
dynamic Python semantics into static-and-fast C semantics, but also
for directly manipulating types defined in external libraries. Cython
thus merges the two worlds into a very broadly applicable programming
language.
.. [Cython] G. Ewing, R. W. Bradshaw, S. Behnel, D. S. Seljebotn et al.,
The Cython compiler, http://cython.org.
.. [IronPython] Jim Hugunin et al., http://www.codeplex.com/IronPython.
.. [Jython] J. Huginin, B. Warsaw, F. Bock, et al.,
Jython: Python for the Java platform, http://www.jython.org/
.. [PyPy] The PyPy Group, PyPy: a Python implementation written in Python,
http://codespeak.net/pypy.
.. [Pyrex] G. Ewing, Pyrex: C-Extensions for Python,
http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/
.. [Python] G. van Rossum et al., The Python programming language,
http://python.org.
# Makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
PAPER =
# Internal variables.
PAPEROPT_a4 = -D latex_paper_size=a4
PAPEROPT_letter = -D latex_paper_size=letter
ALLSPHINXOPTS = -d build/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) .
.PHONY: help clean html web htmlhelp latex changes linkcheck
help:
@echo "Please use \`make <target>' where <target> is one of"
@echo " html to make standalone HTML files"
@echo " web to make files usable by Sphinx.web"
@echo " htmlhelp to make HTML files and a HTML help project"
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
@echo " changes to make an overview over all changed/added/deprecated items"
@echo " linkcheck to check all external links for integrity"
clean:
-rm -rf build/*
html:
mkdir -p build/html build/doctrees
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) build/html
@echo
@echo "Build finished. The HTML pages are in build/html."
web:
mkdir -p build/web build/doctrees
$(SPHINXBUILD) -b web $(ALLSPHINXOPTS) build/web
@echo
@echo "Build finished; now you can run"
@echo " python -m sphinx.web build/web"
@echo "to start the server."
htmlhelp:
mkdir -p build/htmlhelp build/doctrees
$(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) build/htmlhelp
@echo
@echo "Build finished; now you can run HTML Help Workshop with the" \
".hhp project file in build/htmlhelp."
latex:
mkdir -p build/latex build/doctrees
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) build/latex
@echo
@echo "Build finished; the LaTeX files are in build/latex."
@echo "Run \`make all-pdf' or \`make all-ps' in that directory to" \
"run these through (pdf)latex."
changes:
mkdir -p build/changes build/doctrees
$(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) build/changes
@echo
@echo "The overview file is in build/changes."
linkcheck:
mkdir -p build/linkcheck build/doctrees
$(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) build/linkcheck
@echo
@echo "Link check complete; look for any errors in the above output " \
"or in build/linkcheck/output.txt."
.. highlight:: cython
.. _compilation:
***********
Compilation
***********
* Cython code, unlike Python, must be compiled.
* This happens in two stages:
* A ``.pyx`` file is compiles by Cython to a ``.c`` file.
* The ``.c`` file is compiled by a C comiler to a ``.so`` file (or a ``.pyd`` file on Windows)
* The following sub-sections describe several ways to build your extension modules.
.. note:: The ``-a`` option
* Using the Cython compiler with the ``-a`` option will produce a really nice HTML file of the Cython generated ``.c`` code.
* Double clicking on the highlighted sections will expand the code to reveal what Cython has actually generated for you.
* This is very useful for understanding, optimizing or debugging your module.
=====================
From the Command Line
=====================
* Run the Cython compiler command with your options and list of ``.pyx`` files to generate::
$ cython -a yourmod.pyx
* This creates a ``yourmod.c`` file. (and the -a switch produces a generated html file)
* Compiling your ``.c`` files will vary depending on your operating system.
* Python documentation for writing extension modules should have some details for your system.
* Here we give an example on a Linux system::
$ gcc -shared -pthread -fPIC -fwrapv -O2 -Wall -fno-strict-aliasing -I/usr/include/python2.5 -o yourmod.so yourmod.c
* ``gcc`` will need to have paths to your included header files and paths to libraries you need to link with.
* A ``yourmod.so`` file is now in the same directory.
* Your module, ``yourmod`` is available for you to import as you normally would.
=========
Distutils
=========
* Ensure Distutils is installed in your system.
* The following assumes a Cython file to be compiled called *hello.pyx*.
* Create a ``setup.py`` script::
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
ext_modules = [Extension("hello", ["hello.pyx"])]
setup(
name = ’Hello world app’,
cmdclass = {’build_ext’: build_ext},
ext_modules = ext_modules
)
* Run the command ``python setup.py build_ext --inplace`` in your system's command shell.
* Your done.. import your new extension module into your python shell or script as normal.
=====
SCons
=====
to be completed...
=========
Pyximport
=========
* For generating Cython code right in your pure python modulce::
>>> import pyximport; pyximport.install()
>>> import helloworld
Hello World
* Use for simple Cython builds only.
* No extra C libraries.
* No special build setup needed.
* Also has experimental compilation support for normal Python modules.
* Allows you to automatically run Cython on every ``.pyx`` and ``.py`` module that Python imports.
* This includes the standard library and installed packages.
* In the case that Cython fails to compile a Python module, *pyximport* will fall back to loading the source modules instead.
* The ``.py`` import mechanism is installed like this::
>>> pyximport.install(pyimport = True)
.. note:: Authors
Paul Prescod, Stefan Behnal
====
Sage
====
The Sage notebook allows transparently editing and
compiling Cython code simply by typing %cython at
the top of a cell and evaluate it. Variables and func-
tions defined in a Cython cell imported into the run-
ning session.
.. todo:: Provide a link to Sage docs
Compiler Directives
===================
TODO. See http://wiki.cython.org/enhancements/compilerdirectives
\ No newline at end of file
.. highlight:: cython
.. _extension_types:
***************
Extension Types
***************
* Normal Python as well as extension type classes can be defined.
* Extension types:
* Are considered by Python as "built-in" types.
* Can be used to wrap arbitrary C-data structures, and provide a Python-like interface to them from Python.
* Attributes and methods can be called from Python or Cython code
* Are defined by the ``cdef class`` statement.
::
cdef class Shrubbery:
cdef int width, height
def __init__(self, w, h):
self.width = w
self.height = h
def describe(self):
print "This shrubbery is", self.width, \
"by", self.height, "cubits."
==========
Attributes
==========
* Are stored directly in the object's C struct.
* Are fixed at compile time.
* You can't add attributes to an extension type instance at run time like in normal Python.
* You can sub-class the extenstion type in Python to add attributes at run-time.
* There are two ways to access extension type attributes:
* By Python look-up.
* Python code's only method of access.
* By direct access to the C struct from Cython code.
* Cython code can use either method of access, though.
* By default, extension type attributes are:
* Only accessible by direct access.
* Not accessible from Python code.
* To make attributes accessible to Python, they must be declared ``public`` or ``readonly``::
cdef class Shrubbery:
cdef public int width, height
cdef readonly float depth
* The ``width`` and ``height`` attributes are readable and writable from Python code.
* The ``depth`` attribute is readable but not writable.
.. note::
.. note::
You can only expose simple C types, such as ints, floats, and strings, for Python access. You can also expose Python-valued attributes.
.. note::
The ``public`` and ``readonly`` options apply only to Python access, not direct access. All the attributes of an extension type are always readable and writable by C-level access.
=======
Methods
=======
* ``self`` is used in extension type methods just like it normally is in Python.
* See **Functions and Methods**; all of which applies here.
==========
Properties
==========
* Cython provides a special syntax::
cdef class Spam:
property cheese:
"A doc string can go here."
def __get__(self):
# This is called when the property is read.
...
def __set__(self, value):
# This is called when the property is written.
...
def __del__(self):
# This is called when the property is deleted.
* The ``__get__()``, ``__set__()``, and ``__del__()`` methods are all optional.
* If they are ommitted, An exception is raised when an access attempt is made.
* Below, is a full example that defines a property which can..
* Add to a list each time it is written to (``"__set__"``).
* Return the list when it is read (``"__get__"``).
* Empty the list when it is deleted (``"__del__"``).
::
# cheesy.pyx
cdef class CheeseShop:
cdef object cheeses
def __cinit__(self):
self.cheeses = []
property cheese:
def __get__(self):
return "We don't have: %s" % self.cheeses
def __set__(self, value):
self.cheeses.append(value)
def __del__(self):
del self.cheeses[:]
# Test input
from cheesy import CheeseShop
shop = CheeseShop()
print shop.cheese
shop.cheese = "camembert"
print shop.cheese
shop.cheese = "cheddar"
print shop.cheese
del shop.cheese
print shop.cheese
::
# Test output
We don't have: []
We don't have: ['camembert']
We don't have: ['camembert', 'cheddar']
We don't have: []
===============
Special Methods
===============
.. note::
#. The semantics of Cython's special methods are similar in principle to that of Python's.
#. There are substantial differences in some behavior.
#. Some Cython special methods have no Python counter-part.
* See the :ref:`special_methods_table` for the many that are available.
Declaration
===========
* Must be declared with ``def`` and cannot be declared with ``cdef``.
* Performance is not affected by the ``def`` declaration because of special calling conventions
Docstrings
==========
* Docstrings are not supported yet for some special method types.
* They can be included in the source, but may not appear in the corresponding ``__doc__`` attribute at run-time.
* This a Python library limitation because the ``PyTypeObject`` data structure is limited
Initialization: ``__cinit__()`` and ``__init__()``
==================================================
* Any arguments passed to the extension type's constructor, will be passed to both initialization methods.
* ``__cinit__()`` is where you should perform C-level initialization of the object
* This includes any allocation of C data structures.
* **Caution** is warranted as to what you do in this method.
* The object may not be fully valid Python object when it is called.
* Calling Python objects, including the extensions own methods, may be hazardous.
* By the time ``__cinit__()`` is called...
* Memory has been allocated for the object.
* All C-level attributes have been initialized to 0 or null.
* Python have been initialized to ``None``, but you can not rely on that for each occasion.
* This initialization method is guaranteed to be called exactly once.
* For Extensions types that inherit a base type:
* The ``__cinit__()`` method of the base type is automatically called before this one.
* The inherited ``__cinit__()`` method can not be called explicitly.
* Passing modified argument lists to the base type must be done through ``__init__()``.
* It may be wise to give the ``__cinit__()`` method both ``"*"`` and ``"**"`` arguments.
* Allows the method to accept or ignore additional arguments.
* Eliminates the need for a Python level sub-class, that changes the ``__init__()`` method's signature, to have to override both the ``__new__()`` and ``__init__()`` methods.
* If ``__cinit__()`` is declared to take no arguments except ``self``, it will ignore any extra arguments passed to the constructor without complaining about a signature mis-match
* ``__init__()`` is for higher-level initialization and is safer for Python access.
* By the time this method is called, the extension type is a fully valid Python object.
* All operations are safe.
* This method may sometimes be called more than once, or possibly not at all.
* Take this into consideration to make sure the design of your other methods are robust of this fact.
Finalization: ``__dealloc__()``
===============================
* This method is the counter-part to ``__cinit__()``.
* Any C-data that was explicitly allocated in the ``__cinit__()`` method should be freed here.
* Use caution in this method:
* The Python object to which this method belongs may not be completely intact at this point.
* Avoid invoking any Python operations that may touch the object.
* Don't call any of this object's methods.
* It's best to just deallocate C-data structures here.
* All Python attributes of your extension type object are deallocated by Cython after the ``__dealloc__()`` method returns.
Arithmetic Methods
==================
.. note:: Most of these methods behave differently than in Python
* There are not "reversed" versions of these methods... there is no __radd__() for instance.
* If the first operand cannot perform the operation, the same method of the second operand is called, with the operands in the same order.
* Do not rely on the first parameter of these methods, being ``"self"`` or the right type.
* The types of both operands should be tested before deciding what to do.
* Return ``NotImplemented`` for unhandled, mis-matched operand types.
* The previously mentioned points..
* Also apply to 'in-place' method ``__ipow__()``.
* Do not apply to other 'in-place' methods like ``__iadd__()``, in that these always take ``self`` as the first argument.
Rich Comparisons
================
.. note:: There are no separate methods for individual rich comparison operations.
* A single special method called ``__richcmp__()`` replaces all the individual rich compare, special method types.
* ``__richcmp__()`` takes an integer argument, indicating which operation is to be performed as shown in the table below.
+-----+-----+
| < | 0 |
+-----+-----+
| == | 2 |
+-----+-----+
| > | 4 |
+-----+-----+
| <= | 1 |
+-----+-----+
| != | 3 |
+-----+-----+
| >= | 5 |
+-----+-----+
The ``__next__()`` Method
=========================
* Extension types used to expose an iterator interface should define a ``__next__()`` method.
* **Do not** explicitly supply a ``next()`` method, because Python does that for you automatically.
===========
Subclassing
===========
* An extension type may inherit from a built-in type or another extension type::
cdef class Parrot:
...
cdef class Norwegian(Parrot):
...
* A complete definition of the base type must be available to Cython
* If the base type is a built-in type, it must have been previously declared as an ``extern`` extension type.
* ``cimport`` can be used to import the base type, if the extern declared base type is in a ``.pxd`` definition file.
* In Cython, multiple inheritance is not permitted.. singlular inheritance only
* Cython extenstion types can also be sub-classed in Python.
* Here multiple inhertance is permissible as is normal for Python.
* Even multiple extension types may be inherited, but C-layout of all the base classes must be compatible.
====================
Forward Declarations
====================
* Extension types can be "forward-declared".
* This is necessary when two extension types refer to each other::
cdef class Shrubbery # forward declaration
cdef class Shrubber:
cdef Shrubbery work_in_progress
cdef class Shrubbery:
cdef Shrubber creator
* An extension type that has a base-class, requires that both forward-declarations be specified::
cdef class A(B)
...
cdef class A(B):
# attributes and methods
========================
Extension Types and None
========================
* Parameters and C-variables declared as an Extension type, may take the value of ``None``.
* This is analogous to the way a C-pointer can take the value of ``NULL``.
.. note::
#. Exercise caution when using ``None``
#. Read this section carefully.
* There is no problem as long as you are performing Python operations on it.
* This is because full dynamic type checking is applied
* When accessing an extension type's C-attributes, **make sure** it is not ``None``.
* Cython does not check this for reasons of efficency.
* Be very aware of exposing Python functions that take extension types as arguments::
def widen_shrubbery(Shrubbery sh, extra_width): # This is
sh.width = sh.width + extra_width
* Users could **crash** the program by passing ``None`` for the ``sh`` parameter.
* This could be avoided by::
def widen_shrubbery(Shrubbery sh, extra_width):
if sh is None:
raise TypeError
sh.width = sh.width + extra_width
* Cython provides a more convenient way with a ``not None`` clause::
def widen_shrubbery(Shrubbery sh not None, extra_width):
sh.width = sh.width + extra_width
* Now this function automatically checks that ``sh`` is not ``None``, as well as that is the right type.
* ``not None`` can only be used in Python functions (declared with ``def`` **not** ``cdef``).
* For ``cdef`` functions, you will have to provide the check yourself.
* The ``self`` parameter of an extension type is guaranteed to **never** be ``None``.
* When comparing a value ``x`` with ``None``, and ``x`` is a Python object, note the following:
* ``x is None`` and ``x is not None`` are very efficient.
* They translate directly to C-pointer comparisons.
* ``x == None`` and ``x != None`` or ``if x: ...`` (a boolean condition), will invoke Python operations and will therefore be much slower.
================
Weak Referencing
================
* By default, weak references are not supported.
* It can be enabled by declaring a C attribute of the ``object`` type called ``__weakref__()``::
cdef class ExplodingAnimal:
"""This animal will self-destruct when it is
no longer strongly referenced."""
cdef object __weakref__
=========================
External and Public Types
=========================
Public
======
* When an extention type is declared ``public``, Cython will generate a C-header (".h") file.
* The header file will contain the declarations for it's **object-struct** and it's **type-object**.
* External C-code can now access the attributes of the extension type.
External
========
* An ``extern`` extension type allows you to gain access to the internals of:
* Python objects defined in the Python core.
* Non-Cython extension modules
* The following example lets you get at the C-level members of Python's built-in "complex" object::
cdef extern from "complexobject.h":
struct Py_complex:
double real
double imag
ctypedef class __builtin__.complex [object PyComplexObject]:
cdef Py_complex cval
# A function which uses the above type
def spam(complex c):
print "Real:", c.cval.real
print "Imag:", c.cval.imag
.. note:: Some important things in the example:
#. ``ctypedef`` has been used because because Python's header file has the struct decalared with::
ctypedef struct {
...
} PyComplexObject;
#. The module of where this type object can be found is specified along side the name of the extension type. See **Implicit Importing**.
#. When declaring an external extension type...
* Don't declare any methods, because they are Python method class the are not needed.
* Similiar to **structs** and **unions**, extension classes declared inside a ``cdef extern from`` block only need to declare the C members which you will actually need to access in your module.
Name Specification Clause
=========================
.. note:: Only available to **public** and **extern** extension types.
* Example::
[object object_struct_name, type type_object_name ]
* ``object_struct_name`` is the name to assume for the type's C-struct.
* ``type_object_name`` is the name to assume for the type's statically declared type-object.
* The object and type clauses can be written in any order.
* For ``cdef extern from`` declarations, This clause **is required**.
* The object clause is required because Cython must generate code that is compatible with the declarations in the header file.
* Otherwise the object clause is optional.
* For public extension types, both the object and type clauses **are required** for Cython to generate code that is compatible with external C-code.
================================
Type Names vs. Constructor Names
================================
* In a Cython module, the name of an extension type serves two distinct purposes:
#. When used in an expression, it refers to a "module-level" global variable holding the type's constructor (i.e. it's type-object)
#. It can also be used as a C-type name to declare a "type" for variables, arguments, and return values.
* Example::
cdef extern class MyModule.Spam:
...
* The name "Spam" serves both of these roles.
* Only "Spam" can be used as the type-name.
* The constructor can be referred to by other names.
* Upon an explicit import of "MyModule"...
* ``MyModule.Spam()`` could be used as the constructor call.
* ``MyModule.Spam`` could not be used as a type-name
* When an "as" clause is used, the name specified takes over both roles::
cdef extern class MyModule.Spam as Yummy:
...
* ``Yummy`` becomes both type-name and a name for the constructor.
* There other ways of course, to get hold of the constructor, but ``Yummy`` is the only usable type-name.
Reference Guide
===============
.. note::
.. todo::
Most of the **boldface** is to be changed to refs or other markup later.
Contents:
.. toctree::
:maxdepth: 2
compilation
language_basics
extension_types
interfacing_with_other_code
special_mention
limitations
directives
Indices and tables
------------------
.. toctree::
:maxdepth: 2
special_methods_table
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
.. highlight:: cython
.. _interfacing_with_other_code:
***************************
Interfacing with Other Code
***************************
==
C
==
===
C++
===
=======
Fortran
=======
=====
Numpy
=====
.. highlight:: cython
.. _language_basics:
***************
Language Basics
***************
=================
Cython File Types
=================
There are three file types in cython:
* Implementation files carry a ``.pyx`` suffix
* Definition files carry a ``.pxd`` suffix
* Include files which carry a ``.pxi`` suffix
Implementation File
===================
What can it contain?
--------------------
* Basically anything Cythonic, but see below.
What can't it contain?
----------------------
* There are some restrictions when it comes to **extension types**, if the extension type is
already defined else where... **more on this later**
Definition File
===============
What can it contain?
--------------------
* Any kind of C type declaration.
* ``extern`` C function or variable decarations.
* Declarations for module implementations.
* The definition parts of **extension types**.
* All declarations of functions, etc., for an **external library**
What can't it contain?
----------------------
* Any non-extern C variable declaration.
* Implementations of C or Python functions.
* Python class definitions
* Python executable statements.
* Any declaration that is defined as **public** to make it accessible to other Cython modules.
* This is not necessary, as it is automatic.
* a **public** declaration is only needed to make it accessible to **external C code**.
What else?
----------
cimport
```````
* Use the **cimport** statement, as you would Python's import statement, to access these files
from other definition or implementation files.
* **cimport** does not need to be called in ``.pyx`` file for for ``.pxd`` file that has the
same name, as they are already in the same namespace.
* For cimport to find the stated definition file, the path to the file must be appended to the
``-I`` option of the **cython compile command**.
compilation order
`````````````````
* When a ``.pyx`` file is to be compiled, cython first checks to see if a corresponding ``.pxd`` file
exits and processes it first.
Include File
============
What can it contain?
--------------------
* Any Cythonic code really, because the entire file is textually embedded at the location
you prescribe.
How do I use it?
----------------
* Include the ``.pxi`` file with an ``include`` statement like: ``include "spamstuff.pxi``
* The ``include`` statement can appear anywhere in your cython file and at any indentation level
* The code in the ``.pxi`` file needs to be rooted at the "zero" indentation level.
* The included code can itself contain other ``include`` statements.
====================
Declaring Data Types
====================
As a dynamic language, Python encourages a programming style of considering classes and objects in terms of their methods and attributes, more than where they fit into the class hierarchy.
This can make Python a very relaxed and comfortable language for rapid development, but with a price - the 'red tape' of managing data types is dumped onto the interpreter. At run time, the interpreter does a lot of work searching namespaces, fetching attributes and parsing argument and keyword tuples. This run-time ‘late binding’ is a major cause of Python’s relative slowness compared to ‘early binding’ languages such as C++.
However with Cython it is possible to gain significant speed-ups through the use of ‘early binding’ programming techniques.
.. note:: Typing is not a necessity
Providing static typing to parameters and variables is convenience to speed up your code, but it is not a necessity. Optimize where and when needed.
The cdef Statement
==================
The ``cdef`` statement is used to make C level declarations for:
:Variables:
::
cdef int i, j, k
cdef float f, g[42], *h
:Structs:
::
cdef struct Grail:
int age
float volume
:Unions:
::
cdef union Food:
char *spam
float *eggs
:Enums:
::
cdef enum CheeseType:
cheddar, edam,
camembert
cdef enum CheeseState:
hard = 1
soft = 2
runny = 3
:Funtions:
::
cdef int eggs(unsigned long l, float f):
...
:Extenstion Types:
::
cdef class Spam:
...
.. note:: Constants
Constants can be defined by using an anonymous enum::
cdef enum:
tons_of_spam = 3
Grouping cdef Declarations
==========================
A series of declarations can grouped into a ``cdef`` block::
cdef:
struct Spam:
int tons
int i
float f
Spam *p
void f(Spam *s):
print s.tons, "Tons of spam"
.. note:: ctypedef statement
The ``ctypedef`` statement is provided for naming types::
ctypedef unsigned long ULong
ctypedef int *IntPtr
Parameters
==========
* Both C and Python **function** types can be declared to have parameters C data types.
* Use normal C declaration syntax::
def spam(int i, char *s):
...
cdef int eggs(unsigned long l, float f):
...
* As these parameters are passed into a Python declared function, they are magically **converted** to the specified C type value.
* This holds true for only numeric and string types
* If no type is specified for a parameter or a return value, it is assumed to be a Python object
* The following takes two Python objects as parameters and returns a Python object::
cdef spamobjs(x, y):
...
.. note:: --
This is different then C language behavior, where it is an int by default.
* Python object types have reference counting performed according to the standard Python C-API rules:
* Borrowed references are taken as parameters
* New references are returned
.. todo::
link or label here the one ref count caveat for numpy.
* The name ``object`` can be used to explicitly declare something as a Python Object.
* For sake of code clarity, it recomened to always use ``object`` explicitly in your code.
* This is also useful for cases where the name being declared would otherwise be taken for a type::
cdef foo(object int):
...
* As a return type::
cdef object foo(object int):
...
.. todo::
Do a see also here ..??
Optional Arguments
------------------
* Are supported for ``cdef`` and ``cpdef`` functions
* There differences though whether you declare them in a ``.pyx`` file or a ``.pxd`` file
* When in a ``.pyx`` file, the signature is the same as it is in Python itself::
cdef class A:
cdef foo(self):
print "A"
cdef class B(A)
cdef foo(self, x=None)
print "B", x
cdef class C(B):
cpdef foo(self, x=True, int k=3)
print "C", x, k
* When in a ``.pxd`` file, the signature is different like this example: ``cdef foo(x=*)``::
cdef class A:
cdef foo(self)
cdef class B(A)
cdef foo(self, x=*)
cdef class C(B):
cpdef foo(self, x=*, int k=*)
* The number of arguments may increase when subclassing, but the arg types and order must be the same.
* There may be a slight performance penalty when the optional arg is overridden with one that does not have default values.
Keyword-only Arguments
=======================
* As in Python 3, ``def`` functions can have keyword-only argurments listed after a ``"*"`` parameter and before a ``"**"`` parameter if any::
def f(a, b, *args, c, d = 42, e, **kwds):
...
* Shown above, the ``c``, ``d`` and ``e`` arguments can not be passed as positional arguments and must be passed as keyword arguments.
* Furthermore, ``c`` and ``e`` are required keyword arguments since they do not have a default value.
* If the parameter name after the ``"*"`` is omitted, the function will not accept any extra positional argumrents::
def g(a, b, *, c, d):
...
* Shown above, the signature takes exactly two positional parameters and has two required keyword parameters
Automatic Type Conversion
=========================
* For basic numeric and string types, in most situations, when a Python object is used in the context of a C value and vice versa.
* The following table summarises the conversion possibilities, assuming ``sizeof(int) == sizeof(long)``:
+----------------------------+--------------------+------------------+
| C types | From Python types | To Python types |
+============================+====================+==================+
| [unsigned] char | int, long | int |
+----------------------------+ | |
| [unsigned] short | | |
+----------------------------+ | |
| int, long | | |
+----------------------------+--------------------+------------------+
| unsigned int | int, long | long |
+----------------------------+ | |
| unsigned long | | |
+----------------------------+ | |
| [unsigned] long long | | |
+----------------------------+--------------------+------------------+
| float, double, long double | int, long, float | float |
+----------------------------+--------------------+------------------+
| char * | str/bytes | str/bytes [#]_ |
+----------------------------+--------------------+------------------+
| struct | | dict |
+----------------------------+--------------------+------------------+
.. note::
**Python String in a C Context**
* A Python string, passed to C context expecting a ``char*``, is only valid as long as the Python string exists.
* A reference to the Python string must be kept around for as long as the C string is needed.
* If this can't be guarenteed, then make a copy of the C string.
* Cython may produce an error message: ``Obtaining char* from a temporary Python value`` and will not resume compiling in situations like this::
cdef char *s
s = pystring1 + pystring2
* The reason is that concatenating to strings in Python produces a temporary variable.
* The variable is decrefed, and the Python string deallocated as soon as the statement has finished,
* Therefore the lvalue **``s``** is left dangling.
* The solution is to assign the result of the concatenation to a Python variable, and then obtain the ``char*`` from that::
cdef char *s
p = pystring1 + pystring2
s = p
.. note::
**It is up to you to be aware of this, and not to depend on Cython's error message, as it is not guarenteed to be generated for every situation.**
Type Casting
=============
* The syntax used in type casting are ``"<"`` and ``">"``
.. note::
The syntax is different from C convention
::
cdef char *p, float *q
p = <char*>q
* If one of the types is a python object for ``<type>x``, Cython will try and do a coersion.
.. note:: Cython will not stop a casting where there is no conversion, but it will emit a warning.
* If the address is what is wanted, cast to a ``void*`` first.
Type Checking
-------------
* A cast like ``<MyExtensionType>x`` will cast x to type ``MyExtensionType`` without type checking at all.
* To have a cast type checked, use the syntax like: ``<MyExtenstionType?>x``.
* In this case, Cython will throw an error if ``"x"`` is not a (subclass) of ``MyExtenstionClass``
* Automatic type checking for extension types can be obtained by whenever ``isinstance()`` is used as the second parameter
Python Objects
==============
==========================
Statements and Expressions
==========================
* For the most part, control structures and expressions follow Python syntax.
* When applied to Python objects, the semantics are the same unless otherwise noted.
* Most Python operators can be applied to C values with the obvious semantics.
* An expression with mixed Python and C values will have **conversions** performed automatically.
* Python operations are automatically checked for errors, with the appropriate action taken.
Differences Between Cython and C
================================
* Most notable are C constructs which have no direct equivalent in Python.
* An integer literal is treated as a C constant
* It will be truncated to whatever size your C compiler thinks appropriate.
* Cast to a Python object like this::
<object>10000000000000000000
* The ``"L"``, ``"LL"`` and the ``"U"`` suffixes have the same meaning as in C
* There is no ``->`` operator in Cython.. instead of ``p->x``, use ``p.x``.
* There is no ``*`` operator in Cython.. instead of ``*p``, use ``p[0]``.
* ``&`` is permissible and has the same semantics as in C.
* ``NULL`` is the null C pointer.
* Do NOT use 0.
* ``NULL`` is a reserved word in Cython
* Syntax for **Type casts** are ``<type>value``.
Scope Rules
===========
* All determination of scoping (local, module, built-in) in Cython is determined statically.
* As with Python, a variable assignment which is not declared explicitly is implicitly declared to be a Python variable residing in the scope where it was assigned.
.. note::
* Module-level scope behaves the same way as a Python local scope if you refer to the variable before assigning to it.
* Tricks, like the following will NOT work in Cython::
try:
x = True
except NameError:
True = 1
* The above example will not work because ``True`` will always be looked up in the module-level scope. Do the following instead::
import __builtin__
try:
True = __builtin__.True
except AttributeError:
True = 1
Built-in Constants
==================
Pre-defined Python built-in constants:
* None
* True
* False
Operator Precedence
===================
* Cython uses Python precedence order, not C
For-loops
==========
* ``range()`` is C optimized when the index value has been declared by ``cdef``::
cdef i
for i in range(n):
...
* The other form available in C is the for-from style
* The target expression must be a variable name.
* The name between the lower and upper bounds must be the same as the target name.
for i from 0 <= i < n:
...
* Or when using a step size::
for i from 0 <= i < n by s:
...
* To reverse the direction, reverse the conditional operation::
for i from 0 >= i > n:
...
* The ``break`` and ``continue`` are permissible.
* Can contain an else clause.
=====================
Functions and Methods
=====================
* There are three types of function declarations in Cython as the sub-sections show below.
* Only "Python" functions can be called outside a Cython module from *Python interpretted code*.
Callable from Python
=====================
* Are decalared with the ``def`` statement
* Are called with Python objects
* Return Python objects
* See **Parameters** for special consideration
Callable from C
================
* Are declared with the ``cdef`` statement.
* Are called with either Python objects or C values.
* Can return either Python objects or C values.
Callable from both Python and C
================================
* Are declared with the ``cpdef`` statement.
* Can be called from anywhere, because it uses a little Cython magic.
* Uses the faster C calling conventions when being called from other Cython code.
Overriding
==========
``cpdef`` functions can override ``cdef`` functions::
cdef class A:
cdef foo(self):
print "A"
cdef class B(A)
cdef foo(self, x=None)
print "B", x
cdef class C(B):
cpdef foo(self, x=True, int k=3)
print "C", x, k
Function Pointers
=================
* Functions declared in a ``struct`` are automatically converted to function pointers.
* see **using exceptions with function pointers**
Python Built-ins
================
The following are provided:
.. todo:: incomplete
+------------------------------+-------------+----------------------------+
| Function and arguments | Return type | Python/C API Equivalent |
+==============================+=============+============================+
| abs(obj) | object | PyNumber_Absolute |
+------------------------------+-------------+----------------------------+
| bool(obj) | object | Py_True, Py_False |
+------------------------------+-------------+----------------------------+
| chr(obj) | object | char |
+------------------------------+-------------+----------------------------+
| delattr(obj, name) | int | PyObject_DelAttr |
+------------------------------+-------------+----------------------------+
| dir(obj) | object | PyObject_Dir |
| getattr(obj, name) (Note 1) | | |
| getattr3(obj, name, default) | | |
+------------------------------+-------------+----------------------------+
| hasattr(obj, name) | int | PyObject_HasAttr |
+------------------------------+-------------+----------------------------+
| hash(obj) | int | PyObject_Hash |
+------------------------------+-------------+----------------------------+
| intern(obj) | object | PyObject_InternFromString |
+------------------------------+-------------+----------------------------+
| isinstance(obj, type) | int | PyObject_IsInstance |
+------------------------------+-------------+----------------------------+
| issubclass(obj, type) | int | PyObject_IsSubclass |
+------------------------------+-------------+----------------------------+
| iter(obj) | object | PyObject_GetIter |
+------------------------------+-------------+----------------------------+
| len(obj) | Py_ssize_t | PyObject_Length |
+------------------------------+-------------+----------------------------+
| pow(x, y, z) (Note 2) | object | PyNumber_Power |
+------------------------------+-------------+----------------------------+
| reload(obj) | object | PyImport_ReloadModule |
+------------------------------+-------------+----------------------------+
| repr(obj) | object | PyObject_Repr |
+------------------------------+-------------+----------------------------+
| setattr(obj, name) | void | PyObject_SetAttr |
+------------------------------+-------------+----------------------------+
============================
Error and Exception Handling
============================
* A plain ``cdef`` declared function, that does not return a Python object...
* Has no way of reporting a Python exception to it's caller.
* Will only print a warning message and the exception is ignored.
* Inorder to propagate exceptions like this to it's caller, you need to declare an exception value for it.
* There are three forms of declaring an exception for a C compiled program.
* First::
cdef int spam() except -1:
...
* In the example above, if an error occurs inside spam, it will immediately return with the value of ``-1``, causing an exception to be propagated to it's caller.
* Functions declared with an exception value, should explicitly prevent a return of that value.
* Second::
cdef int spam() except? -1:
...
* Used when a ``-1`` may possibly be returned and is not to be considered an error.
* The ``"?"`` tells Cython that ``-1`` only indicates a *possible* error.
* Now, each time ``-1`` is returned, Cython generates a call to ``PyErr_Occurrd`` to verify it is an actual error.
* Third::
cdef int spam() except *
* A call to ``PyErr_Occurred`` happens *every* time the function gets called.
.. note:: Returning ``void``
A need to propagate errors when returning ``void`` must use this version.
* Exception values can only be declared for functions returning an..
* integer
* enum
* float
* pointer type
* Must be a constant expression
.. note::
.. note:: Function pointers
* Require the same exception value specification as it's user has declared.
* Use cases here are when used as parameters and when assigned to a variable::
int (*grail)(int, char *) except -1
.. note:: Python Objects
* Declared exception values are **not** need.
* Remember that Cython assumes that a function function without a declared return value, returns a Python object.
* Exceptions on such functions are implicitly propagated by returning ``NULL``
.. note:: C++
* For exceptions from C++ compiled programs, see **Wrapping C++ Classes**
Checking return values for non-Cython functions..
=================================================
* Do not try to raise exceptions by returning the specified value.. Example::
cdef extern FILE *fopen(char *filename, char *mode) except NULL # WRONG!
* The except clause does not work that way.
* It's only purpose is to propagate Python exceptions that have already been raised by either...
* A Cython function
* A C function that calls Python/C API routines.
* To propagate an exception for these circumstances you need to raise it yourself::
cdef FILE *p
p = fopen("spam.txt", "r")
if p == NULL:
raise SpamError("Couldn't open the spam file")
=======================
Conditional Compilation
=======================
* The expressions in the following sub-sections must be valid compile-time expressions.
* They can evaluate to any Python value.
* The *truth* of the result is determined in the usual Python way.
Compile-Time Definitions
=========================
* Defined using the ``DEF`` statement::
DEF FavouriteFood = "spam"
DEF ArraySize = 42
DEF OtherArraySize = 2 * ArraySize + 17
* The right hand side must be a valid compile-time expression made up of either:
* Literal values
* Names defined by other ``DEF`` statements
* They can be combined using any of the Python expression syntax
* Cython provides the following pre-defined names
* Corresponding to the values returned by ``os.uname()``
* UNAME_SYSNAME
* UNAME_NODENAME
* UNAME_RELEASE
* UNAME_VERSION
* UNAME_MACHINE
* A name defined by ``DEF`` can appear anywhere an identifier can appear.
* Cython replaces the name with the literal value before compilation.
* The compile-time expression, in this case, must eveluate to a Python value of ``int``, ``long``, ``float``, or ``str``::
cdef int a1[ArraySize]
cdef int a2[OtherArraySize]
print "I like", FavouriteFood
Conditional Statements
=======================
* Similiar semantics of the C pre-processor
* The following statements can be used to conditinally include or exclude sections of code to compile.
* ``IF``
* ``ELIF``
* ``ELSE``
::
IF UNAME_SYSNAME == "Windows":
include "icky_definitions.pxi"
ELIF UNAME_SYSNAME == "Darwin":
include "nice_definitions.pxi"
ELIF UNAME_SYSNAME == "Linux":
include "penguin_definitions.pxi"
ELSE:
include "other_definitions.pxi"
* ``ELIF`` and ``ELSE`` are optional.
* ``IF`` can appear anywhere that a normal statement or declaration can appear
* It can contain any statements or declarations that would be valid in that context.
* This includes other ``IF`` and ``DEF`` statements
.. [#] The conversion is to/from str for Python 2.x, and bytes for Python 3.x.
.. highlight:: cython
.. _limitations:
***********
Limitations
***********
.. highlight:: cython
.. _special_mention:
***************
Special Mention
***************
.. _special_methods_table:
Special Methods Table
---------------------
This table lists all of the special methods together with their parameter and
return types. In the table below, a parameter name of self is used to indicate
that the parameter has the type that the method belongs to. Other parameters
with no type specified in the table are generic Python objects.
You don't have to declare your method as taking these parameter types. If you
declare different types, conversions will be performed as necessary.
General
^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __cinit__ |self, ... | | Basic initialisation (no direct Python equivalent) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __init__ |self, ... | | Further initialisation |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __dealloc__ |self | | Basic deallocation (no direct Python equivalent) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __cmp__ |x, y | int | 3-way comparison |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __richcmp__ |x, y, int op | object | Rich comparison (no direct Python equivalent) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __str__ |self | object | str(self) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __repr__ |self | object | repr(self) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __hash__ |self | int | Hash function |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __call__ |self, ... | object | self(...) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __iter__ |self | object | Return iterator for sequence |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getattr__ |self, name | object | Get attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __setattr__ |self, name, val | | Set attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __delattr__ |self, name | | Delete attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Arithmetic operators
^^^^^^^^^^^^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __add__ | x, y | object | binary `+` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __sub__ | x, y | object | binary `-` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __mul__ | x, y | object | `*` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __div__ | x, y | object | `/` operator for old-style division |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __floordiv__ | x, y | object | `//` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __truediv__ | x, y | object | `/` operator for new-style division |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __mod__ | x, y | object | `%` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __divmod__ | x, y | object | combined div and mod |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __pow__ | x, y, z | object | `**` operator or pow(x, y, z) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __neg__ | self | object | unary `-` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __pos__ | self | object | unary `+` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __abs__ | self | object | absolute value |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __nonzero__ | self | int | convert to boolean |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __invert__ | self | object | `~` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __lshift__ | x, y | object | `<<` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __rshift__ | x, y | object | `>>` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __and__ | x, y | object | `&` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __or__ | x, y | object | `|` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __xor__ | x, y | object | `^` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Numeric conversions
^^^^^^^^^^^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __int__ | self | object | Convert to integer |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __long__ | self | object | Convert to long integer |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __float__ | self | object | Convert to float |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __oct__ | self | object | Convert to octal |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __hex__ | self | object | Convert to hexadecimal |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __index__ (2.5+ only) | self | object | Convert to sequence index |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
In-place arithmetic operators
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __iadd__ | self, x | object | `+=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __isub__ | self, x | object | `-=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __imul__ | self, x | object | `*=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __idiv__ | self, x | object | `/=` operator for old-style division |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __ifloordiv__ | self, x | object | `//=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __itruediv__ | self, x | object | `/=` operator for new-style division |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __imod__ | self, x | object | `%=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __ipow__ | x, y, z | object | `**=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __ilshift__ | self, x | object | `<<=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __irshift__ | self, x | object | `>>=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __iand__ | self, x | object | `&=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __ior__ | self, x | object | `|=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __ixor__ | self, x | object | `^=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Sequences and mappings
^^^^^^^^^^^^^^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __len__ | self int | | len(self) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getitem__ | self, x | object | self[x] |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __setitem__ | self, x, y | | self[x] = y |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __delitem__ | self, x | | del self[x] |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getslice__ | self, Py_ssize_t i, Py_ssize_t j | object | self[i:j] |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __setslice__ | self, Py_ssize_t i, Py_ssize_t j, x | | self[i:j] = x |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __delslice__ | self, Py_ssize_t i, Py_ssize_t j | | del self[i:j] |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __contains__ | self, x | int | x in self |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Iterators
^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __next__ | self | object | Get next item (called next in Python) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Buffer interface
^^^^^^^^^^^^^^^^
.. note::
The buffer interface is intended for use by C code and is not directly
accessible from Python. It is described in the Python/C API Reference Manual
under sections 6.6 and 10.6.
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __getreadbuffer__ | self, int i, void `**p` | | |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getwritebuffer__ | self, int i, void `**p` | | |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getsegcount__ | self, int `*p` | | |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getcharbuffer__ | self, int i, char `**p` | | |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Descriptor objects
^^^^^^^^^^^^^^^^^^
.. note::
Descriptor objects are part of the support mechanism for new-style
Python classes. See the discussion of descriptors in the Python documentation.
See also PEP 252, "Making Types Look More Like Classes", and PEP 253,
"Subtyping Built-In Types".
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __get__ | self, instance, class | object | Get value of attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __set__ | self, instance, value | | Set value of attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __delete__ | self, instance | | Delete attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Appendix: Installing MinGW on Windows
=====================================
1. Download the MinGW installer from
http://www.mingw.org/wiki/HOWTO_Install_the_MinGW_GCC_Compiler_Suite.
(As of this
writing, the download link is a bit difficult to find; it's under
"About" in the menu on the left-hand side). You want the file
entitled "Automated MinGW Installer" (currently version 5.1.4).
2. Run it and install MinGW. Only the basic package is strictly
needed for Cython, although you might want to grab at least the
C++ compiler as well.
3. You need to set up Windows' "PATH" environment variable so that
includes e.g. "c:\\mingw\\bin" (if you installed MinGW to
"c:\\mingw"). The following web-page describes the procedure
in Windows XP (the Vista procedure is similar):
http://support.microsoft.com/kb/310519
4. Finally, tell Python to use MinGW as the default compiler
(otherwise it will try for Visual C). If Python is installed to
"c:\\Python26", create a file named
"c:\\Python26\\Lib\\distutils\\distutils.cfg" containing::
[build]
compiler = mingw32
The [WinInst]_ wiki page contains updated information about this
procedure. Any contributions towards making the Windows install
process smoother is welcomed; it is an unfortunate fact that none of
the regular Cython developers have convenient access to Windows.
.. [WinInst] http://wiki.cython.org/InstallingOnWindows
Caveats
=======
Since Cython mixes C and Python semantics, some things may be a bit
surprising or unintuitive. Work always goes on to make Cython more natural
for Python users, so this list may change in the future.
- ``10**-2 == 0``, instead of ``0.01`` like in Python.
- Given two typed ``int`` variables ``a`` and ``b``, ``a % b`` has the
same sign as the second argument (following Python semantics) rather then
having the same sign as the first (as in C). The C behavior can be
obtained, at some speed gain, by enabling the division directive.
(Versions prior to Cython 0.12. always followed C semantics.)
- Care is needed with unsigned types. ``cdef unsigned n = 10;
print(range(-n, n))`` will print an empty list, since ``-n`` wraps
around to a large positive integer prior to being passed to the
``range`` function.
- Python's ``float`` type actually wraps C ``double`` values, and
Python's ``int`` type wraps C ``long`` values.
Extension types (aka. cdef classes)
===================================
To support object-oriented programming, Cython supports writing normal
Python classes exactly as in Python::
class MathFunction(object):
def __init__(self, name, operator):
self.name = name
self.operator = operator
def __call__(self, *operands):
return self.operator(*operands)
Based on what Python calls a "built-in type", however, Cython supports
a second kind of class: *extension types*, sometimes referred to as
"cdef classes" due to the keywords used for their declaration. They
are somewhat restricted compared to Python classes, but are generally
more memory efficient and faster than generic Python classes. The
main difference is that they use a C struct to store their fields and methods
instead of a Python dict. This allows them to store arbitrary C types
in their fields without requiring a Python wrapper for them, and to
access fields and methods directly at the C level without passing
through a Python dictionary lookup.
Normal Python classes can inherit from cdef classes, but not the other
way around. Cython requires to know the complete inheritance
hierarchy in order to lay out their C structs, and restricts it to
single inheritance. Normal Python classes, on the other hand, can
inherit from any number of Python classes and extension types, both in
Cython code and pure Python code.
So far our integration example has not been very useful as it only
integrates a single hard-coded function. In order to remedy this,
without sacrificing speed, we will use a cdef class to represent a
function on floating point numbers::
cdef class Function:
cpdef double evaluate(self, double x) except *:
return 0
Like before, cpdef makes two versions of the method available; one
fast for use from Cython and one slower for use from Python. Then::
cdef class SinOfSquareFunction(Function):
cpdef double evaluate(self, double x) except *:
return sin(x**2)
Using this, we can now change our integration example::
def integrate(Function f, double a, double b, int N):
cdef int i
cdef double s, dx
if f is None:
raise ValueError("f cannot be None")
s = 0
dx = (b-a)/N
for i in range(N):
s += f.evaluate(a+i*dx)
return s * dx
print(integrate(SinOfSquareFunction(), 0, 1, 10000))
This is almost as fast as the previous code, however it is much more flexible
as the function to integrate can be changed. It is even possible to pass
in a new function defined in Python-space::
>>> import integrate
>>> class MyPolynomial(integrate.Function):
... def evaluate(self, x):
... return 2*x*x + 3*x - 10
...
>>> integrate(MyPolynomial(), 0, 1, 10000)
-7.8335833300000077
This is about 20 times slower, but still about 10 times faster than
the original Python-only integration code. This shows how large the
speed-ups can easily be when whole loops are moved from Python code
into a Cython module.
Some notes on our new implementation of ``evaluate``:
- The fast method dispatch here only works because ``evaluate`` was
declared in ``Function``. Had ``evaluate`` been introduced in
``SinOfSquareFunction``, the code would still work, but Cython
would have used the slower Python method dispatch mechanism
instead.
- In the same way, had the argument ``f`` not been typed, but only
been passed as a Python object, the slower Python dispatch would
be used.
- Since the argument is typed, we need to check whether it is
``None``. In Python, this would have resulted in an ``AttributeError``
when the ``evaluate`` method was looked up, but Cython would instead
try to access the (incompatible) internal structure of ``None`` as if
it were a ``Function``, leading to a crash or data corruption.
There is a *compiler directive* ``nonecheck`` which turns on checks
for this, at the cost of decreased speed. Here's how compiler directives
are used to dynamically switch on or off ``nonecheck``::
#cython: nonecheck=True
# ^^^ Turns on nonecheck globally
import cython
# Turn off nonecheck locally for the function
@cython.nonecheck(False)
def func():
cdef MyClass obj = None
try:
# Turn nonecheck on again for a block
with cython.nonecheck(True):
print obj.myfunc() # Raises exception
except AttributeError:
pass
print obj.myfunc() # Hope for a crash!
Attributes in cdef classes behave differently from attributes in regular classes:
- All attributes must be pre-declared at compile-time
- Attributes are by default only accessible from Cython (typed access)
- Properties can be declared to expose dynamic attributes to Python-space
::
cdef class WaveFunction(Function):
# Not available in Python-space:
cdef double offset
# Available in Python-space:
cdef public double freq
# Available in Python-space:
property period:
def __get__(self):
return 1.0 / self. freq
def __set__(self, value):
self. freq = 1.0 / value
<...>
Using C libraries
=================
Apart from writing fast code, one of the main use cases of Cython is
to call external C libraries from Python code. As Cython code
compiles down to C code itself, it is actually trivial to call C
functions directly in the code. The following gives a complete
example for using (and wrapping) an external C library in Cython code,
including appropriate error handling and considerations about
designing a suitable API for Python and Cython code.
Imagine you need an efficient way to store integer values in a FIFO
queue. Since memory really matters, and the values are actually
coming from C code, you cannot afford to create and store Python
``int`` objects in a list or deque. So you look out for a queue
implementation in C.
After some web search, you find the C-algorithms library [CAlg]_ and
decide to use its double ended queue implementation. To make the
handling easier, however, you decide to wrap it in a Python extension
type that can encapsulate all memory management.
The C API of the queue implementation, which is defined in the header
file ``libcalg/queue.h``, essentially looks like this::
/* file: queue.h */
typedef struct _Queue Queue;
typedef void *QueueValue;
Queue *queue_new(void);
void queue_free(Queue *queue);
int queue_push_head(Queue *queue, QueueValue data);
QueueValue queue_pop_head(Queue *queue);
QueueValue queue_peek_head(Queue *queue);
int queue_push_tail(Queue *queue, QueueValue data);
QueueValue queue_pop_tail(Queue *queue);
QueueValue queue_peek_tail(Queue *queue);
int queue_is_empty(Queue *queue);
To get started, the first step is to redefine the C API in a ``.pxd``
file, say, ``cqueue.pxd``::
# file: cqueue.pxd
cdef extern from "libcalg/queue.h":
ctypedef struct Queue:
pass
ctypedef void* QueueValue
Queue* queue_new()
void queue_free(Queue* queue)
int queue_push_head(Queue* queue, QueueValue data)
QueueValue queue_pop_head(Queue* queue)
QueueValue queue_peek_head(Queue* queue)
int queue_push_tail(Queue* queue, QueueValue data)
QueueValue queue_pop_tail(Queue* queue)
QueueValue queue_peek_tail(Queue* queue)
bint queue_is_empty(Queue* queue)
Note how these declarations are almost identical to the header file
declarations, so you can often just copy them over. However, you do
not need to provide *all* declarations as above, just those that you
use in your code or in other declarations, so that Cython gets to see
a sufficient and consistent subset of them. Then, consider adapting
them somewhat to make them more comfortable to work with in Cython.
One noteworthy difference to the header file that we use above is the
declaration of the ``Queue`` struct in the first line. ``Queue`` is
in this case used as an *opaque handle*; only the library that is
called knows what is really inside. Since no Cython code needs to
know the contents of the struct, we do not need to declare its
contents, so we simply provide an empty definition (as we do not want
to declare the ``_Queue`` type which is referenced in the C header)
[#]_.
.. [#] There's a subtle difference between ``cdef struct Queue: pass``
and ``ctypedef struct Queue: pass``. The former declares a
type which is referenced in C code as ``struct Queue``, while
the latter is referenced in C as ``Queue``. This is a C
language quirk that Cython is not able to hide. Most modern C
libraries use the ``ctypedef`` kind of struct.
Another exception is the last line. The integer return value of the
``queue_is_empty()`` function is actually a C boolean value, i.e. the
only interesting thing about it is whether it is non-zero or zero,
indicating if the queue is empty or not. This is best expressed by
Cython's ``bint`` type, which is a normal ``int`` type when used in C
but maps to Python's boolean values ``True`` and ``False`` when
converted to a Python object. This way of tightening declarations in
a ``.pxd`` file can often simplify the code that uses them.
It is good practice to define one ``.pxd`` file for each library that
you use, and sometimes even for each header file (or functional group)
if the API is large. That simplifies their reuse in other projects.
Sometimes, you may need to use C functions from the standard C
library, or want to call C-API functions from CPython directly. For
common needs like this, Cython ships with a set of standard ``.pxd``
files that provide these declarations in a readily usable way that is
adapted to their use in Cython. The main packages are ``cpython``,
``libc`` and ``libcpp``. The NumPy library also has a standard
``.pxd`` file ``numpy``, as it is often used in Cython code. See
Cython's ``Cython/Includes/`` source package for a complete list of
provided ``.pxd`` files.
After declaring our C library's API, we can start to design the Queue
class that should wrap the C queue. It will live in a file called
``queue.pyx``. [#]_
.. [#] Note that the name of the ``.pyx`` file must be different from
the ``cqueue.pxd`` file with declarations from the C library,
as both do not describe the same code. A ``.pxd`` file next to
a ``.pyx`` file with the same name defines exported
declarations for code in the ``.pyx`` file. As the
``cqueue.pxd`` file contains declarations of a regular C
library, there must not be a ``.pyx`` file with the same name
that Cython associates with it.
Here is a first start for the Queue class::
# file: queue.pyx
cimport cqueue
cdef class Queue:
cdef cqueue.Queue *_c_queue
def __cinit__(self):
self._c_queue = cqueue.queue_new()
Note that it says ``__cinit__`` rather than ``__init__``. While
``__init__`` is available as well, it is not guaranteed to be run (for
instance, one could create a subclass and forget to call the
ancestor's constructor). Because not initializing C pointers often
leads to hard crashes of the Python interpreter, Cython provides
``__cinit__`` which is *always* called immediately on construction,
before CPython even considers calling ``__init__``, and which
therefore is the right place to initialise ``cdef`` fields of the new
instance. However, as ``__cinit__`` is called during object
construction, ``self`` is not fully constructed yet, and one must
avoid doing anything with ``self`` but assigning to ``cdef`` fields.
Note also that the above method takes no parameters, although subtypes
may want to accept some. A no-arguments ``__cinit__()`` method is a
special case here that simply does not receive any parameters that
were passed to a constructor, so it does not prevent subclasses from
adding parameters. If parameters are used in the signature of
``__cinit__()``, they must match those of any declared ``__init__``
method of classes in the class hierarchy that are used to instantiate
the type.
Before we continue implementing the other methods, it is important to
understand that the above implementation is not safe. In case
anything goes wrong in the call to ``queue_new()``, this code will
simply swallow the error, so we will likely run into a crash later on.
According to the documentation of the ``queue_new()`` function, the
only reason why the above can fail is due to insufficient memory. In
that case, it will return ``NULL``, whereas it would normally return a
pointer to the new queue.
The Python way to get out of this is to raise a ``MemoryError`` [#]_.
We can thus change the init function as follows::
cimport cqueue
cdef class Queue:
cdef cqueue.Queue *_c_queue
def __cinit__(self):
self._c_queue = cqueue.queue_new()
if self._c_queue is NULL:
raise MemoryError()
.. [#] In the specific case of a ``MemoryError``, creating a new
exception instance in order to raise it may actually fail because
we are running out of memory. Luckily, CPython provides a C-API
function ``PyErr_NoMemory()`` that safely raises the right
exception for us. Since version 0.14.1, Cython automatically
substitutes this C-API call whenever you write ``raise
MemoryError`` or ``raise MemoryError()``. If you use an older
version, you have to cimport the C-API function from the standard
package ``cpython.exc`` and call it directly.
The next thing to do is to clean up when the Queue instance is no
longer used (i.e. all references to it have been deleted). To this
end, CPython provides a callback that Cython makes available as a
special method ``__dealloc__()``. In our case, all we have to do is
to free the C Queue, but only if we succeeded in initialising it in
the init method::
def __dealloc__(self):
if self._c_queue is not NULL:
cqueue.queue_free(self._c_queue)
At this point, we have a working Cython module that we can test. To
compile it, we need to configure a ``setup.py`` script for distutils.
Here is the most basic script for compiling a Cython module::
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [Extension("queue", ["queue.pyx"])]
)
To build against the external C library, we must extend this script to
include the necessary setup. Assuming the library is installed in the
usual places (e.g. under ``/usr/lib`` and ``/usr/include`` on a
Unix-like system), we could simply change the extension setup from
::
ext_modules = [Extension("queue", ["queue.pyx"])]
to
::
ext_modules = [
Extension("queue", ["queue.pyx"],
libraries=["calg"])
]
If it is not installed in a 'normal' location, users can provide the
required parameters externally by passing appropriate C compiler
flags, such as::
CFLAGS="-I/usr/local/otherdir/calg/include" \
LDFLAGS="-L/usr/local/otherdir/calg/lib" \
python setup.py build_ext -i
Once we have compiled the module for the first time, we can now import
it and instantiate a new Queue::
$ export PYTHONPATH=.
$ python -c 'import queue.Queue as Q ; Q()'
However, this is all our Queue class can do so far, so let's make it
more usable.
Before implementing the public interface of this class, it is good
practice to look at what interfaces Python offers, e.g. in its
``list`` or ``collections.deque`` classes. Since we only need a FIFO
queue, it's enough to provide the methods ``append()``, ``peek()`` and
``pop()``, and additionally an ``extend()`` method to add multiple
values at once. Also, since we already know that all values will be
coming from C, it's best to provide only ``cdef`` methods for now, and
to give them a straight C interface.
In C, it is common for data structures to store data as a ``void*`` to
whatever data item type. Since we only want to store ``int`` values,
which usually fit into the size of a pointer type, we can avoid
additional memory allocations through a trick: we cast our ``int`` values
to ``void*`` and vice versa, and store the value directly as the
pointer value.
Here is a simple implementation for the ``append()`` method::
cdef append(self, int value):
cqueue.queue_push_tail(self._c_queue, <void*>value)
Again, the same error handling considerations as for the
``__cinit__()`` method apply, so that we end up with this
implementation instead::
cdef append(self, int value):
if not cqueue.queue_push_tail(self._c_queue,
<void*>value):
raise MemoryError()
Adding an ``extend()`` method should now be straight forward::
cdef extend(self, int* values, size_t count):
"""Append all ints to the queue.
"""
cdef size_t i
for i in range(count):
if not cqueue.queue_push_tail(
self._c_queue, <void*>values[i]):
raise MemoryError()
This becomes handy when reading values from a NumPy array, for
example.
So far, we can only add data to the queue. The next step is to write
the two methods to get the first element: ``peek()`` and ``pop()``,
which provide read-only and destructive read access respectively::
cdef int peek(self):
return <int>cqueue.queue_peek_head(self._c_queue)
cdef int pop(self):
return <int>cqueue.queue_pop_head(self._c_queue)
Simple enough. Now, what happens when the queue is empty? According
to the documentation, the functions return a ``NULL`` pointer, which
is typically not a valid value. Since we are simply casting to and
from ints, we cannot distinguish anymore if the return value was
``NULL`` because the queue was empty or because the value stored in
the queue was ``0``. However, in Cython code, we would expect the
first case to raise an exception, whereas the second case should
simply return ``0``. To deal with this, we need to special case this
value, and check if the queue really is empty or not::
cdef int peek(self) except? -1:
cdef int value = \
<int>cqueue.queue_peek_head(self._c_queue)
if value == 0:
# this may mean that the queue is empty, or
# that it happens to contain a 0 value
if cqueue.queue_is_empty(self._c_queue):
raise IndexError("Queue is empty")
return value
Note how we have effectively created a fast path through the method in
the hopefully common cases that the return value is not ``0``. Only
that specific case needs an additional check if the queue is empty.
The ``except? -1`` declaration in the method signature falls into the
same category. If the function was a Python function returning a
Python object value, CPython would simply return ``NULL`` internally
instead of a Python object to indicate an exception, which would
immediately be propagated by the surrounding code. The problem is
that the return type is ``int`` and any ``int`` value is a valid queue
item value, so there is no way to explicitly signal an error to the
calling code. In fact, without such a declaration, there is no
obvious way for Cython to know what to return on exceptions and for
calling code to even know that this method *may* exit with an
exception.
The only way calling code can deal with this situation is to call
``PyErr_Occurred()`` when returning from a function to check if an
exception was raised, and if so, propagate the exception. This
obviously has a performance penalty. Cython therefore allows you to
declare which value it should implicitly return in the case of an
exception, so that the surrounding code only needs to check for an
exception when receiving this exact value.
We chose to use ``-1`` as the exception return value as we expect it
to be an unlikely value to be put into the queue. The question mark
in the ``except? -1`` declaration indicates that the return value is
ambiguous (there *may* be a ``-1`` value in the queue, after all) and
that an additional exception check using ``PyErr_Occurred()`` is
needed in calling code. Without it, Cython code that calls this
method and receives the exception return value would silently (and
sometimes incorrectly) assume that an exception has been raised. In
any case, all other return values will be passed through almost
without a penalty, thus again creating a fast path for 'normal'
values.
Now that the ``peek()`` method is implemented, the ``pop()`` method
also needs adaptation. Since it removes a value from the queue,
however, it is not enough to test if the queue is empty *after* the
removal. Instead, we must test it on entry::
cdef int pop(self) except? -1:
if cqueue.queue_is_empty(self._c_queue):
raise IndexError("Queue is empty")
return <int>cqueue.queue_pop_head(self._c_queue)
The return value for exception propagation is declared exactly as for
``peek()``.
Lastly, we can provide the Queue with an emptiness indicator in the
normal Python way by implementing the ``__bool__()`` special method
(note that Python 2 calls this method ``__nonzero__``, whereas Cython
code can use either name)::
def __bool__(self):
return not cqueue.queue_is_empty(self._c_queue)
Note that this method returns either ``True`` or ``False`` as we
declared the return type of the ``queue_is_empty`` function as
``bint`` in ``cqueue.pxd``.
Now that the implementation is complete, you may want to write some
tests for it to make sure it works correctly. Especially doctests are
very nice for this purpose, as they provide some documentation at the
same time. To enable doctests, however, you need a Python API that
you can call. C methods are not visible from Python code, and thus
not callable from doctests.
A quick way to provide a Python API for the class is to change the
methods from ``cdef`` to ``cpdef``. This will let Cython generate two
entry points, one that is callable from normal Python code using the
Python call semantics and Python objects as arguments, and one that is
callable from C code with fast C semantics and without requiring
intermediate argument conversion from or to Python types.
The following listing shows the complete implementation that uses
``cpdef`` methods where possible::
cimport cqueue
cdef class Queue:
"""A queue class for C integer values.
>>> q = Queue()
>>> q.append(5)
>>> q.peek()
5
>>> q.pop()
5
"""
cdef cqueue.Queue *_c_queue
def __cinit__(self):
self._c_queue = cqueue.queue_new()
if self._c_queue is NULL:
raise MemoryError()
def __dealloc__(self):
if self._c_queue is not NULL:
cqueue.queue_free(self._c_queue)
cpdef append(self, int value):
if not cqueue.queue_push_tail(self._c_queue,
<void*>value):
raise MemoryError()
cdef extend(self, int* values, size_t count):
cdef size_t i
for i in xrange(count):
if not cqueue.queue_push_tail(
self._c_queue, <void*>values[i]):
raise MemoryError()
cpdef int peek(self) except? -1:
cdef int value = \
<int>cqueue.queue_peek_head(self._c_queue)
if value == 0:
# this may mean that the queue is empty,
# or that it happens to contain a 0 value
if cqueue.queue_is_empty(self._c_queue):
raise IndexError("Queue is empty")
return value
cdef int pop(self) except? -1:
if cqueue.queue_is_empty(self._c_queue):
raise IndexError("Queue is empty")
return <int>cqueue.queue_pop_head(self._c_queue)
def __bool__(self):
return not cqueue.queue_is_empty(self._c_queue)
The ``cpdef`` feature is obviously not available for the ``extend()``
method, as the method signature is incompatible with Python argument
types. However, if wanted, we can rename the C-ish ``extend()``
method to e.g. ``c_extend()``, and write a new ``extend()`` method
instead that accepts an arbitrary Python iterable::
cdef c_extend(self, int* values, size_t count):
cdef size_t i
for i in range(count):
if not cqueue.queue_push_tail(
self._c_queue, <void*>values[i]):
raise MemoryError()
cpdef extend(self, values):
for value in values:
self.append(value)
As a quick test with 10000 numbers on the author's machine indicates,
using this Queue from Cython code with C ``int`` values is about five
times as fast as using it from Cython code with Python object values,
almost eight times faster than using it from Python code in a Python
loop, and still more than twice as fast as using Python's highly
optimised ``collections.deque`` type from Cython code with Python
integers.
.. [CAlg] Simon Howard, C Algorithms library, http://c-algorithms.sourceforge.net/
{
'title': 'Cython Tutorial',
'paper_abstract': '''
Cython is a programming language based on Python with extra
syntax to provide static type declarations. This takes advantage of the
benefits of Python while allowing one to achieve the speed of C.
In this paper we describe the Cython language and show how it can
be used both to write optimized code and to interface with external
C libraries.
''',
'authors': [
{'first_names': 'Stefan',
'surname': 'Behnel',
'address': '',
'country': 'Germany',
'email_address': 'stefan\_ml@behnel.de',
'institution': ''},
{'first_names': 'Robert W.',
'surname': 'Bradshaw',
'address': '',
'country': 'USA',
'email_address': 'robertwb@math.washington.edu',
'institution': '''University of Washington\\footnote{
Department of Mathematics, University of Washington, Seattle, WA, USA
}'''},
{'first_names': 'Dag Sverre',
'surname': 'Seljebotn',
'address': '',
'country': 'Norway',
'email_address': 'dagss@student.matnat.uio.no',
# I need three institutions w/ full address... leave it
# all here until we get to editing stage
'institution': '''University of Oslo\\footnote{Institute of Theoretical Astrophysics,
University of Oslo, P.O. Box 1029 Blindern, N-0315 Oslo, Norway}\\footnote{Department
of Mathematics, University of Oslo, P.O. Box 1053 Blindern,
N-0316 Oslo, Norway}\\footnote{Centre of Mathematics for
Applications, University of Oslo, P.O. Box 1053 Blindern, N-0316
Oslo, Norway}'''}
],
}
Calling C functions
====================
This tutorial describes shortly what you need to know in order to call
C library functions from Cython code. For a longer and more
comprehensive tutorial about using external C libraries, wrapping them
and handling errors, see :doc:`clibraries`.
For simplicity, let's start with a function from the standard C
library. This does not add any dependencies to your code, and it has
the additional advantage that Cython already defines many such
functions for you. So you can just cimport and use them.
For example, let's say you need a low-level way to parse a number from
a ``char*`` value. You could use the ``atoi()`` function, as defined
by the ``stdlib.h`` header file. This can be done as follows::
from libc.stdlib cimport atoi
cdef parse_charptr_to_py_int(char* s):
assert s is not NULL, "byte string value is NULL"
return atoi(s) # note: atoi() has no error detection!
You can find a complete list of these standard cimport files in
Cython's source package ``Cython/Includes/``. It also has a complete
set of declarations for CPython's C-API. For example, to test at C
compilation time which CPython version your code is being compiled
with, you can do this::
from cpython.version cimport PY_VERSION_HEX
print PY_VERSION_HEX >= 0x030200F0 # Python version >= 3.2 final
Cython also provides declarations for the C math library::
from libc.math cimport sin
cdef double f(double x):
return sin(x*x)
However, this is a library that is not linked by default on some Unix-like
systems, such as Linux. In addition to cimporting the
declarations, you must configure your build system to link against the
shared library ``m``. For distutils, it is enough to add it to the
``libraries`` parameter of the ``Extension()`` setup::
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
ext_modules=[
Extension("demo",
["demo.pyx"],
libraries=["m"]) # Unix-like specific
]
setup(
name = "Demos",
cmdclass = {"build_ext": build_ext},
ext_modules = ext_modules
)
If you want to access C code for which Cython does not provide a ready
to use declaration, you must declare them yourself. For example, the
above ``sin()`` function is defined as follows::
cdef extern from "math.h":
double sin(double)
This declares the ``sin()`` function in a way that makes it available
to Cython code and instructs Cython to generate C code that includes
the ``math.h`` header file. The C compiler will see the original
declaration in ``math.h`` at compile time, but Cython does not parse
"math.h" and requires a separate definition.
Just like the ``sin()`` function from the math library, it is possible
to declare and call into any C library as long as the module that
Cython generates is properly linked against the shared or static
library.
Tutorials
=========
.. toctree::
:maxdepth: 2
external
clibraries
cdef_classes
pxd_files
caveats
profiling_tutorial
numpy
strings
pure
readings
related_work
appendix
Using Cython with NumPy
=======================
Cython has support for fast access to NumPy arrays. To optimize code
using such arrays one must ``cimport`` the NumPy pxd file (which ships
with Cython), and declare any arrays as having the ``ndarray``
type. The data type and number of dimensions should be fixed at
compile-time and passed. For instance::
import numpy as np
cimport numpy as np
def myfunc(np.ndarray[np.float64_t, ndim=2] A):
<...>
``myfunc`` can now only be passed two-dimensional arrays containing
double precision floats, but array indexing operation is much, much faster,
making it suitable for numerical loops. Expect speed increases well
over 100 times over a pure Python loop; in some cases the speed
increase can be as high as 700 times or more. [Seljebotn09]_
contains detailed examples and benchmarks.
Fast array declarations can currently only be used with function
local variables and arguments to ``def``-style functions (not with
arguments to ``cpdef`` or ``cdef``, and neither with fields in cdef
classes or as global variables). These limitations are considered
known defects and we hope to remove them eventually. In most
circumstances it is possible to work around these limitations rather
easily and without a significant speed penalty, as all NumPy arrays
can also be passed as untyped objects.
Array indexing is only optimized if exactly as many indices are
provided as the number of array dimensions. Furthermore, all indices
must have a native integer type. Slices and NumPy "fancy indexing" is
not optimized. Examples::
def myfunc(np.ndarray[np.float64_t, ndim=1] A):
cdef Py_ssize_t i, j
for i in range(A.shape[0]):
print A[i, 0] # fast
j = 2*i
print A[i, j] # fast
k = 2*i
print A[i, k] # slow, k is not typed
print A[i][j] # slow
print A[i,:] # slow
``Py_ssize_t`` is a signed integer type provided by Python which
covers the same range of values as is supported as NumPy array
indices. It is the preferred type to use for loops over arrays.
Any Cython primitive type (float, complex float and integer types) can
be passed as the array data type. For each valid dtype in the ``numpy``
module (such as ``np.uint8``, ``np.complex128``) there is a
corresponding Cython compile-time definition in the cimport-ed NumPy
pxd file with a ``_t`` suffix [#]_. Cython structs are also allowed
and corresponds to NumPy record arrays. Examples::
cdef packed struct Point:
np.float64_t x, y
def f():
cdef np.ndarray[np.complex128_t, ndim=3] a = \
np.zeros((3,3,3), dtype=np.complex128)
cdef np.ndarray[Point] b = np.zeros(10,
dtype=np.dtype([('x', np.float64),
('y', np.float64)]))
<...>
Note that ``ndim`` defaults to 1. Also note that NumPy record arrays
are by default unaligned, meaning data is packed as tightly as
possible without considering the alignment preferences of the
CPU. Such unaligned record arrays corresponds to a Cython ``packed``
struct. If one uses an aligned dtype, by passing ``align=True`` to the
``dtype`` constructor, one must drop the ``packed`` keyword on the
struct definition.
Some data types are not yet supported, like boolean arrays and string
arrays. Also data types describing data which is not in the native
endian will likely never be supported. It is however possible to
access such arrays on a lower level by casting the arrays::
cdef np.ndarray[np.uint8, cast=True] boolarr = (x < y)
cdef np.ndarray[np.uint32, cast=True] values = \
np.arange(10, dtype='>i4')
Assuming one is on a little-endian system, the ``values`` array
can still access the raw bit content of the array (which must then
be reinterpreted to yield valid results on a little-endian system).
Finally, note that typed NumPy array variables in some respects behave
a little differently from untyped arrays. ``arr.shape`` is no longer a
tuple. ``arr.shape[0]`` is valid but to e.g. print the shape one must
do ``print (<object>arr).shape`` in order to "untype" the variable
first. The same is true for ``arr.data`` (which in typed mode is a C
data pointer).
There are many more options for optimizations to consider for Cython
and NumPy arrays. We again refer the interested reader to [Seljebotn09]_.
.. [#] In Cython 0.11.2, ``np.complex64_t`` and ``np.complex128_t``
does not work and one must write ``complex`` or
``double complex`` instead. This is fixed in 0.11.3. Cython
0.11.1 and earlier does not support complex numbers.
.. [Seljebotn09] D. S. Seljebotn, Fast numerical computations with Cython,
Proceedings of the 8th Python in Science Conference, 2009.
.. highlight:: cython
.. _profiling:
*********
Profiling
*********
This part describes the profiling abilities of Cython. If you are familiar
with profiling pure Python code, you can only read the first section
(:ref:`profiling_basics`). If you are not familiar with python profiling you
should also read the tutorial (:ref:`profiling_tutorial`) which takes you
through a complete example step by step.
.. _profiling_basics:
Cython Profiling Basics
=======================
Profiling in Cython is controlled by a compiler directive.
It can either be set either for an entire file or on a per function
via a Cython decorator.
Enable profiling for a complete source file
-------------------------------------------
Profiling is enable for a complete source file via a global directive to the
Cython compiler at the top of a file::
# cython: profile=True
Note that profiling gives a slight overhead to each function call therefore making
your program a little slower (or a lot, if you call some small functions very
often).
Once enabled, your Cython code will behave just like Python code when called
from the cProfile module. This means you can just profile your Cython code
together with your Python code using the same tools as for Python code alone.
Disabling profiling function wise
------------------------------------------
If your profiling is messed up because of the call overhead to some small
functions that you rather do not want to see in your profile - either because
you plan to inline them anyway or because you are sure that you can't make them
any faster - you can use a special decorator to disable profiling for one
function only::
cimport cython
@cython.profile(False)
def my_often_called_function():
pass
.. _profiling_tutorial:
Profiling Tutorial
==================
This will be a complete tutorial, start to finish, of profiling python code,
turning it into Cython code and keep profiling until it is fast enough.
As a toy example, we would like to evaluate the summation of the reciprocals of
squares up to a certain integer :math:`n` for evaluating :math:`\pi`. The
relation we want to use has been proven by Euler in 1735 and is known as the
`Basel problem <http://en.wikipedia.org/wiki/Basel_problem>`_.
.. math::
\pi^2 = 6 \sum_{k=1}^{\infty} \frac{1}{k^2} =
6 \lim_{k \to \infty} \big( \frac{1}{1^2} +
\frac{1}{2^2} + \dots + \frac{1}{k^2} \big) \approx
6 \big( \frac{1}{1^2} + \frac{1}{2^2} + \dots + \frac{1}{n^2} \big)
A simple python code for evaluating the truncated sum looks like this::
#!/usr/bin/env python
# encoding: utf-8
# filename: calc_pi.py
def recip_square(i):
return 1./i**2
def approx_pi(n=10000000):
val = 0.
for k in range(1,n+1):
val += recip_square(k)
return (6 * val)**.5
On my box, this needs approximately 4 seconds to run the function with the
default n. The higher we choose n, the better will be the approximation for
:math:`\pi`. An experienced python programmer will already see plenty of
places to optimize this code. But remember the golden rule of optimization:
Never optimize without having profiled. Let me repeat this: **Never** optimize
without having profiled your code. Your thoughts about which part of your
code takes too much time are wrong. At least, mine are always wrong. So let's
write a short script to profile our code::
#!/usr/bin/env python
# encoding: utf-8
# filename: profile.py
import pstats, cProfile
import calc_pi
cProfile.runctx("calc_pi.approx_pi()", globals(), locals(), "Profile.prof")
s = pstats.Stats("Profile.prof")
s.strip_dirs().sort_stats("time").print_stats()
Running this on my box gives the following output::
TODO: how to display this not as code but verbatimly?
Sat Nov 7 17:40:54 2009 Profile.prof
10000004 function calls in 6.211 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1 3.243 3.243 6.211 6.211 calc_pi.py:7(approx_pi)
10000000 2.526 0.000 2.526 0.000 calc_pi.py:4(recip_square)
1 0.442 0.442 0.442 0.442 {range}
1 0.000 0.000 6.211 6.211 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
This contains the information that the code runs in 6.2 CPU seconds. Note that
the code got slower by 2 seconds because it ran inside the cProfile module. The
table contains the real valuable information. You might want to check the
python `profiling documentation <http://docs.python.org/library/profile.html>`_
for the nitty gritty details. The most important columns here are totime (total
time spent in this function **not** counting functions that were called by this
function) and cumtime (total time spent in this function **also** counting the
functions called by this function). Looking at the tottime column, we see that
approximately half the time is spent in approx_pi and the other half is spent
in recip_square. Also half a second is spent in range ... of course we should
have used xrange for such a big iteration. And in fact, just changing range to
xrange makes the code run in 5.8 seconds.
We could optimize a lot in the pure python version, but since we are interested
in Cython, let's move forward and bring this module to Cython. We would do this
anyway at some time to get the loop run faster. Here is our first Cython version::
# encoding: utf-8
# cython: profile=True
# filename: calc_pi.pyx
def recip_square(int i):
return 1./i**2
def approx_pi(int n=10000000):
cdef double val = 0.
cdef int k
for k in xrange(1,n+1):
val += recip_square(k)
return (6 * val)**.5
Note the second line: We have to tell Cython that profiling should be enabled.
This makes the Cython code slightly slower, but without this we would not get
meaningful output from the cProfile module. The rest of the code is mostly
unchanged, I only typed some variables which will likely speed things up a bit.
We also need to modify our profiling script to import the Cython module directly.
Here is the complete version adding the import of the pyximport module::
#!/usr/bin/env python
# encoding: utf-8
# filename: profile.py
import pstats, cProfile
import pyximport
pyximport.install()
import calc_pi
cProfile.runctx("calc_pi.approx_pi()", globals(), locals(), "Profile.prof")
s = pstats.Stats("Profile.prof")
s.strip_dirs().sort_stats("time").print_stats()
We only added two lines, the rest stays completely the same. Alternatively, we could also
manually compile our code into an extension; we wouldn't need to change the
profile script then at all. The script now outputs the following::
Sat Nov 7 18:02:33 2009 Profile.prof
10000004 function calls in 4.406 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1 3.305 3.305 4.406 4.406 calc_pi.pyx:7(approx_pi)
10000000 1.101 0.000 1.101 0.000 calc_pi.pyx:4(recip_square)
1 0.000 0.000 4.406 4.406 {calc_pi.approx_pi}
1 0.000 0.000 4.406 4.406 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
We gained 1.8 seconds. Not too shabby. Comparing the output to the previous, we
see that recip_square function got faster while the approx_pi function has not
changed a lot. Let's concentrate on the recip_square function a bit more. First
note, that this function is not to be called from code outside of our module;
so it would be wise to turn it into a cdef to reduce call overhead. We should
also get rid of the power operator: it is turned into a pow(i,2) function call by
Cython, but we could instead just write i*i which could be faster. The
whole function is also a good candidate for inlining. Let's look at the
necessary changes for these ideas::
# encoding: utf-8
# cython: profile=True
# filename: calc_pi.pyx
cdef inline double recip_square(int i):
return 1./(i*i)
def approx_pi(int n=10000000):
cdef double val = 0.
cdef int k
for k in xrange(1,n+1):
val += recip_square(k)
return (6 * val)**.5
Now running the profile script yields::
Sat Nov 7 18:10:11 2009 Profile.prof
10000004 function calls in 2.622 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1 1.782 1.782 2.622 2.622 calc_pi.pyx:7(approx_pi)
10000000 0.840 0.000 0.840 0.000 calc_pi.pyx:4(recip_square)
1 0.000 0.000 2.622 2.622 {calc_pi.approx_pi}
1 0.000 0.000 2.622 2.622 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
That bought us another 1.8 seconds. Not the dramatic change we could have
expected. And why is recip_square still in this table; it is supposed to be
inlined, isn't it? The reason for this is that Cython still generates profiling code
even if the function call is eliminated. Let's tell it to not
profile recip_square any more; we couldn't get the function to be much faster anyway::
# encoding: utf-8
# cython: profile=True
# filename: calc_pi.pyx
cimport cython
@cython.profile(False)
cdef inline double recip_square(int i):
return 1./(i*i)
def approx_pi(int n=10000000):
cdef double val = 0.
cdef int k
for k in xrange(1,n+1):
val += recip_square(k)
return (6 * val)**.5
Running this shows an interesting result::
Sat Nov 7 18:15:02 2009 Profile.prof
4 function calls in 0.089 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.089 0.089 0.089 0.089 calc_pi.pyx:10(approx_pi)
1 0.000 0.000 0.089 0.089 {calc_pi.approx_pi}
1 0.000 0.000 0.089 0.089 <string>:1(<module>)
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
First note the tremendous speed gain: this version only takes 1/50 of the time
of our first Cython version. Also note that recip_square has vanished from the
table like we wanted. But the most peculiar and import change is that
approx_pi also got much faster. This is a problem with all profiling: calling a
function in a profile run adds a certain overhead to the function call. This
overhead is **not** added to the time spent in the called function, but to the
time spent in the **calling** function. In this example, approx_pi didn't need 2.622
seconds in the last run; but it called recip_square 10000000 times, each time taking a
little to set up profiling for it. This adds up to the massive time loss of
around 2.6 seconds. Having disabled profiling for the often called function now
reveals realistic timings for approx_pi; we could continue optimizing it now if
needed.
This concludes this profiling tutorial. There is still some room for
improvement in this code. We could try to replace the power operator in
approx_pi with a call to sqrt from the C stdlib; but this is not necessarily
faster than calling pow(x,0.5).
Even so, the result we achieved here is quite satisfactory: we came up with a
solution that is much faster then our original python version while retaining
functionality and readability.
Pure Python Mode
================
Cython provides language constructs to let the same file be either interpreted
or compiled. This is accomplished by the same "magic" module ``cython`` that
directives use and which must be imported. This is available for both :file:`.py` and
:file:`.pyx` files.
This is accomplished via special functions and decorators and an (optional)
augmenting :file:`.pxd` file.
Magic Attributes
----------------
The currently supported attributes of the ``cython`` module are:
* ``declare`` declares a typed variable in the current scope, which can be used in
place of the :samp:`cdef type var [= value]` construct. This has two forms, the
first as an assignment (useful as it creates a declaration in
interpreted mode as well)::
x = cython.declare(cython.int) # cdef int x
y = cython.declare(cython.double, 0.57721) # cdef double y = 0.57721
and the second mode as a simple function call::
cython.declare(x=cython.int, y=cython.double) # cdef int x; cdef double y
* ``locals`` is a decorator that is used to specify the types of local variables
in the function body (including any or all of the argument types)::
@cython.locals(a=cython.double, b=cython.double, n=cython.p_double)
def foo(a, b, x, y):
...
* ``address`` is used in place of the ``&`` operator::
cython.declare(x=cython.int, x_ptr=cython.p_int)
x_ptr = cython.address(x)
* ``sizeof`` emulates the `sizeof` operator. It can take both types and
expressions.::
cython.declare(n=cython.longlong)
print cython.sizeof(cython.longlong), cython.sizeof(n)
* ``struct`` can be used to create struct types.::
MyStruct = cython.struct(x=cython.int, y=cython.int, data=cython.double)
a = cython.declare(MyStruct)
is equivalent to the code::
cdef struct MyStruct:
int x
int y
double data
cdef MyStruct a
* ``union`` creates union types with exactly the same syntax as ``struct``
* ``typedef`` creates a new type::
T = cython.typedef(cython.p_int) # ctypedef int* T
* ``compiled`` is a special variable which is set to ``True`` when the compiler
runs, and ``False`` in the interpreter. Thus the code::
if cython.compiled:
print "Yep, I'm compiled."
else:
print "Just a lowly interpreted script."
will behave differently depending on whether or not the code is loaded as a
compiled :file:`.so` file or a plain :file:`.py` file.
Augmenting .pxd
---------------
If a :file:`.pxd` file is found with the same name as a :file:`.py` file, it will be
searched for :keyword:`cdef` classes and :keyword:`cdef`/:keyword:`cpdef`
functions and methods. It will then convert the corresponding
classes/functions/methods in the :file:`.py` file to be of the correct type. Thus if
one had :file:`a.pxd`::
cdef class A:
cpdef foo(self, int i)
the file :file:`a.py`::
class A:
def foo(self, i):
print "Big" if i > 1000 else "Small"
would be interpreted as::
cdef class A:
cpdef foo(self, int i):
print "Big" if i > 1000 else "Small"
The special cython module can also be imported and used within the augmenting
:file:`.pxd` file. This makes it possible to add types to a pure python file without
changing the file itself. For example, the following python file
:file:`dostuff.py`::
def dostuff(n):
t = 0
for i in range(n):
t += i
return t
could be augmented with the following :file:`.pxd` file :file:`dostuff.pxd`::
import cython
@cython.locals(t = cython.int, i = cython.int)
cpdef int dostuff(int n)
Besides the ``cython.locals`` decorator, the :func:`cython.declare` function can also be
used to add types to global variables in the augmenting :file:`.pxd` file.
Note that normal Python (:keyword:`def`) functions cannot be declared in
:file:`.pxd` files, so it is currently impossible to override the types of
Python functions in :file:`.pxd` files if they use ``*args`` or ``**kwargs`` in their
signature, for instance.
Types
-----
There are numerous types built in to the cython module. One has all the
standard C types, namely ``char``, ``short``, ``int``, ``long``, ``longlong``
as well as their unsigned versions ``uchar``, ``ushort``, ``uint``, ``ulong``,
``ulonglong``. One also has ``bint`` and ``Py_ssize_t``. For each type, one
has pointer types ``p_int``, ``pp_int``, . . ., up to three levels deep in
interpreted mode, and infinitely deep in compiled mode. The Python types int,
long and bool are interpreted as C ``int``, ``long`` and ``bint``
respectively. Also, the python types ``list``, ``dict``, ``tuple``, . . . may
be used, as well as any user defined types.
Pointer types may be constructed with ``cython.pointer(cython.int)``, and
arrays as ``cython.int[10]``. A limited attempt is made to emulate these more
complex types, but only so much can be done from the Python language.
Decorators (not yet implemented)
--------------------------------
We have settled on ``@cython.cclass`` for the ``cdef class``
decorators, and ``@cython.cfunc`` and ``@cython.ccall`` for :keyword:`cdef` and
:keyword:`cpdef` functions (respectively).
http://codespeak.net/pipermail/cython-dev/2008-November/002925.html
pxd files
=========
In addition to the ``.pyx`` source files, Cython uses ``.pxd`` files
which work like C header files -- they contain Cython declarations
(and sometimes code sections) which are only meant for inclusion by
Cython modules. A ``pxd`` file is imported into a ``pyx`` module by
using the ``cimport`` keyword.
``pxd`` files have many use-cases:
1. They can be used for sharing external C declarations.
2. They can contain functions which are well suited for inlining by
the C compiler. Such functions should be marked ``inline``, example:
::
cdef inline int int_min(int a, int b):
return b if b < a else a
3. When accompanying an equally named ``pyx`` file, they
provide a Cython interface to the Cython module so that other
Cython modules can communicate with it using a more efficient
protocol than the Python one.
In our integration example, we might break it up into ``pxd`` files like this:
1. Add a ``cmath.pxd`` function which defines the C functions available from
the C ``math.h`` header file, like ``sin``. Then one would simply do
``from cmath cimport sin`` in ``integrate.pyx``.
2. Add a ``integrate.pxd`` so that other modules written in Cython
can define fast custom functions to integrate.
::
cdef class Function:
cpdef evaluate(self, double x)
cpdef integrate(Function f, double a,
double b, int N)
Note that if you have a cdef class with attributes, the attributes must
be declared in the class declaration ``pxd`` file (if you use one), not
the ``pyx`` file. The compiler will tell you about this.
cdef extern from "libcalg/queue.h":
ctypedef struct Queue:
pass
ctypedef void* QueueValue
Queue* queue_new()
void queue_free(Queue* queue)
int queue_push_head(Queue* queue, QueueValue data)
QueueValue queue_pop_head(Queue* queue)
QueueValue queue_peek_head(Queue* queue)
int queue_push_tail(Queue* queue, QueueValue data)
QueueValue queue_pop_tail(Queue* queue)
QueueValue queue_peek_tail(Queue* queue)
int queue_is_empty(Queue* queue)
cimport cqueue
cimport python_exc
cdef class Queue:
cdef cqueue.Queue* _c_queue
def __cinit__(self):
self._c_queue = cqueue.queue_new()
if self._c_queue is NULL:
python_exc.PyErr_NoMemory()
def __dealloc__(self):
if self._c_queue is not NULL:
cqueue.queue_free(self._c_queue)
cpdef int append(self, int value) except -1:
if not cqueue.queue_push_tail(self._c_queue, <void*>value):
python_exc.PyErr_NoMemory()
return 0
cdef int extend(self, int* values, Py_ssize_t count) except -1:
cdef Py_ssize_t i
for i in xrange(count):
if not cqueue.queue_push_tail(self._c_queue, <void*>values[i]):
python_exc.PyErr_NoMemory()
return 0
cpdef int peek(self) except? 0:
cdef int value = <int>cqueue.queue_peek_head(self._c_queue)
if value == 0:
# this may mean that the queue is empty, or that it
# happens to contain a 0 value
if cqueue.queue_is_empty(self._c_queue):
raise IndexError("Queue is empty")
return value
cpdef int pop(self) except? 0:
cdef int value = <int>cqueue.queue_pop_head(self._c_queue)
if value == 0:
# this may mean that the queue is empty, or that it
# happens to contain a 0 value
if cqueue.queue_is_empty(self._c_queue):
raise IndexError("Queue is empty")
return value
def __nonzero__(self):
return not cqueue.queue_is_empty(self._c_queue)
DEF repeat_count=10000
def test_cy():
cdef int i
cdef Queue q = Queue()
for i in xrange(repeat_count):
q.append(i)
for i in xrange(repeat_count):
q.peek()
while q:
q.pop()
def test_py():
cdef int i
q = Queue()
for i in xrange(repeat_count):
q.append(i)
for i in xrange(repeat_count):
q.peek()
while q:
q.pop()
from collections import deque
def test_deque():
cdef int i
q = deque()
for i in xrange(repeat_count):
q.appendleft(i)
for i in xrange(repeat_count):
q[-1]
while q:
q.pop()
repeat = range(repeat_count)
def test_py_exec():
q = Queue()
d = dict(q=q, repeat=repeat)
exec u"""\
for i in repeat:
q.append(9)
for i in repeat:
q.peek()
while q:
q.pop()
""" in d
Further reading
===============
The main documentation is located at http://docs.cython.org/. Some
recent features might not have documentation written yet, in such
cases some notes can usually be found in the form of a Cython
Enhancement Proposal (CEP) on http://wiki.cython.org/enhancements.
[Seljebotn09]_ contains more information about Cython and NumPy
arrays. If you intend to use Cython code in a multi-threaded setting,
it is essential to read up on Cython's features for managing the
Global Interpreter Lock (the GIL). The same paper contains an
explanation of the GIL, and the main documentation explains the Cython
features for managing it.
Finally, don't hesitate to ask questions (or post reports on
successes!) on the Cython users mailing list [UserList]_. The Cython
developer mailing list, [DevList]_, is also open to everybody, but
focusses on core development issues. Feel free to use it to report a
clear bug, to ask for guidance if you have time to spare to develop
Cython, or if you have suggestions for future development.
.. [DevList] Cython developer mailing list: http://mail.python.org/mailman/listinfo/cython-devel
.. [Seljebotn09] D. S. Seljebotn, Fast numerical computations with Cython,
Proceedings of the 8th Python in Science Conference, 2009.
.. [UserList] Cython users mailing list: http://groups.google.com/group/cython-users
Related work
============
Pyrex [Pyrex]_ is the compiler project that Cython was originally
based on. Many features and the major design decisions of the Cython
language were developed by Greg Ewing as part of that project. Today,
Cython supersedes the capabilities of Pyrex by providing a
substantially higher compatibility with Python code and Python
semantics, as well as superior optimisations and better integration
with scientific Python extensions like NumPy.
ctypes [ctypes]_ is a foreign function interface (FFI) for Python. It
provides C compatible data types, and allows calling functions in DLLs
or shared libraries. It can be used to wrap these libraries in pure
Python code. Compared to Cython, it has the major advantage of being
in the standard library and being usable directly from Python code,
without any additional dependencies. The major drawback is its
performance, which suffers from the Python call overhead as all
operations must pass through Python code first. Cython, being a
compiled language, can avoid much of this overhead by moving more
functionality and long-running loops into fast C code.
SWIG [SWIG]_ is a wrapper code generator. It makes it very easy to
parse large API definitions in C/C++ header files, and to generate
straight forward wrapper code for a large set of programming
languages. As opposed to Cython, however, it is not a programming
language itself. Thin wrappers are easy to generate, but the more
functionality a wrapper needs to provide, the harder it gets to
implement it with SWIG. Cython, on the other hand, makes it very easy
to write very elaborate wrapper code specifically for the Python
language, and to make it as thin or thick as needed at any given
place. Also, there exists third party code for parsing C header files
and using it to generate Cython definitions and module skeletons.
ShedSkin [ShedSkin]_ is an experimental Python-to-C++ compiler. It
uses a very powerful whole-module type inference engine to generate a
C++ program from (restricted) Python source code. The main drawback
is that it has no support for calling the Python/C API for operations
it does not support natively, and supports very few of the standard
Python modules.
.. [ctypes] http://docs.python.org/library/ctypes.html.
.. there's also the original ctypes home page: http://python.net/crew/theller/ctypes/
.. [Pyrex] G. Ewing, Pyrex: C-Extensions for Python,
http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/
.. [ShedSkin] M. Dufour, J. Coughlan, ShedSkin,
http://code.google.com/p/shedskin/
.. [SWIG] David M. Beazley et al.,
SWIG: An Easy to Use Tool for Integrating Scripting Languages with C and C++,
http://www.swig.org.
.. highlight:: cython
Unicode and passing strings
===========================
Similar to the string semantics in Python 3, Cython also strictly
separates byte strings and unicode strings. Above all, this means
that there is no automatic conversion between byte strings and unicode
strings (except for what Python 2 does in string operations). All
encoding and decoding must pass through an explicit encoding/decoding
step.
It is, however, very easy to pass byte strings between C code and Python.
When receiving a byte string from a C library, you can let Cython
convert it into a Python byte string by simply assigning it to a
Python variable::
cdef char* c_string = c_call_returning_a_c_string()
cdef bytes py_string = c_string
This creates a Python byte string object that holds a copy of the
original C string. It can be safely passed around in Python code, and
will be garbage collected when the last reference to it goes out of
scope. It is important to remember that null bytes in the string act
as terminator character, as generally known from C. The above will
therefore only work correctly for C strings that do not contain null
bytes.
Note that the creation of the Python bytes string can fail with an
exception, e.g. due to insufficient memory. If you need to ``free()``
the string after the conversion, you should wrap the assignment in a
try-finally construct::
cimport stdlib
cdef bytes py_string
cdef char* c_string = c_call_returning_a_c_string()
try:
py_string = c_string
finally:
stdlib.free(c_string)
To convert the byte string back into a C ``char*``, use the opposite
assignment::
cdef char* other_c_string = py_string
This is a very fast operation after which ``other_c_string`` points to
the byte string buffer of the Python string itself. It is tied to the
life time of the Python string. When the Python string is garbage
collected, the pointer becomes invalid. It is therefore important to
keep a reference to the Python string as long as the ``char*`` is in
use. Often enough, this only spans the call to a C function that
receives the pointer as parameter. Special care must be taken,
however, when the C function stores the pointer for later use. Apart
from keeping a Python reference to the string, no manual memory
management is required.
Decoding bytes to text
----------------------
The initially presented way of passing and receiving C strings is
sufficient if your code only deals with binary data in the strings.
When we deal with encoded text, however, it is best practice to decode
the C byte strings to Python Unicode strings on reception, and to
encode Python Unicode strings to C byte strings on the way out.
With a Python byte string object, you would normally just call the
``.decode()`` method to decode it into a Unicode string::
ustring = byte_string.decode('UTF-8')
Cython allows you to do the same for a C string, as long as it
contains no null bytes::
cdef char* some_c_string = c_call_returning_a_c_string()
ustring = some_c_string.decode('UTF-8')
However, this will not work for strings that contain null bytes, and
it is very inefficient for long strings, since Cython has to call
``strlen()`` on the C string first to find out the length by counting
the bytes up to the terminating null byte. In many cases, the user
code will know the length already, e.g. because a C function returned
it. In this case, it is much more efficient to tell Cython the exact
number of bytes by slicing the C string::
cdef char* c_string = NULL
cdef Py_ssize_t length = 0
# get pointer and length from a C function
get_a_c_string(&c_string, &length)
ustring = c_string[:length].decode('UTF-8')
The same can be used when the string contains null bytes, e.g. when it
uses an encoding like UCS-4, where each character is encoded in four
bytes.
It is common practice to wrap string conversions (and non-trivial type
conversions in general) in dedicated functions, as this needs to be
done in exactly the same way whenever receiving text from C. This
could look as follows::
cimport python_unicode
cimport stdlib
cdef unicode tounicode(char* s):
return s.decode('UTF-8', 'strict')
cdef unicode tounicode_with_length(
char* s, size_t length):
return s[:length].decode('UTF-8', 'strict')
cdef unicode tounicode_with_length_and_free(
char* s, size_t length):
try:
return s[:length].decode('UTF-8', 'strict')
finally:
stdlib.free(s)
Most likely, you will prefer shorter function names in your code based
on the kind of string being handled. Different types of content often
imply different ways of handling them on reception. To make the code
more readable and to anticipate future changes, it is good practice to
use separate conversion functions for different types of strings.
Encoding text to bytes
----------------------
The reverse way, converting a Python unicode string to a C ``char*``,
is pretty efficient by itself, assuming that what you actually want is
a memory managed byte string::
py_byte_string = py_unicode_string.encode('UTF-8')
cdef char* c_string = py_byte_string
As noted before, this takes the pointer to the byte buffer of the
Python byte string. Trying to do the same without keeping a reference
to the Python byte string will fail with a compile error::
# this will not compile !
cdef char* c_string = py_unicode_string.encode('UTF-8')
Here, the Cython compiler notices that the code takes a pointer to a
temporary string result that will be garbage collected after the
assignment. Later access to the invalidated pointer will read invalid
memory and likely result in a segfault. Cython will therefore refuse
to compile this code.
Source code encoding
--------------------
When string literals appear in the code, the source code encoding is
important. It determines the byte sequence that Cython will store in
the C code for bytes literals, and the Unicode code points that Cython
builds for unicode literals when parsing the byte encoded source file.
Following `PEP 263`_, Cython supports the explicit declaration of
source file encodings. For example, putting the following comment at
the top of an ``ISO-8859-15`` (Latin-9) encoded source file (into the
first or second line) is required to enable ``ISO-8859-15`` decoding
in the parser::
# -*- coding: ISO-8859-15 -*-
When no explicit encoding declaration is provided, the source code is
parsed as UTF-8 encoded text, as specified by `PEP 3120`_. `UTF-8`_
is a very common encoding that can represent the entire Unicode set of
characters and is compatible with plain ASCII encoded text that it
encodes efficiently. This makes it a very good choice for source code
files which usually consist mostly of ASCII characters.
.. _`PEP 263`: http://www.python.org/dev/peps/pep-0263/
.. _`PEP 3120`: http://www.python.org/dev/peps/pep-3120/
.. _`UTF-8`: http://en.wikipedia.org/wiki/UTF-8
As an example, putting the following line into a UTF-8 encoded source
file will print ``5``, as UTF-8 encodes the letter ``'ö'`` in the two
byte sequence ``'\xc3\xb6'``::
print( len(b'abcö') )
whereas the following ``ISO-8859-15`` encoded source file will print
``4``, as the encoding uses only 1 byte for this letter::
# -*- coding: ISO-8859-15 -*-
print( len(b'abcö') )
Note that the unicode literal ``u'abcö'`` is a correctly decoded four
character Unicode string in both cases, whereas the unprefixed Python
``str`` literal ``'abcö'`` will become a byte string in Python 2 (thus
having length 4 or 5 in the examples above), and a 4 character Unicode
string in Python 3. If you are not familiar with encodings, this may
not appear obvious at first read. See `CEP 108`_ for details.
As a rule of thumb, it is best to avoid unprefixed non-ASCII ``str``
literals and to use unicode string literals for all text. Cython also
supports the ``__future__`` import ``unicode_literals`` that instructs
the parser to read all unprefixed ``str`` literals in a source file as
unicode string literals, just like Python 3.
.. _`CEP 108`: http://wiki.cython.org/enhancements/stringliterals
Single bytes and characters
---------------------------
The Python C-API uses the normal C ``char`` type to represent a byte
value, but it has two special integer types for a Unicode code point
value, i.e. a single Unicode character: ``Py_UNICODE`` and
``Py_UCS4``. Since version 0.13, Cython supports the first natively,
support for ``Py_UCS4`` is new in Cython 0.15. ``Py_UNICODE`` is
either defined as an unsigned 2-byte or 4-byte integer, or as
``wchar_t``, depending on the platform. The exact type is a compile
time option in the build of the CPython interpreter and extension
modules inherit this definition at C compile time. The advantage of
``Py_UCS4`` is that it is guaranteed to be large enough for any
Unicode code point value, regardless of the platform. It is defined
as a 32bit unsigned int or long.
In Cython, the ``char`` type behaves differently from the
``Py_UNICODE`` and ``Py_UCS4`` types when coercing to Python objects.
Similar to the behaviour of the bytes type in Python 3, the ``char``
type coerces to a Python integer value by default, so that the
following prints 65 and not ``A``::
# -*- coding: ASCII -*-
cdef char char_val = 'A'
assert char_val == 65 # ASCII encoded byte value of 'A'
print( char_val )
If you want a Python bytes string instead, you have to request it
explicitly, and the following will print ``A`` (or ``b'A'`` in Python
3)::
print( <bytes>char_val )
The explicit coercion works for any C integer type. Values outside of
the range of a ``char`` or ``unsigned char`` will raise an
``OverflowError`` at runtime. Coercion will also happen automatically
when assigning to a typed variable, e.g.::
cdef bytes py_byte_string
py_byte_string = char_val
On the other hand, the ``Py_UNICODE`` and ``Py_UCS4`` types are rarely
used outside of the context of a Python unicode string, so their
default behaviour is to coerce to a Python unicode object. The
following will therefore print the character ``A``, as would the same
code with the ``Py_UNICODE`` type::
cdef Py_UCS4 uchar_val = u'A'
assert uchar_val == 65 # character point value of u'A'
print( uchar_val )
Again, explicit casting will allow users to override this behaviour.
The following will print 65::
cdef Py_UCS4 uchar_val = u'A'
print( <long>uchar_val )
Note that casting to a C ``long`` (or ``unsigned long``) will work
just fine, as the maximum code point value that a Unicode character
can have is 1114111 (``0x10FFFF``). On platforms with 32bit or more,
``int`` is just as good.
Narrow Unicode builds
----------------------
In narrow Unicode builds of CPython, i.e. builds where
``sys.maxunicode`` is 65535 (such as all Windows builds, as opposed to
1114111 in wide builds), it is still possible to use Unicode character
code points that do not fit into the 16 bit wide ``Py_UNICODE`` type.
For example, such a CPython build will accept the unicode literal
``u'\U00012345'``. However, the underlying system level encoding
leaks into Python space in this case, so that the length of this
literal becomes 2 instead of 1. This also shows when iterating over
it or when indexing into it. The visible substrings are ``u'\uD808'``
and ``u'\uDF45'`` in this example. They form a so-called surrogate
pair that represents the above character.
For more information on this topic, it is worth reading the `Wikipedia
article about the UTF-16 encoding`_.
.. _`Wikipedia article about the UTF-16 encoding`: http://en.wikipedia.org/wiki/UTF-16/UCS-2
The same properties apply to Cython code that gets compiled for a
narrow CPython runtime environment. In most cases, e.g. when
searching for a substring, this difference can be ignored as both the
text and the substring will contain the surrogates. So most Unicode
processing code will work correctly also on narrow builds. Encoding,
decoding and printing will work as expected, so that the above literal
turns into exactly the same byte sequence on both narrow and wide
Unicode platforms.
However, programmers should be aware that a single ``Py_UNICODE``
value (or single 'character' unicode string in CPython) may not be
enough to represent a complete Unicode character on narrow platforms.
For example, if an independent search for ``u'\uD808'`` and
``u'\uDF45'`` in a unicode string succeeds, this does not necessarily
mean that the character ``u'\U00012345`` is part of that string. It
may well be that two different characters are in the string that just
happen to share a code unit with the surrogate pair of the character
in question. Looking for substrings works correctly because the two
code units in the surrogate pair use distinct value ranges, so the
pair is always identifiable in a sequence of code points.
As of version 0.15, Cython has extended support for surrogate pairs so
that you can safely use an ``in`` test to search character values from
the full ``Py_UCS4`` range even on narrow platforms::
cdef Py_UCS4 uchar = 0x12345
print( uchar in some_unicode_string )
Similarly, it can coerce a one character string with a high Unicode
code point value to a Py_UCS4 value on both narrow and wide Unicode
platforms::
cdef Py_UCS4 uchar = u'\U00012345'
assert uchar == 0x12345
Iteration
---------
Cython 0.13 supports efficient iteration over ``char*``, bytes and
unicode strings, as long as the loop variable is appropriately typed.
So the following will generate the expected C code::
cdef char* c_string = ...
cdef char c
for c in c_string[:100]:
if c == 'A': ...
The same applies to bytes objects::
cdef bytes bytes_string = ...
cdef char c
for c in bytes_string:
if c == 'A': ...
For unicode objects, Cython will automatically infer the type of the
loop variable as ``Py_UCS4``::
cdef unicode ustring = ...
# NOTE: no typing required for 'uchar' !
for uchar in ustring:
if uchar == u'A': ...
The automatic type inference usually leads to much more efficient code
here. However, note that some unicode operations still require the
value to be a Python object, so Cython may end up generating redundant
conversion code for the loop variable value inside of the loop. If
this leads to a performance degradation for a specific piece of code,
you can either type the loop variable as a Python object explicitly,
or assign its value to a Python typed variable somewhere inside of the
loop to enforce one-time coercion before running Python operations on
it.
There are also optimisations for ``in`` tests, so that the following
code will run in plain C code, (actually using a switch statement)::
cdef Py_UCS4 uchar_val = get_a_unicode_character()
if uchar_val in u'abcABCxY':
...
Combined with the looping optimisation above, this can result in very
efficient character switching code, e.g. in unicode parsers.
.. highlight:: cython
.. _debugging:
**********************************
Debugging your Cython program
**********************************
Cython comes with an extension for the GNU Debugger that helps users debug
Cython code. To use this functionality, you will need to install gdb 7.2 or
higher, built with Python support (linked to Python 2.5 or higher).
The debugger supports debuggees with versions 2.6 and higher. For Python 3,
code should be built with Python 3 and the debugger should be run with
Python 2 (or at least it should be able to find the Python 2 Cython
installation).
The debugger will need debug information that the Cython compiler can export.
This can be achieved from within the setup
script by passing ``pyrex_gdb=True`` to your Cython Extenion class::
from Cython.Distutils import extension
ext = extension.Extension('source', 'source.pyx', pyrex_gdb=True)
setup(..., ext_modules=[ext)]
With this approach debug information can be enabled on a per-module basis.
Another (easier) way is to simply pass the ``--pyrex-gdb`` flag as a command
line argument::
python setup.py build_ext --pyrex-gdb
For development it's often easy to use the ``--inplace`` flag also, which makes
distutils build your project "in place", i.e., not in a separate `build`
directory.
When invoking Cython from the command line directly you can have it write
debug information using the ``--gdb`` flag::
cython --gdb myfile.pyx
.. note:: The debugger is newly part of Cython 0.14 and as such is still
experimental. CC markflorisson88@gmail.com in your TRAC tickets or
mailing list complaints.
Running the Debugger
=====================
.. highlight:: bash
To run the Cython debugger and have it import the debug information exported
by Cython, run ``cygdb`` in the build directory::
$ python setup.py build_ext --pyrex-gdb --inplace
$ cygdb
GNU gdb (GDB) 7.2
...
(gdb)
When using the Cython debugger, it's preferable that you build and run your code
with an interpreter that is compiled with debugging symbols (i.e. configured
with ``--with-pydebug`` or compiled with the ``-g`` CFLAG). If your Python is
installed and managed by your package manager you probably need to install debug
support separately, e.g. for ubuntu::
$ sudo apt-get install python-dbg
$ python-dbg setup.py build_ext --pyrex-gdb --inplace
Then you need to run your script with ``python-dbg`` also.
You can also pass additional arguments to gdb::
$ cygdb /path/to/build/directory/ GDBARGS
i.e.::
$ cygdb . --args python-dbg mainscript.py
To tell cygdb not to import any debug information, supply ``--`` as the first
argument::
$ cygdb --
Using the Debugger
===================
The Cython debugger comes with a set of commands that support breakpoints,
stack inspection, source code listing, stepping, stepping over, etc. Most
of these commands are analogous to their respective gdb command.
.. function:: cy break breakpoints...
Break in a Python, Cython or C function. First it will look for a Cython
function with that name, if cygdb doesn't know about a function (or method)
with that name, it will set a (pending) C breakpoint. The ``-p`` option can
be used to specify a Python breakpoint.
Breakpoints can be set for either the function or method name, or they can
be fully "qualified", which means that the entire "path" to a function is
given::
(gdb) cy break cython_function_or_method
(gdb) cy break packagename.modulename.cythonfunction
(gdb) cy break packagename.modulename.ClassName.cythonmethod
(gdb) cy break c_function
You can also break on Cython line numbers::
(gdb) cy break packagename.modulename:14
(gdb) cy break :14
Python breakpoints currently support names of the module (not the entire
package path) and the function or method::
(gdb) cy break -p pythonmodule.python_function_or_method
(gdb) cy break -p python_function_or_method
.. note:: Python breakpoints only work in Python builds where the Python frame
information can be read from the debugger. To ensure this, use a
Python debug build or a non-stripped build compiled with debug
support.
.. function:: cy step
Step through Python, Cython or C code. Python, Cython and C functions
called directly from Cython code are considered relevant and will be
stepped into.
.. function:: cy next
Step over Python, Cython or C code.
.. function:: cy run
Run the program. The default interpreter is the interpreter that was used
to build your extensions with, or the interpreter ``cygdb`` is run with
in case the "don't import debug information" option was in effect.
The interpreter can be overridden using gdb's ``file`` command.
.. function:: cy cont
Continue the program.
.. function:: cy up
cy down
Go up and down the stack to what is considered a relevant frame.
.. function:: cy finish
Execute until an upward relevant frame is met or something halts
execution.
.. function:: cy bt
cy backtrace
Print a traceback of all frames considered relevant. The ``-a`` option
makes it print the full traceback (all C frames).
.. function:: cy select
Select a stack frame by number as listed by ``cy backtrace``. This
command is introduced because ``cy backtrace`` prints a reversed stack
trace, so frame numbers differ from gdb's ``bt``.
.. function:: cy print varname
Print a local or global Cython, Python or C variable (depending on the
context). Variables may also be dereferenced::
(gdb) cy print x
x = 1
(gdb) cy print *x
*x = (PyObject) {
_ob_next = 0x93efd8,
_ob_prev = 0x93ef88,
ob_refcnt = 65,
ob_type = 0x83a3e0
}
.. function:: cy set cython_variable = value
Set a Cython variable on the Cython stack to value.
.. function:: cy list
List the source code surrounding the current line.
.. function:: cy locals
cy globals
Print all the local and global variables and their values.
.. function:: cy import FILE...
Import debug information from files given as arguments. The easiest way to
import debug information is to use the cygdb command line tool.
.. function:: cy exec code
Execute code in the current Python or Cython frame. This works like
Python's interactive interpreter.
For Python frames it uses the globals and locals from the Python frame,
for Cython frames it uses the dict of globals used on the Cython module
and a new dict filled with the local Cython variables.
.. note:: ``cy exec`` modifies state and executes code in the debuggee and is
therefore potentially dangerous.
Example::
(gdb) cy exec x + 1
2
(gdb) cy exec import sys; print sys.version_info
(2, 6, 5, 'final', 0)
(gdb) cy exec
>global foo
>
>foo = 'something'
>end
Convenience functions
=====================
The following functions are gdb functions, which means they can be used in a
gdb expression.
.. function:: cy_cname(varname)
Returns the C variable name of a Cython variable. For global
variables this may not be actually valid.
.. function:: cy_cvalue(varname)
Returns the value of a Cython variable.
.. function:: cy_eval(expression)
Evaluates Python code in the nearest Python or Cython frame and returns
the result of the expression as a gdb value. This gives a new reference
if successful, NULL on error.
.. function:: cy_lineno()
Returns the current line number in the selected Cython frame.
Example::
(gdb) print $cy_cname("x")
$1 = "__pyx_v_x"
(gdb) watch $cy_cvalue("x")
Hardware watchpoint 13: $cy_cvalue("x")
(gdb) cy set my_cython_variable = $cy_eval("{'spam': 'ham'}")
(gdb) print $cy_lineno()
$2 = 12
Configuring the Debugger
========================
A few aspects of the debugger are configurable with gdb parameters. For
instance, colors can be disabled, the terminal background color
and breakpoint autocompletion can be configured.
.. c:macro:: cy_complete_unqualified
Tells the Cython debugger whether ``cy break`` should also complete
plain function names, i.e. not prefixed by their module name.
E.g. if you have a function named ``spam``,
in module ``M``, it tells whether to only complete ``M.spam`` or also just
``spam``.
The default is true.
.. c:macro:: cy_colorize_code
Tells the debugger whether to colorize source code. The default is true.
.. c:macro:: cy_terminal_background_color
Tells the debugger about the terminal background color, which affects
source code coloring. The default is "dark", another valid option is
"light".
This is how these parameters can be used::
(gdb) set cy_complete_unqualified off
(gdb) set cy_terminal_background_color light
(gdb) show cy_colorize_code
.. highlight:: cython
.. _early-binding-for-speed:
**************************
Early Binding for Speed
**************************
As a dynamic language, Python encourages a programming style of considering
classes and objects in terms of their methods and attributes, more than where
they fit into the class hierarchy.
This can make Python a very relaxed and comfortable language for rapid
development, but with a price - the 'red tape' of managing data types is
dumped onto the interpreter. At run time, the interpreter does a lot of work
searching namespaces, fetching attributes and parsing argument and keyword
tuples. This run-time 'late binding' is a major cause of Python's relative
slowness compared to 'early binding' languages such as C++.
However with Cython it is possible to gain significant speed-ups through the
use of 'early binding' programming techniques.
For example, consider the following (silly) code example:
.. sourcecode:: cython
cdef class Rectangle:
cdef int x0, y0
cdef int x1, y1
def __init__(self, int x0, int y0, int x1, int y1):
self.x0 = x0; self.y0 = y0; self.x1 = x1; self.y1 = y1
def area(self):
area = (self.x1 - self.x0) * (self.y1 - self.y0)
if area < 0:
area = -area
return area
def rectArea(x0, y0, x1, y1):
rect = Rectangle(x0, y0, x1, y1)
return rect.area()
In the :func:`rectArea` method, the call to :meth:`rect.area` and the
:meth:`.area` method contain a lot of Python overhead.
However, in Cython, it is possible to eliminate a lot of this overhead in cases
where calls occur within Cython code. For example:
.. sourcecode:: cython
cdef class Rectangle:
cdef int x0, y0
cdef int x1, y1
def __init__(self, int x0, int y0, int x1, int y1):
self.x0 = x0; self.y0 = y0; self.x1 = x1; self.y1 = y1
cdef int _area(self):
cdef int area
area = (self.x1 - self.x0) * (self.y1 - self.y0)
if area < 0:
area = -area
return area
def area(self):
return self._area()
def rectArea(x0, y0, x1, y1):
cdef Rectangle rect
rect = Rectangle(x0, y0, x1, y1)
return rect._area()
Here, in the Rectangle extension class, we have defined two different area
calculation methods, the efficient :meth:`_area` C method, and the
Python-callable :meth:`area` method which serves as a thin wrapper around
:meth:`_area`. Note also in the function :func:`rectArea` how we 'early bind'
by declaring the local variable ``rect`` which is explicitly given the type
Rectangle. By using this declaration, instead of just dynamically assigning to
``rect``, we gain the ability to access the much more efficient C-callable
:meth:`_rect` method.
But Cython offers us more simplicity again, by allowing us to declare
dual-access methods - methods that can be efficiently called at C level, but
can also be accessed from pure Python code at the cost of the Python access
overheads. Consider this code:
.. sourcecode:: cython
cdef class Rectangle:
cdef int x0, y0
cdef int x1, y1
def __init__(self, int x0, int y0, int x1, int y1):
self.x0 = x0; self.y0 = y0; self.x1 = x1; self.y1 = y1
cpdef int area(self):
cdef int area
area = (self.x1 - self.x0) * (self.y1 - self.y0)
if area < 0:
area = -area
return area
def rectArea(x0, y0, x1, y1):
cdef Rectangle rect
rect = Rectangle(x0, y0, x1, y1)
return rect.area()
.. note::
in earlier versions of Cython, the :keyword:`cpdef` keyword is
:keyword:`rdef` - but has the same effect).
Here, we just have a single area method, declared as :keyword:`cpdef` to make it
efficiently callable as a C function, but still accessible from pure Python
(or late-binding Cython) code.
If within Cython code, we have a variable already 'early-bound' (ie, declared
explicitly as type Rectangle, (or cast to type Rectangle), then invoking its
area method will use the efficient C code path and skip the Python overhead.
But if in Pyrex or regular Python code we have a regular object variable
storing a Rectangle object, then invoking the area method will require:
* an attribute lookup for the area method
* packing a tuple for arguments and a dict for keywords (both empty in this case)
* using the Python API to call the method
and within the area method itself:
* parsing the tuple and keywords
* executing the calculation code
* converting the result to a python object and returning it
So within Cython, it is possible to achieve massive optimisations by
using strong typing in declaration and casting of variables. For tight loops
which use method calls, and where these methods are pure C, the difference can
be huge.
.. highlight:: cython
.. _extension-types:
******************
Extension Types
******************
Introduction
==============
As well as creating normal user-defined classes with the Python class
statement, Cython also lets you create new built-in Python types, known as
extension types. You define an extension type using the :keyword:`cdef` class
statement. Here's an example::
cdef class Shrubbery:
cdef int width, height
def __init__(self, w, h):
self.width = w
self.height = h
def describe(self):
print "This shrubbery is", self.width, \
"by", self.height, "cubits."
As you can see, a Cython extension type definition looks a lot like a Python
class definition. Within it, you use the def statement to define methods that
can be called from Python code. You can even define many of the special
methods such as :meth:`__init__` as you would in Python.
The main difference is that you can use the :keyword:`cdef` statement to define
attributes. The attributes may be Python objects (either generic or of a
particular extension type), or they may be of any C data type. So you can use
extension types to wrap arbitrary C data structures and provide a Python-like
interface to them.
Attributes
============
Attributes of an extension type are stored directly in the object's C struct.
The set of attributes is fixed at compile time; you can't add attributes to an
extension type instance at run time simply by assigning to them, as you could
with a Python class instance. (You can subclass the extension type in Python
and add attributes to instances of the subclass, however.)
There are two ways that attributes of an extension type can be accessed: by
Python attribute lookup, or by direct access to the C struct from Cython code.
Python code is only able to access attributes of an extension type by the
first method, but Cython code can use either method.
By default, extension type attributes are only accessible by direct access,
not Python access, which means that they are not accessible from Python code.
To make them accessible from Python code, you need to declare them as
:keyword:`public` or :keyword:`readonly`. For example,::
cdef class Shrubbery:
cdef public int width, height
cdef readonly float depth
makes the width and height attributes readable and writable from Python code,
and the depth attribute readable but not writable.
.. note::
You can only expose simple C types, such as ints, floats, and
strings, for Python access. You can also expose Python-valued attributes.
.. note::
Also the :keyword:`public` and :keyword:`readonly` options apply only to
Python access, not direct access. All the attributes of an extension type
are always readable and writable by C-level access.
Type declarations
===================
Before you can directly access the attributes of an extension type, the Cython
compiler must know that you have an instance of that type, and not just a
generic Python object. It knows this already in the case of the ``self``
parameter of the methods of that type, but in other cases you will have to use
a type declaration.
For example, in the following function,::
cdef widen_shrubbery(sh, extra_width): # BAD
sh.width = sh.width + extra_width
because the ``sh`` parameter hasn't been given a type, the width attribute
will be accessed by a Python attribute lookup. If the attribute has been
declared :keyword:`public` or :keyword:`readonly` then this will work, but it
will be very inefficient. If the attribute is private, it will not work at all
-- the code will compile, but an attribute error will be raised at run time.
The solution is to declare ``sh`` as being of type :class:`Shrubbery`, as
follows::
cdef widen_shrubbery(Shrubbery sh, extra_width):
sh.width = sh.width + extra_width
Now the Cython compiler knows that ``sh`` has a C attribute called
:attr:`width` and will generate code to access it directly and efficiently.
The same consideration applies to local variables, for example,::
cdef Shrubbery another_shrubbery(Shrubbery sh1):
cdef Shrubbery sh2
sh2 = Shrubbery()
sh2.width = sh1.width
sh2.height = sh1.height
return sh2
Type Testing and Casting
------------------------
Suppose I have a method :meth:`quest` which returns an object of type :class:`Shrubbery`.
To access it's width I could write::
cdef Shrubbery sh = quest()
print sh.width
which requires the use of a local variable and performs a type test on assignment.
If you *know* the return value of :meth:`quest` will be of type :class:`Shrubbery`
you can use a cast to write::
print (<Shrubbery>quest()).width
This may be dangerous if :meth:`quest()` is not actually a :class:`Shrubbery`, as it
will try to access width as a C struct member which may not exist. At the C level,
rather than raising an :class:`AttributeError`, either an nonsensical result will be
returned (interpreting whatever data is at at that address as an int) or a segfault
may result from trying to access invalid memory. Instead, one can write::
print (<Shrubbery?>quest()).width
which performs a type check (possibly raising a :class:`TypeError`) before making the
cast and allowing the code to proceed.
To explicitly test the type of an object, use the :meth:`isinstance` method. By default,
in Python, the :meth:`isinstance` method checks the :class:`__class__` attribute of the
first argument to determine if it is of the required type. However, this is potentially
unsafe as the :class:`__class__` attribute can be spoofed or changed, but the C structure
of an extension type must be correct to access its :keyword:`cdef` attributes and call its :keyword:`cdef` methods. Cython detects if the second argument is a known extension
type and does a type check instead, analogous to Pyrex's :meth:`typecheck`.
The old behavior is always available by passing a tuple as the second parameter::
print isinstance(sh, Shrubbery) # Check the type of sh
print isinstance(sh, (Shrubbery,)) # Check sh.__class__
Extension types and None
=========================
When you declare a parameter or C variable as being of an extension type,
Cython will allow it to take on the value ``None`` as well as values of its
declared type. This is analogous to the way a C pointer can take on the value
``NULL``, and you need to exercise the same caution because of it. There is no
problem as long as you are performing Python operations on it, because full
dynamic type checking will be applied. However, when you access C attributes
of an extension type (as in the widen_shrubbery function above), it's up to
you to make sure the reference you're using is not ``None`` -- in the
interests of efficiency, Cython does not check this.
You need to be particularly careful when exposing Python functions which take
extension types as arguments. If we wanted to make :func:`widen_shrubbery` a
Python function, for example, if we simply wrote::
def widen_shrubbery(Shrubbery sh, extra_width): # This is
sh.width = sh.width + extra_width # dangerous!
then users of our module could crash it by passing ``None`` for the ``sh``
parameter.
One way to fix this would be::
def widen_shrubbery(Shrubbery sh, extra_width):
if sh is None:
raise TypeError
sh.width = sh.width + extra_width
but since this is anticipated to be such a frequent requirement, Cython
provides a more convenient way. Parameters of a Python function declared as an
extension type can have a ``not None`` clause::
def widen_shrubbery(Shrubbery sh not None, extra_width):
sh.width = sh.width + extra_width
Now the function will automatically check that ``sh`` is ``not None`` along
with checking that it has the right type.
.. note::
``not None`` clause can only be used in Python functions (defined with
:keyword:`def`) and not C functions (defined with :keyword:`cdef`). If
you need to check whether a parameter to a C function is None, you will
need to do it yourself.
.. note::
Some more things:
* The self parameter of a method of an extension type is guaranteed never to
be ``None``.
* When comparing a value with ``None``, keep in mind that, if ``x`` is a Python
object, ``x is None`` and ``x is not None`` are very efficient because they
translate directly to C pointer comparisons, whereas ``x == None`` and
``x != None``, or simply using ``x`` as a boolean value (as in ``if x: ...``)
will invoke Python operations and therefore be much slower.
Special methods
================
Although the principles are similar, there are substantial differences between
many of the :meth:`__xxx__` special methods of extension types and their Python
counterparts. There is a :ref:`separate page <special-methods>` devoted to this subject, and you should
read it carefully before attempting to use any special methods in your
extension types.
Properties
============
There is a special syntax for defining properties in an extension class::
cdef class Spam:
property cheese:
"A doc string can go here."
def __get__(self):
# This is called when the property is read.
...
def __set__(self, value):
# This is called when the property is written.
...
def __del__(self):
# This is called when the property is deleted.
The :meth:`__get__`, :meth:`__set__` and :meth:`__del__` methods are all
optional; if they are omitted, an exception will be raised when the
corresponding operation is attempted.
Here's a complete example. It defines a property which adds to a list each
time it is written to, returns the list when it is read, and empties the list
when it is deleted.::
# cheesy.pyx
cdef class CheeseShop:
cdef object cheeses
def __cinit__(self):
self.cheeses = []
property cheese:
def __get__(self):
return "We don't have: %s" % self.cheeses
def __set__(self, value):
self.cheeses.append(value)
def __del__(self):
del self.cheeses[:]
# Test input
from cheesy import CheeseShop
shop = CheeseShop()
print shop.cheese
shop.cheese = "camembert"
print shop.cheese
shop.cheese = "cheddar"
print shop.cheese
del shop.cheese
print shop.cheese
.. sourcecode:: text
# Test output
We don't have: []
We don't have: ['camembert']
We don't have: ['camembert', 'cheddar']
We don't have: []
Subclassing
=============
An extension type may inherit from a built-in type or another extension type::
cdef class Parrot:
...
cdef class Norwegian(Parrot):
...
A complete definition of the base type must be available to Cython, so if the
base type is a built-in type, it must have been previously declared as an
extern extension type. If the base type is defined in another Cython module, it
must either be declared as an extern extension type or imported using the
:keyword:`cimport` statement.
An extension type can only have one base class (no multiple inheritance).
Cython extension types can also be subclassed in Python. A Python class can
inherit from multiple extension types provided that the usual Python rules for
multiple inheritance are followed (i.e. the C layouts of all the base classes
must be compatible).
Since Cython 0.13.1, there is a way to prevent extension types from
being subtyped in Python. This is done via the ``final`` directive,
usually set on an extension type using a decorator::
cimport cython
@cython.final
cdef class Parrot:
def done(self): pass
Trying to create a Python subclass from this type will raise a
:class:`TypeError` at runtime. Cython will also prevent subtyping a
final type inside of the same module, i.e. creating an extension type
that uses a final type as its base type will fail at compile time.
Note, however, that this restriction does not currently propagate to
other extension modules, so even final extension types can still be
subtyped at the C level by foreign code.
C methods
=========
Extension types can have C methods as well as Python methods. Like C
functions, C methods are declared using :keyword:`cdef` or :keyword:`cpdef` instead of
:keyword:`def`. C methods are "virtual", and may be overridden in derived
extension types.::
# pets.pyx
cdef class Parrot:
cdef void describe(self):
print "This parrot is resting."
cdef class Norwegian(Parrot):
cdef void describe(self):
Parrot.describe(self)
print "Lovely plumage!"
cdef Parrot p1, p2
p1 = Parrot()
p2 = Norwegian()
print "p1:"
p1.describe()
print "p2:"
p2.describe()
.. sourcecode:: text
# Output
p1:
This parrot is resting.
p2:
This parrot is resting.
Lovely plumage!
The above example also illustrates that a C method can call an inherited C
method using the usual Python technique, i.e.::
Parrot.describe(self)
Forward-declaring extension types
===================================
Extension types can be forward-declared, like :keyword:`struct` and
:keyword:`union` types. This will be necessary if you have two extension types
that need to refer to each other, e.g.::
cdef class Shrubbery # forward declaration
cdef class Shrubber:
cdef Shrubbery work_in_progress
cdef class Shrubbery:
cdef Shrubber creator
If you are forward-declaring an extension type that has a base class, you must
specify the base class in both the forward declaration and its subsequent
definition, for example,::
cdef class A(B)
...
cdef class A(B):
# attributes and methods
Making extension types weak-referenceable
==========================================
By default, extension types do not support having weak references made to
them. You can enable weak referencing by declaring a C attribute of type
object called :attr:`__weakref__`. For example,::
cdef class ExplodingAnimal:
"""This animal will self-destruct when it is
no longer strongly referenced."""
cdef object __weakref__
Public and external extension types
====================================
Extension types can be declared extern or public. An extern extension type
declaration makes an extension type defined in external C code available to a
Cython module. A public extension type declaration makes an extension type
defined in a Cython module available to external C code.
External extension types
------------------------
An extern extension type allows you to gain access to the internals of Python
objects defined in the Python core or in a non-Cython extension module.
.. note::
In previous versions of Pyrex, extern extension types were also used to
reference extension types defined in another Pyrex module. While you can still
do that, Cython provides a better mechanism for this. See
:ref:`sharing-declarations`.
Here is an example which will let you get at the C-level members of the
built-in complex object.::
cdef extern from "complexobject.h":
struct Py_complex:
double real
double imag
ctypedef class __builtin__.complex [object PyComplexObject]:
cdef Py_complex cval
# A function which uses the above type
def spam(complex c):
print "Real:", c.cval.real
print "Imag:", c.cval.imag
.. note::
Some important things:
1. In this example, :keyword:`ctypedef` class has been used. This is
because, in the Python header files, the ``PyComplexObject`` struct is
declared with:
.. sourcecode:: c
ctypedef struct {
...
} PyComplexObject;
2. As well as the name of the extension type, the module in which its type
object can be found is also specified. See the implicit importing section
below.
3. When declaring an external extension type, you don't declare any
methods. Declaration of methods is not required in order to call them,
because the calls are Python method calls. Also, as with
:keyword:`structs` and :keyword:`unions`, if your extension class
declaration is inside a :keyword:`cdef` extern from block, you only need to
declare those C members which you wish to access.
Name specification clause
-------------------------
The part of the class declaration in square brackets is a special feature only
available for extern or public extension types. The full form of this clause
is::
[object object_struct_name, type type_object_name ]
where ``object_struct_name`` is the name to assume for the type's C struct,
and type_object_name is the name to assume for the type's statically declared
type object. (The object and type clauses can be written in either order.)
If the extension type declaration is inside a :keyword:`cdef` extern from
block, the object clause is required, because Cython must be able to generate
code that is compatible with the declarations in the header file. Otherwise,
for extern extension types, the object clause is optional.
For public extension types, the object and type clauses are both required,
because Cython must be able to generate code that is compatible with external C
code.
Implicit importing
------------------
Cython requires you to include a module name in an extern extension class
declaration, for example,::
cdef extern class MyModule.Spam:
...
The type object will be implicitly imported from the specified module and
bound to the corresponding name in this module. In other words, in this
example an implicit::
from MyModule import Spam
statement will be executed at module load time.
The module name can be a dotted name to refer to a module inside a package
hierarchy, for example,::
cdef extern class My.Nested.Package.Spam:
...
You can also specify an alternative name under which to import the type using
an as clause, for example,::
cdef extern class My.Nested.Package.Spam as Yummy:
...
which corresponds to the implicit import statement::
from My.Nested.Package import Spam as Yummy
Type names vs. constructor names
--------------------------------
Inside a Cython module, the name of an extension type serves two distinct
purposes. When used in an expression, it refers to a module-level global
variable holding the type's constructor (i.e. its type-object). However, it
can also be used as a C type name to declare variables, arguments and return
values of that type.
When you declare::
cdef extern class MyModule.Spam:
...
the name Spam serves both these roles. There may be other names by which you
can refer to the constructor, but only Spam can be used as a type name. For
example, if you were to explicity import MyModule, you could use
``MyModule.Spam()`` to create a Spam instance, but you wouldn't be able to use
:class:`MyModule.Spam` as a type name.
When an as clause is used, the name specified in the as clause also takes over
both roles. So if you declare::
cdef extern class MyModule.Spam as Yummy:
...
then Yummy becomes both the type name and a name for the constructor. Again,
there are other ways that you could get hold of the constructor, but only
Yummy is usable as a type name.
Public extension types
======================
An extension type can be declared public, in which case a ``.h`` file is
generated containing declarations for its object struct and type object. By
including the ``.h`` file in external C code that you write, that code can
access the attributes of the extension type.
.. highlight:: cython
.. _external-C-code:
**********************************
Interfacing with External C Code
**********************************
One of the main uses of Cython is wrapping existing libraries of C code. This
is achieved by using external declarations to declare the C functions and
variables from the library that you want to use.
You can also use public declarations to make C functions and variables defined
in a Cython module available to external C code. The need for this is expected
to be less frequent, but you might want to do it, for example, if you are
`embedding Python`_ in another application as a scripting language. Just as a
Cython module can be used as a bridge to allow Python code to call C code, it
can also be used to allow C code to call Python code.
.. _embedding Python: http://www.freenet.org.nz/python/embeddingpyrex/
External declarations
=======================
By default, C functions and variables declared at the module level are local
to the module (i.e. they have the C static storage class). They can also be
declared extern to specify that they are defined elsewhere, for example::
cdef extern int spam_counter
cdef extern void order_spam(int tons)
Referencing C header files
---------------------------
When you use an extern definition on its own as in the examples above, Cython
includes a declaration for it in the generated C file. This can cause problems
if the declaration doesn't exactly match the declaration that will be seen by
other C code. If you're wrapping an existing C library, for example, it's
important that the generated C code is compiled with exactly the same
declarations as the rest of the library.
To achieve this, you can tell Cython that the declarations are to be found in a
C header file, like this::
cdef extern from "spam.h":
int spam_counter
void order_spam(int tons)
The ``cdef extern`` from clause does three things:
1. It directs Cython to place a ``#include`` statement for the named header file in
the generated C code.
2. It prevents Cython from generating any C code
for the declarations found in the associated block.
3. It treats all declarations within the block as though they started with
``cdef extern``.
It's important to understand that Cython does not itself read the C header
file, so you still need to provide Cython versions of any declarations from it
that you use. However, the Cython declarations don't always have to exactly
match the C ones, and in some cases they shouldn't or can't. In particular:
1. Don't use ``const``. Cython doesn't know anything about ``const``, so just
leave it out. Most of the time this shouldn't cause any problem, although
on rare occasions you might have to use a cast. You can also explicitly
declare something like::
ctypedef char* const_char_ptr "const char*"
though in most cases this will not be needed.
.. warning::
A problem with const could arise if you have something like::
cdef extern from "grail.h":
char *nun
where grail.h actually contains::
extern const char *nun;
and you do::
cdef void languissement(char *s):
#something that doesn't change s
...
languissement(nun)
which will cause the C compiler to complain. You can work around it by
casting away the constness::
languissement(<char *>nun)
2. Leave out any platform-specific extensions to C declarations such as
``__declspec()``.
3. If the header file declares a big struct and you only want to use a few
members, you only need to declare the members you're interested in. Leaving
the rest out doesn't do any harm, because the C compiler will use the full
definition from the header file.
In some cases, you might not need any of the struct's members, in which
case you can just put pass in the body of the struct declaration, e.g.::
cdef extern from "foo.h":
struct spam:
pass
.. note::
you can only do this inside a ``cdef extern from`` block; struct
declarations anywhere else must be non-empty.
4. If the header file uses ``typedef`` names such as :ctype:`word` to refer
to platform-dependent flavours of numeric types, you will need a
corresponding :keyword:`ctypedef` statement, but you don't need to match
the type exactly, just use something of the right general kind (int, float,
etc). For example,::
ctypedef int word
will work okay whatever the actual size of a :ctype:`word ` is (provided the header
file defines it correctly). Conversion to and from Python types, if any, will also
be used for this new type.
5. If the header file uses macros to define constants, translate them into a
dummy ``enum`` declaration.
6. If the header file defines a function using a macro, declare it as though
it were an ordinary function, with appropriate argument and result types.
7. For archaic reasons C uses the keyword :keyword:`void` to declare a function
taking no parameters. In Cython as in Python, simply declare such functions
as :meth:`foo()`.
A few more tricks and tips:
* If you want to include a C header because it's needed by another header, but
don't want to use any declarations from it, put pass in the extern-from
block::
cdef extern from "spam.h":
pass
* If you want to include some external declarations, but don't want to specify
a header file (because it's included by some other header that you've
already included) you can put ``*`` in place of the header file name::
cdef extern from *:
...
Styles of struct, union and enum declaration
----------------------------------------------
There are two main ways that structs, unions and enums can be declared in C
header files: using a tag name, or using a typedef. There are also some
variations based on various combinations of these.
It's important to make the Cython declarations match the style used in the
header file, so that Cython can emit the right sort of references to the type
in the code it generates. To make this possible, Cython provides two different
syntaxes for declaring a struct, union or enum type. The style introduced
above corresponds to the use of a tag name. To get the other style, you prefix
the declaration with :keyword:`ctypedef`, as illustrated below.
The following table shows the various possible styles that can be found in a
header file, and the corresponding Cython declaration that you should put in
the ``cdef extern`` from block. Struct declarations are used as an example; the
same applies equally to union and enum declarations.
+-------------------------+---------------------------------------------+-----------------------------------------------------------------------+
| C code | Possibilities for corresponding Cython Code | Comments |
+=========================+=============================================+=======================================================================+
| .. sourcecode:: c | :: | Cython will refer to the as ``struct Foo`` in the generated C code. |
| | | |
| struct Foo { | cdef struct Foo: | |
| ... | ... | |
| }; | | |
+-------------------------+---------------------------------------------+-----------------------------------------------------------------------+
| .. sourcecode:: c | :: | Cython will refer to the type simply as ``Foo`` in |
| | | the generated C code. |
| typedef struct { | ctypedef struct Foo: | |
| ... | ... | |
| } Foo; | | |
+-------------------------+---------------------------------------------+-----------------------------------------------------------------------+
| .. sourcecode:: c | :: | If the C header uses both a tag and a typedef with *different* |
| | | names, you can use either form of declaration in Cython |
| typedef struct foo { | cdef struct foo: | (although if you need to forward reference the type, |
| ... | ... | you'll have to use the first form). |
| } Foo; | ctypedef foo Foo #optional | |
| | | |
| | or:: | |
| | | |
| | ctypedef struct Foo: | |
| | ... | |
+-------------------------+---------------------------------------------+-----------------------------------------------------------------------+
| .. sourcecode:: c | :: | If the header uses the *same* name for the tag and typedef, you |
| | | won't be able to include a :keyword:`ctypedef` for it -- but then, |
| typedef struct Foo { | cdef struct Foo: | it's not necessary. |
| ... | ... | |
| } Foo; | | |
+-------------------------+---------------------------------------------+-----------------------------------------------------------------------+
Note that in all the cases below, you refer to the type in Cython code simply
as :ctype:`Foo`, not ``struct Foo``.
Accessing Python/C API routines
---------------------------------
One particular use of the ``cdef extern from`` statement is for gaining access to
routines in the Python/C API. For example,::
cdef extern from "Python.h":
object PyString_FromStringAndSize(char *s, Py_ssize_t len)
will allow you to create Python strings containing null bytes.
Special Types
--------------
Cython predefines the name ``Py_ssize_t`` for use with Python/C API routines. To
make your extensions compatible with 64-bit systems, you should always use
this type where it is specified in the documentation of Python/C API routines.
Windows Calling Conventions
----------------------------
The ``__stdcall`` and ``__cdecl`` calling convention specifiers can be used in
Cython, with the same syntax as used by C compilers on Windows, for example,::
cdef extern int __stdcall FrobnicateWindow(long handle)
cdef void (__stdcall *callback)(void *)
If ``__stdcall`` is used, the function is only considered compatible with
other ``__stdcall`` functions of the same signature.
Resolving naming conflicts - C name specifications
----------------------------------------------------
Each Cython module has a single module-level namespace for both Python and C
names. This can be inconvenient if you want to wrap some external C functions
and provide the Python user with Python functions of the same names.
Cython provides a couple of different ways of solving this problem. The
best way, especially if you have many C functions to wrap, is probably to put
the extern C function declarations into a different namespace using the
facilities described in the section on sharing declarations between Cython
modules.
The other way is to use a C name specification to give different Cython and C
names to the C function. Suppose, for example, that you want to wrap an
external function called :func:`eject_tomato`. If you declare it as::
cdef extern void c_eject_tomato "eject_tomato" (float speed)
then its name inside the Cython module will be ``c_eject_tomato``, whereas its name
in C will be ``eject_tomato``. You can then wrap it with::
def eject_tomato(speed):
c_eject_tomato(speed)
so that users of your module can refer to it as ``eject_tomato``.
Another use for this feature is referring to external names that happen to be
Cython keywords. For example, if you want to call an external function called
print, you can rename it to something else in your Cython module.
As well as functions, C names can be specified for variables, structs, unions,
enums, struct and union members, and enum values. For example,::
cdef extern int one "ein", two "zwei"
cdef extern float three "drei"
cdef struct spam "SPAM":
int i "eye"
cdef enum surprise "inquisition":
first "alpha"
second "beta" = 3
Using Cython Declarations from C
==================================
Cython provides two methods for making C declarations from a Cython module
available for use by external C code---public declarations and C API
declarations.
.. note::
You do not need to use either of these to make declarations from one
Cython module available to another Cython module – you should use the
:keyword:`cimport` statement for that. Sharing Declarations Between Cython Modules.
Public Declarations
---------------------
You can make C types, variables and functions defined in a Cython module
accessible to C code that is linked with the module, by declaring them with
the public keyword::
cdef public struct Bunny: # public type declaration
int vorpalness
cdef public int spam # public variable declaration
cdef public void grail(Bunny *): # public function declaration
...
If there are any public declarations in a Cython module, a header file called
:file:`modulename.h` file is generated containing equivalent C declarations for
inclusion in other C code.
Any C code wanting to make use of these declarations will need to be linked,
either statically or dynamically, with the extension module.
If the Cython module resides within a package, then the name of the ``.h``
file consists of the full dotted name of the module, e.g. a module called
:mod:`foo.spam` would have a header file called :file:`foo.spam.h`.
C API Declarations
-------------------
The other way of making declarations available to C code is to declare them
with the :keyword:`api` keyword. You can use this keyword with C functions and
extension types. A header file called :file:`modulename_api.h` is produced
containing declarations of the functions and extension types, and a function
called :func:`import_modulename`.
C code wanting to use these functions or extension types needs to include the
header and call the :func:`import_modulename` function. The other functions
can then be called and the extension types used as usual.
Any public C type or extension type declarations in the Cython module are also
made available when you include :file:`modulename_api.h`.::
# delorean.pyx
cdef public struct Vehicle:
int speed
float power
cdef api void activate(Vehicle *v):
if v.speed >= 88 and v.power >= 1.21:
print "Time travel achieved"
.. sourcecode:: c
# marty.c
#include "delorean_api.h"
Vehicle car;
int main(int argc, char *argv[]) {
import_delorean();
car.speed = atoi(argv[1]);
car.power = atof(argv[2]);
activate(&car);
}
.. note::
Any types defined in the Cython module that are used as argument or
return types of the exported functions will need to be declared public,
otherwise they won't be included in the generated header file, and you will
get errors when you try to compile a C file that uses the header.
Using the :keyword:`api` method does not require the C code using the
declarations to be linked with the extension module in any way, as the Python
import machinery is used to make the connection dynamically. However, only
functions can be accessed this way, not variables.
You can use both :keyword:`public` and :keyword:`api` on the same function to
make it available by both methods, e.g.::
cdef public api void belt_and_braces():
...
However, note that you should include either :file:`modulename.h` or
:file:`modulename_api.h` in a given C file, not both, otherwise you may get
conflicting dual definitions.
If the Cython module resides within a package, then:
* The name of the header file contains of the full dotted name of the module.
* The name of the importing function contains the full name with dots replaced
by double underscores.
E.g. a module called :mod:`foo.spam` would have an API header file called
:file:`foo.spam_api.h` and an importing function called
:func:`import_foo__spam`.
Multiple public and API declarations
--------------------------------------
You can declare a whole group of items as :keyword:`public` and/or
:keyword:`api` all at once by enclosing them in a :keyword:`cdef` block, for
example,::
cdef public api:
void order_spam(int tons)
char *get_lunch(float tomato_size)
This can be a useful thing to do in a ``.pxd`` file (see
:ref:`sharing-declarations`) to make the module's public interface
available by all three methods.
Acquiring and Releasing the GIL
---------------------------------
Cython provides facilities for releasing the Global Interpreter Lock (GIL)
before calling C code, and for acquiring the GIL in functions that are to be
called back from C code that is executed without the GIL.
Releasing the GIL
^^^^^^^^^^^^^^^^^
You can release the GIL around a section of code using the
:keyword:`with nogil` statement::
with nogil:
<code to be executed with the GIL released>
Code in the body of the statement must not manipulate Python objects in any
way, and must not call anything that manipulates Python objects without first
re-acquiring the GIL. Cython currently does not check this.
Acquiring the GIL
^^^^^^^^^^^^^^^^^
A C function that is to be used as a callback from C code that is executed
without the GIL needs to acquire the GIL before it can manipulate Python
objects. This can be done by specifying with :keyword:`gil` in the function
header::
cdef void my_callback(void *data) with gil:
...
Declaring a function as callable without the GIL
--------------------------------------------------
You can specify :keyword:`nogil` in a C function header or function type to
declare that it is safe to call without the GIL.::
cdef void my_gil_free_func(int spam) nogil:
...
If you are implementing such a function in Cython, it cannot have any Python
arguments, Python local variables, or Python return type, and cannot
manipulate Python objects in any way or call any function that does so without
acquiring the GIL first. Some of these restrictions are currently checked by
Cython, but not all. It is possible that more stringent checking will be
performed in the future.
Declaring a function with :keyword:`gil` also implicitly makes its signature
:keyword:`nogil`.
Cython Users Guide
======================
Contents:
.. toctree::
:maxdepth: 2
overview
tutorial
language_basics
extension_types
special_methods
sharing_declarations
external_C_code
source_files_and_compilation
wrapping_CPlusPlus
numpy_tutorial
limitations
pyrex_differences
early_binding_for_speed
debugging
Indices and tables
------------------
* :ref:`genindex`
* :ref:`modindex`
* :ref:`search`
.. toctree::
.. highlight:: cython
.. _language-basics:
*****************
Language Basics
*****************
C variable and type definitions
===============================
The :keyword:`cdef` statement is used to declare C variables, either local or
module-level::
cdef int i, j, k
cdef float f, g[42], *h
and C :keyword:`struct`, :keyword:`union` or :keyword:`enum` types::
cdef struct Grail:
int age
float volume
cdef union Food:
char *spam
float *eggs
cdef enum CheeseType:
cheddar, edam,
camembert
cdef enum CheeseState:
hard = 1
soft = 2
runny = 3
There is currently no special syntax for defining a constant, but you can use
an anonymous :keyword:`enum` declaration for this purpose, for example,::
cdef enum:
tons_of_spam = 3
.. note::
the words ``struct``, ``union`` and ``enum`` are used only when
defining a type, not when referring to it. For example, to declare a variable
pointing to a ``Grail`` you would write::
cdef Grail *gp
and not::
cdef struct Grail *gp # WRONG
There is also a ``ctypedef`` statement for giving names to types, e.g.::
ctypedef unsigned long ULong
ctypedef int *IntPtr
Grouping multiple C declarations
--------------------------------
If you have a series of declarations that all begin with :keyword:`cdef`, you
can group them into a :keyword:`cdef` block like this::
cdef:
struct Spam:
int tons
int i
float f
Spam *p
void f(Spam *s):
print s.tons, "Tons of spam"
Python functions vs. C functions
==================================
There are two kinds of function definition in Cython:
Python functions are defined using the def statement, as in Python. They take
Python objects as parameters and return Python objects.
C functions are defined using the new :keyword:`cdef` statement. They take
either Python objects or C values as parameters, and can return either Python
objects or C values.
Within a Cython module, Python functions and C functions can call each other
freely, but only Python functions can be called from outside the module by
interpreted Python code. So, any functions that you want to "export" from your
Cython module must be declared as Python functions using def.
There is also a hybrid function, called :keyword:`cpdef`. A :keyword:`cpdef`
can be called from anywhere, but uses the faster C calling conventions
when being called from other Cython code.
Parameters of either type of function can be declared to have C data types,
using normal C declaration syntax. For example,::
def spam(int i, char *s):
...
cdef int eggs(unsigned long l, float f):
...
When a parameter of a Python function is declared to have a C data type, it is
passed in as a Python object and automatically converted to a C value, if
possible. Automatic conversion is currently only possible for numeric types
and string types; attempting to use any other type for the parameter of a
Python function will result in a compile-time error.
C functions, on the other hand, can have parameters of any type, since they're
passed in directly using a normal C function call.
A more complete comparison of the pros and cons of these different method
types can be found at :ref:`early-binding-for-speed`.
Python objects as parameters and return values
----------------------------------------------
If no type is specified for a parameter or return value, it is assumed to be a
Python object. (Note that this is different from the C convention, where it
would default to int.) For example, the following defines a C function that
takes two Python objects as parameters and returns a Python object::
cdef spamobjs(x, y):
...
Reference counting for these objects is performed automatically according to
the standard Python/C API rules (i.e. borrowed references are taken as
parameters and a new reference is returned).
The name object can also be used to explicitly declare something as a Python
object. This can be useful if the name being declared would otherwise be taken
as the name of a type, for example,::
cdef ftang(object int):
...
declares a parameter called int which is a Python object. You can also use
object as the explicit return type of a function, e.g.::
cdef object ftang(object int):
...
In the interests of clarity, it is probably a good idea to always be explicit
about object parameters in C functions.
Error return values
-------------------
If you don't do anything special, a function declared with :keyword:`cdef` that
does not return a Python object has no way of reporting Python exceptions to
its caller. If an exception is detected in such a function, a warning message
is printed and the exception is ignored.
If you want a C function that does not return a Python object to be able to
propagate exceptions to its caller, you need to declare an exception value for
it. Here is an example::
cdef int spam() except -1:
...
With this declaration, whenever an exception occurs inside spam, it will
immediately return with the value ``-1``. Furthermore, whenever a call to spam
returns ``-1``, an exception will be assumed to have occurred and will be
propagated.
When you declare an exception value for a function, you should never
explicitly return that value. If all possible return values are legal and you
can't reserve one entirely for signalling errors, you can use an alternative
form of exception value declaration::
cdef int spam() except? -1:
...
The "?" indicates that the value ``-1`` only indicates a possible error. In this
case, Cython generates a call to :cfunc:`PyErr_Occurred` if the exception value is
returned, to make sure it really is an error.
There is also a third form of exception value declaration::
cdef int spam() except *:
...
This form causes Cython to generate a call to :cfunc:`PyErr_Occurred` after
every call to spam, regardless of what value it returns. If you have a
function returning void that needs to propagate errors, you will have to use
this form, since there isn't any return value to test.
Otherwise there is little use for this form.
An external C++ function that may raise an exception can be declared with::
cdef int spam() except +
See :ref:`wrapping-cplusplus` for more details.
Some things to note:
* Exception values can only declared for functions returning an integer, enum,
float or pointer type, and the value must be a constant expression.
Void functions can only use the ``except *`` form.
* The exception value specification is part of the signature of the function.
If you're passing a pointer to a function as a parameter or assigning it
to a variable, the declared type of the parameter or variable must have
the same exception value specification (or lack thereof). Here is an
example of a pointer-to-function declaration with an exception
value::
int (*grail)(int, char *) except -1
* You don't need to (and shouldn't) declare exception values for functions
which return Python objects. Remember that a function with no declared
return type implicitly returns a Python object. (Exceptions on such functions
are implicitly propagated by returning NULL.)
Checking return values of non-Cython functions
----------------------------------------------
It's important to understand that the except clause does not cause an error to
be raised when the specified value is returned. For example, you can't write
something like::
cdef extern FILE *fopen(char *filename, char *mode) except NULL # WRONG!
and expect an exception to be automatically raised if a call to :func:`fopen`
returns ``NULL``. The except clause doesn't work that way; its only purpose is
for propagating Python exceptions that have already been raised, either by a Cython
function or a C function that calls Python/C API routines. To get an exception
from a non-Python-aware function such as :func:`fopen`, you will have to check the
return value and raise it yourself, for example,::
cdef FILE *p
p = fopen("spam.txt", "r")
if p == NULL:
raise SpamError("Couldn't open the spam file")
Automatic type conversions
==========================
In most situations, automatic conversions will be performed for the basic
numeric and string types when a Python object is used in a context requiring a
C value, or vice versa. The following table summarises the conversion
possibilities.
+----------------------------+--------------------+------------------+
| C types | From Python types | To Python types |
+============================+====================+==================+
| [unsigned] char | int, long | int |
| [unsigned] short | | |
| int, long | | |
+----------------------------+--------------------+------------------+
| unsigned int | int, long | long |
| unsigned long | | |
| [unsigned] long long | | |
+----------------------------+--------------------+------------------+
| float, double, long double | int, long, float | float |
+----------------------------+--------------------+------------------+
| char * | str/bytes | str/bytes [#]_ |
+----------------------------+--------------------+------------------+
| struct | | dict |
+----------------------------+--------------------+------------------+
.. [#] The conversion is to/from str for Python 2.x, and bytes for Python 3.x.
Caveats when using a Python string in a C context
-------------------------------------------------
You need to be careful when using a Python string in a context expecting a
``char *``. In this situation, a pointer to the contents of the Python string is
used, which is only valid as long as the Python string exists. So you need to
make sure that a reference to the original Python string is held for as long
as the C string is needed. If you can't guarantee that the Python string will
live long enough, you will need to copy the C string.
Cython detects and prevents some mistakes of this kind. For instance, if you
attempt something like::
cdef char *s
s = pystring1 + pystring2
then Cython will produce the error message ``Obtaining char * from temporary
Python value``. The reason is that concatenating the two Python strings
produces a new Python string object that is referenced only by a temporary
internal variable that Cython generates. As soon as the statement has finished,
the temporary variable will be decrefed and the Python string deallocated,
leaving ``s`` dangling. Since this code could not possibly work, Cython refuses to
compile it.
The solution is to assign the result of the concatenation to a Python
variable, and then obtain the ``char *`` from that, i.e.::
cdef char *s
p = pystring1 + pystring2
s = p
It is then your responsibility to hold the reference p for as long as
necessary.
Keep in mind that the rules used to detect such errors are only heuristics.
Sometimes Cython will complain unnecessarily, and sometimes it will fail to
detect a problem that exists. Ultimately, you need to understand the issue and
be careful what you do.
Statements and expressions
==========================
Control structures and expressions follow Python syntax for the most part.
When applied to Python objects, they have the same semantics as in Python
(unless otherwise noted). Most of the Python operators can also be applied to
C values, with the obvious semantics.
If Python objects and C values are mixed in an expression, conversions are
performed automatically between Python objects and C numeric or string types.
Reference counts are maintained automatically for all Python objects, and all
Python operations are automatically checked for errors, with appropriate
action taken.
Differences between C and Cython expressions
--------------------------------------------
There are some differences in syntax and semantics between C expressions and
Cython expressions, particularly in the area of C constructs which have no
direct equivalent in Python.
* An integer literal is treated as a C constant, and will
be truncated to whatever size your C compiler thinks appropriate.
To get a Python integer (of arbitrary precision) cast immediately to
an object (e.g. ``<object>100000000000000000000``). The ``L``, ``LL``,
and ``U`` suffixes have the same meaning as in C.
* There is no ``->`` operator in Cython. Instead of ``p->x``, use ``p.x``
* There is no unary ``*`` operator in Cython. Instead of ``*p``, use ``p[0]``
* There is an ``&`` operator, with the same semantics as in C.
* The null C pointer is called ``NULL``, not ``0`` (and ``NULL`` is a reserved word).
* Type casts are written ``<type>value`` , for example::
cdef char *p, float *q
p = <char*>q
Scope rules
-----------
Cython determines whether a variable belongs to a local scope, the module
scope, or the built-in scope completely statically. As with Python, assigning
to a variable which is not otherwise declared implicitly declares it to be a
Python variable residing in the scope where it is assigned.
.. note::
A consequence of these rules is that the module-level scope behaves the
same way as a Python local scope if you refer to a variable before assigning
to it. In particular, tricks such as the following will not work in Cython::
try:
x = True
except NameError:
True = 1
because, due to the assignment, the True will always be looked up in the
module-level scope. You would have to do something like this instead::
import __builtin__
try:
True = __builtin__.True
except AttributeError:
True = 1
Built-in Functions
------------------
Cython compiles calls to the following built-in functions into direct calls to
the corresponding Python/C API routines, making them particularly fast.
+------------------------------+-------------+----------------------------+
| Function and arguments | Return type | Python/C API Equivalent |
+==============================+=============+============================+
| abs(obj) | object | PyNumber_Absolute |
+------------------------------+-------------+----------------------------+
| delattr(obj, name) | int | PyObject_DelAttr |
+------------------------------+-------------+----------------------------+
| dir(obj) | object | PyObject_Dir |
| getattr(obj, name) (Note 1) | | |
| getattr3(obj, name, default) | | |
+------------------------------+-------------+----------------------------+
| hasattr(obj, name) | int | PyObject_HasAttr |
+------------------------------+-------------+----------------------------+
| hash(obj) | int | PyObject_Hash |
+------------------------------+-------------+----------------------------+
| intern(obj) | object | PyObject_InternFromString |
+------------------------------+-------------+----------------------------+
| isinstance(obj, type) | int | PyObject_IsInstance |
+------------------------------+-------------+----------------------------+
| issubclass(obj, type) | int | PyObject_IsSubclass |
+------------------------------+-------------+----------------------------+
| iter(obj) | object | PyObject_GetIter |
+------------------------------+-------------+----------------------------+
| len(obj) | Py_ssize_t | PyObject_Length |
+------------------------------+-------------+----------------------------+
| pow(x, y, z) (Note 2) | object | PyNumber_Power |
+------------------------------+-------------+----------------------------+
| reload(obj) | object | PyImport_ReloadModule |
+------------------------------+-------------+----------------------------+
| repr(obj) | object | PyObject_Repr |
+------------------------------+-------------+----------------------------+
| setattr(obj, name) | void | PyObject_SetAttr |
+------------------------------+-------------+----------------------------+
Note 1: There are two different functions corresponding to the Python
:func:`getattr` depending on whether a third argument is used. In a Python
context, they both evaluate to the Python :func:`getattr` function.
Note 2: Only the three-argument form of :func:`pow` is supported. Use the
``**`` operator otherwise.
Only direct function calls using these names are optimised. If you do
something else with one of these names that assumes it's a Python object, such
as assign it to a Python variable, and later call it, the call will be made as
a Python function call.
Operator Precedence
-------------------
Keep in mind that there are some differences in operator precedence between
Python and C, and that Cython uses the Python precedences, not the C ones.
Integer for-loops
------------------
Cython recognises the usual Python for-in-range integer loop pattern::
for i in range(n):
...
If ``i`` is declared as a :keyword:`cdef` integer type, it will
optimise this into a pure C loop. This restriction is required as
otherwise the generated code wouldn't be correct due to potential
integer overflows on the target architecture. If you are worried that
the loop is not being converted correctly, use the annotate feature of
the cython commandline (``-a``) to easily see the generated C code.
See :ref:`automatic-range-conversion`
For backwards compatibility to Pyrex, Cython also supports another
form of for-loop::
for i from 0 <= i < n:
...
or::
for i from 0 <= i < n by s:
...
where ``s`` is some integer step size.
Some things to note about the for-from loop:
* The target expression must be a variable name.
* The name between the lower and upper bounds must be the same as the target
name.
* The direction of iteration is determined by the relations. If they are both
from the set {``<``, ``<=``} then it is upwards; if they are both from the set
{``>``, ``>=``} then it is downwards. (Any other combination is disallowed.)
Like other Python looping statements, break and continue may be used in the
body, and the loop may have an else clause.
The include statement
=====================
.. warning::
Historically the ``include`` statement was used for sharing declarations.
Use :ref:`sharing-declarations` instead.
A Cython source file can include material from other files using the include
statement, for example::
include "spamstuff.pxi"
The contents of the named file are textually included at that point. The
included file can contain any complete statements or declarations that are
valid in the context where the include statement appears, including other
include statements. The contents of the included file should begin at an
indentation level of zero, and will be treated as though they were indented to
the level of the include statement that is including the file.
.. note::
There are other mechanisms available for splitting Cython code into
separate parts that may be more appropriate in many cases. See
:ref:`sharing-declarations`.
Conditional Compilation
=======================
Some features are available for conditional compilation and compile-time
constants within a Cython source file.
Compile-Time Definitions
------------------------
A compile-time constant can be defined using the DEF statement::
DEF FavouriteFood = "spam"
DEF ArraySize = 42
DEF OtherArraySize = 2 * ArraySize + 17
The right-hand side of the ``DEF`` must be a valid compile-time expression.
Such expressions are made up of literal values and names defined using ``DEF``
statements, combined using any of the Python expression syntax.
The following compile-time names are predefined, corresponding to the values
returned by :func:`os.uname`.
UNAME_SYSNAME, UNAME_NODENAME, UNAME_RELEASE,
UNAME_VERSION, UNAME_MACHINE
The following selection of builtin constants and functions are also available:
None, True, False,
abs, bool, chr, cmp, complex, dict, divmod, enumerate,
float, hash, hex, int, len, list, long, map, max, min,
oct, ord, pow, range, reduce, repr, round, slice, str,
sum, tuple, xrange, zip
A name defined using ``DEF`` can be used anywhere an identifier can appear,
and it is replaced with its compile-time value as though it were written into
the source at that point as a literal. For this to work, the compile-time
expression must evaluate to a Python value of type ``int``, ``long``,
``float`` or ``str``.::
cdef int a1[ArraySize]
cdef int a2[OtherArraySize]
print "I like", FavouriteFood
Conditional Statements
----------------------
The ``IF`` statement can be used to conditionally include or exclude sections
of code at compile time. It works in a similar way to the ``#if`` preprocessor
directive in C.::
IF UNAME_SYSNAME == "Windows":
include "icky_definitions.pxi"
ELIF UNAME_SYSNAME == "Darwin":
include "nice_definitions.pxi"
ELIF UNAME_SYSNAME == "Linux":
include "penguin_definitions.pxi"
ELSE:
include "other_definitions.pxi"
The ``ELIF`` and ``ELSE`` clauses are optional. An ``IF`` statement can appear
anywhere that a normal statement or declaration can appear, and it can contain
any statements or declarations that would be valid in that context, including
``DEF`` statements and other ``IF`` statements.
The expressions in the ``IF`` and ``ELIF`` clauses must be valid compile-time
expressions as for the ``DEF`` statement, although they can evaluate to any
Python value, and the truth of the result is determined in the usual Python
way.
.. highlight:: cython
.. _cython-limitations:
*************
Limitations
*************
Unsupported Python Features
============================
One of our goals is to make Cython as compatible as possible with standard
Python. This page lists the things that work in Python but not in Cython.
As Cython matures, the items in this list should go away.
Generators and generator expressions
-------------------------------------
The yield keyword is not yet supported. This is work in progress.
Since Cython 0.13, some generator expressions are supported when they
can be transformed into inlined loops in combination with builtins,
e.g. ``sum(x*2 for x in seq)``. As of 0.14, the supported builtins
are ``list()``, ``set()``, ``dict()``, ``sum()``, ``any()``,
``all()``, ``sorted()``.
Other Current Limitations
==========================
* The :func:`globals` builtin returns the last Python callers globals, not the current function's locals. This behavior should not be relied upon, as it will probably change in the future.
* The :func:`locals` builtin can only be used if all local variables can be converted to Python objects, and returns a dict.
* Class and function definitions cannot be placed inside control structures.
Semantic differences between Python and Cython
----------------------------------------------
Behaviour of class scopes
^^^^^^^^^^^^^^^^^^^^^^^^^
In Python, referring to a method of a class inside the class definition, i.e.
while the class is being defined, yields a plain function object, but in
Cython it yields an unbound method [#]_. A consequence of this is that the
usual idiom for using the :func:`classmethod` and :func:`staticmethod` functions,
e.g.::
class Spam:
def method(cls):
...
method = classmethod(method)
will not work in Cython. This can be worked around by defining the function
outside the class, and then assigning the result of ``classmethod`` or
``staticmethod`` inside the class, i.e.::
def Spam_method(cls):
...
class Spam:
method = classmethod(Spam_method)
This will change in the near future.
.. rubric:: Footnotes
.. [#] The reason for the different behaviour of class scopes is that
Cython-defined Python functions are ``PyCFunction`` objects, not
``PyFunction`` objects, and are not recognised by the machinery that creates a
bound or unbound method when a function is extracted from a class. To get
around this, Cython wraps each method in an unbound method object itself
before storing it in the class's dictionary.
.. highlight:: cython
.. _numpy_tutorial:
**************************
Cython for NumPy users
**************************
This tutorial is aimed at NumPy users who have no experience with Cython at
all. If you have some knowledge of Cython you may want to skip to the
''Efficient indexing'' section which explains the new improvements made in
summer 2008.
The main scenario considered is NumPy end-use rather than NumPy/SciPy
development. The reason is that Cython is not (yet) able to support functions
that are generic with respect to datatype and the number of dimensions in a
high-level fashion. This restriction is much more severe for SciPy development
than more specific, "end-user" functions. See the last section for more
information on this.
The style of this tutorial will not fit everybody, so you can also consider:
* Robert Bradshaw's `slides on cython for SciPy2008
<http://wiki.sagemath.org/scipy08?action=AttachFile&do=get&target=scipy-cython.tgz>`_
(a higher-level and quicker introduction)
* Basic Cython documentation (see `Cython front page <http://cython.org>`_).
* ``[:enhancements/buffer:Spec for the efficient indexing]``
.. Note::
The fast array access documented below is a completely new feature, and
there may be bugs waiting to be discovered. It might be a good idea to do
a manual sanity check on the C code Cython generates before using this for
serious purposes, at least until some months have passed.
Cython at a glance
====================
Cython is a compiler which compiles Python-like code files to C code. Still,
''Cython is not a Python to C translator''. That is, it doesn't take your full
program and "turns it into C" -- rather, the result makes full use of the
Python runtime environment. A way of looking at it may be that your code is
still Python in that it runs within the Python runtime environment, but rather
than compiling to interpreted Python bytecode one compiles to native machine
code (but with the addition of extra syntax for easy embedding of faster
C-like code).
This has two important consequences:
* Speed. How much depends very much on the program involved though. Typical Python numerical programs would tend to gain very little as most time is spent in lower-level C that is used in a high-level fashion. However for-loop-style programs can gain many orders of magnitude, when typing information is added (and is so made possible as a realistic alternative).
* Easy calling into C code. One of Cython's purposes is to allow easy wrapping
of C libraries. When writing code in Cython you can call into C code as
easily as into Python code.
Some Python constructs are not yet supported, though making Cython compile all
Python code is a stated goal (among the more important omissions are inner
functions and generator functions).
Your Cython environment
========================
Using Cython consists of these steps:
1. Write a :file:`.pyx` source file
2. Run the Cython compiler to generate a C file
3. Run a C compiler to generate a compiled library
4. Run the Python interpreter and ask it to import the module
However there are several options to automate these steps:
1. The `SAGE <http://sagemath.org>`_ mathematics software system provides
excellent support for using Cython and NumPy from an interactive command
line (like IPython) or through a notebook interface (like
Maple/Mathematica). See `this documentation
<http://www.sagemath.org/doc/prog/node40.html>`_.
2. A version of `pyximport <http://www.prescod.net/pyximport/>`_ is shipped
with Cython, so that you can import pyx-files dynamically into Python and
have them compiled automatically (See :ref:`pyximport`).
3. Cython supports distutils so that you can very easily create build scripts
which automate the process, this is the preferred method for full programs.
4. Manual compilation (see below)
.. Note::
If using another interactive command line environment than SAGE, like
IPython or Python itself, it is important that you restart the process
when you recompile the module. It is not enough to issue an "import"
statement again.
Installation
=============
Unless you are used to some other automatic method:
`download Cython <http://cython.org/#download>`_ (0.9.8.1.1 or later), unpack it,
and run the usual ```python setup.py install``. This will install a
``cython`` executable on your system. It is also possible to use Cython from
the source directory without installing (simply launch :file:`cython.py` in the
root directory).
As of this writing SAGE comes with an older release of Cython than required
for this tutorial. So if using SAGE you should download the newest Cython and
then execute ::
$ cd path/to/cython-distro
$ path-to-sage/sage -python setup.py install
This will install the newest Cython into SAGE.
Manual compilation
====================
As it is always important to know what is going on, I'll describe the manual
method here. First Cython is run::
$ cython yourmod.pyx
This creates :file:`yourmod.c` which is the C source for a Python extension
module. A useful additional switch is ``-a`` which will generate a document
:file:`yourmod.html`) that shows which Cython code translates to which C code
line by line.
Then we compile the C file. This may vary according to your system, but the C
file should be built like Python was built. Python documentation for writing
extensions should have some details. On Linux this often means something
like::
$ gcc -shared -pthread -fPIC -fwrapv -O2 -Wall -fno-strict-aliasing -I/usr/include/python2.5 -o yourmod.so yourmod.c
``gcc`` should have access to the NumPy C header files so if they are not
installed at :file:`/usr/include/numpy` or similar you may need to pass another
option for those.
This creates :file:`yourmod.so` in the same directory, which is importable by
Python by using a normal ``import yourmod`` statement.
The first Cython program
==========================
The code below does 2D discrete convolution of an image with a filter (and I'm
sure you can do better!, let it serve for demonstration purposes). It is both
valid Python and valid Cython code. I'll refer to it as both
:file:`convolve_py.py` for the Python version and :file:`convolve1.pyx` for the
Cython version -- Cython uses ".pyx" as its file suffix.
.. code-block:: python
from __future__ import division
import numpy as np
def naive_convolve(f, g):
# f is an image and is indexed by (v, w)
# g is a filter kernel and is indexed by (s, t),
# it needs odd dimensions
# h is the output image and is indexed by (x, y),
# it is not cropped
if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1:
raise ValueError("Only odd dimensions on filter supported")
# smid and tmid are number of pixels between the center pixel
# and the edge, ie for a 5x5 filter they will be 2.
#
# The output size is calculated by adding smid, tmid to each
# side of the dimensions of the input image.
vmax = f.shape[0]
wmax = f.shape[1]
smax = g.shape[0]
tmax = g.shape[1]
smid = smax // 2
tmid = tmax // 2
xmax = vmax + 2*smid
ymax = wmax + 2*tmid
# Allocate result image.
h = np.zeros([xmax, ymax], dtype=f.dtype)
# Do convolution
for x in range(xmax):
for y in range(ymax):
# Calculate pixel value for h at (x,y). Sum one component
# for each pixel (s, t) of the filter g.
s_from = max(smid - x, -smid)
s_to = min((xmax - x) - smid, smid + 1)
t_from = max(tmid - y, -tmid)
t_to = min((ymax - y) - tmid, tmid + 1)
value = 0
for s in range(s_from, s_to):
for t in range(t_from, t_to):
v = x - smid + s
w = y - tmid + t
value += g[smid - s, tmid - t] * f[v, w]
h[x, y] = value
return h
This should be compiled to produce :file:`yourmod.so` (for Linux systems). We
run a Python session to test both the Python version (imported from
``.py``-file) and the compiled Cython module.
.. sourcecode:: ipython
In [1]: import numpy as np
In [2]: import convolve_py
In [3]: convolve_py.naive_convolve(np.array([[1, 1, 1]], dtype=np.int),
... np.array([[1],[2],[1]], dtype=np.int))
Out [3]:
array([[1, 1, 1],
[2, 2, 2],
[1, 1, 1]])
In [4]: import convolve1
In [4]: convolve1.naive_convolve(np.array([[1, 1, 1]], dtype=np.int),
... np.array([[1],[2],[1]], dtype=np.int))
Out [4]:
array([[1, 1, 1],
[2, 2, 2],
[1, 1, 1]])
In [11]: N = 100
In [12]: f = np.arange(N*N, dtype=np.int).reshape((N,N))
In [13]: g = np.arange(81, dtype=np.int).reshape((9, 9))
In [19]: %timeit -n2 -r3 convolve_py.naive_convolve(f, g)
2 loops, best of 3: 1.86 s per loop
In [20]: %timeit -n2 -r3 convolve1.naive_convolve(f, g)
2 loops, best of 3: 1.41 s per loop
There's not such a huge difference yet; because the C code still does exactly
what the Python interpreter does (meaning, for instance, that a new object is
allocated for each number used). Look at the generated html file and see what
is needed for even the simplest statements you get the point quickly. We need
to give Cython more information; we need to add types.
Adding types
=============
To add types we use custom Cython syntax, so we are now breaking Python source
compatibility. Here's :file:`convolve2.pyx`. *Read the comments!* ::
from __future__ import division
import numpy as np
# "cimport" is used to import special compile-time information
# about the numpy module (this is stored in a file numpy.pxd which is
# currently part of the Cython distribution).
cimport numpy as np
# We now need to fix a datatype for our arrays. I've used the variable
# DTYPE for this, which is assigned to the usual NumPy runtime
# type info object.
DTYPE = np.int
# "ctypedef" assigns a corresponding compile-time type to DTYPE_t. For
# every type in the numpy module there's a corresponding compile-time
# type with a _t-suffix.
ctypedef np.int_t DTYPE_t
# The builtin min and max functions works with Python objects, and are
# so very slow. So we create our own.
# - "cdef" declares a function which has much less overhead than a normal
# def function (but it is not Python-callable)
# - "inline" is passed on to the C compiler which may inline the functions
# - The C type "int" is chosen as return type and argument types
# - Cython allows some newer Python constructs like "a if x else b", but
# the resulting C file compiles with Python 2.3 through to Python 3.0 beta.
cdef inline int int_max(int a, int b): return a if a >= b else b
cdef inline int int_min(int a, int b): return a if a <= b else b
# "def" can type its arguments but not have a return type. The type of the
# arguments for a "def" function is checked at run-time when entering the
# function.
#
# The arrays f, g and h is typed as "np.ndarray" instances. The only effect
# this has is to a) insert checks that the function arguments really are
# NumPy arrays, and b) make some attribute access like f.shape[0] much
# more efficient. (In this example this doesn't matter though.)
def naive_convolve(np.ndarray f, np.ndarray g):
if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1:
raise ValueError("Only odd dimensions on filter supported")
assert f.dtype == DTYPE and g.dtype == DTYPE
# The "cdef" keyword is also used within functions to type variables. It
# can only be used at the top indendation level (there are non-trivial
# problems with allowing them in other places, though we'd love to see
# good and thought out proposals for it).
#
# For the indices, the "int" type is used. This corresponds to a C int,
# other C types (like "unsigned int") could have been used instead.
# Purists could use "Py_ssize_t" which is the proper Python type for
# array indices.
cdef int vmax = f.shape[0]
cdef int wmax = f.shape[1]
cdef int smax = g.shape[0]
cdef int tmax = g.shape[1]
cdef int smid = smax // 2
cdef int tmid = tmax // 2
cdef int xmax = vmax + 2*smid
cdef int ymax = wmax + 2*tmid
cdef np.ndarray h = np.zeros([xmax, ymax], dtype=DTYPE)
cdef int x, y, s, t, v, w
# It is very important to type ALL your variables. You do not get any
# warnings if not, only much slower code (they are implicitly typed as
# Python objects).
cdef int s_from, s_to, t_from, t_to
# For the value variable, we want to use the same data type as is
# stored in the array, so we use "DTYPE_t" as defined above.
# NB! An important side-effect of this is that if "value" overflows its
# datatype size, it will simply wrap around like in C, rather than raise
# an error like in Python.
cdef DTYPE_t value
for x in range(xmax):
for y in range(ymax):
s_from = int_max(smid - x, -smid)
s_to = int_min((xmax - x) - smid, smid + 1)
t_from = int_max(tmid - y, -tmid)
t_to = int_min((ymax - y) - tmid, tmid + 1)
value = 0
for s in range(s_from, s_to):
for t in range(t_from, t_to):
v = x - smid + s
w = y - tmid + t
value += g[smid - s, tmid - t] * f[v, w]
h[x, y] = value
return h
At this point, have a look at the generated C code for :file:`convolve1.pyx` and
:file:`convolve2.pyx`. Click on the lines to expand them and see corresponding C.
(Note that this code annotation is currently experimental and especially
"trailing" cleanup code for a block may stick to the last expression in the
block and make it look worse than it is -- use some common sense).
* .. literalinclude: convolve1.html
* .. literalinclude: convolve2.html
Especially have a look at the for loops: In :file:`convolve1.c`, these are ~20 lines
of C code to set up while in :file:`convolve2.c` a normal C for loop is used.
After building this and continuing my (very informal) benchmarks, I get:
.. sourcecode:: ipython
In [21]: import convolve2
In [22]: %timeit -n2 -r3 convolve2.naive_convolve(f, g)
2 loops, best of 3: 828 ms per loop
Efficient indexing
====================
There's still a bottleneck killing performance, and that is the array lookups
and assignments. The ``[]``-operator still uses full Python operations --
what we would like to do instead is to access the data buffer directly at C
speed.
What we need to do then is to type the contents of the :obj:`ndarray` objects.
We do this with a special "buffer" syntax which must be told the datatype
(first argument) and number of dimensions ("ndim" keyword-only argument, if
not provided then one-dimensional is assumed).
More information on this syntax [:enhancements/buffer:can be found here].
Showing the changes needed to produce :file:`convolve3.pyx` only::
...
def naive_convolve(np.ndarray[DTYPE_t, ndim=2] f, np.ndarray[DTYPE_t, ndim=2] g):
...
cdef np.ndarray[DTYPE_t, ndim=2] h = ...
Usage:
.. sourcecode:: ipython
In [18]: import convolve3
In [19]: %timeit -n3 -r100 convolve3.naive_convolve(f, g)
3 loops, best of 100: 11.6 ms per loop
Note the importance of this change.
*Gotcha*: This efficient indexing only affects certain index operations,
namely those with exactly ``ndim`` number of typed integer indices. So if
``v`` for instance isn't typed, then the lookup ``f[v, w]`` isn't
optimized. On the other hand this means that you can continue using Python
objects for sophisticated dynamic slicing etc. just as when the array is not
typed.
Tuning indexing further
========================
The array lookups are still slowed down by two factors:
1. Bounds checking is performed.
2. Negative indices are checked for and handled correctly. The code above is
explicitly coded so that it doesn't use negative indices, and it
(hopefully) always access within bounds. We can add a decorator to disable
bounds checking::
...
cimport cython
@cython.boundscheck(False) # turn of bounds-checking for entire function
def naive_convolve(np.ndarray[DTYPE_t, ndim=2] f, np.ndarray[DTYPE_t, ndim=2] g):
...
Now bounds checking is not performed (and, as a side-effect, if you ''do''
happen to access out of bounds you will in the best case crash your program
and in the worst case corrupt data). It is possible to switch bounds-checking
mode in many ways, see [:docs/compilerdirectives:compiler directives] for more
information.
Negative indices are dealt with by ensuring Cython that the indices will be
positive, by casting the variables to unsigned integer types (if you do have
negative values, then this casting will create a very large positive value
instead and you will attempt to access out-of-bounds values). Casting is done
with a special ``<>``-syntax. The code below is changed to use either
unsigned ints or casting as appropriate::
...
cdef int s, t # changed
cdef unsigned int x, y, v, w # changed
cdef int s_from, s_to, t_from, t_to
cdef DTYPE_t value
for x in range(xmax):
for y in range(ymax):
s_from = max(smid - x, -smid)
s_to = min((xmax - x) - smid, smid + 1)
t_from = max(tmid - y, -tmid)
t_to = min((ymax - y) - tmid, tmid + 1)
value = 0
for s in range(s_from, s_to):
for t in range(t_from, t_to):
v = <unsigned int>(x - smid + s) # changed
w = <unsigned int>(y - tmid + t) # changed
value += g[<unsigned int>(smid - s), <unsigned int>(tmid - t)] * f[v, w] # changed
h[x, y] = value
...
(In the next Cython release we will likely add a compiler directive or
argument to the ``np.ndarray[]``-type specifier to disable negative indexing
so that casting so much isn't necessary; feedback on this is welcome.)
The function call overhead now starts to play a role, so we compare the latter
two examples with larger N:
.. sourcecode:: ipython
In [11]: %timeit -n3 -r100 convolve4.naive_convolve(f, g)
3 loops, best of 100: 5.97 ms per loop
In [12]: N = 1000
In [13]: f = np.arange(N*N, dtype=np.int).reshape((N,N))
In [14]: g = np.arange(81, dtype=np.int).reshape((9, 9))
In [17]: %timeit -n1 -r10 convolve3.naive_convolve(f, g)
1 loops, best of 10: 1.16 s per loop
In [18]: %timeit -n1 -r10 convolve4.naive_convolve(f, g)
1 loops, best of 10: 597 ms per loop
(Also this is a mixed benchmark as the result array is allocated within the
function call.)
.. Warning::
Speed comes with some cost. Especially it can be dangerous to set typed
objects (like ``f``, ``g`` and ``h`` in our sample code) to :keyword:`None`.
Setting such objects to :keyword:`None` is entirely legal, but all you can do with them
is check whether they are None. All other use (attribute lookup or indexing)
can potentially segfault or corrupt data (rather than raising exceptions as
they would in Python).
The actual rules are a bit more complicated but the main message is clear: Do
not use typed objects without knowing that they are not set to None.
More generic code
==================
It would be possible to do::
def naive_convolve(object[DTYPE_t, ndim=2] f, ...):
i.e. use :obj:`object` rather than :obj:`np.ndarray`. Under Python 3.0 this
can allow your algorithm to work with any libraries supporting the buffer
interface; and support for e.g. the Python Imaging Library may easily be added
if someone is interested also under Python 2.x.
There is some speed penalty to this though (as one makes more assumptions
compile-time if the type is set to :obj:`np.ndarray`, specifically it is
assumed that the data is stored in pure strided more and not in indirect
mode).
[:enhancements/buffer:More information]
The future
============
These are some points to consider for further development. All points listed
here has gone through a lot of thinking and planning already; still they may
or may not happen depending on available developer time and resources for
Cython.
1. Support for efficient access to structs/records stored in arrays; currently
only primitive types are allowed.
2. Support for efficient access to complex floating point types in arrays. The
main obstacle here is getting support for efficient complex datatypes in
Cython.
3. Calling NumPy/SciPy functions currently has a Python call overhead; it
would be possible to take a short-cut from Cython directly to C. (This does
however require some isolated and incremental changes to those libraries;
mail the Cython mailing list for details).
4. Efficient code that is generic with respect to the number of dimensions.
This can probably be done today by calling the NumPy C multi-dimensional
iterator API directly; however it would be nice to have for-loops over
:func:`enumerate` and :func:`ndenumerate` on NumPy arrays create efficient
code.
5. A high-level construct for writing type-generic code, so that one can write
functions that work simultaneously with many datatypes. Note however that a
macro preprocessor language can help with doing this for now.
.. highlight:: cython
.. _overview:
********
Overview
********
About Cython
==============
Cython is a language that makes writing C extensions for the Python language
as easy as Python itself. Cython is based on the well-known `Pyrex
<http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/>`_ language by Greg Ewing,
but supports more cutting edge functionality and optimizations [#]_.
The Cython language is very close to the Python language, but Cython
additionally supports calling C functions and declaring C types on variables
and class attributes. This allows the compiler to generate very efficient C
code from Cython code.
This makes Cython the ideal language for wrapping external C libraries,
and for fast C modules that speed up the execution of Python code.
Future Plans
============
Cython is not finished. Substantial tasks remaining. See
:ref:`cython-limitations` for a current list.
.. rubric:: Footnotes
.. [#] For differences with Pyrex see :ref:`pyrex-differences`.
I think this is a result of a recent change to Pyrex that
has been merged into Cython.
If a directory contains an :file:`__init__.py` or :file:`__init__.pyx` file,
it's now assumed to be a package directory. So, for example,
if you have a directory structure::
foo/
__init__.py
shrubbing.pxd
shrubbing.pyx
then the shrubbing module is assumed to belong to a package
called 'foo', and its fully qualified module name is
'foo.shrubbing'.
So when Pyrex wants to find out whether there is a `.pxd` file for shrubbing,
it looks for one corresponding to a module called `foo.shrubbing`. It
does this by searching the include path for a top-level package directory
called 'foo' containing a file called 'shrubbing.pxd'.
However, if foo is the current directory you're running
the compiler from, and you haven't added foo to the
include path using a -I option, then it won't be on
the include path, and the `.pxd` won't be found.
What to do about this depends on whether you really
intend the module to reside in a package.
If you intend shrubbing to be a top-level module, you
will have to move it somewhere else where there is
no :file:`__init__.*` file.
If you do intend it to reside in a package, then there
are two alternatives:
1. cd to the directory containing foo and compile
from there::
cd ..; cython foo/shrubbing.pyx
2. arrange for the directory containing foo to be
passed as a -I option, e.g.::
cython -I .. shrubbing.pyx
Arguably this behaviour is not very desirable, and I'll
see if I can do something about it.
.. highlight:: cython
.. _pyrex-differences:
**************************************
Differences between Cython and Pyrex
**************************************
.. warning::
Both Cython and Pyrex are moving targets. It has come to the point
that an explicit list of all the differences between the two
projects would be laborious to list and track, but hopefully
this high-level list gives an idea of the differences that
are present. It should be noted that both projects make an effort
at mutual compatibility, but Cython's goal is to be as close to
and complete as Python as reasonable.
Python 3.0 Support
==================
Cython creates ``.c`` files that can be built and used with both
Python 2.x and Python 3.x. In fact, compiling your module with
Cython may very well be the easiest way to port code to Python 3.0.
We are also working to make the compiler run in both Python 2.x and 3.0.
Many Python 3 constructs are already supported by Cython.
List/Set/Dict Comprehensions
----------------------------
Cython supports the different comprehensions defined by Python 3.0 for
lists, sets and dicts::
[expr(x) for x in A] # list
{expr(x) for x in A} # set
{key(x) : value(x) for x in A} # dict
Looping is optimized if ``A`` is a list, tuple or dict. You can use
the :keyword:`for` ... :keyword:`from` syntax, too, but it is
generally preferred to use the usual :keyword:`for` ... :keyword:`in`
``range(...)`` syntax with a C run variable (e.g. ``cdef int i``).
.. note:: see :ref:`automatic-range-conversion`
Note that Cython also supports set literals starting from Python 2.3.
Keyword-only arguments
----------------------
Python functions can have keyword-only arguments listed after the ``*``
parameter and before the ``**`` parameter if any, e.g.::
def f(a, b, *args, c, d = 42, e, **kwds):
...
Here ``c``, ``d`` and ``e`` cannot be passed as position arguments and must be
passed as keyword arguments. Furthermore, ``c`` and ``e`` are required keyword
arguments, since they do not have a default value.
If the parameter name after the ``*`` is omitted, the function will not accept any
extra positional arguments, e.g.::
def g(a, b, *, c, d):
...
takes exactly two positional parameters and has two required keyword parameters.
Conditional expressions "x if b else y" (python 2.5)
=====================================================
Conditional expressions as described in
http://www.python.org/dev/peps/pep-0308/::
X if C else Y
Only one of ``X`` and ``Y`` is evaluated, (depending on the value of C).
cdef inline
=============
Module level functions can now be declared inline, with the :keyword:`inline`
keyword passed on to the C compiler. These can be as fast as macros.::
cdef inline int something_fast(int a, int b):
return a*a + b
Note that class-level :keyword:`cdef` functions are handled via a virtual
function table, so the compiler won't be able to inline them in almost all
cases.
Assignment on declaration (e.g. "cdef int spam = 5")
======================================================
In Pyrex, one must write::
cdef int i, j, k
i = 2
j = 5
k = 7
Now, with cython, one can write::
cdef int i = 2, j = 5, k = 7
The expression on the right hand side can be arbitrarily complicated, e.g.::
cdef int n = python_call(foo(x,y), a + b + c) - 32
'by' expression in for loop (e.g. "for i from 0 <= i < 10 by 2")
==================================================================
::
for i from 0 <= i < 10 by 2:
print i
yields::
0
2
4
6
8
.. note:: see :ref:`automatic-range-conversion`
Boolean int type (e.g. it acts like a c int, but coerces to/from python as a boolean)
======================================================================================
In C, ints are used for truth values. In python, any object can be used as a
truth value (using the :meth:`__nonzero__` method, but the canonical choices
are the two boolean objects ``True`` and ``False``. The :keyword:`bint` of
"boolean int" object is compiled to a C int, but get coerced to and from
Cython as booleans. The return type of comparisons and several builtins is a
:ctype:`bint` as well. This allows one to avoid having to wrap things in
:func:`bool()`. For example, one can write::
def is_equal(x):
return x == y
which would return ``1`` or ``0`` in Pyrex, but returns ``True`` or ``False`` in
python. One can declare variables and return values for functions to be of the
:ctype:`bint` type. For example::
cdef int i = x
cdef bint b = x
The first conversion would happen via ``x.__int__()`` whereas the second would
happen via ``x.__nonzero__()``. (Actually, if ``x`` is the python object
``True`` or ``False`` then no method call is made.)
Executable class bodies
=======================
Including a working :func:`classmethod`::
cdef class Blah:
def some_method(self):
print self
some_method = classmethod(some_method)
a = 2*3
print "hi", a
cpdef functions
=================
Cython adds a third function type on top of the usual :keyword:`def` and
:keyword:`cdef`. If a function is declared :keyword:`cpdef` it can be called
from and overridden by both extension and normal python subclasses. You can
essentially think of a :keyword:`cpdef` method as a :keyword:`cdef` method +
some extras. (That's how it's implemented at least.) First, it creates a
:keyword:`def` method that does nothing but call the underlying
:keyword:`cdef` method (and does argument unpacking/coercion if needed). At
the top of the :keyword:`cdef` method a little bit of code is added to check
to see if it's overridden. Specifically, in pseudocode::
if type(self) has a __dict__:
foo = self.getattr('foo')
if foo is not wrapper_foo:
return foo(args)
[cdef method body]
To detect whether or not a type has a dictionary, it just checks the
tp_dictoffset slot, which is ``NULL`` (by default) for extension types, but
non- null for instance classes. If the dictionary exists, it does a single
attribute lookup and can tell (by comparing pointers) whether or not the
returned result is actually a new function. If, and only if, it is a new
function, then the arguments packed into a tuple and the method called. This
is all very fast. A flag is set so this lookup does not occur if one calls the
method on the class directly, e.g.::
cdef class A:
cpdef foo(self):
pass
x = A()
x.foo() # will check to see if overridden
A.foo(x) # will call A's implementation whether overridden or not
See :ref:`early-binding-for-speed` for explanation and usage tips.
.. _automatic-range-conversion:
Automatic range conversion
============================
This will convert statements of the form ``for i in range(...)`` to ``for i
from ...`` when ``i`` is any cdef'd integer type, and the direction (i.e. sign
of step) can be determined.
.. warning::
This may change the semantics if the range causes
assignment to ``i`` to overflow. Specifically, if this option is set, an error
will be raised before the loop is entered, whereas without this option the loop
will execute until a overflowing value is encountered. If this effects you
change ``Cython/Compiler/Options.py`` (eventually there will be a better
way to set this).
More friendly type casting
===========================
In Pyrex, if one types ``<int>x`` where ``x`` is a Python object, one will get
the memory address of ``x``. Likewise, if one types ``<object>i`` where ``i``
is a C int, one will get an "object" at location ``i`` in memory. This leads
to confusing results and segfaults.
In Cython ``<type>x`` will try and do a coercion (as would happen on assignment of
``x`` to a variable of type type) if exactly one of the types is a python object.
It does not stop one from casting where there is no conversion (though it will
emit a warning). If one really wants the address, cast to a ``void *`` first.
As in Pyrex ``<MyExtensionType>x`` will cast ``x`` to type :ctype:`MyExtensionType` without any
type checking. Cython supports the syntax ``<MyExtensionType?>`` to do the cast
with type checking (i.e. it will throw an error if ``x`` is not a (subclass of)
:ctype:`MyExtensionType`.
Optional arguments in cdef/cpdef functions
============================================
Cython now supports optional arguments for :keyword:`cdef` and
:keyword:`cpdef` functions.
The syntax in the ``.pyx`` file remains as in Python, but one declares such
functions in the ``.pxd`` file by writing ``cdef foo(x=*)``. The number of
arguments may increase on subclassing, but the argument types and order must
remain the same. There is a slight performance penalty in some cases when a
cdef/cpdef function without any optional is overridden with one that does have
default argument values.
For example, one can have the ``.pxd`` file::
cdef class A:
cdef foo(self)
cdef class B(A)
cdef foo(self, x=*)
cdef class C(B):
cpdef foo(self, x=*, int k=*)
with corresponding ``.pyx`` file::
cdef class A:
cdef foo(self):
print "A"
cdef class B(A)
cdef foo(self, x=None)
print "B", x
cdef class C(B):
cpdef foo(self, x=True, int k=3)
print "C", x, k
.. note::
this also demonstrates how :keyword:`cpdef` functions can override
:keyword:`cdef` functions.
Function pointers in structs
=============================
Functions declared in :keyword:`structs` are automatically converted to
function pointers for convenience.
C++ Exception handling
=========================
:keyword:`cdef` functions can now be declared as::
cdef int foo(...) except +
cdef int foo(...) except +TypeError
cdef int foo(...) except +python_error_raising_function
in which case a Python exception will be raised when a C++ error is caught.
See :ref:`wrapping-cplusplus` for more details.
Synonyms
=========
``cdef import from`` means the same thing as ``cdef extern from``
Source code encoding
======================
.. TODO: add the links to the relevent PEPs
Cython supports PEP 3120 and PEP 263, i.e. you can start your Cython source
file with an encoding comment and generally write your source code in UTF-8.
This impacts the encoding of byte strings and the conversion of unicode string
literals like ``u'abcd'`` to unicode objects.
Automatic ``typecheck``
========================
Rather than introducing a new keyword :keyword:`typecheck` as explained in the
`Pyrex docs
<http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/special_methods.html>`_,
Cython emits a (non-spoofable and faster) typecheck whenever
:func:`isinstance` is used with an extension type as the second parameter.
From __future__ directives
==========================
Cython supports several from __future__ directives, namely ``unicode_literals`` and ``division``.
With statements are always enabled.
Pure Python mode
================
Cython has support for compiling ``.py`` files, and
accepting type annotations using decorators and other
valid Python syntax. This allows the same source to
be interpreted as straight Python, or compiled for
optimized results.
See http://wiki.cython.org/pure
for more details.
.. highlight:: cython
.. _sharing-declarations:
********************************************
Sharing Declarations Between Cython Modules
********************************************
This section describes a new set of facilities for making C declarations,
functions and extension types in one Cython module available for use in
another Cython module. These facilities are closely modelled on the Python
import mechanism, and can be thought of as a compile-time version of it.
Definition and Implementation files
====================================
A Cython module can be split into two parts: a definition file with a ``.pxd``
suffix, containing C declarations that are to be available to other Cython
modules, and an implementation file with a ``.pyx`` suffix, containing
everything else. When a module wants to use something declared in another
module's definition file, it imports it using the :keyword:`cimport`
statement.
A ``.pxd`` file that consists solely of extern declarations does not need
to correspond to an actual ``.pyx`` file or Python module. This can make it a
convenient place to put common declarations, for example declarations of
functions from an :ref:`external library <external-C-code>` that one wants to use in several modules.
What a Definition File contains
================================
A definition file can contain:
* Any kind of C type declaration.
* extern C function or variable declarations.
* Declarations of C functions defined in the module.
* The definition part of an extension type (see below).
It cannot contain any non-extern C variable declarations.
It cannot contain the implementations of any C or Python functions, or any
Python class definitions, or any executable statements. It is needed when one
wants to access :keyword:`cdef` attributes and methods, or to inherit from
:keyword:`cdef` classes defined in this module.
.. note::
You don't need to (and shouldn't) declare anything in a declaration file
public in order to make it available to other Cython modules; its mere
presence in a definition file does that. You only need a public
declaration if you want to make something available to external C code.
What an Implementation File contains
======================================
An implementation file can contain any kind of Cython statement, although there
are some restrictions on the implementation part of an extension type if the
corresponding definition file also defines that type (see below).
If one doesn't need to :keyword:`cimport` anything from this module, then this
is the only file one needs.
The cimport statement
=======================
The :keyword:`cimport` statement is used in a definition or
implementation file to gain access to names declared in another definition
file. Its syntax exactly parallels that of the normal Python import
statement::
cimport module [, module...]
from module cimport name [as name] [, name [as name] ...]
Here is an example. The file on the left is a definition file which exports a
C data type. The file on the right is an implementation file which imports and
uses it.
:file:`dishes.pxd`::
cdef enum otherstuff:
sausage, eggs, lettuce
cdef struct spamdish:
int oz_of_spam
otherstuff filler
:file:`restaurant.pyx`::
cimport dishes
from dishes cimport spamdish
cdef void prepare(spamdish *d):
d.oz_of_spam = 42
d.filler = dishes.sausage
def serve():
cdef spamdish d
prepare(&d)
print "%d oz spam, filler no. %d" % (d.oz_of_spam, d.otherstuff)
It is important to understand that the :keyword:`cimport` statement can only
be used to import C data types, C functions and variables, and extension
types. It cannot be used to import any Python objects, and (with one
exception) it doesn't imply any Python import at run time. If you want to
refer to any Python names from a module that you have cimported, you will have
to include a regular import statement for it as well.
The exception is that when you use :keyword:`cimport` to import an extension type, its
type object is imported at run time and made available by the name under which
you imported it. Using :keyword:`cimport` to import extension types is covered in more
detail below.
If a ``.pxd`` file changes, any modules that :keyword:`cimport` from it may need to be
recompiled.
Search paths for definition files
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
When you :keyword:`cimport` a module called ``modulename``, the Cython
compiler searches for a file called :file:`modulename.pxd` along the search
path for include files, as specified by ``-I`` command line options.
Also, whenever you compile a file :file:`modulename.pyx`, the corresponding
definition file :file:`modulename.pxd` is first searched for along the same
path, and if found, it is processed before processing the ``.pyx`` file.
Using cimport to resolve naming conflicts
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The :keyword:`cimport` mechanism provides a clean and simple way to solve the
problem of wrapping external C functions with Python functions of the same
name. All you need to do is put the extern C declarations into a ``.pxd`` file
for an imaginary module, and :keyword:`cimport` that module. You can then
refer to the C functions by qualifying them with the name of the module.
Here's an example:
:file:`c_lunch.pxd` ::
cdef extern from "lunch.h":
void eject_tomato(float)
:file:`lunch.pyx` ::
cimport c_lunch
def eject_tomato(float speed):
c_lunch.eject_tomato(speed)
You don't need any :file:`c_lunch.pyx` file, because the only things defined
in :file:`c_lunch.pxd` are extern C entities. There won't be any actual
``c_lunch`` module at run time, but that doesn't matter; the
:file:`c_lunch.pxd` file has done its job of providing an additional namespace
at compile time.
Sharing C Functions
===================
C functions defined at the top level of a module can be made available via
:keyword:`cimport` by putting headers for them in the ``.pxd`` file, for
example,:
:file:`volume.pxd`::
cdef float cube(float)
:file:`spammery.pyx`::
from volume cimport cube
def menu(description, size):
print description, ":", cube(size), \
"cubic metres of spam"
menu("Entree", 1)
menu("Main course", 3)
menu("Dessert", 2)
:file:`volume.pyx`::
cdef float cube(float x):
return x * x * x
.. note::
When a module exports a C function in this way, an object appears in the
module dictionary under the function's name. However, you can't make use of
this object from Python, nor can you use it from Cython using a normal import
statement; you have to use :keyword:`cimport`.
Sharing Extension Types
=======================
An extension type can be made available via :keyword:`cimport` by splitting
its definition into two parts, one in a definition file and the other in the
corresponding implementation file.
The definition part of the extension type can only declare C attributes and C
methods, not Python methods, and it must declare all of that type's C
attributes and C methods.
The implementation part must implement all of the C methods declared in the
definition part, and may not add any further C attributes. It may also define
Python methods.
Here is an example of a module which defines and exports an extension type,
and another module which uses it.::
# Shrubbing.pxd
cdef class Shrubbery:
cdef int width
cdef int length
# Shrubbing.pyx
cdef class Shrubbery:
def __cinit__(self, int w, int l):
self.width = w
self.length = l
def standard_shrubbery():
return Shrubbery(3, 7)
# Landscaping.pyx
cimport Shrubbing
import Shrubbing
cdef Shrubbing.Shrubbery sh
sh = Shrubbing.standard_shrubbery()
print "Shrubbery size is %d x %d" % (sh.width, sh.height)
Some things to note about this example:
* There is a :keyword:`cdef` class Shrubbery declaration in both
:file:`Shrubbing.pxd` and :file:`Shrubbing.pyx`. When the Shrubbing module
is compiled, these two declarations are combined into one.
* In Landscaping.pyx, the :keyword:`cimport` Shrubbing declaration allows us
to refer to the Shrubbery type as :class:`Shrubbing.Shrubbery`. But it
doesn't bind the name Shrubbing in Landscaping's module namespace at run
time, so to access :func:`Shrubbing.standard_shrubbery` we also need to
``import Shrubbing``.
.. highlight:: cython
.. _compilation:
****************************
Source Files and Compilation
****************************
Cython source file names consist of the name of the module followed by a
``.pyx`` extension, for example a module called primes would have a source
file named :file:`primes.pyx`.
Once you have written your ``.pyx`` file, there are a couple of ways of turning it
into an extension module. One way is to compile it manually with the Cython
compiler, e.g.:
.. sourcecode:: text
$ cython primes.pyx
This will produce a file called :file:`primes.c`, which then needs to be
compiled with the C compiler using whatever options are appropriate on your
platform for generating an extension module. For these options look at the
official Python documentation.
The other, and probably better, way is to use the :mod:`distutils` extension
provided with Cython. The benifit of this method is that it will give the
platform specific compilation options, acting like a stripped down autotools.
Basic setup.py
===============
The distutils extension provided with Cython allows you to pass ``.pyx`` files
directly to the ``Extension`` constructor in your setup file.
If you have a single Cython file that you want to turn into a compiled
extension, say with filename :file:`example.pyx` the associated :file:`setup.py`
would be::
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [Extension("example", ["example.pyx"])]
)
To understand the :file:`setup.py` more fully look at the official
:mod:`distutils` documentation. To compile the extension for use in the
current directory use:
.. sourcecode:: text
$ python setup.py build_ext --inplace
Cython Files Depending on C Files
===================================
When you have come C files that have been wrapped with cython and you want to
compile them into your extension the basic :file:`setup.py` file to do this
would be::
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
sourcefiles = ['example.pyx', 'helper.c', 'another_helper.c']
setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [Extension("example", sourcefiles)]
)
Notice that the files have been given a name, this is not necessary, but it
makes the file easier to format if the list gets long.
The :class:`Extension` class takes many options, and a fuller explanation can
be found in the `distutils documentation`_. Some useful options to know about
are ``include_dirs``, ``libraries``, and ``library_dirs`` which specify where
to find the ``.h`` and library files when linking to external libraries.
.. _distutils documentation: http://docs.python.org/extending/building.html
Multiple Cython Files in a Package
===================================
TODO
Distributing Cython modules
============================
It is strongly recommended that you distribute the generated ``.c`` files as well
as your Cython sources, so that users can install your module without needing
to have Cython available.
It is also recommended that Cython compilation not be enabled by default in the
version you distribute. Even if the user has Cython installed, he probably
doesn't want to use it just to install your module. Also, the version he has
may not be the same one you used, and may not compile your sources correctly.
This simply means that the :file:`setup.py` file that you ship with will just
be a normal distutils file on the generated `.c` files, for the basic example
we would have instead::
from distutils.core import setup
from distutils.extension import Extension
setup(
ext_modules = [Extension("example", ["example.c"])]
)
.. _pyximport:
Pyximport
===========
.. TODO add some text about how this is Paul Prescods code. Also change the
tone to be more universal (i.e. remove all the I statements)
Cython is a compiler. Therefore it is natural that people tend to go
through an edit/compile/test cycle with Cython modules. But my personal
opinion is that one of the deep insights in Python's implementation is
that a language can be compiled (Python modules are compiled to ``.pyc``)
files and hide that compilation process from the end-user so that they
do not have to worry about it. Pyximport does this for Cython modules.
For instance if you write a Cython module called :file:`foo.pyx`, with
Pyximport you can import it in a regular Python module like this::
import pyximport; pyximport.install()
import foo
Doing so will result in the compilation of :file:`foo.pyx` (with appropriate
exceptions if it has an error in it).
If you would always like to import Cython files without building them
specially, you can also the first line above to your :file:`sitecustomize.py`.
That will install the hook every time you run Python. Then you can use
Cython modules just with simple import statements. I like to test my
Cython modules like this:
.. sourcecode:: text
$ python -c "import foo"
Dependency Handling
--------------------
In Pyximport 1.1 it is possible to declare that your module depends on
multiple files, (likely ``.h`` and ``.pxd`` files). If your Cython module is
named ``foo`` and thus has the filename :file:`foo.pyx` then you should make
another file in the same directory called :file:`foo.pyxdep`. The
:file:`modname.pyxdep` file can be a list of filenames or "globs" (like
``*.pxd`` or ``include/*.h``). Each filename or glob must be on a separate
line. Pyximport will check the file date for each of those files before
deciding whether to rebuild the module. In order to keep track of the
fact that the dependency has been handled, Pyximport updates the
modification time of your ".pyx" source file. Future versions may do
something more sophisticated like informing distutils of the
dependencies directly.
Limitations
------------
Pyximport does not give you any control over how your Cython file is
compiled. Usually the defaults are fine. You might run into problems if
you wanted to write your program in half-C, half-Cython and build them
into a single library. Pyximport 1.2 will probably do this.
Pyximport does not hide the Distutils/GCC warnings and errors generated
by the import process. Arguably this will give you better feedback if
something went wrong and why. And if nothing went wrong it will give you
the warm fuzzy that pyximport really did rebuild your module as it was
supposed to.
For further thought and discussion
------------------------------------
I don't think that Python's :func:`reload` will do anything for changed
``.so``'s on some (all?) platforms. It would require some (easy)
experimentation that I haven't gotten around to. But reload is rarely used in
applications outside of the Python interactive interpreter and certainly not
used much for C extension modules. Info about Windows
`<http://mail.python.org/pipermail/python-list/2001-July/053798.html>`_
``setup.py install`` does not modify :file:`sitecustomize.py` for you. Should it?
Modifying Python's "standard interpreter" behaviour may be more than
most people expect of a package they install..
Pyximport puts your ``.c`` file beside your ``.pyx`` file (analogous to
``.pyc`` beside ``.py``). But it puts the platform-specific binary in a
build directory as per normal for Distutils. If I could wave a magic
wand and get Cython or distutils or whoever to put the build directory I
might do it but not necessarily: having it at the top level is *VERY*
*HELPFUL* for debugging Cython problems.
.. _special-methods:
Special Methods of Extension Types
===================================
This page describes the special methods currently supported by Cython extension
types. A complete list of all the special methods appears in the table at the
bottom. Some of these methods behave differently from their Python
counterparts or have no direct Python counterparts, and require special
mention.
.. Note: Everything said on this page applies only to extension types, defined
with the :keyword:`cdef class` statement. It doesn't apply to classes defined with the
Python :keyword:`class` statement, where the normal Python rules apply.
Declaration
------------
Special methods of extension types must be declared with :keyword:`def`, not
:keyword:`cdef`. This does not impact their performance--Python uses different
calling conventions to invoke these special methods.
Docstrings
-----------
Currently, docstrings are not fully supported in some special methods of extension
types. You can place a docstring in the source to serve as a comment, but it
won't show up in the corresponding :attr:`__doc__` attribute at run time. (This
seems to be is a Python limitation -- there's nowhere in the `PyTypeObject`
data structure to put such docstrings.)
Initialisation methods: :meth:`__cinit__` and :meth:`__init__`
---------------------------------------------------------------
There are two methods concerned with initialising the object.
The :meth:`__cinit__` method is where you should perform basic C-level
initialisation of the object, including allocation of any C data structures
that your object will own. You need to be careful what you do in the
:meth:`__cinit__` method, because the object may not yet be fully valid Python
object when it is called. Therefore, you should be careful invoking any Python
operations which might touch the object; in particular, its methods.
By the time your :meth:`__cinit__` method is called, memory has been allocated for the
object and any C attributes it has have been initialised to 0 or null. (Any
Python attributes have also been initialised to None, but you probably
shouldn't rely on that.) Your :meth:`__cinit__` method is guaranteed to be called
exactly once.
If your extension type has a base type, the :meth:`__cinit__` method of the base type
is automatically called before your :meth:`__cinit__` method is called; you cannot
explicitly call the inherited :meth:`__cinit__` method. If you need to pass a modified
argument list to the base type, you will have to do the relevant part of the
initialisation in the :meth:`__init__` method instead (where the normal rules for
calling inherited methods apply).
Any initialisation which cannot safely be done in the :meth:`__cinit__` method should
be done in the :meth:`__init__` method. By the time :meth:`__init__` is called, the object is
a fully valid Python object and all operations are safe. Under some
circumstances it is possible for :meth:`__init__` to be called more than once or not
to be called at all, so your other methods should be designed to be robust in
such situations.
Any arguments passed to the constructor will be passed to both the
:meth:`__cinit__` method and the :meth:`__init__` method. If you anticipate
subclassing your extension type in Python, you may find it useful to give the
:meth:`__cinit__` method `*` and `**` arguments so that it can accept and
ignore extra arguments. Otherwise, any Python subclass which has an
:meth:`__init__` with a different signature will have to override
:meth:`__new__`[#] as well as :meth:`__init__`, which the writer of a Python
class wouldn't expect to have to do. Alternatively, as a convenience, if you declare
your :meth:`__cinit__`` method to take no arguments (other than self) it
will simply ignore any extra arguments passed to the constructor without
complaining about the signature mismatch.
.. Note: Older Cython files may use :meth:`__new__` rather than :meth:`__cinit__`. The two are synonyms.
The name change from :meth:`__new__` to :meth:`__cinit__` was to avoid
confusion with Python :meth:`__new__` (which is an entirely different
concept) and eventually the use of :meth:`__new__` in Cython will be
disallowed to pave the way for supporting Python-style :meth:`__new__`
.. [#] http://docs.python.org/reference/datamodel.html#object.__new__
Finalization method: :meth:`__dealloc__`
----------------------------------------
The counterpart to the :meth:`__cinit__` method is the :meth:`__dealloc__`
method, which should perform the inverse of the :meth:`__cinit__` method. Any
C data that you explicitly allocated (e.g. via malloc) in your
:meth:`__cinit__` method should be freed in your :meth:`__dealloc__` method.
You need to be careful what you do in a :meth:`__dealloc__` method. By the time your
:meth:`__dealloc__` method is called, the object may already have been partially
destroyed and may not be in a valid state as far as Python is concerned, so
you should avoid invoking any Python operations which might touch the object.
In particular, don't call any other methods of the object or do anything which
might cause the object to be resurrected. It's best if you stick to just
deallocating C data.
You don't need to worry about deallocating Python attributes of your object,
because that will be done for you by Cython after your :meth:`__dealloc__` method
returns.
.. Note: There is no :meth:`__del__` method for extension types.
Arithmetic methods
-------------------
Arithmetic operator methods, such as :meth:`__add__`, behave differently from their
Python counterparts. There are no separate "reversed" versions of these
methods (:meth:`__radd__`, etc.) Instead, if the first operand cannot perform the
operation, the same method of the second operand is called, with the operands
in the same order.
This means that you can't rely on the first parameter of these methods being
"self" or being the right type, and you should test the types of both operands
before deciding what to do. If you can't handle the combination of types you've
been given, you should return `NotImplemented`.
This also applies to the in-place arithmetic method :meth:`__ipow__`. It doesn't apply
to any of the other in-place methods (:meth:`__iadd__`, etc.) which always
take `self` as the first argument.
Rich comparisons
-----------------
There are no separate methods for the individual rich comparison operations
(:meth:`__eq__`, :meth:`__le__`, etc.) Instead there is a single method
:meth:`__richcmp__` which takes an integer indicating which operation is to be
performed, as follows:
+-----+-----+
| < | 0 |
+-----+-----+
| == | 2 |
+-----+-----+
| > | 4 |
+-----+-----+
| <= | 1 |
+-----+-----+
| != | 3 |
+-----+-----+
| >= | 5 |
+-----+-----+
The :meth:`__next__` method
----------------------------
Extension types wishing to implement the iterator interface should define a
method called :meth:`__next__`, not next. The Python system will automatically
supply a next method which calls your :meth:`__next__`. Do *NOT* explicitly
give your type a :meth:`next` method, or bad things could happen.
Special Method Table
---------------------
This table lists all of the special methods together with their parameter and
return types. In the table below, a parameter name of self is used to indicate
that the parameter has the type that the method belongs to. Other parameters
with no type specified in the table are generic Python objects.
You don't have to declare your method as taking these parameter types. If you
declare different types, conversions will be performed as necessary.
General
^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __cinit__ |self, ... | | Basic initialisation (no direct Python equivalent) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __init__ |self, ... | | Further initialisation |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __dealloc__ |self | | Basic deallocation (no direct Python equivalent) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __cmp__ |x, y | int | 3-way comparison |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __richcmp__ |x, y, int op | object | Rich comparison (no direct Python equivalent) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __str__ |self | object | str(self) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __repr__ |self | object | repr(self) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __hash__ |self | int | Hash function |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __call__ |self, ... | object | self(...) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __iter__ |self | object | Return iterator for sequence |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getattr__ |self, name | object | Get attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __setattr__ |self, name, val | | Set attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __delattr__ |self, name | | Delete attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Arithmetic operators
^^^^^^^^^^^^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __add__ | x, y | object | binary `+` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __sub__ | x, y | object | binary `-` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __mul__ | x, y | object | `*` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __div__ | x, y | object | `/` operator for old-style division |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __floordiv__ | x, y | object | `//` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __truediv__ | x, y | object | `/` operator for new-style division |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __mod__ | x, y | object | `%` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __divmod__ | x, y | object | combined div and mod |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __pow__ | x, y, z | object | `**` operator or pow(x, y, z) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __neg__ | self | object | unary `-` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __pos__ | self | object | unary `+` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __abs__ | self | object | absolute value |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __nonzero__ | self | int | convert to boolean |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __invert__ | self | object | `~` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __lshift__ | x, y | object | `<<` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __rshift__ | x, y | object | `>>` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __and__ | x, y | object | `&` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __or__ | x, y | object | `|` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __xor__ | x, y | object | `^` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Numeric conversions
^^^^^^^^^^^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __int__ | self | object | Convert to integer |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __long__ | self | object | Convert to long integer |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __float__ | self | object | Convert to float |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __oct__ | self | object | Convert to octal |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __hex__ | self | object | Convert to hexadecimal |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __index__ (2.5+ only) | self | object | Convert to sequence index |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
In-place arithmetic operators
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __iadd__ | self, x | object | `+=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __isub__ | self, x | object | `-=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __imul__ | self, x | object | `*=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __idiv__ | self, x | object | `/=` operator for old-style division |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __ifloordiv__ | self, x | object | `//=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __itruediv__ | self, x | object | `/=` operator for new-style division |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __imod__ | self, x | object | `%=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __ipow__ | x, y, z | object | `**=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __ilshift__ | self, x | object | `<<=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __irshift__ | self, x | object | `>>=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __iand__ | self, x | object | `&=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __ior__ | self, x | object | `|=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __ixor__ | self, x | object | `^=` operator |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Sequences and mappings
^^^^^^^^^^^^^^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __len__ | self int | | len(self) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getitem__ | self, x | object | self[x] |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __setitem__ | self, x, y | | self[x] = y |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __delitem__ | self, x | | del self[x] |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getslice__ | self, Py_ssize_t i, Py_ssize_t j | object | self[i:j] |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __setslice__ | self, Py_ssize_t i, Py_ssize_t j, x | | self[i:j] = x |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __delslice__ | self, Py_ssize_t i, Py_ssize_t j | | del self[i:j] |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __contains__ | self, x | int | x in self |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Iterators
^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __next__ | self | object | Get next item (called next in Python) |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Buffer interface [PEP 3118] (no Python equivalents - see note 1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __getbuffer__ | self, Py_buffer `*view`, int flags | | |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __releasebuffer__ | self, Py_buffer `*view` | | |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Buffer interface [legacy] (no Python equivalents - see note 1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __getreadbuffer__ | self, Py_ssize_t i, void `**p` | | |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getwritebuffer__ | self, Py_ssize_t i, void `**p` | | |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getsegcount__ | self, Py_ssize_t `*p` | | |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __getcharbuffer__ | self, Py_ssize_t i, char `**p` | | |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
Descriptor objects (see note 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| Name | Parameters | Return type | Description |
+=======================+=======================================+=============+=====================================================+
| __get__ | self, instance, class | object | Get value of attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __set__ | self, instance, value | | Set value of attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
| __delete__ | self, instance | | Delete attribute |
+-----------------------+---------------------------------------+-------------+-----------------------------------------------------+
.. note:: (1) The buffer interface was intended for use by C code and is not directly
accessible from Python. It is described in the Python/C API Reference Manual
of Python 2.x under sections 6.6 and 10.6. It was superseded by the new
PEP 3118 buffer protocol in Python 2.6 and is no longer available in Python 3.
.. note:: (2) Descriptor objects are part of the support mechanism for new-style
Python classes. See the discussion of descriptors in the Python documentation.
See also PEP 252, "Making Types Look More Like Classes", and PEP 253,
"Subtyping Built-In Types".
.. highlight:: cython
.. _tutorial:
*********
Tutorial
*********
The Basics of Cython
====================
The fundamental nature of Cython can be summed up as follows: Cython is Python
with C data types.
Cython is Python: Almost any piece of Python code is also valid Cython code.
(There are a few :ref:`cython-limitations`, but this approximation will
serve for now.) The Cython compiler will convert it into C code which makes
equivalent calls to the Python/C API.
But Cython is much more than that, because parameters and variables can be
declared to have C data types. Code which manipulates Python values and C
values can be freely intermixed, with conversions occurring automatically
wherever possible. Reference count maintenance and error checking of Python
operations is also automatic, and the full power of Python's exception
handling facilities, including the try-except and try-finally statements, is
available to you -- even in the midst of manipulating C data.
Cython Hello World
===================
As Cython can accept almost any valid python source file, one of the hardest
things in getting started is just figuring out how to compile your extension.
So lets start with the canonical python hello world::
print "Hello World"
So the first thing to do is rename the file to :file:`helloworld.pyx`. Now we
need to make the :file:`setup.py`, which is like a python Makefile (for more
information see :ref:`compilation`). Your :file:`setup.py` should look like::
from distutils.core import setup
from distutils.extension import Extension
from Cython.Distutils import build_ext
setup(
cmdclass = {'build_ext': build_ext},
ext_modules = [Extension("helloworld", ["helloworld.pyx"])]
)
To use this to build your Cython file use the commandline options:
.. sourcecode:: text
$ python setup.py build_ext --inplace
Which will leave a file in your local directory called :file:`helloworld.so` in unix
or :file:`helloworld.dll` in Windows. Now to use this file: start the python
interpreter and simply import it as if it was a regular python module::
>>> import helloworld
Hello World
Congratulations! You now know how to build a Cython extension. But So Far
this example doesn't really give a feeling why one would ever want to use Cython, so
lets create a more realistic example.
:mod:`pyximport`: Cython Compilation the Easy Way
==================================================
If your module doesn't require any extra C libraries or a special
build setup, then you can use the pyximport module by Paul Prescod and
Stefan Behnel to load .pyx files directly on import, without having to
write a :file:`setup.py` file. It is shipped and installed with
Cython and can be used like this::
>>> import pyximport; pyximport.install()
>>> import helloworld
Hello World
Since Cython 0.11, the :mod:`pyximport` module also has experimental
compilation support for normal Python modules. This allows you to
automatically run Cython on every .pyx and .py module that Python
imports, including the standard library and installed packages.
Cython will still fail to compile a lot of Python modules, in which
case the import mechanism will fall back to loading the Python source
modules instead. The .py import mechanism is installed like this::
>>> pyximport.install(pyimport = True)
Fibonacci Fun
==============
From the official Python tutorial a simple fibonacci function is defined as:
.. literalinclude:: ../examples/tutorial/fib1/fib.pyx
Now following the steps for the Hello World example we first rename the file
to have a `.pyx` extension, lets say :file:`fib.pyx`, then we create the
:file:`setup.py` file. Using the file created for the Hello World example, all
that you need to change is the name of the Cython filename, and the resulting
module name, doing this we have:
.. literalinclude:: ../examples/tutorial/fib1/setup.py
Build the extension with the same command used for the helloworld.pyx:
.. sourcecode:: text
$ python setup.py build_ext --inplace
And use the new extension with::
>>> import fib
>>> fib.fib(2000)
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
Primes
=======
Here's a small example showing some of what can be done. It's a routine for
finding prime numbers. You tell it how many primes you want, and it returns
them as a Python list.
:file:`primes.pyx`:
.. literalinclude:: ../examples/tutorial/primes/primes.pyx
:linenos:
You'll see that it starts out just like a normal Python function definition,
except that the parameter ``kmax`` is declared to be of type ``int`` . This
means that the object passed will be converted to a C integer (or a
``TypeError.`` will be raised if it can't be).
Lines 2 and 3 use the ``cdef`` statement to define some local C variables.
Line 4 creates a Python list which will be used to return the result. You'll
notice that this is done exactly the same way it would be in Python. Because
the variable result hasn't been given a type, it is assumed to hold a Python
object.
Lines 7-9 set up for a loop which will test candidate numbers for primeness
until the required number of primes has been found. Lines 11-12, which try
dividing a candidate by all the primes found so far, are of particular
interest. Because no Python objects are referred to, the loop is translated
entirely into C code, and thus runs very fast.
When a prime is found, lines 14-15 add it to the p array for fast access by
the testing loop, and line 16 adds it to the result list. Again, you'll notice
that line 16 looks very much like a Python statement, and in fact it is, with
the twist that the C parameter ``n`` is automatically converted to a Python
object before being passed to the append method. Finally, at line 18, a normal
Python return statement returns the result list.
Compiling primes.pyx with the Cython compiler produces an extension module
which we can try out in the interactive interpreter as follows::
>>> import primes
>>> primes.primes(10)
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
See, it works! And if you're curious about how much work Cython has saved you,
take a look at the C code generated for this module.
Language Details
================
For more about the Cython language, see :ref:`language-basics`.
To dive right in to using Cython in a numerical computation context, see :ref:`numpy_tutorial`.
.. highlight:: cython
.. _wrapping-cplusplus:
********************************
Using C++ in Cython
********************************
Overview
=========
Cython v0.13 introduces native support for most of the C++ language. This means that the previous tricks that were used to wrap C++ classes (as described in http://wiki.cython.org/WrappingCPlusPlus_ForCython012AndLower) are no longer needed.
Wrapping C++ classes with Cython is now much more straightforward. This document describe in details the new way of wrapping C++ code.
What's new in Cython v0.13 about C++
---------------------------------------------------
For users of previous Cython versions, here is a brief overview of the main new features of Cython v0.13 regarding C++ support:
* C++ objects can now be dynamically allocated with ``new`` and ``del`` keywords.
* C++ objects can now be stack-allocated.
* C++ classes can be declared with the new keyword ``cppclass``.
* Templated classes are supported.
* Overloaded functions are supported.
* Overloading of C++ operators (such as operator+, operator[],...) is supported.
Procedure Overview
-------------------
The general procedure for wrapping a C++ file can now be described as follow:
* Specify C++ language in :file:`setup.py` script
* Create ``cdef extern from`` blocks with the optional namespace (if exists) and the namespace name as string
* Declare classes as ``cdef cppclass`` blocks
* Declare public attributes (variables, methods and constructors)
A simple Tutorial
==================
An example C++ API
-------------------
Here is a tiny C++ API which we will use as an example throughout this
document. Let's assume it will be in a header file called
:file:`Rectangle.h`:
.. sourcecode:: c++
namespace shapes {
class Rectangle {
public:
int x0, y0, x1, y1;
Rectangle(int x0, int y0, int x1, int y1);
~Rectangle();
int getLength();
int getHeight();
int getArea();
void move(int dx, int dy);
};
}
and the implementation in the file called :file:`Rectangle.cpp`:
.. sourcecode:: c++
#include "Rectangle.h"
Rectangle::Rectangle(int X0, int Y0, int X1, int Y1)
{
x0 = X0;
y0 = Y0;
x1 = X1;
y1 = Y1;
}
Rectangle::~Rectangle()
{
}
int Rectangle::getLength()
{
return (x1 - x0);
}
int Rectangle::getHeight()
{
return (y1 - y0);
}
int Rectangle::getArea()
{
return (x1 - x0) * (y1 - y0);
}
void Rectangle::move(int dx, int dy)
{
x0 += dx;
y0 += dy;
x1 += dx;
y1 += dy;
}
This is pretty dumb, but should suffice to demonstrate the steps involved.
Specify C++ language in setup.py
---------------------------------
In Cython :file:`setup.py` scripts, one normally instantiates an Extension
object. To make Cython generate and compile a C++ source, you just need
to add the keyword ``language="c++"`` to your Extension construction statement, as in::
ext = Extension(
"rectangle", # name of extension
["rectangle.pyx", "Rectangle.cpp"], # filename of our Cython source
language="c++", # this causes Cython to create C++ source
include_dirs=[...], # usual stuff
libraries=["stdc++", ...], # ditto
extra_link_args=[...], # if needed
cmdclass = {'build_ext': build_ext}
)
Cython will generate and compile the :file:`rectangle.cpp` file (from the
:file:`rectangle.pyx`), then it will compile :file:`Rectangle.cpp`
(implementation of the ``Rectangle`` class) and link both objects files
together into :file:`rectangle.so`, which you can then import in Python using
``import rectangle`` (if you forget to link the :file:`Rectangle.o`, you will
get missing symbols while importing the library in Python).
Alternatively, one can also use the ``cython`` command-line utility to generate a C++ ``.cpp`` file, and then compile it into a python extension. C++ mode for the ``cython`` command is turned on with the ``--cplus`` option.
Declaring a C++ class interface
--------------------------------
The procedure for wrapping a C++ class is quite similar to that for wrapping
normal C structs, with a couple of additions. Let's start here by creating the
basic ``cdef extern from`` block::
cdef extern from "Rectangle.h" namespace "shapes":
This will make the C++ class def for Rectangle available. Note the namespace declaration.
Declare class with cdef cppclass
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Now, let's add the Rectangle class to this extern from block - just copy the class name from Rectangle.h and adjust for Cython syntax, so now it becomes::
cdef extern from "Rectangle.h" namespace "shapes":
cdef cppclass Rectangle:
Add public attributes
^^^^^^^^^^^^^^^^^^^^^^
We now need to declare the attributes for use on Cython::
cdef extern from "Rectangle.h" namespace "shapes":
cdef cppclass Rectangle:
Rectangle(int, int, int, int)
int x0, y0, x1, y1
int getLength()
int getHeight()
int getArea()
void move(int, int)
Declare a var with the wrapped C++ class
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Now, we use cdef to declare a var of the class with the C++ ``new`` statement::
cdef Rectangle *rec = new Rectangle(1, 2, 3, 4)
cdef int recLength = rec.getLength()
...
del rec #delete heap allocated object
It's also possible to declare a stack allocated object, but it's necessary to have a "default" constructor::
cdef extern from "Foo.h":
cdef cppclass Foo:
Foo()
cdef Foo foo
Note that, like C++, if the class has only one constructor and it is a default one, it's not necessary to declare it.
Create Cython wrapper class
----------------------------
At this point, we have exposed into our pyx file's namespace the interface of the C++ Rectangle type. Now, we need to make
this accessible from external Python code (which is our whole point).
Common programming practice is to create a Cython extension type which
holds a C++ instance pointer as an attribute ``thisptr``, and create a bunch of
forwarding methods. So we can implement the Python extension type as::
cdef class PyRectangle:
cdef Rectangle *thisptr # hold a C++ instance which we're wrapping
def __cinit__(self, int x0, int y0, int x1, int y1):
self.thisptr = new Rectangle(x0, y0, x1, y1)
def __dealloc__(self):
del self.thisptr
def getLength(self):
return self.thisptr.getLength()
def getHeight(self):
return self.thisptr.getHeight()
def getArea(self):
return self.thisptr.getArea()
def move(self, dx, dy):
self.thisptr.move(dx, dy)
And there we have it. From a Python perspective, this extension type will look
and feel just like a natively defined Rectangle class. If you want to give
attribute access, you could just implement some properties::
property x0:
def __get__(self): return self.thisptr.x0
def __set__(self, x0): self.thisptr.x0 = x0
...
Advanced C++ features
======================
We describe here all the C++ features that were not discussed in the above tutorial.
Overloading
------------
Overloading is very simple. Just declare the method with different parameters and use any of them::
cdef extern from "Foo.h":
cdef cppclass Foo:
Foo(int)
Foo(bool)
Foo(int, bool)
Foo(int, int)
Overloading operators
----------------------
Cython uses C++ for overloading operators::
cdef extern from "foo.h":
cdef cppclass Foo:
Foo()
Foo* operator+(Foo*)
Foo* operator-(Foo)
int operator*(Foo*)
int operator/(int)
cdef Foo* foo = new Foo()
cdef int x
cdef Foo* foo2 = foo[0] + foo
foo2 = foo[0] - foo[0]
x = foo[0] * foo2
x = foo[0] / 1
cdef Foo f
foo = f + &f
foo2 = f - f
del foo, foo2
Nested class declarations
--------------------------
C++ allows nested class declaration. Class declarations can also be nested in Cython::
cdef extern from "<vector>" namespace "std":
cdef cppclass vector[T]:
cppclass iterator:
T operator*()
iterator operator++()
bint operator==(iterator)
bint operator!=(iterator)
vector()
void push_back(T&)
T& operator[](int)
T& at(int)
iterator begin()
iterator end()
cdef vector[int].iterator iter #iter is declared as being of type vector<int>::iterator
Note that the nested class is declared with a ``cppclass`` but without a ``cdef``.
C++ operators not compatible with Python syntax
------------------------------------------------
Cython try to keep a syntax as close as possible to standard Python. Because of this, certain C++ operators, like the preincrement ``++foo`` or the dereferencing operator ``*foo`` cannot be used with the same syntax as C++. Cython provides functions replacing these operators in a special module ``cython.operator``. The functions provided are:
* ``cython.operator.dereference`` for dereferencing. ``dereference(foo)`` will produce the C++ code ``*foo``
* ``cython.operator.preincrement`` for pre-incrementation. ``preincrement(foo)`` will produce the C++ code ``++foo``
* ...
These functions need to be cimported. Of course, one can use a ``from ... cimport ... as`` to have shorter and more readable functions. For example: ``from cython.operator cimport dereference as deref``.
Templates
----------
Cython uses a bracket syntax for templating. A simple example for wrapping C++ vector::
from cython.operator cimport dereference as deref, preincrement as inc #dereference and increment operators
cdef extern from "<vector>" namespace "std":
cdef cppclass vector[T]:
cppclass iterator:
T operator*()
iterator operator++()
bint operator==(iterator)
bint operator!=(iterator)
vector()
void push_back(T&)
T& operator[](int)
T& at(int)
iterator begin()
iterator end()
cdef vector[int] *v = new vector[int]()
cdef int i
for i in range(10):
v.push_back(i)
cdef vector[int].iterator it = v.begin()
while it != v.end():
print deref(it)
inc(it)
del v
Multiple template parameters can be defined as a list, such as [T, U, V] or [int, bool, char].
Standard library
-----------------
Most of the containers of the C++ Standard Library have been declared in pxd files located in ``/Cython/Includes/libcpp``. These containers are: deque, list, map, pair, queue, set, stack, vector.
For example::
from libcpp.vector cimport vector
cdef vector[int] vect
cdef int i
for i in range(10):
vect.push_back(i)
for i in range(10):
print vect[i]
The pxd files in ``/Cython/Includes/libcpp`` also work as good examples on how to declare C++ classes.
Exceptions
-----------
Cython cannot throw C++ exceptions, or catch them with a try-except statement,
but it is possible to declare a function as potentially raising an C++
exception and converting it into a Python exception. For example, ::
cdef extern from "some_file.h":
cdef int foo() except +
This will translate try and the C++ error into an appropriate Python exception
(currently an IndexError on std::out_of_range and a RuntimeError otherwise
(preserving the what() message). ::
cdef int bar() except +MemoryError
This will catch any C++ error and raise a Python MemoryError in its place.
(Any Python exception is valid here.) ::
cdef int raise_py_error()
cdef int something_dangerous() except +raise_py_error
If something_dangerous raises a C++ exception then raise_py_error will be
called, which allows one to do custom C++ to Python error "translations." If
raise_py_error does not actually raise an exception a RuntimeError will be
raised.
Caveats and Limitations
========================
Access to C-only functions
---------------------------
Whenever generating C++ code, Cython generates declarations of and calls
to functions assuming these functions are C++ (ie, not declared as extern "C"
{...} . This is ok if the C functions have C++ entry points, but if they're C
only, you will hit a roadblock. If you have a C++ Cython module needing
to make calls to pure-C functions, you will need to write a small C++ shim
module which:
* includes the needed C headers in an extern "C" block
* contains minimal forwarding functions in C++, each of which calls the
respective pure-C function
Inherited C++ methods
----------------------
If you have a class ``Foo`` with a child class ``Bar``, and ``Foo`` has a
method :meth:`fred`, then you'll have to cast to access this method from
``Bar`` objects.
For example::
cdef class MyClass:
Bar *b
...
def myfunc(self):
...
b.fred() # wrong, won't work
(<Foo *>(self.b)).fred() # should work, Cython now thinks it's a 'Foo'
It might take some experimenting by others (you?) to find the most elegant
ways of handling this issue.
Declaring/Using References
---------------------------
Question: How do you declare and call a function that takes a reference as an argument?
C++ left-values
----------------
C++ allows functions returning a reference to be left-values. This is currently not supported in Cython. ``cython.operator.dereference(foo)`` is also not considered a left-value.
Background
----------
[brain dump]
The "Old Cython Users Guide" is a derivative of the old Pyrex documentation. It underwent substantial editing by Peter Alexandar
to become the Reference Guide, which is oriented around bullet points
and lists rather than prose. This transition was incomplete.
At nearly the same time, Robert, Dag, and Stefan wrote a tutorial as
part of the SciPy proceedings. It was felt that the content there was
cleaner and more up to date than anything else, and this became the
basis for the "Getting Started" and "Tutorials" sections. However,
it simply doesn't have as much content as the old documentation used to.
Eventually, it seems all of the old users manual could be whittled
down into independent tutorial topics. Much discussion of what we'd
like to see is at
http://www.mail-archive.com/cython-dev@codespeak.net/msg06945.html
There is currently a huge amount of redundancy, but no one section has
it all.
Also, we should go through the wiki enhancement proposal list and make sure to transfer the (done) ones into the user manual.
.. highlight:: cython
.. _overview:
********
Welcome!
********
===============
What is Cython?
===============
Cython is a programming language based on Python
with extra syntax to provide static type declarations.
================
What Does It Do?
================
It takes advantage of the benefits of Python while allowing one to achieve the speed of C.
============================
How Exactly Does It Do That?
============================
The source code gets translated into optimized C/C++
code and compiled as Python extension modules.
This allows for both very fast program execution and tight
integration with external C libraries, while keeping
up the high *programmer productivity* for which the
Python language is well known.
=============
Tell Me More!
=============
The Python language is well known.
The primary Python execution environment is commonly referred to as CPython, as it is written in
C. Other major implementations use:
:Java: Jython [#Jython]_
:C#: IronPython [#IronPython]_)
:Python itself: PyPy [#PyPy]_
Written in C, CPython has been
conducive to wrapping many external libraries that interface through the C language. It has, however, remained non trivial to write the necessary glue code in
C, especially for programmers who are more fluent in a
high-level language like Python than in a do-it-yourself
language like C.
Originally based on the well-known Pyrex [#Pyrex]_, the
Cython project has approached this problem by means
of a source code compiler that translates Python code
to equivalent C code. This code is executed within the
CPython runtime environment, but at the speed of
compiled C and with the ability to call directly into C
libraries.
At the same time, it keeps the original interface of the Python source code, which makes it directly
usable from Python code. These two-fold characteristics enable Cython’s two major use cases:
#. Extending the CPython interpreter with fast binary modules, and
#. Interfacing Python code with external C libraries.
While Cython can compile (most) regular Python
code, the generated C code usually gains major (and
sometime impressive) speed improvements from optional static type declarations for both Python and
C types. These allow Cython to assign C semantics to
parts of the code, and to translate them into very efficient C code.
Type declarations can therefore be used
for two purposes:
#. For moving code sections from dynamic Python semantics into static-and-fast C semantics, but also for..
#. Directly manipulating types defined in external libraries. Cython thus merges the two worlds into a very broadly applicable programming language.
==================
Where Do I Get It?
==================
Well.. at `cython.org <http://cython.org>`_.. of course!
======================
How Do I Report a Bug?
======================
=================================
I Want To Make A Feature Request!
=================================
============================================
Is There a Mail List? How Do I Contact You?
============================================
.. rubric:: Footnotes
.. [#Jython] **Jython:** \J. Huginin, B. Warsaw, F. Bock, et al., Jython: Python for the Java platform, http://www.jython.org/
.. [#IronPython] **IronPython:** Jim Hugunin et al., http://www.codeplex.com/IronPython.
.. [#PyPy] **PyPy:** The PyPy Group, PyPy: a Python implementation written in Python, http://codespeak.net/pypy.
.. [#Pyrex] **Pyrex:** G. Ewing, Pyrex: C-Extensions for Python, http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment