Commit e2941f2b authored by gabrieldemarmiesse's avatar gabrieldemarmiesse

Finished 'the first cython program'.

parent adc1aa70
# cython: infer_types=True # cython: infer_types=True
import numpy as np import numpy as np
cimport cython
# "def" can type its arguments but not have a return type. The type of the
# arguments for a "def" function is checked at run-time when entering the
# function.
# We now need to fix a datatype for our arrays. I've used the variable
# DTYPE for this, which is assigned to the usual NumPy runtime
# type info object.
# The arrays f, g and h is typed as "np.ndarray" instances. The only effect
# this has is to a) insert checks that the function arguments really are
# NumPy arrays, and b) make some attribute access like f.shape[0] much
# more efficient. (In this example this doesn't matter though.)
ctypedef fused my_type: ctypedef fused my_type:
int int
...@@ -22,15 +11,7 @@ ctypedef fused my_type: ...@@ -22,15 +11,7 @@ ctypedef fused my_type:
cpdef naive_convolve_fused_types(my_type [:,:] f, my_type [:,:] g): cpdef naive_convolve_fused_types(my_type [:,:] f, my_type [:,:] g):
if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1: if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1:
raise ValueError("Only odd dimensions on filter supported") raise ValueError("Only odd dimensions on filter supported")
# The "cdef" keyword is also used within functions to type variables. It
# can only be used at the top indentation level (there are non-trivial
# problems with allowing them in other places, though we'd love to see
# good and thought out proposals for it).
#
# For the indices, the "int" type is used. This corresponds to a C int,
# other C types (like "unsigned int") could have been used instead.
# Purists could use "Py_ssize_t" which is the proper Python type for
# array indices.
vmax = f.shape[0] vmax = f.shape[0]
wmax = f.shape[1] wmax = f.shape[1]
smax = g.shape[0] smax = g.shape[0]
...@@ -50,11 +31,6 @@ cpdef naive_convolve_fused_types(my_type [:,:] f, my_type [:,:] g): ...@@ -50,11 +31,6 @@ cpdef naive_convolve_fused_types(my_type [:,:] f, my_type [:,:] g):
h_np = np.zeros([xmax, ymax], dtype=dtype) h_np = np.zeros([xmax, ymax], dtype=dtype)
cdef my_type [:,:] h = h_np cdef my_type [:,:] h = h_np
# For the value variable, we want to use the same data type as is
# stored in the array, so we use "DTYPE_t" as defined above.
# NB! An important side-effect of this is that if "value" overflows its
# datatype size, it will simply wrap around like in C, rather than raise
# an error like in Python.
cdef my_type value cdef my_type value
for x in range(xmax): for x in range(xmax):
for y in range(ymax): for y in range(ymax):
......
# cython: infer_types=True # cython: infer_types=True
import numpy as np import numpy as np
cimport cython
# "def" can type its arguments but not have a return type. The type of the
# arguments for a "def" function is checked at run-time when entering the
# function.
# We now need to fix a datatype for our arrays. I've used the variable
# DTYPE for this, which is assigned to the usual NumPy runtime
# type info object.
DTYPE = np.intc DTYPE = np.intc
# The arrays f, g and h is typed as "np.ndarray" instances. The only effect
# this has is to a) insert checks that the function arguments really are
# NumPy arrays, and b) make some attribute access like f.shape[0] much
# more efficient. (In this example this doesn't matter though.)
@cython.boundscheck(False) @cython.boundscheck(False)
@cython.wraparound(False) @cython.wraparound(False)
def naive_convolve_infer_types(int [:,::1] f, int [:,::1] g): def naive_convolve_infer_types(int [:,::1] f, int [:,::1] g):
if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1: if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1:
raise ValueError("Only odd dimensions on filter supported") raise ValueError("Only odd dimensions on filter supported")
# The "cdef" keyword is also used within functions to type variables. It
# can only be used at the top indentation level (there are non-trivial
# problems with allowing them in other places, though we'd love to see
# good and thought out proposals for it).
#
# For the indices, the "int" type is used. This corresponds to a C int,
# other C types (like "unsigned int") could have been used instead.
# Purists could use "Py_ssize_t" which is the proper Python type for
# array indices.
vmax = f.shape[0] vmax = f.shape[0]
wmax = f.shape[1] wmax = f.shape[1]
smax = g.shape[0] smax = g.shape[0]
...@@ -34,14 +17,10 @@ def naive_convolve_infer_types(int [:,::1] f, int [:,::1] g): ...@@ -34,14 +17,10 @@ def naive_convolve_infer_types(int [:,::1] f, int [:,::1] g):
tmid = tmax // 2 tmid = tmax // 2
xmax = vmax + 2*smid xmax = vmax + 2*smid
ymax = wmax + 2*tmid ymax = wmax + 2*tmid
h_np = np.zeros([xmax, ymax], dtype=DTYPE) h_np = np.zeros([xmax, ymax], dtype=DTYPE)
cdef int [:,::1] h = h_np cdef int [:,::1] h = h_np
# For the value variable, we want to use the same data type as is
# stored in the array, so we use "DTYPE_t" as defined above.
# NB! An important side-effect of this is that if "value" overflows its
# datatype size, it will simply wrap around like in C, rather than raise
# an error like in Python.
cdef int value cdef int value
for x in range(xmax): for x in range(xmax):
for y in range(ymax): for y in range(ymax):
......
import numpy as np import numpy as np
# "def" can type its arguments but not have a return type. The type of the
# arguments for a "def" function is checked at run-time when entering the
# function.
# We now need to fix a datatype for our arrays. I've used the variable
# DTYPE for this, which is assigned to the usual NumPy runtime
# type info object.
DTYPE = np.intc DTYPE = np.intc
# The arrays f, g and h is typed as "np.ndarray" instances. The only effect
# this has is to a) insert checks that the function arguments really are
# NumPy arrays, and b) make some attribute access like f.shape[0] much
# more efficient. (In this example this doesn't matter though.)
def naive_convolve_memview(int [:,:] f, int [:,:] g): def naive_convolve_memview(int [:,:] f, int [:,:] g):
if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1: if g.shape[0] % 2 != 1 or g.shape[1] % 2 != 1:
raise ValueError("Only odd dimensions on filter supported") raise ValueError("Only odd dimensions on filter supported")
# The "cdef" keyword is also used within functions to type variables. It
# can only be used at the top indentation level (there are non-trivial # We don't need to check for the type of NumPy array here because
# problems with allowing them in other places, though we'd love to see # a check is already performed when calling the function.
# good and thought out proposals for it). cdef int x, y, s, t, v, w, s_from, s_to, t_from, t_to
#
# For the indices, the "int" type is used. This corresponds to a C int,
# other C types (like "unsigned int") could have been used instead.
# Purists could use "Py_ssize_t" which is the proper Python type for
# array indices.
cdef int vmax = f.shape[0] cdef int vmax = f.shape[0]
cdef int wmax = f.shape[1] cdef int wmax = f.shape[1]
cdef int smax = g.shape[0] cdef int smax = g.shape[0]
...@@ -31,18 +17,10 @@ def naive_convolve_memview(int [:,:] f, int [:,:] g): ...@@ -31,18 +17,10 @@ def naive_convolve_memview(int [:,:] f, int [:,:] g):
cdef int tmid = tmax // 2 cdef int tmid = tmax // 2
cdef int xmax = vmax + 2*smid cdef int xmax = vmax + 2*smid
cdef int ymax = wmax + 2*tmid cdef int ymax = wmax + 2*tmid
h_np = np.zeros([xmax, ymax], dtype=DTYPE) h_np = np.zeros([xmax, ymax], dtype=DTYPE)
cdef int [:,:] h = h_np cdef int [:,:] h = h_np
cdef int x, y, s, t, v, w
# It is very important to type ALL your variables. You do not get any
# warnings if not, only much slower code (they are implicitly typed as
# Python objects).
cdef int s_from, s_to, t_from, t_to
# For the value variable, we want to use the same data type as is
# stored in the array, so we use "DTYPE_t" as defined above.
# NB! An important side-effect of this is that if "value" overflows its
# datatype size, it will simply wrap around like in C, rather than raise
# an error like in Python.
cdef int value cdef int value
for x in range(xmax): for x in range(xmax):
for y in range(ymax): for y in range(ymax):
......
...@@ -24,6 +24,8 @@ def naive_convolve_types(f, g): ...@@ -24,6 +24,8 @@ def naive_convolve_types(f, g):
# other C types (like "unsigned int") could have been used instead. # other C types (like "unsigned int") could have been used instead.
# Purists could use "Py_ssize_t" which is the proper Python type for # Purists could use "Py_ssize_t" which is the proper Python type for
# array indices. # array indices.
cdef int x, y, s, t, v, w, s_from, s_to, t_from, t_to
cdef int vmax = f.shape[0] cdef int vmax = f.shape[0]
cdef int wmax = f.shape[1] cdef int wmax = f.shape[1]
cdef int smax = g.shape[0] cdef int smax = g.shape[0]
...@@ -33,11 +35,9 @@ def naive_convolve_types(f, g): ...@@ -33,11 +35,9 @@ def naive_convolve_types(f, g):
cdef int xmax = vmax + 2*smid cdef int xmax = vmax + 2*smid
cdef int ymax = wmax + 2*tmid cdef int ymax = wmax + 2*tmid
h = np.zeros([xmax, ymax], dtype=DTYPE) h = np.zeros([xmax, ymax], dtype=DTYPE)
cdef int x, y, s, t, v, w
# It is very important to type ALL your variables. You do not get any # It is very important to type ALL your variables. You do not get any
# warnings if not, only much slower code (they are implicitly typed as # warnings if not, only much slower code (they are implicitly typed as
# Python objects). # Python objects).
cdef int s_from, s_to, t_from, t_to
# For the value variable, we want to use the same data type as is # For the value variable, we want to use the same data type as is
# stored in the array, so we use "DTYPE_t" as defined above. # stored in the array, so we use "DTYPE_t" as defined above.
# NB! An important side-effect of this is that if "value" overflows its # NB! An important side-effect of this is that if "value" overflows its
......
...@@ -133,13 +133,13 @@ The first Cython program ...@@ -133,13 +133,13 @@ The first Cython program
The code below does 2D discrete convolution of an image with a filter (and I'm The code below does 2D discrete convolution of an image with a filter (and I'm
sure you can do better!, let it serve for demonstration purposes). It is both sure you can do better!, let it serve for demonstration purposes). It is both
valid Python and valid Cython code. I'll refer to it as both valid Python and valid Cython code. I'll refer to it as both
:file:`convolve_py.py` for the Python version and :file:`convolve1.pyx` for the :file:`convolve_py.py` for the Python version and :file:`convolve_cy.pyx` for the
Cython version -- Cython uses ".pyx" as its file suffix. Cython version -- Cython uses ".pyx" as its file suffix.
.. literalinclude:: ../../examples/userguide/convolve_py.py .. literalinclude:: ../../examples/userguide/convolve_py.py
:linenos: :linenos:
This should be compiled to produce :file:`yourmod.so` (for Linux systems). We This should be compiled to produce :file:`convolve_cy.so` (for Linux systems). We
run a Python session to test both the Python version (imported from run a Python session to test both the Python version (imported from
``.py``-file) and the compiled Cython module. ``.py``-file) and the compiled Cython module.
...@@ -153,8 +153,8 @@ run a Python session to test both the Python version (imported from ...@@ -153,8 +153,8 @@ run a Python session to test both the Python version (imported from
array([[1, 1, 1], array([[1, 1, 1],
[2, 2, 2], [2, 2, 2],
[1, 1, 1]]) [1, 1, 1]])
In [4]: import convolve1 In [4]: import convolve_cy
In [4]: convolve1.naive_convolve(np.array([[1, 1, 1]], dtype=np.int), In [4]: convolve_cy.naive_convolve(np.array([[1, 1, 1]], dtype=np.int),
... np.array([[1],[2],[1]], dtype=np.int)) ... np.array([[1],[2],[1]], dtype=np.int))
Out [4]: Out [4]:
array([[1, 1, 1], array([[1, 1, 1],
...@@ -164,13 +164,17 @@ run a Python session to test both the Python version (imported from ...@@ -164,13 +164,17 @@ run a Python session to test both the Python version (imported from
In [12]: f = np.arange(N*N, dtype=np.int).reshape((N,N)) In [12]: f = np.arange(N*N, dtype=np.int).reshape((N,N))
In [13]: g = np.arange(81, dtype=np.int).reshape((9, 9)) In [13]: g = np.arange(81, dtype=np.int).reshape((9, 9))
In [19]: %timeit -n2 -r3 convolve_py.naive_convolve(f, g) In [19]: %timeit -n2 -r3 convolve_py.naive_convolve(f, g)
2 loops, best of 3: 1.86 s per loop 422 ms ± 2.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [20]: %timeit -n2 -r3 convolve1.naive_convolve(f, g) In [20]: %timeit -n2 -r3 convolve_cy.naive_convolve(f, g)
2 loops, best of 3: 1.41 s per loop 342 ms ± 1.39 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
There's not such a huge difference yet; because the C code still does exactly There's not such a huge difference yet; because the C code still does exactly
what the Python interpreter does (meaning, for instance, that a new object is what the Python interpreter does (meaning, for instance, that a new object is
allocated for each number used). Look at the generated html file and see what allocated for each number used). You can look at the Python interaction
and the generated C code by using `-a` when calling Cython from the command
line, `%%cython -a` when using a Jupyter Notebook, or by using
`cythonize('convolve_cy.pyx', annotate=True)` when using a `setup.py`.
Look at the generated html file and see what
is needed for even the simplest statements you get the point quickly. We need is needed for even the simplest statements you get the point quickly. We need
to give Cython more information; we need to add types. to give Cython more information; we need to add types.
...@@ -358,9 +362,16 @@ mode). ...@@ -358,9 +362,16 @@ mode).
Where to go from here? Where to go from here?
====================== ======================
* Since there is no Python interaction in the loops, it is possible with Cython
to release the GIL and use multiple cores easily. To learn how to do that,
you can see :ref:`using parallelism in Cython <parallel>`.
* If you want to learn how to make use of `BLAS <http://www.netlib.org/blas/>`_
or `LAPACK <http://www.netlib.org/lapack/>`_ with Cython, you can watch
`the presentation of Ian Henriksen at SciPy 2015
<https://www.youtube.com/watch?v=R4yB-8tB0J0&t=693s&ab_channel=Enthought>`_.
The future The future
============ ==========
These are some points to consider for further development. All points listed These are some points to consider for further development. All points listed
here has gone through a lot of thinking and planning already; still they may here has gone through a lot of thinking and planning already; still they may
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment