Commit 70c5f657 authored by Mark Florisson's avatar Mark Florisson

Update OpenMP docs

parent ffbdf766
...@@ -20,12 +20,14 @@ currently supports OpenMP, but later on more backends might be supported. ...@@ -20,12 +20,14 @@ currently supports OpenMP, but later on more backends might be supported.
Thread-locality and reductions are automatically inferred for variables. Thread-locality and reductions are automatically inferred for variables.
If you assign to a variable, it becomes lastprivate, meaning that the If you assign to a variable in a prange block, it becomes lastprivate, meaning that the
variable will contain the value from the last iteration. If you use an variable will contain the value from the last iteration. If you use an
inplace operator on a variable, it becomes a reduction, meaning that the inplace operator on a variable, it becomes a reduction, meaning that the
values from the thread-local copies of the variable will be reduced with values from the thread-local copies of the variable will be reduced with
the operator and assigned to the original variable after the loop. The the operator and assigned to the original variable after the loop. The
index variable is always lastprivate. index variable is always lastprivate.
Variables assigned to in a parallel with block will be private and unusable
after the block, as there is no concept of a sequentially last value.
The ``schedule`` is passed to OpenMP and can be one of the following: The ``schedule`` is passed to OpenMP and can be one of the following:
...@@ -88,24 +90,24 @@ currently supports OpenMP, but later on more backends might be supported. ...@@ -88,24 +90,24 @@ currently supports OpenMP, but later on more backends might be supported.
buffers used by a prange. A contained prange will be a worksharing loop buffers used by a prange. A contained prange will be a worksharing loop
that is not parallel, so any variable assigned to in the parallel section that is not parallel, so any variable assigned to in the parallel section
is also private to the prange. Variables that are private in the parallel is also private to the prange. Variables that are private in the parallel
block are unaltered after the parallel block. block are unavailable after the parallel block.
Example with thread-local buffers:: Example with thread-local buffers::
from cython.parallel import * from cython.parallel import *
from cython.stdlib cimport abort from libc.stdlib cimport abort, malloc, free
cdef Py_ssize_t i, n = 100 cdef Py_ssize_t idx, i, n = 100
cdef int * local_buf cdef int * local_buf
cdef size_t size = 10 cdef size_t size = 10
with nogil, parallel: with nogil, parallel():
local_buf = malloc(sizeof(int) * size) local_buf = <int *> malloc(sizeof(int) * size)
if local_buf == NULL: if local_buf == NULL:
abort() abort()
# populate our local buffer in a sequential loop # populate our local buffer in a sequential loop
for i in range(size): for idx in range(size):
local_buf[i] = i * 2 local_buf[i] = i * 2
# share the work using the thread-local buffer(s) # share the work using the thread-local buffer(s)
...@@ -135,7 +137,7 @@ enable OpenMP. For gcc this can be done as follows in a setup.py:: ...@@ -135,7 +137,7 @@ enable OpenMP. For gcc this can be done as follows in a setup.py::
"hello", "hello",
["hello.pyx"], ["hello.pyx"],
extra_compile_args=['-fopenmp'], extra_compile_args=['-fopenmp'],
libraries=['gomp'], extra_link_args=['-fopenmp'],
) )
setup( setup(
...@@ -158,7 +160,7 @@ particular order:: ...@@ -158,7 +160,7 @@ particular order::
from cython.parallel import prange from cython.parallel import prange
def func(Py_ssize_t n): cdef int func(Py_ssize_t n):
cdef Py_ssize_t i cdef Py_ssize_t i
for i in prange(n, nogil=True): for i in prange(n, nogil=True):
...@@ -173,6 +175,12 @@ particular order:: ...@@ -173,6 +175,12 @@ particular order::
In the example above it is undefined whether an exception shall be raised, In the example above it is undefined whether an exception shall be raised,
whether it will simply break or whether it will return 2. whether it will simply break or whether it will return 2.
Nested Parallelism
==================
Nested parallelism is currently disabled due to a bug in gcc 4.5 [#]_. However,
you can freely call functions with parallel sections from a parallel section.
.. rubric:: References .. rubric:: References
.. [#] http://www.openmp.org/mp-documents/spec30.pdf .. [#] http://www.openmp.org/mp-documents/spec30.pdf
.. [#] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49897
...@@ -688,3 +688,20 @@ def test_parallel_with_gil_continue_unnested(): ...@@ -688,3 +688,20 @@ def test_parallel_with_gil_continue_unnested():
sum += i sum += i
print sum print sum
cdef int inner_parallel_section() nogil:
cdef int j, sum = 0
for j in prange(10):
sum += j
return sum
def outer_parallel_section():
"""
>>> outer_parallel_section()
450
"""
cdef int i, sum = 0
for i in prange(10, nogil=True):
sum += inner_parallel_section()
return sum
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment