better explanation of OpenMP schedules

6e26cece · Stefan Behnel · 50f24198 · 6e26cece
Commit 6e26cece authored May 13, 2012 by Stefan Behnel
Hide whitespace changes
Inline Side-by-side

Showing with 45 additions and 31 deletions

docs/src/userguide/parallelism.rst docs/src/userguide/parallelism.rst +45 -31

No files found.
--- a/docs/src/userguide/parallelism.rst
+++ b/docs/src/userguide/parallelism.rst
@@ -39,37 +39,51 @@ __ nogil_

    The ``schedule`` is passed to OpenMP and can be one of the following:

-    +-----------------+------------------------------------------------------+
-    | Schedule        | Description                                          |
-    +=================+======================================================+
-    |static           | The iteration space is divided into chunks that are  |
-    |                 | approximately equal in size, and at most one chunk   |
-    |                 | is distributed to each thread, if ``chunksize`` is   |
-    |                 | not given. If ``chunksize`` is specified, iterations |
-    |                 | are distributed cyclically in a static manner with a |
-    |                 | blocksize of ``chunksize``.                          |
-    +-----------------+------------------------------------------------------+
-    |dynamic          | The iterations are distributed to threads in the team|
-    |                 | as the threads request them, with a default chunk    |
-    |                 | size of 1.                                           |
-    +-----------------+------------------------------------------------------+
-    |guided           | The iterations are distributed to threads in the team|
-    |                 | as the threads request them. The size of each chunk  |
-    |                 | is proportional to the number of unassigned          |
-    |                 | iterations divided by the number of threads in the   |
-    |                 | team, decreasing to 1 (or ``chunksize`` if given).   |
-    +-----------------+------------------------------------------------------+
-    |runtime          | The schedule and chunk size are taken from the       |
-    |                 | runtime-scheduling-variable, which can be set through|
-    |                 | the ``omp_set_schedule`` function call, or the       |
-    |                 | ``OMP_SCHEDULE`` environment variable.               |
-    +-----------------+------------------------------------------------------+
-
-..    |auto             | The decision regarding scheduling is delegated to the|
-..    |                 | compiler and/or runtime system. The programmer gives |
-..    |                 | the implementation the freedom to choose any possible|
-..    |                 | mapping of iterations to threads in the team.        |
-..    +-----------------+------------------------------------------------------+
+    static:
+       If a chunksize is provided, iterations are distributed to all
+       threads ahead of time in blocks of the given chunksize.  If no
+       chunksize is given, the iteration space is divided into chunks that
+       are approximately equal in size, and at most one chunk is assigned
+       to each thread in advance.
+
+       This is most appropriate when the scheduling overhead matters and
+       the problem can be cut down into equally sized chunks that are
+       known to have approximately the same runtime.
+
+    dynamic:
+       The iterations are distributed to threads as they request them,
+       with a default chunk size of 1.
+
+       This is suitable when the runtime of each chunk differs and is not
+       known in advance and therefore a larger number of smaller chunks
+       is used in order to keep all threads busy.
+
+    guided:
+       As with dynamic scheduling, the iterations are distributed to
+       threads as they request them, but with decreasing chunk size.  The
+       size of each chunk is proportional to the number of unassigned
+       iterations divided by the number of participating threads,
+       decreasing to 1 (or the chunksize if provided).
+
+       This has an advantage over pure dynamic scheduling when it turns
+       out that the last chunks take more time than expected or are
+       otherwise being badly scheduled, so that most threads start running
+       idle while the last chunks are being worked on by only a smaller
+       number of threads.
+
+    runtime:
+       The schedule and chunk size are taken from the runtime scheduling
+       variable, which can be set through the ``openmp.omp_set_schedule()``
+       function call, or the OMP_SCHEDULE environment variable.  Note that
+       this essentially disables any static compile time optimisations of
+       the scheduling code itself and may therefore show a slightly worse
+       performance than when the same scheduling policy is statically
+       configured at compile time.
+
+..  auto             The decision regarding scheduling is delegated to the
+..                   compiler and/or runtime system. The programmer gives
+..                   the implementation the freedom to choose any possible
+..                   mapping of iterations to threads in the team.

    The default schedule is implementation defined. For more information consult
    the OpenMP specification [#]_.