Commit cc353a0c authored by Raymond Hettinger's avatar Raymond Hettinger Committed by Miss Islington (bot)

Various refinements to the NormalDist examples and recipes (GH-12272)

parent 491ef53c
...@@ -510,10 +510,9 @@ of applications in statistics. ...@@ -510,10 +510,9 @@ of applications in statistics.
.. classmethod:: NormalDist.from_samples(data) .. classmethod:: NormalDist.from_samples(data)
Class method that makes a normal distribution instance Makes a normal distribution instance computed from sample data. The
from sample data. The *data* can be any :term:`iterable` *data* can be any :term:`iterable` and should consist of values that
and should consist of values that can be converted to type can be converted to type :class:`float`.
:class:`float`.
If *data* does not contain at least two elements, raises If *data* does not contain at least two elements, raises
:exc:`StatisticsError` because it takes at least one point to estimate :exc:`StatisticsError` because it takes at least one point to estimate
...@@ -536,11 +535,10 @@ of applications in statistics. ...@@ -536,11 +535,10 @@ of applications in statistics.
the given value *x*. Mathematically, it is the ratio ``P(x <= X < the given value *x*. Mathematically, it is the ratio ``P(x <= X <
x+dx) / dx``. x+dx) / dx``.
Note the relative likelihood of *x* can be greater than `1.0`. The The relative likelihood is computed as the probability of a sample
probability for a specific point on a continuous distribution is `0.0`, occurring in a narrow range divided by the width of the range (hence
so the :func:`pdf` is used instead. It gives the probability of a the word "density"). Since the likelihood is relative to other points,
sample occurring in a narrow range around *x* and then dividing that its value can be greater than `1.0`.
probability by the width of the range (hence the word "density").
.. method:: NormalDist.cdf(x) .. method:: NormalDist.cdf(x)
...@@ -568,7 +566,8 @@ of applications in statistics. ...@@ -568,7 +566,8 @@ of applications in statistics.
>>> temperature_february * (9/5) + 32 # Fahrenheit >>> temperature_february * (9/5) + 32 # Fahrenheit
NormalDist(mu=41.0, sigma=4.5) NormalDist(mu=41.0, sigma=4.5)
Dividing a constant by an instance of :class:`NormalDist` is not supported. Dividing a constant by an instance of :class:`NormalDist` is not supported
because the result wouldn't be normally distributed.
Since normal distributions arise from additive effects of independent Since normal distributions arise from additive effects of independent
variables, it is possible to `add and subtract two independent normally variables, it is possible to `add and subtract two independent normally
...@@ -581,8 +580,10 @@ of applications in statistics. ...@@ -581,8 +580,10 @@ of applications in statistics.
>>> birth_weights = NormalDist.from_samples([2.5, 3.1, 2.1, 2.4, 2.7, 3.5]) >>> birth_weights = NormalDist.from_samples([2.5, 3.1, 2.1, 2.4, 2.7, 3.5])
>>> drug_effects = NormalDist(0.4, 0.15) >>> drug_effects = NormalDist(0.4, 0.15)
>>> combined = birth_weights + drug_effects >>> combined = birth_weights + drug_effects
>>> f'mean: {combined.mean :.1f} standard deviation: {combined.stdev :.1f}' >>> round(combined.mean, 1)
'mean: 3.1 standard deviation: 0.5' 3.1
>>> round(combined.stdev, 1)
0.5
.. versionadded:: 3.8 .. versionadded:: 3.8
...@@ -595,14 +596,15 @@ of applications in statistics. ...@@ -595,14 +596,15 @@ of applications in statistics.
For example, given `historical data for SAT exams For example, given `historical data for SAT exams
<https://blog.prepscholar.com/sat-standard-deviation>`_ showing that scores <https://blog.prepscholar.com/sat-standard-deviation>`_ showing that scores
are normally distributed with a mean of 1060 and a standard deviation of 192, are normally distributed with a mean of 1060 and a standard deviation of 192,
determine the percentage of students with scores between 1100 and 1200: determine the percentage of students with scores between 1100 and 1200, after
rounding to the nearest whole number:
.. doctest:: .. doctest::
>>> sat = NormalDist(1060, 195) >>> sat = NormalDist(1060, 195)
>>> fraction = sat.cdf(1200 + 0.5) - sat.cdf(1100 - 0.5) >>> fraction = sat.cdf(1200 + 0.5) - sat.cdf(1100 - 0.5)
>>> f'{fraction * 100 :.1f}% score between 1100 and 1200' >>> round(fraction * 100.0, 1)
'18.4% score between 1100 and 1200' 18.4
What percentage of men and women will have the same height in `two normally What percentage of men and women will have the same height in `two normally
distributed populations with known means and standard deviations distributed populations with known means and standard deviations
...@@ -616,18 +618,19 @@ distributed populations with known means and standard deviations ...@@ -616,18 +618,19 @@ distributed populations with known means and standard deviations
To estimate the distribution for a model than isn't easy to solve To estimate the distribution for a model than isn't easy to solve
analytically, :class:`NormalDist` can generate input samples for a `Monte analytically, :class:`NormalDist` can generate input samples for a `Monte
Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_ of the Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_:
model:
.. doctest:: .. doctest::
>>> def model(x, y, z):
... return (3*x + 7*x*y - 5*y) / (11 * z)
...
>>> n = 100_000 >>> n = 100_000
>>> X = NormalDist(350, 15).samples(n) >>> X = NormalDist(10, 2.5).samples(n)
>>> Y = NormalDist(47, 17).samples(n) >>> Y = NormalDist(15, 1.75).samples(n)
>>> Z = NormalDist(62, 6).samples(n) >>> Z = NormalDist(5, 1.25).samples(n)
>>> model_simulation = [x * y / z for x, y, z in zip(X, Y, Z)] >>> NormalDist.from_samples(map(model, X, Y, Z)) # doctest: +SKIP
>>> NormalDist.from_samples(model_simulation) # doctest: +SKIP NormalDist(mu=19.640137307085507, sigma=47.03273142191088)
NormalDist(mu=267.6516398754636, sigma=101.357284306067)
Normal distributions commonly arise in machine learning problems. Normal distributions commonly arise in machine learning problems.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment