Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
cc353a0c
Commit
cc353a0c
authored
Mar 10, 2019
by
Raymond Hettinger
Committed by
Miss Islington (bot)
Mar 10, 2019
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Various refinements to the NormalDist examples and recipes (GH-12272)
parent
491ef53c
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
26 additions
and
23 deletions
+26
-23
Doc/library/statistics.rst
Doc/library/statistics.rst
+26
-23
No files found.
Doc/library/statistics.rst
View file @
cc353a0c
...
...
@@ -510,10 +510,9 @@ of applications in statistics.
.. classmethod:: NormalDist.from_samples(data)
Class method that makes a normal distribution instance
from sample data. The *data* can be any :term:`iterable`
and should consist of values that can be converted to type
:class:`float`.
Makes a normal distribution instance computed from sample data. The
*data* can be any :term:`iterable` and should consist of values that
can be converted to type :class:`float`.
If *data* does not contain at least two elements, raises
:exc:`StatisticsError` because it takes at least one point to estimate
...
...
@@ -536,11 +535,10 @@ of applications in statistics.
the given value *x*. Mathematically, it is the ratio ``P(x <= X <
x+dx) / dx``.
Note the relative likelihood of *x* can be greater than `1.0`. The
probability for a specific point on a continuous distribution is `0.0`,
so the :func:`pdf` is used instead. It gives the probability of a
sample occurring in a narrow range around *x* and then dividing that
probability by the width of the range (hence the word "density").
The relative likelihood is computed as the probability of a sample
occurring in a narrow range divided by the width of the range (hence
the word "density"). Since the likelihood is relative to other points,
its value can be greater than `1.0`.
.. method:: NormalDist.cdf(x)
...
...
@@ -568,7 +566,8 @@ of applications in statistics.
>>> temperature_february * (9/5) + 32 # Fahrenheit
NormalDist(mu=41.0, sigma=4.5)
Dividing a constant by an instance of :class:`NormalDist` is not supported.
Dividing a constant by an instance of :class:`NormalDist` is not supported
because the result wouldn't be normally distributed.
Since normal distributions arise from additive effects of independent
variables, it is possible to `add and subtract two independent normally
...
...
@@ -581,8 +580,10 @@ of applications in statistics.
>>> birth_weights = NormalDist.from_samples([2.5, 3.1, 2.1, 2.4, 2.7, 3.5])
>>> drug_effects = NormalDist(0.4, 0.15)
>>> combined = birth_weights + drug_effects
>>> f'mean: {combined.mean :.1f} standard deviation: {combined.stdev :.1f}'
'mean: 3.1 standard deviation: 0.5'
>>> round(combined.mean, 1)
3.1
>>> round(combined.stdev, 1)
0.5
.. versionadded:: 3.8
...
...
@@ -595,14 +596,15 @@ of applications in statistics.
For example, given `historical data for SAT exams
<https://blog.prepscholar.com/sat-standard-deviation>`_ showing that scores
are normally distributed with a mean of 1060 and a standard deviation of 192,
determine the percentage of students with scores between 1100 and 1200:
determine the percentage of students with scores between 1100 and 1200, after
rounding to the nearest whole number:
.. doctest::
>>> sat = NormalDist(1060, 195)
>>> fraction = sat.cdf(1200 + 0.5) - sat.cdf(1100 - 0.5)
>>>
f'{fraction * 100 :.1f}% score between 1100 and 1200'
'18.4% score between 1100 and 1200'
>>>
round(fraction * 100.0, 1)
18.4
What percentage of men and women will have the same height in `two normally
distributed populations with known means and standard deviations
...
...
@@ -616,18 +618,19 @@ distributed populations with known means and standard deviations
To estimate the distribution for a model than isn't easy to solve
analytically, :class:`NormalDist` can generate input samples for a `Monte
Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_ of the
model:
Carlo simulation <https://en.wikipedia.org/wiki/Monte_Carlo_method>`_:
.. doctest::
>>> def model(x, y, z):
... return (3*x + 7*x*y - 5*y) / (11 * z)
...
>>> n = 100_000
>>> X = NormalDist(350, 15).samples(n)
>>> Y = NormalDist(47, 17).samples(n)
>>> Z = NormalDist(62, 6).samples(n)
>>> model_simulation = [x * y / z for x, y, z in zip(X, Y, Z)]
>>> NormalDist.from_samples(model_simulation) # doctest: +SKIP
NormalDist(mu=267.6516398754636, sigma=101.357284306067)
>>> X = NormalDist(10, 2.5).samples(n)
>>> Y = NormalDist(15, 1.75).samples(n)
>>> Z = NormalDist(5, 1.25).samples(n)
>>> NormalDist.from_samples(map(model, X, Y, Z)) # doctest: +SKIP
NormalDist(mu=19.640137307085507, sigma=47.03273142191088)
Normal distributions commonly arise in machine learning problems.
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment