Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cpython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cpython
Commits
1f58f4fa
Commit
1f58f4fa
authored
Mar 06, 2019
by
Raymond Hettinger
Committed by
Miss Islington (bot)
Mar 06, 2019
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Refine statistics.NormalDist documentation and improve test coverage (GH-12208)
parent
318d537d
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
26 additions
and
29 deletions
+26
-29
Doc/library/statistics.rst
Doc/library/statistics.rst
+24
-28
Lib/test/test_statistics.py
Lib/test/test_statistics.py
+2
-1
No files found.
Doc/library/statistics.rst
View file @
1f58f4fa
...
...
@@ -479,7 +479,7 @@ measurements as a single entity.
Normal distributions arise from the `Central Limit Theorem
<https://en.wikipedia.org/wiki/Central_limit_theorem>`_ and have a wide range
of applications in statistics
, including simulations and hypothesis testing
.
of applications in statistics.
.. class:: NormalDist(mu=0.0, sigma=1.0)
...
...
@@ -492,19 +492,19 @@ of applications in statistics, including simulations and hypothesis testing.
.. attribute:: mean
A read-only property
representing
the `arithmetic mean
A read-only property
for
the `arithmetic mean
<https://en.wikipedia.org/wiki/Arithmetic_mean>`_ of a normal
distribution.
.. attribute:: stdev
A read-only property
representing
the `standard deviation
A read-only property
for
the `standard deviation
<https://en.wikipedia.org/wiki/Standard_deviation>`_ of a normal
distribution.
.. attribute:: variance
A read-only property
representing
the `variance
A read-only property
for
the `variance
<https://en.wikipedia.org/wiki/Variance>`_ of a normal
distribution. Equal to the square of the standard deviation.
...
...
@@ -584,8 +584,8 @@ of applications in statistics, including simulations and hypothesis testing.
Dividing a constant by an instance of :class:`NormalDist` is not supported.
Since normal distributions arise from additive effects of independent
variables, it is possible to `add and subtract two
normally distributed
random variables
variables, it is possible to `add and subtract two
independent normally
distributed
random variables
<https://en.wikipedia.org/wiki/Sum_of_normally_distributed_random_variables>`_
represented as instances of :class:`NormalDist`. For example:
...
...
@@ -607,15 +607,15 @@ of applications in statistics, including simulations and hypothesis testing.
For example, given `historical data for SAT exams
<https://blog.prepscholar.com/sat-standard-deviation>`_ showing that scores
are normally distributed with a mean of 1060 and standard deviation of 192,
are normally distributed with a mean of 1060 and
a
standard deviation of 192,
determine the percentage of students with scores between 1100 and 1200:
.. doctest::
>>> sat = NormalDist(1060, 195)
>>> fraction = sat.cdf(1200
) - sat.cdf(1100
)
>>> fraction = sat.cdf(1200
+ 0.5) - sat.cdf(1100 - 0.5
)
>>> f'{fraction * 100 :.1f}% score between 1100 and 1200'
'18.
2
% score between 1100 and 1200'
'18.
4
% score between 1100 and 1200'
What percentage of men and women will have the same height in `two normally
distributed populations with known means and standard deviations
...
...
@@ -644,20 +644,12 @@ model:
Normal distributions commonly arise in machine learning problems.
Wikipedia has a `nice example
with
a Naive Bayesian Classifier
<https://en.wikipedia.org/wiki/Naive_Bayes_classifier>`_. The challenge
is to guess a person's gender from measurements of normally distributed
features
including height, weight, and foot size.
Wikipedia has a `nice example
of
a Naive Bayesian Classifier
<https://en.wikipedia.org/wiki/Naive_Bayes_classifier>`_. The challenge
is to
predict a person's gender from measurements of normally distributed features
including height, weight, and foot size.
The `prior probability <https://en.wikipedia.org/wiki/Prior_probability>`_ of
being male or female is 50%:
.. doctest::
>>> prior_male = 0.5
>>> prior_female = 0.5
We also have a training dataset with measurements for eight people. These
We're given a training dataset with measurements for eight people. The
measurements are assumed to be normally distributed, so we summarize the data
with :class:`NormalDist`:
...
...
@@ -670,8 +662,8 @@ with :class:`NormalDist`:
>>> foot_size_male = NormalDist.from_samples([12, 11, 12, 10])
>>> foot_size_female = NormalDist.from_samples([6, 8, 7, 9])
We observe a new person whose feature measurements are known but whose gender
is unknown:
Next, we encounter a new person whose feature measurements are known but whose
gender
is unknown:
.. doctest::
...
...
@@ -679,19 +671,23 @@ is unknown:
>>> wt = 130 # weight
>>> fs = 8 # foot size
The posterior is the product of the prior times each likelihood of a
feature measurement given the gender:
Starting with a 50% `prior probability
<https://en.wikipedia.org/wiki/Prior_probability>`_ of being male or female,
we compute the posterior as the prior times the product of likelihoods for the
feature measurements given the gender:
.. doctest::
>>> prior_male = 0.5
>>> prior_female = 0.5
>>> posterior_male = (prior_male * height_male.pdf(ht) *
... weight_male.pdf(wt) * foot_size_male.pdf(fs))
>>> posterior_female = (prior_female * height_female.pdf(ht) *
... weight_female.pdf(wt) * foot_size_female.pdf(fs))
The final prediction
is awarded to the largest posterior -- this is known as
the
`maximum a posteriori
The final prediction
goes to the largest posterior. This is known as the
`maximum a posteriori
<https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation>`_ or MAP:
.. doctest::
...
...
Lib/test/test_statistics.py
View file @
1f58f4fa
...
...
@@ -2123,6 +2123,7 @@ class TestNormalDist(unittest.TestCase):
0.3605
,
0.3589
,
0.3572
,
0.3555
,
0.3538
,
]):
self
.
assertAlmostEqual
(
Z
.
pdf
(
x
/
100.0
),
px
,
places
=
4
)
self
.
assertAlmostEqual
(
Z
.
pdf
(
-
x
/
100.0
),
px
,
places
=
4
)
# Error case: variance is zero
Y
=
NormalDist
(
100
,
0
)
with
self
.
assertRaises
(
statistics
.
StatisticsError
):
...
...
@@ -2262,7 +2263,7 @@ class TestNormalDist(unittest.TestCase):
self
.
assertEqual
(
X
*
y
,
NormalDist
(
1000
,
150
))
# __mul__
self
.
assertEqual
(
y
*
X
,
NormalDist
(
1000
,
150
))
# __rmul__
self
.
assertEqual
(
X
/
y
,
NormalDist
(
10
,
1.5
))
# __truediv__
with
self
.
assertRaises
(
TypeError
):
with
self
.
assertRaises
(
TypeError
):
# __rtruediv__
y
/
X
def
test_equality
(
self
):
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment