Commit 5ed67937 authored by Jason Madden's avatar Jason Madden

Merge pull request #708 from gevent/cffi-semaphore

Use a manually locked pure-Python Semaphore implementation on PyPy
parents 9a82fc7e 41319891
......@@ -23,15 +23,16 @@ gevent/gevent.ares.c: gevent/ares.pyx gevent/*.pxd
$(CYTHON) -o gevent.ares.c gevent/ares.pyx
mv gevent.ares.* gevent/
gevent/gevent._semaphore.c: gevent/_semaphore.pyx gevent/_semaphore.pxd
# For PyPy, we need to have _semaphore named as a .pyx file so it doesn't
# get loaded in preference to the .so. But we want to keep the definitions
gevent/gevent._semaphore.c: gevent/_semaphore.py gevent/_semaphore.pxd
# On PyPy, if we wanted to use Cython to compile _semaphore.py, we'd
# need to have _semaphore named as a .pyx file so it doesn't get
# loaded in preference to the .so. (We want to keep the definitions
# separate in a .pxd file for ease of reading, and that only works
# with .py files.
cp gevent/_semaphore.pyx gevent/_semaphore.py
# with .py files, so we'd have to copy them back and forth.)
# cp gevent/_semaphore.pyx gevent/_semaphore.py
$(CYTHON) -o gevent._semaphore.c gevent/_semaphore.py
mv gevent._semaphore.* gevent/
rm gevent/_semaphore.py
# rm gevent/_semaphore.py
clean:
rm -f corecext.pyx gevent/corecext.pyx
......
......@@ -5,7 +5,5 @@ IF "%PYTHON_EXE%" == "python" (
)
cython -o gevent.ares.c gevent/ares.pyx
move gevent.ares.* gevent
move gevent\\_semaphore.pyx gevent\\_semaphore.py
cython -o gevent._semaphore.c gevent/_semaphore.py
move gevent._semaphore.* gevent
del gevent\\_semaphore.py
......@@ -12,6 +12,14 @@
shipped with. Some Linux distributions, including RedHat/CentOS and
Amazon have backported the changes to older versions. Reported in
:issue:`702`.
- PyPy: An interaction between Cython compiled code and the garbage
collector caused PyPy to crash when a previously-allocated Semaphore
was used in a ``__del__`` method, something done in the popular
libraries ``requests`` and ``urllib3``. Due to this and other Cython
related issues, the Semaphore class is no longer compiled by Cython.
This means that it is now traceable and not exactly as atomic as the
Cython version, though the overall semantics should remain the same.
Reported in :issue:`704` by Shaun Crampton.
1.1rc2 (Dec 11, 2015)
=====================
......
......@@ -37,7 +37,8 @@ PyPy Notes
PyPy has been tested on OS X and 64-bit Linux from version 2.6.1
through 4.0.0 and 4.0.1.
.. note:: PyPy is not supported on Windows.
.. note:: PyPy is not supported on Windows. (gevent's CFFI backend is not
available on Windows.)
- Version 2.6.1 or above is required for proper signal handling. Prior
to 2.6.1 and its inclusion of `cffi 1.3.0`_, signals could be
......@@ -51,24 +52,24 @@ through 4.0.0 and 4.0.1.
well or better on PyPy than on CPython at least on some platforms.
Things that are known or expected to be (relatively) slower under
PyPy include the :mod:`c-ares resolver <gevent.resolver_ares>` and
:class:`gevent.lock.Semaphore`. Whether or not these matter will
:class:`~gevent.lock.Semaphore`. Whether or not these matter will
depend on the workload of each application.
.. note:: Released versions of PyPy through at least 4.0.0 have `a
.. caution:: The ``c-ares`` resolver is considered highly experimental
under PyPy and is not recommended for production use.
Released versions of PyPy through at least 4.0.1 have `a
bug`_ that can cause a memory leak when subclassing
objects that are implemented in Cython, as are the two
things mentioned above. The `Semaphore` class is
subclassed to become :class:`gevent.thread.LockType`,
which in turn is used as the basis for
:class:`threading.Lock`. The `Semaphore` object is coded
carefully to avoid this leak, assuming it is deallocated
when not acquired (which should be the typical case). The
``c-ares`` package has not been audited for this issue.
objects that are implemented in Cython, as is the c-ares
resolver. In addition, thanks to reports like
:issue:`704`, we know that the PyPy garbage collector can
interact badly with Cython-compiled code, leading to
crashes. While the intended use of the ares resolver has
been loosely audited for these issues, no guarantees are made.
.. note:: PyPy 4.0.x on Linux is known to *rarely* (once per 24 hours)
encounter crashes when running heavily loaded, heavily
networked gevent programs. The exact cause is unknown and is
being tracked in :issue:`677`.
networked gevent programs (even without ``c-ares``). The
exact cause is unknown and is being tracked in :issue:`677`.
.. _cffi 1.3.0: https://bitbucket.org/cffi/cffi/src/ad3140a30a7b0ca912185ef500546a9fb5525ece/doc/source/whatsnew.rst?at=default
.. _1.2.0: https://cffi.readthedocs.org/en/latest/whatsnew.html#v1-2-0
......
......@@ -17,7 +17,13 @@ class Semaphore(object):
If not given, ``value`` defaults to 1.
This Semaphore's ``__exit__`` method does not call the trace function.
The semaphore is a context manager and can be used in ``with`` statements.
This Semaphore's ``__exit__`` method does not call the trace function
on CPython, but does under PyPy.
.. seealso:: :class:`BoundedSemaphore` for a safer version that prevents
some classes of bugs.
"""
def __init__(self, value=1):
......@@ -53,6 +59,9 @@ class Semaphore(object):
return self.counter <= 0
def release(self):
"""
Release the semaphore, notifying any waiters if needed.
"""
self.counter += 1
self._start_notify()
return self.counter
......@@ -112,6 +121,9 @@ class Semaphore(object):
*callback* will be called in the :class:`Hub <gevent.hub.Hub>`, so it must not use blocking gevent API.
*callback* will be passed one argument: this instance.
This method is normally called automatically by :meth:`acquire` and :meth:`wait`; most code
will not need to use it.
"""
if not callable(callback):
raise TypeError('Expected callable:', callback)
......@@ -125,7 +137,10 @@ class Semaphore(object):
"""
unlink(callback) -> None
Remove the callback set by :meth:`rawlink`
Remove the callback set by :meth:`rawlink`.
This method is normally called automatically by :meth:`acquire` and :meth:`wait`; most
code will not need to use it.
"""
try:
self._links.remove(callback)
......@@ -169,7 +184,7 @@ class Semaphore(object):
Wait until it is possible to acquire this semaphore, or until the optional
*timeout* elapses.
.. warning:: If this semaphore was initialized with a size of 0,
.. caution:: If this semaphore was initialized with a size of 0,
this method will block forever if no timeout is given.
:keyword float timeout: If given, specifies the maximum amount of seconds
......@@ -189,7 +204,7 @@ class Semaphore(object):
Acquire the semaphore.
.. warning:: If this semaphore was initialized with a size of 0,
.. caution:: If this semaphore was initialized with a size of 0,
this method will block forever (unless a timeout is given or blocking is
set to false).
......@@ -233,6 +248,8 @@ class Semaphore(object):
class BoundedSemaphore(Semaphore):
"""
BoundedSemaphore(value=1) -> BoundedSemaphore
A bounded semaphore checks to make sure its current value doesn't
exceed its initial value. If it does, :class:`ValueError` is
raised. In most situations semaphores are used to guard resources
......@@ -242,11 +259,12 @@ class BoundedSemaphore(Semaphore):
If not given, *value* defaults to 1.
"""
#: For monkey-patching, allow changing the class of error we raise
_OVER_RELEASE_ERROR = ValueError
def __init__(self, value=1):
Semaphore.__init__(self, value)
self._initial_value = value
def __init__(self, *args, **kwargs):
Semaphore.__init__(self, *args, **kwargs)
self._initial_value = self.counter
def release(self):
if self.counter >= self._initial_value:
......
......@@ -161,7 +161,7 @@ def wait_read(fileno, timeout=None, timeout_exc=_NONE):
For the meaning of the other parameters and possible exceptions,
see :func:`wait`.
.. seealso: :func:`cancel_wait`
.. seealso:: :func:`cancel_wait`
"""
io = get_hub().loop.io(fileno, 1)
return wait(io, timeout, timeout_exc)
......@@ -177,7 +177,7 @@ def wait_write(fileno, timeout=None, timeout_exc=_NONE, event=_NONE):
:keyword event: Ignored. Applications should not pass this parameter.
In the future, it may become an error.
.. seealso: :func:`cancel_wait`
.. seealso:: :func:`cancel_wait`
"""
io = get_hub().loop.io(fileno, 2)
return wait(io, timeout, timeout_exc)
......@@ -194,7 +194,7 @@ def wait_readwrite(fileno, timeout=None, timeout_exc=_NONE, event=_NONE):
:keyword event: Ignored. Applications should not pass this parameter.
In the future, it may become an error.
.. seealso: :func:`cancel_wait`
.. seealso:: :func:`cancel_wait`
"""
io = get_hub().loop.io(fileno, 3)
return wait(io, timeout, timeout_exc)
......
# Copyright (c) 2009-2012 Denis Bilenko. See LICENSE for details.
"""Locking primitives"""
from __future__ import absolute_import
from gevent.hub import getcurrent
from gevent.hub import getcurrent, PYPY
from gevent._semaphore import Semaphore, BoundedSemaphore
__all__ = ['Semaphore', 'DummySemaphore', 'BoundedSemaphore', 'RLock']
# On PyPy, we don't compile the Semaphore class with Cython. Under
# Cython, each individual method holds the GIL for its entire
# duration, ensuring that no other thread can interrupt us in an
# unsafe state (only when we _do_wait do we call back into Python and
# allow switching threads). Simulate that here through the use of a manual
# lock. (We use a separate lock for each semaphore to allow sys.settrace functions
# to use locks *other* than the one being traced.)
if PYPY:
# TODO: Need to use monkey.get_original?
from thread import allocate_lock as _allocate_lock
from thread import get_ident as _get_ident
_sem_lock = _allocate_lock()
class _OwnedLock(object):
def __init__(self):
self._owner = None
self._block = _allocate_lock()
self._locking = {}
self._count = 0
def untraceable(f):
# Don't allow re-entry to these functions in a single thread, as can
# happen if a sys.settrace is used
def wrapper(self):
me = _get_ident()
try:
count = self._locking[me]
except KeyError:
count = self._locking[me] = 1
else:
count = self._locking[me] = count + 1
if count:
return
try:
return f(self)
finally:
count = count - 1
if not count:
del self._locking[me]
else:
self._locking[me] = count
return wrapper
@untraceable
def acquire(self):
me = _get_ident()
if self._owner == me:
self._count += 1
return
self._owner = me
self._block.acquire()
self._count = 1
@untraceable
def release(self):
self._count = count = self._count - 1
if not count:
self._block.release()
self._owner = None
# acquire, wait, and release all acquire the lock on entry and release it
# on exit. acquire and wait can call _do_wait, which must release it on entry
# and re-acquire it for them on exit.
class _around(object):
before = None
after = None
def __enter__(self):
self.before()
def __exit__(self, t, v, tb):
self.after()
def _decorate(func, cmname):
# functools.wrap?
def wrapped(self, *args, **kwargs):
with getattr(self, cmname):
return func(self, *args, **kwargs)
return wrapped
Semaphore._py3k_acquire = Semaphore.acquire = _decorate(Semaphore.acquire, '_lock_locked')
Semaphore.release = _decorate(Semaphore.release, '_lock_locked')
Semaphore.wait = _decorate(Semaphore.wait, '_lock_locked')
Semaphore._do_wait = _decorate(Semaphore._do_wait, '_lock_unlocked')
_Sem_init = Semaphore.__init__
def __init__(self, *args, **kwargs):
l = self._lock_lock = _OwnedLock()
self._lock_locked = _around()
self._lock_locked.before = l.acquire
self._lock_locked.after = l.release
self._lock_unlocked = _around()
self._lock_unlocked.before = l.release
self._lock_unlocked.after = l.acquire
_Sem_init(self, *args, **kwargs)
Semaphore.__init__ = __init__
del _decorate
class DummySemaphore(object):
"""
DummySemaphore(value=None) -> DummySemaphore
A Semaphore initialized with "infinite" initial value. None of its
methods ever block.
......@@ -34,6 +141,13 @@ class DummySemaphore(object):
# determines whether it should lock around IO to the underlying
# file object.
def __init__(self, value=None):
"""
.. versionchanged:: 1.1rc3
Accept and ignore a *value* argument for compatibility with Semaphore.
"""
pass
def __str__(self):
return '<%s>' % self.__class__.__name__
......@@ -42,6 +156,7 @@ class DummySemaphore(object):
return False
def release(self):
"""Releasing a dummy semaphore does nothing."""
pass
def rawlink(self, callback):
......@@ -52,6 +167,7 @@ class DummySemaphore(object):
pass
def wait(self, timeout=None):
"""Waiting for a DummySemaphore returns immediately."""
pass
def acquire(self, blocking=True, timeout=None):
......
......@@ -27,8 +27,9 @@ class Resolver(object):
reports of it not properly honoring certain system configurations.
However, because it does not use threads, it may scale better.
.. note:: This module is considered experimental on PyPy, and
due to its implementation in cython, it may be slower.
.. caution:: This module is considered extremely experimental on PyPy, and
due to its implementation in cython, it may be slower. It may also lead to
interpreter crashes.
.. _c-ares: http://c-ares.haxx.se
"""
......
......@@ -32,6 +32,24 @@ class TestSemaphore(greentest.TestCase):
r = weakref.ref(s)
self.assertEqual(s, r())
def test_semaphore_in_class_with_del(self):
# Issue #704. This used to crash the process
# under PyPy through at least 4.0.1 if the Semaphore
# was implemented with Cython.
class X(object):
def __init__(self):
self.s = Semaphore()
def __del__(self):
self.s.acquire()
X()
import gc
gc.collect()
gc.collect()
test_semaphore_in_class_with_del.ignore_leakcheck = True
class TestLock(greentest.TestCase):
......
......@@ -2,7 +2,7 @@ from __future__ import print_function
import sys
import subprocess
import unittest
import gevent.thread
from gevent.thread import allocate_lock
script = """
from gevent import monkey
......@@ -48,23 +48,101 @@ sys.stdout.write("..finishing..")
class TestTrace(unittest.TestCase):
def test_untraceable_lock(self):
# Untraceable locks were part of the solution to https://bugs.python.org/issue1733757
# which details a deadlock that could happen if a trace function invoked
# threading.currentThread at shutdown time---the cleanup lock would be held
# by the VM, and calling currentThread would try to acquire it again. The interpreter
# changed in 2.6 to use the `with` statement (https://hg.python.org/cpython/rev/76f577a9ec03/),
# which apparently doesn't trace in quite the same way.
if hasattr(sys, 'gettrace'):
old = sys.gettrace()
else:
old = None
PYPY = hasattr(sys, 'pypy_version_info')
lst = []
try:
def trace(frame, ev, arg):
lst.append((frame.f_code.co_filename, frame.f_lineno, ev))
if not PYPY: # because we expect to trace on PyPy
print("TRACE: %s:%s %s" % lst[-1])
return trace
with gevent.thread.allocate_lock():
with allocate_lock():
sys.settrace(trace)
finally:
sys.settrace(old)
self.failUnless(lst == [], "trace not empty")
if not PYPY:
self.assertEqual(lst, [], "trace not empty")
else:
# Have an assert so that we know if we miscompile
self.assertTrue(len(lst) > 0, "should not compile on pypy")
def test_untraceable_lock_uses_different_lock(self):
if hasattr(sys, 'gettrace'):
old = sys.gettrace()
else:
old = None
PYPY = hasattr(sys, 'pypy_version_info')
lst = []
# we should be able to use unrelated locks from within the trace function
l = allocate_lock()
try:
def trace(frame, ev, arg):
with l:
lst.append((frame.f_code.co_filename, frame.f_lineno, ev))
if not PYPY: # because we expect to trace on PyPy
print("TRACE: %s:%s %s" % lst[-1])
return trace
l2 = allocate_lock()
sys.settrace(trace)
# Separate functions, not the C-implemented `with` so the trace
# function gets a crack at them
l2.acquire()
l2.release()
finally:
sys.settrace(old)
if not PYPY:
self.assertEqual(lst, [], "trace not empty")
else:
# Have an assert so that we know if we miscompile
self.assertTrue(len(lst) > 0, "should not compile on pypy")
def test_untraceable_lock_uses_same_lock(self):
from gevent.hub import LoopExit
if hasattr(sys, 'gettrace'):
old = sys.gettrace()
else:
old = None
PYPY = hasattr(sys, 'pypy_version_info')
lst = []
e = None
# we should not be able to use the same lock from within the trace function
# because it's over acquired but instead of deadlocking it raises an exception
l = allocate_lock()
try:
def trace(frame, ev, arg):
with l:
lst.append((frame.f_code.co_filename, frame.f_lineno, ev))
return trace
sys.settrace(trace)
# Separate functions, not the C-implemented `with` so the trace
# function gets a crack at them
l.acquire()
except LoopExit as ex:
e = ex
finally:
sys.settrace(old)
if not PYPY:
self.assertEqual(lst, [], "trace not empty")
else:
# Have an assert so that we know if we miscompile
self.assertTrue(len(lst) > 0, "should not compile on pypy")
self.assertTrue(isinstance(e, LoopExit))
def run_script(self, more_args=()):
args = [sys.executable, "-c", script]
......
......@@ -10,6 +10,7 @@ from os.path import join, abspath, basename, dirname
from subprocess import check_call
from glob import glob
PYPY = hasattr(sys, 'pypy_version_info')
WIN = sys.platform.startswith('win')
CFFI_WIN_BUILD_ANYWAY = os.environ.get("PYPY_WIN_BUILD_ANYWAY")
......@@ -31,6 +32,8 @@ if WIN:
# Make sure the env vars that make.cmd needs are set
if not os.environ.get('PYTHON_EXE'):
os.environ['PYTHON_EXE'] = 'pypy' if PYPY else 'python'
if not os.environ.get('PYEXE'):
os.environ['PYEXE'] = os.environ['PYTHON_EXE']
import distutils
......@@ -417,11 +420,15 @@ elif PYPY:
# and having only the bare minimum be in cython might help reduce that penalty.
# NOTE: You must use version 0.23.4 or later to avoid a memory leak.
# https://mail.python.org/pipermail/cython-devel/2015-October/004571.html
Extension(name="gevent._semaphore",
sources=["gevent/gevent._semaphore.c"]),
# However, that's all for naught on up to and including PyPy 4.0.1 which
# have some serious crashing bugs with GC interacting with cython,
# so this is disabled (would need to add gevent/gevent._semaphore.c back to
# the run_make line)
#Extension(name="gevent._semaphore",
# sources=["gevent/gevent._semaphore.c"]),
]
include_package_data = True
run_make = 'gevent/gevent._semaphore.c gevent/gevent.ares.c'
run_make = 'gevent/gevent.ares.c'
else:
ext_modules = [
CORE,
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment