Commit bc77eff8 authored by Nick Coghlan's avatar Nick Coghlan Committed by GitHub

bpo-33042: Fix pre-initialization sys module configuration (GH-6157)

- new test case for pre-initialization of sys.warnoptions and sys._xoptions
- restored ability to call these APIs prior to Py_Initialize
- updated the docs for the affected APIs to make it clear they can be
  called before Py_Initialize
- also enhanced the existing embedding test cases
  to check for expected settings in the sys module
parent d02ac25a
......@@ -31,6 +31,9 @@ The following functions can be safely called before Python is initialized:
* :c:func:`Py_SetProgramName`
* :c:func:`Py_SetPythonHome`
* :c:func:`Py_SetStandardStreamEncoding`
* :c:func:`PySys_AddWarnOption`
* :c:func:`PySys_AddXOption`
* :c:func:`PySys_ResetWarnOptions`
* Informative functions:
......
......@@ -205,16 +205,24 @@ accessible to C code. They all work with the current interpreter thread's
.. c:function:: void PySys_ResetWarnOptions()
Reset :data:`sys.warnoptions` to an empty list.
Reset :data:`sys.warnoptions` to an empty list. This function may be
called prior to :c:func:`Py_Initialize`.
.. c:function:: void PySys_AddWarnOption(const wchar_t *s)
Append *s* to :data:`sys.warnoptions`.
Append *s* to :data:`sys.warnoptions`. This function must be called prior
to :c:func:`Py_Initialize` in order to affect the warnings filter list.
.. c:function:: void PySys_AddWarnOptionUnicode(PyObject *unicode)
Append *unicode* to :data:`sys.warnoptions`.
Note: this function is not currently usable from outside the CPython
implementation, as it must be called prior to the implicit import of
:mod:`warnings` in :c:func:`Py_Initialize` to be effective, but can't be
called until enough of the runtime has been initialized to permit the
creation of Unicode objects.
.. c:function:: void PySys_SetPath(const wchar_t *path)
Set :data:`sys.path` to a list object of paths found in *path* which should
......@@ -260,7 +268,8 @@ accessible to C code. They all work with the current interpreter thread's
.. c:function:: void PySys_AddXOption(const wchar_t *s)
Parse *s* as a set of :option:`-X` options and add them to the current
options mapping as returned by :c:func:`PySys_GetXOptions`.
options mapping as returned by :c:func:`PySys_GetXOptions`. This function
may be called prior to :c:func:`Py_Initialize`.
.. versionadded:: 3.2
......
......@@ -951,6 +951,14 @@ Build and C API Changes
second argument is *NULL* and the :c:type:`wchar_t*` string contains null
characters. (Contributed by Serhiy Storchaka in :issue:`30708`.)
- Changes to the startup sequence and the management of dynamic memory
allocators mean that the long documented requirement to call
:c:func:`Py_Initialize` before calling most C API functions is now
relied on more heavily, and failing to abide by it may lead to segfaults in
embedding applications. See the :ref:`porting-to-python-37` section in this
document and the :ref:`pre-init-safe` section in the C API documentation
for more details.
Other CPython Implementation Changes
====================================
......@@ -1098,6 +1106,7 @@ API and Feature Removals
``asyncio._overlapped``. Replace ``from asyncio import selectors`` with
``import selectors`` for example.
.. _porting-to-python-37:
Porting to Python 3.7
=====================
......@@ -1282,14 +1291,24 @@ Other CPython implementation changes
------------------------------------
* In preparation for potential future changes to the public CPython runtime
initialization API (see :pep:`432` for details), CPython's internal startup
initialization API (see :pep:`432` for an initial, but somewhat outdated,
draft), CPython's internal startup
and configuration management logic has been significantly refactored. While
these updates are intended to be entirely transparent to both embedding
applications and users of the regular CPython CLI, they're being mentioned
here as the refactoring changes the internal order of various operations
during interpreter startup, and hence may uncover previously latent defects,
either in embedding applications, or in CPython itself.
(Contributed by Nick Coghlan and Eric Snow as part of :issue:`22257`.)
(Initially contributed by Nick Coghlan and Eric Snow as part of
:issue:`22257`, and further updated by Nick, Eric, and Victor Stinner in a
number of other issues). Some known details affected:
* :c:func:`PySys_AddWarnOptionUnicode` is not currently usable by embedding
applications due to the requirement to create a Unicode object prior to
calling `Py_Initialize`. Use :c:func:`PySys_AddWarnOption` instead.
* warnings filters added by an embedding application with
:c:func:`PySys_AddWarnOption` should now more consistently take precedence
over the default filters set by the interpreter
* Due to changes in the way the default warnings filters are configured,
setting :c:data:`Py_BytesWarningFlag` to a value greater than one is no longer
......
......@@ -51,7 +51,7 @@ class EmbeddingTests(unittest.TestCase):
if p.returncode != 0 and support.verbose:
print(f"--- {cmd} failed ---")
print(f"stdout:\n{out}")
print(f"stderr:\n{out}")
print(f"stderr:\n{err}")
print(f"------")
self.assertEqual(p.returncode, 0,
......@@ -83,7 +83,7 @@ class EmbeddingTests(unittest.TestCase):
for line in out.splitlines():
if line == "--- Pass {} ---".format(numloops):
self.assertEqual(len(current_run), 0)
if support.verbose:
if support.verbose > 1:
print(line)
numloops += 1
continue
......@@ -96,7 +96,7 @@ class EmbeddingTests(unittest.TestCase):
# Parse the line from the loop. The first line is the main
# interpreter and the 3 afterward are subinterpreters.
interp = Interp(*match.groups())
if support.verbose:
if support.verbose > 1:
print(interp)
self.assertTrue(interp.interp)
self.assertTrue(interp.tstate)
......@@ -190,12 +190,33 @@ class EmbeddingTests(unittest.TestCase):
def test_pre_initialization_api(self):
"""
Checks the few parts of the C-API that work before the runtine
Checks some key parts of the C-API that need to work before the runtine
is initialized (via Py_Initialize()).
"""
env = dict(os.environ, PYTHONPATH=os.pathsep.join(sys.path))
out, err = self.run_embedded_interpreter("pre_initialization_api", env=env)
self.assertEqual(out, '')
if sys.platform == "win32":
expected_path = self.test_exe
else:
expected_path = os.path.join(os.getcwd(), "spam")
expected_output = f"sys.executable: {expected_path}\n"
self.assertIn(expected_output, out)
self.assertEqual(err, '')
def test_pre_initialization_sys_options(self):
"""
Checks that sys.warnoptions and sys._xoptions can be set before the
runtime is initialized (otherwise they won't be effective).
"""
env = dict(PYTHONPATH=os.pathsep.join(sys.path))
out, err = self.run_embedded_interpreter(
"pre_initialization_sys_options", env=env)
expected_output = (
"sys.warnoptions: ['once', 'module', 'default']\n"
"sys._xoptions: {'not_an_option': '1', 'also_not_an_option': '2'}\n"
"warnings.filters[:3]: ['default', 'module', 'once']\n"
)
self.assertIn(expected_output, out)
self.assertEqual(err, '')
def test_bpo20891(self):
......
Embedding applications may once again call PySys_ResetWarnOptions,
PySys_AddWarnOption, and PySys_AddXOption prior to calling Py_Initialize.
\ No newline at end of file
......@@ -2,6 +2,7 @@
#include "pythread.h"
#include <inttypes.h>
#include <stdio.h>
#include <wchar.h>
/*********************************************************
* Embedded interpreter tests that need a custom exe
......@@ -130,23 +131,89 @@ static int test_forced_io_encoding(void)
* Test parts of the C-API that work before initialization
*********************************************************/
/* The pre-initialization tests tend to break by segfaulting, so explicitly
* flushed progress messages make the broken API easier to find when they fail.
*/
#define _Py_EMBED_PREINIT_CHECK(msg) \
do {printf(msg); fflush(stdout);} while (0);
static int test_pre_initialization_api(void)
{
/* Leading "./" ensures getpath.c can still find the standard library */
_Py_EMBED_PREINIT_CHECK("Checking Py_DecodeLocale\n");
wchar_t *program = Py_DecodeLocale("./spam", NULL);
if (program == NULL) {
fprintf(stderr, "Fatal error: cannot decode program name\n");
return 1;
}
_Py_EMBED_PREINIT_CHECK("Checking Py_SetProgramName\n");
Py_SetProgramName(program);
_Py_EMBED_PREINIT_CHECK("Initializing interpreter\n");
Py_Initialize();
_Py_EMBED_PREINIT_CHECK("Check sys module contents\n");
PyRun_SimpleString("import sys; "
"print('sys.executable:', sys.executable)");
_Py_EMBED_PREINIT_CHECK("Finalizing interpreter\n");
Py_Finalize();
_Py_EMBED_PREINIT_CHECK("Freeing memory allocated by Py_DecodeLocale\n");
PyMem_RawFree(program);
return 0;
}
/* bpo-33042: Ensure embedding apps can predefine sys module options */
static int test_pre_initialization_sys_options(void)
{
/* We allocate a couple of the option dynamically, and then delete
* them before calling Py_Initialize. This ensures the interpreter isn't
* relying on the caller to keep the passed in strings alive.
*/
wchar_t *static_warnoption = L"once";
wchar_t *static_xoption = L"also_not_an_option=2";
size_t warnoption_len = wcslen(static_warnoption);
size_t xoption_len = wcslen(static_xoption);
wchar_t *dynamic_once_warnoption = calloc(warnoption_len+1, sizeof(wchar_t));
wchar_t *dynamic_xoption = calloc(xoption_len+1, sizeof(wchar_t));
wcsncpy(dynamic_once_warnoption, static_warnoption, warnoption_len+1);
wcsncpy(dynamic_xoption, static_xoption, xoption_len+1);
_Py_EMBED_PREINIT_CHECK("Checking PySys_AddWarnOption\n");
PySys_AddWarnOption(L"default");
_Py_EMBED_PREINIT_CHECK("Checking PySys_ResetWarnOptions\n");
PySys_ResetWarnOptions();
_Py_EMBED_PREINIT_CHECK("Checking PySys_AddWarnOption linked list\n");
PySys_AddWarnOption(dynamic_once_warnoption);
PySys_AddWarnOption(L"module");
PySys_AddWarnOption(L"default");
_Py_EMBED_PREINIT_CHECK("Checking PySys_AddXOption\n");
PySys_AddXOption(L"not_an_option=1");
PySys_AddXOption(dynamic_xoption);
/* Delete the dynamic options early */
free(dynamic_once_warnoption);
dynamic_once_warnoption = NULL;
free(dynamic_xoption);
dynamic_xoption = NULL;
_Py_EMBED_PREINIT_CHECK("Initializing interpreter\n");
_testembed_Py_Initialize();
_Py_EMBED_PREINIT_CHECK("Check sys module contents\n");
PyRun_SimpleString("import sys; "
"print('sys.warnoptions:', sys.warnoptions); "
"print('sys._xoptions:', sys._xoptions); "
"warnings = sys.modules['warnings']; "
"latest_filters = [f[0] for f in warnings.filters[:3]]; "
"print('warnings.filters[:3]:', latest_filters)");
_Py_EMBED_PREINIT_CHECK("Finalizing interpreter\n");
Py_Finalize();
return 0;
}
/* bpo-20891: Avoid race condition when initialising the GIL */
static void bpo20891_thread(void *lockp)
{
PyThread_type_lock lock = *((PyThread_type_lock*)lockp);
......@@ -217,6 +284,7 @@ static struct TestCase TestCases[] = {
{ "forced_io_encoding", test_forced_io_encoding },
{ "repeated_init_and_subinterpreters", test_repeated_init_and_subinterpreters },
{ "pre_initialization_api", test_pre_initialization_api },
{ "pre_initialization_sys_options", test_pre_initialization_sys_options },
{ "bpo20891", test_bpo20891 },
{ NULL, NULL }
};
......@@ -232,13 +300,13 @@ int main(int argc, char *argv[])
/* No match found, or no test name provided, so display usage */
printf("Python " PY_VERSION " _testembed executable for embedded interpreter tests\n"
"Normally executed via 'EmbeddingTests' in Lib/test/test_capi.py\n\n"
"Normally executed via 'EmbeddingTests' in Lib/test/test_embed.py\n\n"
"Usage: %s TESTNAME\n\nAll available tests:\n", argv[0]);
for (struct TestCase *tc = TestCases; tc && tc->name; tc++) {
printf(" %s\n", tc->name);
}
/* Non-zero exit code will cause test_capi.py tests to fail.
/* Non-zero exit code will cause test_embed.py tests to fail.
This is intentional. */
return -1;
}
......@@ -1609,11 +1609,141 @@ list_builtin_module_names(void)
return list;
}
/* Pre-initialization support for sys.warnoptions and sys._xoptions
*
* Modern internal code paths:
* These APIs get called after _Py_InitializeCore and get to use the
* regular CPython list, dict, and unicode APIs.
*
* Legacy embedding code paths:
* The multi-phase initialization API isn't public yet, so embedding
* apps still need to be able configure sys.warnoptions and sys._xoptions
* before they call Py_Initialize. To support this, we stash copies of
* the supplied wchar * sequences in linked lists, and then migrate the
* contents of those lists to the sys module in _PyInitializeCore.
*
*/
struct _preinit_entry {
wchar_t *value;
struct _preinit_entry *next;
};
typedef struct _preinit_entry *_Py_PreInitEntry;
static _Py_PreInitEntry _preinit_warnoptions = NULL;
static _Py_PreInitEntry _preinit_xoptions = NULL;
static _Py_PreInitEntry
_alloc_preinit_entry(const wchar_t *value)
{
/* To get this to work, we have to initialize the runtime implicitly */
_PyRuntime_Initialize();
/* Force default allocator, so we can ensure that it also gets used to
* destroy the linked list in _clear_preinit_entries.
*/
PyMemAllocatorEx old_alloc;
_PyMem_SetDefaultAllocator(PYMEM_DOMAIN_RAW, &old_alloc);
_Py_PreInitEntry node = PyMem_RawCalloc(1, sizeof(*node));
if (node != NULL) {
node->value = _PyMem_RawWcsdup(value);
if (node->value == NULL) {
PyMem_RawFree(node);
node = NULL;
};
};
PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &old_alloc);
return node;
};
static int
_append_preinit_entry(_Py_PreInitEntry *optionlist, const wchar_t *value)
{
_Py_PreInitEntry new_entry = _alloc_preinit_entry(value);
if (new_entry == NULL) {
return -1;
}
/* We maintain the linked list in this order so it's easy to play back
* the add commands in the same order later on in _Py_InitializeCore
*/
_Py_PreInitEntry last_entry = *optionlist;
if (last_entry == NULL) {
*optionlist = new_entry;
} else {
while (last_entry->next != NULL) {
last_entry = last_entry->next;
}
last_entry->next = new_entry;
}
return 0;
};
static void
_clear_preinit_entries(_Py_PreInitEntry *optionlist)
{
_Py_PreInitEntry current = *optionlist;
*optionlist = NULL;
/* Deallocate the nodes and their contents using the default allocator */
PyMemAllocatorEx old_alloc;
_PyMem_SetDefaultAllocator(PYMEM_DOMAIN_RAW, &old_alloc);
while (current != NULL) {
_Py_PreInitEntry next = current->next;
PyMem_RawFree(current->value);
PyMem_RawFree(current);
current = next;
}
PyMem_SetAllocator(PYMEM_DOMAIN_RAW, &old_alloc);
};
static void
_clear_all_preinit_options(void)
{
_clear_preinit_entries(&_preinit_warnoptions);
_clear_preinit_entries(&_preinit_xoptions);
}
static int
_PySys_ReadPreInitOptions(void)
{
/* Rerun the add commands with the actual sys module available */
PyThreadState *tstate = PyThreadState_GET();
if (tstate == NULL) {
/* Still don't have a thread state, so something is wrong! */
return -1;
}
_Py_PreInitEntry entry = _preinit_warnoptions;
while (entry != NULL) {
PySys_AddWarnOption(entry->value);
entry = entry->next;
}
entry = _preinit_xoptions;
while (entry != NULL) {
PySys_AddXOption(entry->value);
entry = entry->next;
}
_clear_all_preinit_options();
return 0;
};
static PyObject *
get_warnoptions(void)
{
PyObject *warnoptions = _PySys_GetObjectId(&PyId_warnoptions);
if (warnoptions == NULL || !PyList_Check(warnoptions)) {
/* PEP432 TODO: we can reach this if warnoptions is NULL in the main
* interpreter config. When that happens, we need to properly set
* the `warnoptions` reference in the main interpreter config as well.
*
* For Python 3.7, we shouldn't be able to get here due to the
* combination of how _PyMainInterpreter_ReadConfig and _PySys_EndInit
* work, but we expect 3.8+ to make the _PyMainInterpreter_ReadConfig
* call optional for embedding applications, thus making this
* reachable again.
*/
Py_XDECREF(warnoptions);
warnoptions = PyList_New(0);
if (warnoptions == NULL)
......@@ -1630,6 +1760,12 @@ get_warnoptions(void)
void
PySys_ResetWarnOptions(void)
{
PyThreadState *tstate = PyThreadState_GET();
if (tstate == NULL) {
_clear_preinit_entries(&_preinit_warnoptions);
return;
}
PyObject *warnoptions = _PySys_GetObjectId(&PyId_warnoptions);
if (warnoptions == NULL || !PyList_Check(warnoptions))
return;
......@@ -1658,6 +1794,11 @@ PySys_AddWarnOptionUnicode(PyObject *option)
void
PySys_AddWarnOption(const wchar_t *s)
{
PyThreadState *tstate = PyThreadState_GET();
if (tstate == NULL) {
_append_preinit_entry(&_preinit_warnoptions, s);
return;
}
PyObject *unicode;
unicode = PyUnicode_FromWideChar(s, -1);
if (unicode == NULL)
......@@ -1678,6 +1819,16 @@ get_xoptions(void)
{
PyObject *xoptions = _PySys_GetObjectId(&PyId__xoptions);
if (xoptions == NULL || !PyDict_Check(xoptions)) {
/* PEP432 TODO: we can reach this if xoptions is NULL in the main
* interpreter config. When that happens, we need to properly set
* the `xoptions` reference in the main interpreter config as well.
*
* For Python 3.7, we shouldn't be able to get here due to the
* combination of how _PyMainInterpreter_ReadConfig and _PySys_EndInit
* work, but we expect 3.8+ to make the _PyMainInterpreter_ReadConfig
* call optional for embedding applications, thus making this
* reachable again.
*/
Py_XDECREF(xoptions);
xoptions = PyDict_New();
if (xoptions == NULL)
......@@ -1730,6 +1881,11 @@ error:
void
PySys_AddXOption(const wchar_t *s)
{
PyThreadState *tstate = PyThreadState_GET();
if (tstate == NULL) {
_append_preinit_entry(&_preinit_xoptions, s);
return;
}
if (_PySys_AddXOptionWithError(s) < 0) {
/* No return value, therefore clear error state if possible */
if (_PyThreadState_UncheckedGet()) {
......@@ -2257,6 +2413,7 @@ _PySys_BeginInit(PyObject **sysmod)
}
*sysmod = m;
return _Py_INIT_OK();
type_init_failed:
......@@ -2333,6 +2490,11 @@ _PySys_EndInit(PyObject *sysdict, _PyMainInterpreterConfig *config)
if (get_xoptions() == NULL)
return -1;
/* Transfer any sys.warnoptions and sys._xoptions set directly
* by an embedding application from the linked list to the module. */
if (_PySys_ReadPreInitOptions() != 0)
return -1;
if (PyErr_Occurred())
return -1;
return 0;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment