Commit b6d1e4fe authored by David Herrmann's avatar David Herrmann Committed by Juerg Haefliger

fork: record start_time late

This changes the fork(2) syscall to record the process start_time after
initializing the basic task structure but still before making the new
process visible to user-space.

Technically, we could record the start_time anytime during fork(2).  But
this might lead to scenarios where a start_time is recorded long before
a process becomes visible to user-space.  For instance, with
userfaultfd(2) and TLS, user-space can delay the execution of fork(2)
for an indefinite amount of time (and will, if this causes network
access, or similar).

By recording the start_time late, it much closer reflects the point in
time where the process becomes live and can be observed by other
processes.

Lastly, this makes it much harder for user-space to predict and control
the start_time they get assigned.  Previously, user-space could fork a
process and stall it in copy_thread_tls() before its pid is allocated,
but after its start_time is recorded.  This can be misused to later-on
cycle through PIDs and resume the stalled fork(2) yielding a process
that has the same pid and start_time as a process that existed before.
This can be used to circumvent security systems that identify processes
by their pid+start_time combination.

Even though user-space was always aware that start_time recording is
flaky (but several projects are known to still rely on start_time-based
identification), changing the start_time to be recorded late will help
mitigate existing attacks and make it much harder for user-space to
control the start_time a process gets assigned.
Reported-by: default avatarJann Horn <jannh@google.com>
Signed-off-by: default avatarTom Gundersen <teg@jklm.no>
Signed-off-by: default avatarDavid Herrmann <dh.herrmann@gmail.com>
Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>

CVE-2019-6133

(cherry picked from commit 7b558513)
Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
Acked-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
Acked-by: default avatarThadeu Lima de Souza Cascardo <cascardo@canonical.com>
Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
parent cd399450
......@@ -1420,8 +1420,6 @@ static struct task_struct *copy_process(unsigned long clone_flags,
posix_cpu_timers_init(p);
p->start_time = ktime_get_ns();
p->real_start_time = ktime_get_boot_ns();
p->io_context = NULL;
p->audit_context = NULL;
cgroup_fork(p);
......@@ -1581,6 +1579,17 @@ static struct task_struct *copy_process(unsigned long clone_flags,
if (retval)
goto bad_fork_free_pid;
/*
* From this point on we must avoid any synchronous user-space
* communication until we take the tasklist-lock. In particular, we do
* not want user-space to be able to predict the process start-time by
* stalling fork(2) after we recorded the start_time but before it is
* visible to the system.
*/
p->start_time = ktime_get_ns();
p->real_start_time = ktime_get_boot_ns();
/*
* Make it visible to the rest of the system, but dont wake it up yet.
* Need tasklist lock for parent etc handling!
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment