Commit 0c942e8f authored by Austin Clements's avatar Austin Clements

runtime: avoid incorrect panic when a signal arrives during STW

Stop-the-world and freeze-the-world (used for unhandled panics) are
currently not safe to do at the same time. While a regular unhandled
panic can't happen concurrently with STW (if the P hasn't been
stopped, then the panic blocks the STW), a panic from a _SigThrow
signal can happen on an already-stopped P, racing with STW. When this
happens, freezetheworld sets sched.stopwait to 0x7fffffff and
stopTheWorldWithSema panics because sched.stopwait != 0.

Fix this by detecting when freeze-the-world happens before
stop-the-world has completely stopped the world and freeze the STW
operation rather than panicking.

Fixes #17442.

Change-Id: I646a7341221dd6d33ea21d818c2f7218e2cb7e20
Reviewed-on: https://go-review.googlesource.com/34611
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: default avatarRuss Cox <rsc@golang.org>
Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
parent 860c9c0b
...@@ -632,10 +632,15 @@ func helpgc(nproc int32) { ...@@ -632,10 +632,15 @@ func helpgc(nproc int32) {
// sched.stopwait to in order to request that all Gs permanently stop. // sched.stopwait to in order to request that all Gs permanently stop.
const freezeStopWait = 0x7fffffff const freezeStopWait = 0x7fffffff
// freezing is set to non-zero if the runtime is trying to freeze the
// world.
var freezing uint32
// Similar to stopTheWorld but best-effort and can be called several times. // Similar to stopTheWorld but best-effort and can be called several times.
// There is no reverse operation, used during crashing. // There is no reverse operation, used during crashing.
// This function must not lock any mutexes. // This function must not lock any mutexes.
func freezetheworld() { func freezetheworld() {
atomic.Store(&freezing, 1)
// stopwait and preemption requests can be lost // stopwait and preemption requests can be lost
// due to races with concurrently executing threads, // due to races with concurrently executing threads,
// so try several times // so try several times
...@@ -1018,14 +1023,29 @@ func stopTheWorldWithSema() { ...@@ -1018,14 +1023,29 @@ func stopTheWorldWithSema() {
preemptall() preemptall()
} }
} }
// sanity checks
bad := ""
if sched.stopwait != 0 { if sched.stopwait != 0 {
throw("stopTheWorld: not stopped") bad = "stopTheWorld: not stopped (stopwait != 0)"
} } else {
for i := 0; i < int(gomaxprocs); i++ { for i := 0; i < int(gomaxprocs); i++ {
p := allp[i] p := allp[i]
if p.status != _Pgcstop { if p.status != _Pgcstop {
throw("stopTheWorld: not stopped") bad = "stopTheWorld: not stopped (status != _Pgcstop)"
}
}
}
if atomic.Load(&freezing) != 0 {
// Some other thread is panicking. This can cause the
// sanity checks above to fail if the panic happens in
// the signal handler on a stopped thread. Either way,
// we should halt this thread.
lock(&deadlock)
lock(&deadlock)
} }
if bad != "" {
throw(bad)
} }
} }
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment