Commits · c99d7f7f852b525694c645e00b6c06729a6735a2 · Kirill Smelkov / go

05 Nov, 2015 18 commits

runtime: decentralize mark done and mark termination · c99d7f7f

Austin Clements authored Oct 26, 2015

This moves all of the mark 1 to mark 2 transition and mark termination
to the mark done transition function. This means these transitions are
now handled on the goroutine that detected mark completion. This also
means that the GC coordinator and the background completion barriers
are no longer used and various workarounds to yield to the coordinator
are no longer necessary. These will be removed in follow-up commits.

One consequence of this is that mark workers now need to be
preemptible when performing the mark done transition. This allows them
to stop the world and to perform the final clean-up steps of GC after
restarting the world. They are only made preemptible while performing
this transition, so if the worker findRunnableGCWorker would schedule
isn't available, we didn't want to schedule it anyway.

Fixes #11970.

Change-Id: I9203a2d6287eeff62d589ec02ad9cb1e29ddb837
Reviewed-on: https://go-review.googlesource.com/16391Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

c99d7f7f

runtime: account mark worker time before gcMarkDone · d986bf27

Austin Clements authored Oct 26, 2015

Currently gcMarkDone takes basically no time, so it's okay to account
the worker time after calling it. However, gcMarkDone is about to take
potentially *much* longer because it may perform all of mark
termination. Prepare for this by swapping the order so we account the
time before calling gcMarkDone.

Change-Id: I90c7df68192acfc4fd02a7254dae739dda4e2fcb
Reviewed-on: https://go-review.googlesource.com/16390Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

d986bf27

runtime: factor mark done transition · 171204b5

Austin Clements authored Oct 24, 2015

Currently the code for completion of mark 1/mark 2 is duplicated in
background workers and assists. Factor this in to a single function
that will serve as the transition function for concurrent mark.

Change-Id: I4d9f697a15da0d349db3b34d56f3a220dd41d41b
Reviewed-on: https://go-review.googlesource.com/16359Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

171204b5

runtime: eliminate mark completion in scheduler · 12e23f05

Austin Clements authored Oct 23, 2015

Currently, findRunnableGCWorker will perform mark completion if there
is no remaining work and no running workers. This used to be necessary
to resolve a race in the transition from mark 1 to mark 2 where we
would enter mark 2 with no mark work (and no dedicated workers), so no
workers would run, so no worker would signal mark completion.

However, we're about to make mark completion also perform the entire
follow-on process, which includes mark termination. We really don't
want to do that in the scheduler if it happens to detect completion.

Conveniently, this hack is no longer necessary because we always
enqueue root scanning work at the beginning of both mark 1 and mark 2,
so a mark worker will always run. Hence, we can simply eliminate it.

Change-Id: I3fc8f27c8da632f0fb732c9f6425e1f457f5652e
Reviewed-on: https://go-review.googlesource.com/16358Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

12e23f05

runtime: don't start idle mark workers when barriers are cleared · 20f276e2

Austin Clements authored Nov 03, 2015

Currently, we don't start dedicated or fractional mark workers unless
the mark 1 or mark 2 barriers have been cleared. One intended
consequence of this is that no background workers run between the
forEachP that disposes all gcWork caches and the beginning of mark 2.

However, we (unintentionally) did not apply this restriction to idle
mark workers. As a result, these can start in the interim between mark
1 completion and mark 2 starting. This explains why it was necessary
to reset the root marking jobs using carefully ordered atomic writes
when setting up mark 2. It also means that, even though we definitely
enqueue work before starting mark 2, it may be drained by the time we
reset the mark 2 barrier. If this happens, currently the only thing
preventing the runtime from deadlocking is that the scheduler itself
also checks for mark completion and will signal mark 2 completion.
Were it not for the odd behavior of idle workers, this check in the
scheduler would not be necessary.

Clean all of this up and prepare to remove this check in the scheduler
by applying the same restriction to starting idle mark workers.

Change-Id: Ic1b479e1591bd7773dc27b320ca399a215603b5a
Reviewed-on: https://go-review.googlesource.com/16631Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

20f276e2

runtime: decentralize sweep termination and mark transition · a51905fa

Austin Clements authored Oct 23, 2015

This moves all of GC initialization, sweep termination, and the
transition to concurrent marking in to the off->mark transition
function. This means it's now handled on the goroutine that detected
the state exit condition.

As a result, malloc no longer needs to Gosched() at the beginning of
the GC cycle to prevent over-allocation while the GC is starting up
because it will now *help* the GC to start up. The Gosched hack is
still necessary during GC shutdown (this is easy to test by enabling
gctrace and hitting Ctrl-S to block the gctrace output).

At this point, the GC coordinator still handles later phases. This
requires a small tweak to how we start the GC coordinator. Currently,
starting the GC coordinator is best-effort and may fail if the
coordinator is about to park from the previous cycle but hasn't yet.
We fix this by replacing the park/ready to wake up the coordinator
with a semaphore. This is temporary since the coordinator will be
going away in a few commits.

Updates #11970.

Change-Id: I2c6a11c91e72dfbc59c2d8e7c66146dee9a444fe
Reviewed-on: https://go-review.googlesource.com/16357Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

a51905fa

runtime: decentralize concurrent sweep termination · 9630c47e

Austin Clements authored Oct 23, 2015

This moves concurrent sweep termination from the coordinator to the
off->mark transition. This allows it to be performed by all Gs
attempting to start the GC.

Updates #11970.

Change-Id: I24428e8599a759398c2ef7ec996ba755a448f947
Reviewed-on: https://go-review.googlesource.com/16356Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

9630c47e

runtime: beginning of decentralized off->mark transition · f54bcedc

Austin Clements authored Oct 23, 2015

This begins the conversion of the centralized GC coordinator to a
decentralized state machine by introducing the internal API that
triggers the first state transition from _GCoff to _GCmark (or
_GCmarktermination).

This change introduces the transition lock, the off->mark transition
condition (which is very similar to shouldtriggergc()), and the
general structure of a state transition. Since we're doing this
conversion in stages, it then falls back to the GC coordinator to
actually execute the cycle. We'll start moving logic out of the GC
coordinator and in to transition functions next.

This fixes a minor bug in gcstoptheworld debug mode where passing the
heap trigger once could trigger multiple STW GCs.

Updates #11970.

Change-Id: I964087dd190a639eb5766398f8e1bbf8b352902f
Reviewed-on: https://go-review.googlesource.com/16355Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>

f54bcedc

runtime: move concurrent mark setup off system stack · 38425962

Austin Clements authored Oct 24, 2015

For historical reasons we currently do a lot of the concurrent mark
setup on the system stack. In fact, at this point the one and only
thing that needs to happen on the system stack is the start-the-world.

Clean up this code by lifting everything other than the
start-the-world off the system stack.

The diff for this change looks large, but the only code change is to
narrow the systemstack call. Everything else is re-indentation.

Change-Id: I1e03b8afc759fad726f2397b05a17d183c2713ce
Reviewed-on: https://go-review.googlesource.com/16354Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

38425962

runtime: lift state variables from func gc to var work · 19596215

Austin Clements authored Oct 23, 2015

We're about to split func gc across several functions, so lift the
local variables it uses for tracking statistics and state across the
cycle into the global "work" variable.

Change-Id: Ie955f2f1758c7f5a5543ea1f3f33b222bc4b1d37
Reviewed-on: https://go-review.googlesource.com/16353Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

19596215

runtime: note a minor issue with GODEUG=gcstoptheworld · 16980189

Austin Clements authored Oct 23, 2015

Change-Id: I91cda8d88b0852cd0f868d33c594206bcca0c386
Reviewed-on: https://go-review.googlesource.com/16352Reviewed-by: Rick Hudson <rlh@golang.org>

16980189

cmd/go: change ar argument to rc · 321cf6f8

Ian Lance Taylor authored Nov 04, 2015

Put 'r' first because that is the command, and 'c' is the modifier.
Keep 'c' because it means to not warn when creating an archive.
Drop 'u' because it is unnecessary and fails on Arch Linux.

No test because this is only for gccgo (I tested it manually).

Fixes #12310.

Change-Id: Id740257fb1c347dfaa60f7d613af2897dae2c059
Reviewed-on: https://go-review.googlesource.com/16664
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>

321cf6f8

runtime: optimize string comparison on amd64 · 967564be

Ilya Tocar authored Oct 29, 2015

Use AVX2 if possible.
Results below (haswell):

name old time/op new time/op delta
CompareStringEqual-6 8.77ns ± 0% 8.63ns ± 1% -1.58% (p=0.000 n=20+19)
CompareStringIdentical-6 5.02ns ± 0% 5.02ns ± 0% ~ (all samples are equal)
CompareStringSameLength-6 7.51ns ± 0% 7.51ns ± 0% ~ (all samples are equal)
CompareStringDifferentLength-6 1.56ns ± 0% 1.56ns ± 0% ~ (all samples are equal)
CompareStringBigUnaligned-6 124µs ± 1% 105µs ± 5% -14.99% (p=0.000 n=20+18)
CompareStringBig-6 112µs ± 1% 103µs ± 0% -7.87% (p=0.000 n=20+17)

name old speed new speed delta
CompareStringBigUnaligned-6 8.48GB/s ± 1% 9.98GB/s ± 5% +17.67% (p=0.000 n=20+18)
CompareStringBig-6 9.37GB/s ± 1% 10.17GB/s ± 0% +8.54% (p=0.000 n=20+17)

Change-Id: I1c949626dd2aaf9f633e3c888a9df71c82eed7e1
Reviewed-on: https://go-review.googlesource.com/16481Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Klaus Post <klauspost@gmail.com>

967564be

runtime: simplify buffered channels. · 8e496f1d

Keith Randall authored Mar 20, 2015

This change removes the retry mechanism we use for buffered channels.
Instead, any sender waking up a receiver or vice versa completes the
full protocol with its counterpart. This means the counterpart does
not need to relock the channel when it wakes up. (Currently
buffered channels need to relock on wakeup.)

For sends on a channel with waiting receivers, this change replaces
two copies (sender->queue, queue->receiver) with one (sender->receiver).
For receives on channels with a waiting sender, two copies are still required.

This change unifies to a large degree the algorithm for buffered
and unbuffered channels, simplifying the overall implementation.

Fixes #11506

benchmark old ns/op new ns/op delta
BenchmarkChanProdCons10 125 110 -12.00%
BenchmarkChanProdCons0 303 284 -6.27%
BenchmarkChanProdCons100 75.5 71.3 -5.56%
BenchmarkChanContended 6452 6125 -5.07%
BenchmarkChanNonblocking 11.5 11.0 -4.35%
BenchmarkChanCreation 149 143 -4.03%
BenchmarkChanSem 63.6 61.6 -3.14%
BenchmarkChanUncontended 6390 6212 -2.79%
BenchmarkChanSync 282 276 -2.13%
BenchmarkChanProdConsWork10 516 506 -1.94%
BenchmarkChanProdConsWork0 696 685 -1.58%
BenchmarkChanProdConsWork100 470 469 -0.21%
BenchmarkChanPopular 660427 660012 -0.06%

Change-Id: I164113a56432fbc7cace0786e49c5a6e6a708ea4
Reviewed-on: https://go-review.googlesource.com/9345
Run-TryBot: Keith Randall <khr@golang.org>
Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Dmitry Vyukov <dvyukov@google.com>

8e496f1d

cmd/dist: remove vestigial -s flag · 7bb2a7d6

Matthew Dempsky authored Nov 04, 2015

Fixes #12002.

Change-Id: I7262f4520560ac158fc2ee3ce1d2f7a488d40354
Reviewed-on: https://go-review.googlesource.com/16666
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Dave Cheney <dave@cheney.net>

7bb2a7d6

crypto/x509: add /etc/ssl/certs to certificate directories · 61ca7e5c

Shenghou Ma authored Nov 04, 2015

Fixes #12139.

Change-Id: Ied760ac37e2fc21ef951ae872136dc3bfd49bf9f
Reviewed-on: https://go-review.googlesource.com/16671Reviewed-by: Ian Lance Taylor <iant@golang.org>

61ca7e5c

cmd/go: put all generate variables in the environment · 94968155

Ian Lance Taylor authored Nov 02, 2015

Fixes #13124.

Change-Id: I8a824156c84016504d29dc2dd2d522149b189be8
Reviewed-on: https://go-review.googlesource.com/16537Reviewed-by: Russ Cox <rsc@golang.org>

94968155

internal/syscall/unix: eliminate non-trivial randomTrap initializer · b16c6991

Matthew Dempsky authored Nov 04, 2015

While here, enable getrandom on arm64 too (using the value found in
include/uapi/asm-generic/unistd.h, which seems to match up with other
GOARCH=arm64 syscall numbers).

Updates #10848.

Change-Id: I5ab36ccf6ee8d5cc6f0e1a61d09f0da7410288b9
Reviewed-on: https://go-review.googlesource.com/16662
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>

b16c6991

04 Nov, 2015 11 commits

runtime: make putfull start mark workers · dcd9e5bc

Austin Clements authored Oct 26, 2015

Currently we depend on the good graces and timing of the scheduler to
get opportunities to start dedicated mark workers. In the worst case,
it may take 10ms to get dedicated mark workers going at the beginning
of mark 1 and mark 2 or after the amount of available work has dropped
and gone back up.

Instead of waiting for the regular preemption logic to get around to
us, make putfull enlist a random P if we're not already running enough
dedicated workers. This should improve performance stability of the
garbage collector and is likely to improve the overall performance
somewhat.

No overall effect on the go1 benchmarks. It speeds up the garbage
benchmark by 12%, which more than counters the performance loss from
the previous commit.

name old time/op new time/op delta
XBenchGarbage-12 6.32ms ± 4% 5.58ms ± 2% -11.68% (p=0.000 n=20+16)

name old time/op new time/op delta
BinaryTree17-12 3.18s ± 5% 3.12s ± 4% -1.83% (p=0.021 n=20+20)
Fannkuch11-12 2.50s ± 2% 2.46s ± 2% -1.57% (p=0.000 n=18+19)
FmtFprintfEmpty-12 50.8ns ± 3% 50.4ns ± 3% ~ (p=0.184 n=20+20)
FmtFprintfString-12 167ns ± 2% 171ns ± 1% +2.46% (p=0.000 n=20+19)
FmtFprintfInt-12 161ns ± 2% 163ns ± 2% +1.81% (p=0.000 n=20+20)
FmtFprintfIntInt-12 269ns ± 1% 266ns ± 1% -0.81% (p=0.002 n=19+20)
FmtFprintfPrefixedInt-12 237ns ± 2% 231ns ± 2% -2.86% (p=0.000 n=20+20)
FmtFprintfFloat-12 313ns ± 2% 313ns ± 1% ~ (p=0.681 n=20+20)
FmtManyArgs-12 1.05µs ± 2% 1.03µs ± 1% -2.26% (p=0.000 n=20+20)
GobDecode-12 8.66ms ± 1% 8.67ms ± 1% ~ (p=0.380 n=19+20)
GobEncode-12 6.56ms ± 1% 6.56ms ± 2% ~ (p=0.607 n=19+20)
Gzip-12 317ms ± 1% 314ms ± 2% -1.10% (p=0.000 n=20+19)
Gunzip-12 42.1ms ± 1% 42.2ms ± 1% +0.27% (p=0.044 n=20+19)
HTTPClientServer-12 62.7µs ± 1% 62.0µs ± 1% -1.04% (p=0.000 n=19+18)
JSONEncode-12 16.7ms ± 1% 16.8ms ± 2% +0.59% (p=0.021 n=20+20)
JSONDecode-12 58.2ms ± 1% 61.4ms ± 2% +5.43% (p=0.000 n=18+19)
Mandelbrot200-12 3.84ms ± 1% 3.87ms ± 2% +0.79% (p=0.008 n=18+20)
GoParse-12 3.86ms ± 2% 3.76ms ± 2% -2.60% (p=0.000 n=20+20)
RegexpMatchEasy0_32-12 100ns ± 2% 100ns ± 1% -0.68% (p=0.005 n=18+15)
RegexpMatchEasy0_1K-12 332ns ± 1% 342ns ± 1% +3.16% (p=0.000 n=19+19)
RegexpMatchEasy1_32-12 82.9ns ± 3% 83.0ns ± 2% ~ (p=0.906 n=19+20)
RegexpMatchEasy1_1K-12 487ns ± 1% 494ns ± 1% +1.50% (p=0.000 n=17+20)
RegexpMatchMedium_32-12 131ns ± 2% 130ns ± 1% ~ (p=0.686 n=19+20)
RegexpMatchMedium_1K-12 39.6µs ± 1% 39.2µs ± 1% -1.09% (p=0.000 n=18+19)
RegexpMatchHard_32-12 2.04µs ± 1% 2.04µs ± 2% ~ (p=0.804 n=20+20)
RegexpMatchHard_1K-12 61.7µs ± 2% 61.3µs ± 2% ~ (p=0.052 n=18+20)
Revcomp-12 529ms ± 2% 533ms ± 1% +0.83% (p=0.003 n=20+19)
Template-12 70.7ms ± 2% 71.0ms ± 2% ~ (p=0.065 n=20+19)
TimeParse-12 351ns ± 2% 355ns ± 1% +1.25% (p=0.000 n=19+20)
TimeFormat-12 362ns ± 2% 373ns ± 1% +2.83% (p=0.000 n=18+20)
[Geo mean] 62.2µs 62.3µs +0.13%

name old speed new speed delta
GobDecode-12 88.6MB/s ± 1% 88.5MB/s ± 1% ~ (p=0.392 n=19+20)
GobEncode-12 117MB/s ± 1% 117MB/s ± 1% ~ (p=0.622 n=19+20)
Gzip-12 61.1MB/s ± 1% 61.8MB/s ± 2% +1.11% (p=0.000 n=20+19)
Gunzip-12 461MB/s ± 1% 460MB/s ± 1% -0.27% (p=0.044 n=20+19)
JSONEncode-12 116MB/s ± 1% 115MB/s ± 2% -0.58% (p=0.022 n=20+20)
JSONDecode-12 33.3MB/s ± 1% 31.6MB/s ± 2% -5.15% (p=0.000 n=18+19)
GoParse-12 15.0MB/s ± 2% 15.4MB/s ± 2% +2.66% (p=0.000 n=20+20)
RegexpMatchEasy0_32-12 317MB/s ± 2% 319MB/s ± 2% ~ (p=0.052 n=20+20)
RegexpMatchEasy0_1K-12 3.08GB/s ± 1% 2.99GB/s ± 1% -3.07% (p=0.000 n=19+19)
RegexpMatchEasy1_32-12 386MB/s ± 3% 386MB/s ± 2% ~ (p=0.939 n=19+20)
RegexpMatchEasy1_1K-12 2.10GB/s ± 1% 2.07GB/s ± 1% -1.46% (p=0.000 n=17+20)
RegexpMatchMedium_32-12 7.62MB/s ± 2% 7.64MB/s ± 1% ~ (p=0.702 n=19+20)
RegexpMatchMedium_1K-12 25.9MB/s ± 1% 26.1MB/s ± 2% +0.99% (p=0.000 n=18+20)
RegexpMatchHard_32-12 15.7MB/s ± 1% 15.7MB/s ± 2% ~ (p=0.723 n=20+20)
RegexpMatchHard_1K-12 16.6MB/s ± 2% 16.7MB/s ± 2% ~ (p=0.052 n=18+20)
Revcomp-12 481MB/s ± 2% 477MB/s ± 1% -0.83% (p=0.003 n=20+19)
Template-12 27.5MB/s ± 2% 27.3MB/s ± 2% ~ (p=0.062 n=20+19)
[Geo mean] 99.4MB/s 99.1MB/s -0.35%

Change-Id: I914d8cadded5a230509d118164a4c201601afc06
Reviewed-on: https://go-review.googlesource.com/16298Reviewed-by: Rick Hudson <rlh@golang.org>

dcd9e5bc

runtime: eliminate getfull barrier from concurrent mark · 62ba520b

Austin Clements authored Oct 26, 2015

Currently dedicated mark workers participate in the getfull barrier
during concurrent mark. However, the getfull barrier wasn't designed
for concurrent work and this causes no end of headaches.

In the concurrent setting, participants come and go. This makes mark
completion susceptible to live-lock: since dedicated workers are only
periodically polling for completion, it's possible for the program to
be in some transient worker each time one of the dedicated workers
wakes up to check if it can exit the getfull barrier. It also
complicates reasoning about the system because dedicated workers
participate directly in the getfull barrier, but transient workers
must instead use trygetfull because they have exit conditions that
aren't captured by getfull (e.g., fractional workers exit when
preempted). The complexity of implementing these exit conditions
contributed to #11677. Furthermore, the getfull barrier is inefficient
because we could be running user code instead of spinning on a P. In
effect, we're dedicating 25% of the CPU to marking even if that means
we have to spin to make that 25%. It also causes issues on Windows
because we can't actually sleep for 100µs (#8687).

Fix this by making dedicated workers no longer participate in the
getfull barrier. Instead, dedicated workers simply return to the
scheduler when they fail to get more work, regardless of what others
workers are doing, and the scheduler only starts new dedicated workers
if there's work available. Everything that needs to be handled by this
barrier is already handled by detection of mark completion.

This makes the system much more symmetric because all workers and
assists now use trygetfull during concurrent mark. It also loosens the
25% CPU target so that we can give some of that 25% back to user code
if there isn't enough work to keep the mark worker busy. And it
eliminates the problematic 100µs sleep on Windows during concurrent
mark (though not during mark termination).

The downside of this is that if we hit a bottleneck in the heap graph
that then expands back out, the system may shut down dedicated workers
and take a while to start them back up. We'll address this in the next
commit.

Updates #12041 and #8687.

No effect on the go1 benchmarks. This slows down the garbage benchmark
by 9%, but we'll more than make it up in the next commit.

name old time/op new time/op delta
XBenchGarbage-12 5.80ms ± 2% 6.32ms ± 4% +9.03% (p=0.000 n=20+20)

Change-Id: I65100a9ba005a8b5cf97940798918672ea9dd09b
Reviewed-on: https://go-review.googlesource.com/16297Reviewed-by: Rick Hudson <rlh@golang.org>

62ba520b

misc/ios: keep whole buffer in go_darwin_arm_exec · afab7716

David Crawshaw authored Nov 04, 2015

The existing go_darwin_arm_exec.go script does not work with Xcode 7,
not due to any significant changes, but just ordering and timing of
statements from lldb. Unfortunately the current design of
go_darwin_arm_exec.go makes it not obvious what gets stuck where, so
this moves from a moving buffer window to a complete buffer of the
lldb output.

The result is easier code to follow, and it works with Xcode 7.

Updates #12660.

Change-Id: I3b8b890b0bf4474119482e95d84e821a86d1eaed
Reviewed-on: https://go-review.googlesource.com/16634Reviewed-by: Michael Matloob <matloob@golang.org>

afab7716

net/http: register HTTP/2 before listening in ListenAndServe · e50a4c69

Brad Fitzpatrick authored Nov 03, 2015

Change-Id: Icf9b6802945051aa484fb9ebcce71704f5655474
Reviewed-on: https://go-review.googlesource.com/16630Reviewed-by: Andrew Gerrand <adg@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

e50a4c69

syscall: allow nacl's fake network code to Listen twice on the same address · 8ee90fad

Brad Fitzpatrick authored Nov 04, 2015

Noticed from nacl trybot failures on new tests in
https://golang.org/cl/16630

Related earlier fix of mine to nacl's listen code:

  syscall: fix nacl listener to not accept connections once closed
  https://go-review.googlesource.com/15940

Perhaps a better fix (in the future?) would be to remove the listener
from the map at close, but that didn't seem entirely straightforward
last time I looked into it. It's not my code, but it seems that the
map entry continues to have a purpose even after Listener close. (?)

But given that this code is only really used for running tests and the
playground, this seems fine.

Change-Id: I43bfedc57c07f215f4d79c18f588d3650687a48f
Reviewed-on: https://go-review.googlesource.com/16650
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
Reviewed-by: Ian Lance Taylor <iant@golang.org>

8ee90fad

cmd/vet: use testenv · dc9ad586

David Crawshaw authored Nov 04, 2015

Fix for iOS builder.

Change-Id: I5b6c977b187446c848182a9294d5bed6b5f9f6e4
Reviewed-on: https://go-review.googlesource.com/16633
Run-TryBot: David Crawshaw <crawshaw@golang.org>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>

dc9ad586

crypto/md5: uniform Write func · 8c827c04

unknown authored Nov 04, 2015

Unification of implementation of existing md5.Write function
with other implementations (sha1, sha256, sha512).

Change-Id: I58ae02d165b17fc221953a5b4b986048b46c0508
Reviewed-on: https://go-review.googlesource.com/16621Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

8c827c04

cmd/go: check if destination is a regular file · 76811213

Mohit Agarwal authored Nov 03, 2015

builder.copyFile ensures that the destination is an object file.  This
wouldn't be true if we are not writing to a regular file and the copy
fails.  Check if the destination is an object file only if we are
writing to a regular file.  While removing the file, ensure that it is a
regular file so that device files and such aren't removed when running
as a user with suggicient privileges.

Fixes #12407

Change-Id: Ie86ce9770fa59aa56fc486a5962287859b69db3d
Reviewed-on: https://go-review.googlesource.com/16585Reviewed-by: Ian Lance Taylor <iant@golang.org>
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

76811213

cmd/compile: add go:nowritebarrierrec annotation · 3a765430

Austin Clements authored Nov 02, 2015

This introduces a recursive variant of the go:nowritebarrier
annotation that prohibits write barriers not only in the annotated
function, but in all functions it calls, recursively. The error
message gives the shortest call stack from the annotated function to
the function containing the prohibited write barrier, including the
names of the functions and the line numbers of the calls.

To demonstrate the annotation, we apply it to gcmarkwb_m, the write
barrier itself.

This is a new annotation rather than a modification of the existing
go:nowritebarrier annotation because, for better or worse, there are
many go:nowritebarrier functions that do call functions with write
barriers. In most of these cases this is benign because the annotation
was conservative, but it prohibits simply coopting the existing
annotation.

Change-Id: I225ca483c8f699e8436373ed96349e80ca2c2479
Reviewed-on: https://go-review.googlesource.com/16554Reviewed-by: Keith Randall <khr@golang.org>

3a765430

cmd/cgo: add a missing newline in writeExports · 2780abd6

Ian Lance Taylor authored Nov 03, 2015

The code works without the newline, but it looks funny:

func _cgoexp_15afe6549f62_GoFn(a unsafe.Pointer, n int32) {	fn := GoFn

This adds a newline after the '{'.

Change-Id: I6c465abe16f47924426d1b22b91004b3a3586ebd
Reviewed-on: https://go-review.googlesource.com/16612
Run-TryBot: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Minux Ma <minux@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>

2780abd6

net/http: don't panic after request if Handler sets Request.Body to nil · b1050542

Brad Fitzpatrick authored Nov 03, 2015

The Server's server goroutine was panicing (but recovering) when
cleaning up after handling a request. It was pretty harmless (it just
closed that one connection and didn't kill the whole process) but it
was distracting.

Updates #13135

Change-Id: I2a0ce9e8b52c8d364e3f4ce245e05c6f8d62df14
Reviewed-on: https://go-review.googlesource.com/16572
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Andrew Gerrand <adg@golang.org>

b1050542

03 Nov, 2015 11 commits

cmd/compile: make sure instrumented call has type width · 9179c9cb

Ian Lance Taylor authored Oct 26, 2015

The width of the type of an external variable defined with a type
literal may not be set when the instrumentation pass is run.  There are
two cases in the standard library that fail without the call to dowidth:

../../../src/encoding/base32/base32.go:322: constant -1000000000 overflows uintptr
../../../src/encoding/base32/base32.go:329: constant -1000000000 overflows uintptr
../../../src/encoding/json/encode.go:385: constant -1000000000 overflows uintptr
../../../src/encoding/json/encode.go:387: constant -1000000000 overflows uintptr

Change-Id: I7c3334f7decdb7488595ffe4090cd262d7334283
Reviewed-on: https://go-review.googlesource.com/16331
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Russ Cox <rsc@golang.org>

9179c9cb

misc/cgo/testsanitizers: fix test of whether compiler option works · 6326786c

Ian Lance Taylor authored Nov 03, 2015

On older versions of GCC we need to pass a file name before GCC will
report an unrecognized option.

Fixes #13065.

Change-Id: I7ed34c01a006966a446059025f7d10235c649072
Reviewed-on: https://go-review.googlesource.com/16589
Run-TryBot: Ian Lance Taylor <iant@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: David Crawshaw <crawshaw@golang.org>

6326786c

runtime: remove dead code · ee0305e0

Dmitry Vyukov authored Nov 03, 2015

runtime.free has long gone.

Change-Id: I058f69e6481b8fa008e1951c29724731a8a3d081
Reviewed-on: https://go-review.googlesource.com/16593Reviewed-by: Austin Clements <austin@google.com>
Run-TryBot: Austin Clements <austin@google.com>

ee0305e0

runtime: cache two workbufs to reduce contention · b6c0934a

Austin Clements authored Oct 16, 2015

Currently the gcWork abstraction caches a single work buffer. As a
result, if a worker is putting and getting pointers right at the
boundary of a work buffer, it can flap between work buffers and
(potentially significantly) increase contention on the global work
buffer lists.

This change modifies gcWork to instead cache two work buffers and
switch off between them. This introduces one buffers' worth of
hysteresis and eliminates the above performance worst case by
amortizing the cost of getting or putting a work buffer over at least
one buffers' worth of work.

In practice, it's difficult to trigger this worst case with reasonably
large work buffers. On the garbage benchmark, this reduces the max
writes/sec to the global work list from 32K to 25K and the median from
6K to 5K. However, if a workload were to trigger this worst case
behavior, it could significantly drive up this contention.

This has negligible effects on the go1 benchmarks and slightly speeds
up the garbage benchmark.

name old time/op new time/op delta
XBenchGarbage-12 5.90ms ± 3% 5.83ms ± 4% -1.18% (p=0.011 n=18+18)

name old time/op new time/op delta
BinaryTree17-12 3.22s ± 4% 3.17s ± 3% -1.57% (p=0.009 n=19+20)
Fannkuch11-12 2.44s ± 1% 2.53s ± 4% +3.78% (p=0.000 n=18+19)
FmtFprintfEmpty-12 50.2ns ± 2% 50.5ns ± 5% ~ (p=0.631 n=19+20)
FmtFprintfString-12 167ns ± 1% 166ns ± 1% ~ (p=0.141 n=20+20)
FmtFprintfInt-12 162ns ± 1% 159ns ± 1% -1.80% (p=0.000 n=20+20)
FmtFprintfIntInt-12 277ns ± 2% 263ns ± 1% -4.78% (p=0.000 n=20+18)
FmtFprintfPrefixedInt-12 240ns ± 1% 232ns ± 2% -3.25% (p=0.000 n=20+20)
FmtFprintfFloat-12 311ns ± 1% 315ns ± 2% +1.17% (p=0.000 n=20+20)
FmtManyArgs-12 1.05µs ± 2% 1.03µs ± 2% -1.72% (p=0.000 n=20+20)
GobDecode-12 8.65ms ± 1% 8.71ms ± 2% +0.68% (p=0.001 n=19+20)
GobEncode-12 6.51ms ± 1% 6.54ms ± 1% +0.42% (p=0.047 n=20+19)
Gzip-12 318ms ± 2% 315ms ± 2% -1.20% (p=0.000 n=19+19)
Gunzip-12 42.2ms ± 2% 42.1ms ± 1% ~ (p=0.667 n=20+19)
HTTPClientServer-12 62.5µs ± 1% 62.4µs ± 1% ~ (p=0.110 n=20+18)
JSONEncode-12 16.8ms ± 1% 16.8ms ± 2% ~ (p=0.569 n=19+20)
JSONDecode-12 60.8ms ± 2% 59.8ms ± 1% -1.69% (p=0.000 n=19+19)
Mandelbrot200-12 3.87ms ± 1% 3.85ms ± 0% -0.61% (p=0.001 n=20+17)
GoParse-12 3.76ms ± 2% 3.76ms ± 1% ~ (p=0.698 n=20+20)
RegexpMatchEasy0_32-12 100ns ± 2% 101ns ± 2% ~ (p=0.065 n=19+20)
RegexpMatchEasy0_1K-12 342ns ± 2% 333ns ± 1% -2.82% (p=0.000 n=20+19)
RegexpMatchEasy1_32-12 83.3ns ± 2% 83.2ns ± 2% ~ (p=0.692 n=20+19)
RegexpMatchEasy1_1K-12 498ns ± 2% 490ns ± 1% -1.52% (p=0.000 n=18+20)
RegexpMatchMedium_32-12 131ns ± 2% 131ns ± 2% ~ (p=0.464 n=20+18)
RegexpMatchMedium_1K-12 39.3µs ± 2% 39.6µs ± 1% +0.77% (p=0.000 n=18+19)
RegexpMatchHard_32-12 2.04µs ± 2% 2.06µs ± 1% +0.69% (p=0.009 n=19+20)
RegexpMatchHard_1K-12 61.4µs ± 2% 62.1µs ± 1% +1.21% (p=0.000 n=19+20)
Revcomp-12 534ms ± 1% 529ms ± 1% -0.97% (p=0.000 n=19+16)
Template-12 70.4ms ± 2% 70.0ms ± 1% ~ (p=0.070 n=19+19)
TimeParse-12 359ns ± 3% 344ns ± 1% -4.15% (p=0.000 n=19+19)
TimeFormat-12 357ns ± 1% 361ns ± 2% +1.05% (p=0.002 n=20+20)
[Geo mean] 62.4µs 62.0µs -0.56%

name old speed new speed delta
GobDecode-12 88.7MB/s ± 1% 88.1MB/s ± 2% -0.68% (p=0.001 n=19+20)
GobEncode-12 118MB/s ± 1% 117MB/s ± 1% -0.42% (p=0.046 n=20+19)
Gzip-12 60.9MB/s ± 2% 61.7MB/s ± 2% +1.21% (p=0.000 n=19+19)
Gunzip-12 460MB/s ± 2% 461MB/s ± 1% ~ (p=0.661 n=20+19)
JSONEncode-12 116MB/s ± 1% 115MB/s ± 2% ~ (p=0.555 n=19+20)
JSONDecode-12 31.9MB/s ± 2% 32.5MB/s ± 1% +1.72% (p=0.000 n=19+19)
GoParse-12 15.4MB/s ± 2% 15.4MB/s ± 1% ~ (p=0.653 n=20+20)
RegexpMatchEasy0_32-12 317MB/s ± 2% 315MB/s ± 2% ~ (p=0.141 n=19+20)
RegexpMatchEasy0_1K-12 2.99GB/s ± 2% 3.07GB/s ± 1% +2.86% (p=0.000 n=20+19)
RegexpMatchEasy1_32-12 384MB/s ± 2% 385MB/s ± 2% ~ (p=0.672 n=20+19)
RegexpMatchEasy1_1K-12 2.06GB/s ± 2% 2.09GB/s ± 1% +1.54% (p=0.000 n=18+20)
RegexpMatchMedium_32-12 7.62MB/s ± 2% 7.63MB/s ± 2% ~ (p=0.800 n=20+18)
RegexpMatchMedium_1K-12 26.0MB/s ± 1% 25.8MB/s ± 1% -0.77% (p=0.000 n=18+19)
RegexpMatchHard_32-12 15.7MB/s ± 2% 15.6MB/s ± 1% -0.69% (p=0.010 n=19+20)
RegexpMatchHard_1K-12 16.7MB/s ± 2% 16.5MB/s ± 1% -1.19% (p=0.000 n=19+20)
Revcomp-12 476MB/s ± 1% 481MB/s ± 1% +0.97% (p=0.000 n=19+16)
Template-12 27.6MB/s ± 2% 27.7MB/s ± 1% ~ (p=0.071 n=19+19)
[Geo mean] 99.1MB/s 99.3MB/s +0.27%

Change-Id: I68bcbf74ccb716cd5e844a554f67b679135105e6
Reviewed-on: https://go-review.googlesource.com/16042Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

b6c0934a

runtime: fix finalization and profiling of tiny allocations · bf606094

Dmitry Vyukov authored Nov 03, 2015

Handling of special records for tiny allocations has two problems:
1. Once we queue a finalizer we mark the object. As the result any
subsequent finalizers for the same object will not be queued
during this GC cycle. If we have 16 finalizers setup (the worst case),
finalization will take 16 GC cycles. This is what caused misbehave
of tinyfin.go. The actual flakiness was caused by the fact that fing
is asynchronous and don't always run before the check.
2. If a tiny block has both finalizer and profile specials,
it is possible that we both queue finalizer, preserve the object live
and free the profile record. As the result heap profile can be skewed.

Fix both issues by analyzing all special records for a single object at once.

Also, make tinyfin test stricter and remove reliance on real time.

Also, add a test for the problem 2. Currently heap profile missed about
a half of live memory.

Fixes #13100

Change-Id: I9ae4dc1c44893724138a4565ca5cae29f2e97544
Reviewed-on: https://go-review.googlesource.com/16591Reviewed-by: Austin Clements <austin@google.com>
Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Dmitry Vyukov <dvyukov@google.com>

bf606094

strings: add asm version of Index() for short strings on amd64 · 95333aea

Ilya Tocar authored Oct 28, 2015

Currently we have special case for 1-byte strings,
This extends this to strings shorter than 32 bytes on amd64.
Results (broadwell):

name old time/op new time/op delta
IndexRune-4 57.4ns ± 0% 57.5ns ± 0% +0.10% (p=0.000 n=20+19)
IndexRuneFastPath-4 20.4ns ± 0% 20.4ns ± 0% ~ (all samples are equal)
Index-4 21.0ns ± 0% 21.8ns ± 0% +3.81% (p=0.000 n=20+20)
LastIndex-4 7.07ns ± 1% 6.98ns ± 0% -1.21% (p=0.000 n=20+16)
IndexByte-4 18.3ns ± 0% 18.3ns ± 0% ~ (all samples are equal)
IndexHard1-4 1.46ms ± 0% 0.39ms ± 0% -73.06% (p=0.000 n=16+16)
IndexHard2-4 1.46ms ± 0% 0.30ms ± 0% -79.55% (p=0.000 n=18+18)
IndexHard3-4 1.46ms ± 0% 0.66ms ± 0% -54.68% (p=0.000 n=19+19)
LastIndexHard1-4 1.46ms ± 0% 1.46ms ± 0% -0.01% (p=0.036 n=18+20)
LastIndexHard2-4 1.46ms ± 0% 1.46ms ± 0% ~ (p=0.588 n=19+19)
LastIndexHard3-4 1.46ms ± 0% 1.46ms ± 0% ~ (p=0.283 n=17+20)
IndexTorture-4 11.1µs ± 0% 11.1µs ± 0% +0.01% (p=0.000 n=18+17)

Change-Id: I892781549f558f698be4e41f9f568e3d0611efb5
Reviewed-on: https://go-review.googlesource.com/16430Reviewed-by: Keith Randall <khr@golang.org>
Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>

95333aea

runtime: enlarge GC work buffer size · 18705721

Austin Clements authored Oct 02, 2015

Currently the GC work buffers are only 256 bytes and hence can record
only 24 64-bit pointer. They were reduced from 4K in commits db7fd1c1
and a15818fe as a way to minimize the amount of work the per-P workbuf
caches could "hide" from the mark phase and carry in to the mark
termination phase. However, this approach wasn't very robust and we
later added a "mark 2" phase to address this problem head-on.

Because of mark 2, there's now no benefit to having very small work
buffers. But there are plenty of downsides: small work buffers
increase contention on the work lists, increase the frequency and
hence net overhead of acquiring and releasing work buffers, and
somewhat increase memory overhead of the GC.

This commit expands work buffers back to 4K (504 64-bit pointers).
This reduces the rate of writes to work.full in the garbage benchmark
from a peak of ~780,000 writes/sec to a peak of ~32,000 writes/sec.

This has negligible effect on the go1 benchmarks. It slightly slows
down the garbage benchmark.

name old time/op new time/op delta
XBenchGarbage-12 5.37ms ± 5% 5.60ms ± 2% +4.37% (p=0.000 n=20+20)

Change-Id: Ic9cc28e7a125d23d9faf4f5e690fb8aa9bcdfb28
Reviewed-on: https://go-review.googlesource.com/15893Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

18705721

runtime: make assists preemptible · 45652830

Austin Clements authored Oct 15, 2015

Currently, assists are non-preemptible, which means a heavily
assisting G can block other Gs from running. At the beginning of a GC
cycle, it can also delay scang, which will spin until the assist is
done. Since scanning is currently done sequentially, this can
seriously extend the length of the scan phase.

Fix this by making assists preemptible. Since the assist holds work
buffers and runs on the system stack, this must be done cooperatively:
we make gcDrainN return on preemption, and make the assist return from
the system stack and voluntarily Gosched.

This is prerequisite to enlarging the work buffers. Without this
change, the delays and spinning in scang increase significantly.

This has no effect on the go1 benchmarks.

name              old time/op  new time/op  delta
XBenchGarbage-12  5.72ms ± 4%  5.37ms ± 5%  -6.11%  (p=0.000 n=20+20)

Change-Id: I829e732a0f23b126da633516a1a9ec1a508fdbf1
Reviewed-on: https://go-review.googlesource.com/15894Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

45652830

runtime: replace assist sleep loop with park/ready · 15aa6bbd

Austin Clements authored Oct 14, 2015

GC assists must block until the assist can be satisfied (either
through stealing credit or doing work) or the GC cycle ends.
Currently, this is implemented as a retry loop with a 100 µs delay.
This obviously isn't ideal, as it wastes CPU and delays mutator
execution. It also has the somewhat peculiar downside that sleeping a
G requires allocation, and this requires working around recursive
allocation.

Replace this timed delay with a proper scheduling queue. When an
assist can't be satisfied immediately, it adds the allocating G to a
queue and parks it. Any time background scan credit is flushed, it
consults this queue, directly satisfies the debt of queued assists,
and wakes up satisfied assists before flushing any remaining credit to
the background credit pool.

No effect on the go1 benchmarks. Slightly speeds up the garbage
benchmark.

name              old time/op  new time/op  delta
XBenchGarbage-12  5.81ms ± 1%  5.72ms ± 4%  -1.65%  (p=0.011 n=20+20)

Updates #12041.

Change-Id: I8ee3b6274dd097b12b10a8030796a958a4b0e7b7
Reviewed-on: https://go-review.googlesource.com/15890Reviewed-by: Rick Hudson <rlh@golang.org>
Run-TryBot: Austin Clements <austin@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>

15aa6bbd

runtime: change p.runq from []*g to []guintptr · 0ca4488c

Austin Clements authored Nov 02, 2015

This eliminates many write barriers in the scheduler code that are
unnecessary and will interfere with upcoming changes where the garbage
collector will have to invoke run queue functions in contexts that
must not have write barriers.

Change-Id: I702d0ac99cfd00ffff406e7362917db6a43e7e55
Reviewed-on: https://go-review.googlesource.com/16556Reviewed-by: Russ Cox <rsc@golang.org>
Run-TryBot: Austin Clements <austin@google.com>

0ca4488c

misc/ios: fix an error when getenv encounters unset variable · 48155f54

Michael Matloob authored Nov 02, 2015

The error message should indicate the name of the unset variable,
rather than the value. The value will alwayse be empty.

Change-Id: I6f6c165074dfce857b6523703a890d205423cd28
Reviewed-on: https://go-review.googlesource.com/16555Reviewed-by: David Crawshaw <crawshaw@golang.org>

48155f54