- 27 Apr, 2015 15 commits
-
-
Dmitry Savintsev authored
Replaced code.google.com/p/re2/ with github.com/google/re2/ and updated the file names (re2-exhaustive.txt.bz2 not re2.txt.gz) as well as the re2 make command (make log). Change-Id: I15937b0b8a898d78d45366857ed86421c8d69960 Reviewed-on: https://go-review.googlesource.com/9372Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Russ Cox authored
The master goroutine was returning before the child goroutine had done its final i < b.N (the one that fails and causes it to exit the loop) and then the benchmark harness was updating b.N, causing a read+write race on b.N. Change-Id: I2504270a0de30544736f6c32161337a25b505c3e Reviewed-on: https://go-review.googlesource.com/9368Reviewed-by: Austin Clements <austin@google.com>
-
Austin Clements authored
Change-Id: I061057414c722c5a0f03c709528afc8554114db6 Reviewed-on: https://go-review.googlesource.com/9367Reviewed-by: Rick Hudson <rlh@golang.org>
-
Josh Bleecher Snyder authored
This is a follow-up to CL 9269, as suggested by dvyukov. There is probably even more that can be done to speed up this shuffle. It will matter more once CL 7570 (fine-grained locking in select) is in and can be revisited then, with benchmarks. Change-Id: Ic13a27d11cedd1e1f007951214b3bb56b1644f02 Reviewed-on: https://go-review.googlesource.com/9393Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
-
Austin Clements authored
This avoids confusion with the main findrunnable in the scheduler. Change-Id: I8cf40657557a8610a2fe5a2f74598518256ca7f0 Reviewed-on: https://go-review.googlesource.com/9305Reviewed-by: Rick Hudson <rlh@golang.org>
-
Austin Clements authored
Currently, we use a full stop-the-world around enabling write barriers. This is to ensure that all Gs have enabled write barriers before any blackening occurs (either in gcBgMarkWorker() or in gcAssistAlloc()). However, there's no need to bring the whole world to a synchronous stop to ensure this. This change replaces the STW with a ragged barrier that ensures each P has individually observed that write barriers should be enabled before GC performs any blackening. Change-Id: If2f129a6a55bd8bdd4308067af2b739f3fb41955 Reviewed-on: https://go-review.googlesource.com/8207Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
-
Austin Clements authored
This adds forEachP, which performs a general-purpose ragged global barrier. forEachP takes a callback and invokes it for every P at a GC safe point. Ps that are idle or in a syscall are considered to be at a continuous safe point. forEachP ensures that these Ps do not change state by forcing all syscall Ps into idle and holding the sched.lock. To ensure that Ps do not enter syscall or idle without running the safe-point function, this adds checks for a pending callback every place there is currently a gcwaiting check. We'll use forEachP to replace the STW around enabling the write barrier and to replace the current asynchronous per-M wbuf cache with a cooperatively managed per-P gcWork cache. Change-Id: Ie944f8ce1fead7c79bf271d2f42fcd61a41bb3cc Reviewed-on: https://go-review.googlesource.com/8206Reviewed-by: Russ Cox <rsc@golang.org> Reviewed-by: Rick Hudson <rlh@golang.org>
-
Josh Bleecher Snyder authored
This reverts commit a9e50a6b. Change-Id: I3c5e459f1030e36bc249910facdae12303a44151 Reviewed-on: https://go-review.googlesource.com/9394Reviewed-by: Josh Bleecher Snyder <josharian@gmail.com>
-
Josh Bleecher Snyder authored
Instead of running: go test -short runtime -cpu=1 go test -short runtime -cpu=2 go test -short runtime -cpu=4 Run just: go test -short runtime -cpu=1,2,4 This is a return to the Go 1.4.2 behavior. We lose incremental display of progress and per-cpu timing information, but we don't have to recompile and relink the runtime test, which is slow. This cuts about 10s off all.bash. Updates #10571. Change-Id: I6e8c7149780d47439f8bcfa888e6efc84290c60a Reviewed-on: https://go-review.googlesource.com/9350Reviewed-by: Dave Cheney <dave@cheney.net> Reviewed-by: Ian Lance Taylor <iant@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
-
Josh Bleecher Snyder authored
Reduces allocs linking cmd/go and runtime.test by ~13%. No functional changes. The most easily addressed sources of allocations after this are expandpkg, rdstring, and symbuf string conversion. These can be reduced by interning strings, but that increases the overall memory footprint. Change-Id: Ifedefc9f2a0403bcc75460d6b139e8408374e058 Reviewed-on: https://go-review.googlesource.com/9391Reviewed-by: David Crawshaw <crawshaw@golang.org>
-
Roger Peppe authored
There is no need to escape newlines in char data - it makes the XML larger and harder to read. Change-Id: I1c1fcee1bdffc705c7428f89ca90af8085d6fb73 Reviewed-on: https://go-review.googlesource.com/9310Reviewed-by: Nigel Tao <nigeltao@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
-
Austin Clements authored
This fixes a bug where the runtime ready()s a goroutine while setting up a new M that's initially marked as spinning, causing the scheduler to later panic when it finds work in the run queue of a P associated with a spinning M. Specifically, the sequence of events that can lead to this is: 1) sysmon calls handoffp to hand off a P stolen from a syscall. 2) handoffp sees no pending work on the P, so it calls startm with spinning set. 3) startm calls newm, which in turn calls allocm to allocate a new M. 4) allocm "borrows" the P we're handing off in order to do allocation and performs this allocation. 5) This allocation may assist the garbage collector, and this assist may detect the end of concurrent mark and ready() the main GC goroutine to signal this. 6) This ready()ing puts the GC goroutine on the run queue of the borrowed P. 7) newm starts the OS thread, which runs mstart and subsequently mstart1, which marks the M spinning because startm was called with spinning set. 8) mstart1 enters the scheduler, which panics because there's work on the run queue, but the M is marked spinning. To fix this, before marking the M spinning in step 7, add a check to see if work was been added to the P's run queue. If this is the case, undo the spinning instead. Fixes #10573. Change-Id: I4670495ae00582144a55ce88c45ae71de597cfa5 Reviewed-on: https://go-review.googlesource.com/9332Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>
-
Austin Clements authored
This adds a check that we never put a P on the idle list when it has work on its local run queue. Change-Id: Ifcfab750de60c335148a7f513d4eef17be03b6a7 Reviewed-on: https://go-review.googlesource.com/9324Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
-
Josh Bleecher Snyder authored
This is the optimization made to math/rand in CL 21030043. Change-Id: I231b24fa77cac1fe74ba887db76313b5efaab3e8 Reviewed-on: https://go-review.googlesource.com/9269Reviewed-by: Minux Ma <minux@golang.org>
-
John Dethridge authored
Change-Id: I677a5ee273a4d285a8adff71ffcfeac34afc887f Reviewed-on: https://go-review.googlesource.com/9235Reviewed-by: Austin Clements <austin@google.com>
-
- 26 Apr, 2015 11 commits
-
-
Adam Langley authored
This change causes the GetCertificate callback to be called if Certificates is empty. Previously this configuration would result in an error. This allows people to have servers that depend entirely on dynamic certificate selection, even when the client doesn't send SNI. Fixes #9208. Change-Id: I2f5a5551215958b88b154c64a114590300dfc461 Reviewed-on: https://go-review.googlesource.com/8792Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Adam Langley <agl@golang.org>
-
Jonathan Rudenberg authored
The OCSP response is currently only exposed via a method on Conn, which makes it inaccessible when using wrappers like net/http. The ConnectionState structure is typically available even when using wrappers and contains many of the other handshake details, so this change exposes the stapled OCSP response in that structure. Change-Id: If8dab49292566912c615d816321b4353e711f71f Reviewed-on: https://go-review.googlesource.com/9361Reviewed-by: Adam Langley <agl@golang.org> Run-TryBot: Adam Langley <agl@golang.org>
-
David Leon Gil authored
At present, Unmarshal does not check that the point it unmarshals is actually *on* the curve. (It may be on the curve's twist.) This can, as Daniel Bernstein has pointed out at great length, lead to quite devastating attacks. And 3 out of the 4 curves supported by crypto/elliptic have twists with cofactor != 1; P-224, in particular, has a sufficiently large cofactor that it is likely that conventional dlog attacks might be useful. This closes #2445, filed by Watson Ladd. To explain why this was (partially) rejected before being accepted: In the general case, for curves with cofactor != 1, verifying subgroup membership is required. (This is expensive and hard-to-implement.) But, as recent discussion during the CFRG standardization process has brought out, small-subgroup attacks are much less damaging than a twist attack. Change-Id: I284042eb9954ff9b7cde80b8b693b1d468c7e1e8 Reviewed-on: https://go-review.googlesource.com/2421Reviewed-by: Adam Langley <agl@golang.org>
-
Paul van Brouwershaven authored
This implements a method for x509.CertificateRequest to prevent certain attacks and to allow a CA/RA to properly check the validity of the binding between an end entity and a key pair, to prove that it has possession of (i.e., is able to use) the private key corresponding to the public key for which a certificate is requested. RFC 2986 section 3 states: "A certification authority fulfills the request by authenticating the requesting entity and verifying the entity's signature, and, if the request is valid, constructing an X.509 certificate from the distinguished name and public key, the issuer name, and the certification authority's choice of serial number, validity period, and signature algorithm." Change-Id: I37795c3b1dfdfdd455d870e499b63885eb9bda4f Reviewed-on: https://go-review.googlesource.com/7371Reviewed-by: Adam Langley <agl@golang.org>
-
Jonathan Rudenberg authored
This change adds a new method to tls.Config, SetSessionTicketKeys, that changes the key used to encrypt session tickets while the server is running. Additional keys may be provided that will be used to maintain continuity while rotating keys. If a ticket encrypted with an old key is provided by the client, the server will resume the session and provide the client with a ticket encrypted using the new key. Fixes #9994 Change-Id: Idbc16b10ff39616109a51ed39a6fa208faad5b4e Reviewed-on: https://go-review.googlesource.com/9072Reviewed-by: Jonathan Rudenberg <jonathan@titanous.com> Reviewed-by: Adam Langley <agl@golang.org>
-
Håvard Haugen authored
The command "go tool pprof -top $GOROOT/bin/go /dev/null" now logs that profile is empty instead of panicking. Fixes #9207 Change-Id: I3d55c179277cb19ad52c8f24f1aca85db53ee08d Reviewed-on: https://go-review.googlesource.com/2571 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
-
Jonathan Rudenberg authored
This change adds support for serving and receiving Signed Certificate Timestamps as described in RFC 6962. The server is now capable of serving SCTs listed in the Certificate structure. The client now asks for SCTs and, if any are received, they are exposed in the ConnectionState structure. Fixes #10201 Change-Id: Ib3adae98cb4f173bc85cec04d2bdd3aa0fec70bb Reviewed-on: https://go-review.googlesource.com/8988Reviewed-by: Adam Langley <agl@golang.org> Run-TryBot: Adam Langley <agl@golang.org> Reviewed-by: Jonathan Rudenberg <jonathan@titanous.com>
-
Justin Nuß authored
Currently parseRecord will always start with a nil slice and then resize the slice on append. For input with a fixed number of fields per record we can preallocate the slice to avoid having to resize the slice. This change implements this optimization by using FieldsPerRecord as capacity if it's > 0 and also adds a benchmark to better show the differences. benchmark old ns/op new ns/op delta BenchmarkRead 19741 17909 -9.28% benchmark old allocs new allocs delta BenchmarkRead 59 41 -30.51% benchmark old bytes new bytes delta BenchmarkRead 6276 5844 -6.88% Change-Id: I7c2abc9c80a23571369bcfcc99a8ffc474eae7ab Reviewed-on: https://go-review.googlesource.com/8880Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
David Crawshaw authored
Follows the linux signal forwarding semantics from http://golang.org/cl/8712, sharing the implementation of sigfwdgo. Forwarding for 386, arm, and arm64 will follow. Change-Id: I6bf30d563d19da39b6aec6900c7fe12d82ed4f62 Reviewed-on: https://go-review.googlesource.com/9302Reviewed-by: Ian Lance Taylor <iant@golang.org>
-
Michael Hudson-Doyle authored
Sorry about this. Fixes #10575 Change-Id: I2de23be68e7d822d182e5a0d6a00c607448d861e Reviewed-on: https://go-review.googlesource.com/9341Reviewed-by: Minux Ma <minux@golang.org>
-
Matt T. Proud authored
This commit is largely cosmetic in the sense that it is the remnants of a change proposal I had prepared for testing/quick, until I discovered that 3e9ed273 already implemented the feature I was looking for: quick.Value() for reflect.Kind Array. What you see is a merger and manual cleanup; the cosmetic cleanups are as follows: (1.) Keeping the TestCheckEqual and its associated input functions in the same order as type kinds defined in reflect.Kind. Since 3e9ed273 was committed, the test case began to diverge from the constant's ordering. (2.) The `Intptr` derivatives existed to exercise quick.Value with reflect.Kind's `Ptr` constant. All `Intptr` (unrelated to `uintptr`) in the test have been migrated to ensure the parallelism of the listings and to convey that `Intptr` is not special. (3.) Correct a misspelling (transposition) of "alias", whereby it is named as "Alais". Change-Id: I441450db16b8bb1272c52b0abcda3794dcd0599d Reviewed-on: https://go-review.googlesource.com/8804Reviewed-by: Russ Cox <rsc@golang.org>
-
- 25 Apr, 2015 1 commit
-
-
Michael Hudson-Doyle authored
The long comment block in obj6.go:progedit talked about the two code sequences for accessing g as "local exec" and "initial exec", but really they are both forms of local exec. This stuff is confusing enough without using the wrong words for things, so rewrite it to talk about 2-instruction and 1-instruction sequences. Unfortunately the confusion has made it into code, with the R_TLS_IE relocation now doing double duty as meaning actual initial exec when externally linking and boring old local exec when linking internally (half of this is my fault). So this stops using R_TLS_IE in the local exec case. There is a chance this might break plan9 or windows, but I don't think so. Next step is working out what the heck is going on on ARM... Change-Id: I09da4388210cf49dbc99fd25f5172bbe517cee57 Reviewed-on: https://go-review.googlesource.com/9273Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>
-
- 24 Apr, 2015 13 commits
-
-
Rick Hudson authored
A previous change to mbitmap.go dropped a return on a path the seems not to be excersized. This was a mistake that this CL fixes. Change-Id: I715ee4ef08f5bf8d9f53cee84e8fb31a237e2d43 Reviewed-on: https://go-review.googlesource.com/9295Reviewed-by: Austin Clements <austin@google.com>
-
Michael Hudson-Doyle authored
I think this should fix the arm build. A proper fix involves making the handling of tlsg less fragile, I'll try that tomorrow. Update #10557 Change-Id: I9b1b666737fb40aebb6f284748509afa8483cce5 Reviewed-on: https://go-review.googlesource.com/9272Reviewed-by: Dave Cheney <dave@cheney.net> Run-TryBot: Dave Cheney <dave@cheney.net>
-
Austin Clements authored
Currently, each M has a cache of the most recently used *workbuf. This is used primarily by the write barrier so it doesn't have to access the global workbuf lists on every write barrier. It's also used by stack scanning because it's convenient. This cache is important for write barrier performance, but this particular approach has several downsides. It's faster than no cache, but far from optimal (as the benchmarks below show). It's complex: access to the cache is sprinkled through most of the workbuf list operations and it requires special care to transform into and back out of the gcWork cache that's actually used for scanning and marking. It requires atomic exchanges to take ownership of the cached workbuf and to return it to the M's cache even though it's almost always used by only the current M. Since it's per-M, flushing these caches is O(# of Ms), which may be high. And it has some significant subtleties: for example, in general the cache shouldn't be used after the harvestwbufs() in mark termination because it could hide work from mark termination, but stack scanning can happen after this and *will* use the cache (but it turns out this is okay because it will always be followed by a getfull(), which drains the cache). This change replaces this cache with a per-P gcWork object. This gcWork cache can be used directly by scanning and marking (as long as preemption is disabled, which is a general requirement of gcWork). Since it's per-P, it doesn't require synchronization, which simplifies things and means the only atomic operations in the write barrier are occasionally fetching new work buffers and setting a mark bit if the object isn't already marked. This cache can be flushed in O(# of Ps), which is generally small. It follows a simple flushing rule: the cache can be used during any phase, but during mark termination it must be flushed before allowing preemption. This also makes the dispose during mutator assist no longer necessary, which eliminates the vast majority of gcWork dispose calls and reduces contention on the global workbuf lists. And it's a lot faster on some benchmarks: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 11963668673 11206112763 -6.33% BenchmarkFannkuch11 2643217136 2649182499 +0.23% BenchmarkFmtFprintfEmpty 70.4 70.2 -0.28% BenchmarkFmtFprintfString 364 307 -15.66% BenchmarkFmtFprintfInt 317 282 -11.04% BenchmarkFmtFprintfIntInt 512 483 -5.66% BenchmarkFmtFprintfPrefixedInt 404 380 -5.94% BenchmarkFmtFprintfFloat 521 479 -8.06% BenchmarkFmtManyArgs 2164 1894 -12.48% BenchmarkGobDecode 30366146 22429593 -26.14% BenchmarkGobEncode 29867472 26663152 -10.73% BenchmarkGzip 391236616 396779490 +1.42% BenchmarkGunzip 96639491 96297024 -0.35% BenchmarkHTTPClientServer 100110 70763 -29.31% BenchmarkJSONEncode 51866051 52511382 +1.24% BenchmarkJSONDecode 103813138 86094963 -17.07% BenchmarkMandelbrot200 4121834 4120886 -0.02% BenchmarkGoParse 16472789 5879949 -64.31% BenchmarkRegexpMatchEasy0_32 140 140 +0.00% BenchmarkRegexpMatchEasy0_1K 394 394 +0.00% BenchmarkRegexpMatchEasy1_32 120 120 +0.00% BenchmarkRegexpMatchEasy1_1K 621 614 -1.13% BenchmarkRegexpMatchMedium_32 209 202 -3.35% BenchmarkRegexpMatchMedium_1K 54889 55175 +0.52% BenchmarkRegexpMatchHard_32 2682 2675 -0.26% BenchmarkRegexpMatchHard_1K 79383 79524 +0.18% BenchmarkRevcomp 584116718 584595320 +0.08% BenchmarkTemplate 125400565 109620196 -12.58% BenchmarkTimeParse 386 387 +0.26% BenchmarkTimeFormat 580 447 -22.93% (Best out of 10 runs. The delta of averages is similar.) This also puts us in a good position to flush these caches when nearing the end of concurrent marking, which will let us increase the size of the work buffers while still controlling mark termination pause time. Change-Id: I2dd94c8517a19297a98ec280203cccaa58792522 Reviewed-on: https://go-review.googlesource.com/9178 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
-
Austin Clements authored
When findRunnable considers running a fractional mark worker, it first checks if there's any work to be done; if there isn't there's no point in running the worker because it will just reschedule immediately. However, currently findRunnable just checks work.full and work.partial, whereas getfull can *also* draw work from m.currentwbuf. As a result, findRunnable may not start a worker even though there actually is work. This problem manifests itself in occasional failures of the test/init1.go test. This test is unusual because it performs a large amount of allocation without executing any write barriers, which means there's nothing to force the pointers in currentwbuf out to the work.partial/full lists where findRunnable can see them. This change fixes this problem by making findRunnable also check for a currentwbuf. This aligns findRunnable with trygetfull's notion of whether or not there's work. Change-Id: Ic76d22b7b5d040bc4f58a6b5975e9217650e66c4 Reviewed-on: https://go-review.googlesource.com/9299Reviewed-by: Russ Cox <rsc@golang.org>
-
Austin Clements authored
Currently, findRunnable only considers running a mark worker if there's work in the work queue. In principle, this can delay the start of the desired number of dedicated mark workers if there's no work pending. This is unlikely to occur in practice, since there should be work queued from the scan phase, but if it were to come up, a CPU hog mutator could slow down or delay garbage collection. This check makes sense for fractional mark workers, since they'll just return to the scheduler immediately if there's no work, but we want the scheduler to start all of the dedicated mark workers promptly, even if there's currently no queued work. Hence, this change moves the pending work check after the check for starting a dedicated worker. Change-Id: I52b851cc9e41f508a0955b3f905ca80f109ea101 Reviewed-on: https://go-review.googlesource.com/9298Reviewed-by: Rick Hudson <rlh@golang.org>
-
Austin Clements authored
bgMarkCount no longer exists. Change-Id: I3aa406fdccfca659814da311229afbae55af8304 Reviewed-on: https://go-review.googlesource.com/9297Reviewed-by: Rick Hudson <rlh@golang.org>
-
Hyang-Ah Hana Kim authored
Instead of comparing against the entire output that may include verbose warning messages, use the last line of the output and check it includes the expected success message (PASS). Change-Id: Iafd583ee5529a8aef5439b9f1f6ce0185e4b1331 Reviewed-on: https://go-review.googlesource.com/9304Reviewed-by: David Crawshaw <crawshaw@golang.org>
-
Rob Pike authored
Also rename and update mkdoc.sh to mkalldocs.sh Change-Id: Ief3673c22d45624e173fc65ee279cea324da03b5 Reviewed-on: https://go-review.googlesource.com/9226Reviewed-by: Russ Cox <rsc@golang.org>
-
Srdjan Petrovic authored
The motivation is that sysAlloc/Free() currently aren't safe to be called without a valid G, because arm's xadd64() uses locks that require a valid G. The solution here was proposed by Dmitry Vyukov: use xadduintptr() instead of xadd64(), until arm can support xadd64 on all of its architectures (not a trivial task for arm). Change-Id: I250252079357ea2e4360e1235958b1c22051498f Reviewed-on: https://go-review.googlesource.com/9002Reviewed-by: Dmitry Vyukov <dvyukov@google.com>
-
Hyang-Ah Hana Kim authored
- main3.c tests main.main is exported when compiled for GOOS=android. - wait longer for main2.c (it's slow on android/arm) - rearranged test.bash Fixes #10070. Change-Id: I6e5a98d1c5fae776afa54ecb5da633b59b269316 Reviewed-on: https://go-review.googlesource.com/9296Reviewed-by: David Crawshaw <crawshaw@golang.org> Run-TryBot: Hyang-Ah Hana Kim <hyangah@gmail.com>
-
Michael Hudson-Doyle authored
This lets us avoid loading string constants via the GOT and (together with http://golang.org/cl/9102) results in the fannkuch benchmark having very similar register usage with -dynlink as without. Change-Id: Ic3892b399074982b76773c3e547cfbba5dabb6f9 Reviewed-on: https://go-review.googlesource.com/9103Reviewed-by: Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>
-
Austin Clements authored
Currently, when allocation reaches the GC trigger, the runtime uses readyExecute to start the GC goroutine immediately rather than wait for the scheduler to get around to the GC goroutine while the mutator continues to grow the heap. Now that the scheduler runs the most recently readied goroutine when a goroutine yields its time slice, this rigmarole is no longer necessary. The runtime can simply ready the GC goroutine and yield from the readying goroutine. Change-Id: I3b4ebadd2a72a923b1389f7598f82973dd5c8710 Reviewed-on: https://go-review.googlesource.com/9292Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>
-
Austin Clements authored
Currently, the main GC goroutine sleeps on a note during concurrent mark and the first background mark worker or assist to finish marking use wakes up that note to let the main goroutine proceed into mark termination. Unfortunately, the latency of this wakeup can be quite high, since the GC goroutine will typically have lost its P while in the futex sleep, meaning it will be placed on the global run queue and will wait there until some P is kind enough to pick it up. This delay gives the mutator more time to allocate and create floating garbage, growing the heap unnecessarily. Worse, it's likely that background marking has stopped at this point (unless GOMAXPROCS>4), so anything that's allocated and published to the heap during this window will have to be scanned during mark termination while the world is stopped. This change replaces the note sleep/wakeup with a gopark/ready scheme. This keeps the wakeup inside the Go scheduler and lets the garbage collector take advantage of the new scheduler semantics that run the ready()d goroutine immediately when the ready()ing goroutine sleeps. For the json benchmark from x/benchmarks with GOMAXPROCS=4, this reduces the delay in waking up the GC goroutine and entering mark termination once concurrent marking is done from ~100ms to typically <100µs. Change-Id: Ib11f8b581b8914f2d68e0094f121e49bac3bb384 Reviewed-on: https://go-review.googlesource.com/9291Reviewed-by: Rick Hudson <rlh@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
-