- 13 Apr, 2016 1 commit
-
-
David Crawshaw authored
This CL introduces the typeOff type and a lookup method of the same name that can turn a typeOff offset into an *rtype. In a typical Go binary (built with buildmode=exe, pie, c-archive, or c-shared), there is one moduledata and all typeOff values are offsets relative to firstmoduledata.types. This makes computing the pointer cheap in typical programs. With buildmode=shared (and one day, buildmode=plugin) there are multiple modules whose relative offset is determined at runtime. We identify a type in the general case by the pair of the original *rtype that references it and its typeOff value. We determine the module from the original pointer, and then use the typeOff from there to compute the final *rtype. To ensure there is only one *rtype representing each type, the runtime initializes a typemap for each module, using any identical type from an earlier module when resolving that offset. This means that types computed from an offset match the type mapped by the pointer dynamic relocations. A series of followup CLs will replace other *rtype values with typeOff (and name/*string with nameOff). For types created at runtime by reflect, type offsets are treated as global IDs and reference into a reflect offset map kept by the runtime. darwin/amd64: cmd/go: -57KB (0.6%) jujud: -557KB (0.8%) linux/amd64 PIE: cmd/go: -361KB (3.0%) jujud: -3.5MB (4.2%) For #6853. Change-Id: Icf096fd884a0a0cb9f280f46f7a26c70a9006c96 Reviewed-on: https://go-review.googlesource.com/21285 Reviewed-by:
Ian Lance Taylor <iant@golang.org> Run-TryBot: David Crawshaw <crawshaw@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 10 Apr, 2016 1 commit
-
-
Emmanuel Odeke authored
Make execution panics implement Error as mandated by https://golang.org/ref/spec#Run_time_panics, instead of panics with strings. Fixes #14965 Change-Id: I7827f898b9b9c08af541db922cc24fa0800ff18a Reviewed-on: https://go-review.googlesource.com/21214 Reviewed-by:
Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 07 Apr, 2016 1 commit
-
-
Michael Hudson-Doyle authored
So that all Go processes do not die on startup on a system with >256 CPUs. I tested this by hacking osinit to set ncpu to 1000. Updates #15131 Change-Id: I52e061a0de97be41d684dd8b748fa9087d6f1aef Reviewed-on: https://go-review.googlesource.com/21599 Reviewed-by:
Brad Fitzpatrick <bradfitz@golang.org>
-
- 05 Apr, 2016 2 commits
-
-
Dmitry Vyukov authored
Two GC-related functions, scang and casgstatus, wait in an active spin loop. Active spinning is never a good idea in user-space. Once we wait several times more than the expected wait time, something unexpected is happenning (e.g. the thread we are waiting for is descheduled or handling a page fault) and we need to yield to OS scheduler. Moreover, the expected wait time is very high for these functions: scang wait time can be tens of milliseconds, casgstatus can be hundreds of microseconds. It does not make sense to spin even for that time. go install -a std profile on a 4-core machine shows that 11% of time is spent in the active spin in scang: 6.12% compile compile [.] runtime.scang 3.27% compile compile [.] runtime.readgstatus 1.72% compile compile [.] runtime/internal/atomic.Load The active spin also increases tail latency in the case of the slightest oversubscription: GC goroutines spend whole quantum in the loop instead of executing user code. Here is scang wait time histogram during go install -a std: 13707.0000 - 1815442.7667 [ 118]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎... 1815442.7667 - 3617178.5333 [ 9]: ∎∎∎∎∎∎∎∎∎ 3617178.5333 - 5418914.3000 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎ 5418914.3000 - 7220650.0667 [ 5]: ∎∎∎∎∎ 7220650.0667 - 9022385.8333 [ 12]: ∎∎∎∎∎∎∎∎∎∎∎∎ 9022385.8333 - 10824121.6000 [ 13]: ∎∎∎∎∎∎∎∎∎∎∎∎∎ 10824121.6000 - 12625857.3667 [ 15]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 12625857.3667 - 14427593.1333 [ 18]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 14427593.1333 - 16229328.9000 [ 18]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 16229328.9000 - 18031064.6667 [ 32]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 18031064.6667 - 19832800.4333 [ 28]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 19832800.4333 - 21634536.2000 [ 6]: ∎∎∎∎∎∎ 21634536.2000 - 23436271.9667 [ 15]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 23436271.9667 - 25238007.7333 [ 11]: ∎∎∎∎∎∎∎∎∎∎∎ 25238007.7333 - 27039743.5000 [ 27]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 27039743.5000 - 28841479.2667 [ 20]: ∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎ 28841479.2667 - 30643215.0333 [ 10]: ∎∎∎∎∎∎∎∎∎∎ 30643215.0333 - 32444950.8000 [ 7]: ∎∎∎∎∎∎∎ 32444950.8000 - 34246686.5667 [ 4]: ∎∎∎∎ 34246686.5667 - 36048422.3333 [ 4]: ∎∎∎∎ 36048422.3333 - 37850158.1000 [ 1]: ∎ 37850158.1000 - 39651893.8667 [ 5]: ∎∎∎∎∎ 39651893.8667 - 41453629.6333 [ 2]: ∎∎ 41453629.6333 - 43255365.4000 [ 2]: ∎∎ 43255365.4000 - 45057101.1667 [ 2]: ∎∎ 45057101.1667 - 46858836.9333 [ 1]: ∎ 46858836.9333 - 48660572.7000 [ 2]: ∎∎ 48660572.7000 - 50462308.4667 [ 3]: ∎∎∎ 50462308.4667 - 52264044.2333 [ 2]: ∎∎ 52264044.2333 - 54065780.0000 [ 2]: ∎∎ and the zoomed-in first part: 13707.0000 - 19916.7667 [ 2]: ∎∎ 19916.7667 - 26126.5333 [ 2]: ∎∎ 26126.5333 - 32336.3000 [ 9]: ∎∎∎∎∎∎∎∎∎ 32336.3000 - 38546.0667 [ 8]: ∎∎∎∎∎∎∎∎ 38546.0667 - 44755.8333 [ 12]: ∎∎∎∎∎∎∎∎∎∎∎∎ 44755.8333 - 50965.6000 [ 10]: ∎∎∎∎∎∎∎∎∎∎ 50965.6000 - 57175.3667 [ 5]: ∎∎∎∎∎ 57175.3667 - 63385.1333 [ 6]: ∎∎∎∎∎∎ 63385.1333 - 69594.9000 [ 5]: ∎∎∎∎∎ 69594.9000 - 75804.6667 [ 6]: ∎∎∎∎∎∎ 75804.6667 - 82014.4333 [ 6]: ∎∎∎∎∎∎ 82014.4333 - 88224.2000 [ 4]: ∎∎∎∎ 88224.2000 - 94433.9667 [ 1]: ∎ 94433.9667 - 100643.7333 [ 1]: ∎ 100643.7333 - 106853.5000 [ 2]: ∎∎ 106853.5000 - 113063.2667 [ 0]: 113063.2667 - 119273.0333 [ 2]: ∎∎ 119273.0333 - 125482.8000 [ 2]: ∎∎ 125482.8000 - 131692.5667 [ 1]: ∎ 131692.5667 - 137902.3333 [ 1]: ∎ 137902.3333 - 144112.1000 [ 0]: 144112.1000 - 150321.8667 [ 2]: ∎∎ 150321.8667 - 156531.6333 [ 1]: ∎ 156531.6333 - 162741.4000 [ 1]: ∎ 162741.4000 - 168951.1667 [ 0]: 168951.1667 - 175160.9333 [ 0]: 175160.9333 - 181370.7000 [ 1]: ∎ 181370.7000 - 187580.4667 [ 1]: ∎ 187580.4667 - 193790.2333 [ 2]: ∎∎ 193790.2333 - 200000.0000 [ 0]: Here is casgstatus wait time histogram: 631.0000 - 5276.6333 [ 3]: ∎∎∎ 5276.6333 - 9922.2667 [ 5]: ∎∎∎∎∎ 9922.2667 - 14567.9000 [ 2]: ∎∎ 14567.9000 - 19213.5333 [ 6]: ∎∎∎∎∎∎ 19213.5333 - 23859.1667 [ 5]: ∎∎∎∎∎ 23859.1667 - 28504.8000 [ 6]: ∎∎∎∎∎∎ 28504.8000 - 33150.4333 [ 6]: ∎∎∎∎∎∎ 33150.4333 - 37796.0667 [ 2]: ∎∎ 37796.0667 - 42441.7000 [ 1]: ∎ 42441.7000 - 47087.3333 [ 3]: ∎∎∎ 47087.3333 - 51732.9667 [ 0]: 51732.9667 - 56378.6000 [ 1]: ∎ 56378.6000 - 61024.2333 [ 0]: 61024.2333 - 65669.8667 [ 0]: 65669.8667 - 70315.5000 [ 0]: 70315.5000 - 74961.1333 [ 1]: ∎ 74961.1333 - 79606.7667 [ 0]: 79606.7667 - 84252.4000 [ 0]: 84252.4000 - 88898.0333 [ 0]: 88898.0333 - 93543.6667 [ 0]: 93543.6667 - 98189.3000 [ 0]: 98189.3000 - 102834.9333 [ 0]: 102834.9333 - 107480.5667 [ 1]: ∎ 107480.5667 - 112126.2000 [ 0]: 112126.2000 - 116771.8333 [ 0]: 116771.8333 - 121417.4667 [ 0]: 121417.4667 - 126063.1000 [ 0]: 126063.1000 - 130708.7333 [ 0]: 130708.7333 - 135354.3667 [ 0]: 135354.3667 - 140000.0000 [ 1]: ∎ Ideally we eliminate the waiting by switching to async state machine for GC, but for now just yield to OS scheduler after a reasonable wait time. To choose yielding parameters I've measured golang.org/x/benchmarks/http tail latencies with different yield delays and oversubscription levels. With no oversubscription (to the degree possible): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 1.41ms ±15% 1.41ms ± 5% ~ (p=0.611 n=13+12) Latency-95 5.21ms ± 2% 5.15ms ± 2% -1.15% (p=0.012 n=13+13) Latency-99 7.16ms ± 2% 7.05ms ± 2% -1.54% (p=0.002 n=13+13) Latency-999 10.7ms ± 9% 10.2ms ±10% -5.46% (p=0.004 n=12+13) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 1.41ms ±15% 1.41ms ± 8% ~ (p=0.511 n=13+13) Latency-95 5.21ms ± 2% 5.14ms ± 2% -1.23% (p=0.006 n=13+13) Latency-99 7.16ms ± 2% 7.02ms ± 2% -1.94% (p=0.000 n=13+13) Latency-999 10.7ms ± 9% 10.1ms ± 8% -6.14% (p=0.000 n=12+13) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 1.41ms ±15% 1.45ms ± 6% ~ (p=0.724 n=13+13) Latency-95 5.21ms ± 2% 5.18ms ± 1% ~ (p=0.287 n=13+13) Latency-99 7.16ms ± 2% 7.05ms ± 2% -1.64% (p=0.002 n=13+13) Latency-999 10.7ms ± 9% 10.0ms ± 5% -6.72% (p=0.000 n=12+13) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 1.41ms ±15% 1.51ms ± 7% +6.57% (p=0.002 n=13+13) Latency-95 5.21ms ± 2% 5.21ms ± 2% ~ (p=0.960 n=13+13) Latency-99 7.16ms ± 2% 7.06ms ± 2% -1.50% (p=0.012 n=13+13) Latency-999 10.7ms ± 9% 10.0ms ± 6% -6.49% (p=0.000 n=12+13) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 1.41ms ±15% 1.53ms ± 6% +8.48% (p=0.000 n=13+12) Latency-95 5.21ms ± 2% 5.23ms ± 2% ~ (p=0.287 n=13+13) Latency-99 7.16ms ± 2% 7.08ms ± 2% -1.21% (p=0.004 n=13+13) Latency-999 10.7ms ± 9% 9.9ms ± 3% -7.99% (p=0.000 n=12+12) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 1.41ms ±15% 1.47ms ± 5% ~ (p=0.072 n=13+13) Latency-95 5.21ms ± 2% 5.17ms ± 2% ~ (p=0.091 n=13+13) Latency-99 7.16ms ± 2% 7.02ms ± 2% -1.99% (p=0.000 n=13+13) Latency-999 10.7ms ± 9% 9.9ms ± 5% -7.86% (p=0.000 n=12+13) With slight oversubscription (another instance of http benchmark was running in background with reduced GOMAXPROCS): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 840µs ± 3% 804µs ± 3% -4.37% (p=0.000 n=15+18) Latency-95 6.52ms ± 4% 6.03ms ± 4% -7.51% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.0ms ± 4% -7.33% (p=0.000 n=18+14) Latency-999 18.0ms ± 9% 16.8ms ± 7% -6.84% (p=0.000 n=18+18) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 840µs ± 3% 809µs ± 3% -3.71% (p=0.000 n=15+17) Latency-95 6.52ms ± 4% 6.11ms ± 4% -6.29% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 9.9ms ± 6% -7.55% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.5ms ±11% -8.49% (p=0.000 n=18+18) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 840µs ± 3% 823µs ± 5% -2.06% (p=0.002 n=15+18) Latency-95 6.52ms ± 4% 6.32ms ± 3% -3.05% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 4% -5.22% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.7ms ±10% -7.09% (p=0.000 n=18+18) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 840µs ± 3% 836µs ± 5% ~ (p=0.442 n=15+18) Latency-95 6.52ms ± 4% 6.39ms ± 3% -2.00% (p=0.000 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 6% -5.15% (p=0.000 n=18+17) Latency-999 18.0ms ± 9% 16.6ms ± 8% -7.48% (p=0.000 n=18+18) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 840µs ± 3% 836µs ± 6% ~ (p=0.401 n=15+18) Latency-95 6.52ms ± 4% 6.40ms ± 4% -1.79% (p=0.010 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 5% -4.95% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.5ms ±14% -8.17% (p=0.000 n=18+18) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 840µs ± 3% 828µs ± 2% -1.49% (p=0.001 n=15+17) Latency-95 6.52ms ± 4% 6.38ms ± 4% -2.04% (p=0.001 n=18+18) Latency-99 10.8ms ± 7% 10.2ms ± 4% -4.77% (p=0.000 n=18+18) Latency-999 18.0ms ± 9% 16.9ms ± 9% -6.23% (p=0.000 n=18+18) With significant oversubscription (background http benchmark was running with full GOMAXPROCS): scang yield delay = 1, casgstatus yield delay = 1 Latency-50 1.32ms ±12% 1.30ms ±13% ~ (p=0.454 n=14+14) Latency-95 16.3ms ±10% 15.3ms ± 7% -6.29% (p=0.001 n=14+14) Latency-99 29.4ms ±10% 27.9ms ± 5% -5.04% (p=0.001 n=14+12) Latency-999 49.9ms ±19% 45.9ms ± 5% -8.00% (p=0.008 n=14+13) scang yield delay = 5000, casgstatus yield delay = 3000 Latency-50 1.32ms ±12% 1.29ms ± 9% ~ (p=0.227 n=14+14) Latency-95 16.3ms ±10% 15.4ms ± 5% -5.27% (p=0.002 n=14+14) Latency-99 29.4ms ±10% 27.9ms ± 6% -5.16% (p=0.001 n=14+14) Latency-999 49.9ms ±19% 46.8ms ± 8% -6.21% (p=0.050 n=14+14) scang yield delay = 10000, casgstatus yield delay = 5000 Latency-50 1.32ms ±12% 1.35ms ± 9% ~ (p=0.401 n=14+14) Latency-95 16.3ms ±10% 15.0ms ± 4% -7.67% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.4ms ± 5% -6.98% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.7ms ± 5% -10.56% (p=0.000 n=14+11) scang yield delay = 30000, casgstatus yield delay = 10000 Latency-50 1.32ms ±12% 1.36ms ±10% ~ (p=0.246 n=14+14) Latency-95 16.3ms ±10% 14.9ms ± 5% -8.31% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.4ms ± 7% -6.70% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.9ms ±15% -10.13% (p=0.003 n=14+14) scang yield delay = 100000, casgstatus yield delay = 50000 Latency-50 1.32ms ±12% 1.41ms ± 9% +6.37% (p=0.008 n=14+13) Latency-95 16.3ms ±10% 15.1ms ± 8% -7.45% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.5ms ±12% -6.67% (p=0.002 n=14+14) Latency-999 49.9ms ±19% 45.9ms ±16% -8.06% (p=0.019 n=14+14) scang yield delay = 200000, casgstatus yield delay = 100000 Latency-50 1.32ms ±12% 1.42ms ±10% +7.21% (p=0.003 n=14+14) Latency-95 16.3ms ±10% 15.0ms ± 7% -7.59% (p=0.000 n=14+14) Latency-99 29.4ms ±10% 27.3ms ± 8% -7.20% (p=0.000 n=14+14) Latency-999 49.9ms ±19% 44.8ms ± 8% -10.21% (p=0.001 n=14+13) All numbers are on 8 cores and with GOGC=10 (http benchmark has tiny heap, few goroutines and low allocation rate, so by default GC barely affects tail latency). 10us/5us yield delays seem to provide a reasonable compromise and give 5-10% tail latency reduction. That's what used in this change. go install -a std results on 4 core machine: name old time/op new time/op delta Time 8.39s ± 2% 7.94s ± 2% -5.34% (p=0.000 n=47+49) UserTime 24.6s ± 2% 22.9s ± 2% -6.76% (p=0.000 n=49+49) SysTime 1.77s ± 9% 1.89s ±11% +7.00% (p=0.000 n=49+49) CpuLoad 315ns ± 2% 313ns ± 1% -0.59% (p=0.000 n=49+48) # %CPU MaxRSS 97.1ms ± 4% 97.5ms ± 9% ~ (p=0.838 n=46+49) # bytes Update #14396 Update #14189 Change-Id: I3f4109bf8f7fd79b39c466576690a778232055a2 Reviewed-on: https://go-review.googlesource.com/21503 Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Rick Hudson <rlh@golang.org> Reviewed-by:
Austin Clements <austin@google.com>
-
Dmitry Vyukov authored
Usleep(100) in runqgrab negatively affects latency and throughput of parallel application. We are sleeping instead of doing useful work. This is effect is particularly visible on windows where minimal sleep duration is 1-15ms. Reduce sleep from 100us to 3us and use osyield on windows. Sync chan send/recv takes ~50ns, so 3us gives us ~50x overshoot. benchmark old ns/op new ns/op delta BenchmarkChanSync-12 216 217 +0.46% BenchmarkChanSyncWork-12 27213 25816 -5.13% CPU consumption goes up from 106% to 108% in the first case, and from 107% to 125% in the second case. Test case from #14790 on windows: BenchmarkDefaultResolution-8 4583372 29720 -99.35% Benchmark1ms-8 992056 30701 -96.91% 99-th latency percentile for HTTP request serving is improved by up to 15% (see http://golang.org/cl/20835 for details). The following benchmarks are from the change that originally added this sleep (see https://golang.org/s/go15gomaxprocs): name old time/op new time/op delta Chain 22.6µs ± 2% 22.7µs ± 6% ~ (p=0.905 n=9+10) ChainBuf 22.4µs ± 3% 22.5µs ± 4% ~ (p=0.780 n=9+10) Chain-2 23.5µs ± 4% 24.9µs ± 1% +5.66% (p=0.000 n=10+9) ChainBuf-2 23.7µs ± 1% 24.4µs ± 1% +3.31% (p=0.000 n=9+10) Chain-4 24.2µs ± 2% 25.1µs ± 3% +3.70% (p=0.000 n=9+10) ChainBuf-4 24.4µs ± 5% 25.0µs ± 2% +2.37% (p=0.023 n=10+10) Powser 2.37s ± 1% 2.37s ± 1% ~ (p=0.423 n=8+9) Powser-2 2.48s ± 2% 2.57s ± 2% +3.74% (p=0.000 n=10+9) Powser-4 2.66s ± 1% 2.75s ± 1% +3.40% (p=0.000 n=10+10) Sieve 13.3s ± 2% 13.3s ± 2% ~ (p=1.000 n=10+9) Sieve-2 7.00s ± 2% 7.44s ±16% ~ (p=0.408 n=8+10) Sieve-4 4.13s ±21% 3.85s ±22% ~ (p=0.113 n=9+9) Fixes #14790 Change-Id: Ie7c6a1c4f9c8eb2f5d65ab127a3845386d6f8b5d Reviewed-on: https://go-review.googlesource.com/20835 Reviewed-by:
Austin Clements <austin@google.com>
-
- 01 Apr, 2016 1 commit
-
-
Ian Lance Taylor authored
Fixes #15061. Change-Id: I71f69f398d1c5f3a884bbd044786f1a5600d0fae Reviewed-on: https://go-review.googlesource.com/21398 Run-TryBot: Ian Lance Taylor <iant@golang.org> Reviewed-by:
Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 29 Mar, 2016 1 commit
-
-
Michel Lespinasse authored
See #14874 This change makes the runtime register all compiler generated itabs (as obtained from the moduledata) during init. Change-Id: I9969a0985b99b8bda820a631f7fe4c78f1174cdf Reviewed-on: https://go-review.googlesource.com/20900 Reviewed-by:
Keith Randall <khr@golang.org> Run-TryBot: Michel Lespinasse <walken@google.com>
-
- 25 Mar, 2016 1 commit
-
-
Dmitry Vyukov authored
During random stealing we steal 4*GOMAXPROCS times from random procs. One would expect that most of the time we check all procs this way, but due to low quality PRNG we actually miss procs with frightening probability. Below are modelling experiment results for 1e6 tries: GOMAXPROCS = 2 : missed 1 procs 7944 times GOMAXPROCS = 3 : missed 1 procs 101620 times GOMAXPROCS = 3 : missed 2 procs 3571 times GOMAXPROCS = 4 : missed 1 procs 63916 times GOMAXPROCS = 4 : missed 2 procs 61 times GOMAXPROCS = 4 : missed 3 procs 16 times GOMAXPROCS = 5 : missed 1 procs 133136 times GOMAXPROCS = 5 : missed 2 procs 1025 times GOMAXPROCS = 5 : missed 3 procs 101 times GOMAXPROCS = 5 : missed 4 procs 15 times GOMAXPROCS = 8 : missed 1 procs 151765 times GOMAXPROCS = 8 : missed 2 procs 5057 times GOMAXPROCS = 8 : missed 3 procs 1726 times GOMAXPROCS = 8 : missed 4 procs 68 times GOMAXPROCS = 12 : missed 1 procs 199081 times GOMAXPROCS = 12 : missed 2 procs 27489 times GOMAXPROCS = 12 : missed 3 procs 3113 times GOMAXPROCS = 12 : missed 4 procs 233 times GOMAXPROCS = 12 : missed 5 procs 9 times GOMAXPROCS = 16 : missed 1 procs 237477 times GOMAXPROCS = 16 : missed 2 procs 30037 times GOMAXPROCS = 16 : missed 3 procs 9466 times GOMAXPROCS = 16 : missed 4 procs 1334 times GOMAXPROCS = 16 : missed 5 procs 192 times GOMAXPROCS = 16 : missed 6 procs 5 times GOMAXPROCS = 16 : missed 7 procs 1 times GOMAXPROCS = 16 : missed 8 procs 1 times A missed proc won't lead to underutilization because we check all procs again after dropping P. But it can lead to an unpleasant situation when we miss a proc, drop P, check all procs, discover work, acquire P, miss the proc again, repeat. Improve stealing logic to cover all procs. Also don't enter spinning mode and try to steal when there is nobody around. Change-Id: Ibb6b122cc7fb836991bad7d0639b77c807aab4c2 Reviewed-on: https://go-review.googlesource.com/20836 Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Austin Clements <austin@google.com> Reviewed-by:
Marvin Stenger <marvin.stenger94@gmail.com>
-
- 16 Mar, 2016 2 commits
-
-
Austin Clements authored
gopark calls the unlock function after setting the G to _Gwaiting. This means it's generally unsafe to access the G's stack from the unlock function because the G may start running on another P. Once we start shrinking stacks concurrently, a stack shrink could also move the stack the moment after it enters _Gwaiting and before the unlock function is called. Document this restriction and fix the two places where we currently violate it. This is unlikely to be a problem in practice for these two places right now, but they're already skating on thin ice. For example, the following sequence could in principle cause corruption, deadlock, or a panic in the select code: On M1/P1: 1. G1 selects on channels A and B. 2. selectgoImpl calls gopark. 3. gopark puts G1 in _Gwaiting. 4. gopark calls selparkcommit. 5. selparkcommit releases the lock on channel A. On M2/P2: 6. G2 sends to channel A. 7. The send puts G1 in _Grunnable and puts it on P2's run queue. 8. The scheduler runs, selects G1, puts it in _Grunning, and resumes G1. 9. On G1, the sellock immediately following the gopark gets called. 10. sellock grows and moves the stack. On M1/P1: 11. selparkcommit continues to scan the lock order for the next channel to unlock, but it's now reading from a freed (and possibly reused) stack. This shouldn't happen in practice because step 10 isn't the first call to sellock, so the stack should already be big enough. However, once we start shrinking stacks concurrently, this reasoning won't work any more. For #12967. Change-Id: I3660c5be37e5be9f87433cb8141bdfdf37fadc4c Reviewed-on: https://go-review.googlesource.com/20038 Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
Given a G, there's currently no way to find the channel it's blocking on. We'll need this information to fix a (probably theoretical) bug in select and to implement concurrent stack shrinking, so record the channel in the sudog. For #12967. Change-Id: If8fb63a140f1d07175818824d08c0ebeec2bdf66 Reviewed-on: https://go-review.googlesource.com/20035 Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 13 Mar, 2016 1 commit
-
-
Emmanuel Odeke authored
Move functions testSchedLocalQueueLocal and testSchedLocalQueueSteal from proc.go to export_test.go, the only site that they are used. Fixes #14796 Change-Id: I16b6fa4a13835eab33f66a2c2e87a5f5c79b7bd3 Reviewed-on: https://go-review.googlesource.com/20640 Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Brad Fitzpatrick <bradfitz@golang.org>
-
- 02 Mar, 2016 1 commit
-
-
Brad Fitzpatrick authored
The tree's pretty inconsistent about single space vs double space after a period in documentation. Make it consistently a single space, per earlier decisions. This means contributors won't be confused by misleading precedence. This CL doesn't use go/doc to parse. It only addresses // comments. It was generated with: $ perl -i -npe 's,^(\s*// .+[a-z]\.) +([A-Z]),$1 $2,' $(git grep -l -E '^\s*//(.+\.) +([A-Z])') $ go test go/doc -update Change-Id: Iccdb99c37c797ef1f804a94b22ba5ee4b500c4f7 Reviewed-on: https://go-review.googlesource.com/20022 Reviewed-by:
Rob Pike <r@golang.org> Reviewed-by:
Dave Day <djd@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 26 Feb, 2016 1 commit
-
-
Dmitry Vyukov authored
Currently dropg does not unwire locked g/m. This is unnecessary distiction between locked and non-locked g/m. We always restart goroutines with execute which re-wires g/m. First, this produces false sense that this distinction is necessary. Second, it can confuse some sanity and cross checks. For example, if we check that g/m are unwired before we wire them in execute, the check will fail for locked g/m. I've hit this while doing some race detector changes, When we deschedule a goroutine and run scheduler code, m.curg is generally nil, but not for locked ms. Remove the distinction. Change-Id: I3b87a28ff343baa1d564aab1f821b582a84dee07 Reviewed-on: https://go-review.googlesource.com/19950 Reviewed-by:
Austin Clements <austin@google.com> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 25 Feb, 2016 1 commit
-
-
Austin Clements authored
_Genqueue and _Gscanenqueue were introduced as part of the GC quiesce code. The quiesce code was removed by 197aa9e6, but these states and some associated code stuck around. Remove them. Change-Id: I69df81881602d4a431556513dac2959668d27c20 Reviewed-on: https://go-review.googlesource.com/19638 Reviewed-by:
Rick Hudson <rlh@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 24 Feb, 2016 1 commit
-
-
Martin Möhrmann authored
Change-Id: Icd06d99c42b8299fd931c7da821e1f418684d913 Reviewed-on: https://go-review.googlesource.com/19829 Reviewed-by:
Brad Fitzpatrick <bradfitz@golang.org> Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 21 Feb, 2016 1 commit
-
-
Shenghou Ma authored
Change-Id: I6cb8ac7b59812e82111ab3b0f8303ab8194a5129 Reviewed-on: https://go-review.googlesource.com/19791 Reviewed-by:
Ian Lance Taylor <iant@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 02 Feb, 2016 1 commit
-
-
Austin Clements authored
Currently it's possible for the scheduler to deadlock with the right confluence of locked Gs, assists, and scheduling of background mark workers. Broadly, this happens because handoffp is stricter than findrunnable, and if the only work for a P is GC work, handoffp will put the P into idle, rather than starting an M to execute that P. One way this can happen is as follows: 0. There is only one user G, which we'll call G 1. There is more than one P, but they're all idle except the one running G 1. 1. G 1 locks itself to an M using runtime.LockOSThread. 2. GC starts up and enters mark 1. 3. G 1 performs a GC assist, which completes mark 1 without being fully satisfied. Completing mark 1 causes all background mark workers to park. And since the assist isn't fully satisfied, it parks as well, waiting for a background mark worker to satisfy its remaining assist debt. 4. The assist park enters the scheduler. Since G 1 is locked to the M, the scheduler releases the P and calls handoffp to hand the P to another M. 5. handoffp checks the local and global run queues, which are empty, and sees that there are idle Ps, so rather than start an M, it puts the P into idle. At this point, all of the Gs are waiting and all of the Ps are idle. In particular, none of the GC workers are running, so no mark work gets done and the assist on the main G is never satisfied, so the whole process soft locks up. Fix this by making handoffp start an M if there is GC work. This reintroduces a key invariant: that in any situation where findrunnable would return a G to run on a P, handoffp for that P will start an M to run work on that P. Fixes #13645. Tested by running 2,689 iterations of `go tool dist test -no-rebuild runtime:cpu124` across 10 linux-amd64-noopt VMs with no failures. Without this change, the failure rate was somewhere around 1%. Performance change is negligible. name old time/op new time/op delta XBenchGarbage-12 2.48ms ± 2% 2.48ms ± 1% -0.24% (p=0.000 n=92+93) name old time/op new time/op delta BinaryTree17-12 2.86s ± 2% 2.87s ± 2% ~ (p=0.667 n=19+20) Fannkuch11-12 2.52s ± 1% 2.47s ± 1% -2.05% (p=0.000 n=18+20) FmtFprintfEmpty-12 51.7ns ± 1% 51.5ns ± 3% ~ (p=0.931 n=16+20) FmtFprintfString-12 170ns ± 1% 168ns ± 1% -0.65% (p=0.000 n=19+19) FmtFprintfInt-12 160ns ± 0% 160ns ± 0% +0.18% (p=0.033 n=17+19) FmtFprintfIntInt-12 265ns ± 1% 273ns ± 1% +2.98% (p=0.000 n=17+19) FmtFprintfPrefixedInt-12 235ns ± 1% 239ns ± 1% +1.99% (p=0.000 n=16+19) FmtFprintfFloat-12 315ns ± 0% 315ns ± 1% ~ (p=0.250 n=17+19) FmtManyArgs-12 1.04µs ± 1% 1.05µs ± 0% +0.87% (p=0.000 n=17+19) GobDecode-12 7.93ms ± 0% 7.85ms ± 1% -1.03% (p=0.000 n=16+18) GobEncode-12 6.62ms ± 1% 6.58ms ± 1% -0.60% (p=0.000 n=18+19) Gzip-12 322ms ± 1% 320ms ± 1% -0.46% (p=0.009 n=20+20) Gunzip-12 42.5ms ± 1% 42.5ms ± 0% ~ (p=0.751 n=19+19) HTTPClientServer-12 69.7µs ± 1% 70.0µs ± 2% ~ (p=0.056 n=19+19) JSONEncode-12 16.9ms ± 1% 16.7ms ± 1% -1.13% (p=0.000 n=19+19) JSONDecode-12 61.5ms ± 1% 61.3ms ± 1% -0.35% (p=0.001 n=20+17) Mandelbrot200-12 3.94ms ± 0% 3.91ms ± 0% -0.67% (p=0.000 n=20+18) GoParse-12 3.71ms ± 1% 3.70ms ± 1% ~ (p=0.244 n=17+19) RegexpMatchEasy0_32-12 101ns ± 1% 102ns ± 2% +0.54% (p=0.037 n=19+20) RegexpMatchEasy0_1K-12 349ns ± 0% 350ns ± 0% +0.33% (p=0.000 n=17+18) RegexpMatchEasy1_32-12 84.5ns ± 2% 84.2ns ± 1% -0.43% (p=0.048 n=19+20) RegexpMatchEasy1_1K-12 510ns ± 1% 513ns ± 2% +0.58% (p=0.002 n=18+20) RegexpMatchMedium_32-12 132ns ± 1% 134ns ± 1% +0.95% (p=0.000 n=20+20) RegexpMatchMedium_1K-12 40.1µs ± 1% 39.6µs ± 1% -1.39% (p=0.000 n=20+20) RegexpMatchHard_32-12 2.08µs ± 0% 2.06µs ± 1% -0.95% (p=0.000 n=18+18) RegexpMatchHard_1K-12 62.2µs ± 1% 61.9µs ± 1% -0.42% (p=0.001 n=19+20) Revcomp-12 537ms ± 0% 536ms ± 0% ~ (p=0.076 n=20+20) Template-12 71.3ms ± 1% 69.3ms ± 1% -2.75% (p=0.000 n=20+20) TimeParse-12 361ns ± 0% 360ns ± 1% ~ (p=0.056 n=19+19) TimeFormat-12 353ns ± 0% 352ns ± 0% -0.23% (p=0.000 n=17+18) [Geo mean] 62.6µs 62.5µs -0.17% Change-Id: I0fbbbe4d7d99653ba5600ffb4394fa03558bc4e9 Reviewed-on: https://go-review.googlesource.com/19107 Reviewed-by:
Rick Hudson <rlh@golang.org> Reviewed-by:
Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 27 Jan, 2016 2 commits
-
-
Austin Clements authored
Currently p.gcBgMarkWorker is a *g. Change it to a guintptr. This eliminates a write barrier during the subtle mark worker parking dance (which isn't known to be causing problems, but may). Change-Id: Ibf12c05ac910820448059e69a68e5b882c993ed8 Reviewed-on: https://go-review.googlesource.com/18970 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by:
Rick Hudson <rlh@golang.org> Reviewed-by:
Russ Cox <rsc@golang.org>
-
Austin Clements authored
Currently mark workers attach to their designated Ps before parking, either during initialization or after performing a phase transition. However, in both of these cases, it's possible that the mark worker is running on a different P than the one it attaches to. This is a problem, because as soon as the worker attaches to a P, that P's scheduler can execute the worker. If the worker hasn't yet parked on the P it's actually running on, this means the worker G will be running in two places at once. The most visible consequence of this is that once the first instance of the worker does park, it will clear g.m and the second instance will crash shortly when it tries to use g.m. Fix this by moving the attach to the gopark callback. At this point, the G is genuinely stopped and the callback is running on the system stack, so it's safe for another P's scheduler to pick up the worker G. Fixes #13363. Fixes #13978. Change-Id: If2f7c4a4174f9511f6227e14a27c56fb842d1cc8 Reviewed-on: https://go-review.googlesource.com/18761 Reviewed-by:
Rick Hudson <rlh@golang.org> Reviewed-by:
Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com>
-
- 14 Jan, 2016 1 commit
-
-
Ian Lance Taylor authored
This doesn't fix a bug, but may improve performance in programs that have many concurrent calls from C to Go. The old code made several system calls between lockextra and unlockextra. That could be happening while another thread is spinning acquiring lockextra. This changes the code to not make any system calls while holding the lock. Change-Id: I50576478e478670c3d6429ad4e1b7d80f98a19d8 Reviewed-on: https://go-review.googlesource.com/18548 Reviewed-by:
Russ Cox <rsc@golang.org>
-
- 13 Jan, 2016 1 commit
-
-
Russ Cox authored
[Repeat of CL 18343 with build fixes.] Before, NumGoroutine counted system goroutines and Stack (usually) didn't show them, which was inconsistent and confusing. To resolve which way they should be consistent, it seems like package main import "runtime" func main() { println(runtime.NumGoroutine()) } should print 1 regardless of internal runtime details. Make it so. Fixes #11706. Change-Id: If26749fec06aa0ff84311f7941b88d140552e81d Reviewed-on: https://go-review.googlesource.com/18432 Reviewed-by:
Austin Clements <austin@google.com> Run-TryBot: Russ Cox <rsc@golang.org>
-
- 11 Jan, 2016 1 commit
-
-
Ian Lance Taylor authored
Fixes #13881. Change-Id: Idff77db381640184ddd2b65022133bb226168800 Reviewed-on: https://go-review.googlesource.com/18449 Reviewed-by:
David Crawshaw <crawshaw@golang.org> Run-TryBot: Ian Lance Taylor <iant@golang.org>
-
- 09 Jan, 2016 1 commit
-
-
Ian Lance Taylor authored
The previous behaviour of installing the signal handlers in a separate thread meant that Go initialization raced with non-Go initialization if the non-Go initialization also wanted to install signal handlers. Make installing signal handlers synchronous so that the process-wide behavior is predictable. Update #9896. Change-Id: Ice24299877ec46f8518b072a381932d273096a32 Reviewed-on: https://go-review.googlesource.com/18150 Run-TryBot: Ian Lance Taylor <iant@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
David Crawshaw <crawshaw@golang.org>
-
- 08 Jan, 2016 2 commits
-
-
Russ Cox authored
This reverts commit c5bafc82. Change-Id: Ie7030c978c6263b9e996d5aa0e490086796df26d Reviewed-on: https://go-review.googlesource.com/18431 Reviewed-by:
Russ Cox <rsc@golang.org>
-
Russ Cox authored
Before, NumGoroutine counted system goroutines and Stack (usually) didn't show them, which was inconsistent and confusing. To resolve which way they should be consistent, it seems like package main import "runtime" func main() { println(runtime.NumGoroutine()) } should print 1 regardless of internal runtime details. Make it so. Fixes #11706. Change-Id: I6bfe26a901de517728192cfb26a5568c4ef4fe47 Reviewed-on: https://go-review.googlesource.com/18343 Reviewed-by:
Austin Clements <austin@google.com>
-
- 07 Jan, 2016 2 commits
-
-
Austin Clements authored
f90b48e0 intended to require the stack barrier lock in all cases of sigprof that walked the user stack, but got it wrong. In particular, if sp < gp.stack.lo || gp.stack.hi < sp, tracebackUser would be true, but we wouldn't acquire the stack lock. If it then turned out that we were in a cgo call, it would walk the stack without the lock. In fact, the whole structure of stack locking is sigprof is somewhat wrong because it assumes the G to lock is gp.m.curg, but all three gentraceback calls start from potentially different Gs. To fix this, we lower the gcTryLockStackBarriers calls much closer to the gentraceback calls. There are now three separate trylock calls, each clearly associated with a gentraceback and the locked G clearly matches the G from which the gentraceback starts. This actually brings the sigprof logic closer to what it originally was before stack barrier locking. This depends on "runtime: increase assumed stack size in externalthreadhandler" because it very slightly increases the stack used by sigprof; without this other commit, this is enough to blow the profiler thread's assumed stack size. Fixes #12528 (hopefully for real this time!). For the 1.5 branch, though it will require some backporting. On the 1.5 branch, this will *not* require the "runtime: increase assumed stack size in externalthreadhandler" commit: there's no pcvalue cache, so the used stack is smaller. Change-Id: Id2f6446ac276848f6fc158bee550cccd03186b83 Reviewed-on: https://go-review.googlesource.com/18328 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Russ Cox <rsc@golang.org>
-
Austin Clements authored
If a sigprof happens during a cgo call, we traceback from the entry point of the cgo call. However, if the SP is outside of the G's stack, we'll then ignore this traceback, even if it was successful, and overwrite it with just _ExternalCode. Fix this by accepting any successful traceback, regardless of whether we got it from a cgo entry point or from regular Go code. Fixes #13466. Change-Id: I5da9684361fc5964f44985d74a8cdf02ffefd213 Reviewed-on: https://go-review.googlesource.com/18327 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Russ Cox <rsc@golang.org>
-
- 06 Jan, 2016 2 commits
-
-
Ian Lance Taylor authored
We were setting the signal mask of a new m to the signal mask of the m that created it. That failed when that m happened to be the one created by ensureSigM, which sets its signal mask to only include the signals being caught by os/signal.Notify. Fixes #13164. Update #9896. Change-Id: I705c196fe9d11754e10bab9e9b2e7530ecdfa367 Reviewed-on: https://go-review.googlesource.com/18064 Reviewed-by:
Russ Cox <rsc@golang.org>
-
Ian Lance Taylor authored
Avoids an msan error when runtime/cgo is explicitly rebuilt with -fsanitize=memory. Fixes #13815. Change-Id: I70308034011fb308b63585bcd40b0d1e62ec93ef Reviewed-on: https://go-review.googlesource.com/18263 Reviewed-by:
Russ Cox <rsc@golang.org>
-
- 18 Dec, 2015 1 commit
-
-
Austin Clements authored
Currently, if sigprof determines that the G is in user code (not cgo or libcall code), it will only traceback the G stack if it can acquire the stack barrier lock. However, it has no such restriction if the G is in cgo or libcall code. Because cgo calls count as syscalls, stack scanning and stack barrier installation can occur during a cgo call, which means sigprof could attempt to traceback a G in a cgo call while scanstack is installing stack barriers in that G's stack. As a result, the following sequence of events can cause the sigprof traceback to panic with "missed stack barrier": 1. M1: G1 performs a Cgo call (which, on Windows, is any system call, which could explain why this is easier to reproduce on Windows). 2. M1: The Cgo call puts G1 into _Gsyscall state. 3. M2: GC starts a scan of G1's stack. It puts G1 in to _Gscansyscall and acquires the stack barrier lock. 4. M3: A profiling signal comes in. On Windows this is a global (though I don't think this matters), so the runtime stops M1 and calls sigprof for G1. 5. M3: sigprof fails to acquire the stack barrier lock (because the GC's stack scan holds it). 6. M3: sigprof observes that G1 is in a Cgo call, so it calls gentraceback on G1 with its Cgo transition point. 7. M3: gentraceback on G1 grabs the currently empty g.stkbar slice. 8. M2: GC finishes scanning G1's stack and installing stack barriers. 9. M3: gentraceback encounters one of the just-installed stack barriers and panics. This commit fixes this by only allowing cgo tracebacks if sigprof can acquire the stack barrier lock, just like in the regular user traceback case. For good measure, we put the same constraint on libcall tracebacks. This case is probably already safe because, unlike cgo calls, libcalls leave the G in _Grunning and prevent reaching a safe point, so scanstack cannot run during a libcall. However, this also means that sigprof will always acquire the stack barrier lock without contention, so there's no cost to adding this constraint to libcall tracebacks. Fixes #12528. For 1.5.3 (will require some backporting). Change-Id: Ia5a4b8e3d66b23b02ffcd54c6315c81055c0cec2 Reviewed-on: https://go-review.googlesource.com/18023 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by:
Russ Cox <rsc@golang.org>
-
- 16 Dec, 2015 1 commit
-
-
Shenghou Ma authored
So that there is a uniformed way to retrieve Go version from a Go binary, starting from Go 1.4 (see https://golang.org/cl/117040043) Updates #13507. Change-Id: Iaa2b14fca2d8c4d883d3824e2efc82b3e6fe2624 Reviewed-on: https://go-review.googlesource.com/17459 Reviewed-by:
Keith Randall <khr@golang.org>
-
- 15 Dec, 2015 2 commits
-
-
Austin Clements authored
Currently, sysmon triggers a forced GC solely based on memstats.last_gc. However, memstats.last_gc isn't updated until mark termination, so once sysmon starts triggering forced GC, it will keep triggering them until GC finishes. The first of these actually starts a GC; the remainder up to the last print "GC forced", but gcStart returns immediately because gcphase != _GCoff; then the last may start another GC if the previous GC finishes (and sets last_gc) between sysmon triggering it and gcStart checking the GC phase. Fix this by expanding the condition for starting a forced GC to also require that no GC is currently running. This, combined with the way forcegchelper blocks until the GC cycle is started, ensures sysmon only starts one GC when the time exceeds the forced GC threshold. Fixes #13458. Change-Id: Ie6cf841927f6085136be3f45259956cd5cf10d23 Reviewed-on: https://go-review.googlesource.com/17819 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Russ Cox <rsc@golang.org>
-
Austin Clements authored
The addition of stack barrier locking to copystack subsumes the partial fix from commit bbd1a1c7 for SIGPROF during copystack. With the stack barrier locking, this commit simplifies the rule in sigprof to: the user stack can be traced only if sigprof can acquire the stack barrier lock. Updates #12932, #13362. Change-Id: I1c1f80015053d0ac7761e9e0c7437c2aba26663f Reviewed-on: https://go-review.googlesource.com/17192 Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by:
Russ Cox <rsc@golang.org>
-
- 11 Dec, 2015 2 commits
-
-
Dmitry Vyukov authored
Currently we wake up new worker threads whenever we pass through the scheduler with nmspinning==0. This leads to lots of unnecessary thread wake ups. Instead let only spinning threads wake up new spinning threads. For the following program: package main import "runtime" func main() { for i := 0; i < 1e7; i++ { runtime.Gosched() } } Before: $ time ./test real 0m4.278s user 0m7.634s sys 0m1.423s $ strace -c ./test % time seconds usecs/call calls errors syscall 99.93 9.314936 3 2685009 17536 futex After: $ time ./test real 0m1.200s user 0m1.181s sys 0m0.024s $ strace -c ./test % time seconds usecs/call calls errors syscall 3.11 0.000049 25 2 futex Fixes #13527 Change-Id: Ia1f5bf8a896dcc25d8b04beb1f4317aa9ff16f74 Reviewed-on: https://go-review.googlesource.com/17540 Reviewed-by:
Austin Clements <austin@google.com> Run-TryBot: Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Rahul Chaudhry authored
debug.schedtrace is an int32. Convert it to int64 before multiplying with constant 1000000. Otherwise, schedtrace values more than 2147 result in int32 overflow causing incorrect delays between traces. Change-Id: I064e8d7b432c1e892a705ee1f31a2e8cdd2c3ea3 Reviewed-on: https://go-review.googlesource.com/17712 Reviewed-by:
Austin Clements <austin@google.com> Run-TryBot: Ian Lance Taylor <iant@golang.org>
-
- 19 Nov, 2015 3 commits
-
-
Austin Clements authored
sysmon runs without a P. This means it can't interact with the garbage collector, so write barriers not allowed in anything that sysmon does. Fixes #10600. Change-Id: I9de1283900dadee4f72e2ebfc8787123e382ae88 Reviewed-on: https://go-review.googlesource.com/17006 Run-TryBot: Austin Clements <austin@google.com> Reviewed-by:
Dmitry Vyukov <dvyukov@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
Austin Clements authored
allocm is a very unusual function: it is specifically designed to allocate in contexts where m.p is nil by temporarily taking over a P. Since allocm is used in many contexts where it would make sense to use nowritebarrierrec, this commit teaches the nowritebarrierrec analysis to stop at allocm. Updates #10600. Change-Id: I8499629461d4fe25712d861720dfe438df7ada9b Reviewed-on: https://go-review.googlesource.com/17005 Reviewed-by:
Russ Cox <rsc@golang.org>
-
Austin Clements authored
A sigprof during stack barrier insertion or removal can crash if it detects an inconsistency between the stkbar array and the stack itself. Currently we protect against this when scanning another G's stack using stackLock, but we don't protect against it when unwinding stack barriers for a recover or a memmove to the stack. This commit cleans up and improves the stack locking code. It abstracts out the lock and unlock operations. It uses the lock consistently everywhere we perform stack operations, and pushes the lock/unlock down closer to where the stack barrier operations happen to make it more obvious what it's protecting. Finally, it modifies sigprof so that instead of spinning until it acquires the lock, it simply doesn't perform a traceback if it can't acquire it. This is necessary to prevent self-deadlock. Updates #11863, which introduced stackLock to fix some of these issues, but didn't go far enough. Updates #12528. Change-Id: I9d1fa88ae3744d31ba91500c96c6988ce1a3a349 Reviewed-on: https://go-review.googlesource.com/17036 Reviewed-by:
Russ Cox <rsc@golang.org> Run-TryBot: Austin Clements <austin@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 18 Nov, 2015 1 commit
-
-
Russ Cox authored
Cgo-created threads transition between having associated Go g's and m's and not. A signal arriving during the transition could think it was safe and appropriate to run Go signal handlers when it was in fact not. Avoid the race by masking all signals during the transition. Fixes #12277. Change-Id: Ie9711bc1d098391d58362492197a7e0f5b497d14 Reviewed-on: https://go-review.googlesource.com/16915 Reviewed-by:
Ian Lance Taylor <iant@golang.org> Run-TryBot: Russ Cox <rsc@golang.org> TryBot-Result: Gobot Gobot <gobot@golang.org>
-
- 12 Nov, 2015 1 commit
-
-
Michael Hudson-Doyle authored
The PowerPC ISA does not have a PC-relative load instruction, which poses obvious challenges when generating position-independent code. The way the ELFv2 ABI addresses this is to specify that r2 points to a per "module" (shared library or executable) TOC pointer. Maintaining this pointer requires cooperation between codegen and the system linker: * Non-leaf functions leave space on the stack at r1+24 to save the TOC pointer. * A call to a function that *might* have to go via a PLT stub must be followed by a nop instruction that the system linker can replace with "ld r1, 24(r1)" to restore the TOC pointer (only when dynamically linking Go code). * When calling a function via a function pointer, the address of the function must be in r12, and the first couple of instructions (the "global entry point") of the called function use this to derive the address of the TOC for the module it is in. * When calling a function that is implemented in the same module, the system linker adjusts the call to skip over the instructions mentioned above (the "local entry point"), assuming that r2 is already correctly set. So this changeset adds the global entry point instructions, sets the metadata so the system linker knows where the local entry point is, inserts code to save the TOC pointer at 24(r1), adds a nop after any call not known to be local and copes with the odd non-local code transfer in the runtime (e.g. the stuff around jmpdefer). It does not actually compile PIC yet. Change-Id: I7522e22bdfd2f891745a900c60254fe9e372c854 Reviewed-on: https://go-review.googlesource.com/15967 Reviewed-by:
Russ Cox <rsc@golang.org>
-