1. 27 Apr, 2016 25 commits
    • Michael Munday's avatar
      crypto/md5: add s390x assembly implementation · 239fb76e
      Michael Munday authored
      Adapted from md5block_amd64.s.
      
      name                 old speed      new speed      delta
      Hash8Bytes           14.0MB/s ± 1%  39.9MB/s ± 0%  +185.52%   (p=0.000 n=9+10)
      Hash1K                176MB/s ± 1%   661MB/s ± 1%  +274.44%  (p=0.000 n=10+10)
      Hash8K                196MB/s ± 0%   742MB/s ± 1%  +278.35%   (p=0.000 n=10+9)
      Hash8BytesUnaligned  14.2MB/s ± 2%  39.8MB/s ± 0%  +180.06%  (p=0.000 n=10+10)
      Hash1KUnaligned       177MB/s ± 1%   651MB/s ± 0%  +267.38%  (p=0.000 n=10+10)
      Hash8KUnaligned       197MB/s ± 1%   731MB/s ± 1%  +271.73%  (p=0.000 n=10+10)
      
      Change-Id: I45ece98ee10f30fcd192b9c3d743ba61c248f36a
      Reviewed-on: https://go-review.googlesource.com/22505Reviewed-by: default avatarBill O'Farrell <billotosyr@gmail.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      239fb76e
    • Michael Hudson-Doyle's avatar
      cmd/compile: de-dup the gclocals symbols in compiler too · f4d38a87
      Michael Hudson-Doyle authored
      These symbols are de-duplicated in the linker but the compiler generates quite
      many duplicates too: 2425 of 13769 total symbols for runtime.a for example.
      De-duplicating them in the compiler saves the linker a bit of work.
      
      Fixes #14983
      
      Change-Id: I5f18e5f9743563c795aad8f0a22d17a7ed147711
      Reviewed-on: https://go-review.googlesource.com/22293Reviewed-by: default avatarDavid Crawshaw <crawshaw@golang.org>
      f4d38a87
    • Dave Cheney's avatar
      cmd/compile/internal/gc: remove oconv(op, 0) calls · d3c79d32
      Dave Cheney authored
      Updates #15462
      
      Automatic refactor with sed -e.
      
      Replace all oconv(op, 0) to string conversion with the raw op value
      which fmt's %v verb can print directly.
      
      The remaining oconv(op, FmtSharp) will be replaced with op.GoString and
      %#v in the next CL.
      
      Change-Id: I5e2f7ee0bd35caa65c6dd6cb1a866b5e4519e641
      Reviewed-on: https://go-review.googlesource.com/22499
      Run-TryBot: Dave Cheney <dave@cheney.net>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      d3c79d32
    • Dan Peterson's avatar
      net: search domain from hostname if no search directives · cbd72318
      Dan Peterson authored
      Fixes #14897
      
      Change-Id: Iffe7462983a5623a37aa0dc6f74c8c70e10c3244
      Reviewed-on: https://go-review.googlesource.com/21464Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      cbd72318
    • Damien Neil's avatar
      syscall: fix uint64->int cast of control message header · 4edb40d4
      Damien Neil authored
      Change-Id: I28980b307d10730b122a4f833809bc400d6aff24
      Reviewed-on: https://go-review.googlesource.com/22525Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      4edb40d4
    • Cherry Zhang's avatar
      misc/cgo/testcarchive: fix path of libgo.a for darwin/arm · 78bcdeb6
      Cherry Zhang authored
      After CL 22461, c-archive build on darwin/arm is by default compiled
      with -shared, so update the install path.
      
      Fix build.
      
      Change-Id: Ie93dbd226ed416b834da0234210f4b98bc0e3606
      Reviewed-on: https://go-review.googlesource.com/22507Reviewed-by: default avatarDavid Crawshaw <crawshaw@golang.org>
      Run-TryBot: David Crawshaw <crawshaw@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      78bcdeb6
    • Austin Clements's avatar
      runtime: don't rescan globals · b49b71ae
      Austin Clements authored
      Currently the runtime rescans globals during mark 2 and mark
      termination. This costs as much as 500µs/MB in STW time, which is
      enough to surpass the 10ms STW limit with only 20MB of globals.
      
      It's also basically unnecessary. The compiler already generates write
      barriers for global -> heap pointer updates and the regular write
      barrier doesn't check whether the slot is a global or in the heap.
      Some less common write barriers do cause problems.
      heapBitsBulkBarrier, which is used by typedmemmove and related
      functions, currently depends on having access to the pointer bitmap
      and as a result ignores writes to globals. Likewise, the
      reflect-related write barriers reflect_typedmemmovepartial and
      callwritebarrier ignore non-heap destinations; though it appears they
      can never be called with global pointers anyway.
      
      This commit makes heapBitsBulkBarrier issue write barriers for writes
      to global pointers using the data and BSS pointer bitmaps, removes the
      inheap checks from the reflection write barriers, and eliminates the
      rescans during mark 2 and mark termination. It also adds a test that
      writes to globals have write barriers.
      
      Programs with large data+BSS segments (with pointers) aren't common,
      but for programs that do have large data+BSS segments, this
      significantly reduces pause time:
      
      name \ 95%ile-time/markTerm              old         new  delta
      LargeBSS/bss:1GB/gomaxprocs:4  148200µs ± 6%  302µs ±52%  -99.80% (p=0.008 n=5+5)
      
      This very slightly improves the go1 benchmarks:
      
      name                      old time/op    new time/op    delta
      BinaryTree17-12              2.62s ± 3%     2.62s ± 4%    ~     (p=0.904 n=20+20)
      Fannkuch11-12                2.15s ± 1%     2.13s ± 0%  -1.29%  (p=0.000 n=18+20)
      FmtFprintfEmpty-12          48.3ns ± 2%    47.6ns ± 1%  -1.52%  (p=0.000 n=20+16)
      FmtFprintfString-12          152ns ± 0%     152ns ± 1%    ~     (p=0.725 n=18+18)
      FmtFprintfInt-12             150ns ± 1%     149ns ± 1%  -1.14%  (p=0.000 n=19+20)
      FmtFprintfIntInt-12          250ns ± 0%     244ns ± 1%  -2.12%  (p=0.000 n=20+18)
      FmtFprintfPrefixedInt-12     219ns ± 1%     217ns ± 1%  -1.20%  (p=0.000 n=19+20)
      FmtFprintfFloat-12           280ns ± 0%     281ns ± 1%  +0.47%  (p=0.000 n=19+19)
      FmtManyArgs-12               928ns ± 0%     923ns ± 1%  -0.53%  (p=0.000 n=19+18)
      GobDecode-12                7.21ms ± 1%    7.24ms ± 2%    ~     (p=0.091 n=19+19)
      GobEncode-12                6.07ms ± 1%    6.05ms ± 1%  -0.36%  (p=0.002 n=20+17)
      Gzip-12                      265ms ± 1%     265ms ± 1%    ~     (p=0.496 n=20+19)
      Gunzip-12                   39.6ms ± 1%    39.3ms ± 1%  -0.85%  (p=0.000 n=19+19)
      HTTPClientServer-12         74.0µs ± 2%    73.8µs ± 1%    ~     (p=0.569 n=20+19)
      JSONEncode-12               15.4ms ± 1%    15.3ms ± 1%  -0.25%  (p=0.049 n=17+17)
      JSONDecode-12               53.7ms ± 2%    53.0ms ± 1%  -1.29%  (p=0.000 n=18+17)
      Mandelbrot200-12            3.97ms ± 1%    3.97ms ± 0%    ~     (p=0.072 n=17+18)
      GoParse-12                  3.35ms ± 2%    3.36ms ± 1%  +0.51%  (p=0.005 n=18+20)
      RegexpMatchEasy0_32-12      72.7ns ± 2%    72.2ns ± 1%  -0.70%  (p=0.005 n=19+19)
      RegexpMatchEasy0_1K-12       246ns ± 1%     245ns ± 0%  -0.60%  (p=0.000 n=18+16)
      RegexpMatchEasy1_32-12      72.8ns ± 1%    72.5ns ± 1%  -0.37%  (p=0.011 n=18+18)
      RegexpMatchEasy1_1K-12       380ns ± 1%     385ns ± 1%  +1.34%  (p=0.000 n=20+19)
      RegexpMatchMedium_32-12      115ns ± 2%     115ns ± 1%  +0.44%  (p=0.047 n=20+20)
      RegexpMatchMedium_1K-12     35.4µs ± 1%    35.5µs ± 1%    ~     (p=0.079 n=18+19)
      RegexpMatchHard_32-12       1.83µs ± 0%    1.80µs ± 1%  -1.76%  (p=0.000 n=18+18)
      RegexpMatchHard_1K-12       55.1µs ± 0%    54.3µs ± 1%  -1.42%  (p=0.000 n=18+19)
      Revcomp-12                   386ms ± 1%     381ms ± 1%  -1.14%  (p=0.000 n=18+18)
      Template-12                 61.5ms ± 2%    61.5ms ± 2%    ~     (p=0.647 n=19+20)
      TimeParse-12                 338ns ± 0%     336ns ± 1%  -0.72%  (p=0.000 n=14+19)
      TimeFormat-12                350ns ± 0%     357ns ± 0%  +2.05%  (p=0.000 n=19+18)
      [Geo mean]                  55.3µs         55.0µs       -0.41%
      
      Change-Id: I57e8720385a1b991aeebd111b6874354308e2a6b
      Reviewed-on: https://go-review.googlesource.com/20829
      Run-TryBot: Austin Clements <austin@google.com>
      Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      b49b71ae
    • Austin Clements's avatar
      runtime: make {add,subtract}{b,1} nosplit · 30172f18
      Austin Clements authored
      These are used at the bottom level of various GC operations that must
      not be preempted. To be on the safe side, mark them all nosplit.
      
      Change-Id: I8f7360e79c9852bd044df71413b8581ad764380c
      Reviewed-on: https://go-review.googlesource.com/22504
      Run-TryBot: Austin Clements <austin@google.com>
      Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      30172f18
    • David Crawshaw's avatar
      reflect: fix strings of SliceOf-created types · bddfc337
      David Crawshaw authored
      The new type was inheriting the tflagExtraStar from its prototype.
      
      Fixes #15467
      
      Change-Id: Ic22c2a55cee7580cb59228d52b97e1c0a1e60220
      Reviewed-on: https://go-review.googlesource.com/22501Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      bddfc337
    • David Crawshaw's avatar
      reflect: unnamed interface types have no name · 217be5b3
      David Crawshaw authored
      Fixes #15468
      
      Change-Id: I8723171f87774a98d5e80e7832ebb96dd1fbea74
      Reviewed-on: https://go-review.googlesource.com/22524Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: David Crawshaw <crawshaw@golang.org>
      217be5b3
    • Zhongwei Yao's avatar
      cmd/compile: enable const division for arm64 · 74a9bad6
      Zhongwei Yao authored
      performance:
      benchmark                   old ns/op     new ns/op     delta
      BenchmarkDivconstI64-8      8.28          2.70          -67.39%
      BenchmarkDivconstU64-8      8.28          4.69          -43.36%
      BenchmarkDivconstI32-8      8.28          6.39          -22.83%
      BenchmarkDivconstU32-8      8.28          4.43          -46.50%
      BenchmarkDivconstI16-8      5.17          5.17          +0.00%
      BenchmarkDivconstU16-8      5.33          5.34          +0.19%
      BenchmarkDivconstI8-8       3.50          3.50          +0.00%
      BenchmarkDivconstU8-8       3.51          3.50          -0.28%
      
      Fixes #15382
      
      Change-Id: Ibce7b28f0586d593b33c4d4ecc5d5e7e7c905d13
      Reviewed-on: https://go-review.googlesource.com/22292Reviewed-by: default avatarMichael Munday <munday@ca.ibm.com>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      74a9bad6
    • Robert Griesemer's avatar
      cmd/compile: switch to compact export format by default · 7538b1db
      Robert Griesemer authored
      builtin.go was auto-generated via go generate; all other
      changes were manual.
      
      The new format reduces the export data size by ~65% on average
      for the std library packages (and there is still quite a bit of
      room for improvement).
      
      The average time to write export data is reduced by (at least)
      62% as measured in one run over the std lib, it is likely more.
      
      The average time to read import data is reduced by (at least)
      37% as measured in one run over the std lib, it is likely more.
      There is also room to improve this time.
      
      The compiler transparently handles both packages using the old
      and the new format.
      
      Comparing the -S output of the go build for each package via
      the cmp.bash script (added) shows identical assembly code for
      all packages, but 6 files show file:line differences:
      
      The following files have differences because they use cgo
      and cgo uses different temp. directories for different builds.
      Harmless.
      
      	src/crypto/x509
      	src/net
      	src/os/user
      	src/runtime/cgo
      
      The following files have file:line differences that are not yet
      fully explained; however the differences exist w/ and w/o new export
      format (pre-existing condition). See issue #15453.
      
      	src/go/internal/gccgoimporter
      	src/go/internal/gcimporter
      
      In summary, switching to the new export format produces the same
      package files as before for all practical purposes.
      
      How can you tell which one you have (if you care): Open a package
      (.a) file in an editor. Textual export data starts with a $$ after
      the header and is more or less legible; binary export data starts
      with a $$B after the header and is mostly unreadable. A stand-alone
      decoder (for debugging) is in the works.
      
      In case of a problem, please first try reverting back to the old
      textual format to determine if the cause is the new export format:
      
      For a stand-alone compiler invocation:
      - go tool compile -newexport=0 <files>
      
      For a single package:
      - go build -gcflags="-newexport=0" <pkg>
      
      For make/all.bash:
      - (export GO_GCFLAGS="-newexport=0"; sh make.bash)
      
      Fixes #13241.
      
      Change-Id: I2588cb463be80af22446bf80c225e92ab79878b8
      Reviewed-on: https://go-review.googlesource.com/22123Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      7538b1db
    • Michael Matloob's avatar
      regexp: add a harder regexp to the benchmarks · 70d95a48
      Michael Matloob authored
      This regexp has many parallel alternations
      
      Change-Id: I8044f460aa7d18f20cb0452e9470557b87facd6d
      Reviewed-on: https://go-review.googlesource.com/22471Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      70d95a48
    • Cherry Zhang's avatar
      cmd/link: remove absolute address for c-archive on darwin/arm · 9629f55f
      Cherry Zhang authored
      Now it is possible to build a c-archive as PIC on darwin/arm (this is
      now the default). Then the system linker can link the binary using
      the archive as PIE.
      
      Fixes #12896.
      
      Change-Id: Iad84131572422190f5fa036e7d71910dc155f155
      Reviewed-on: https://go-review.googlesource.com/22461Reviewed-by: default avatarDavid Crawshaw <crawshaw@golang.org>
      9629f55f
    • Robert Griesemer's avatar
      cmd/compile: don't write pos info for builtin packages · 86c93c98
      Robert Griesemer authored
      TestBuiltin will fail if run on Windows and builtin.go was generated
      on a non-Windows machine (or vice versa) because path names have
      different separators. Avoid problem altogether by not writing pos
      info for builtin packages. It's not needed.
      
      Affects -newexport only.
      
      Change-Id: I8944f343452faebaea9a08b5fb62829bed77c148
      Reviewed-on: https://go-review.googlesource.com/22498
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      86c93c98
    • Keith Randall's avatar
      cmd/compile: don't use line numbers from ONAME and named OLITERALs · a19e60b2
      Keith Randall authored
      The line numbers of ONAMEs are the location of their
      declaration, not their use.
      
      The line numbers of named OLITERALs are also the location
      of their declaration.
      
      Ignore both of these.  Instead, we will inherit the line number from
      the containing syntactic item.
      
      Fixes #14742
      Fixes #15430
      
      Change-Id: Ie43b5b9f6321cbf8cead56e37ccc9364d0702f2f
      Reviewed-on: https://go-review.googlesource.com/22479Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      a19e60b2
    • Zhongwei Yao's avatar
      cmd/asm: fix SIMD register name on arm64 · c9389a10
      Zhongwei Yao authored
      Current V-register range is V32~V63 on arm64. This patch changes it to
      V0~V31.
      
      fix #15465.
      
      Change-Id: I90dab42dea46825ec5d7a8321ec4f6550735feb8
      Reviewed-on: https://go-review.googlesource.com/22520Reviewed-by: default avatarAram Hăvărneanu <aram@mgk.ro>
      Run-TryBot: Aram Hăvărneanu <aram@mgk.ro>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      c9389a10
    • Dmitry Vyukov's avatar
      runtime/race: improve TestNoRaceIOHttp test · 6dfba5c7
      Dmitry Vyukov authored
      TestNoRaceIOHttp does all kinds of bad things:
      1. Binds to a fixed port, so concurrent tests fail.
      2. Registers HTTP handler multiple times, so repeated tests fail.
      3. Relies on sleep to wait for listen.
      
      Fix all of that.
      
      Change-Id: I1210b7797ef5e92465b37dc407246d92a2a24fe8
      Reviewed-on: https://go-review.googlesource.com/19953
      Run-TryBot: Dmitry Vyukov <dvyukov@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      6dfba5c7
    • Martin Möhrmann's avatar
      image/color: optimize RGBToYCbCr · 102cf2ae
      Martin Möhrmann authored
      Apply optimizations used to speed up YCbCrToRGB from
      https://go-review.googlesource.com/#/c/21910/
      to RGBToYCbCr.
      
      name             old time/op  new time/op  delta
      RGBToYCbCr/0-2   6.81ns ± 0%  5.96ns ± 0%  -12.48%  (p=0.000 n=38+50)
      RGBToYCbCr/Cb-2  7.68ns ± 0%  6.13ns ± 0%  -20.21%  (p=0.000 n=50+33)
      RGBToYCbCr/Cr-2  6.84ns ± 0%  6.04ns ± 0%  -11.70%  (p=0.000 n=39+42)
      
      Updates #15260
      
      Change-Id: If3ea5393ae371a955ddf18ab226aae20b48f9692
      Reviewed-on: https://go-review.googlesource.com/22411Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRalph Corderoy <ralph@inputplus.co.uk>
      102cf2ae
    • Dave Cheney's avatar
      cmd/compile/internal: unexport gc.Oconv · 8f2e780e
      Dave Cheney authored
      Updates #15462
      
      Semi automatic change with gofmt -r and hand fixups for callers outside
      internal/gc.
      
      All the uses of gc.Oconv outside cmd/compile/internal/gc were for the
      Oconv(op, 0) form, which is already handled the Op.String method.
      
      Replace the use of gc.Oconv(op, 0) with op itself, which will call
      Op.String via the %v or %s verb. Unexport Oconv.
      
      Change-Id: I84da2a2e4381b35f52efce427b2d6a3bccdf2526
      Reviewed-on: https://go-review.googlesource.com/22496
      Run-TryBot: Dave Cheney <dave@cheney.net>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      8f2e780e
    • Josh Bleecher Snyder's avatar
      cmd/compile: fix opnames · 707aed03
      Josh Bleecher Snyder authored
      Change-Id: Ief4707747338912216a8509b1adbf655c8ffac56
      Reviewed-on: https://go-review.googlesource.com/22495
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      707aed03
    • Brad Fitzpatrick's avatar
      net/http: remove idle transport connections from Transport when server closes · 2e302182
      Brad Fitzpatrick authored
      Previously the Transport would cache idle connections from the
      Transport for later reuse, but if a peer server disconnected
      (e.g. idle timeout), we would not proactively remove the *persistConn
      from the Transport's idle list, leading to a waste of memory
      (potentially forever).
      
      Instead, when the persistConn's readLoop terminates, remote it from
      the idle list, if present.
      
      This also adds the beginning of accounting for the total number of
      idle connections, which will be needed for Transport.MaxIdleConns
      later.
      
      Updates #15461
      
      Change-Id: Iab091f180f8dd1ee0d78f34b9705d68743b5557b
      Reviewed-on: https://go-review.googlesource.com/22492Reviewed-by: default avatarAndrew Gerrand <adg@golang.org>
      2e302182
    • Brad Fitzpatrick's avatar
      context: fix doc typo · 87bca88c
      Brad Fitzpatrick authored
      Fixes #15449
      
      Change-Id: I8d84d076a05c56694b48f7b84f572b1a6524f522
      Reviewed-on: https://go-review.googlesource.com/22493Reviewed-by: default avatarAndrew Gerrand <adg@golang.org>
      87bca88c
    • Russ Cox's avatar
      cmd/go: add Package.StaleReason for debugging with go list · 0b5fbf70
      Russ Cox authored
      It comes up every few months that we can't understand why
      the go command is rebuilding some package.
      Add diagnostics so that the go command can explain itself
      if asked.
      
      For #2775, #3506, #12074.
      
      Change-Id: I1c73b492589b49886bf31a8f9d05514adbd6ed70
      Reviewed-on: https://go-review.googlesource.com/22432Reviewed-by: default avatarRob Pike <r@golang.org>
      0b5fbf70
    • Michael Munday's avatar
      crypto/sha256: add s390x assembly implementation · 525ae3f8
      Michael Munday authored
      Renames block to blockGeneric so that it can be called when the
      assembly feature check fails. This means making block a var on
      platforms without an assembly implementation (similar to the sha1
      package).
      
      Also adds a test to check that the fallback path works correctly
      when the feature check fails.
      
      name        old speed      new speed       delta
      Hash8Bytes  6.42MB/s ± 1%  27.14MB/s ± 0%  +323.01%  (p=0.000 n=10+10)
      Hash1K      53.9MB/s ± 0%  511.1MB/s ± 0%  +847.57%   (p=0.000 n=10+9)
      Hash8K      57.1MB/s ± 1%  609.7MB/s ± 0%  +967.04%  (p=0.000 n=10+10)
      
      Change-Id: If962b2a5c9160b3a0b76ccee53b2fd809468ed3d
      Reviewed-on: https://go-review.googlesource.com/22460
      Run-TryBot: Michael Munday <munday@ca.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBill O'Farrell <billotosyr@gmail.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      525ae3f8
  2. 26 Apr, 2016 15 commits
    • Austin Clements's avatar
      runtime: make stack re-scan O(# dirty stacks) · 2a889b9d
      Austin Clements authored
      Currently the stack re-scan during mark termination is O(# stacks)
      because we enqueue a root marking job for every goroutine. It takes
      ~34ns to process this root marking job for a valid (clean) stack, so
      at around 300k goroutines we exceed the 10ms pause goal. A non-trivial
      portion of this time is spent simply taking the cache miss to check
      the gcscanvalid flag, so simply optimizing the path that handles clean
      stacks can only improve this so much.
      
      Fix this by keeping an explicit list of goroutines with dirty stacks
      that need to be rescanned. When a goroutine first transitions to
      running after a stack scan and marks its stack dirty, it adds itself
      to this list. We enqueue root marking jobs only for the goroutines in
      this list, so this improves stack re-scanning asymptotically by
      completely eliminating time spent on clean goroutines.
      
      This reduces mark termination time for 500k idle goroutines from 15ms
      to 238µs. Overall performance effect is negligible.
      
      name \ 95%ile-time/markTerm     old           new         delta
      IdleGs/gs:500000/gomaxprocs:12  15000µs ± 0%  238µs ± 5%  -98.41% (p=0.000 n=10+10)
      
      name              old time/op  new time/op  delta
      XBenchGarbage-12  2.30ms ± 3%  2.29ms ± 1%  -0.43%  (p=0.049 n=17+18)
      
      name                      old time/op    new time/op    delta
      BinaryTree17-12              2.57s ± 3%     2.59s ± 2%    ~     (p=0.141 n=19+20)
      Fannkuch11-12                2.09s ± 0%     2.10s ± 1%  +0.53%  (p=0.000 n=19+19)
      FmtFprintfEmpty-12          45.3ns ± 3%    45.2ns ± 2%    ~     (p=0.845 n=20+20)
      FmtFprintfString-12          129ns ± 0%     127ns ± 0%  -1.55%  (p=0.000 n=16+16)
      FmtFprintfInt-12             123ns ± 0%     119ns ± 1%  -3.24%  (p=0.000 n=19+19)
      FmtFprintfIntInt-12          195ns ± 1%     189ns ± 1%  -3.11%  (p=0.000 n=17+17)
      FmtFprintfPrefixedInt-12     193ns ± 1%     187ns ± 1%  -3.06%  (p=0.000 n=19+19)
      FmtFprintfFloat-12           254ns ± 0%     255ns ± 1%  +0.35%  (p=0.001 n=14+17)
      FmtManyArgs-12               781ns ± 0%     770ns ± 0%  -1.48%  (p=0.000 n=16+19)
      GobDecode-12                7.00ms ± 1%    6.98ms ± 1%    ~     (p=0.563 n=19+19)
      GobEncode-12                5.91ms ± 1%    5.92ms ± 0%    ~     (p=0.118 n=19+18)
      Gzip-12                      219ms ± 1%     215ms ± 1%  -1.81%  (p=0.000 n=18+18)
      Gunzip-12                   37.2ms ± 0%    37.4ms ± 0%  +0.45%  (p=0.000 n=17+19)
      HTTPClientServer-12         76.9µs ± 3%    77.5µs ± 2%  +0.81%  (p=0.030 n=20+19)
      JSONEncode-12               15.0ms ± 0%    14.8ms ± 1%  -0.88%  (p=0.001 n=15+19)
      JSONDecode-12               50.6ms ± 0%    53.2ms ± 2%  +5.07%  (p=0.000 n=17+19)
      Mandelbrot200-12            4.05ms ± 0%    4.05ms ± 1%    ~     (p=0.581 n=16+17)
      GoParse-12                  3.34ms ± 1%    3.30ms ± 1%  -1.21%  (p=0.000 n=15+20)
      RegexpMatchEasy0_32-12      69.6ns ± 1%    69.8ns ± 2%    ~     (p=0.566 n=19+19)
      RegexpMatchEasy0_1K-12       238ns ± 1%     236ns ± 0%  -0.91%  (p=0.000 n=17+13)
      RegexpMatchEasy1_32-12      69.8ns ± 1%    70.0ns ± 1%  +0.23%  (p=0.026 n=17+16)
      RegexpMatchEasy1_1K-12       371ns ± 1%     363ns ± 1%  -2.07%  (p=0.000 n=19+19)
      RegexpMatchMedium_32-12      107ns ± 2%     106ns ± 1%  -0.51%  (p=0.031 n=18+20)
      RegexpMatchMedium_1K-12     33.0µs ± 0%    32.9µs ± 0%  -0.30%  (p=0.004 n=16+16)
      RegexpMatchHard_32-12       1.70µs ± 0%    1.70µs ± 0%  +0.45%  (p=0.000 n=16+17)
      RegexpMatchHard_1K-12       51.1µs ± 2%    51.4µs ± 1%  +0.53%  (p=0.000 n=17+19)
      Revcomp-12                   378ms ± 1%     385ms ± 1%  +1.92%  (p=0.000 n=19+18)
      Template-12                 64.3ms ± 2%    65.0ms ± 2%  +1.09%  (p=0.001 n=19+19)
      TimeParse-12                 315ns ± 1%     317ns ± 2%    ~     (p=0.108 n=18+20)
      TimeFormat-12                360ns ± 1%     337ns ± 0%  -6.30%  (p=0.000 n=18+13)
      [Geo mean]                  51.8µs         51.6µs       -0.48%
      
      Change-Id: Icf8994671476840e3998236e15407a505d4c760c
      Reviewed-on: https://go-review.googlesource.com/20700Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      2a889b9d
    • Austin Clements's avatar
      runtime: don't clear gcscanvalid in casfrom_Gscanstatus · 5b765ce3
      Austin Clements authored
      Currently we clear gcscanvalid in both casgstatus and
      casfrom_Gscanstatus if the new status is _Grunning. This is very
      important to do in casgstatus. However, this is potentially wrong in
      casfrom_Gscanstatus because in this case the caller doesn't own gp and
      hence the write is racy. Unlike the other _Gscan statuses, during
      _Gscanrunning, the G is still running. This does not indicate that
      it's transitioning into a running state. The scan simply hasn't
      happened yet, so it's neither valid nor invalid.
      
      Conveniently, this also means clearing gcscanvalid is unnecessary in
      this case because the G was already in _Grunning, so we can simply
      remove this code. What will happen instead is that the G will be
      preempted to scan itself, that scan will set gcscanvalid to true, and
      then the G will return to _Grunning via casgstatus, clearing
      gcscanvalid.
      
      This fix will become necessary shortly when we start keeping track of
      the set of G's with dirty stacks, since it will no longer be
      idempotent to simply set gcscanvalid to false.
      
      Change-Id: I688c82e6fbf00d5dbbbff49efa66acb99ee86785
      Reviewed-on: https://go-review.googlesource.com/20669Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      5b765ce3
    • Austin Clements's avatar
      runtime: fix typos in comment about gcscanvalid · c707d838
      Austin Clements authored
      Change-Id: Id4ad7ebf88a21eba2bc5714b96570ed5cfaed757
      Reviewed-on: https://go-review.googlesource.com/22210Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      c707d838
    • Austin Clements's avatar
      runtime: remove stack barriers during sweep · 9f263c14
      Austin Clements authored
      This adds a best-effort pass to remove stack barriers immediately
      after the end of mark termination. This isn't necessary for the Go
      runtime, but should help external tools that perform stack walks but
      aren't aware of Go's stack barriers such as GDB, perf, and VTune.
      (Though clearly they'll still have trouble unwinding stacks during
      mark.)
      
      Change-Id: I66600fae1f03ee36b5459d2b00dcc376269af18e
      Reviewed-on: https://go-review.googlesource.com/20668Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      9f263c14
    • Austin Clements's avatar
      runtime: remove stack barriers during concurrent mark · 269c969c
      Austin Clements authored
      Currently we remove stack barriers during STW mark termination, which
      has a non-trivial per-goroutine cost and means that we have to touch
      even clean stacks during mark termination. However, there's no problem
      with leaving them in during the sweep phase. They just have to be out
      by the time we install new stack barriers immediately prior to
      scanning the stack such as during the mark phase of the next GC cycle
      or during mark termination in a STW GC.
      
      Hence, move the gcRemoveStackBarriers from STW mark termination to
      just before we install new stack barriers during concurrent mark. This
      removes the cost from STW. Furthermore, this combined with concurrent
      stack shrinking means that the mark termination scan of a clean stack
      is a complete no-op, which will make it possible to skip clean stacks
      entirely during mark termination.
      
      This has the downside that it will mess up anything outside of Go that
      tries to walk Go stacks all the time instead of just some of the time.
      This includes tools like GDB, perf, and VTune. We'll improve the
      situation shortly.
      
      Change-Id: Ia40baad8f8c16aeefac05425e00b0cf478137097
      Reviewed-on: https://go-review.googlesource.com/20667Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      269c969c
    • Austin Clements's avatar
      runtime: avoid span root marking entirely during mark termination · efb0c554
      Austin Clements authored
      Currently we enqueue span root mark jobs during both concurrent mark
      and mark termination, but we make the job a no-op during mark
      termination.
      
      This is silly. Instead of queueing them up just to not do them, don't
      queue them up in the first place.
      
      Change-Id: Ie1d36de884abfb17dd0db6f0449a2b7c997affab
      Reviewed-on: https://go-review.googlesource.com/20666Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      efb0c554
    • Austin Clements's avatar
      runtime: free dead G stacks concurrently · e8337491
      Austin Clements authored
      Currently we free cached stacks of dead Gs during STW stack root
      marking. We do this during STW because there's no way to take
      ownership of a particular dead G, so attempting to free a dead G's
      stack during concurrent stack root marking could race with reusing
      that G.
      
      However, we can do this concurrently if we take a completely different
      approach. One way to prevent reuse of a dead G is to remove it from
      the free G list. Hence, this adds a new fixed root marking task that
      simply removes all Gs from the list of dead Gs with cached stacks,
      frees their stacks, and then adds them to the list of dead Gs without
      cached stacks.
      
      This is also a necessary step toward rescanning only dirty stacks,
      since it eliminates another task from STW stack marking.
      
      Change-Id: Iefbad03078b284a2e7bf30fba397da4ca87fe095
      Reviewed-on: https://go-review.googlesource.com/20665Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      e8337491
    • Austin Clements's avatar
      runtime: split gfree list into with-stacks and without-stacks · 1a2cf91f
      Austin Clements authored
      Currently all free Gs are added to one list. Split this into two
      lists: one for free Gs with cached stacks and one for Gs without
      cached stacks.
      
      This lets us preferentially allocate Gs that already have a stack, but
      more importantly, it sets us up to free cached G stacks concurrently.
      
      Change-Id: Idbe486f708997e1c9d166662995283f02d1eeb3c
      Reviewed-on: https://go-review.googlesource.com/20664Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      1a2cf91f
    • Keith Randall's avatar
      cmd/compile: a rule's line number is at its -> · 3b0efa68
      Keith Randall authored
      Let's define the line number of a multiline rule as the line
      number on which the -> appears.  This helps make the rule
      cover analysis look a bit nicer.
      
      Change-Id: I4ac4c09f2240285976590ecfd416bc4c05e78946
      Reviewed-on: https://go-review.googlesource.com/22473Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      3b0efa68
    • Matthew Dempsky's avatar
      cmd/compile: lazily initialize litbuf · 8d075bee
      Matthew Dempsky authored
      Instead of eagerly creating strings like "literal 2.01" for every
      lexed number in case we need to mention it in an error message, defer
      this work to (*parser).syntax_error.
      
      name      old allocs/op  new allocs/op  delta
      Template      482k ± 0%      482k ± 0%  -0.12%   (p=0.000 n=9+10)
      GoTypes      1.35M ± 0%     1.35M ± 0%  -0.04%  (p=0.015 n=10+10)
      Compiler     5.45M ± 0%     5.44M ± 0%  -0.12%    (p=0.000 n=9+8)
      
      Change-Id: I333b3c80e583864914412fb38f8c0b7f1d8c8821
      Reviewed-on: https://go-review.googlesource.com/22480
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      8d075bee
    • Robert Griesemer's avatar
      cmd/dist: sort entries in zcgo.go generated file for deterministic build · 19912e1d
      Robert Griesemer authored
      This simplifies comparison of object files across different builds
      by ensuring that the strings in the zcgo.go always appear in the
      same order.
      
      Change-Id: I3639ea4fd10e0d645b838d1bbb03cd33deca340e
      Reviewed-on: https://go-review.googlesource.com/22478Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      19912e1d
    • Egon Elbre's avatar
      unicode: improve SimpleFold performance for ascii · e607abbf
      Egon Elbre authored
      This change significantly speeds up case-insensitive regexp matching.
      
      benchmark                      old ns/op      new ns/op      delta
      BenchmarkMatchEasy0i_32-8      2690           1473           -45.24%
      BenchmarkMatchEasy0i_1K-8      80404          42269          -47.43%
      BenchmarkMatchEasy0i_32K-8     3272187        2076118        -36.55%
      BenchmarkMatchEasy0i_1M-8      104805990      66503805       -36.55%
      BenchmarkMatchEasy0i_32M-8     3360192200     2126121600     -36.73%
      
      benchmark                      old MB/s     new MB/s     speedup
      BenchmarkMatchEasy0i_32-8      11.90        21.72        1.83x
      BenchmarkMatchEasy0i_1K-8      12.74        24.23        1.90x
      BenchmarkMatchEasy0i_32K-8     10.01        15.78        1.58x
      BenchmarkMatchEasy0i_1M-8      10.00        15.77        1.58x
      BenchmarkMatchEasy0i_32M-8     9.99         15.78        1.58x
      
      Issue #13288
      
      Change-Id: I94af7bb29e75d60b4f6ee760124867ab271b9642
      Reviewed-on: https://go-review.googlesource.com/16943Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      e607abbf
    • Alan Donovan's avatar
      gc: use AbsFileLine for deterministic binary export data · 6e4a8615
      Alan Donovan authored
      This version of the file name honors the -trimprefix flag,
      which strips off variable parts like $WORK or $PWD.
      The TestCgoConsistentResults test now passes.
      
      Change-Id: If93980b054f9b13582dd314f9d082c26eaac4f41
      Reviewed-on: https://go-review.googlesource.com/22444Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      6e4a8615
    • Robert Griesemer's avatar
      cmd/compile: don't discard inlineable but empty functions with binary export format · 17db07f9
      Robert Griesemer authored
      Change-Id: I0f016fa000f949d27847d645b4cdebe68a8abf20
      Reviewed-on: https://go-review.googlesource.com/22474
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      17db07f9
    • Michael Hudson-Doyle's avatar
      cmd/link: pass -no-pie (if supported) when creating a race-enabled executable. · 3a72d626
      Michael Hudson-Doyle authored
      Fixes #15443
      
      Change-Id: Ia3593104fc1a4255926ae5675c25390604b44b7b
      Reviewed-on: https://go-review.googlesource.com/22453
      Run-TryBot: Michael Hudson-Doyle <michael.hudson@canonical.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      3a72d626