1. 08 Sep, 2016 13 commits
    • Cherry Zhang's avatar
      cmd/compile: intrinsify Ctz, Bswap, and some atomics on ARM64 · 4354ffd3
      Cherry Zhang authored
      Change-Id: Ia5bf72b70e6f6522d6fb8cd050e78f862d37b5ae
      Reviewed-on: https://go-review.googlesource.com/27936
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      4354ffd3
    • Sina Siadat's avatar
      net/http/httputil: remove custom hop-by-hop headers from response in ReverseProxy · daa7c607
      Sina Siadat authored
      Hop-by-hop headers (explicitly mentioned in RFC 2616) were already
      removed from the response. This removes the custom hop-by-hop
      headers listed in the "Connection" header of the response.
      
      Updates #16875
      
      Change-Id: I6b8f261d38b8d72040722f3ded29755ef0303427
      Reviewed-on: https://go-review.googlesource.com/28810Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      daa7c607
    • Kevin Burke's avatar
      encoding/json: Use a lookup table for safe characters · ed8f2079
      Kevin Burke authored
      The previous check for characters inside of a JSON string that needed
      to be escaped performed seven different boolean comparisons before
      determining that a ASCII character did not need to be escaped. Most
      characters do not need to be escaped, so this check can be done in a
      more performant way.
      
      Use the same strategy as the unicode package for precomputing a range
      of characters that need to be escaped, then do a single lookup into a
      character array to determine whether the character needs escaping.
      
      On an AWS c4.large node:
      
      $ benchstat benchmarks/master-bench benchmarks/json-table-bench
      name                   old time/op    new time/op     delta
      CodeEncoder-2            19.0ms ± 0%     15.5ms ± 1%  -18.16%        (p=0.000 n=19+20)
      CodeMarshal-2            20.1ms ± 1%     16.8ms ± 2%  -16.35%        (p=0.000 n=20+21)
      CodeDecoder-2            49.3ms ± 1%     49.5ms ± 2%     ~           (p=0.498 n=16+20)
      DecoderStream-2           416ns ± 0%      416ns ± 1%     ~           (p=0.978 n=19+19)
      CodeUnmarshal-2          51.0ms ± 1%     50.9ms ± 1%     ~           (p=0.490 n=19+17)
      CodeUnmarshalReuse-2     48.5ms ± 2%     48.5ms ± 2%     ~           (p=0.989 n=20+19)
      UnmarshalString-2         541ns ± 1%      532ns ± 1%   -1.75%        (p=0.000 n=20+21)
      UnmarshalFloat64-2        485ns ± 1%      481ns ± 1%   -0.92%        (p=0.000 n=20+21)
      UnmarshalInt64-2          429ns ± 1%      427ns ± 1%   -0.49%        (p=0.000 n=19+20)
      Issue10335-2              631ns ± 1%      619ns ± 1%   -1.84%        (p=0.000 n=20+20)
      NumberIsValid-2          19.1ns ± 0%     19.1ns ± 0%     ~     (all samples are equal)
      NumberIsValidRegexp-2     689ns ± 1%      690ns ± 0%     ~           (p=0.150 n=20+20)
      SkipValue-2              14.0ms ± 0%     14.0ms ± 0%   -0.05%        (p=0.000 n=18+18)
      EncoderEncode-2           525ns ± 2%      512ns ± 1%   -2.33%        (p=0.000 n=20+18)
      
      name                   old speed      new speed       delta
      CodeEncoder-2           102MB/s ± 0%    125MB/s ± 1%  +22.20%        (p=0.000 n=19+20)
      CodeMarshal-2          96.6MB/s ± 1%  115.6MB/s ± 2%  +19.56%        (p=0.000 n=20+21)
      CodeDecoder-2          39.3MB/s ± 1%   39.2MB/s ± 2%     ~           (p=0.464 n=16+20)
      CodeUnmarshal-2        38.1MB/s ± 1%   38.1MB/s ± 1%     ~           (p=0.525 n=19+17)
      SkipValue-2             143MB/s ± 0%    143MB/s ± 0%   +0.05%        (p=0.000 n=18+18)
      
      I also took the data set reported in #5683 (browser
      telemetry data from Mozilla), added named structs for
      the data set, and turned it into a proper benchmark:
      https://github.com/kevinburke/jsonbench/blob/master/go/bench_test.go
      
      The results from that test are similarly encouraging. On a 64-bit
      Mac:
      
      $ benchstat benchmarks/master-benchmark benchmarks/json-table-benchmark
      name              old time/op    new time/op    delta
      CodeMarshal-4       1.19ms ± 2%    1.08ms ± 2%   -9.33%  (p=0.000 n=21+17)
      Unmarshal-4         3.09ms ± 3%    3.06ms ± 1%   -0.83%  (p=0.027 n=22+17)
      UnmarshalReuse-4    3.04ms ± 1%    3.04ms ± 1%     ~     (p=0.169 n=20+15)
      
      name              old speed      new speed      delta
      CodeMarshal-4     80.3MB/s ± 1%  88.5MB/s ± 1%  +10.29%  (p=0.000 n=21+17)
      Unmarshal-4       31.0MB/s ± 2%  31.2MB/s ± 1%   +0.83%  (p=0.025 n=22+17)
      
      On the c4.large:
      
      $ benchstat benchmarks/master-bench benchmarks/json-table-bench
      name              old time/op    new time/op    delta
      CodeMarshal-2       1.10ms ± 1%    0.98ms ± 1%  -10.12%  (p=0.000 n=20+54)
      Unmarshal-2         2.82ms ± 1%    2.79ms ± 0%   -1.09%  (p=0.000 n=20+51)
      UnmarshalReuse-2    2.80ms ± 0%    2.77ms ± 0%   -1.03%  (p=0.000 n=20+52)
      
      name              old speed      new speed      delta
      CodeMarshal-2     87.3MB/s ± 1%  97.1MB/s ± 1%  +11.27%  (p=0.000 n=20+54)
      Unmarshal-2       33.9MB/s ± 1%  34.2MB/s ± 0%   +1.10%  (p=0.000 n=20+51)
      
      For what it's worth, I tried other heuristics - short circuiting the
      conditional for common ASCII characters, for example:
      
      if (b >= 63 && b != 92) || (b >= 39 && b <= 59) || (rest of the conditional)
      
      This offered a speedup around 7-9%, not as large as the submitted
      change.
      
      Change-Id: Idcf88f7b93bfcd1164cdd6a585160b7e407a0d9b
      Reviewed-on: https://go-review.googlesource.com/24466Reviewed-by: default avatarJoe Tsai <thebrokentoaster@gmail.com>
      Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      ed8f2079
    • Martin Möhrmann's avatar
      bytes: improve WriteRune performance · 2321895f
      Martin Möhrmann authored
      Remove the runeBytes buffer and write the utf8 encoding directly
      to the internal buf byte slice.
      
      name         old time/op   new time/op   delta
      WriteRune-4   80.5µs ± 2%   57.1µs ± 2%  -29.06%  (p=0.000 n=20+20)
      
      name         old speed     new speed     delta
      WriteRune-4  153MB/s ± 2%  215MB/s ± 2%  +40.96%  (p=0.000 n=20+20)
      
      Change-Id: Ic15f6e2d6e56a3d15c74f56159e2eae020ba73ba
      Reviewed-on: https://go-review.googlesource.com/28816Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      2321895f
    • Josh Bleecher Snyder's avatar
      runtime: simplify getargp · 07bcc165
      Josh Bleecher Snyder authored
      Change-Id: I9ed62e8a6d8b9204c18748efd7845adabf3460b9
      Reviewed-on: https://go-review.googlesource.com/28775
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      07bcc165
    • Cherry Zhang's avatar
      cmd/compile/internal/ssa/gen: fix error message for wrong arg length · 70fd814f
      Cherry Zhang authored
      When arg length is wrong, op is not set, so it always prints
      "should have 0 args".
      
      Change-Id: If7bcb41d993919d0038d2a09e16188c79dfbd858
      Reviewed-on: https://go-review.googlesource.com/28831Reviewed-by: default avatarKeith Randall <khr@golang.org>
      70fd814f
    • Michal Bohuslávek's avatar
      doc: fix typo in the release notes · bec84c72
      Michal Bohuslávek authored
      Change-Id: I003795d8dc2176532ee133740bf35e23a3aa3878
      Reviewed-on: https://go-review.googlesource.com/28811Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      bec84c72
    • Martin Möhrmann's avatar
      runtime: remove maxstring · 252093f1
      Martin Möhrmann authored
      Before this CL the runtime prevented printing of overlong strings with the print
      function when the length of the string was determined to be corrupted.
      Corruption was checked by comparing the string size against the limit
      which was stored in maxstring.
      
      However maxstring was not updated everywhere were go strings were created
      e.g. for string constants during compile time. Thereby the check for maximum
      string length prevented the printing of some valid strings.
      
      The protection maxstring provided did not warrant the bookkeeping
      and global synchronization needed to keep maxstring updated to the
      correct limit everywhere.
      
      Fixes #16999
      
      Change-Id: I62cc2f4362f333f75b77f199ce1a71aac0ff7aeb
      Reviewed-on: https://go-review.googlesource.com/28813Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      252093f1
    • Brad Fitzpatrick's avatar
      io/ioutil: return better error when TempDir called with non-extant dir · fd975c6a
      Brad Fitzpatrick authored
      Fixes #14196
      
      Change-Id: Ife7950289ac6adbcfc4d0f2fce31f20bc2657858
      Reviewed-on: https://go-review.googlesource.com/28772Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      fd975c6a
    • Matthew Dempsky's avatar
      cmd/compile: remove unnecessary FuncType cloning · 3a59b562
      Matthew Dempsky authored
      Since FuncTypes are represented as structs rather than linking the
      parameter lists together, we no longer need to worry about duplicating
      the parameter lists.
      
      Change-Id: I3767aa3cd1cbeddfb80a6eef6b42290dc2ac14ae
      Reviewed-on: https://go-review.googlesource.com/28574
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      3a59b562
    • Sina Siadat's avatar
      net/http/httputil: copy header map if necessary in ReverseProxy · 24d8f3fa
      Sina Siadat authored
      We were already making a copy of the map before removing
      hop-by-hop headers. This commit does the same for proxied
      headers mentioned in the "Connection" header.
      
      A test is added to ensure request headers are not modified.
      
      Updates #16875
      
      Change-Id: I85329d212787958d5ad818915eb0538580a4653a
      Reviewed-on: https://go-review.googlesource.com/28493Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      24d8f3fa
    • Alberto Donizetti's avatar
      go/format: add format.Node example · b6f44923
      Alberto Donizetti authored
      Updates #16360
      
      Change-Id: I5927cffa961cd85539a3ba9606b116c5996d1096
      Reviewed-on: https://go-review.googlesource.com/26696Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      b6f44923
    • Brad Fitzpatrick's avatar
      io: add test that MultiReader zeros exhausted Readers · 614dfe9b
      Brad Fitzpatrick authored
      Updates #16983
      Updates #16996
      
      Change-Id: I76390766385b2668632c95e172b2d243d7f66651
      Reviewed-on: https://go-review.googlesource.com/28771
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      614dfe9b
  2. 07 Sep, 2016 13 commits
  3. 06 Sep, 2016 14 commits
    • Hiroshi Ioka's avatar
      bytes: make IndexRune faster · e10286ae
      Hiroshi Ioka authored
      re-implement IndexRune by IndexByte and Index which are well optimized
      to get performance gain.
      
      name                  old time/op   new time/op     delta
      IndexRune/10-4         53.2ns ± 1%     29.1ns ± 1%    -45.32%  (p=0.008 n=5+5)
      IndexRune/32-4          191ns ± 1%       27ns ± 1%    -85.75%  (p=0.008 n=5+5)
      IndexRune/4K-4         23.5µs ± 1%      1.0µs ± 1%    -95.77%  (p=0.008 n=5+5)
      IndexRune/4M-4         23.8ms ± 0%      1.0ms ± 2%    -95.90%  (p=0.008 n=5+5)
      IndexRune/64M-4         384ms ± 1%       15ms ± 1%    -95.98%  (p=0.008 n=5+5)
      IndexRuneASCII/10-4    61.5ns ± 0%     10.3ns ± 4%    -83.17%  (p=0.008 n=5+5)
      IndexRuneASCII/32-4     203ns ± 0%       11ns ± 5%    -94.68%  (p=0.008 n=5+5)
      IndexRuneASCII/4K-4    23.4µs ± 0%      0.3µs ± 2%    -98.60%  (p=0.008 n=5+5)
      IndexRuneASCII/4M-4    24.0ms ± 1%      0.3ms ± 1%    -98.60%  (p=0.008 n=5+5)
      IndexRuneASCII/64M-4    386ms ± 2%        6ms ± 1%    -98.57%  (p=0.008 n=5+5)
      
      name                  old speed     new speed       delta
      IndexRune/10-4        188MB/s ± 1%    344MB/s ± 1%    +82.91%  (p=0.008 n=5+5)
      IndexRune/32-4        167MB/s ± 0%   1175MB/s ± 1%   +603.52%  (p=0.008 n=5+5)
      IndexRune/4K-4        174MB/s ± 1%   4117MB/s ± 1%  +2262.71%  (p=0.008 n=5+5)
      IndexRune/4M-4        176MB/s ± 0%   4299MB/s ± 2%  +2340.46%  (p=0.008 n=5+5)
      IndexRune/64M-4       175MB/s ± 1%   4354MB/s ± 1%  +2388.57%  (p=0.008 n=5+5)
      IndexRuneASCII/10-4   163MB/s ± 0%    968MB/s ± 4%   +494.66%  (p=0.008 n=5+5)
      IndexRuneASCII/32-4   157MB/s ± 0%   2974MB/s ± 4%  +1788.59%  (p=0.008 n=5+5)
      IndexRuneASCII/4K-4   175MB/s ± 0%  12481MB/s ± 2%  +7027.71%  (p=0.008 n=5+5)
      IndexRuneASCII/4M-4   175MB/s ± 1%  12510MB/s ± 1%  +7061.15%  (p=0.008 n=5+5)
      IndexRuneASCII/64M-4  174MB/s ± 2%  12143MB/s ± 1%  +6881.70%  (p=0.008 n=5+5)
      
      Change-Id: I0632eadb83937c2a9daa7f0ce79df1dee64f992e
      Reviewed-on: https://go-review.googlesource.com/28537
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      e10286ae
    • Austin Clements's avatar
      runtime/debug: enable TestFreeOSMemory on all arches · 8259cf3c
      Austin Clements authored
      TestFreeOSMemory was disabled on many arches because of issue #9993.
      Since that's been fixed, enable the test everywhere.
      
      Change-Id: I298c38c3e04128d9c8a1f558980939d5699bea03
      Reviewed-on: https://go-review.googlesource.com/27403
      Run-TryBot: Austin Clements <austin@google.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: default avatarMinux Ma <minux@golang.org>
      8259cf3c
    • Austin Clements's avatar
      syscall: make Getpagesize return page size from runtime · 1b9499b0
      Austin Clements authored
      syscall.Getpagesize currently returns hard-coded page sizes on all
      architectures (some of which are probably always wrong, and some of
      which are definitely not always right). The runtime now has this
      information, queried from the OS during runtime init, so make
      syscall.Getpagesize return the page size that the runtime knows.
      
      Updates #10180.
      
      Change-Id: I4daa6fbc61a2193eb8fa9e7878960971205ac346
      Reviewed-on: https://go-review.googlesource.com/25051
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      1b9499b0
    • Austin Clements's avatar
      runtime: don't hard-code physical page size · 6dda7b2f
      Austin Clements authored
      Now that the runtime fetches the true physical page size from the OS,
      make the physical page size used by heap growth a variable instead of
      a constant. This isn't used in any performance-critical paths, so it
      shouldn't be an issue.
      
      sys.PhysPageSize is also renamed to sys.DefaultPhysPageSize to make it
      clear that it's not necessarily the true page size. There are no uses
      of this constant any more, but we'll keep it around for now.
      
      Updates #12480 and #10180.
      
      Change-Id: I6c23b9df860db309c38c8287a703c53817754f03
      Reviewed-on: https://go-review.googlesource.com/25022
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      6dda7b2f
    • Austin Clements's avatar
      runtime: fetch physical page size from the OS · 276a52de
      Austin Clements authored
      Currently the physical page size assumed by the runtime is hard-coded.
      On Linux the runtime at least fetches the OS page size during init and
      sanity checks against the hard-coded value, but they may still differ.
      On other OSes we wouldn't even notice.
      
      Add support on all OSes to fetch the actual OS physical page size
      during runtime init and lift the sanity check of PhysPageSize from the
      Linux init code to general malloc init. Currently this is the only use
      of the retrieved page size, but we'll add more shortly.
      
      Updates #12480 and #10180.
      
      Change-Id: I065f2834bc97c71d3208edc17fd990ec9058b6da
      Reviewed-on: https://go-review.googlesource.com/25050
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      276a52de
    • Austin Clements's avatar
      runtime: assume 64kB physical pages on ARM · d7de8b6d
      Austin Clements authored
      Currently we assume the physical page size on ARM is 4kB. While this
      is usually true, the architecture also supports 16kB and 64kB physical
      pages, and Linux (and possibly other OSes) can be configured to use
      these larger page sizes.
      
      With Go 1.6, such a configuration could potentially run, but generally
      resulted in memory corruption or random panics. With current master,
      this configuration will cause the runtime to panic during init on
      Linux when it checks the true physical page size (and will still cause
      corruption or panics on other OSes).
      
      However, the assumed physical page size only has to be a multiple of
      the true physical page size, the scavenger can now deal with large
      physical page sizes, and the rest of the runtime can deal with a
      larger assumed physical page size than the true size. Hence, there's
      little disadvantage to conservatively setting the assumed physical
      page size to 64kB on ARM.
      
      This may result in some extra memory use, since we can only return
      memory at multiples of the assumed physical page size. However, it is
      a simple change that should make Go run on systems configured for
      larger page sizes. The following commits will make the runtime query
      the actual physical page size from the OS, but this is a simple step
      there.
      
      Updates #12480.
      
      Change-Id: I851829595bc9e0c76235c847a7b5f62ad82b5302
      Reviewed-on: https://go-review.googlesource.com/25021
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMinux Ma <minux@golang.org>
      d7de8b6d
    • Austin Clements's avatar
      runtime: bound scanobject to ~100 µs · cf4f1d07
      Austin Clements authored
      Currently the time spent in scanobject is proportional to the size of
      the object being scanned. Since scanobject is non-preemptible, large
      objects can cause significant goroutine (and even whole application)
      delays through several means:
      
      1. If a GC assist picks up a large object, the allocating goroutine is
         blocked for the whole scan, even if that scan well exceeds that
         goroutine's debt.
      
      2. Since the scheduler does not run on the P performing a large object
         scan, goroutines in that P's run queue do not run unless they are
         stolen by another P (which can take some time). If there are a few
         large objects, all of the Ps may get tied up so the scheduler
         doesn't run anywhere.
      
      3. Even if a large object is scanned by a background worker and other
         Ps are still running the scheduler, the large object scan doesn't
         flush background credit until the whole scan is done. This can
         easily cause all allocations to block in assists, waiting for
         credit, causing an effective STW.
      
      Fix this by splitting large objects into 128 KB "oblets" and scanning
      at most one oblet at a time. Since we can scan 1–2 MB/ms, this equates
      to bounding scanobject at roughly 100 µs. This improves assist
      behavior both because assists can no longer get "unlucky" and be stuck
      scanning a large object, and because it causes the background worker
      to flush credit and unblock assists more frequently when scanning
      large objects. This also improves GC parallelism if the heap consists
      primarily of a small number of very large objects by letting multiple
      workers scan a large objects in parallel.
      
      Fixes #10345. Fixes #16293.
      
      This substantially improves goroutine latency in the benchmark from
      issue #16293, which exercises several forms of very large objects:
      
      name                 old max-latency    new max-latency    delta
      SliceNoPointer-12           154µs ± 1%        155µs ±  2%     ~     (p=0.087 n=13+12)
      SlicePointer-12             314ms ± 1%       5.94ms ±138%  -98.11%  (p=0.000 n=19+20)
      SliceLivePointer-12        1148ms ± 0%       4.72ms ±167%  -99.59%  (p=0.000 n=19+20)
      MapNoPointer-12           72509µs ± 1%        408µs ±325%  -99.44%  (p=0.000 n=19+18)
      ChanPointer-12              313ms ± 0%       4.74ms ±140%  -98.49%  (p=0.000 n=18+20)
      ChanLivePointer-12         1147ms ± 0%       3.30ms ±149%  -99.71%  (p=0.000 n=19+20)
      
      name                 old P99.9-latency  new P99.9-latency  delta
      SliceNoPointer-12           113µs ±25%         107µs ±12%     ~     (p=0.153 n=20+18)
      SlicePointer-12          309450µs ± 0%         133µs ±23%  -99.96%  (p=0.000 n=20+20)
      SliceLivePointer-12         961ms ± 0%        1.35ms ±27%  -99.86%  (p=0.000 n=20+20)
      MapNoPointer-12            448µs ±288%         119µs ±18%  -73.34%  (p=0.000 n=18+20)
      ChanPointer-12           309450µs ± 0%         134µs ±23%  -99.96%  (p=0.000 n=20+19)
      ChanLivePointer-12          961ms ± 0%        1.35ms ±27%  -99.86%  (p=0.000 n=20+20)
      
      This has negligible effect on all metrics from the garbage, JSON, and
      HTTP x/benchmarks.
      
      It shows slight improvement on some of the go1 benchmarks,
      particularly Revcomp, which uses some multi-megabyte buffers:
      
      name                      old time/op    new time/op    delta
      BinaryTree17-12              2.46s ± 1%     2.47s ± 1%  +0.32%  (p=0.012 n=20+20)
      Fannkuch11-12                2.82s ± 0%     2.81s ± 0%  -0.61%  (p=0.000 n=17+20)
      FmtFprintfEmpty-12          50.8ns ± 5%    50.5ns ± 2%    ~     (p=0.197 n=17+19)
      FmtFprintfString-12          131ns ± 1%     132ns ± 0%  +0.57%  (p=0.000 n=20+16)
      FmtFprintfInt-12             117ns ± 0%     116ns ± 0%  -0.47%  (p=0.000 n=15+20)
      FmtFprintfIntInt-12          180ns ± 0%     179ns ± 1%  -0.78%  (p=0.000 n=16+20)
      FmtFprintfPrefixedInt-12     186ns ± 1%     185ns ± 1%  -0.55%  (p=0.000 n=19+20)
      FmtFprintfFloat-12           263ns ± 1%     271ns ± 0%  +2.84%  (p=0.000 n=18+20)
      FmtManyArgs-12               741ns ± 1%     742ns ± 1%    ~     (p=0.190 n=19+19)
      GobDecode-12                7.44ms ± 0%    7.35ms ± 1%  -1.21%  (p=0.000 n=20+20)
      GobEncode-12                6.22ms ± 1%    6.21ms ± 1%    ~     (p=0.336 n=20+19)
      Gzip-12                      220ms ± 1%     219ms ± 1%    ~     (p=0.130 n=19+19)
      Gunzip-12                   37.9ms ± 0%    37.9ms ± 1%    ~     (p=1.000 n=20+19)
      HTTPClientServer-12         82.5µs ± 3%    82.6µs ± 3%    ~     (p=0.776 n=20+19)
      JSONEncode-12               16.4ms ± 1%    16.5ms ± 2%  +0.49%  (p=0.003 n=18+19)
      JSONDecode-12               53.7ms ± 1%    54.1ms ± 1%  +0.71%  (p=0.000 n=19+18)
      Mandelbrot200-12            4.19ms ± 1%    4.20ms ± 1%    ~     (p=0.452 n=19+19)
      GoParse-12                  3.38ms ± 1%    3.37ms ± 1%    ~     (p=0.123 n=19+19)
      RegexpMatchEasy0_32-12      72.1ns ± 1%    71.8ns ± 1%    ~     (p=0.397 n=19+17)
      RegexpMatchEasy0_1K-12       242ns ± 0%     242ns ± 0%    ~     (p=0.168 n=17+20)
      RegexpMatchEasy1_32-12      72.1ns ± 1%    72.1ns ± 1%    ~     (p=0.538 n=18+19)
      RegexpMatchEasy1_1K-12       385ns ± 1%     384ns ± 1%    ~     (p=0.388 n=20+20)
      RegexpMatchMedium_32-12      112ns ± 1%     112ns ± 3%    ~     (p=0.539 n=20+20)
      RegexpMatchMedium_1K-12     34.4µs ± 2%    34.4µs ± 2%    ~     (p=0.628 n=18+18)
      RegexpMatchHard_32-12       1.80µs ± 1%    1.80µs ± 1%    ~     (p=0.522 n=18+19)
      RegexpMatchHard_1K-12       54.0µs ± 1%    54.1µs ± 1%    ~     (p=0.647 n=20+19)
      Revcomp-12                   387ms ± 1%     369ms ± 5%  -4.89%  (p=0.000 n=17+19)
      Template-12                 62.3ms ± 1%    62.0ms ± 0%  -0.48%  (p=0.002 n=20+17)
      TimeParse-12                 314ns ± 1%     314ns ± 0%    ~     (p=1.011 n=20+13)
      TimeFormat-12                358ns ± 0%     354ns ± 0%  -1.12%  (p=0.000 n=17+20)
      [Geo mean]                  53.5µs         53.3µs       -0.23%
      
      Change-Id: I2a0a179d1d6bf7875dd054b7693dd12d2a340132
      Reviewed-on: https://go-review.googlesource.com/23540
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      cf4f1d07
    • Austin Clements's avatar
      runtime: clean up more traces of the old mark bit · b275e55d
      Austin Clements authored
      Commit 59877bfa renamed bitMarked to bitScan, since the bitmap is no
      longer used for marking. However, there were several other references
      to this strewn about comments and in some other constant names. Fix
      these up, too.
      
      Change-Id: I4183d28c6b01977f1d75a99ad55b150f2211772d
      Reviewed-on: https://go-review.googlesource.com/28450
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRick Hudson <rlh@golang.org>
      b275e55d
    • Cherry Zhang's avatar
      cmd/compile: remove nil check if followed by storezero on ARM64, MIPS64 · 4d5bb762
      Cherry Zhang authored
      Change-Id: Ib90c92056fa70b27feb734837794ef53e842c41a
      Reviewed-on: https://go-review.googlesource.com/28513
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      4d5bb762
    • David Chase's avatar
      cmd/compile: remove ld/st-followed nil checks for PPC64 · 0e0ab203
      David Chase authored
      Enabled checks (except for DUFF-ops which aren't implemented yet).
      Added ppc64le to relevant test.
      
      Also updated register list to reflect no-longer-reserved-
      for-constants status (file was missed in that change).
      
      Updates #16010.
      
      Change-Id: I31b1aac19e14994f760f2ecd02edbeb1f78362e7
      Reviewed-on: https://go-review.googlesource.com/28548
      Run-TryBot: David Chase <drchase@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      0e0ab203
    • David Crawshaw's avatar
      cmd/link: remove outdated cast and comment · b926bf83
      David Crawshaw authored
      This program is written in Go now.
      
      Change-Id: Ieec21a1bcac7c7a59e88cd1e1359977659de1757
      Reviewed-on: https://go-review.googlesource.com/28549Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: David Crawshaw <crawshaw@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      b926bf83
    • Aliaksandr Valialkin's avatar
      regexp: reduce mallocs in Regexp.Find* and Regexp.ReplaceAll*. · bea39e63
      Aliaksandr Valialkin authored
      This improves Regexp.Find* and Regexp.ReplaceAll* speed:
      
      name                  old time/op    new time/op    delta
      Find-4                   345ns ± 1%     314ns ± 1%    -8.94%    (p=0.000 n=9+8)
      FindString-4             341ns ± 1%     308ns ± 0%    -9.85%   (p=0.000 n=10+9)
      FindSubmatch-4           440ns ± 1%     404ns ± 0%    -8.27%   (p=0.000 n=10+8)
      FindStringSubmatch-4     426ns ± 0%     387ns ± 0%    -9.07%   (p=0.000 n=10+9)
      ReplaceAll-4            1.75µs ± 1%    1.67µs ± 0%    -4.45%   (p=0.000 n=9+10)
      
      name                  old alloc/op   new alloc/op   delta
      Find-4                   16.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.000 n=10+10)
      FindString-4             16.0B ± 0%     0.0B ±NaN%  -100.00%  (p=0.000 n=10+10)
      FindSubmatch-4           80.0B ± 0%     48.0B ± 0%   -40.00%  (p=0.000 n=10+10)
      FindStringSubmatch-4     64.0B ± 0%     32.0B ± 0%   -50.00%  (p=0.000 n=10+10)
      ReplaceAll-4              152B ± 0%      104B ± 0%   -31.58%  (p=0.000 n=10+10)
      
      name                  old allocs/op  new allocs/op  delta
      Find-4                    1.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.000 n=10+10)
      FindString-4              1.00 ± 0%     0.00 ±NaN%  -100.00%  (p=0.000 n=10+10)
      FindSubmatch-4            2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
      FindStringSubmatch-4      2.00 ± 0%      1.00 ± 0%   -50.00%  (p=0.000 n=10+10)
      ReplaceAll-4              8.00 ± 0%      5.00 ± 0%   -37.50%  (p=0.000 n=10+10)
      
      Fixes #15643
      
      Change-Id: I594fe51172373e2adb98d1d25c76ca2cde54ff48
      Reviewed-on: https://go-review.googlesource.com/23030Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      bea39e63
    • David Crawshaw's avatar
      cmd/compile: generate table of main symbol types · 5923df1a
      David Crawshaw authored
      For each exported symbol in package main, add its name and type to
      go.plugin.tabs symbol. This is used by the runtime when loading a
      plugin to return a typed interface{} value.
      
      Change-Id: I23c39583e57180acb8f7a74d218dae4368614f46
      Reviewed-on: https://go-review.googlesource.com/27818
      Run-TryBot: David Crawshaw <crawshaw@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      5923df1a
    • Ilya Tocar's avatar
      math: fix sqrt regression on AMD64 · 6e703ae7
      Ilya Tocar authored
      1.7 introduced a significant regression compared to 1.6:
      
      SqrtIndirect-4  2.32ns ± 0%  7.86ns ± 0%  +238.79%        (p=0.000 n=20+18)
      
      This is caused by sqrtsd preserving upper part of destination register.
      Which introduces dependency on previous  value of X0.
      In 1.6 benchmark loop didn't use X0 immediately after call:
      
      callq  *%rbx
      movsd  0x8(%rsp),%xmm2
      movsd  0x20(%rsp),%xmm1
      addsd  %xmm2,%xmm1
      mov    0x18(%rsp),%rax
      inc    %rax
      jmp    loop
      
      In 1.7 however xmm0 is used just after call:
      
      callq  *%rbx
      mov    0x10(%rsp),%rcx
      lea    0x1(%rcx),%rax
      movsd  0x8(%rsp),%xmm0
      movsd  0x18(%rsp),%xmm1
      
      I've  verified that this is caused by dependency, by inserting
      XORPS X0,X0 in the beginning of math.Sqrt, which puts performance back on 1.6 level.
      
      Splitting SQRTSD mem,reg into:
      MOVSD mem,reg
      SQRTSD reg,reg
      
      Removes dependency, because MOVSD (load version)
      doesn't need to preserve upper part of a register.
      And reg,reg operation is solved by renamer in CPU.
      
      As a result of this change regression is gone:
      SqrtIndirect-4  7.86ns ± 0%  2.33ns ± 0%  -70.36%  (p=0.000 n=18+17)
      
      This also removes old Sqrt benchmarks, in favor of benchmarks measuring latency.
      Only SqrtIndirect is kept, to show impact of this patch.
      
      Change-Id: Ic7eebe8866445adff5bc38192fa8d64c9a6b8872
      Reviewed-on: https://go-review.googlesource.com/28392
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      6e703ae7