1. 01 Mar, 2017 10 commits
    • Alex Brainman's avatar
      cmd/link: write dwarf relocations · aada4903
      Alex Brainman authored
      For #10776.
      
      Change-Id: I11dd441d8e5d6316889ffa8418df8b58c57c677d
      Reviewed-on: https://go-review.googlesource.com/36982Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      aada4903
    • Ian Lance Taylor's avatar
      os: don't use waitid on Darwin · 15442178
      Ian Lance Taylor authored
      According to issue #19314 waitid on Darwin returns if the process is
      stopped, even though we specify WEXITED.
      
      Fixes #19314.
      
      Change-Id: I95faf196c11e43b7741efff79351bab45c811bc2
      Reviewed-on: https://go-review.googlesource.com/37610
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      15442178
    • Dave Cheney's avatar
      cmd/compile/internal/ssa: remove unused PrintFunc variable · d945b286
      Dave Cheney authored
      Change-Id: I8c581eec77beacaddc0aac29e7d380a4d5ca8acc
      Reviewed-on: https://go-review.googlesource.com/37551
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      d945b286
    • Robert Griesemer's avatar
      go/internal/srcimporter: parse files concurrently (fixes TODO) · c861a4c7
      Robert Griesemer authored
      Passes go test -race.
      
      Change-Id: I14b5b1b1a8ad1e43d60013823d71d78a6519581f
      Reviewed-on: https://go-review.googlesource.com/37588
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: default avatarAlan Donovan <adonovan@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      c861a4c7
    • Matthew Dempsky's avatar
      cmd/compile/internal/gc: separate builtin and real runtime packages · b6c600fc
      Matthew Dempsky authored
      The builtin runtime package definitions intentionally diverge from the
      actual runtime package's, but this only works as long as they never
      overlap.
      
      To make it easier to expand the builtin runtime package, this CL now
      loads their definitions into a logically separate "go.runtime"
      package.  By resetting the package's Prefix field to "runtime", any
      references to builtin definitions will still resolve against the real
      package runtime.
      
      Fixes #14482.
      
      Passes toolstash -cmp.
      
      Change-Id: I539c0994deaed4506a331f38c5b4d6bc8c95433f
      Reviewed-on: https://go-review.googlesource.com/37538
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      b6c600fc
    • Brad Fitzpatrick's avatar
      cmd/vet/all: remove pprof from the whitelist · 12b6c181
      Brad Fitzpatrick authored
      Updates #19322
      
      Change-Id: I610f40d874f499e52db3356a3da54538dac55242
      Reviewed-on: https://go-review.googlesource.com/37618
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      12b6c181
    • Josh Bleecher Snyder's avatar
      cmd/compile: recognize bit test patterns on amd64 · 21831355
      Josh Bleecher Snyder authored
      Updates #18943
      
      Change-Id: If3080d6133bb6d2710b57294da24c90251ab4e08
      Reviewed-on: https://go-review.googlesource.com/36329
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      21831355
    • Heschi Kreinick's avatar
      cmd/compile, cmd/asm: remove Link.Plists · ac7761e1
      Heschi Kreinick authored
      Link.Plists never contained more than one Plist, and sometimes none.
      Passing around the Plist being worked on is straightforward and makes
      the data flow easier to follow.
      
      Change-Id: I79cb30cb2bd3d319fdbb1dfa5d35b27fcb748e5c
      Reviewed-on: https://go-review.googlesource.com/37169
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      ac7761e1
    • Raul Silvera's avatar
      cmd/vendor/github.com/google/pprof: refresh from upstream · ac4a8652
      Raul Silvera authored
      Updating to commit e41fb7133e7ebb84ba6af2f6443032c728db26d3
      from github.com/google/pprof
      
      This fixes #19322
      
      Change-Id: Ia1c008a09f46ed19ef176046e38868eacb715682
      Reviewed-on: https://go-review.googlesource.com/37617Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      ac4a8652
    • Robert Griesemer's avatar
      compress/flate: use math/bits.Reverse8/16 instead of local implementation · bca03206
      Robert Griesemer authored
      No measurable impact on performance (specifically, no degradation).
      Reverse is used in Huffman en/de-coding. For completeness, here are
      all the speed-related benchmark results:
      
      name                             old time/op    new time/op    delta
      Decode/Digits/Huffman/1e4-8         181µs ± 0%     178µs ± 1%   ~             (p=0.100 n=3+3)
      Decode/Digits/Huffman/1e5-8        1.60ms ± 3%    1.56ms ± 3%   ~             (p=0.400 n=3+3)
      Decode/Digits/Huffman/1e6-8        15.7ms ± 1%    15.3ms ± 3%   ~             (p=0.700 n=3+3)
      Decode/Digits/Speed/1e4-8           179µs ± 0%     180µs ± 0%   ~             (p=0.200 n=3+3)
      Decode/Digits/Speed/1e5-8          1.68ms ± 0%    1.66ms ± 3%   ~             (p=0.700 n=3+3)
      Decode/Digits/Speed/1e6-8          16.6ms ± 2%    16.6ms ± 5%   ~             (p=0.700 n=3+3)
      Decode/Digits/Default/1e4-8         179µs ± 1%     178µs ± 1%   ~             (p=0.700 n=3+3)
      Decode/Digits/Default/1e5-8        1.62ms ± 3%    1.62ms ± 4%   ~             (p=1.000 n=3+3)
      Decode/Digits/Default/1e6-8        16.0ms ± 2%    16.0ms ± 3%   ~             (p=1.000 n=3+3)
      Decode/Digits/Compression/1e4-8     179µs ± 1%     179µs ± 0%   ~             (p=0.200 n=3+3)
      Decode/Digits/Compression/1e5-8    1.62ms ± 2%    1.62ms ± 3%   ~             (p=1.000 n=3+3)
      Decode/Digits/Compression/1e6-8    16.1ms ± 3%    16.0ms ± 3%   ~             (p=1.000 n=3+3)
      Decode/Twain/Huffman/1e4-8          205µs ± 2%     207µs ± 1%   ~             (p=1.000 n=3+3)
      Decode/Twain/Huffman/1e5-8         1.77ms ± 2%    1.77ms ± 4%   ~             (p=0.700 n=3+3)
      Decode/Twain/Huffman/1e6-8         17.4ms ± 2%    17.4ms ± 3%   ~             (p=1.000 n=3+3)
      Decode/Twain/Speed/1e4-8            186µs ± 1%     186µs ± 1%   ~             (p=0.400 n=3+3)
      Decode/Twain/Speed/1e5-8           1.53ms ± 2%    1.52ms ± 0%   ~             (p=0.700 n=3+3)
      Decode/Twain/Speed/1e6-8           14.9ms ± 1%    14.8ms ± 1%   ~             (p=1.000 n=3+3)
      Decode/Twain/Default/1e4-8          176µs ± 1%     174µs ± 0%   ~             (p=0.200 n=3+3)
      Decode/Twain/Default/1e5-8         1.30ms ± 2%    1.31ms ± 1%   ~             (p=0.700 n=3+3)
      Decode/Twain/Default/1e6-8         12.6ms ± 3%    12.5ms ± 0%   ~             (p=0.700 n=3+3)
      Decode/Twain/Compression/1e4-8      177µs ± 0%     174µs ± 1%   ~             (p=0.100 n=3+3)
      Decode/Twain/Compression/1e5-8     1.30ms ± 1%    1.31ms ± 0%   ~             (p=0.700 n=3+3)
      Decode/Twain/Compression/1e6-8     12.5ms ± 1%    12.5ms ± 1%   ~             (p=1.000 n=3+3)
      Encode/Digits/Huffman/1e4-8        47.4µs ± 1%    46.5µs ± 0%   ~             (p=0.100 n=3+3)
      Encode/Digits/Huffman/1e5-8         453µs ± 2%     446µs ± 1%   ~             (p=0.700 n=3+3)
      Encode/Digits/Huffman/1e6-8        4.44ms ± 3%    4.39ms ± 0%   ~             (p=1.000 n=3+3)
      Encode/Digits/Speed/1e4-8           190µs ± 4%     185µs ± 0%   ~             (p=0.100 n=3+3)
      Encode/Digits/Speed/1e5-8          1.78ms ± 5%    1.75ms ± 1%   ~             (p=1.000 n=3+3)
      Encode/Digits/Speed/1e6-8          17.9ms ± 7%    17.3ms ± 1%   ~             (p=0.400 n=3+3)
      Encode/Digits/Default/1e4-8         366µs ± 1%     361µs ± 0%   ~             (p=0.200 n=3+3)
      Encode/Digits/Default/1e5-8        5.58ms ± 5%    5.44ms ± 1%   ~             (p=0.400 n=3+3)
      Encode/Digits/Default/1e6-8        59.0ms ± 3%    58.2ms ± 1%   ~             (p=0.700 n=3+3)
      Encode/Digits/Compression/1e4-8     369µs ± 3%     362µs ± 0%   ~             (p=0.100 n=3+3)
      Encode/Digits/Compression/1e5-8    5.50ms ± 2%    5.47ms ± 1%   ~             (p=1.000 n=3+3)
      Encode/Digits/Compression/1e6-8    59.4ms ± 2%    58.5ms ± 1%   ~             (p=0.400 n=3+3)
      Encode/Twain/Huffman/1e4-8         64.4µs ± 3%    64.7µs ± 1%   ~             (p=0.700 n=3+3)
      Encode/Twain/Huffman/1e5-8          526µs ± 1%     526µs ± 2%   ~             (p=1.000 n=3+3)
      Encode/Twain/Huffman/1e6-8         5.18ms ± 2%    5.17ms ± 1%   ~             (p=0.700 n=3+3)
      Encode/Twain/Speed/1e4-8            206µs ± 1%     204µs ± 0%   ~             (p=0.100 n=3+3)
      Encode/Twain/Speed/1e5-8           1.73ms ± 2%    1.70ms ± 0%   ~             (p=0.100 n=3+3)
      Encode/Twain/Speed/1e6-8           16.7ms ± 0%    16.7ms ± 1%   ~             (p=1.000 n=3+3)
      Encode/Twain/Default/1e4-8          423µs ± 3%     418µs ± 1%   ~             (p=1.000 n=3+3)
      Encode/Twain/Default/1e5-8         6.34ms ± 4%    6.23ms ± 0%   ~             (p=1.000 n=3+3)
      Encode/Twain/Default/1e6-8         68.0ms ± 3%    67.5ms ± 0%   ~             (p=0.700 n=3+3)
      Encode/Twain/Compression/1e4-8      435µs ± 3%     424µs ± 0%   ~             (p=0.700 n=3+3)
      Encode/Twain/Compression/1e5-8     7.01ms ± 1%    6.92ms ± 0%   ~             (p=0.100 n=3+3)
      Encode/Twain/Compression/1e6-8     77.1ms ± 4%    75.5ms ± 1%   ~             (p=0.400 n=3+3)
      
      name                             old speed      new speed      delta
      Decode/Digits/Huffman/1e4-8      55.2MB/s ± 0%  56.2MB/s ± 1%   ~             (p=0.100 n=3+3)
      Decode/Digits/Huffman/1e5-8      62.4MB/s ± 3%  64.1MB/s ± 3%   ~             (p=0.400 n=3+3)
      Decode/Digits/Huffman/1e6-8      63.8MB/s ± 1%  65.3MB/s ± 3%   ~             (p=0.700 n=3+3)
      Decode/Digits/Speed/1e4-8        55.8MB/s ± 0%  55.4MB/s ± 0%   ~             (p=0.200 n=3+3)
      Decode/Digits/Speed/1e5-8        59.6MB/s ± 0%  60.3MB/s ± 3%   ~             (p=0.700 n=3+3)
      Decode/Digits/Speed/1e6-8        60.1MB/s ± 2%  60.3MB/s ± 4%   ~             (p=0.700 n=3+3)
      Decode/Digits/Default/1e4-8      55.8MB/s ± 1%  56.1MB/s ± 1%   ~             (p=0.700 n=3+3)
      Decode/Digits/Default/1e5-8      61.8MB/s ± 3%  61.7MB/s ± 4%   ~             (p=1.000 n=3+3)
      Decode/Digits/Default/1e6-8      62.4MB/s ± 2%  62.4MB/s ± 3%   ~             (p=1.000 n=3+3)
      Decode/Digits/Compression/1e4-8  55.7MB/s ± 1%  56.0MB/s ± 0%   ~             (p=0.300 n=3+3)
      Decode/Digits/Compression/1e5-8  61.7MB/s ± 2%  61.9MB/s ± 3%   ~             (p=1.000 n=3+3)
      Decode/Digits/Compression/1e6-8  62.2MB/s ± 3%  62.6MB/s ± 3%   ~             (p=1.000 n=3+3)
      Decode/Twain/Huffman/1e4-8       48.8MB/s ± 2%  48.4MB/s ± 1%   ~             (p=1.000 n=3+3)
      Decode/Twain/Huffman/1e5-8       56.4MB/s ± 2%  56.6MB/s ± 4%   ~             (p=0.700 n=3+3)
      Decode/Twain/Huffman/1e6-8       57.6MB/s ± 2%  57.5MB/s ± 3%   ~             (p=1.000 n=3+3)
      Decode/Twain/Speed/1e4-8         53.7MB/s ± 1%  53.9MB/s ± 1%   ~             (p=0.400 n=3+3)
      Decode/Twain/Speed/1e5-8         65.5MB/s ± 2%  65.6MB/s ± 0%   ~             (p=0.700 n=3+3)
      Decode/Twain/Speed/1e6-8         66.9MB/s ± 1%  67.4MB/s ± 1%   ~             (p=1.000 n=3+3)
      Decode/Twain/Default/1e4-8       56.9MB/s ± 1%  57.3MB/s ± 0%   ~             (p=0.200 n=3+3)
      Decode/Twain/Default/1e5-8       77.2MB/s ± 2%  76.6MB/s ± 1%   ~             (p=0.700 n=3+3)
      Decode/Twain/Default/1e6-8       79.3MB/s ± 3%  80.0MB/s ± 0%   ~             (p=0.700 n=3+3)
      Decode/Twain/Compression/1e4-8   56.4MB/s ± 0%  57.5MB/s ± 1%   ~             (p=0.100 n=3+3)
      Decode/Twain/Compression/1e5-8   76.8MB/s ± 1%  76.5MB/s ± 0%   ~             (p=0.700 n=3+3)
      Decode/Twain/Compression/1e6-8   80.1MB/s ± 1%  79.8MB/s ± 1%   ~             (p=1.000 n=3+3)
      Encode/Digits/Huffman/1e4-8       211MB/s ± 1%   215MB/s ± 0%   ~             (p=0.100 n=3+3)
      Encode/Digits/Huffman/1e5-8       221MB/s ± 2%   224MB/s ± 1%   ~             (p=0.700 n=3+3)
      Encode/Digits/Huffman/1e6-8       225MB/s ± 3%   228MB/s ± 0%   ~             (p=1.000 n=3+3)
      Encode/Digits/Speed/1e4-8        52.8MB/s ± 4%  54.1MB/s ± 0%   ~             (p=0.100 n=3+3)
      Encode/Digits/Speed/1e5-8        56.2MB/s ± 5%  57.0MB/s ± 1%   ~             (p=1.000 n=3+3)
      Encode/Digits/Speed/1e6-8        56.0MB/s ± 6%  57.7MB/s ± 1%   ~             (p=0.400 n=3+3)
      Encode/Digits/Default/1e4-8      27.3MB/s ± 1%  27.7MB/s ± 0%   ~             (p=0.200 n=3+3)
      Encode/Digits/Default/1e5-8      17.9MB/s ± 4%  18.4MB/s ± 1%   ~             (p=0.400 n=3+3)
      Encode/Digits/Default/1e6-8      17.0MB/s ± 3%  17.2MB/s ± 1%   ~             (p=0.500 n=3+3)
      Encode/Digits/Compression/1e4-8  27.1MB/s ± 3%  27.6MB/s ± 0%   ~             (p=0.100 n=3+3)
      Encode/Digits/Compression/1e5-8  18.2MB/s ± 2%  18.3MB/s ± 1%   ~             (p=1.000 n=3+3)
      Encode/Digits/Compression/1e6-8  16.9MB/s ± 2%  17.1MB/s ± 1%   ~             (p=0.400 n=3+3)
      Encode/Twain/Huffman/1e4-8        155MB/s ± 3%   155MB/s ± 1%   ~             (p=0.700 n=3+3)
      Encode/Twain/Huffman/1e5-8        190MB/s ± 1%   190MB/s ± 2%   ~             (p=1.000 n=3+3)
      Encode/Twain/Huffman/1e6-8        193MB/s ± 2%   193MB/s ± 1%   ~             (p=0.700 n=3+3)
      Encode/Twain/Speed/1e4-8         48.5MB/s ± 1%  49.1MB/s ± 0%   ~             (p=0.100 n=3+3)
      Encode/Twain/Speed/1e5-8         57.7MB/s ± 2%  59.0MB/s ± 0%   ~             (p=0.100 n=3+3)
      Encode/Twain/Speed/1e6-8         59.7MB/s ± 0%  59.7MB/s ± 1%   ~             (p=1.000 n=3+3)
      Encode/Twain/Default/1e4-8       23.6MB/s ± 3%  23.9MB/s ± 1%   ~             (p=1.000 n=3+3)
      Encode/Twain/Default/1e5-8       15.8MB/s ± 4%  16.1MB/s ± 0%   ~             (p=1.000 n=3+3)
      Encode/Twain/Default/1e6-8       14.7MB/s ± 3%  14.8MB/s ± 0%   ~             (p=0.700 n=3+3)
      Encode/Twain/Compression/1e4-8   23.0MB/s ± 3%  23.6MB/s ± 0%   ~             (p=0.700 n=3+3)
      Encode/Twain/Compression/1e5-8   14.3MB/s ± 1%  14.5MB/s ± 0%   ~             (p=0.100 n=3+3)
      Encode/Twain/Compression/1e6-8   13.0MB/s ± 4%  13.2MB/s ± 1%   ~             (p=0.400 n=3+3)
      
      Measured on a "quiet" (no browser running) 2.3 GHz Intel Core i7, running macOS 10.12.3.
      
      See also #19279.
      
      Change-Id: Ice759eb34eb37442b543957447c264e0aadc1fa9
      Reviewed-on: https://go-review.googlesource.com/37460
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      bca03206
  2. 28 Feb, 2017 26 commits
    • Robert Griesemer's avatar
      math/bits: move left-over functionality from bits_impl.go to bits.go · 32b41c8d
      Robert Griesemer authored
      Removes an extra function call for TrailingZeroes and thus may
      increase chances for inlining.
      
      Change-Id: Iefd8d4402dc89b64baf4e5c865eb3dadade623af
      Reviewed-on: https://go-review.googlesource.com/37613
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      32b41c8d
    • Josh Bleecher Snyder's avatar
      cmd/vet/all: disable cgo when running 'go install' · 09294ab7
      Josh Bleecher Snyder authored
      Change-Id: Iab1e84624c0288ebdd33fbe83bd60948b5d91fc4
      Reviewed-on: https://go-review.googlesource.com/37612Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      09294ab7
    • Brad Fitzpatrick's avatar
      os/exec: remove duplicate environment variables in Cmd.Start · e73f4894
      Brad Fitzpatrick authored
      Nobody intends to have duplicates anyway because it's so undefined
      and everything handles it so poorly.
      
      Removing duplicates automatically simplifies code and makes existing
      code do what people already expect.
      
      Fixes #12868
      
      Change-Id: I95eeba8c59ff94d0f018012a6f4e031aaabfd5d9
      Reviewed-on: https://go-review.googlesource.com/37586
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      e73f4894
    • Martin Möhrmann's avatar
      strings: fix handling of invalid UTF-8 sequences in Map · 3c023f75
      Martin Möhrmann authored
      The new Map implementation introduced in golang.org/cl/33201
      did not differentiate if an invalid UTF-8 sequence was decoded
      or the RuneError rune. It would therefore always advance by
      3 bytes (which is the length of the RuneError rune) instead
      of 1 for an invalid sequences. This cl adds a check to correctly
      determine the length of bytes needed to advance to the next rune.
      
      Fixes #19330.
      
      Change-Id: I1e7f9333f3ef6068ffc64015bb0a9f32b0b7111d
      Reviewed-on: https://go-review.googlesource.com/37597
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJoe Tsai <thebrokentoaster@gmail.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      3c023f75
    • Keith Randall's avatar
      cmd/compile: simplify load+op rules · 0fe58bf6
      Keith Randall authored
      There's no need to use @block rules, as canMergeLoad makes sure that
      the load and op are already in the same block.
      With no @block needed, we also don't need to set the type explicitly.
      It can just be inherited from the op being rewritten.
      
      Noticed while working on #19284.
      
      Change-Id: Ied8bcc8058260118ff7e166093112e29107bcb7e
      Reviewed-on: https://go-review.googlesource.com/37585
      Run-TryBot: Keith Randall <khr@golang.org>
      Reviewed-by: default avatarIlya Tocar <ilya.tocar@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      0fe58bf6
    • Robert Griesemer's avatar
      go/types: implement SizesFor convenience function · 45055f21
      Robert Griesemer authored
      SizesFor returns a Sizes implementation for a supported architecture.
      Use functionality in srcimporter.
      
      Change-Id: I197e641b419c678030dfaab5c5b8c569fd0410f3
      Reviewed-on: https://go-review.googlesource.com/37583
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: default avatarAlan Donovan <adonovan@google.com>
      45055f21
    • Robert Griesemer's avatar
      math/bits: faster LeadingZeros and Len functions · 83bc4a2f
      Robert Griesemer authored
      benchmark                     old ns/op     new ns/op     delta
      BenchmarkLeadingZeros-8       8.43          3.10          -63.23%
      BenchmarkLeadingZeros8-8      8.13          1.33          -83.64%
      BenchmarkLeadingZeros16-8     7.34          2.07          -71.80%
      BenchmarkLeadingZeros32-8     7.99          2.87          -64.08%
      BenchmarkLeadingZeros64-8     8.13          2.96          -63.59%
      
      Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.
      
      Change-Id: Id343531b408d42ac45f10c76f60e85bdb977f91e
      Reviewed-on: https://go-review.googlesource.com/37582Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      83bc4a2f
    • Robert Griesemer's avatar
      math/bits: faster TrailingZeroes8 · 9515cb51
      Robert Griesemer authored
      For sizes > 8, the existing code is faster.
      
      benchmark                     old ns/op     new ns/op     delta
      BenchmarkTrailingZeros8-8     1.95          1.29          -33.85%
      
      Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.
      
      Change-Id: I6f3a33ec633a2c544ec29693c141f2f99335c745
      Reviewed-on: https://go-review.googlesource.com/37581Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      9515cb51
    • Robert Griesemer's avatar
      math/bits: faster OnesCount using table lookups for sizes 8,16,32 · d7a659b1
      Robert Griesemer authored
      For uint64, the existing algorithm is faster.
      
      benchmark                  old ns/op     new ns/op     delta
      BenchmarkOnesCount8-8      1.95          0.97          -50.26%
      BenchmarkOnesCount16-8     2.54          1.39          -45.28%
      BenchmarkOnesCount32-8     2.61          1.96          -24.90%
      
      Measured on 2.3 GHz Intel Core i7 running macOS 10.12.3.
      
      Change-Id: I6cc42882fef3d24694720464039161e339a9ae99
      Reviewed-on: https://go-review.googlesource.com/37580Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      d7a659b1
    • Josh Bleecher Snyder's avatar
      runtime: evacuate old map buckets more consistently · 064e44f2
      Josh Bleecher Snyder authored
      During map growth, buckets are evacuated in two ways.
      When a value is altered, its containing bucket is evacuated.
      Also, an evacuation mark is maintained and advanced every time.
      Prior to this CL, the evacuation mark was always incremented,
      even if the next bucket to be evacuated had already been evacuated.
      This CL changes evacuation mark advancement to skip previously
      evacuated buckets. This has the effect of making map evacuation both
      more aggressive and more consistent.
      
      Aggressive map evacuation is good. While the map is growing,
      map accesses must check two buckets, which may be far apart in memory.
      Map growth also delays garbage collection.
      And if map evacuation is not aggressive enough, there is a risk that
      a populate-once read-many map may be stuck permanently in map growth.
      This CL does not eliminate that possibility, but it shrinks the window.
      
      There is minimal impact on map benchmarks:
      
      name                         old time/op    new time/op    delta
      MapPop100-8                    12.4µs ±11%    12.4µs ± 7%    ~     (p=0.798 n=15+15)
      MapPop1000-8                    240µs ± 8%     235µs ± 8%    ~     (p=0.217 n=15+14)
      MapPop10000-8                  4.49ms ±10%    4.51ms ±15%    ~     (p=1.000 n=15+13)
      MegMap-8                       11.9ns ± 2%    11.8ns ± 0%  -1.01%  (p=0.000 n=15+11)
      MegOneMap-8                    9.30ns ± 1%    9.29ns ± 1%    ~     (p=0.955 n=14+14)
      MegEqMap-8                     31.9µs ± 5%    31.9µs ± 3%    ~     (p=0.935 n=15+15)
      MegEmptyMap-8                  2.41ns ± 2%    2.41ns ± 0%    ~     (p=0.594 n=12+14)
      SmallStrMap-8                  12.8ns ± 1%    12.7ns ± 1%    ~     (p=0.569 n=14+13)
      MapStringKeysEight_16-8        13.6ns ± 1%    13.7ns ± 2%    ~     (p=0.100 n=13+15)
      MapStringKeysEight_32-8        12.1ns ± 1%    12.1ns ± 2%    ~     (p=0.340 n=15+15)
      MapStringKeysEight_64-8        12.1ns ± 1%    12.1ns ± 2%    ~     (p=0.582 n=15+14)
      MapStringKeysEight_1M-8        12.0ns ± 1%    12.1ns ± 1%    ~     (p=0.267 n=15+14)
      IntMap-8                       7.96ns ± 1%    7.97ns ± 2%    ~     (p=0.991 n=15+13)
      RepeatedLookupStrMapKey32-8    15.8ns ± 2%    15.8ns ± 1%    ~     (p=0.393 n=15+14)
      RepeatedLookupStrMapKey1M-8    35.3µs ± 2%    35.3µs ± 1%    ~     (p=0.815 n=15+15)
      NewEmptyMap-8                  36.0ns ± 4%    36.4ns ± 7%    ~     (p=0.270 n=15+15)
      NewSmallMap-8                  85.5ns ± 1%    85.6ns ± 1%    ~     (p=0.674 n=14+15)
      MapIter-8                      89.9ns ± 6%    90.8ns ± 6%    ~     (p=0.467 n=15+15)
      MapIterEmpty-8                 10.0ns ±22%    10.0ns ±25%    ~     (p=0.846 n=15+15)
      SameLengthMap-8                4.18ns ± 1%    4.17ns ± 1%    ~     (p=0.653 n=15+14)
      BigKeyMap-8                    20.2ns ± 1%    20.1ns ± 1%  -0.82%  (p=0.002 n=15+15)
      BigValMap-8                    22.5ns ± 8%    22.3ns ± 6%    ~     (p=0.615 n=15+15)
      SmallKeyMap-8                  15.3ns ± 1%    15.3ns ± 1%    ~     (p=0.754 n=15+14)
      ComplexAlgMap-8                58.4ns ± 1%    58.7ns ± 1%  +0.52%  (p=0.000 n=14+15)
      
      There is a tiny but detectable difference in the compiler:
      
      name       old time/op      new time/op      delta
      Template        218ms ± 5%       219ms ± 4%    ~     (p=0.094 n=98+98)
      Unicode        93.6ms ± 5%      93.6ms ± 4%    ~     (p=0.910 n=94+95)
      GoTypes         596ms ± 5%       598ms ± 6%    ~     (p=0.533 n=98+100)
      Compiler        2.72s ± 3%       2.72s ± 4%    ~     (p=0.238 n=100+99)
      SSA             4.11s ± 3%       4.11s ± 3%    ~     (p=0.864 n=99+98)
      Flate           129ms ± 6%       129ms ± 4%    ~     (p=0.522 n=98+96)
      GoParser        151ms ± 4%       151ms ± 4%  -0.48%  (p=0.017 n=96+96)
      Reflect         379ms ± 3%       376ms ± 4%  -0.57%  (p=0.011 n=99+99)
      Tar             112ms ± 5%       112ms ± 6%    ~     (p=0.688 n=93+95)
      XML             214ms ± 4%       214ms ± 5%    ~     (p=0.968 n=100+99)
      StdCmd          16.2s ± 2%       16.2s ± 2%  -0.26%  (p=0.048 n=99+99)
      
      name       old user-ns/op   new user-ns/op   delta
      Template   252user-ms ± 4%  250user-ms ± 4%  -0.63%  (p=0.020 n=98+97)
      Unicode    113user-ms ± 7%  114user-ms ± 5%    ~     (p=0.057 n=97+94)
      GoTypes    776user-ms ± 5%  777user-ms ± 5%    ~     (p=0.375 n=97+96)
      Compiler   3.61user-s ± 3%  3.60user-s ± 3%    ~     (p=0.445 n=98+93)
      SSA        5.84user-s ± 6%  5.85user-s ± 5%    ~     (p=0.542 n=100+95)
      Flate      154user-ms ± 5%  154user-ms ± 5%    ~     (p=0.699 n=99+99)
      GoParser   184user-ms ± 6%  183user-ms ± 4%    ~     (p=0.557 n=98+95)
      Reflect    461user-ms ± 5%  462user-ms ± 4%    ~     (p=0.853 n=97+99)
      Tar        130user-ms ± 5%  129user-ms ± 6%    ~     (p=0.567 n=93+100)
      XML        257user-ms ± 6%  258user-ms ± 6%    ~     (p=0.205 n=99+100)
      
      Change-Id: Id92dd54a152904069aac415e6aaaab5c67f5f476
      Reviewed-on: https://go-review.googlesource.com/37011Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      064e44f2
    • Matthew Dempsky's avatar
      cmd/internal/obj: remove unused Getcallerpc function · 9bc67bb4
      Matthew Dempsky authored
      Change-Id: I0c7b677657326f318e906e109cbda0cfa78c4973
      Reviewed-on: https://go-review.googlesource.com/37537
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMichael Hudson-Doyle <michael.hudson@canonical.com>
      9bc67bb4
    • philhofer's avatar
      cmd/compile/ssa: more aggressive constant folding · 379567aa
      philhofer authored
      Add rewrite rules that canonicalize the location
      of constants in expressions, and fold conststants
      that appear in operations that can be trivially
      reassociated.
      
      After this change, the compiler constant-folds
      expressions like "4 + x - 1" and "4 & x & 1"
      
      Benchmarks affected on darwin/amd64:
      
      name                     old time/op    new time/op    delta
      FmtFprintfInt-8            82.1ns ± 1%    81.7ns ± 1%  -0.46%  (p=0.023 n=8+9)
      FmtFprintfIntInt-8          122ns ± 2%     120ns ± 2%  -1.48%  (p=0.047 n=10+10)
      FmtManyArgs-8               493ns ± 0%     486ns ± 1%  -1.37%  (p=0.000 n=8+10)
      Gzip-8                      230ms ± 0%     229ms ± 1%  -0.46%  (p=0.001 n=10+9)
      HTTPClientServer-8         74.5µs ± 1%    73.7µs ± 1%  -1.11%  (p=0.000 n=10+10)
      JSONDecode-8               51.7ms ± 0%    51.9ms ± 1%  +0.42%  (p=0.017 n=10+9)
      RegexpMatchEasy0_32-8      82.6ns ± 1%    81.7ns ± 0%  -1.02%  (p=0.000 n=9+8)
      RegexpMatchMedium_32-8      121ns ± 1%     120ns ± 1%  -1.48%  (p=0.001 n=10+10)
      Revcomp-8                   426ms ± 1%     400ms ± 1%  -6.16%  (p=0.000 n=10+10)
      TimeFormat-8                330ns ± 1%     327ns ± 0%  -0.82%  (p=0.000 n=10+10)
      
      name                     old speed      new speed      delta
      Gzip-8                   84.4MB/s ± 0%  84.8MB/s ± 1%  +0.47%  (p=0.001 n=10+9)
      JSONDecode-8             37.6MB/s ± 0%  37.4MB/s ± 1%  -0.42%  (p=0.016 n=10+9)
      RegexpMatchEasy0_32-8     387MB/s ± 1%   392MB/s ± 0%  +1.06%  (p=0.000 n=9+8)
      RegexpMatchMedium_32-8   8.21MB/s ± 1%  8.34MB/s ± 1%  +1.58%  (p=0.000 n=10+9)
      Revcomp-8                 597MB/s ± 1%   636MB/s ± 1%  +6.57%  (p=0.000 n=10+10)
      
      Change-Id: Ie37ff91605b76a984a8400dfd1e34f50bf61c864
      Reviewed-on: https://go-review.googlesource.com/37290Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      379567aa
    • Josh Bleecher Snyder's avatar
      cmd/compile, runtime: specialize convT2x, don't alloc for zero vals · 504bc3ed
      Josh Bleecher Snyder authored
      Prior to this CL, all runtime conversions
      from a concrete value to an interface went
      through one of two runtime calls: convT2E or convT2I.
      However, in practice, basic types are very common.
      Specializing convT2x for those basic types allows
      for a more efficient implementation for those types.
      For basic scalars and strings, allocation and copying
      can use the same methods as normal code.
      For pointer-free types, allocation can occur without
      zeroing, and copying can take place without GC calls.
      For slices, copying is cheaper and simpler.
      
      This CL adds twelve runtime routines:
      
      convT2E16, convT2I16
      convT2E32, convT2I32
      convT2E64, convT2I64
      convT2Estring, convT2Istring
      convT2Eslice, convT2Islice
      convT2Enoptr, convT2Inoptr
      
      While compiling make.bash, 93% of all convT2x calls
      are now to one of these specialized convT2x call.
      
      Within specialized convT2x routines, it is cheap to check
      for a zero value, in a way that it is not in general.
      When we detect a zero value there, we return a pointer
      to zeroVal, rather than allocating.
      
      name                         old time/op  new time/op  delta
      ConvT2Ezero/zero/16-8        17.9ns ± 2%   3.0ns ± 3%  -83.20%  (p=0.000 n=56+56)
      ConvT2Ezero/zero/32-8        17.8ns ± 2%   3.0ns ± 3%  -83.15%  (p=0.000 n=59+60)
      ConvT2Ezero/zero/64-8        20.1ns ± 1%   3.0ns ± 2%  -84.98%  (p=0.000 n=57+57)
      ConvT2Ezero/zero/str-8       32.6ns ± 1%   3.0ns ± 4%  -90.70%  (p=0.000 n=59+60)
      ConvT2Ezero/zero/slice-8     36.7ns ± 2%   3.0ns ± 2%  -91.78%  (p=0.000 n=59+59)
      ConvT2Ezero/zero/big-8       91.9ns ± 2%  85.9ns ± 2%   -6.52%  (p=0.000 n=57+57)
      ConvT2Ezero/nonzero/16-8     17.7ns ± 2%  12.7ns ± 3%  -28.38%  (p=0.000 n=55+60)
      ConvT2Ezero/nonzero/32-8     17.8ns ± 1%  12.7ns ± 1%  -28.44%  (p=0.000 n=54+57)
      ConvT2Ezero/nonzero/64-8     20.0ns ± 1%  15.0ns ± 1%  -24.90%  (p=0.000 n=56+58)
      ConvT2Ezero/nonzero/str-8    32.6ns ± 1%  25.7ns ± 1%  -21.17%  (p=0.000 n=58+55)
      ConvT2Ezero/nonzero/slice-8  36.8ns ± 2%  30.4ns ± 1%  -17.32%  (p=0.000 n=60+52)
      ConvT2Ezero/nonzero/big-8    92.1ns ± 2%  85.9ns ± 2%   -6.70%  (p=0.000 n=57+59)
      
      Benchmarks on a real program (the compiler):
      
      name       old time/op      new time/op      delta
      Template        227ms ± 5%       221ms ± 2%  -2.48%  (p=0.000 n=30+26)
      Unicode         102ms ± 5%       100ms ± 3%  -1.30%  (p=0.009 n=30+26)
      GoTypes         656ms ± 5%       659ms ± 4%    ~     (p=0.208 n=30+30)
      Compiler        2.82s ± 2%       2.82s ± 1%    ~     (p=0.614 n=29+27)
      Flate           128ms ± 2%       128ms ± 5%    ~     (p=0.783 n=27+28)
      GoParser        158ms ± 3%       158ms ± 3%    ~     (p=0.261 n=28+30)
      Reflect         408ms ± 7%       401ms ± 3%    ~     (p=0.075 n=30+30)
      Tar             123ms ± 6%       121ms ± 8%    ~     (p=0.287 n=29+30)
      XML             220ms ± 2%       220ms ± 4%    ~     (p=0.805 n=29+29)
      
      name       old user-ns/op   new user-ns/op   delta
      Template   281user-ms ± 4%  279user-ms ± 3%  -0.87%  (p=0.044 n=28+28)
      Unicode    142user-ms ± 4%  141user-ms ± 3%  -1.04%  (p=0.015 n=30+27)
      GoTypes    884user-ms ± 3%  886user-ms ± 2%    ~     (p=0.532 n=30+30)
      Compiler   3.94user-s ± 3%  3.92user-s ± 1%    ~     (p=0.185 n=30+28)
      Flate      165user-ms ± 2%  165user-ms ± 4%    ~     (p=0.780 n=27+29)
      GoParser   209user-ms ± 2%  208user-ms ± 3%    ~     (p=0.453 n=28+30)
      Reflect    533user-ms ± 6%  526user-ms ± 3%    ~     (p=0.057 n=30+30)
      Tar        156user-ms ± 6%  154user-ms ± 6%    ~     (p=0.133 n=29+30)
      XML        288user-ms ± 4%  288user-ms ± 4%    ~     (p=0.633 n=30+30)
      
      name       old alloc/op     new alloc/op     delta
      Template       41.0MB ± 0%      40.9MB ± 0%  -0.11%  (p=0.000 n=29+29)
      Unicode        32.6MB ± 0%      32.6MB ± 0%    ~     (p=0.572 n=29+30)
      GoTypes         122MB ± 0%       122MB ± 0%  -0.10%  (p=0.000 n=30+30)
      Compiler        482MB ± 0%       481MB ± 0%  -0.07%  (p=0.000 n=30+29)
      Flate          26.6MB ± 0%      26.6MB ± 0%    ~     (p=0.096 n=30+30)
      GoParser       32.7MB ± 0%      32.6MB ± 0%  -0.06%  (p=0.011 n=28+28)
      Reflect        84.2MB ± 0%      84.1MB ± 0%  -0.17%  (p=0.000 n=29+30)
      Tar            27.7MB ± 0%      27.7MB ± 0%  -0.05%  (p=0.032 n=27+28)
      XML            44.7MB ± 0%      44.7MB ± 0%    ~     (p=0.131 n=28+30)
      
      name       old allocs/op    new allocs/op    delta
      Template         373k ± 1%        370k ± 1%  -0.76%  (p=0.000 n=30+30)
      Unicode          325k ± 1%        325k ± 1%    ~     (p=0.383 n=29+30)
      GoTypes         1.16M ± 0%       1.15M ± 0%  -0.75%  (p=0.000 n=29+30)
      Compiler        4.15M ± 0%       4.13M ± 0%  -0.59%  (p=0.000 n=30+29)
      Flate            238k ± 1%        237k ± 1%  -0.62%  (p=0.000 n=30+30)
      GoParser         304k ± 1%        302k ± 1%  -0.64%  (p=0.000 n=30+28)
      Reflect         1.00M ± 0%       0.99M ± 0%  -1.10%  (p=0.000 n=29+30)
      Tar              245k ± 1%        244k ± 1%  -0.59%  (p=0.000 n=27+29)
      XML              391k ± 1%        389k ± 1%  -0.59%  (p=0.000 n=29+30)
      
      Change-Id: Id7f456d690567c2b0a96b0d6d64de8784b6e305f
      Reviewed-on: https://go-review.googlesource.com/36476
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      504bc3ed
    • Cherry Zhang's avatar
      cmd/compile: update signature of runtime.memclr* · f6fc0dd6
      Cherry Zhang authored
      runtime.memclr* functions have signatures
      
      func memclrNoHeapPointers(ptr unsafe.Pointer, n uintptr)
      func memclrHasPointers(ptr unsafe.Pointer, n uintptr)
      
      Update compiler's copy. Also teach gc/mkbuiltin.go to handle
      unsafe.Pointer. The import statement and its support is not
      really necessary, but just to make it look like real Go code.
      
      Fixes #19185.
      
      Change-Id: I251d02571fde2716d4727e31e04d56ec04b6f22a
      Reviewed-on: https://go-review.googlesource.com/37257
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      f6fc0dd6
    • Josh Bleecher Snyder's avatar
      cmd/vet/all: temporarily ignore vendored pprof · d3d2a67c
      Josh Bleecher Snyder authored
      Change-Id: I3d96b9803dbbd7184f96240bd7944af919ca1376
      Reviewed-on: https://go-review.googlesource.com/37579
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      d3d2a67c
    • Josh Bleecher Snyder's avatar
      cmd/vet: allow shifts by amounts calculated using unsafe · d99d5f7c
      Josh Bleecher Snyder authored
      The real world code that inspired this fix,
      from runtime/pprof/map.go:
      
      	// Compute hash of (stk, tag).
      	h := uintptr(0)
      	for _, x := range stk {
      		h = h<<8 | (h >> (8 * (unsafe.Sizeof(h) - 1)))
      		h += uintptr(x) * 41
      	}
      	h = h<<8 | (h >> (8 * (unsafe.Sizeof(h) - 1)))
      	h += uintptr(tag) * 41
      
      Change-Id: I99a95b97cba73811faedb0b9a1b9b54e9a1784a3
      Reviewed-on: https://go-review.googlesource.com/37574
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      d99d5f7c
    • Josh Bleecher Snyder's avatar
      cmd/vet/all: move suspicious shift whitelists to 64 bit · 016569f2
      Josh Bleecher Snyder authored
      This is an inconsequential consequence of updating
      math/big to use math/bits.
      
      Better would be to teach the vet shift test
      to size int/uint/uintptr to the platform in use,
      eliminating the whole category of "might be too small".
      Filed #19321 for that.
      
      Change-Id: I7e0b837bd329132d7a564468c18502dd2e724fc6
      Reviewed-on: https://go-review.googlesource.com/37576
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      016569f2
    • Brad Fitzpatrick's avatar
      cmd/dist: make the vetall builder have test shards per os/arch · 31f9769c
      Brad Fitzpatrick authored
      This makes the vetall builder friendly to auto-sharding by the build
      coordinator.
      
      Change-Id: I0893f5051ec90e7a6adcb89904ba08cd2d590549
      Reviewed-on: https://go-review.googlesource.com/37572Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      31f9769c
    • Josh Bleecher Snyder's avatar
      cmd/vet/all: exit with non-zero error code on failure · 8defd9f7
      Josh Bleecher Snyder authored
      Change-Id: I68e60b155c583fa47aa5ca13d591851009a4e571
      Reviewed-on: https://go-review.googlesource.com/37571Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      8defd9f7
    • Michael Munday's avatar
      cmd/compile: emit fused multiply-{add,subtract} instructions on s390x · bd8a39b6
      Michael Munday authored
      Explcitly block fused multiply-add pattern matching when a cast is used
      after the multiplication, for example:
      
          - (a * b) + c        // can emit fused multiply-add
          - float64(a * b) + c // cannot emit fused multiply-add
      
      float{32,64} and complex{64,128} casts of matching types are now kept
      as OCONV operations rather than being replaced with OCONVNOP operations
      because they now imply a rounding operation (and therefore aren't a
      no-op anymore).
      
      Operations (for example, multiplication) on complex types may utilize
      fused multiply-add and -subtract instructions internally. There is no
      way to disable this behavior at the moment.
      
      Improves the performance of the floating point implementation of
      poly1305:
      
      name         old speed     new speed     delta
      64           246MB/s ± 0%  275MB/s ± 0%  +11.48%   (p=0.000 n=10+8)
      1K           312MB/s ± 0%  357MB/s ± 0%  +14.41%  (p=0.000 n=10+10)
      64Unaligned  246MB/s ± 0%  274MB/s ± 0%  +11.43%  (p=0.000 n=10+10)
      1KUnaligned  312MB/s ± 0%  357MB/s ± 0%  +14.39%   (p=0.000 n=10+8)
      
      Updates #17895.
      
      Change-Id: Ia771d275bb9150d1a598f8cc773444663de5ce16
      Reviewed-on: https://go-review.googlesource.com/36963
      Run-TryBot: Michael Munday <munday@ca.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      bd8a39b6
    • David du Colombier's avatar
      crypto/sha512: fix checkAVX2 · a38a2d02
      David du Colombier authored
      The checkAVX2 test doesn't appear to be correct,
      because it always returns the value of support_bmi2,
      even if the value of support_avx2 is false.
      
      Consequently, checkAVX2 always returns true, as long
      as BMI2 is supported, even if AVX2 is not supported.
      
      We change checkAVX2 to return false when support_avx2
      is false.
      
      Fixes #19316.
      
      Change-Id: I2ec9dfaa09f4b54c4a03d60efef891b955d60578
      Reviewed-on: https://go-review.googlesource.com/37590
      Run-TryBot: David du Colombier <0intro@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      a38a2d02
    • Martin Möhrmann's avatar
      cmd/compile: fix assignment order in string range loop · a8f07310
      Martin Möhrmann authored
      Fixes #18376.
      
      Change-Id: I4fe24f479311cd4cd1bdad9a966b681e50e3d500
      Reviewed-on: https://go-review.googlesource.com/35955Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      a8f07310
    • Carlo Alberto Ferraris's avatar
      bytes: make bytes.Buffer cache-friendly · 55310403
      Carlo Alberto Ferraris authored
      During benchmark of an internal tool we found out that (*Buffer).Reset() was
      surprisingly showing up in CPU profiles.
      
      This CL contains two related changes aimed at speeding up Reset():
      1. Create a fast path for Truncate(0) by moving the logic to Reset()
         (this makes Reset() a simple leaf func that gets inlined since it
         gets compiled to 3 MOVx instructions). Accordingly change calls in
         the rest of the Buffer methods to call Reset() instead of Truncate(0).
      2. Reorder the fields in the Buffer struct so that frequently accessed
         fields are packed together (buf, off, lastRead). This also make them
         likely to be in the same cacheline.
      
      Ideally it would be advisable to have Buffer{} cacheline-aligned, but I
      couldn't find a way to do this without changing the size of the bootstrap
      array (but this will cause some regressions, because it will make duffcopy
      show up in CPU profiles where it wasn't showing up before).
      
      go1 benchmarks are not really affected, but some other benchmarks that
      exercise Buffer more show improvements:
      
      name                     old time/op    new time/op    delta
      BinaryTree17-4              2.46s ± 9%     2.43s ± 3%    ~     (p=0.982 n=14+14)
      Fannkuch11-4                2.98s ± 1%     2.90s ± 1%  -2.58%  (p=0.000 n=15+14)
      FmtFprintfEmpty-4          45.2ns ± 1%    45.2ns ± 1%    ~     (p=0.494 n=14+15)
      FmtFprintfString-4         76.8ns ± 1%    83.1ns ± 2%  +8.23%  (p=0.000 n=10+15)
      FmtFprintfInt-4            78.0ns ± 2%    74.6ns ± 1%  -4.46%  (p=0.000 n=15+15)
      FmtFprintfIntInt-4          113ns ± 1%     109ns ± 2%  -2.91%  (p=0.000 n=13+15)
      FmtFprintfPrefixedInt-4     152ns ± 2%     143ns ± 2%  -6.04%  (p=0.000 n=15+14)
      FmtFprintfFloat-4           224ns ± 1%     222ns ± 2%  -1.08%  (p=0.001 n=15+14)
      FmtManyArgs-4               464ns ± 2%     463ns ± 2%    ~     (p=0.303 n=14+15)
      GobDecode-4                6.25ms ± 2%    6.32ms ± 3%  +1.20%  (p=0.002 n=14+14)
      GobEncode-4                5.41ms ± 2%    5.41ms ± 2%    ~     (p=0.967 n=15+15)
      Gzip-4                      215ms ± 2%     218ms ± 2%  +1.35%  (p=0.002 n=15+15)
      Gunzip-4                   34.3ms ± 2%    34.2ms ± 2%    ~     (p=0.539 n=15+15)
      HTTPClientServer-4         76.4µs ± 2%    75.4µs ± 1%  -1.31%  (p=0.000 n=15+15)
      JSONEncode-4               14.7ms ± 2%    14.6ms ± 3%    ~     (p=0.094 n=14+14)
      JSONDecode-4               48.0ms ± 1%    48.5ms ± 1%  +0.92%  (p=0.001 n=14+12)
      Mandelbrot200-4            4.04ms ± 2%    4.06ms ± 1%    ~     (p=0.108 n=15+13)
      GoParse-4                  2.99ms ± 2%    3.00ms ± 1%    ~     (p=0.130 n=15+13)
      RegexpMatchEasy0_32-4      78.3ns ± 1%    79.5ns ± 1%  +1.51%  (p=0.000 n=15+14)
      RegexpMatchEasy0_1K-4       185ns ± 1%     186ns ± 1%  +0.76%  (p=0.005 n=15+15)
      RegexpMatchEasy1_32-4      79.0ns ± 2%    76.7ns ± 1%  -2.87%  (p=0.000 n=14+15)
      
      name                     old speed      new speed      delta
      GobDecode-4               123MB/s ± 2%   121MB/s ± 3%  -1.18%  (p=0.002 n=14+14)
      GobEncode-4               142MB/s ± 2%   142MB/s ± 1%    ~     (p=0.959 n=15+15)
      Gzip-4                   90.3MB/s ± 2%  89.1MB/s ± 2%  -1.34%  (p=0.002 n=15+15)
      Gunzip-4                  565MB/s ± 2%   567MB/s ± 2%    ~     (p=0.539 n=15+15)
      JSONEncode-4              132MB/s ± 2%   133MB/s ± 3%    ~     (p=0.091 n=14+14)
      JSONDecode-4             40.4MB/s ± 1%  40.0MB/s ± 1%  -0.92%  (p=0.001 n=14+12)
      GoParse-4                19.4MB/s ± 2%  19.3MB/s ± 1%    ~     (p=0.121 n=15+13)
      RegexpMatchEasy0_32-4     409MB/s ± 1%   403MB/s ± 1%  -1.47%  (p=0.000 n=15+14)
      RegexpMatchEasy0_1K-4    5.53GB/s ± 1%  5.49GB/s ± 1%  -0.86%  (p=0.002 n=15+15)
      RegexpMatchEasy1_32-4     405MB/s ± 2%   417MB/s ± 1%  +2.94%  (p=0.000 n=14+15)
      
      name                old time/op  new time/op  delta
      PoolsSingle1K-4     34.9ns ± 2%  30.4ns ± 4%  -12.80%  (p=0.000 n=15+15)
      PoolsSingle64K-4    36.9ns ± 1%  34.4ns ± 4%   -6.72%  (p=0.000 n=14+15)
      PoolsRandomSmall-4  34.8ns ± 3%  29.5ns ± 1%  -15.19%  (p=0.000 n=15+14)
      PoolsRandomLarge-4  38.6ns ± 1%  34.3ns ± 3%  -11.17%  (p=0.000 n=14+15)
      PoolSingle1K-4      26.1ns ± 1%  21.2ns ± 2%  -18.59%  (p=0.000 n=15+14)
      PoolSingle64K-4     26.7ns ± 2%  21.5ns ± 2%  -19.72%  (p=0.000 n=15+15)
      MakeSingle1K-4      24.2ns ± 2%  24.3ns ± 3%     ~     (p=0.132 n=13+15)
      MakeSingle64K-4     6.76µs ± 1%  6.96µs ± 5%   +2.94%  (p=0.002 n=13+13)
      MakeRandomSmall-4    531ns ± 4%   538ns ± 5%     ~     (p=0.066 n=14+15)
      MakeRandomLarge-4    152µs ± 0%   152µs ± 1%   -0.31%  (p=0.001 n=14+13)
      
      Change-Id: I86d7d9d2cac65335baf62214fbb35ba0fd8f9528
      Reviewed-on: https://go-review.googlesource.com/37416
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      55310403
    • Josh Bleecher Snyder's avatar
      cmd/compile: fold (NegNN (ConstNN ...)) · 417f49a3
      Josh Bleecher Snyder authored
      Fix up and enable a few rules.
      They trigger a handful of times in std,
      despite the frontend handling.
      
      Change-Id: I83378c057cbbc95a4f2b58cd8c36aec0e9dc547f
      Reviewed-on: https://go-review.googlesource.com/37227
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      417f49a3
    • Michael Munday's avatar
      cmd/compile: fix merging of s390x conditional moves into branch conditions · 4dbcb53d
      Michael Munday authored
      A type conversion inserted between MOVD{LT,LE,GT,GE,EQ,NE} and CMPWconst
      by CL 36256 broke the rewrite rule designed to merge the two.
      This results in simple for loops (e.g. for i := 0; i < N; i++ {})
      emitting two comparisons instead of one, plus a conditional move.
      
      This CL explicitly types the input to CMPWconst so that the type conversion
      can be omitted. It also adds a test to check that conditional moves aren't
      emitted for loops with 'less than' conditions (i.e. i < N) on s390x.
      
      Fixes #19227.
      
      Change-Id: Ia39e806ed723791c3c755951aef23f957828ea3e
      Reviewed-on: https://go-review.googlesource.com/37334Reviewed-by: default avatarKeith Randall <khr@golang.org>
      4dbcb53d
    • Joe Tsai's avatar
      net/url: document the package better · 1b31c9ff
      Joe Tsai authored
      Changes made:
      * Adjust the documented form for a URL to make it more obvious what
      happens when the scheme is missing.
      * Remove references to Go1.5. We are sufficiently far along enough
      that this distinction no longer matters.
      * Remove the "Opaque" example which provides a hacky and misleading
      use of the Opaque field. This workaround is no longer necessary
      since RawPath was added in Go1.5 and the obvious approach just works:
      	// The raw string "/%2f/" will be sent as expected.
      	req, _ := http.NewRequest("GET", "https://example.com/%2f/")
      
      Fixes #18824
      
      Change-Id: Ie33d27222e06025ce8025f8a0f04b601aaee1513
      Reviewed-on: https://go-review.googlesource.com/36127
      Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      1b31c9ff
  3. 27 Feb, 2017 4 commits