1. 10 May, 2017 8 commits
    • Michael Munday's avatar
      cmd/compile: add generic rules to eliminate some unnecessary stores · 4fc498d8
      Michael Munday authored
      Eliminates stores of values that have just been loaded from the same
      location. Handles the common case where there are up to 3 intermediate
      stores to non-overlapping struct fields.
      
      For example the loads and stores of x.a, x.b and x.d in the following
      function are now removed:
      
      type T struct {
      	a, b, c, d int
      }
      
      func f(x *T) {
      	y := *x
      	y.c += 8
      	*x = y
      }
      
      Before this CL (s390x):
      
      TEXT    "".f(SB)
      	MOVD    "".x(R15), R5
      	MOVD    (R5), R1
      	MOVD    8(R5), R2
      	MOVD    16(R5), R0
      	MOVD    24(R5), R4
      	ADD     $8, R0, R3
      	STMG    R1, R4, (R5)
      	RET
      
      After this CL (s390x):
      
      TEXT	"".f(SB)
      	MOVD	"".x(R15), R1
      	MOVD	16(R1), R0
      	ADD	$8, R0, R0
      	MOVD	R0, 16(R1)
      	RET
      
      In total these rules are triggered ~5091 times during all.bash,
      which is broken down as:
      
      Intermediate stores | Triggered
      --------------------+----------
      0                   | 1434
      1                   | 2508
      2                   | 888
      3                   | 261
      --------------------+----------
      
      Change-Id: Ia4721ae40146aceec1fdd3e65b0e9283770bfba5
      Reviewed-on: https://go-review.googlesource.com/38793
      Run-TryBot: Michael Munday <munday@ca.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      4fc498d8
    • Michael Munday's avatar
      cmd/compile/internal/ssa: fix generation of ppc64x rules · cb83924d
      Michael Munday authored
      The files PPC64.rules and rewritePPC64.go were out of sync due to
      conflicts between CL 41630 and CL 42145 (i.e. running 'go run *.go'
      in the gen directory resulted in unexpected changes).
      
      Change-Id: I1d409656b66afeab6cb9c6df9b3dcab7859caa75
      Reviewed-on: https://go-review.googlesource.com/43091
      Run-TryBot: Michael Munday <munday@ca.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCarlos Eduardo Seo <cseo@linux.vnet.ibm.com>
      Reviewed-by: default avatarLynn Boger <laboger@linux.vnet.ibm.com>
      cb83924d
    • David Chase's avatar
      cmd/link: include DW_AT_producer in .debug_info · 41d0bbdc
      David Chase authored
      This can make life easier for Delve (and other debuggers),
      and can help them with bug reports.
      
      Sample producer field (from objdump):
      <48> DW_AT_producer : Go cmd/compile devel +8a59dbf41a Mon May 8 16:02:44 2017 -0400
      
      Change-Id: I0605843c959b53a60a25a3b870aa8755bf5d5b13
      Reviewed-on: https://go-review.googlesource.com/33588
      Run-TryBot: David Chase <drchase@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      41d0bbdc
    • Daniel Martí's avatar
      reflect: don't panic in ArrayOf if elem size is 0 · 9bced477
      Daniel Martí authored
      We do a division by the elem type size to check if the array size would
      be too large for the virtual address space. This is a silly check if the
      size is 0, but the problem is that it means a division by zero and a
      panic.
      
      Since arrays of empty structs are valid in a regular program, make them
      also work in reflect.
      
      Use a separate, explicit test with struct{}{} to make sure the test for
      a zero-sized type is not confused with the rest.
      
      Fixes #20313.
      
      Change-Id: I47b8b87e6541631280b79227bdea6a0f6035c9e0
      Reviewed-on: https://go-review.googlesource.com/43131
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      9bced477
    • Lynn Boger's avatar
      cmd/compile: ppc64x intrinsics for math/bits · 8304d107
      Lynn Boger authored
      This adds math/bits intrinsics for OnesCount, Len, TrailingZeros on
      ppc64x.
      
      benchmark                       old ns/op     new ns/op     delta
      BenchmarkLeadingZeros-16        4.26          1.71          -59.86%
      BenchmarkLeadingZeros16-16      3.04          1.83          -39.80%
      BenchmarkLeadingZeros32-16      3.31          1.82          -45.02%
      BenchmarkLeadingZeros64-16      3.69          1.71          -53.66%
      BenchmarkTrailingZeros-16       2.55          1.62          -36.47%
      BenchmarkTrailingZeros32-16     2.55          1.77          -30.59%
      BenchmarkTrailingZeros64-16     2.78          1.62          -41.73%
      BenchmarkOnesCount-16           3.19          0.93          -70.85%
      BenchmarkOnesCount32-16         2.55          1.18          -53.73%
      BenchmarkOnesCount64-16         3.22          0.93          -71.12%
      
      Update #18616
      
      I also made a change to bits_test.go because when debugging some failures
      the output was not quite providing the right argument information.
      
      Change-Id: Ia58d31d1777cf4582a4505f85b11a1202ca07d3e
      Reviewed-on: https://go-review.googlesource.com/41630
      Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCarlos Eduardo Seo <cseo@linux.vnet.ibm.com>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      8304d107
    • Daniel Martí's avatar
      reflect: fix String of new array types · a4864094
      Daniel Martí authored
      When constructing a new type for an array type in ArrayOf, we don't
      reset tflag to 0. All the other methods in the package, such as SliceOf,
      do this already. This results in the new array type having weird issues
      when being printed, such as having tflagExtraStar set when it shouldn't.
      
      That flag removes the first char to get rid of '*', but when used
      incorrectly in this case it eats the '[' character leading to broken
      strings like "3]int".
      
      This was fixed in 56752eb2 for issue #16722, but ArrayOf was missed.
      
      Also make the XM test struct have a non-zero size as that leads to a
      division by zero panic in ArrayOf.
      
      Fixes #20311.
      
      Change-Id: I18f1027fdbe9f71767201e7424269c3ceeb23eb5
      Reviewed-on: https://go-review.googlesource.com/43130
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: default avatarDavid Crawshaw <crawshaw@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      a4864094
    • Marvin Stenger's avatar
      cmd/compile/internal/gc: rename signatlist to signatset · 266a3b66
      Marvin Stenger authored
      Also change type from map[*types.Type]bool to map[*types.Type]struct{}.
      This is basically a clean-up.
      
      Change-Id: I167583eff0fa1070a7522647219476033b52b840
      Reviewed-on: https://go-review.googlesource.com/41859Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      266a3b66
    • Josh Bleecher Snyder's avatar
      cmd/compile: use a buffered channel for the function queue · 12137766
      Josh Bleecher Snyder authored
      Updates #20307
      
      With -c=2:
      
      name        old time/op       new time/op       delta
      Template          140ms ± 3%        139ms ± 4%  -1.06%  (p=0.003 n=50+50)
      Unicode          81.1ms ± 4%       81.9ms ± 4%  +0.96%  (p=0.006 n=50+49)
      GoTypes           375ms ± 3%        374ms ± 3%    ~     (p=0.094 n=48+48)
      Compiler          1.69s ± 2%        1.68s ± 2%  -0.41%  (p=0.004 n=49+48)
      SSA               3.05s ± 1%        3.05s ± 2%    ~     (p=0.953 n=47+49)
      Flate            86.3ms ± 2%       85.9ms ± 2%  -0.49%  (p=0.011 n=49+48)
      GoParser         99.5ms ± 3%       99.3ms ± 3%    ~     (p=0.394 n=48+49)
      Reflect           262ms ± 3%        261ms ± 3%    ~     (p=0.354 n=47+49)
      Tar              81.4ms ± 3%       79.7ms ± 4%  -1.98%  (p=0.000 n=47+50)
      XML               133ms ± 3%        133ms ± 3%    ~     (p=0.992 n=50+49)
      [Geo mean]        236ms             235ms       -0.36%
      
      name        old user-time/op  new user-time/op  delta
      Template          249ms ± 5%        242ms ± 7%  -2.61%  (p=0.000 n=48+50)
      Unicode           111ms ± 4%        111ms ± 6%    ~     (p=0.407 n=46+47)
      GoTypes           753ms ± 2%        748ms ± 3%  -0.65%  (p=0.010 n=48+50)
      Compiler          3.28s ± 2%        3.27s ± 2%  -0.40%  (p=0.026 n=49+47)
      SSA               7.03s ± 2%        7.01s ± 3%    ~     (p=0.154 n=45+50)
      Flate             154ms ± 3%        154ms ± 3%    ~     (p=0.306 n=49+49)
      GoParser          180ms ± 4%        179ms ± 4%    ~     (p=0.148 n=48+48)
      Reflect           427ms ± 2%        428ms ± 3%    ~     (p=0.502 n=46+49)
      Tar               142ms ± 5%        135ms ± 9%  -4.83%  (p=0.000 n=46+50)
      XML               247ms ± 3%        247ms ± 4%    ~     (p=0.921 n=49+49)
      [Geo mean]        426ms             422ms       -0.92%
      
      
      Change-Id: I4746234439ddb9a7e5840fc783b8857da6a4a680
      Reviewed-on: https://go-review.googlesource.com/43110
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      12137766
  2. 09 May, 2017 20 commits
    • Josh Bleecher Snyder's avatar
      cmd/compile: allow OpVarXXX calls to be duplicated in writebarrier blocks · 94a017f3
      Josh Bleecher Snyder authored
      OpVarXXX Values don't generate instructions,
      so there's no reason not to duplicate them,
      and duplicating them generates better code
      (fewer branches).
      
      This requires changing the start/end accounting
      to correctly handle the case in which we have run
      of Values beginning with an OpVarXXX, e.g.
      OpVarDef, OpZeroWB, OpMoveWB.
      In that case, the sequence of values should begin
      at the OpZeroWB, not the OpVarDef.
      
      This also lays the groundwork for experimenting
      with allowing duplication of some scalar stores.
      
      Shrinks function text sizes a tiny amount:
      
      name        old object-bytes  new object-bytes  delta
      Template           381k ± 0%         381k ± 0%  -0.01%  (p=0.008 n=5+5)
      Unicode            203k ± 0%         203k ± 0%  -0.04%  (p=0.008 n=5+5)
      GoTypes           1.17M ± 0%        1.17M ± 0%  -0.01%  (p=0.008 n=5+5)
      SSA               8.24M ± 0%        8.24M ± 0%  -0.00%  (p=0.008 n=5+5)
      Flate              230k ± 0%         230k ± 0%    ~     (all equal)
      GoParser           286k ± 0%         286k ± 0%    ~     (all equal)
      Reflect           1.00M ± 0%        1.00M ± 0%    ~     (all equal)
      Tar                189k ± 0%         189k ± 0%    ~     (all equal)
      XML                415k ± 0%         415k ± 0%  -0.01%  (p=0.008 n=5+5)
      
      Updates #19838
      
      Change-Id: Ic5ef30855919f1468066eba08ae5c4bd9a01db27
      Reviewed-on: https://go-review.googlesource.com/42011
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      94a017f3
    • Ian Lance Taylor's avatar
      cmd/internal/obj, cmd/link: fix st_other field on PPC64 · 5331e7e9
      Ian Lance Taylor authored
      In PPC64 ELF files, the st_other field indicates the number of
      prologue instructions between the global and local entry points.
      We add the instructions in the compiler and assembler if -shared is used.
      We were assuming that the instructions were present when building a
      c-archive or PIE or doing dynamic linking, on the assumption that those
      are the cases where the go tool would be building with -shared.
      That assumption fails when using some other tool, such as Bazel,
      that does not necessarily use -shared in exactly the same way.
      
      This CL records in the object file whether a symbol was compiled
      with -shared (this will be the same for all symbols in a given compilation)
      and uses that information when setting the st_other field.
      
      Fixes #20290.
      
      Change-Id: Ib2b77e16aef38824871102e3c244fcf04a86c6ea
      Reviewed-on: https://go-review.googlesource.com/43051
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMichael Hudson-Doyle <michael.hudson@canonical.com>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      5331e7e9
    • Todd Neal's avatar
      cmd/compile: ignore types when considering tuple select for CSE · 08dca4c6
      Todd Neal authored
      Fixes #20097
      
      Change-Id: I3c9626ccc8cd0c46a7081ea8650b2ff07a5d4fcd
      Reviewed-on: https://go-review.googlesource.com/41505
      Run-TryBot: Todd Neal <todd@tneal.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      08dca4c6
    • Josh Bleecher Snyder's avatar
      cmd/compile: change ssa.Type into *types.Type · 46b88c9f
      Josh Bleecher Snyder authored
      When package ssa was created, Type was in package gc.
      To avoid circular dependencies, we used an interface (ssa.Type)
      to represent type information in SSA.
      
      In the Go 1.9 cycle, gri extricated the Type type from package gc.
      As a result, we can now use it in package ssa.
      Now, instead of package types depending on package ssa,
      it is the other way.
      This is a more sensible dependency tree,
      and helps compiler performance a bit.
      
      Though this is a big CL, most of the changes are
      mechanical and uninteresting.
      
      Interesting bits:
      
      * Add new singleton globals to package types for the special
        SSA types Memory, Void, Invalid, Flags, and Int128.
      * Add two new Types, TSSA for the special types,
        and TTUPLE, for SSA tuple types.
        ssa.MakeTuple is now types.NewTuple.
      * Move type comparison result constants CMPlt, CMPeq, and CMPgt
        to package types.
      * We had picked the name "types" in our rules for the handy
        list of types provided by ssa.Config. That conflicted with
        the types package name, so change it to "typ".
      * Update the type comparison routine to handle tuples and special
        types inline.
      * Teach gc/fmt.go how to print special types.
      * We can now eliminate ElemTypes in favor of just Elem,
        and probably also some other duplicated Type methods
        designed to return ssa.Type instead of *types.Type.
      * The ssa tests were using their own dummy types,
        and they were not particularly careful about types in general.
        Of necessity, this CL switches them to use *types.Type;
        it does not make them more type-accurate.
        Unfortunately, using types.Type means initializing a bit
        of the types universe.
        This is prime for refactoring and improvement.
      
      This shrinks ssa.Value; it now fits in a smaller size class
      on 64 bit systems. This doesn't have a giant impact,
      though, since most Values are preallocated in a chunk.
      
      name        old alloc/op      new alloc/op      delta
      Template         37.9MB ± 0%       37.7MB ± 0%  -0.57%  (p=0.000 n=10+8)
      Unicode          28.9MB ± 0%       28.7MB ± 0%  -0.52%  (p=0.000 n=10+10)
      GoTypes           110MB ± 0%        109MB ± 0%  -0.88%  (p=0.000 n=10+10)
      Flate            24.7MB ± 0%       24.6MB ± 0%  -0.66%  (p=0.000 n=10+10)
      GoParser         31.1MB ± 0%       30.9MB ± 0%  -0.61%  (p=0.000 n=10+9)
      Reflect          73.9MB ± 0%       73.4MB ± 0%  -0.62%  (p=0.000 n=10+8)
      Tar              25.8MB ± 0%       25.6MB ± 0%  -0.77%  (p=0.000 n=9+10)
      XML              41.2MB ± 0%       40.9MB ± 0%  -0.80%  (p=0.000 n=10+10)
      [Geo mean]       40.5MB            40.3MB       -0.68%
      
      name        old allocs/op     new allocs/op     delta
      Template           385k ± 0%         386k ± 0%    ~     (p=0.356 n=10+9)
      Unicode            343k ± 1%         344k ± 0%    ~     (p=0.481 n=10+10)
      GoTypes           1.16M ± 0%        1.16M ± 0%  -0.16%  (p=0.004 n=10+10)
      Flate              238k ± 1%         238k ± 1%    ~     (p=0.853 n=10+10)
      GoParser           320k ± 0%         320k ± 0%    ~     (p=0.720 n=10+9)
      Reflect            957k ± 0%         957k ± 0%    ~     (p=0.460 n=10+8)
      Tar                252k ± 0%         252k ± 0%    ~     (p=0.133 n=9+10)
      XML                400k ± 0%         400k ± 0%    ~     (p=0.796 n=10+10)
      [Geo mean]         428k              428k       -0.01%
      
      
      Removing all the interface calls helps non-trivially with CPU, though.
      
      name        old time/op       new time/op       delta
      Template          178ms ± 4%        173ms ± 3%  -2.90%  (p=0.000 n=94+96)
      Unicode          85.0ms ± 4%       83.9ms ± 4%  -1.23%  (p=0.000 n=96+96)
      GoTypes           543ms ± 3%        528ms ± 3%  -2.73%  (p=0.000 n=98+96)
      Flate             116ms ± 3%        113ms ± 4%  -2.34%  (p=0.000 n=96+99)
      GoParser          144ms ± 3%        140ms ± 4%  -2.80%  (p=0.000 n=99+97)
      Reflect           344ms ± 3%        334ms ± 4%  -3.02%  (p=0.000 n=100+99)
      Tar               106ms ± 5%        103ms ± 4%  -3.30%  (p=0.000 n=98+94)
      XML               198ms ± 5%        192ms ± 4%  -2.88%  (p=0.000 n=92+95)
      [Geo mean]        178ms             173ms       -2.65%
      
      name        old user-time/op  new user-time/op  delta
      Template          229ms ± 5%        224ms ± 5%  -2.36%  (p=0.000 n=95+99)
      Unicode           107ms ± 6%        106ms ± 5%  -1.13%  (p=0.001 n=93+95)
      GoTypes           696ms ± 4%        679ms ± 4%  -2.45%  (p=0.000 n=97+99)
      Flate             137ms ± 4%        134ms ± 5%  -2.66%  (p=0.000 n=99+96)
      GoParser          176ms ± 5%        172ms ± 8%  -2.27%  (p=0.000 n=98+100)
      Reflect           430ms ± 6%        411ms ± 5%  -4.46%  (p=0.000 n=100+92)
      Tar               128ms ±13%        123ms ±13%  -4.21%  (p=0.000 n=100+100)
      XML               239ms ± 6%        233ms ± 6%  -2.50%  (p=0.000 n=95+97)
      [Geo mean]        220ms             213ms       -2.76%
      
      
      Change-Id: I15c7d6268347f8358e75066dfdbd77db24e8d0c1
      Reviewed-on: https://go-review.googlesource.com/42145
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      46b88c9f
    • Josh Bleecher Snyder's avatar
      cmd/compile: add boolean simplification rules · 6a24b2d0
      Josh Bleecher Snyder authored
      These collectively fire a few hundred times during make.bash,
      mostly rewriting XOR SETNE -> SETEQ.
      
      Fixes #17905.
      
      Change-Id: Ic5eb241ee93ed67099da3de11f59e4df9fab64a3
      Reviewed-on: https://go-review.googlesource.com/42491
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      6a24b2d0
    • Marvin Stenger's avatar
      cmd/compile/internal/ssa: mark boolean instructions commutative · 9aeced65
      Marvin Stenger authored
      Mark AndB, OrB, EqB, and NeqB as commutative.
      
      Change-Id: Ife7cfcb9780cc5dd669617cb52339ab336667da4
      Reviewed-on: https://go-review.googlesource.com/42515Reviewed-by: default avatarGiovanni Bajo <rasky@develer.com>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      9aeced65
    • Josh Bleecher Snyder's avatar
      cmd/compile: make builds reproducible in presence of **byte and **int8 · 6f2ee0f3
      Josh Bleecher Snyder authored
      CL 39915 introduced sorting of signats by ShortString
      for reproducible builds. But ShortString treats types
      byte and uint8 identically; same for rune and uint32.
      CL 39915 attempted to compensate for this by only
      adding the underlying type (uint8) to signats in addsignat.
      
      This only works for byte and uint8. For e.g. *byte and *uint,
      both get added, and their sort order is random,
      leading to non-reproducible builds.
      
      One fix would be to add yet another type printing mode
      that doesn't eliminate byte and rune, and use it
      for sorting signats. But the formatting routines
      are complicated enough as it is.
      
      Instead, just sort first by ShortString and then by String.
      We can't just use String, because ShortString makes distinctions
      that String doesn't. ShortString is really preferred here;
      String is serving only as a backstop for handling of bytes and runes.
      
      The long series of types in the test helps increase the odds of
      failure, allowing a smaller number of iterations in the test.
      On my machine, a full test takes 700ms.
      
      Passes toolstash-check.
      
      Updates #19961
      Fixes #20272
      
      name        old alloc/op      new alloc/op      delta
      Template         37.9MB ± 0%       37.9MB ± 0%  +0.12%  (p=0.032 n=5+5)
      Unicode          28.9MB ± 0%       28.9MB ± 0%    ~     (p=0.841 n=5+5)
      GoTypes           110MB ± 0%        110MB ± 0%    ~     (p=0.841 n=5+5)
      Compiler          463MB ± 0%        463MB ± 0%    ~     (p=0.056 n=5+5)
      SSA              1.11GB ± 0%       1.11GB ± 0%  +0.02%  (p=0.016 n=5+5)
      Flate            24.7MB ± 0%       24.8MB ± 0%  +0.14%  (p=0.032 n=5+5)
      GoParser         31.1MB ± 0%       31.1MB ± 0%    ~     (p=0.421 n=5+5)
      Reflect          73.9MB ± 0%       73.9MB ± 0%    ~     (p=1.000 n=5+5)
      Tar              25.8MB ± 0%       25.8MB ± 0%  +0.15%  (p=0.016 n=5+5)
      XML              41.2MB ± 0%       41.2MB ± 0%    ~     (p=0.310 n=5+5)
      [Geo mean]       72.0MB            72.0MB       +0.07%
      
      name        old allocs/op     new allocs/op     delta
      Template           384k ± 0%         385k ± 1%    ~     (p=0.056 n=5+5)
      Unicode            343k ± 0%         344k ± 0%    ~     (p=0.548 n=5+5)
      GoTypes           1.16M ± 0%        1.16M ± 0%    ~     (p=0.421 n=5+5)
      Compiler          4.43M ± 0%        4.44M ± 0%  +0.26%  (p=0.032 n=5+5)
      SSA               9.86M ± 0%        9.87M ± 0%  +0.10%  (p=0.032 n=5+5)
      Flate              237k ± 1%         238k ± 0%  +0.49%  (p=0.032 n=5+5)
      GoParser           319k ± 1%         320k ± 1%    ~     (p=0.151 n=5+5)
      Reflect            957k ± 0%         957k ± 0%    ~     (p=1.000 n=5+5)
      Tar                251k ± 0%         252k ± 1%  +0.49%  (p=0.016 n=5+5)
      XML                399k ± 0%         401k ± 1%    ~     (p=0.310 n=5+5)
      [Geo mean]         739k              741k       +0.26%
      
      Change-Id: Ic27995a8d374d012b8aca14546b1df9d28d30df7
      Reviewed-on: https://go-review.googlesource.com/42955
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      6f2ee0f3
    • Josh Bleecher Snyder's avatar
      cmd/compile: make "imported and not used" errors deterministic · 9fda4df9
      Josh Bleecher Snyder authored
      If there were more unused imports than
      the maximum default number of errors to report,
      the set of reported imports was non-deterministic.
      
      Fix by accumulating and sorting them prior to output.
      
      Fixes #20298
      
      Change-Id: Ib3d5a15fd7dc40009523fcdc1b93ddc62a1b05f2
      Reviewed-on: https://go-review.googlesource.com/42954
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      9fda4df9
    • Cherry Zhang's avatar
      cmd/internal/obj/arm64, cmd/compile: improve offset folding on ARM64 · fb0ccc5d
      Cherry Zhang authored
      ARM64 assembler backend only accepts loads and stores with small
      or aligned offset. The compiler therefore can only fold small or
      aligned offsets into loads and stores. For locals and args, their
      offsets to SP are not known until very late, and the compiler
      makes conservative decision not folding some of them. However,
      in most cases, the offset is indeed small or aligned, and can
      be folded into load and store (but actually not).
      
      This CL adds support of loads and stores with large and unaligned
      offsets. When the offset doesn't fit into the instruction, it
      uses two instructions and (for very large offset) the constant
      pool. This way, the compiler doesn't need to be conservative,
      and can simply fold the offset.
      
      To make it work, the assembler's optab matching rules need to be
      changed. Before, MOVD accepts C_UAUTO32K which matches multiple
      of 8 between 0 and 32K, and also C_UAUTO16K, which may not be
      multiple of 8 and does not fit into MOVD instruction. The
      assembler errors in the latter case. This change makes it only
      matches multiple of 8 (or offsets within ±256, which also fits
      in instruction), and uses the large-or-unaligned-offset rule
      for things doesn't fit (without error). Other sized move rules
      are changed similarly.
      
      Class C_UAUTO64K and C_UOREG64K are removed, as they are never
      used.
      
      In shared library, load/store of global is rewritten to using
      GOT and temp register, which conflicts with the use of temp
      register for assembling large offset. So the folding is disabled
      for globals in shared library mode.
      
      Reduce cmd/go binary size by 2%.
      
      name                     old time/op    new time/op    delta
      BinaryTree17-8              8.67s ± 0%     8.61s ± 0%   -0.60%  (p=0.000 n=9+10)
      Fannkuch11-8                6.24s ± 0%     6.19s ± 0%   -0.83%  (p=0.000 n=10+9)
      FmtFprintfEmpty-8           116ns ± 0%     116ns ± 0%     ~     (all equal)
      FmtFprintfString-8          196ns ± 0%     192ns ± 0%   -1.89%  (p=0.000 n=10+10)
      FmtFprintfInt-8             199ns ± 0%     198ns ± 0%   -0.35%  (p=0.001 n=9+10)
      FmtFprintfIntInt-8          294ns ± 0%     293ns ± 0%   -0.34%  (p=0.000 n=8+8)
      FmtFprintfPrefixedInt-8     318ns ± 1%     318ns ± 1%     ~     (p=1.000 n=10+10)
      FmtFprintfFloat-8           537ns ± 0%     531ns ± 0%   -1.17%  (p=0.000 n=9+10)
      FmtManyArgs-8              1.19µs ± 1%    1.18µs ± 1%   -1.41%  (p=0.001 n=10+10)
      GobDecode-8                17.2ms ± 1%    17.3ms ± 2%     ~     (p=0.165 n=10+10)
      GobEncode-8                14.7ms ± 1%    14.7ms ± 2%     ~     (p=0.631 n=10+10)
      Gzip-8                      837ms ± 0%     836ms ± 0%   -0.14%  (p=0.006 n=9+10)
      Gunzip-8                    141ms ± 0%     139ms ± 0%   -1.24%  (p=0.000 n=9+10)
      HTTPClientServer-8          256µs ± 1%     253µs ± 1%   -1.35%  (p=0.000 n=10+10)
      JSONEncode-8               40.1ms ± 1%    41.3ms ± 1%   +3.06%  (p=0.000 n=10+9)
      JSONDecode-8                157ms ± 1%     156ms ± 1%   -0.83%  (p=0.001 n=9+8)
      Mandelbrot200-8            8.94ms ± 0%    8.94ms ± 0%   +0.02%  (p=0.000 n=9+9)
      GoParse-8                  8.69ms ± 0%    8.54ms ± 1%   -1.69%  (p=0.000 n=8+10)
      RegexpMatchEasy0_32-8       227ns ± 1%     228ns ± 1%   +0.48%  (p=0.016 n=10+9)
      RegexpMatchEasy0_1K-8      1.92µs ± 0%    1.63µs ± 0%  -15.08%  (p=0.000 n=10+9)
      RegexpMatchEasy1_32-8       256ns ± 0%     251ns ± 0%   -2.19%  (p=0.000 n=10+9)
      RegexpMatchEasy1_1K-8      2.38µs ± 0%    2.09µs ± 0%  -12.49%  (p=0.000 n=10+9)
      RegexpMatchMedium_32-8      352ns ± 0%     354ns ± 0%   +0.39%  (p=0.002 n=10+9)
      RegexpMatchMedium_1K-8      106µs ± 0%     106µs ± 0%   -0.05%  (p=0.005 n=10+9)
      RegexpMatchHard_32-8       5.92µs ± 0%    5.89µs ± 0%   -0.40%  (p=0.000 n=9+8)
      RegexpMatchHard_1K-8        180µs ± 0%     179µs ± 0%   -0.14%  (p=0.000 n=10+9)
      Revcomp-8                   1.20s ± 0%     1.13s ± 0%   -6.29%  (p=0.000 n=9+8)
      Template-8                  159ms ± 1%     154ms ± 1%   -3.14%  (p=0.000 n=9+10)
      TimeParse-8                 800ns ± 3%     769ns ± 1%   -3.91%  (p=0.000 n=10+10)
      TimeFormat-8                826ns ± 2%     817ns ± 2%   -1.04%  (p=0.050 n=10+10)
      [Geo mean]                  145µs          143µs        -1.79%
      
      Change-Id: I5fc42087cee9b54ea414f8ef6d6d020b80eb5985
      Reviewed-on: https://go-review.googlesource.com/42172
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      Reviewed-by: default avatarDavid Chase <drchase@google.com>
      fb0ccc5d
    • Josh Bleecher Snyder's avatar
      cmd/go: enable concurrent backend compilation by default · 5e0bcb38
      Josh Bleecher Snyder authored
      It can be disabled by setting the environment variable
      GO19CONCURRENTCOMPILATION=0, or with -gcflags=-c=1.
      
      Fixes #15756.
      
      Change-Id: I7acbf16330512b62ee14ecbab1f46b53ec5a67b6
      Reviewed-on: https://go-review.googlesource.com/41820
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      5e0bcb38
    • Josh Bleecher Snyder's avatar
      cmd/go: add support for concurrent backend compilation · f4e5bd48
      Josh Bleecher Snyder authored
      It is disabled by default.
      It can be enabled by setting the environment variable
      GO19CONCURRENTCOMPILATION=1.
      
      Benchmarking results are presented in a grid.
      Columns are different values of c (compiler backend concurrency);
      rows are different values of p (process concurrency).
      
      'go build -a std cmd', a 4 core raspberry pi 3:
      
                  c=1        c=2        c=4
      StdCmd/p=1  504s ± 2%  413s ± 4%  367s ± 3%
      StdCmd/p=2  314s ± 3%  266s ± 4%  267s ± 4%
      StdCmd/p=4  254s ± 5%  241s ± 5%  238s ± 6%
      
      'go build -a std cmd', an 8 core darwin/amd64 laptop:
      
                  c=1         c=2         c=4         c=6         c=8
      StdCmd/p=1  40.4s ± 7%  31.0s ± 1%  27.3s ± 1%  27.8s ± 0%  27.7s ± 0%
      StdCmd/p=2  21.9s ± 1%  17.9s ± 1%  16.9s ± 1%  17.0s ± 1%  17.2s ± 0%
      StdCmd/p=4  17.4s ± 2%  14.5s ± 2%  13.3s ± 2%  13.5s ± 2%  13.6s ± 2%
      StdCmd/p=6  16.9s ± 1%  14.2s ± 2%  13.1s ± 2%  13.2s ± 2%  13.3s ± 2%
      StdCmd/p=8  16.7s ± 2%  14.2s ± 2%  13.2s ± 3%  13.2s ± 2%  13.4s ± 2%
      
      'go build -a std cmd', a 96 core arm64 server:
      
                   c=1         c=2         c=4         c=6         c=8         c=16        c=32        c=64        c=96
      StdCmd/p=1    173s ± 1%   133s ± 1%   114s ± 1%   109s ± 1%   106s ± 0%   106s ± 1%   107s ± 1%   110s ± 1%   113s ± 1%
      StdCmd/p=2   94.2s ± 2%  71.5s ± 1%  61.7s ± 1%  58.7s ± 1%  57.5s ± 2%  56.9s ± 1%  58.0s ± 1%  59.6s ± 1%  61.0s ± 1%
      StdCmd/p=4   74.1s ± 2%  53.5s ± 1%  43.7s ± 2%  40.5s ± 1%  39.2s ± 2%  38.9s ± 2%  39.5s ± 3%  40.3s ± 2%  40.8s ± 1%
      StdCmd/p=6   69.3s ± 1%  50.2s ± 2%  40.3s ± 2%  37.3s ± 3%  36.0s ± 3%  35.3s ± 2%  36.0s ± 2%  36.8s ± 2%  37.5s ± 2%
      StdCmd/p=8   66.1s ± 2%  47.7s ± 2%  38.6s ± 2%  35.7s ± 2%  34.4s ± 1%  33.6s ± 2%  34.2s ± 2%  34.6s ± 1%  35.0s ± 1%
      StdCmd/p=16  63.4s ± 2%  45.3s ± 2%  36.3s ± 2%  33.3s ± 2%  32.0s ± 3%  31.6s ± 2%  32.1s ± 2%  32.5s ± 2%  32.7s ± 2%
      StdCmd/p=32  62.2s ± 1%  44.2s ± 2%  35.3s ± 2%  32.4s ± 2%  31.2s ± 2%  30.9s ± 2%  31.1s ± 2%  31.7s ± 2%  32.0s ± 2%
      StdCmd/p=64  62.2s ± 1%  44.3s ± 2%  35.4s ± 2%  32.4s ± 2%  31.2s ± 2%  30.9s ± 2%  31.2s ± 2%  31.8s ± 3%  32.2s ± 3%
      StdCmd/p=96  62.2s ± 2%  44.4s ± 2%  35.3s ± 2%  32.3s ± 2%  31.1s ± 2%  30.9s ± 3%  31.3s ± 2%  31.7s ± 1%  32.1s ± 2%
      
      benchjuju, an 8 core darwin/amd64 laptop:
      
                     c=1         c=2         c=4         c=6         c=8
      BuildJuju/p=1  55.3s ± 0%  46.3s ± 0%  41.9s ± 0%  41.4s ± 1%  41.3s ± 0%
      BuildJuju/p=2  33.7s ± 1%  28.4s ± 1%  26.7s ± 1%  26.6s ± 1%  26.8s ± 1%
      BuildJuju/p=4  24.7s ± 1%  22.3s ± 1%  21.4s ± 1%  21.7s ± 1%  21.8s ± 1%
      BuildJuju/p=6  20.6s ± 1%  19.3s ± 2%  19.4s ± 1%  19.7s ± 1%  19.9s ± 1%
      BuildJuju/p=8  20.6s ± 2%  19.5s ± 2%  19.3s ± 2%  19.6s ± 1%  19.8s ± 2%
      
      Updates #15756
      
      Change-Id: I8a56e88953071a05eee764002024c54cd888a56c
      Reviewed-on: https://go-review.googlesource.com/41819
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      f4e5bd48
    • Robert Griesemer's avatar
      spec: clarify unsafe.Pointer conversions · 86f5f7fd
      Robert Griesemer authored
      A pointer type of underlying type unsafe.Pointer can be used in
      unsafe conversions. Document unfortunate status quo.
      
      Fixes #19306.
      
      Change-Id: I28172508a200561f8df366bbf2c2807ef3b48c97
      Reviewed-on: https://go-review.googlesource.com/42132Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      Reviewed-by: default avatarRuss Cox <rsc@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      86f5f7fd
    • Ibrahim AshShohail's avatar
      go/token: remove excess parenthesis in NoPos.IsValid() documentation · 54102963
      Ibrahim AshShohail authored
      Fixes #20294
      
      Change-Id: I32ac862fe00180210a04103cc94c4d9fef5d1b6c
      Reviewed-on: https://go-review.googlesource.com/42992Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      54102963
    • Austin Clements's avatar
      runtime/pprof: deflake TestGoroutineCounts · d659682d
      Austin Clements authored
      TestGoroutineCounts currently depends on timing to get 100 goroutines
      to a known blocking point before taking a profile. This fails
      frequently, with different goroutines captured at different stacks.
      The test is disabled on openbsd because it was too flaky, but in fact
      it flakes on all platforms.
      
      Fix this by using Gosched instead of timing. This is both much more
      reliable and makes the test run faster.
      
      Fixes #15156.
      
      Change-Id: Ia6e894196d717655b8fb4ee96df53f6cc8bc5f1f
      Reviewed-on: https://go-review.googlesource.com/42953
      Run-TryBot: Austin Clements <austin@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      d659682d
    • Ian Lance Taylor's avatar
      cmd/go: put user flags after code generation flag · 9eacd977
      Ian Lance Taylor authored
      This permits the user to override the code generation flag when they
      know better. This is always a good policy for all flags automatically
      inserted by the build system.
      
      Doing this now so that I can write a test for #20290.
      
      Update #20290
      
      Change-Id: I5c6708a277238d571b8d037993a5a59e2a442e98
      Reviewed-on: https://go-review.googlesource.com/42952
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      9eacd977
    • Rob Phoenix's avatar
      net: fix ExampleParseCIDR IPv4 prefix length · 1e732ca3
      Rob Phoenix authored
      Issue #15228 describes that reserved address blocks should be used for
      documentation purposes. This change updates the prefix length so the
      IPv4 address adheres to this.
      
      Change-Id: I237d9cce1a71f4fd95f927ec894ce53fa806047f
      Reviewed-on: https://go-review.googlesource.com/42991
      Run-TryBot: Ian Lance Taylor <iant@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      1e732ca3
    • Alex Brainman's avatar
      cmd/go: run tests that require symlinks · 096e2bff
      Alex Brainman authored
      Change-Id: I19a724ea4eb1ba0ff558721650c89a949e53b7c7
      Reviewed-on: https://go-review.googlesource.com/42895Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      096e2bff
    • Alex Brainman's avatar
      os: avoid calulating fileStat.path until it is needed · 6dcaa095
      Alex Brainman authored
      This CL improves
      
      on my Windows 7
      
      name         old time/op    new time/op    delta
      Readdirname    58.1µs ± 1%    58.1µs ± 0%     ~     (p=0.817 n=8+8)
      Readdir        58.0µs ± 3%    57.8µs ± 0%     ~     (p=0.944 n=9+8)
      
      name         old alloc/op   new alloc/op   delta
      Readdirname    3.03kB ± 0%    2.84kB ± 0%   -6.33%  (p=0.000 n=10+10)
      Readdir        3.00kB ± 0%    2.81kB ± 0%   -6.40%  (p=0.000 n=10+10)
      
      name         old allocs/op  new allocs/op  delta
      Readdirname      34.0 ± 0%      30.0 ± 0%  -11.76%  (p=0.000 n=10+10)
      Readdir          33.0 ± 0%      29.0 ± 0%  -12.12%  (p=0.000 n=10+10)
      
      on my Windows XP
      
      name           old time/op    new time/op    delta
      Readdirname-2    85.5µs ± 0%    84.0µs ± 0%   -1.83%  (p=0.000 n=10+10)
      Readdir-2        84.6µs ± 0%    83.5µs ± 0%   -1.31%  (p=0.000 n=10+9)
      
      name           old alloc/op   new alloc/op   delta
      Readdirname-2    6.52kB ± 0%    5.66kB ± 0%  -13.25%  (p=0.000 n=10+10)
      Readdir-2        6.39kB ± 0%    5.53kB ± 0%  -13.52%  (p=0.000 n=10+10)
      
      name           old allocs/op  new allocs/op  delta
      Readdirname-2      78.0 ± 0%      66.0 ± 0%  -15.38%  (p=0.000 n=10+10)
      Readdir-2          77.0 ± 0%      65.0 ± 0%  -15.58%  (p=0.000 n=10+10)
      
      Change-Id: I5d698eca86b8e94a46b6cfbd5947898b7b3fbdbd
      Reviewed-on: https://go-review.googlesource.com/42894Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      6dcaa095
    • ltnwgl's avatar
      container/heap: optimization when selecting smaller child · f5352a77
      ltnwgl authored
      In down(), if two children are equal, we can choose either one.
      Inspired by https://codereview.appspot.com/6613064/
      
      Change-Id: Iaad4ca5e2f5111bf3abb87f606584e7d274c620b
      Reviewed-on: https://go-review.googlesource.com/38612
      Run-TryBot: Robert Griesemer <gri@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      f5352a77
    • Rob Phoenix's avatar
      net: add examples for IPv4, ParseCIDR & IPv4Mask · 716761b8
      Rob Phoenix authored
      Further examples to support the net package.
      
      See issue #5757
      
      Change-Id: I839fd97a468c8d9195e8f4a0ee886ba50ca3f382
      Reviewed-on: https://go-review.googlesource.com/42912Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      716761b8
  3. 08 May, 2017 4 commits
    • Robert Griesemer's avatar
      cmd/compile: better errors for float constants with large exponents · bcf2d74c
      Robert Griesemer authored
      Also: Removed misleading comment.
      
      Fixes #20232.
      
      Change-Id: I0b141b1360ac53267b7ebfcec7a2e2a238f3f46c
      Reviewed-on: https://go-review.googlesource.com/42930
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      bcf2d74c
    • Bill O'Farrell's avatar
      math: use SIMD to accelerate additional scalar math functions on s390x · 88672de7
      Bill O'Farrell authored
      As necessary, math functions were structured to use stubs, so that they can
      be accelerated with assembly on any platform.
      
      Technique used was minimax polynomial approximation using tables of
      polynomial coefficients, with argument range reduction.
      
      Benchmark         New     Old     Speedup
      BenchmarkAcos     12.2    47.5    3.89
      BenchmarkAcosh    18.5    56.2    3.04
      BenchmarkAsin     13.1    40.6    3.10
      BenchmarkAsinh    19.4    62.8    3.24
      BenchmarkAtan     10.1    23      2.28
      BenchmarkAtanh    19.1    53.2    2.79
      BenchmarkAtan2    16.5    33.9    2.05
      BenchmarkCbrt     14.8    58      3.92
      BenchmarkErf      10.8    20.1    1.86
      BenchmarkErfc     11.2    23.5    2.10
      BenchmarkExp      8.77    53.8    6.13
      BenchmarkExpm1    10.1    38.3    3.79
      BenchmarkLog      13.1    40.1    3.06
      BenchmarkLog1p    12.7    38.3    3.02
      BenchmarkPowInt   31.7    40.5    1.28
      BenchmarkPowFrac  33.1    141     4.26
      BenchmarkTan      11.5    30      2.61
      
      Accuracy was tested against a high precision
      reference function to determine maximum error.
      Note: ulperr is error in "units in the last place"
      
             max
            ulperr
      Acos  1.15
      Acosh 1.07
      Asin  2.22
      Asinh 1.72
      Atan  1.41
      Atanh 3.00
      Atan2 1.45
      Cbrt  1.18
      Erf   1.29
      Erfc  4.82
      Exp   1.00
      Expm1 2.26
      Log   0.94
      Log1p 2.39
      Tan   3.14
      
      Pow will have 99.99% correctly rounded results with reasonable inputs
      producing numeric (non Inf or NaN) results
      
      Change-Id: I850e8cf7b70426e8b54ec49d74acd4cddc8c6cb2
      Reviewed-on: https://go-review.googlesource.com/38585Reviewed-by: default avatarMichael Munday <munday@ca.ibm.com>
      Run-TryBot: Michael Munday <munday@ca.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      88672de7
    • Marvin Stenger's avatar
      bytes: skip inline test by default · 8c49c06b
      Marvin Stenger authored
      The test "TestTryGrowByResliceInlined" introduced in c08ac367 broke the
      noopt builder as it fails when inlining is disabled.
      Since there are currently no other options at hand for checking
      inlined-ness other than looking at emited symbols of the compilation,
      we for now skip the problem causing test by default and only run
      it on one specific builder ("linux-amd64").
      Also see CL 42813, which introduced the test and contains comments
      suggesting this temporary solution.
      
      Change-Id: I3978ab0831da04876cf873d78959f821c459282b
      Reviewed-on: https://go-review.googlesource.com/42820Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      8c49c06b
    • Alex Brainman's avatar
      internal/poll: remove allocation in windows FD.Writev · ddcb975f
      Alex Brainman authored
      Use closure parameter instead of external variable to
      remove 1 allocation.
      
      I tried to add test, but it is difficult to add something simple
      and not flake here. I did test this with:
      
      diff --git a/src/net/writev_test.go b/src/net/writev_test.go
      index 4c05be4..e417d68 100644
      --- a/src/net/writev_test.go
      +++ b/src/net/writev_test.go
      @@ -99,6 +99,15 @@ func TestBuffers_WriteTo(t *testing.T) {
       	}
       }
      
      +func TestBuffers_WriteToAllocs(t *testing.T) {
      +	allocs := testing.AllocsPerRun(10, func() {
      +		testBuffer_writeTo(t, 10, false)
      +	})
      +	if allocs > 0 {
      +		t.Fatalf("got %v; want 0", allocs)
      +	}
      +}
      +
       func testBuffer_writeTo(t *testing.T, chunks int, useCopy bool) {
       	oldHook := poll.TestHookDidWritev
       	defer func() { poll.TestHookDidWritev = oldHook }()
      
      It makes allocation count go down by 1 after the fix.
      
      Before:
      
      C:\>u:\test -test.v -test.run=WriteToAllocs
      === RUN   TestBuffers_WriteToAllocs
      --- FAIL: TestBuffers_WriteToAllocs (0.05s)
              writev_test.go:107: got 66; want 0
      FAIL
      
      and after:
      
      C:\>u:\test -test.v -test.run=WriteToAllocs
      === RUN   TestBuffers_WriteToAllocs
      --- FAIL: TestBuffers_WriteToAllocs (0.04s)
              writev_test.go:107: got 65; want 0
      FAIL
      
      Thanks to @MichaelMonashev for report and the fix.
      
      Fixes #19222
      
      Change-Id: I0f73cd9e2c8bbaa0653083f81f3ccb83b5ea84e1
      Reviewed-on: https://go-review.googlesource.com/42893Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      ddcb975f
  4. 07 May, 2017 5 commits
    • Elias Naur's avatar
      cmd/link/internal/ld: don't link with -no_pie on darwin/arm64 · 45d42fdc
      Elias Naur authored
      Ever since CL 33301 linking darwin/arm64 excutables has resulted in
      warnings like:
      
      ld: warning: -no_pie ignored for arm64
      
      Remove -no_pie on darwin/arm64.
      
      Change-Id: I9f7685351fa8cce29795283e1a24fc7a6753d698
      Reviewed-on: https://go-review.googlesource.com/42815
      Run-TryBot: Elias Naur <elias.naur@gmail.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      45d42fdc
    • Kevin Burke's avatar
      os, cmd/link: fix typos · 9058b9ae
      Kevin Burke authored
      Also switch "stating" to "statting" to describe applying os.Stat to
      a resource; the former is more confusable than the latter.
      
      Change-Id: I9d8e3506bd383f8f1479c05948c03b8c633dc4af
      Reviewed-on: https://go-review.googlesource.com/42855Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      9058b9ae
    • Marvin Stenger's avatar
      bytes: optimize Buffer's Write, WriteString, WriteByte, and WriteRune · c08ac367
      Marvin Stenger authored
      In the common case, the grow method only needs to reslice the internal
      buffer. Making another function call to grow can be expensive when Write
      is called very often with small pieces of data (like a byte or rune).
      Thus, we add a tryGrowByReslice method that is inlineable so that we can
      avoid an extra call in most cases.
      
      name                       old time/op    new time/op    delta
      WriteByte-4                  35.5µs ± 0%    17.4µs ± 1%   -51.03%  (p=0.000 n=19+20)
      WriteRune-4                  55.7µs ± 1%    38.7µs ± 1%   -30.56%  (p=0.000 n=18+19)
      BufferNotEmptyWriteRead-4     304µs ± 5%     283µs ± 3%    -6.86%  (p=0.000 n=19+17)
      BufferFullSmallReads-4       87.0µs ± 5%    66.8µs ± 2%   -23.26%  (p=0.000 n=17+17)
      
      name                       old speed      new speed      delta
      WriteByte-4                 115MB/s ± 0%   235MB/s ± 1%  +104.19%  (p=0.000 n=19+20)
      WriteRune-4                 221MB/s ± 1%   318MB/s ± 1%   +44.01%  (p=0.000 n=18+19)
      
      Fixes #17857
      
      Change-Id: I08dfb10a1c7e001817729dbfcc951bda12fe8814
      Reviewed-on: https://go-review.googlesource.com/42813Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      c08ac367
    • Damien Lespiau's avatar
      cmd/asm: enable MOVSD in the encoding end-to-end test · 23c5db9b
      Damien Lespiau authored
      MOVSD is properly handled but its encoding test wasn't enabled. Enable
      it.
      
      For reference this was found with a little tool I wrote [1] to explore
      which instructions are missing or not tested in the go obj package and
      assembler:
      
      "which SSE2 instructions aren't tested? And don't list instructions
      which can take MMX operands"
      
      $ x86db-gogen list --extension SSE2 --not-tested --not-mmx
      CLFLUSH mem           [m:  np 0f ae /7] WILLAMETTE,SSE2
      MOVSD   xmmreg,xmmreg [rm: f2 0f 10 /r] WILLAMETTE,SSE2
      MOVSD   xmmreg,xmmreg [mr: f2 0f 11 /r] WILLAMETTE,SSE2
      MOVSD   mem64,xmmreg  [mr: f2 0f 11 /r] WILLAMETTE,SSE2
      MOVSD   xmmreg,mem64  [rm: f2 0f 10 /r] WILLAMETTE,SSE2
      
      (CLFLUSH was introduced with SSE2, but has its own CPUID bit)
      
      [1] https://github.com/dlespiau/x86db
      
      Change-Id: Ic3af3028cb8d4f02e53fdebb9b30fb311f4ee454
      Reviewed-on: https://go-review.googlesource.com/42814Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      23c5db9b
    • Alex Brainman's avatar
      os: reimplement windows os.Stat · 53003621
      Alex Brainman authored
      Currently windows Stat uses combination of Lstat and Readlink to
      walk symlinks until it reaches file or directory. Windows Readlink
      is implemented via Windows DeviceIoControl(FSCTL_GET_REPARSE_POINT, ...)
      call, but that call does not work on network shares or inside of
      Docker container (see issues #18555 ad #19922 for details).
      
      But Raymond Chen suggests different approach:
      https://blogs.msdn.microsoft.com/oldnewthing/20100212-00/?p=14963/
      - he suggests to use Windows I/O manager to dereferences the
      symbolic link.
      
      This appears to work for all normal symlinks, but also for network
      shares and inside of Docker container.
      
      This CL implements described procedure.
      
      I also had to adjust TestStatSymlinkLoop, because the test is
      expecting Stat to return syscall.ELOOP for symlink with a loop.
      But new Stat returns Windows error of ERROR_CANT_RESOLVE_FILENAME
      = 1921 instead. I could map ERROR_CANT_RESOLVE_FILENAME into
      syscall.ELOOP, but I suspect the former is broader than later.
      And ERROR_CANT_RESOLVE_FILENAME message text of "The name of
      the file cannot be resolved by the system." sounds fine to me.
      
      Fixes #10935
      Fixes #18555
      Fixes #19922
      
      Change-Id: I979636064cdbdb9c7c840cf8ae73fe2c24499879
      Reviewed-on: https://go-review.googlesource.com/41834Reviewed-by: default avatarHarshavardhana <hrshvardhana@gmail.com>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      53003621
  5. 06 May, 2017 3 commits