1. 15 Mar, 2018 14 commits
    • Daniel Martí's avatar
      cmd/compile: cache sparse maps across ssa passes · cd2cb6e3
      Daniel Martí authored
      This is done for sparse sets already, but it was missing for sparse
      maps. Only affects deadstore and regalloc, as they're the only ones that
      use sparse maps.
      
      name                 old time/op    new time/op    delta
      DSEPass-4               247µs ± 0%     216µs ± 0%  -12.75%  (p=0.008 n=5+5)
      DSEPassBlock-4         3.05ms ± 1%    2.87ms ± 1%   -6.02%  (p=0.002 n=6+6)
      CSEPass-4              2.30ms ± 0%    2.32ms ± 0%   +0.53%  (p=0.026 n=6+6)
      CSEPassBlock-4         23.8ms ± 0%    23.8ms ± 0%     ~     (p=0.931 n=6+5)
      DeadcodePass-4         51.7µs ± 1%    51.5µs ± 2%     ~     (p=0.429 n=5+6)
      DeadcodePassBlock-4     734µs ± 1%     742µs ± 3%     ~     (p=0.394 n=6+6)
      MultiPass-4             152µs ± 0%     149µs ± 2%     ~     (p=0.082 n=5+6)
      MultiPassBlock-4       2.67ms ± 1%    2.41ms ± 2%   -9.77%  (p=0.008 n=5+5)
      
      name                 old alloc/op   new alloc/op   delta
      DSEPass-4              41.2kB ± 0%     0.1kB ± 0%  -99.68%  (p=0.002 n=6+6)
      DSEPassBlock-4          560kB ± 0%       4kB ± 0%  -99.34%  (p=0.026 n=5+6)
      CSEPass-4               189kB ± 0%     189kB ± 0%     ~     (all equal)
      CSEPassBlock-4         3.10MB ± 0%    3.10MB ± 0%     ~     (p=0.444 n=5+5)
      DeadcodePass-4         10.5kB ± 0%    10.5kB ± 0%     ~     (all equal)
      DeadcodePassBlock-4     164kB ± 0%     164kB ± 0%     ~     (all equal)
      MultiPass-4             240kB ± 0%     199kB ± 0%  -17.06%  (p=0.002 n=6+6)
      MultiPassBlock-4       3.60MB ± 0%    2.99MB ± 0%  -17.06%  (p=0.002 n=6+6)
      
      name                 old allocs/op  new allocs/op  delta
      DSEPass-4                8.00 ± 0%      4.00 ± 0%  -50.00%  (p=0.002 n=6+6)
      DSEPassBlock-4            240 ± 0%       120 ± 0%  -50.00%  (p=0.002 n=6+6)
      CSEPass-4                9.00 ± 0%      9.00 ± 0%     ~     (all equal)
      CSEPassBlock-4          1.35k ± 0%     1.35k ± 0%     ~     (all equal)
      DeadcodePass-4           3.00 ± 0%      3.00 ± 0%     ~     (all equal)
      DeadcodePassBlock-4      9.00 ± 0%      9.00 ± 0%     ~     (all equal)
      MultiPass-4              11.0 ± 0%      10.0 ± 0%   -9.09%  (p=0.002 n=6+6)
      MultiPassBlock-4          165 ± 0%       150 ± 0%   -9.09%  (p=0.002 n=6+6)
      
      Change-Id: I43860687c88f33605eb1415f36473c5cfe8fde4a
      Reviewed-on: https://go-review.googlesource.com/98449
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      cd2cb6e3
    • Giovanni Bajo's avatar
      cmd/compile: implement CMOV on amd64 · a35ec9a5
      Giovanni Bajo authored
      This builds upon the branchelim pass, activating it for amd64 and
      lowering CondSelect. Special care is made to FPU instructions for
      NaN handling.
      
      Benchmark results on Xeon E5630 (Westmere EP):
      
      name                      old time/op    new time/op    delta
      BinaryTree17-16              4.99s ± 9%     4.66s ± 2%     ~     (p=0.095 n=5+5)
      Fannkuch11-16                4.93s ± 3%     5.04s ± 2%     ~     (p=0.548 n=5+5)
      FmtFprintfEmpty-16          58.8ns ± 7%    61.4ns ±14%     ~     (p=0.579 n=5+5)
      FmtFprintfString-16          114ns ± 2%     114ns ± 4%     ~     (p=0.603 n=5+5)
      FmtFprintfInt-16             181ns ± 4%     125ns ± 3%  -30.90%  (p=0.008 n=5+5)
      FmtFprintfIntInt-16          263ns ± 2%     217ns ± 2%  -17.34%  (p=0.008 n=5+5)
      FmtFprintfPrefixedInt-16     230ns ± 1%     212ns ± 1%   -7.99%  (p=0.008 n=5+5)
      FmtFprintfFloat-16           411ns ± 3%     344ns ± 5%  -16.43%  (p=0.008 n=5+5)
      FmtManyArgs-16               828ns ± 4%     790ns ± 2%   -4.59%  (p=0.032 n=5+5)
      GobDecode-16                10.9ms ± 4%    10.8ms ± 5%     ~     (p=0.548 n=5+5)
      GobEncode-16                9.52ms ± 5%    9.46ms ± 2%     ~     (p=1.000 n=5+5)
      Gzip-16                      334ms ± 2%     337ms ± 2%     ~     (p=0.548 n=5+5)
      Gunzip-16                   64.4ms ± 1%    65.0ms ± 1%   +1.00%  (p=0.008 n=5+5)
      HTTPClientServer-16          156µs ± 3%     155µs ± 3%     ~     (p=0.690 n=5+5)
      JSONEncode-16               21.0ms ± 1%    21.8ms ± 0%   +3.76%  (p=0.016 n=5+4)
      JSONDecode-16               95.1ms ± 0%    95.7ms ± 1%     ~     (p=0.151 n=5+5)
      Mandelbrot200-16            6.38ms ± 1%    6.42ms ± 1%     ~     (p=0.095 n=5+5)
      GoParse-16                  5.47ms ± 2%    5.36ms ± 1%   -1.95%  (p=0.016 n=5+5)
      RegexpMatchEasy0_32-16       111ns ± 1%     111ns ± 1%     ~     (p=0.635 n=5+4)
      RegexpMatchEasy0_1K-16       408ns ± 1%     411ns ± 2%     ~     (p=0.087 n=5+5)
      RegexpMatchEasy1_32-16       103ns ± 1%     104ns ± 1%     ~     (p=0.484 n=5+5)
      RegexpMatchEasy1_1K-16       659ns ± 2%     652ns ± 1%     ~     (p=0.571 n=5+5)
      RegexpMatchMedium_32-16      176ns ± 2%     174ns ± 1%     ~     (p=0.476 n=5+5)
      RegexpMatchMedium_1K-16     58.6µs ± 4%    57.7µs ± 4%     ~     (p=0.548 n=5+5)
      RegexpMatchHard_32-16       3.07µs ± 3%    3.04µs ± 4%     ~     (p=0.421 n=5+5)
      RegexpMatchHard_1K-16       89.2µs ± 1%    87.9µs ± 2%   -1.52%  (p=0.032 n=5+5)
      Revcomp-16                   575ms ± 0%     587ms ± 2%   +2.12%  (p=0.032 n=4+5)
      Template-16                  110ms ± 1%     107ms ± 3%   -3.00%  (p=0.032 n=5+5)
      TimeParse-16                 463ns ± 0%     462ns ± 0%     ~     (p=0.810 n=5+4)
      TimeFormat-16                538ns ± 0%     535ns ± 0%   -0.63%  (p=0.024 n=5+5)
      
      name                      old speed      new speed      delta
      GobDecode-16              70.7MB/s ± 4%  71.4MB/s ± 5%     ~     (p=0.452 n=5+5)
      GobEncode-16              80.7MB/s ± 5%  81.2MB/s ± 2%     ~     (p=1.000 n=5+5)
      Gzip-16                   58.2MB/s ± 2%  57.7MB/s ± 2%     ~     (p=0.452 n=5+5)
      Gunzip-16                  302MB/s ± 1%   299MB/s ± 1%   -0.99%  (p=0.008 n=5+5)
      JSONEncode-16             92.4MB/s ± 1%  89.1MB/s ± 0%   -3.63%  (p=0.016 n=5+4)
      JSONDecode-16             20.4MB/s ± 0%  20.3MB/s ± 1%     ~     (p=0.135 n=5+5)
      GoParse-16                10.6MB/s ± 2%  10.8MB/s ± 1%   +2.00%  (p=0.016 n=5+5)
      RegexpMatchEasy0_32-16     286MB/s ± 1%   285MB/s ± 3%     ~     (p=1.000 n=5+5)
      RegexpMatchEasy0_1K-16    2.51GB/s ± 1%  2.49GB/s ± 2%     ~     (p=0.095 n=5+5)
      RegexpMatchEasy1_32-16     309MB/s ± 1%   307MB/s ± 1%     ~     (p=0.548 n=5+5)
      RegexpMatchEasy1_1K-16    1.55GB/s ± 2%  1.57GB/s ± 1%     ~     (p=0.690 n=5+5)
      RegexpMatchMedium_32-16   5.68MB/s ± 2%  5.73MB/s ± 1%     ~     (p=0.579 n=5+5)
      RegexpMatchMedium_1K-16   17.5MB/s ± 4%  17.8MB/s ± 4%     ~     (p=0.500 n=5+5)
      RegexpMatchHard_32-16     10.4MB/s ± 3%  10.5MB/s ± 4%     ~     (p=0.460 n=5+5)
      RegexpMatchHard_1K-16     11.5MB/s ± 1%  11.7MB/s ± 2%   +1.57%  (p=0.032 n=5+5)
      Revcomp-16                 442MB/s ± 0%   433MB/s ± 2%   -2.05%  (p=0.032 n=4+5)
      Template-16               17.7MB/s ± 1%  18.2MB/s ± 3%   +3.12%  (p=0.032 n=5+5)
      
      Change-Id: I6972e8f35f2b31f9a42ac473a6bf419a18022558
      Reviewed-on: https://go-review.googlesource.com/100935
      Run-TryBot: Giovanni Bajo <rasky@develer.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      a35ec9a5
    • James Cowgill's avatar
      cmd/internal/obj/mips: load/store even float registers first · 42311108
      James Cowgill authored
      There is a bug in Octeon III processors where storing an odd floating
      point register after it has recently been written to by a double
      floating point operation will store the old value from before the double
      operation (there are some extra details - the operation and store
      must be a certain number of cycles apart). However, this bug does not
      occur if the even register is stored first. Currently the bug only
      happens on big endian because go always loads the even register first on
      little endian.
      
      Workaround the bug by always loading / storing the even floating point
      register first. Since this is just an instruction reordering, it should
      have no performance penalty. This follows other compilers like GCC which
      will always store the even register first (although you do have to set
      the ISA level to MIPS I to prevent it from using SDC1).
      
      Change-Id: I5e73daa4d724ca1df7bf5228aab19f53f26a4976
      Reviewed-on: https://go-review.googlesource.com/97735Reviewed-by: default avatarKeith Randall <khr@golang.org>
      42311108
    • Geoff Berry's avatar
      cmd/compile/internal/ssa: add patterns for arm64 bitfield opcodes · e244a7a7
      Geoff Berry authored
      Add patterns to match common idioms for EXTR, BFI, BFXIL, SBFIZ, SBFX,
      UBFIZ and UBFX opcodes.
      
      go1 benchmarks results on Amberwing:
      name                   old time/op    new time/op    delta
      FmtManyArgs               786ns ± 2%     714ns ± 1%  -9.20%  (p=0.000 n=10+10)
      Gzip                      437ms ± 0%     402ms ± 0%  -7.99%  (p=0.000 n=10+10)
      FmtFprintfIntInt          196ns ± 0%     182ns ± 0%  -7.28%  (p=0.000 n=10+9)
      FmtFprintfPrefixedInt     207ns ± 0%     199ns ± 0%  -3.86%  (p=0.000 n=10+10)
      FmtFprintfFloat           324ns ± 0%     316ns ± 0%  -2.47%  (p=0.000 n=10+8)
      FmtFprintfInt             119ns ± 0%     117ns ± 0%  -1.68%  (p=0.000 n=10+9)
      GobDecode                12.8ms ± 2%    12.6ms ± 1%  -1.62%  (p=0.002 n=10+10)
      JSONDecode               94.4ms ± 1%    93.4ms ± 0%  -1.10%  (p=0.000 n=10+10)
      RegexpMatchEasy0_32       247ns ± 0%     245ns ± 0%  -0.65%  (p=0.000 n=10+10)
      RegexpMatchMedium_32      314ns ± 0%     312ns ± 0%  -0.64%  (p=0.000 n=10+10)
      RegexpMatchEasy0_1K       541ns ± 0%     538ns ± 0%  -0.55%  (p=0.000 n=10+9)
      TimeParse                 450ns ± 1%     448ns ± 1%  -0.42%  (p=0.035 n=9+9)
      RegexpMatchEasy1_32       244ns ± 0%     243ns ± 0%  -0.41%  (p=0.000 n=10+10)
      GoParse                  6.03ms ± 0%    6.00ms ± 0%  -0.40%  (p=0.002 n=10+10)
      RegexpMatchEasy1_1K       779ns ± 0%     777ns ± 0%  -0.26%  (p=0.000 n=10+10)
      RegexpMatchHard_32       2.75µs ± 0%    2.74µs ± 1%  -0.06%  (p=0.026 n=9+9)
      BinaryTree17              11.7s ± 0%     11.6s ± 0%    ~     (p=0.089 n=10+10)
      HTTPClientServer         89.1µs ± 1%    89.5µs ± 2%    ~     (p=0.436 n=10+10)
      RegexpMatchHard_1K       78.9µs ± 0%    79.5µs ± 2%    ~     (p=0.469 n=10+10)
      FmtFprintfEmpty          58.5ns ± 0%    58.5ns ± 0%    ~     (all equal)
      GobEncode                12.0ms ± 1%    12.1ms ± 0%    ~     (p=0.075 n=10+10)
      Revcomp                   669ms ± 0%     668ms ± 0%    ~     (p=0.091 n=7+9)
      Mandelbrot200            5.35ms ± 0%    5.36ms ± 0%  +0.07%  (p=0.000 n=9+9)
      RegexpMatchMedium_1K     52.1µs ± 0%    52.1µs ± 0%  +0.10%  (p=0.000 n=9+9)
      Fannkuch11                3.25s ± 0%     3.26s ± 0%  +0.36%  (p=0.000 n=9+10)
      FmtFprintfString          114ns ± 1%     115ns ± 0%  +0.52%  (p=0.011 n=10+10)
      JSONEncode               20.2ms ± 0%    20.3ms ± 0%  +0.65%  (p=0.000 n=10+10)
      Template                 91.3ms ± 0%    92.3ms ± 0%  +1.08%  (p=0.000 n=10+10)
      TimeFormat                484ns ± 0%     495ns ± 1%  +2.30%  (p=0.000 n=9+10)
      
      There are some opportunities to improve this change further by adding
      patterns to match the "extended register" versions of ADD/SUB/CMP, but I
      think that should be evaluated on its own.  The regressions in Template
      and TimeFormat would likely be recovered by this, as they seem to be due
      to generating:
      
          ubfiz x0, x0, #3, #8
          add x1, x2, x0
      
      instead of
      
          add x1, x2, x0, lsl #3
      
      Change-Id: I5644a8d70ac7a98e784a377a2b76ab47a3415a4b
      Reviewed-on: https://go-review.googlesource.com/88355Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      Run-TryBot: Cherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      e244a7a7
    • Alberto Donizetti's avatar
      test/codegen: port len/cap pow2 div tests to codegen · ded9a1b3
      Alberto Donizetti authored
      And delete them from asm_test.
      
      Change-Id: I29c8d098a8893e6b669b6272a2f508985ac9d618
      Reviewed-on: https://go-review.googlesource.com/100876Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      ded9a1b3
    • Tobias Klauser's avatar
      syscall: use Android O friendly fstatat syscall to implement Stat on linux/amd64 · 10732562
      Tobias Klauser authored
      The Android O seccomp policy disallows the stat syscall on amd64, see
      https://android.googlesource.com/platform/bionic/+/android-4.2.2_r1.2/libc/SYSCALLS.TXT
      
      Use the fstatat syscall with AT_FDCWD and zero flags instead to achieve
      the same behavior.
      
      Fixes #24403
      
      Change-Id: I36fc9ec9bc938cd8e9de30f66c0eb9d2e24debf6
      Reviewed-on: https://go-review.googlesource.com/100878
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      Reviewed-by: default avatarElias Naur <elias.naur@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      10732562
    • Tobias Klauser's avatar
      runtime: use Android O friendly faccessat syscall on linux/amd64 · 5fcfe6b6
      Tobias Klauser authored
      The Android O seccomp policy disallows the access syscall on amd64, see
      https://android.googlesource.com/platform/bionic/+/android-4.2.2_r1.2/libc/SYSCALLS.TXT
      
      Use the faccessat syscall with AT_FDCWD instead to achieve the same
      behavior.
      
      Updates #24403
      
      Change-Id: I9db847c1c0f33987a3479b3f96e721fb9588cde2
      Reviewed-on: https://go-review.googlesource.com/100877
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      5fcfe6b6
    • Diogo Pinela's avatar
      sync: make WaitGroup more space-efficient · 9ff7df00
      Diogo Pinela authored
      The struct stores its 64-bit state field in a 12-byte array to
      ensure that it can be 64-bit-aligned. This leaves 4 spare bytes,
      which we can reuse to store the sema field.
      
      (32-bit alignment is still guaranteed because the array type was
      changed to [3]uint32.)
      
      Fixes #19149.
      
      Change-Id: I9bc20e69e45e0e07fbf496080f3650e8be0d6e8d
      Reviewed-on: https://go-review.googlesource.com/100515Reviewed-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      9ff7df00
    • Ian Gudger's avatar
      net: use golang.org/x/net/dns/dnsmessage for DNS resolution · 672729eb
      Ian Gudger authored
      Vendors golang.org/x/net/dns/dnsmessage from x/net git rev
      892bf7b0c6e2f93b51166bf3882e50277fa5afc6
      
      Updates #16218
      Updates #21160
      
      Change-Id: Ic4e8f3c3d83c2936354ec14c5be93b0d2b42dd91
      Reviewed-on: https://go-review.googlesource.com/37879
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      672729eb
    • Josh Bleecher Snyder's avatar
      runtime: fix another typo in runtime-gdb.py · c830e05a
      Josh Bleecher Snyder authored
      tuple, touple,
      gdb, gdv,
      let's call the whole thing off.
      
      Change-Id: I72d12f6c75061777474e7dec2c90d2a8a3715da6
      Reviewed-on: https://go-review.googlesource.com/100836
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      c830e05a
    • Matthew Dempsky's avatar
      cmd/compile: extract common noding code from func{Decl,Lit} · 29517daf
      Matthew Dempsky authored
      Passes toolstash-check.
      
      Change-Id: I8290221d6169e077dfa4ea737d685c7fcecf6841
      Reviewed-on: https://go-review.googlesource.com/100835
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      29517daf
    • Matthew Dempsky's avatar
      cmd/compile: fix duplicate code generation in swt.go · 463fe95b
      Matthew Dempsky authored
      When combining adjacent type switch cases with the same type hash, we
      failed to actually remove the combined cases, so we would generate
      code for them twice.
      
      We use MD5 for type hashes, so collisions are rare, but they do
      currently appear in test/fixedbugs/bug248.dir/bug2.go, which is how I
      noticed this failure.
      
      Passes toolstash-check.
      
      Change-Id: I66729b3366b96cb8ddc8fa6f3ebea11ef6d74012
      Reviewed-on: https://go-review.googlesource.com/100461
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      463fe95b
    • Josh Bleecher Snyder's avatar
      runtime: print goid when throwing for split stack overflow · 183fd6f1
      Josh Bleecher Snyder authored
      Change-Id: I66515156c2fc6886312c0eccb86d7ceaf7947042
      Reviewed-on: https://go-review.googlesource.com/100465
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      183fd6f1
    • Josh Bleecher Snyder's avatar
      runtime: refactor gdb PC parsing · ef400ed2
      Josh Bleecher Snyder authored
      Change-Id: I91607edaf9c256e6723eb3d6e18c8210eb86b704
      Reviewed-on: https://go-review.googlesource.com/100464
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      ef400ed2
  2. 14 Mar, 2018 14 commits
    • Matthew Dempsky's avatar
      cmd/compile: cleanup closure.go · eb3c44b2
      Matthew Dempsky authored
      The main thing is we now eagerly create the ODCLFUNC node for
      closures, immediately cross-link them, and assign fields (e.g., Nbody,
      Dcl, Parents, Marks) directly on the ODCLFUNC (previously they were
      assigned on the OCLOSURE and later moved to the ODCLFUNC).
      
      This allows us to set Curfn to the ODCLFUNC instead of the OCLOSURE,
      which makes things more consistent with normal function declarations.
      (Notably, this means Cvars now hang off the ODCLFUNC instead of the
      OCLOSURE.)
      
      Assignment of xfunc symbol names also now happens before typechecking
      their body, which means debugging output now provides a more helpful
      name than "<S>".
      
      In golang.org/cl/66810, we changed "x := y" statements to avoid
      creating false closure variables for x, but we still create them for
      struct literals like "s{f: x}". Update comment in capturevars
      accordingly.
      
      More opportunity for cleanups still, but this makes some substantial
      progress, IMO.
      
      Passes toolstash-check.
      
      Change-Id: I65a4efc91886e3dcd1000561348af88297775cd7
      Reviewed-on: https://go-review.googlesource.com/100197
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      eb3c44b2
    • Ilya Tocar's avatar
      Revert "cmd/compile: implement CMOV on amd64" · 644d14ea
      Ilya Tocar authored
      This reverts commit 080187f4.
      
      It broke build of golang.org/x/exp/shiny/iconvg
      See issue 24395 for details
      
      Change-Id: Ifd6134f6214e6cee40bd3c63c32941d5fc96ae8b
      Reviewed-on: https://go-review.googlesource.com/100755
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      644d14ea
    • Heschi Kreinick's avatar
      cmd/compile/internal/ssa: track stack-only vars · 44e65f2c
      Heschi Kreinick authored
      User variables that cannot be SSA'd, either because their addresses are
      taken or because they are too large for the decomposition heuristic, do
      not explicitly appear as operands of SSA values. Instead they are written
      to directly via the stack pointer.
      
      This hid them from the location list generation, which is only
      interested in the named value table. Fortunately, the lifetime of
      stack-only variables is delineated by VarDef/VarKill ops, and it's easy
      enough to turn those into location list bounds.
      
      One wrinkle: stack frame information is not explicitly available in the
      SSA phases, because it's owned by the frontend in AllocFrame. It would
      be easier if the set of live LocalSlots were returned by that, but this
      is the minimal change to fix missing variables. Or VarDef/VarKills
      could appear in NamedValues, which would make this change even easier.
      
      Change-Id: Ice6654dad6f9babb0286e95c7ec28594561dc91f
      Reviewed-on: https://go-review.googlesource.com/100458Reviewed-by: default avatarDavid Chase <drchase@google.com>
      Run-TryBot: David Chase <drchase@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      44e65f2c
    • Lynn Boger's avatar
      cmd/compile: improve PPC64.rules to reduce size of rewritePPC64.go · aff222cd
      Lynn Boger authored
      Some rules in PPC64.rules cause an extremely large rewritePPC64.go
      file to be generated, due to rules with commutative operations and
      many operands. This happens with the existing
      rules for combining byte loads in little endian order, and
      also happens with the pending change to do the same for bytes
      in big endian order.
      
      The change improves the existing rules and reduces the size of
      the rewrite file by more than 60%. Once this change is merged,
      then the pending change for big endian ordered rules will be
      updated to use rules that avoid generating an excessively large
      rewrite file.
      
      This also includes a fix to a performance regression for
      littleEndian.PutUint16 on ppc64le.
      
      Change-Id: I8d2ea42885fa2b84b30c63aa124b0a9b130564ff
      Reviewed-on: https://go-review.googlesource.com/100675
      Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      aff222cd
    • Robert Griesemer's avatar
      math/big: add comment about internal assumptions on nat values · 7d4d2cb6
      Robert Griesemer authored
      Change-Id: I7ed40507a019c0bf521ba748fc22c03d74bb17b7
      Reviewed-on: https://go-review.googlesource.com/100719Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      7d4d2cb6
    • Balaram Makam's avatar
      runtime: improve arm64 memclr implementation · b46d3988
      Balaram Makam authored
      Improve runtime memclr_arm64.s using ZVA feature to zero out memory when n
      is at least 64 bytes.
      
      Also add DCZID_EL0 system register to use in MRS instruction.
      
          Benchmark results of runtime/Memclr on Amberwing:
      name          old time/op    new time/op    delta
      Memclr/5        12.7ns ± 0%    12.7ns ± 0%      ~     (all equal)
      Memclr/16       12.7ns ± 0%    12.2ns ± 1%    -4.13%  (p=0.000 n=7+8)
      Memclr/64       14.0ns ± 0%    14.6ns ± 1%    +4.29%  (p=0.000 n=7+8)
      Memclr/256      23.7ns ± 0%    25.7ns ± 0%    +8.44%  (p=0.000 n=8+7)
      Memclr/4096      204ns ± 0%      74ns ± 0%   -63.71%  (p=0.000 n=8+8)
      Memclr/65536    2.89µs ± 0%    0.84µs ± 0%   -70.91%  (p=0.000 n=8+8)
      Memclr/1M       45.9µs ± 0%    17.0µs ± 0%   -62.88%  (p=0.000 n=8+8)
      Memclr/4M        184µs ± 0%      77µs ± 4%   -57.94%  (p=0.001 n=6+8)
      Memclr/8M        367µs ± 0%     144µs ± 1%   -60.72%  (p=0.000 n=7+8)
      Memclr/16M       734µs ± 0%     293µs ± 1%   -60.09%  (p=0.000 n=8+8)
      Memclr/64M      2.94ms ± 0%    1.23ms ± 0%   -58.06%  (p=0.000 n=7+8)
      GoMemclr/5      8.00ns ± 0%    8.79ns ± 0%    +9.83%  (p=0.000 n=8+8)
      GoMemclr/16     8.00ns ± 0%    7.60ns ± 0%    -5.00%  (p=0.000 n=8+8)
      GoMemclr/64     10.8ns ± 0%    10.4ns ± 0%    -3.70%  (p=0.000 n=8+8)
      GoMemclr/256    20.4ns ± 0%    21.2ns ± 0%    +3.92%  (p=0.000 n=8+8)
      
      name          old speed      new speed      delta
      Memclr/5       394MB/s ± 0%   393MB/s ± 0%    -0.28%  (p=0.006 n=8+8)
      Memclr/16     1.26GB/s ± 0%  1.31GB/s ± 1%    +4.07%  (p=0.000 n=7+8)
      Memclr/64     4.57GB/s ± 0%  4.39GB/s ± 2%    -3.91%  (p=0.000 n=7+8)
      Memclr/256    10.8GB/s ± 0%  10.0GB/s ± 0%    -7.95%  (p=0.001 n=7+6)
      Memclr/4096   20.1GB/s ± 0%  55.3GB/s ± 0%  +175.46%  (p=0.000 n=8+8)
      Memclr/65536  22.6GB/s ± 0%  77.8GB/s ± 0%  +243.63%  (p=0.000 n=7+8)
      Memclr/1M     22.8GB/s ± 0%  61.5GB/s ± 0%  +169.38%  (p=0.000 n=8+8)
      Memclr/4M     22.8GB/s ± 0%  54.3GB/s ± 4%  +137.85%  (p=0.001 n=6+8)
      Memclr/8M     22.8GB/s ± 0%  58.1GB/s ± 1%  +154.56%  (p=0.000 n=7+8)
      Memclr/16M    22.8GB/s ± 0%  57.2GB/s ± 1%  +150.54%  (p=0.000 n=8+8)
      Memclr/64M    22.8GB/s ± 0%  54.4GB/s ± 0%  +138.42%  (p=0.000 n=7+8)
      GoMemclr/5     625MB/s ± 0%   569MB/s ± 0%    -8.90%  (p=0.000 n=7+8)
      GoMemclr/16   2.00GB/s ± 0%  2.10GB/s ± 0%    +5.26%  (p=0.000 n=8+8)
      GoMemclr/64   5.92GB/s ± 0%  6.15GB/s ± 0%    +3.83%  (p=0.000 n=7+8)
      GoMemclr/256  12.5GB/s ± 0%  12.1GB/s ± 0%    -3.77%  (p=0.000 n=8+7)
      
          Benchmark results of runtime/Memclr on Amberwing without ZVA:
      name          old time/op    new time/op    delta
      Memclr/5        12.7ns ± 0%    12.8ns ± 0%   +0.79%  (p=0.008 n=5+5)
      Memclr/16       12.7ns ± 0%    12.7ns ± 0%     ~     (p=0.444 n=5+5)
      Memclr/64       14.0ns ± 0%    14.4ns ± 0%   +2.86%  (p=0.008 n=5+5)
      Memclr/256      23.7ns ± 1%    19.2ns ± 0%  -19.06%  (p=0.008 n=5+5)
      Memclr/4096      203ns ± 0%     119ns ± 0%  -41.38%  (p=0.008 n=5+5)
      Memclr/65536    2.89µs ± 0%    1.66µs ± 0%  -42.76%  (p=0.008 n=5+5)
      Memclr/1M       45.9µs ± 0%    26.2µs ± 0%  -42.82%  (p=0.008 n=5+5)
      Memclr/4M        184µs ± 0%     105µs ± 0%  -42.81%  (p=0.008 n=5+5)
      Memclr/8M        367µs ± 0%     210µs ± 0%  -42.76%  (p=0.008 n=5+5)
      Memclr/16M       734µs ± 0%     420µs ± 0%  -42.74%  (p=0.008 n=5+5)
      Memclr/64M      2.94ms ± 0%    1.69ms ± 0%  -42.46%  (p=0.008 n=5+5)
      GoMemclr/5      8.00ns ± 0%    8.40ns ± 0%   +5.00%  (p=0.008 n=5+5)
      GoMemclr/16     8.00ns ± 0%    8.40ns ± 0%   +5.00%  (p=0.008 n=5+5)
      GoMemclr/64     10.8ns ± 0%     9.6ns ± 0%  -11.02%  (p=0.008 n=5+5)
      GoMemclr/256    20.4ns ± 0%    17.2ns ± 0%  -15.69%  (p=0.008 n=5+5)
      
      name          old speed      new speed      delta
      Memclr/5       393MB/s ± 0%   391MB/s ± 0%   -0.64%  (p=0.008 n=5+5)
      Memclr/16     1.26GB/s ± 0%  1.26GB/s ± 0%   -0.55%  (p=0.008 n=5+5)
      Memclr/64     4.57GB/s ± 0%  4.44GB/s ± 0%   -2.79%  (p=0.008 n=5+5)
      Memclr/256    10.8GB/s ± 0%  13.3GB/s ± 0%  +23.07%  (p=0.016 n=4+5)
      Memclr/4096   20.1GB/s ± 0%  34.3GB/s ± 0%  +70.91%  (p=0.008 n=5+5)
      Memclr/65536  22.7GB/s ± 0%  39.6GB/s ± 0%  +74.65%  (p=0.008 n=5+5)
      Memclr/1M     22.8GB/s ± 0%  40.0GB/s ± 0%  +74.88%  (p=0.008 n=5+5)
      Memclr/4M     22.8GB/s ± 0%  39.9GB/s ± 0%  +74.84%  (p=0.008 n=5+5)
      Memclr/8M     22.9GB/s ± 0%  39.9GB/s ± 0%  +74.71%  (p=0.008 n=5+5)
      Memclr/16M    22.9GB/s ± 0%  39.9GB/s ± 0%  +74.64%  (p=0.008 n=5+5)
      Memclr/64M    22.8GB/s ± 0%  39.7GB/s ± 0%  +73.79%  (p=0.008 n=5+5)
      GoMemclr/5     625MB/s ± 0%   595MB/s ± 0%   -4.77%  (p=0.000 n=4+5)
      GoMemclr/16   2.00GB/s ± 0%  1.90GB/s ± 0%   -4.77%  (p=0.008 n=5+5)
      GoMemclr/64   5.92GB/s ± 0%  6.66GB/s ± 0%  +12.48%  (p=0.016 n=4+5)
      GoMemclr/256  12.5GB/s ± 0%  14.9GB/s ± 0%  +18.95%  (p=0.008 n=5+5)
      
      Fixes #22948
      
      Change-Id: Iaae4e22391e25b54d299821bb7f8a81ac3986b93
      Reviewed-on: https://go-review.googlesource.com/82055
      Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      b46d3988
    • Robert Griesemer's avatar
      cmd/compile: document new line directives · e65d6a6a
      Robert Griesemer authored
      Fixes #24183.
      
      Change-Id: I5ef31c4a3aad7e05568b7de1227745d686d4aff8
      Reviewed-on: https://go-review.googlesource.com/100462Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      e65d6a6a
    • Tobias Klauser's avatar
      runtime, syscall: add RawSyscall6 on Solaris and make it panic · f0939ba5
      Tobias Klauser authored
      The syscall package currently declares RawSyscall6 for every GOOS, but
      does not define it on Solaris. This leads to code using said function
      to compile but it will not link. Fix it by adding RawSyscall6 and make
      it panic.
      
      Also remove the obsolete comment above runtime.syscall_syscall as
      pointed out by Aram.
      
      Updates #24357
      
      Change-Id: I1b1423121d1c99de2ecc61cd9a935dba9b39e3a4
      Reviewed-on: https://go-review.googlesource.com/100655Reviewed-by: default avatarAram Hăvărneanu <aram@mgk.ro>
      f0939ba5
    • Alberto Donizetti's avatar
      test/codegen: port all small memmove tests to codegen · cd3aae9b
      Alberto Donizetti authored
      This change ports all the remaining tests checking that small memmoves
      are replaced with MOVs to the new codegen test harness, and deletes
      them from the asm_test file.
      
      Change-Id: I01c94b441e27a5d61518035af62d62779dafeb56
      Reviewed-on: https://go-review.googlesource.com/100476
      Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      cd3aae9b
    • Alberto Donizetti's avatar
      test/codegen: add codegen tests for div · 858042b8
      Alberto Donizetti authored
      Change-Id: I6ce8981e85fd55ade6078b0946e54a9215d9deca
      Reviewed-on: https://go-review.googlesource.com/100575
      Run-TryBot: Alberto Donizetti <alb.donizetti@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      858042b8
    • Daniel Martí's avatar
      cmd/asm: move manual tests out of generated file · b8d26225
      Daniel Martí authored
      Thanks to Iskander Sharipov for spotting this in an earlier CL of mine.
      
      Change-Id: Idf45ad266205ff83985367cb38f585badfbed151
      Reviewed-on: https://go-review.googlesource.com/100535
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      Reviewed-by: default avatarIskander Sharipov <iskander.sharipov@intel.com>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      b8d26225
    • David du Colombier's avatar
      runtime: don't use floating point in findnull on Plan 9 · 523f2ea7
      David du Colombier authored
      In CL 98015, findnull was rewritten so it uses bytes.IndexByte.
      
      This broke the build on plan9/amd64 because the implementation
      of bytes.IndexByte on AMD64 relies on SSE instructions while
      floating point instructions are not allowed in the note handler.
      
      This change fixes findnull by using the former implementation
      on Plan 9, so it doesn't use bytes.IndexByte.
      
      Fixes #24387.
      
      Change-Id: I084d1a44d38d9f77a6c1ad492773f0a98226be16
      Reviewed-on: https://go-review.googlesource.com/100577
      Run-TryBot: David du Colombier <0intro@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarRob Pike <r@golang.org>
      523f2ea7
    • Tobias Klauser's avatar
      test: check that size argument errors are emitted at call site · d32018a5
      Tobias Klauser authored
      Add tests for the "negative size argument in make.*" and "size argument
      too large in make.*" error messages to appear at call sites in case the
      size is a const defined on another line.
      
      As suggested by Matthew in a comment on CL 69910.
      
      Change-Id: I5c33d4bec4e3d20bb21fe8019df27999997ddff3
      Reviewed-on: https://go-review.googlesource.com/100395Reviewed-by: default avatarMatthew Dempsky <mdempsky@google.com>
      d32018a5
    • Josh Bleecher Snyder's avatar
      runtime: fix typo in gdb script · 4d38d3ae
      Josh Bleecher Snyder authored
      Change-Id: I9d4b3e25b00724f0e4870c6082671b4f14cc18fc
      Reviewed-on: https://go-review.googlesource.com/100463
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      4d38d3ae
  3. 13 Mar, 2018 12 commits