1. 15 Oct, 2018 14 commits
    • Martin Möhrmann's avatar
      cmd/compile: avoid implicit bounds checks after explicit checks for append · 9f66b41b
      Martin Möhrmann authored
      The generated code for the append builtin already checks if the appended
      to slice is large enough and calls growslice if that is not the case.
      Trust that this ensures the slice is large enough and avoid the
      implicit bounds check when slicing the slice to its new size.
      
      Removes 365 panicslice calls (-14%) from the go binary which
      reduces the binary size by ~12kbyte.
      
      Change-Id: I1b88418675ff409bc0b956853c9e95241274d5a6
      Reviewed-on: https://go-review.googlesource.com/c/119315
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      9f66b41b
    • Martin Möhrmann's avatar
      runtime/internal/math: add multiplication with overflow check · c9130cae
      Martin Möhrmann authored
      This CL adds a new internal math package for use by the runtime.
      The new package exports a MulUintptr function with uintptr arguments
      a and b and returns uintptr(a*b) and whether the full-width product
      x*y does overflow the uintptr value range (uintptr(x*y) != x*y).
      
      Uses of MulUinptr in the runtime and intrinsics for performance
      will be added in followup CLs.
      
      Updates #21588
      
      Change-Id: Ia5a02eeabc955249118e4edf68c67d9fc0858058
      Reviewed-on: https://go-review.googlesource.com/c/91755
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      c9130cae
    • Keith Randall's avatar
      cmd/compile: check order temp has correct type · 240a30da
      Keith Randall authored
      Followon from CL 140306
      
      Change-Id: Ic71033d2301105b15b60645d895a076107f44a2e
      Reviewed-on: https://go-review.googlesource.com/c/142178
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      240a30da
    • Alberto Donizetti's avatar
      test/codegen: test ppc64 TrailingZeros, OnesCount codegen · 7c96d87e
      Alberto Donizetti authored
      This change adds codegen tests for the intrinsification on ppc64 of
      the OnesCount{64,32,16,8}, and TrailingZeros{64,32,16,8} math/bits
      functions.
      
      Change-Id: Id3364921fbd18316850e15c8c71330c906187fdb
      Reviewed-on: https://go-review.googlesource.com/c/141897Reviewed-by: default avatarLynn Boger <laboger@linux.vnet.ibm.com>
      7c96d87e
    • Josh Bleecher Snyder's avatar
      cmd/compile: fuse before branchelim · a55f3ee4
      Josh Bleecher Snyder authored
      The branchelim pass works better after fuse.
      Running fuse before branchelim also increases
      the stability of generated code amidst other compiler changes,
      which was the original motivation behind this change.
      
      The fuse pass is not cheap enough to run in its entirety
      before branchelim, but the most important half of it is.
      This change makes it possible to run "plain fuse" independently
      and does so before branchelim.
      
      During make.bash, elimIf occurrences increase from 4244 to 4288 (1%),
      and elimIfElse occurrences increase from 989 to 1079 (9%).
      
      Toolspeed impact is marginal; plain fuse pays for itself.
      
      name        old time/op       new time/op       delta
      Template          189ms ± 2%        189ms ± 2%    ~     (p=0.890 n=45+46)
      Unicode          93.2ms ± 5%       93.4ms ± 7%    ~     (p=0.790 n=48+48)
      GoTypes           662ms ± 4%        660ms ± 4%    ~     (p=0.186 n=48+49)
      Compiler          2.89s ± 4%        2.91s ± 3%  +0.89%  (p=0.050 n=49+44)
      SSA               8.23s ± 2%        8.21s ± 1%    ~     (p=0.165 n=46+44)
      Flate             123ms ± 4%        123ms ± 3%  +0.58%  (p=0.031 n=47+49)
      GoParser          154ms ± 4%        154ms ± 4%    ~     (p=0.492 n=49+48)
      Reflect           430ms ± 4%        429ms ± 4%    ~     (p=1.000 n=48+48)
      Tar               171ms ± 3%        170ms ± 4%    ~     (p=0.122 n=48+48)
      XML               232ms ± 3%        232ms ± 2%    ~     (p=0.850 n=46+49)
      [Geo mean]        394ms             394ms       +0.02%
      
      name        old user-time/op  new user-time/op  delta
      Template          236ms ± 5%        236ms ± 4%    ~     (p=0.934 n=50+50)
      Unicode           132ms ± 7%        130ms ± 9%    ~     (p=0.087 n=50+50)
      GoTypes           861ms ± 3%        867ms ± 4%    ~     (p=0.124 n=48+50)
      Compiler          3.93s ± 4%        3.94s ± 3%    ~     (p=0.584 n=49+44)
      SSA               12.2s ± 2%        12.3s ± 1%    ~     (p=0.610 n=46+45)
      Flate             149ms ± 4%        150ms ± 4%    ~     (p=0.194 n=48+49)
      GoParser          193ms ± 5%        191ms ± 6%    ~     (p=0.239 n=49+50)
      Reflect           553ms ± 5%        556ms ± 5%    ~     (p=0.091 n=49+49)
      Tar               218ms ± 5%        218ms ± 5%    ~     (p=0.359 n=49+50)
      XML               299ms ± 5%        298ms ± 4%    ~     (p=0.482 n=50+49)
      [Geo mean]        516ms             516ms       -0.01%
      
      name        old alloc/op      new alloc/op      delta
      Template         36.3MB ± 0%       36.3MB ± 0%  -0.02%  (p=0.000 n=49+49)
      Unicode          29.7MB ± 0%       29.7MB ± 0%    ~     (p=0.270 n=50+50)
      GoTypes           126MB ± 0%        126MB ± 0%  -0.34%  (p=0.000 n=50+49)
      Compiler          534MB ± 0%        531MB ± 0%  -0.50%  (p=0.000 n=50+50)
      SSA              1.98GB ± 0%       1.98GB ± 0%  -0.06%  (p=0.000 n=49+49)
      Flate            24.6MB ± 0%       24.6MB ± 0%  -0.29%  (p=0.000 n=50+50)
      GoParser         29.5MB ± 0%       29.4MB ± 0%  -0.15%  (p=0.000 n=49+50)
      Reflect          87.3MB ± 0%       87.2MB ± 0%  -0.13%  (p=0.000 n=49+50)
      Tar              35.6MB ± 0%       35.5MB ± 0%  -0.17%  (p=0.000 n=50+50)
      XML              48.2MB ± 0%       48.0MB ± 0%  -0.30%  (p=0.000 n=48+50)
      [Geo mean]       83.1MB            82.9MB       -0.20%
      
      name        old allocs/op     new allocs/op     delta
      Template           352k ± 0%         352k ± 0%  -0.01%  (p=0.004 n=49+49)
      Unicode            341k ± 0%         341k ± 0%    ~     (p=0.341 n=48+50)
      GoTypes           1.28M ± 0%        1.28M ± 0%  -0.03%  (p=0.000 n=50+49)
      Compiler          4.96M ± 0%        4.96M ± 0%  -0.05%  (p=0.000 n=50+49)
      SSA               15.5M ± 0%        15.5M ± 0%  -0.01%  (p=0.000 n=50+49)
      Flate              233k ± 0%         233k ± 0%  +0.01%  (p=0.032 n=49+49)
      GoParser           294k ± 0%         294k ± 0%    ~     (p=0.052 n=46+48)
      Reflect           1.04M ± 0%        1.04M ± 0%    ~     (p=0.171 n=50+47)
      Tar                343k ± 0%         343k ± 0%  -0.03%  (p=0.000 n=50+50)
      XML                429k ± 0%         429k ± 0%  -0.04%  (p=0.000 n=50+50)
      [Geo mean]         812k              812k       -0.02%
      
      Object files grow slightly; branchelim often increases binary size, at least on amd64.
      
      name        old object-bytes  new object-bytes  delta
      Template          509kB ± 0%        509kB ± 0%  -0.01%  (p=0.008 n=5+5)
      Unicode           224kB ± 0%        224kB ± 0%    ~     (all equal)
      GoTypes          1.84MB ± 0%       1.84MB ± 0%  +0.00%  (p=0.008 n=5+5)
      Compiler         6.71MB ± 0%       6.71MB ± 0%  +0.01%  (p=0.008 n=5+5)
      SSA              21.2MB ± 0%       21.2MB ± 0%  +0.01%  (p=0.008 n=5+5)
      Flate             324kB ± 0%        324kB ± 0%  -0.00%  (p=0.008 n=5+5)
      GoParser          404kB ± 0%        404kB ± 0%  -0.02%  (p=0.008 n=5+5)
      Reflect          1.40MB ± 0%       1.40MB ± 0%  +0.09%  (p=0.008 n=5+5)
      Tar               452kB ± 0%        452kB ± 0%  +0.06%  (p=0.008 n=5+5)
      XML               596kB ± 0%        596kB ± 0%  +0.00%  (p=0.008 n=5+5)
      [Geo mean]       1.04MB            1.04MB       +0.01%
      
      Change-Id: I535c711b85380ff657fc0f022bebd9cb14ddd07f
      Reviewed-on: https://go-review.googlesource.com/c/129378
      Run-TryBot: Josh Bleecher Snyder <josharian@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      a55f3ee4
    • Keith Randall's avatar
      cmd/compile: provide types for all order-allocated temporaries · 63e964e1
      Keith Randall authored
      Ensure that we correctly type the stack temps for regular closures,
      method function closures, and slice literals.
      
      Then we don't need to override the dummy types later.
      Furthermore, this allows order to reuse temporaries of these types.
      
      OARRAYLIT doesn't need a temporary as far as I can tell, so I
      removed that case from order.
      
      Change-Id: Ic58520fa50c90639393ff78f33d3c831d5c4acb9
      Reviewed-on: https://go-review.googlesource.com/c/140306Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      63e964e1
    • Keith Randall's avatar
      cmd/compile: fix gdb stepping test · 296b7aea
      Keith Randall authored
      Not sure why this changed behavior, but seems mostly harmless.
      
      Fixes #28198
      
      Change-Id: Ie25c6e1fcb64912a582c7ae7bf92c4c1642e83cb
      Reviewed-on: https://go-review.googlesource.com/c/141649Reviewed-by: default avatarDavid Chase <drchase@google.com>
      296b7aea
    • Ben Shi's avatar
      test/codegen: add tests of FMA for arm/arm64 · 93e27e01
      Ben Shi authored
      This CL adds tests of fused multiplication-accumulation
      on arm/arm64.
      
      Change-Id: Ic85d5277c0d6acb7e1e723653372dfaf96824a39
      Reviewed-on: https://go-review.googlesource.com/c/141652
      Run-TryBot: Ben Shi <powerman1st@163.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      93e27e01
    • Akhil Indurti's avatar
      internal/cpu: expose ARM feature flags for FMA · bb3bf5bb
      Akhil Indurti authored
      This change exposes feature flags needed to implement an FMA intrinsic
      on ARM CPUs via auxv's HWCAP bits. Specifically, it exposes HasVFPv4 to
      detect if an ARM processor has the fourth version of the vector floating
      point unit. The relevant instruction for this CL is VFMA, emitted in Go
      as FMULAD.
      
      Updates #26630.
      
      Change-Id: Ibbc04fb24c2b4d994f93762360f1a37bc6d83ff7
      Reviewed-on: https://go-review.googlesource.com/c/126315
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarMartin Möhrmann <moehrmann@google.com>
      bb3bf5bb
    • Martin Möhrmann's avatar
      cmd/compile: simplify as2 method of *Order · d6e80069
      Martin Möhrmann authored
      Merge the two for loops that set up the node lists for
      temporaries into one for loop.
      
      Passes toolstash -cmp
      
      Change-Id: Ibc739115f38c8869b0dcfbf9819fdc2fc96962e0
      Reviewed-on: https://go-review.googlesource.com/c/141819Reviewed-by: default avatarKeith Randall <khr@golang.org>
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      d6e80069
    • avsharapov's avatar
      cmd/cgo: simplify switch statement to if statement · 9322b533
      avsharapov authored
      Change-Id: Ie7dce45d554fde69d682680f55abba6a7fc55036
      Reviewed-on: https://go-review.googlesource.com/c/142017Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      9322b533
    • Ivan Sharavuev's avatar
      pprof: replace bits = bits + "..." to bits += "..." where bits is a string. · e47c11d8
      Ivan Sharavuev authored
      Change-Id: Ic77ebbdf2670b7fdf2c381cd1ba768624b07e57c
      Reviewed-on: https://go-review.googlesource.com/c/141998
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      e47c11d8
    • OlgaVlPetrova's avatar
      src/cmd/compile/internal/ssa: replace `s = s + x' => 's += x'. · 85066acc
      OlgaVlPetrova authored
      Change-Id: I1f399a8a0aa200bfda01f97f920b1345e59956ba
      Reviewed-on: https://go-review.googlesource.com/c/142057
      Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      85066acc
    • Ben Shi's avatar
      test/codegen: add tests for multiplication-subtraction · c3208842
      Ben Shi authored
      This CL adds tests for armv7's MULS and arm64's MSUBW.
      
      Change-Id: Id0fd5d26fd477e4ed14389b0d33cad930423eb5b
      Reviewed-on: https://go-review.googlesource.com/c/141651
      Run-TryBot: Ben Shi <powerman1st@163.com>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      c3208842
  2. 14 Oct, 2018 3 commits
    • Keith Randall's avatar
      cmd/compile: reuse temporaries in order pass · 389e9427
      Keith Randall authored
      Instead of allocating a new temporary each time one
      is needed, keep a list of temporaries which are free
      (have already been VARKILLed on every path) and use
      one of them.
      
      Should save a lot of stack space. In a function like this:
      
      func main() {
           fmt.Printf("%d %d\n", 2, 3)
           fmt.Printf("%d %d\n", 4, 5)
           fmt.Printf("%d %d\n", 6, 7)
      }
      
      The three [2]interface{} arrays used to hold the ... args
      all use the same autotmp, instead of 3 different autotmps
      as happened previous to this CL.
      
      Change-Id: I2d728e226f81e05ae68ca8247af62014a1b032d3
      Reviewed-on: https://go-review.googlesource.com/c/140301
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarCherry Zhang <cherryyz@google.com>
      389e9427
    • Keith Randall's avatar
      runtime,cmd/compile: pass strings and slices to convT2{E,I} by value · 0e9f8a21
      Keith Randall authored
      When we pass these types by reference, we usually have to allocate
      temporaries on the stack, initialize them, then pass their address
      to the conversion functions. It's simpler to pass these types
      directly by value.
      
      This particularly applies to conversions needed for fmt.Printf
      (to interface{} for constructing a [...]interface{}).
      
      func f(a, b, c string) {
           fmt.Printf("%s %s\n", a, b)
           fmt.Printf("%s %s\n", b, c)
      }
      
      This function's stack frame shrinks from 200 to 136 bytes, and
      its code shrinks from 535 to 453 bytes.
      
      The go binary shrinks 0.3%.
      
      Update #24286
      
      Aside: for this function f, we don't really need to allocate
      temporaries for the convT2E function. We could use the address
      of a, b, and c directly. That might get similar (or maybe better?)
      improvements. I investigated a bit, but it seemed complicated
      to do it safely. This change was much easier.
      
      Change-Id: I78cbe51b501fb41e1e324ce4203f0de56a1db82d
      Reviewed-on: https://go-review.googlesource.com/c/135377
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      0e9f8a21
    • Keith Randall's avatar
      cmd/compile: optimize loads from readonly globals into constants · 653a4bd8
      Keith Randall authored
      Instead of
         MOVB go.string."foo"(SB), AX
      do
         MOVB $102, AX
      
      When we know the global we're loading from is readonly, we can
      do that read at compile time.
      
      I've made this arch-dependent mostly because the cases where this
      happens often are memory->memory moves, and those don't get
      decomposed until lowering.
      
      Did amd64/386/arm/arm64. Other architectures could follow.
      
      Update #26498
      
      Change-Id: I41b1dc831b2cd0a52dac9b97f4f4457888a46389
      Reviewed-on: https://go-review.googlesource.com/c/141118
      Run-TryBot: Keith Randall <khr@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarJosh Bleecher Snyder <josharian@gmail.com>
      653a4bd8
  3. 13 Oct, 2018 4 commits
  4. 12 Oct, 2018 17 commits
    • Matthew Dempsky's avatar
      cmd/compile: remove ineffectual -i flag · b4150f76
      Matthew Dempsky authored
      This flag lost its usefulness in CL 34273.
      
      Change-Id: I033c29f105937139b4e359a340906be439f1ed07
      Reviewed-on: https://go-review.googlesource.com/c/141646
      Run-TryBot: Matthew Dempsky <mdempsky@google.com>
      Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      b4150f76
    • Mak Kolybabi's avatar
      doc: fix spelling of `comp[]hensive` to `comp[r]ehensive` · 57456527
      Mak Kolybabi authored
      Change-Id: Idd93e45fab30e7496105b84fc2fce1884711b580
      GitHub-Last-Rev: 43aa04e876655e31fc1c4b2b5ae0702472e49102
      GitHub-Pull-Request: golang/go#27983
      Reviewed-on: https://go-review.googlesource.com/c/141645Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      57456527
    • Russ Cox's avatar
      regexp: add partial Deprecation comment to Copy · bf68744a
      Russ Cox authored
      Change-Id: I21b7817e604a48330f1ee250f7b1b2adc1f16067
      Reviewed-on: https://go-review.googlesource.com/c/139784
      Run-TryBot: Russ Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      bf68744a
    • Russ Cox's avatar
      regexp: add DeepEqual test · 5160e0d1
      Russ Cox authored
      This locks in behavior we accidentally broke
      and then restored during the Go 1.11 cycle.
      See #26219.
      
      It also locks in new behavior that DeepEqual
      always works, instead of only usually working.
      
      This CL is the final piece of a series of CLs to make
      DeepEqual always work, by eliminating the machine
      cache and making other related optimizations.
      Overall, this whole sequence of CLs achieves:
      
      name                             old time/op    new time/op    delta
      Find-12                             264ns ± 3%     260ns ± 0%   -1.59%  (p=0.000 n=10+9)
      FindAllNoMatches-12                 140ns ± 2%     133ns ± 0%   -5.34%  (p=0.000 n=10+7)
      FindString-12                       256ns ± 0%     249ns ± 0%   -2.73%  (p=0.000 n=8+8)
      FindSubmatch-12                     339ns ± 1%     333ns ± 1%   -1.73%  (p=0.000 n=9+10)
      FindStringSubmatch-12               322ns ± 0%     322ns ± 1%     ~     (p=0.450 n=8+10)
      Literal-12                          100ns ± 2%      92ns ± 0%   -8.13%  (p=0.000 n=10+10)
      NotLiteral-12                      1.50µs ± 0%    1.47µs ± 0%   -1.65%  (p=0.000 n=8+8)
      MatchClass-12                      2.18µs ± 0%    2.15µs ± 0%   -1.05%  (p=0.000 n=10+9)
      MatchClass_InRange-12              2.12µs ± 0%    2.11µs ± 0%   -0.65%  (p=0.000 n=10+9)
      ReplaceAll-12                      1.41µs ± 0%    1.41µs ± 0%     ~     (p=0.254 n=7+10)
      AnchoredLiteralShortNonMatch-12    89.8ns ± 0%    81.5ns ± 0%   -9.22%  (p=0.000 n=8+9)
      AnchoredLiteralLongNonMatch-12      105ns ± 3%      97ns ± 0%   -7.21%  (p=0.000 n=10+10)
      AnchoredShortMatch-12               141ns ± 0%     128ns ± 0%   -9.22%  (p=0.000 n=9+9)
      AnchoredLongMatch-12                276ns ± 4%     253ns ± 2%   -8.23%  (p=0.000 n=10+10)
      OnePassShortA-12                    620ns ± 0%     587ns ± 0%   -5.26%  (p=0.000 n=10+6)
      NotOnePassShortA-12                 575ns ± 3%     547ns ± 1%   -4.77%  (p=0.000 n=10+10)
      OnePassShortB-12                    493ns ± 0%     455ns ± 0%   -7.62%  (p=0.000 n=8+9)
      NotOnePassShortB-12                 423ns ± 0%     406ns ± 1%   -3.95%  (p=0.000 n=8+10)
      OnePassLongPrefix-12                112ns ± 0%     109ns ± 1%   -2.77%  (p=0.000 n=9+10)
      OnePassLongNotPrefix-12             405ns ± 0%     349ns ± 0%  -13.74%  (p=0.000 n=8+9)
      MatchParallelShared-12              501ns ± 1%      38ns ± 2%  -92.42%  (p=0.000 n=10+10)
      MatchParallelCopied-12             39.1ns ± 0%    38.6ns ± 1%   -1.38%  (p=0.002 n=6+10)
      QuoteMetaAll-12                    94.6ns ± 0%    94.8ns ± 0%   +0.26%  (p=0.001 n=10+9)
      QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%     ~     (all equal)
      Match/Easy0/32-12                  79.1ns ± 0%    72.0ns ± 0%   -8.95%  (p=0.000 n=9+9)
      Match/Easy0/1K-12                   307ns ± 1%     297ns ± 0%   -3.32%  (p=0.000 n=10+7)
      Match/Easy0/32K-12                 4.65µs ± 2%    4.67µs ± 1%     ~     (p=0.633 n=10+8)
      Match/Easy0/1M-12                   234µs ± 0%     234µs ± 0%     ~     (p=0.684 n=10+10)
      Match/Easy0/32M-12                 7.98ms ± 1%    7.96ms ± 0%   -0.31%  (p=0.014 n=9+9)
      Match/Easy0i/32-12                 1.13µs ± 1%    1.10µs ± 0%   -3.18%  (p=0.000 n=9+10)
      Match/Easy0i/1K-12                 32.5µs ± 0%    31.7µs ± 0%   -2.61%  (p=0.000 n=9+9)
      Match/Easy0i/32K-12                1.59ms ± 0%    1.26ms ± 0%  -20.71%  (p=0.000 n=9+7)
      Match/Easy0i/1M-12                 51.0ms ± 0%    40.4ms ± 0%  -20.68%  (p=0.000 n=10+7)
      Match/Easy0i/32M-12                 1.63s ± 0%     1.30s ± 0%  -20.62%  (p=0.001 n=7+7)
      Match/Easy1/32-12                  75.1ns ± 1%    67.4ns ± 0%  -10.24%  (p=0.000 n=8+10)
      Match/Easy1/1K-12                   861ns ± 0%     879ns ± 0%   +2.18%  (p=0.000 n=8+8)
      Match/Easy1/32K-12                 39.2µs ± 1%    34.1µs ± 0%  -13.01%  (p=0.000 n=10+8)
      Match/Easy1/1M-12                  1.38ms ± 0%    1.17ms ± 0%  -15.06%  (p=0.000 n=10+8)
      Match/Easy1/32M-12                 44.2ms ± 1%    37.5ms ± 0%  -15.15%  (p=0.000 n=10+9)
      Match/Medium/32-12                 1.04µs ± 1%    1.03µs ± 0%   -0.64%  (p=0.002 n=9+8)
      Match/Medium/1K-12                 31.3µs ± 0%    31.2µs ± 0%   -0.36%  (p=0.000 n=9+9)
      Match/Medium/32K-12                1.44ms ± 0%    1.20ms ± 0%  -17.02%  (p=0.000 n=8+7)
      Match/Medium/1M-12                 46.1ms ± 0%    38.2ms ± 0%  -17.14%  (p=0.001 n=6+8)
      Match/Medium/32M-12                 1.48s ± 0%     1.23s ± 0%  -17.10%  (p=0.000 n=9+7)
      Match/Hard/32-12                   1.54µs ± 1%    1.47µs ± 0%   -4.64%  (p=0.000 n=9+10)
      Match/Hard/1K-12                   46.4µs ± 1%    44.4µs ± 0%   -4.35%  (p=0.000 n=9+8)
      Match/Hard/32K-12                  2.19ms ± 0%    1.78ms ± 7%  -18.74%  (p=0.000 n=8+10)
      Match/Hard/1M-12                   70.1ms ± 0%    57.7ms ± 7%  -17.62%  (p=0.000 n=8+10)
      Match/Hard/32M-12                   2.24s ± 0%     1.84s ± 8%  -17.92%  (p=0.000 n=8+10)
      Match/Hard1/32-12                  8.17µs ± 1%    7.95µs ± 0%   -2.72%  (p=0.000 n=8+10)
      Match/Hard1/1K-12                   254µs ± 2%     245µs ± 0%   -3.62%  (p=0.000 n=9+10)
      Match/Hard1/32K-12                 9.58ms ± 1%    8.54ms ± 7%  -10.87%  (p=0.000 n=10+10)
      Match/Hard1/1M-12                   306ms ± 1%     271ms ± 8%  -11.42%  (p=0.000 n=9+10)
      Match/Hard1/32M-12                  9.79s ± 1%     8.58s ± 9%  -12.37%  (p=0.000 n=9+10)
      Match_onepass_regex/32-12           808ns ± 0%     716ns ± 1%  -11.39%  (p=0.000 n=8+9)
      Match_onepass_regex/1K-12          27.8µs ± 0%    19.9µs ± 2%  -28.51%  (p=0.000 n=8+9)
      Match_onepass_regex/32K-12          925µs ± 0%     631µs ± 2%  -31.71%  (p=0.000 n=9+9)
      Match_onepass_regex/1M-12          29.5ms ± 0%    20.2ms ± 2%  -31.53%  (p=0.000 n=10+9)
      Match_onepass_regex/32M-12          945ms ± 0%     648ms ± 2%  -31.39%  (p=0.000 n=9+9)
      CompileOnepass-12                  4.67µs ± 0%    4.60µs ± 0%   -1.48%  (p=0.000 n=10+10)
      [Geo mean]                         24.5µs         21.4µs       -12.94%
      
      https://perf.golang.org/search?q=upload:20181004.5
      
      Change-Id: Icb17b306830dc5489efbb55900937b94ce0eb047
      Reviewed-on: https://go-review.googlesource.com/c/139783
      Run-TryBot: Russ Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      5160e0d1
    • Russ Cox's avatar
      regexp: evaluate context flags lazily · 3ca1f28e
      Russ Cox authored
      There's no point in computing whether we're at the
      beginning of the line if the NFA isn't going to ask.
      Wait to compute that until asked.
      
      Whatever minor slowdowns were introduced by
      the conversion to pools that were not repaid by
      other optimizations are taken care of by this one.
      
      name                             old time/op    new time/op    delta
      Find-12                             252ns ± 0%     260ns ± 0%   +3.34%  (p=0.000 n=10+8)
      FindAllNoMatches-12                 136ns ± 4%     134ns ± 4%   -0.96%  (p=0.033 n=10+10)
      FindString-12                       246ns ± 0%     250ns ± 0%   +1.46%  (p=0.000 n=8+10)
      FindSubmatch-12                     332ns ± 1%     332ns ± 0%     ~     (p=0.101 n=9+10)
      FindStringSubmatch-12               321ns ± 1%     322ns ± 1%     ~     (p=0.717 n=9+10)
      Literal-12                         91.6ns ± 0%    92.3ns ± 0%   +0.74%  (p=0.000 n=9+9)
      NotLiteral-12                      1.47µs ± 0%    1.47µs ± 0%   +0.38%  (p=0.000 n=9+8)
      MatchClass-12                      2.15µs ± 0%    2.15µs ± 0%   +0.39%  (p=0.000 n=10+10)
      MatchClass_InRange-12              2.09µs ± 0%    2.11µs ± 0%   +0.75%  (p=0.000 n=9+9)
      ReplaceAll-12                      1.40µs ± 0%    1.40µs ± 0%     ~     (p=0.525 n=10+10)
      AnchoredLiteralShortNonMatch-12    83.5ns ± 0%    81.6ns ± 0%   -2.28%  (p=0.000 n=9+10)
      AnchoredLiteralLongNonMatch-12      101ns ± 0%      97ns ± 1%   -3.54%  (p=0.000 n=10+10)
      AnchoredShortMatch-12               131ns ± 0%     128ns ± 0%   -2.29%  (p=0.000 n=10+9)
      AnchoredLongMatch-12                268ns ± 1%     252ns ± 1%   -6.04%  (p=0.000 n=10+10)
      OnePassShortA-12                    614ns ± 0%     587ns ± 1%   -4.33%  (p=0.000 n=6+10)
      NotOnePassShortA-12                 552ns ± 0%     547ns ± 1%   -0.89%  (p=0.000 n=10+10)
      OnePassShortB-12                    494ns ± 0%     455ns ± 0%   -7.96%  (p=0.000 n=9+9)
      NotOnePassShortB-12                 411ns ± 0%     406ns ± 0%   -1.30%  (p=0.000 n=9+9)
      OnePassLongPrefix-12                109ns ± 0%     108ns ± 1%     ~     (p=0.064 n=8+9)
      OnePassLongNotPrefix-12             403ns ± 0%     349ns ± 0%  -13.30%  (p=0.000 n=9+8)
      MatchParallelShared-12             38.9ns ± 1%    37.9ns ± 1%   -2.65%  (p=0.000 n=10+8)
      MatchParallelCopied-12             39.2ns ± 1%    38.3ns ± 2%   -2.20%  (p=0.001 n=10+10)
      QuoteMetaAll-12                    94.5ns ± 0%    94.7ns ± 0%   +0.18%  (p=0.043 n=10+9)
      QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%     ~     (all equal)
      Match/Easy0/32-12                  72.2ns ± 0%    71.9ns ± 0%   -0.38%  (p=0.009 n=8+10)
      Match/Easy0/1K-12                   296ns ± 1%     297ns ± 0%   +0.51%  (p=0.001 n=10+9)
      Match/Easy0/32K-12                 4.57µs ± 3%    4.61µs ± 2%     ~     (p=0.280 n=10+10)
      Match/Easy0/1M-12                   234µs ± 0%     234µs ± 0%     ~     (p=0.986 n=10+10)
      Match/Easy0/32M-12                 7.96ms ± 0%    7.98ms ± 0%   +0.22%  (p=0.010 n=10+9)
      Match/Easy0i/32-12                 1.09µs ± 0%    1.10µs ± 0%   +0.23%  (p=0.000 n=8+9)
      Match/Easy0i/1K-12                 31.7µs ± 0%    31.7µs ± 0%   +0.09%  (p=0.003 n=9+8)
      Match/Easy0i/32K-12                1.61ms ± 0%    1.27ms ± 1%  -21.03%  (p=0.000 n=8+10)
      Match/Easy0i/1M-12                 51.4ms ± 0%    40.4ms ± 0%  -21.29%  (p=0.000 n=8+8)
      Match/Easy0i/32M-12                 1.65s ± 0%     1.30s ± 1%  -21.22%  (p=0.000 n=9+9)
      Match/Easy1/32-12                  67.6ns ± 1%    67.2ns ± 0%     ~     (p=0.085 n=10+9)
      Match/Easy1/1K-12                   873ns ± 2%     880ns ± 0%   +0.78%  (p=0.006 n=9+7)
      Match/Easy1/32K-12                 39.7µs ± 1%    34.3µs ± 3%  -13.53%  (p=0.000 n=10+10)
      Match/Easy1/1M-12                  1.41ms ± 1%    1.19ms ± 3%  -15.48%  (p=0.000 n=10+10)
      Match/Easy1/32M-12                 44.9ms ± 1%    38.0ms ± 2%  -15.21%  (p=0.000 n=10+10)
      Match/Medium/32-12                 1.04µs ± 0%    1.03µs ± 0%   -0.57%  (p=0.000 n=9+9)
      Match/Medium/1K-12                 31.2µs ± 0%    31.4µs ± 1%   +0.61%  (p=0.000 n=8+10)
      Match/Medium/32K-12                1.45ms ± 1%    1.20ms ± 0%  -17.70%  (p=0.000 n=10+8)
      Match/Medium/1M-12                 46.4ms ± 0%    38.4ms ± 2%  -17.32%  (p=0.000 n=6+9)
      Match/Medium/32M-12                 1.49s ± 1%     1.24s ± 1%  -16.81%  (p=0.000 n=10+10)
      Match/Hard/32-12                   1.47µs ± 0%    1.47µs ± 0%   -0.31%  (p=0.000 n=9+10)
      Match/Hard/1K-12                   44.5µs ± 1%    44.4µs ± 0%     ~     (p=0.075 n=10+10)
      Match/Hard/32K-12                  2.09ms ± 0%    1.78ms ± 7%  -14.88%  (p=0.000 n=8+10)
      Match/Hard/1M-12                   67.8ms ± 5%    56.9ms ± 7%  -16.05%  (p=0.000 n=10+10)
      Match/Hard/32M-12                   2.17s ± 5%     1.84s ± 6%  -15.21%  (p=0.000 n=10+10)
      Match/Hard1/32-12                  7.89µs ± 0%    7.94µs ± 0%   +0.61%  (p=0.000 n=9+9)
      Match/Hard1/1K-12                   246µs ± 0%     245µs ± 0%   -0.30%  (p=0.010 n=9+10)
      Match/Hard1/32K-12                 8.93ms ± 0%    8.17ms ± 0%   -8.44%  (p=0.000 n=9+8)
      Match/Hard1/1M-12                   286ms ± 0%     269ms ± 9%   -5.66%  (p=0.028 n=9+10)
      Match/Hard1/32M-12                  9.16s ± 0%     8.61s ± 8%   -5.98%  (p=0.028 n=9+10)
      Match_onepass_regex/32-12           825ns ± 0%     712ns ± 0%  -13.75%  (p=0.000 n=8+8)
      Match_onepass_regex/1K-12          28.7µs ± 1%    19.8µs ± 0%  -30.99%  (p=0.000 n=9+8)
      Match_onepass_regex/32K-12          950µs ± 1%     628µs ± 0%  -33.83%  (p=0.000 n=9+8)
      Match_onepass_regex/1M-12          30.4ms ± 0%    20.1ms ± 0%  -33.74%  (p=0.000 n=9+8)
      Match_onepass_regex/32M-12          974ms ± 1%     646ms ± 0%  -33.73%  (p=0.000 n=9+8)
      CompileOnepass-12                  4.60µs ± 0%    4.59µs ± 0%     ~     (p=0.063 n=8+9)
      [Geo mean]                         23.1µs         21.3µs        -7.44%
      
      https://perf.golang.org/search?q=upload:20181004.4
      
      Change-Id: I47cdd09f6dcde1d7c317080e9b4df42c7d0a8d24
      Reviewed-on: https://go-review.googlesource.com/c/139782
      Run-TryBot: Russ Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      3ca1f28e
    • Russ Cox's avatar
      regexp: use pools for NFA machines · a376435a
      Russ Cox authored
      Now the machine struct is only used for NFA execution.
      Use global pools to cache machines instead of per-Regexp lists.
      
      Also eliminate some tail calls in NFA execution, to pay for
      the added overhead of sync.Pool.
      
      name                             old time/op    new time/op    delta
      Find-12                             252ns ± 0%     252ns ± 0%     ~     (p=1.000 n=10+10)
      FindAllNoMatches-12                 134ns ± 1%     136ns ± 4%     ~     (p=0.443 n=9+10)
      FindString-12                       246ns ± 0%     246ns ± 0%   -0.16%  (p=0.046 n=10+8)
      FindSubmatch-12                     333ns ± 2%     332ns ± 1%     ~     (p=0.489 n=10+9)
      FindStringSubmatch-12               320ns ± 0%     321ns ± 1%   +0.55%  (p=0.005 n=10+9)
      Literal-12                         91.1ns ± 0%    91.6ns ± 0%   +0.55%  (p=0.000 n=10+9)
      NotLiteral-12                      1.45µs ± 0%    1.47µs ± 0%   +0.82%  (p=0.000 n=10+9)
      MatchClass-12                      2.19µs ± 0%    2.15µs ± 0%   -2.01%  (p=0.000 n=9+10)
      MatchClass_InRange-12              2.09µs ± 0%    2.09µs ± 0%     ~     (p=0.082 n=10+9)
      ReplaceAll-12                      1.39µs ± 0%    1.40µs ± 0%   +0.50%  (p=0.000 n=10+10)
      AnchoredLiteralShortNonMatch-12    82.4ns ± 0%    83.5ns ± 0%   +1.36%  (p=0.000 n=8+9)
      AnchoredLiteralLongNonMatch-12      106ns ± 1%     101ns ± 0%   -4.36%  (p=0.000 n=10+10)
      AnchoredShortMatch-12               130ns ± 0%     131ns ± 0%   +0.77%  (p=0.000 n=9+10)
      AnchoredLongMatch-12                272ns ± 0%     268ns ± 1%   -1.46%  (p=0.000 n=8+10)
      OnePassShortA-12                    615ns ± 0%     614ns ± 0%     ~     (p=0.094 n=10+6)
      NotOnePassShortA-12                 549ns ± 0%     552ns ± 0%   +0.52%  (p=0.000 n=9+10)
      OnePassShortB-12                    494ns ± 0%     494ns ± 0%     ~     (p=0.247 n=8+9)
      NotOnePassShortB-12                 412ns ± 1%     411ns ± 0%     ~     (p=0.625 n=10+9)
      OnePassLongPrefix-12                108ns ± 0%     109ns ± 0%   +0.93%  (p=0.000 n=10+8)
      OnePassLongNotPrefix-12             402ns ± 0%     403ns ± 0%   +0.14%  (p=0.041 n=8+9)
      MatchParallelShared-12             38.6ns ± 2%    38.9ns ± 1%     ~     (p=0.172 n=9+10)
      MatchParallelCopied-12             39.4ns ± 7%    39.2ns ± 1%     ~     (p=0.423 n=10+10)
      QuoteMetaAll-12                    94.9ns ± 0%    94.5ns ± 0%   -0.42%  (p=0.000 n=9+10)
      QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%     ~     (all equal)
      Match/Easy0/32-12                  72.1ns ± 0%    72.2ns ± 0%     ~     (p=0.435 n=9+8)
      Match/Easy0/1K-12                   298ns ± 0%     296ns ± 1%   -1.01%  (p=0.000 n=8+10)
      Match/Easy0/32K-12                 4.64µs ± 1%    4.57µs ± 3%   -1.39%  (p=0.030 n=10+10)
      Match/Easy0/1M-12                   234µs ± 0%     234µs ± 0%     ~     (p=0.971 n=10+10)
      Match/Easy0/32M-12                 7.95ms ± 0%    7.96ms ± 0%     ~     (p=0.278 n=9+10)
      Match/Easy0i/32-12                 1.10µs ± 0%    1.09µs ± 0%   -0.29%  (p=0.000 n=9+8)
      Match/Easy0i/1K-12                 31.8µs ± 1%    31.7µs ± 0%     ~     (p=0.704 n=10+9)
      Match/Easy0i/32K-12                1.62ms ± 1%    1.61ms ± 0%   -1.12%  (p=0.000 n=10+8)
      Match/Easy0i/1M-12                 51.8ms ± 0%    51.4ms ± 0%   -0.84%  (p=0.000 n=8+8)
      Match/Easy0i/32M-12                 1.65s ± 0%     1.65s ± 0%   -0.46%  (p=0.000 n=9+9)
      Match/Easy1/32-12                  67.7ns ± 1%    67.6ns ± 1%     ~     (p=0.723 n=10+10)
      Match/Easy1/1K-12                   873ns ± 0%     873ns ± 2%     ~     (p=0.345 n=10+9)
      Match/Easy1/32K-12                 39.4µs ± 0%    39.7µs ± 1%   +0.66%  (p=0.000 n=10+10)
      Match/Easy1/1M-12                  1.39ms ± 0%    1.41ms ± 1%   +1.10%  (p=0.000 n=10+10)
      Match/Easy1/32M-12                 44.3ms ± 0%    44.9ms ± 1%   +1.18%  (p=0.000 n=10+10)
      Match/Medium/32-12                 1.04µs ± 0%    1.04µs ± 0%   -0.58%  (p=0.000 n=9+9)
      Match/Medium/1K-12                 31.4µs ± 0%    31.2µs ± 0%   -0.62%  (p=0.000 n=8+8)
      Match/Medium/32K-12                1.45ms ± 0%    1.45ms ± 1%     ~     (p=0.356 n=9+10)
      Match/Medium/1M-12                 46.4ms ± 0%    46.4ms ± 0%     ~     (p=0.142 n=8+6)
      Match/Medium/32M-12                 1.49s ± 1%     1.49s ± 1%     ~     (p=0.739 n=10+10)
      Match/Hard/32-12                   1.48µs ± 0%    1.47µs ± 0%   -0.53%  (p=0.000 n=9+9)
      Match/Hard/1K-12                   45.0µs ± 1%    44.5µs ± 1%   -1.06%  (p=0.000 n=10+10)
      Match/Hard/32K-12                  2.24ms ± 0%    2.09ms ± 0%   -6.56%  (p=0.000 n=8+8)
      Match/Hard/1M-12                   71.6ms ± 0%    67.8ms ± 5%   -5.36%  (p=0.000 n=7+10)
      Match/Hard/32M-12                   2.29s ± 0%     2.17s ± 5%   -5.40%  (p=0.000 n=9+10)
      Match/Hard1/32-12                  7.89µs ± 0%    7.89µs ± 0%     ~     (p=0.053 n=9+9)
      Match/Hard1/1K-12                   244µs ± 0%     246µs ± 0%   +0.71%  (p=0.000 n=10+9)
      Match/Hard1/32K-12                 10.3ms ± 0%     8.9ms ± 0%  -13.76%  (p=0.000 n=10+9)
      Match/Hard1/1M-12                   331ms ± 0%     286ms ± 0%  -13.72%  (p=0.000 n=9+9)
      Match/Hard1/32M-12                  10.6s ± 0%      9.2s ± 0%  -13.72%  (p=0.000 n=10+9)
      Match_onepass_regex/32-12           830ns ± 0%     825ns ± 0%   -0.57%  (p=0.000 n=9+8)
      Match_onepass_regex/1K-12          28.7µs ± 1%    28.7µs ± 1%   -0.22%  (p=0.040 n=9+9)
      Match_onepass_regex/32K-12          949µs ± 0%     950µs ± 1%     ~     (p=0.236 n=8+9)
      Match_onepass_regex/1M-12          30.4ms ± 0%    30.4ms ± 0%     ~     (p=0.059 n=8+9)
      Match_onepass_regex/32M-12          973ms ± 0%     974ms ± 1%     ~     (p=0.258 n=9+9)
      CompileOnepass-12                  4.64µs ± 0%    4.60µs ± 0%   -0.90%  (p=0.000 n=10+8)
      [Geo mean]                         23.3µs         23.1µs        -1.16%
      
      https://perf.golang.org/search?q=upload:20181004.3
      
      Change-Id: I46f3d52ce89c8cd992cf554473c27af81fd81bfd
      Reviewed-on: https://go-review.googlesource.com/c/139781
      Run-TryBot: Russ Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      a376435a
    • Russ Cox's avatar
      regexp: split one-pass execution out of machine struct · 60b29711
      Russ Cox authored
      This allows the one-pass executions to have their
      own pool of (much smaller) allocated structures.
      A step toward eliminating the per-Regexp machine cache.
      
      Not much effect on benchmarks, since there are no
      optimizations here, and pools are a tiny bit slower than a
      locked data structure for single-threaded code.
      
      name                             old time/op    new time/op    delta
      Find-12                             254ns ± 0%     252ns ± 0%  -0.94%  (p=0.000 n=9+10)
      FindAllNoMatches-12                 135ns ± 0%     134ns ± 1%  -0.49%  (p=0.002 n=9+9)
      FindString-12                       247ns ± 0%     246ns ± 0%  -0.24%  (p=0.003 n=8+10)
      FindSubmatch-12                     334ns ± 0%     333ns ± 2%    ~     (p=0.283 n=10+10)
      FindStringSubmatch-12               321ns ± 0%     320ns ± 0%  -0.51%  (p=0.000 n=9+10)
      Literal-12                         92.2ns ± 0%    91.1ns ± 0%  -1.25%  (p=0.000 n=9+10)
      NotLiteral-12                      1.47µs ± 0%    1.45µs ± 0%  -0.99%  (p=0.000 n=9+10)
      MatchClass-12                      2.17µs ± 0%    2.19µs ± 0%  +0.84%  (p=0.000 n=7+9)
      MatchClass_InRange-12              2.13µs ± 0%    2.09µs ± 0%  -1.70%  (p=0.000 n=10+10)
      ReplaceAll-12                      1.39µs ± 0%    1.39µs ± 0%  +0.51%  (p=0.000 n=10+10)
      AnchoredLiteralShortNonMatch-12    83.2ns ± 0%    82.4ns ± 0%  -0.96%  (p=0.000 n=8+8)
      AnchoredLiteralLongNonMatch-12      105ns ± 0%     106ns ± 1%    ~     (p=0.087 n=10+10)
      AnchoredShortMatch-12               131ns ± 0%     130ns ± 0%  -0.76%  (p=0.000 n=10+9)
      AnchoredLongMatch-12                267ns ± 0%     272ns ± 0%  +2.01%  (p=0.000 n=10+8)
      OnePassShortA-12                    611ns ± 0%     615ns ± 0%  +0.61%  (p=0.000 n=9+10)
      NotOnePassShortA-12                 552ns ± 0%     549ns ± 0%  -0.46%  (p=0.000 n=8+9)
      OnePassShortB-12                    491ns ± 0%     494ns ± 0%  +0.61%  (p=0.000 n=8+8)
      NotOnePassShortB-12                 412ns ± 0%     412ns ± 1%    ~     (p=0.151 n=9+10)
      OnePassLongPrefix-12                112ns ± 0%     108ns ± 0%  -3.57%  (p=0.000 n=10+10)
      OnePassLongNotPrefix-12             410ns ± 0%     402ns ± 0%  -1.95%  (p=0.000 n=9+8)
      MatchParallelShared-12             38.8ns ± 1%    38.6ns ± 2%    ~     (p=0.536 n=10+9)
      MatchParallelCopied-12             39.2ns ± 3%    39.4ns ± 7%    ~     (p=0.986 n=10+10)
      QuoteMetaAll-12                    94.6ns ± 0%    94.9ns ± 0%  +0.29%  (p=0.001 n=8+9)
      QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%    ~     (all equal)
      Match/Easy0/32-12                  72.9ns ± 0%    72.1ns ± 0%  -1.07%  (p=0.000 n=9+9)
      Match/Easy0/1K-12                   298ns ± 0%     298ns ± 0%    ~     (p=0.140 n=6+8)
      Match/Easy0/32K-12                 4.60µs ± 2%    4.64µs ± 1%    ~     (p=0.171 n=10+10)
      Match/Easy0/1M-12                   235µs ± 0%     234µs ± 0%  -0.14%  (p=0.004 n=10+10)
      Match/Easy0/32M-12                 7.96ms ± 0%    7.95ms ± 0%  -0.12%  (p=0.043 n=10+9)
      Match/Easy0i/32-12                 1.09µs ± 0%    1.10µs ± 0%  +0.15%  (p=0.000 n=8+9)
      Match/Easy0i/1K-12                 31.7µs ± 0%    31.8µs ± 1%    ~     (p=0.905 n=9+10)
      Match/Easy0i/32K-12                1.61ms ± 0%    1.62ms ± 1%  +1.12%  (p=0.000 n=9+10)
      Match/Easy0i/1M-12                 51.4ms ± 0%    51.8ms ± 0%  +0.85%  (p=0.000 n=8+8)
      Match/Easy0i/32M-12                 1.65s ± 1%     1.65s ± 0%    ~     (p=0.113 n=9+9)
      Match/Easy1/32-12                  67.9ns ± 0%    67.7ns ± 1%    ~     (p=0.232 n=8+10)
      Match/Easy1/1K-12                   884ns ± 0%     873ns ± 0%  -1.29%  (p=0.000 n=9+10)
      Match/Easy1/32K-12                 39.2µs ± 0%    39.4µs ± 0%  +0.50%  (p=0.000 n=9+10)
      Match/Easy1/1M-12                  1.39ms ± 0%    1.39ms ± 0%  +0.29%  (p=0.000 n=9+10)
      Match/Easy1/32M-12                 44.2ms ± 1%    44.3ms ± 0%  +0.21%  (p=0.029 n=10+10)
      Match/Medium/32-12                 1.05µs ± 0%    1.04µs ± 0%  -0.27%  (p=0.001 n=8+9)
      Match/Medium/1K-12                 31.3µs ± 0%    31.4µs ± 0%  +0.39%  (p=0.000 n=9+8)
      Match/Medium/32K-12                1.45ms ± 0%    1.45ms ± 0%  +0.33%  (p=0.000 n=8+9)
      Match/Medium/1M-12                 46.2ms ± 0%    46.4ms ± 0%  +0.35%  (p=0.000 n=9+8)
      Match/Medium/32M-12                 1.48s ± 0%     1.49s ± 1%  +0.70%  (p=0.000 n=8+10)
      Match/Hard/32-12                   1.49µs ± 0%    1.48µs ± 0%  -0.43%  (p=0.000 n=10+9)
      Match/Hard/1K-12                   45.1µs ± 1%    45.0µs ± 1%    ~     (p=0.393 n=10+10)
      Match/Hard/32K-12                  2.18ms ± 1%    2.24ms ± 0%  +2.71%  (p=0.000 n=9+8)
      Match/Hard/1M-12                   69.7ms ± 1%    71.6ms ± 0%  +2.76%  (p=0.000 n=9+7)
      Match/Hard/32M-12                   2.23s ± 1%     2.29s ± 0%  +2.65%  (p=0.000 n=9+9)
      Match/Hard1/32-12                  7.89µs ± 0%    7.89µs ± 0%    ~     (p=0.286 n=9+9)
      Match/Hard1/1K-12                   244µs ± 0%     244µs ± 0%    ~     (p=0.905 n=9+10)
      Match/Hard1/32K-12                 10.3ms ± 0%    10.3ms ± 0%    ~     (p=0.796 n=10+10)
      Match/Hard1/1M-12                   331ms ± 0%     331ms ± 0%    ~     (p=0.167 n=8+9)
      Match/Hard1/32M-12                  10.6s ± 0%     10.6s ± 0%    ~     (p=0.315 n=8+10)
      Match_onepass_regex/32-12           812ns ± 0%     830ns ± 0%  +2.19%  (p=0.000 n=10+9)
      Match_onepass_regex/1K-12          28.5µs ± 0%    28.7µs ± 1%  +0.97%  (p=0.000 n=10+9)
      Match_onepass_regex/32K-12          936µs ± 0%     949µs ± 0%  +1.43%  (p=0.000 n=10+8)
      Match_onepass_regex/1M-12          30.2ms ± 0%    30.4ms ± 0%  +0.62%  (p=0.000 n=10+8)
      Match_onepass_regex/32M-12          970ms ± 0%     973ms ± 0%  +0.35%  (p=0.000 n=10+9)
      CompileOnepass-12                  4.63µs ± 1%    4.64µs ± 0%    ~     (p=0.060 n=10+10)
      [Geo mean]                         23.3µs         23.3µs       +0.12%
      
      https://perf.golang.org/search?q=upload:20181004.2
      
      Change-Id: Iff9e9f9d4a4698162126a2f300e8ed1b1a39361e
      Reviewed-on: https://go-review.googlesource.com/c/139780
      Run-TryBot: Russ Cox <rsc@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      60b29711
    • Russ Cox's avatar
      regexp: split bit-state execution out of machine struct · 2d4346b3
      Russ Cox authored
      This allows the bit-state executions to have their
      own pool of allocated structures. A step toward
      eliminating the per-Regexp machine cache.
      
      Note especially the -92% on MatchParallelShared.
      This is real but not a complete story: the other
      execution engines still need to be de-shared,
      but the benchmark was only using bit-state.
      
      The tiny slowdowns in unrelated code are noise.
      
      name                             old time/op    new time/op    delta
      Find-12                             264ns ± 3%     254ns ± 0%   -3.86%  (p=0.000 n=10+9)
      FindAllNoMatches-12                 140ns ± 2%     135ns ± 0%   -3.91%  (p=0.000 n=10+9)
      FindString-12                       256ns ± 0%     247ns ± 0%   -3.52%  (p=0.000 n=8+8)
      FindSubmatch-12                     339ns ± 1%     334ns ± 0%   -1.41%  (p=0.000 n=9+10)
      FindStringSubmatch-12               322ns ± 0%     321ns ± 0%   -0.21%  (p=0.005 n=8+9)
      Literal-12                          100ns ± 2%      92ns ± 0%   -8.10%  (p=0.000 n=10+9)
      NotLiteral-12                      1.50µs ± 0%    1.47µs ± 0%   -1.91%  (p=0.000 n=8+9)
      MatchClass-12                      2.18µs ± 0%    2.17µs ± 0%   -0.20%  (p=0.001 n=10+7)
      MatchClass_InRange-12              2.12µs ± 0%    2.13µs ± 0%   +0.23%  (p=0.000 n=10+10)
      ReplaceAll-12                      1.41µs ± 0%    1.39µs ± 0%   -1.30%  (p=0.000 n=7+10)
      AnchoredLiteralShortNonMatch-12    89.8ns ± 0%    83.2ns ± 0%   -7.35%  (p=0.000 n=8+8)
      AnchoredLiteralLongNonMatch-12      105ns ± 3%     105ns ± 0%     ~     (p=0.186 n=10+10)
      AnchoredShortMatch-12               141ns ± 0%     131ns ± 0%   -7.09%  (p=0.000 n=9+10)
      AnchoredLongMatch-12                276ns ± 4%     267ns ± 0%   -3.23%  (p=0.000 n=10+10)
      OnePassShortA-12                    620ns ± 0%     611ns ± 0%   -1.39%  (p=0.000 n=10+9)
      NotOnePassShortA-12                 575ns ± 3%     552ns ± 0%   -3.97%  (p=0.000 n=10+8)
      OnePassShortB-12                    493ns ± 0%     491ns ± 0%   -0.33%  (p=0.000 n=8+8)
      NotOnePassShortB-12                 423ns ± 0%     412ns ± 0%   -2.60%  (p=0.000 n=8+9)
      OnePassLongPrefix-12                112ns ± 0%     112ns ± 0%     ~     (all equal)
      OnePassLongNotPrefix-12             405ns ± 0%     410ns ± 0%   +1.23%  (p=0.000 n=8+9)
      MatchParallelShared-12              501ns ± 1%      39ns ± 1%  -92.27%  (p=0.000 n=10+10)
      MatchParallelCopied-12             39.1ns ± 0%    39.2ns ± 3%     ~     (p=0.785 n=6+10)
      QuoteMetaAll-12                    94.6ns ± 0%    94.6ns ± 0%     ~     (p=0.439 n=10+8)
      QuoteMetaNone-12                   52.7ns ± 0%    52.7ns ± 0%     ~     (all equal)
      Match/Easy0/32-12                  79.1ns ± 0%    72.9ns ± 0%   -7.85%  (p=0.000 n=9+9)
      Match/Easy0/1K-12                   307ns ± 1%     298ns ± 0%   -2.99%  (p=0.000 n=10+6)
      Match/Easy0/32K-12                 4.65µs ± 2%    4.60µs ± 2%     ~     (p=0.159 n=10+10)
      Match/Easy0/1M-12                   234µs ± 0%     235µs ± 0%   +0.17%  (p=0.003 n=10+10)
      Match/Easy0/32M-12                 7.98ms ± 1%    7.96ms ± 0%     ~     (p=0.278 n=9+10)
      Match/Easy0i/32-12                 1.13µs ± 1%    1.09µs ± 0%   -3.24%  (p=0.000 n=9+8)
      Match/Easy0i/1K-12                 32.5µs ± 0%    31.7µs ± 0%   -2.66%  (p=0.000 n=9+9)
      Match/Easy0i/32K-12                1.59ms ± 0%    1.61ms ± 0%   +0.75%  (p=0.000 n=9+9)
      Match/Easy0i/1M-12                 51.0ms ± 0%    51.4ms ± 0%   +0.77%  (p=0.000 n=10+8)
      Match/Easy0i/32M-12                 1.63s ± 0%     1.65s ± 1%   +1.24%  (p=0.000 n=7+9)
      Match/Easy1/32-12                  75.1ns ± 1%    67.9ns ± 0%   -9.54%  (p=0.000 n=8+8)
      Match/Easy1/1K-12                   861ns ± 0%     884ns ± 0%   +2.71%  (p=0.000 n=8+9)
      Match/Easy1/32K-12                 39.2µs ± 1%    39.2µs ± 0%     ~     (p=0.090 n=10+9)
      Match/Easy1/1M-12                  1.38ms ± 0%    1.39ms ± 0%     ~     (p=0.095 n=10+9)
      Match/Easy1/32M-12                 44.2ms ± 1%    44.2ms ± 1%     ~     (p=0.218 n=10+10)
      Match/Medium/32-12                 1.04µs ± 1%    1.05µs ± 0%   +1.05%  (p=0.000 n=9+8)
      Match/Medium/1K-12                 31.3µs ± 0%    31.3µs ± 0%   -0.14%  (p=0.004 n=9+9)
      Match/Medium/32K-12                1.44ms ± 0%    1.45ms ± 0%   +0.18%  (p=0.001 n=8+8)
      Match/Medium/1M-12                 46.1ms ± 0%    46.2ms ± 0%   +0.13%  (p=0.003 n=6+9)
      Match/Medium/32M-12                 1.48s ± 0%     1.48s ± 0%   +0.20%  (p=0.002 n=9+8)
      Match/Hard/32-12                   1.54µs ± 1%    1.49µs ± 0%   -3.60%  (p=0.000 n=9+10)
      Match/Hard/1K-12                   46.4µs ± 1%    45.1µs ± 1%   -2.78%  (p=0.000 n=9+10)
      Match/Hard/32K-12                  2.19ms ± 0%    2.18ms ± 1%   -0.51%  (p=0.006 n=8+9)
      Match/Hard/1M-12                   70.1ms ± 0%    69.7ms ± 1%   -0.52%  (p=0.006 n=8+9)
      Match/Hard/32M-12                   2.24s ± 0%     2.23s ± 1%   -0.42%  (p=0.046 n=8+9)
      Match/Hard1/32-12                  8.17µs ± 1%    7.89µs ± 0%   -3.42%  (p=0.000 n=8+9)
      Match/Hard1/1K-12                   254µs ± 2%     244µs ± 0%   -3.91%  (p=0.000 n=9+9)
      Match/Hard1/32K-12                 9.58ms ± 1%   10.35ms ± 0%   +8.00%  (p=0.000 n=10+10)
      Match/Hard1/1M-12                   306ms ± 1%     331ms ± 0%   +8.27%  (p=0.000 n=9+8)
      Match/Hard1/32M-12                  9.79s ± 1%    10.60s ± 0%   +8.29%  (p=0.000 n=9+8)
      Match_onepass_regex/32-12           808ns ± 0%     812ns ± 0%   +0.47%  (p=0.000 n=8+10)
      Match_onepass_regex/1K-12          27.8µs ± 0%    28.5µs ± 0%   +2.32%  (p=0.000 n=8+10)
      Match_onepass_regex/32K-12          925µs ± 0%     936µs ± 0%   +1.24%  (p=0.000 n=9+10)
      Match_onepass_regex/1M-12          29.5ms ± 0%    30.2ms ± 0%   +2.38%  (p=0.000 n=10+10)
      Match_onepass_regex/32M-12          945ms ± 0%     970ms ± 0%   +2.60%  (p=0.000 n=9+10)
      CompileOnepass-12                  4.67µs ± 0%    4.63µs ± 1%   -0.84%  (p=0.000 n=10+10)
      [Geo mean]                         24.5µs         23.3µs        -5.04%
      
      https://perf.golang.org/search?q=upload:20181004.1
      
      Change-Id: Idbc2b76223718265657819ff38be2d9aba1c54b4
      Reviewed-on: https://go-review.googlesource.com/c/139779
      Run-TryBot: Russ Cox <rsc@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      2d4346b3
    • Russ Cox's avatar
      testing: implement -benchtime=100x · 8e0aea16
      Russ Cox authored
      When running benchmarks with profilers and trying to
      compare one run against another, it is very useful to be
      able to force each run to execute exactly the same number
      of iterations.
      
      Discussion on the proposal issue #24735 led to the decision
      to overload -benchtime, so that instead of saying
      -benchtime 10s to run a benchmark for 10 seconds,
      you say -benchtime 100x to run a benchmark 100 times.
      
      Fixes #24735.
      
      Change-Id: Id17c5bd18bd09987bb48ed12420d61ae9e200fd7
      Reviewed-on: https://go-review.googlesource.com/c/139258
      Run-TryBot: Russ Cox <rsc@golang.org>
      Reviewed-by: default avatarAustin Clements <austin@google.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      8e0aea16
    • Robert Griesemer's avatar
      go/types: remove a test case and update comment · 56131cbd
      Robert Griesemer authored
      The original need for the extra test case and issue was eliminated
      by https://golang.org/cl/116815 which introduced systematic cycle
      detection. Now that we correctly report the cycle, we can't say much
      about the invalid cast anyway (the type is invalid due to the cycle).
      
      A more sophisticated approach would be able to tell the size of
      a function type independent of the details of that type, but the
      type-checker is not set up for this kind of lazy type-checking.
      
      Fixes #23127.
      
      Change-Id: Ia8479e66baf630ce96f6f36770c8e1c810c59ddc
      Reviewed-on: https://go-review.googlesource.com/c/141640
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarAlan Donovan <adonovan@google.com>
      56131cbd
    • Martin Möhrmann's avatar
      internal/cpu: use 'off' for disabling cpu capabilities instead of '0' · 4fb8b1de
      Martin Möhrmann authored
      Updates #27218
      
      Change-Id: I4ce20376fd601b5f958d79014af7eaf89e9de613
      Reviewed-on: https://go-review.googlesource.com/c/141818
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      4fb8b1de
    • Tobias Klauser's avatar
      internal/poll: add FD.Fsync on aix · d82e51a1
      Tobias Klauser authored
      Follow-up for CL 138717. This fixes the build of the os package on
      aix.
      
      Change-Id: I879b9360e71837ab622ae3a7b6144782cf5a9ce7
      Reviewed-on: https://go-review.googlesource.com/c/141797
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      d82e51a1
    • Martin Möhrmann's avatar
      internal/cpu: enable support for GODEBUGCPU in non-experimental builds · a5248acd
      Martin Möhrmann authored
      Enabling GODEBUGCPU without the need to set GOEXPERIMENT=debugcpu  enables
      trybots and builders to run tests for GODEBUGCPU features in upcoming CLs
      that will implement the new syntax and features for non-experimental
      GODEBUGCPU support from proposal golang.org/issue/27218.
      
      Updates #27218
      
      Change-Id: Icc69e51e736711a86b02b46bd441ffc28423beba
      Reviewed-on: https://go-review.googlesource.com/c/141817
      Run-TryBot: Martin Möhrmann <moehrmann@google.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      a5248acd
    • Brad Fitzpatrick's avatar
      net/http: flesh out Transport's HTTP/1 CONNECT+bidi support to match HTTP/2 · da6c1683
      Brad Fitzpatrick authored
      Fixes #17227
      
      Change-Id: I0f8964593d69623b85d5759f6276063ee62b2915
      Reviewed-on: https://go-review.googlesource.com/c/123156Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      da6c1683
    • Yuval Pavel Zholkover's avatar
      syscall: correctly pad with NUL in FreeBSD convertFromDirents11 · e19f5754
      Yuval Pavel Zholkover authored
      We weren't writing a terminating NUL after dstDirent.Namlen bytes of dstDirent.Name.
      And we weren't filling the possible additional bytes until dstDirent.Reclen.
      
      Fixes #28131
      
      Change-Id: Id691c25225795c0dbb0d7004bfca7bb7fc706de9
      Reviewed-on: https://go-review.googlesource.com/c/141297
      Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      e19f5754
    • Mihai Todor's avatar
      encoding/base64: fix typo in decodeQuantum docs · 1f95e0a9
      Mihai Todor authored
      Change-Id: I643540bcea574d8a70b79237d97097dcc7368766
      GitHub-Last-Rev: e2be58d1ab84f91dfbba1067aae7145f24fd650d
      GitHub-Pull-Request: golang/go#28125
      Reviewed-on: https://go-review.googlesource.com/c/141119Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      1f95e0a9
    • Elias Naur's avatar
      os: make UserHomeDir return "/" on iOS · 93cf82f0
      Elias Naur authored
      The UserHomeDir test succeeds on the builder, but not when run
      manually where HOME is set to the host $HOME.
      
      Change-Id: I1db0f608b04b311b53cc0c8160a3778caaf542f6
      Reviewed-on: https://go-review.googlesource.com/c/141798
      Run-TryBot: Elias Naur <elias.naur@gmail.com>
      Reviewed-by: default avatarBrad Fitzpatrick <bradfitz@golang.org>
      93cf82f0
  5. 11 Oct, 2018 2 commits