1. 16 Aug, 2017 18 commits
    • Alberto Donizetti's avatar
      cmd/compile: combine x*n + y*n into (x+y)*n · a0453a18
      Alberto Donizetti authored
      There are a few cases where this can be useful. Apart from the obvious
      (and silly)
      
        100*n + 200*n
      
      where we generate one IMUL instead of two, consider:
      
        15*n + 31*n
      
      Currently, the compiler strength-reduces both imuls, generating:
      
          0x0000 00000	MOVQ	"".n+8(SP), AX
      	0x0005 00005 	MOVQ	AX, CX
      	0x0008 00008 	SHLQ	$4, AX
      	0x000c 00012 	SUBQ	CX, AX
      	0x000f 00015 	MOVQ	CX, DX
      	0x0012 00018 	SHLQ	$5, CX
      	0x0016 00022 	SUBQ	DX, CX
      	0x0019 00025 	ADDQ	CX, AX
      	0x001c 00028 	MOVQ	AX, "".~r1+16(SP)
      	0x0021 00033 	RET
      
      But combining the imuls is both faster and shorter:
      
      	0x0000 00000	MOVQ	"".n+8(SP), AX
      	0x0005 00005 	IMULQ	$46, AX
      	0x0009 00009	MOVQ	AX, "".~r1+16(SP)
      	0x000e 00014 	RET
      
      even without strength-reduction.
      
      Moreover, consider:
      
        5*n + 7*(n+1) + 11*(n+2)
      
      We already have a rule that rewrites 7(n+1) into 7n+7, so the
      generated code (without imuls merging) looks like this:
      
      	0x0000 00000 	MOVQ	"".n+8(SP), AX
      	0x0005 00005 	LEAQ	(AX)(AX*4), CX
      	0x0009 00009 	MOVQ	AX, DX
      	0x000c 00012 	NEGQ	AX
      	0x000f 00015 	LEAQ	(AX)(DX*8), AX
      	0x0013 00019 	ADDQ	CX, AX
      	0x0016 00022 	LEAQ	(DX)(CX*2), CX
      	0x001a 00026 	LEAQ	29(AX)(CX*1), AX
      	0x001f 00031 	MOVQ	AX, "".~r1+16(SP)
      
      But with imuls merging, the 5n, 7n and 11n factors get merged, and the
      generated code looks like this:
      
      	0x0000 00000 	MOVQ	"".n+8(SP), AX
      	0x0005 00005 	IMULQ	$23, AX
      	0x0009 00009 	ADDQ	$29, AX
      	0x000d 00013 	MOVQ	AX, "".~r1+16(SP)
      	0x0012 00018 	RET
      
      Which is both faster and shorter; that's also the exact same code that
      clang and the intel c compiler generate for the above expression.
      
      Change-Id: Ib4d5503f05d2f2efe31a1be14e2fe6cac33730a9
      Reviewed-on: https://go-review.googlesource.com/55143Reviewed-by: default avatarKeith Randall <khr@golang.org>
      a0453a18
    • Keith Randall's avatar
      cmd/link: fix bad dwarf for sudog<T> · e70fae8a
      Keith Randall authored
      The DWARF entries for type-specific sudog entries used the
      channel value type instead of a pointer-to-value type for the elem field.
      
      Fixes #21094
      
      R=go1.10
      
      Change-Id: I3f63a5664f42b571f729931309f2c9f6f38ab031
      Reviewed-on: https://go-review.googlesource.com/50170Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      e70fae8a
    • Ilya Tocar's avatar
      cmd/compile/internal/ssa: use sse to zero on amd64 · df709828
      Ilya Tocar authored
      Use 16-byte stores instead of 8-byte stores to zero small blocks.
      Also switch to duffzero for 65+ bytes only, because for each
      duffzero call we also save/restore BP, so call requires 4 instructions
      and replacing it with 4 sse stores doesn't cause code-bloat.
      Also switch duffzero to use leaq, instead of addq to avoid clobbering flags.
      
      ClearFat8-6     0.54ns ± 0%  0.54ns ± 0%     ~     (all equal)
      ClearFat12-6    1.07ns ± 0%  1.07ns ± 0%     ~     (all equal)
      ClearFat16-6    1.07ns ± 0%  0.69ns ± 0%  -35.51%  (p=0.001 n=8+9)
      ClearFat24-6    1.61ns ± 1%  1.07ns ± 0%  -33.33%  (p=0.000 n=10+10)
      ClearFat32-6    2.14ns ± 0%  1.07ns ± 0%  -50.00%  (p=0.001 n=8+9)
      ClearFat40-6    2.67ns ± 1%  1.61ns ± 0%  -39.72%  (p=0.000 n=10+8)
      ClearFat48-6    3.75ns ± 0%  2.68ns ± 0%  -28.59%  (p=0.000 n=9+9)
      ClearFat56-6    4.29ns ± 0%  3.22ns ± 0%  -25.10%  (p=0.000 n=9+9)
      ClearFat64-6    4.30ns ± 0%  3.22ns ± 0%  -25.15%  (p=0.000 n=8+8)
      ClearFat128-6   7.50ns ± 1%  7.51ns ± 0%     ~     (p=0.767 n=10+9)
      ClearFat256-6   13.9ns ± 1%  13.9ns ± 1%     ~     (p=0.257 n=10+10)
      ClearFat512-6   26.8ns ± 0%  26.8ns ± 0%     ~     (p=0.467 n=8+8)
      ClearFat1024-6  52.5ns ± 0%  52.5ns ± 0%     ~     (p=1.000 n=8+8)
      
      Also shaves ~20kb from go tool:
      
      go_old 10384994
      go_new 10364514 [-20480 bytes]
      
      section differences
      global text (code) = -20585 bytes (-0.532047%)
      read-only data = -302 bytes (-0.018101%)
      Total difference -20887 bytes (-0.348731%)
      
      Change-Id: I15854e87544545c1af24775df895e38e16e12694
      Reviewed-on: https://go-review.googlesource.com/54410
      Run-TryBot: Ilya Tocar <ilya.tocar@intel.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarKeith Randall <khr@golang.org>
      df709828
    • griesemer's avatar
      go/importer: make source importer more tolerant in presence of errors · b26ad605
      griesemer authored
      If the source importer only encounters "soft" type checking errors
      it can safely return the type-checked package because it will be
      completely set up. This makes the source importer slightly more
      robust in the presence of errors.
      
      Fixes #20855.
      
      Change-Id: I5af9ccdb30eee6bca7a0fab872f6057bde521bf3
      Reviewed-on: https://go-review.googlesource.com/55730Reviewed-by: default avatarAlan Donovan <adonovan@google.com>
      b26ad605
    • Daniel Martí's avatar
      reflect: remove useless parameter from newName · 9c9df65c
      Daniel Martí authored
      pkgPath always received the empty string. Worse yet, it panicked if it
      received anything else. This has been the case ever since newName was
      introduced in early 2016.
      
      Change-Id: I5f164305bd30c34455ef35e776c7616f303b37e4
      Reviewed-on: https://go-review.googlesource.com/54331
      Run-TryBot: Daniel Martí <mvdan@mvdan.cc>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarDavid Crawshaw <crawshaw@golang.org>
      9c9df65c
    • Michael Stapelberg's avatar
      go/internal/gcimporter: fix typo: cmd/compiler → cmd/compile · 29186526
      Michael Stapelberg authored
      Change-Id: I087980d30308353c4a450636122f7e87c8310090
      Reviewed-on: https://go-review.googlesource.com/56090Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      29186526
    • Brian Kessler's avatar
      math/big: recognize z.Mul(x, x) as squaring of x · 25b040c2
      Brian Kessler authored
      updates #13745
      
      Multiprecision squaring can be done in a straightforward manner
      with about half the multiplications of a basic multiplication
      due to the symmetry of the operands.  This change implements
      basic squaring for nat types and uses it for Int multiplication
      when the same variable is supplied to both arguments of
      z.Mul(x, x). This has some overhead to allocate a temporary
      variable to hold the cross products, shift them to double and
      add them to the diagonal terms.  There is a speed benefit in
      the intermediate range when the overhead is neglible and the
      asymptotic performance of karatsuba multiplication has not been
      reached.
      
      basicSqrThreshold = 20
      karatsubaSqrThreshold = 400
      
      Were set by running calibrate_test.go to measure timing differences
      between the algorithms.  Benchmarks for squaring:
      
      name           old time/op  new time/op  delta
      IntSqr/1-4     51.5ns ±25%  25.1ns ± 7%  -51.38%  (p=0.008 n=5+5)
      IntSqr/2-4     79.1ns ± 4%  72.4ns ± 2%   -8.47%  (p=0.008 n=5+5)
      IntSqr/3-4      102ns ± 4%    97ns ± 5%     ~     (p=0.056 n=5+5)
      IntSqr/5-4      161ns ± 4%   163ns ± 7%     ~     (p=0.952 n=5+5)
      IntSqr/8-4      277ns ± 5%   267ns ± 6%     ~     (p=0.087 n=5+5)
      IntSqr/10-4     358ns ± 3%   360ns ± 4%     ~     (p=0.730 n=5+5)
      IntSqr/20-4    1.07µs ± 3%  1.01µs ± 6%     ~     (p=0.056 n=5+5)
      IntSqr/30-4    2.36µs ± 4%  1.72µs ± 2%  -27.03%  (p=0.008 n=5+5)
      IntSqr/50-4    5.19µs ± 3%  3.88µs ± 4%  -25.37%  (p=0.008 n=5+5)
      IntSqr/80-4    11.3µs ± 4%   8.6µs ± 3%  -23.78%  (p=0.008 n=5+5)
      IntSqr/100-4   16.2µs ± 4%  12.8µs ± 3%  -21.49%  (p=0.008 n=5+5)
      IntSqr/200-4   50.1µs ± 5%  44.7µs ± 3%  -10.65%  (p=0.008 n=5+5)
      IntSqr/300-4    105µs ±11%    95µs ± 3%   -9.50%  (p=0.008 n=5+5)
      IntSqr/500-4    231µs ± 5%   227µs ± 2%     ~     (p=0.310 n=5+5)
      IntSqr/800-4    496µs ± 9%   459µs ± 3%   -7.40%  (p=0.016 n=5+5)
      IntSqr/1000-4   700µs ± 3%   710µs ± 5%     ~     (p=0.841 n=5+5)
      
      Show a speed up of 10-25% in the range where basicSqr is optimal,
      improved single word squaring and no significant difference when
      the fallback to standard multiplication is used.
      
      Change-Id: Iae2c82ca91cf890823f91e5c83bbe9a2c534b72b
      Reviewed-on: https://go-review.googlesource.com/53638Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      Run-TryBot: Robert Griesemer <gri@golang.org>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      25b040c2
    • Alberto Donizetti's avatar
      cmd/go: make go tool suggest 'go doc cmd/<command>' · 259f78f0
      Alberto Donizetti authored
      $ gotip tool -h says:
      
        For more about each tool command, see 'go tool command -h'.
      
      But it's better to suggest
      
        go doc cmd/<command>
      
      Fixes #18313
      
      Change-Id: I0a36d585906a5e1879e5b7927d1b6173e97cb500
      Reviewed-on: https://go-review.googlesource.com/55990Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      259f78f0
    • griesemer's avatar
      go/types: document that Signature.Recv() is ignored for type identity · f6f125dd
      griesemer authored
      Fixes #21367.
      
      Change-Id: I50704c5a613abcce57b340db8992c7bcb1cb728f
      Reviewed-on: https://go-review.googlesource.com/55710Reviewed-by: default avatarAlan Donovan <adonovan@google.com>
      f6f125dd
    • Brian Kessler's avatar
      math/big: speed up GCD x, y calculation · 53836a74
      Brian Kessler authored
      The current implementation of the extended Euclidean GCD algorithm
      calculates both cosequences x and y inside the division loop. This
      is unneccessary since the second Bezout coefficient can be obtained
      at the end of calculation via a multiplication, subtraction and a
      division.  In case only one coefficient is needed, e.g. ModInverse
      this calculation can be skipped entirely.  This is a standard
      optimization, see e.g.
      
      "Handbook of Elliptic and Hyperelliptic Curve Cryptography"
      Cohen et al pp 191
      Available at:
      http://cs.ucsb.edu/~koc/ccs130h/2013/EllipticHyperelliptic-CohenFrey.pdf
      
      Updates #15833
      
      Change-Id: I1e0d2e63567cfed97fd955048fe6373d36f22757
      Reviewed-on: https://go-review.googlesource.com/50530Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      53836a74
    • Brian Kessler's avatar
      math: eliminate overflow in Pow(x,y) for large y · 12465661
      Brian Kessler authored
      The current implementation uses a shift and add
      loop to compute the product of x's exponent xe and
      the integer part of y (yi) for yi up to 1<<63.
      Since xe is an 11-bit exponent, this product can be
      up to 74-bits and overflow both 32 and 64-bit int.
      
      This change checks whether the accumulated exponent
      will fit in the 11-bit float exponent of the output
      and breaks out of the loop early if overflow is detected.
      
      The current handling of yi >= 1<<63 uses Exp(y * Log(x))
      which incorrectly returns Nan for x<0.  In addition,
      for y this large, Exp(y * Log(x)) can be enumerated
      to only overflow except when x == -1 since the
      boundary cases computed exactly:
      
      Pow(NextAfter(1.0, Inf(1)), 1<<63)  == 2.72332... * 10^889
      Pow(NextAfter(1.0, Inf(-1)), 1<<63) == 1.91624... * 10^-445
      
      exceed the range of float64. So, the call can be
      replaced with a simple case statement analgous to
      y == Inf that correctly handles x < 0 as well.
      
      Fixes #7394
      
      Change-Id: I6f50dc951f3693697f9669697599860604323102
      Reviewed-on: https://go-review.googlesource.com/48290Reviewed-by: default avatarRobert Griesemer <gri@golang.org>
      12465661
    • Alex Brainman's avatar
      cmd/link: delete shNames · a9257b6b
      Alex Brainman authored
      Change-Id: Ie5d12ba4105fec17551637d066d0dffd508f74a4
      Reviewed-on: https://go-review.googlesource.com/55261
      Run-TryBot: Alex Brainman <alex.brainman@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      a9257b6b
    • Alex Brainman's avatar
      cmd/link: delete addpesection · 6aa38668
      Alex Brainman authored
      Change-Id: Iee9db172d28d4d372fa617907078a494e764bf12
      Reviewed-on: https://go-review.googlesource.com/55260Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      6aa38668
    • Alex Brainman's avatar
      cmd/link: use peSection everywhere · babc5b1d
      Alex Brainman authored
      Change-Id: I4d4e8452b9b9e628f3ea8b2b727ad63ec2a1dd31
      Reviewed-on: https://go-review.googlesource.com/55259Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      babc5b1d
    • Alex Brainman's avatar
      cmd/link: add peSection · 2c2b1723
      Alex Brainman authored
      Change-Id: Id3aeeaeaacf5f079fb2ddad579f2f209b7fc0e06
      Reviewed-on: https://go-review.googlesource.com/55258Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      2c2b1723
    • Alex Brainman's avatar
      cmd/link: introduce and use peFile and peStringTable · 20832e6d
      Alex Brainman authored
      Change-Id: Icd13b32d35cde474c9292227471f916a64af88eb
      Reviewed-on: https://go-review.googlesource.com/55257Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      20832e6d
    • Joe Tsai's avatar
      archive/tar: make Writer error handling consistent · b9a79f32
      Joe Tsai authored
      The Writer logic was not consistent about when an IO error would
      persist across multiple calls on Writer's methods.
      
      Thus, to make the error handling more consistent we always check
      the persistent state of the error prior to every exported method
      call, and return an error if set. Otherwise, it is the responsibility
      of every exported method to persist any fatal errors that may occur.
      
      As a simplification, we can remove the close field since that
      information can be represented by simply storing ErrWriteAfterClose
      in the err field.
      
      Change-Id: I8746ca36b3739803e0373253450db69b3bd12f38
      Reviewed-on: https://go-review.googlesource.com/55590
      Run-TryBot: Joe Tsai <joetsai@digital-static.net>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      b9a79f32
    • Joe Tsai's avatar
      archive/tar: add support for long binary strings in GNU format · 5c20ffbb
      Joe Tsai authored
      The GNU tar format defines the following type flags:
      	TypeGNULongName = 'L' // Next file has a long name
      	TypeGNULongLink = 'K' // Next file symlinks to a file w/ a long name
      
      Anytime a string exceeds the field dedicated to store it, the GNU format
      permits a fake "file" to be prepended where that file entry has a Typeflag
      of 'L' or 'K' and the contents of the file is a NUL-terminated string.
      
      Contrary to previous TODO comments,
      the GNU format supports arbitrary strings (without NUL) rather UTF-8 strings.
      The manual says the following:
      <<<
      The name, linkname, magic, uname, and gname are
      null-terminated character strings
      
      > <<<
      > All characters in header blocks are represented
      > by using 8-bit characters in the local variant of ASCII.
      
      
      From this description, we gather the following:
      * We must forbid NULs in any GNU strings
      * Any 8-bit value (other than NUL) is permitted
      
      Since the modern world has moved to UTF-8, it is really difficult to
      determine what a "local variant of ASCII" means. For this reason,
      we treat strings as just an arbitrary binary string (without NUL)
      and leave it to the user to determine the encoding of this string.
      (Practically, it seems that UTF-8 is the typical encoding used
      in GNU archives seen in the wild).
      
      The implementation of GNU tar seems to confirm this interpretation
      of the manual where it permits any arbitrary binary string to exist
      within these fields so long as they do not contain the NUL character.
      
       $ touch `echo -e "not\x80\x81\x82\x83utf8"`
       $ gnutar -H gnu --tar -cvf gnu-not-utf8.tar $(echo -e "not\x80\x81\x82\x83utf8")
      
      The fact that we permit arbitrary binary in GNU strings goes
      hand-in-hand with the fact that GNU also permits a "base-256" encoding
      of numeric fields, which is effectively two-complement binary.
      
      Change-Id: Ic037ec6bed306d07d1312f0058594bd9b64d9880
      Reviewed-on: https://go-review.googlesource.com/55573Reviewed-by: default avatarIan Lance Taylor <iant@golang.org>
      Run-TryBot: Joe Tsai <thebrokentoaster@gmail.com>
      TryBot-Result: Gobot Gobot <gobot@golang.org>
      5c20ffbb
  2. 15 Aug, 2017 22 commits