- 29 May, 2014 2 commits
-
-
Russ Cox authored
[Same as CL 102820043 except applied changes to 6g/gsubr.c also to 5g/gsubr.c and 8g/gsubr.c. The problem I had last night trying to do that was that 8g's copy of nodarg has different (but equivalent) control flow and I was pasting the new code into the wrong place.] Description from CL 102820043: The 'nodarg' function is used to obtain a Node* representing a function argument or result. It returned a brand new Node*, but that violates the guarantee in most places in the compiler that two Node*s refer to the same variable if and only if they are the same Node* pointer. Reestablish that invariant by making nodarg return a preexisting named variable if present. Having fixed that, avoid any copy during x=x in componentgen, because the VARDEF we emit before the copy marks the lhs x as dead incorrectly. The change in walk.c avoids modifying the result of nodarg. This was the only place in the compiler that did so. Fixes #8097. LGTM=khr R=golang-codereviews, khr CC=golang-codereviews, iant, khr, r https://golang.org/cl/103750043
-
Russ Cox authored
Breaks 386 and arm builds. The obvious reason is that this CL only edited 6g/gsubr.c and failed to edit 5g/gsubr.c and 8g/gsubr.c. However, the obvious CL applying the same edit to those files (CL 101900043) causes mysterious build failures in various of the standard package tests, usually involving reflect. Something deep and subtle is broken but only on the 32-bit systems. Undo this CL for now. ««« original CL description cmd/gc: fix x=x crash The 'nodarg' function is used to obtain a Node* representing a function argument or result. It returned a brand new Node*, but that violates the guarantee in most places in the compiler that two Node*s refer to the same variable if and only if they are the same Node* pointer. Reestablish that invariant by making nodarg return a preexisting named variable if present. Having fixed that, avoid any copy during x=x in componentgen, because the VARDEF we emit before the copy marks the lhs x as dead incorrectly. The change in walk.c avoids modifying the result of nodarg. This was the only place in the compiler that did so. Fixes #8097. LGTM=r, khr R=golang-codereviews, r, khr CC=golang-codereviews, iant https://golang.org/cl/102820043 »»» TBR=r CC=golang-codereviews, khr https://golang.org/cl/95660043
-
- 28 May, 2014 1 commit
-
-
Russ Cox authored
The 'nodarg' function is used to obtain a Node* representing a function argument or result. It returned a brand new Node*, but that violates the guarantee in most places in the compiler that two Node*s refer to the same variable if and only if they are the same Node* pointer. Reestablish that invariant by making nodarg return a preexisting named variable if present. Having fixed that, avoid any copy during x=x in componentgen, because the VARDEF we emit before the copy marks the lhs x as dead incorrectly. The change in walk.c avoids modifying the result of nodarg. This was the only place in the compiler that did so. Fixes #8097. LGTM=r, khr R=golang-codereviews, r, khr CC=golang-codereviews, iant https://golang.org/cl/102820043
-
- 12 May, 2014 1 commit
-
-
Josh Bleecher Snyder authored
This is joint work with Daniel Morsing. In order for the register allocator to alias two variables, they must have the same width, stack offset, and etype. Code generation was altering a variable's etype in a few places. This prevented the variable from being moved to a register, which in turn prevented peephole optimization. This failure to alias was very common, with almost 23,000 instances just running make.bash. This phenomenon was not visible in the register allocation debug output because the variables that failed to alias had the same name. The debugging-only change to bits.c fixes this by printing the variable number with its name. This CL fixes the source of all etype mismatches for 6g, all but one case for 8g, and depressingly few cases for 5g. (I believe that extending CL 6819083 to 5g is a prerequisite.) Fixing the remaining cases in 8g and 5g is work for the future. The etype mismatch fixes are: * [gc] Slicing changed the type of the base pointer into a uintptr in order to perform arithmetic on it. Instead, support addition directly on pointers. * [*g] OSPTR was giving type uintptr to slice base pointers; undo that. This arose, for example, while compiling copy(dst, src). * [8g] 64 bit float conversion was assigning int64 type during codegen, overwriting the existing uint64 type. Note that some etype mismatches are appropriate, such as a struct with a single field or an array with a single element. With these fixes, the number of registerizations that occur while running make.bash for 6g increases ~10%. Hello world binary size shrinks ~1.5%. Running all benchmarks in the standard library show performance improvements ranging from nominal to substantive (>10%); a full comparison using 6g on my laptop is available at https://gist.github.com/josharian/8f9b5beb46667c272064. The microbenchmarks must be taken with a grain of salt; see issue 7920. The few benchmarks that show real regressions are likely due to issue 7920. I manually examined the generated code for the top few regressions and none had any assembly output changes. The few benchmarks that show extraordinary improvements are likely also due to issue 7920. Performance results from 8g appear similar to 6g. 5g shows no performance improvements. This is not surprising, given the discussion above. Update #7316 LGTM=rsc R=rsc, daniel.morsing, bradfitz CC=dave, golang-codereviews https://golang.org/cl/91850043
-
- 04 Apr, 2014 1 commit
-
-
Rémy Oudompheng authored
Native Client forbids jumps/calls to arbitrary locations and enforces a particular alignement, which makes the Duff's device ineffective. LGTM=khr R=rsc, dave, khr CC=golang-codereviews https://golang.org/cl/84400043
-
- 01 Apr, 2014 1 commit
-
-
Keith Randall authored
REP MOVSQ and REP STOSQ have a really high startup overhead. Use a Duff's device to do the repetition instead. benchmark old ns/op new ns/op delta BenchmarkClearFat32 7.20 1.60 -77.78% BenchmarkCopyFat32 6.88 2.38 -65.41% BenchmarkClearFat64 7.15 3.20 -55.24% BenchmarkCopyFat64 6.88 3.44 -50.00% BenchmarkClearFat128 9.53 5.34 -43.97% BenchmarkCopyFat128 9.27 5.56 -40.02% BenchmarkClearFat256 13.8 9.53 -30.94% BenchmarkCopyFat256 13.5 10.3 -23.70% BenchmarkClearFat512 22.3 18.0 -19.28% BenchmarkCopyFat512 22.0 19.7 -10.45% BenchmarkCopyFat1024 36.5 38.4 +5.21% BenchmarkClearFat1024 35.1 35.0 -0.28% TODO: use for stack frame zeroing TODO: REP prefixes are still used for "reverse" copying when src/dst regions overlap. Might be worth fixing. LGTM=rsc R=golang-codereviews, rsc CC=golang-codereviews, r https://golang.org/cl/81370046
-
- 20 Mar, 2014 1 commit
-
-
Rémy Oudompheng authored
Revision 3ae4607a43ff introduced CONVNOP layers to fix type checking issues arising from comparisons. The added complexity made 8g run out of registers when compiling an equality function in go.net/ipv6. A similar issue occurred in test/sizeof.go on amd64p32 with 6g. Fixes #7405. LGTM=khr R=rsc, dave, iant, khr CC=golang-codereviews https://golang.org/cl/78100044
-
- 26 Feb, 2014 1 commit
-
-
Ian Lance Taylor authored
The gvardef function does nothing if n->class == PEXTERN, so we don't need to test for that before calling it. This makes the 6g/8g code more like the 5g code and clarifies that the cases that do not test for n->class != PEXTERN are not buggy. LGTM=rsc R=rsc CC=golang-codereviews https://golang.org/cl/68900044
-
- 15 Feb, 2014 1 commit
-
-
Russ Cox authored
The VARDEF placement must be before the initialization but after any final use. If you have something like s = ... using s ... the rhs must be evaluated, then the VARDEF, then the lhs assigned. There is a large comment in pgen.c on gvardef explaining this in more detail. This CL also includes Ian's suggestions from earlier CLs, namely commenting the use of mode in link.h and fixing the precedence of the ~r check in dcl.c. This CL enables the check that if liveness analysis decides a variable is live on entry to the function, that variable must be a function parameter (not a result, and not a local variable). If this check fails, it indicates a bug in the liveness analysis or in the generated code being analyzed. The race detector generates invalid code for append(x, y...). The code declares a temporary t and then uses cap(t) before initializing t. The new liveness check catches this bug and stops the compiler from writing out the buggy code. Consequently, this CL disables the race detector tests in run.bash until the race detector bug can be fixed (golang.org/issue/7334). Except for the race detector bug, the liveness analysis check does not detect any problems (this CL and the previous CLs fixed all the detected problems). The net test still fails with GOGC=0 but the rest of the tests now pass or time out (because GOGC=0 is so slow). TBR=iant CC=golang-codereviews https://golang.org/cl/64170043
-
- 14 Feb, 2014 1 commit
-
-
Russ Cox authored
Any initialization of a variable by a block copy or block zeroing or by multiple assignments (componentwise copying or zeroing of a multiword variable) needs to emit a VARDEF. These cases were not. Fixes #7205. TBR=iant CC=golang-codereviews https://golang.org/cl/63650044
-
- 17 Sep, 2013 1 commit
-
-
Russ Cox authored
This eliminates ~75% of the nil checks being emitted, on all architectures. We can do better, but we need a bit more general support from the compiler, and I don't want to do that so close to Go 1.2. What's here is simple but effective and safe. A few small code generation cleanups were required to make the analysis consistent on all systems about which nil checks are omitted, at least in the test. Fixes #6019. R=ken2 CC=golang-dev https://golang.org/cl/13334052
-
- 11 Sep, 2013 1 commit
-
-
Rémy Oudompheng authored
A new node type OSPTR is added to refer to the data pointer of strings and slices in a simple way during walk(). It will be useful for future work on simplification of slice arithmetic. benchmark old ns/op new ns/op delta BenchmarkCopy1Byte 9 8 -13.98% BenchmarkCopy2Byte 14 8 -40.49% BenchmarkCopy4Byte 13 8 -35.04% BenchmarkCopy8Byte 13 8 -37.10% BenchmarkCopy12Byte 14 12 -15.38% BenchmarkCopy16Byte 14 12 -17.24% BenchmarkCopy32Byte 19 14 -27.32% BenchmarkCopy128Byte 31 26 -15.29% BenchmarkCopy1024Byte 100 92 -7.50% BenchmarkCopy1String 10 7 -28.99% BenchmarkCopy2String 10 7 -28.06% BenchmarkCopy4String 10 8 -22.69% BenchmarkCopy8String 10 8 -23.30% BenchmarkCopy12String 11 11 -5.88% BenchmarkCopy16String 11 11 -5.08% BenchmarkCopy32String 15 14 -6.58% BenchmarkCopy128String 28 25 -10.60% BenchmarkCopy1024String 95 95 +0.53% R=golang-dev, bradfitz, cshapiro, dave, daniel.morsing, rsc, khr, khr CC=golang-dev https://golang.org/cl/9101048
-
- 15 Aug, 2013 1 commit
-
-
Russ Cox authored
See golang.org/s/go12nil. This CL is about getting all the right checks inserted. A followup CL will add an optimization pass to remove redundant checks. R=ken2 CC=golang-dev https://golang.org/cl/12970043
-
- 02 Jul, 2013 1 commit
-
-
Russ Cox authored
Design doc at golang.org/s/go12slice. This is an experimental feature and may not be included in the release. R=golang-dev, r CC=golang-dev https://golang.org/cl/10743046
-
- 09 Jun, 2013 1 commit
-
-
Shenghou Ma authored
R=golang-dev, bradfitz, khr, r CC=golang-dev https://golang.org/cl/7461046
-
- 30 Apr, 2013 1 commit
-
-
Rob Pike authored
Some 64-bit fields were run through 32-bit words, some counts were not checked for overflow, and relocations must fit in 32 bits. Tests to follow. R=golang-dev, dsymonds CC=golang-dev https://golang.org/cl/9033043
-
- 24 Apr, 2013 1 commit
-
-
Ian Lance Taylor authored
R=r, ken, khr, daniel.morsing CC=dsymonds, golang-dev, rickyz https://golang.org/cl/8925043
-
- 07 Mar, 2013 1 commit
-
-
Rémy Oudompheng authored
The code would violate the contract of cmp64. Fixes #5002. R=rsc, golang-dev CC=golang-dev https://golang.org/cl/7593043
-
- 02 Jan, 2013 1 commit
-
-
Rémy Oudompheng authored
A new environment variable GO386 is introduced to choose between code generation targeting 387 or SSE2. No auto-detection is performed and the setting defaults to 387 to preserve previous behaviour. The patch is a reorganization of CL6549052 by rsc. Fixes #3912. R=minux.ma, rsc CC=golang-dev https://golang.org/cl/6962043
-
- 21 Dec, 2012 1 commit
-
-
Rémy Oudompheng authored
Also restore the smallintconst case for binary ops. Fixes #3835. R=daniel.morsing, rsc CC=golang-dev https://golang.org/cl/6999043
-
- 27 Nov, 2012 1 commit
-
-
Rémy Oudompheng authored
Fixes #4448. R=golang-dev, rsc CC=golang-dev https://golang.org/cl/6855100
-
- 26 Nov, 2012 1 commit
-
-
Rémy Oudompheng authored
This allows 5g and 8g to benefit from the rewrite as shifts or magic multiplies. The 64-bit arithmetic is not handled there, and left in 6g. Update #2230. R=golang-dev, dave, mtj, iant, rsc CC=golang-dev https://golang.org/cl/6819123
-
- 21 Nov, 2012 1 commit
-
-
Rémy Oudompheng authored
Fixes #4399. R=golang-dev, nigeltao CC=golang-dev https://golang.org/cl/6845053
-
- 01 Nov, 2012 1 commit
-
-
Rémy Oudompheng authored
The move to 64-bit ints in 6g made componentgen ineffective. In componentgen, the code already selects which values it can handle. On amd64: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 9477970000 9582314000 +1.10% BenchmarkFannkuch11 5928750000 5255080000 -11.36% BenchmarkGobDecode 37103040 31451120 -15.23% BenchmarkGobEncode 16042490 16844730 +5.00% BenchmarkGzip 811337400 741373600 -8.62% BenchmarkGunzip 197928700 192844500 -2.57% BenchmarkJSONEncode 224164100 140064200 -37.52% BenchmarkJSONDecode 258346800 231829000 -10.26% BenchmarkMandelbrot200 7561780 7601615 +0.53% BenchmarkParse 12970340 11624360 -10.38% BenchmarkRevcomp 1969917000 1699137000 -13.75% BenchmarkTemplate 296182000 263117400 -11.16% R=nigeltao, dave, daniel.morsing CC=golang-dev https://golang.org/cl/6821052
-
- 16 Oct, 2012 1 commit
-
-
Rémy Oudompheng authored
This patch is enough to fix compilation of exp/types tests but only passes a stripped down version of the appripriate torture test. Update #4207. R=dave, nigeltao, rsc, golang-dev CC=golang-dev https://golang.org/cl/6621061
-
- 02 Oct, 2012 1 commit
-
-
Rémy Oudompheng authored
A similar change was made in 6g recently. LEALs in cmd/go: 31440 before, 27867 after. benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 7065794000 6723617000 -4.84% BenchmarkFannkuch11 7767395000 7477945000 -3.73% BenchmarkGobDecode 34708140 34857820 +0.43% BenchmarkGobEncode 10998780 10960060 -0.35% BenchmarkGzip 1603630000 1471052000 -8.27% BenchmarkGunzip 242573900 240650400 -0.79% BenchmarkJSONEncode 120842200 117966100 -2.38% BenchmarkJSONDecode 247254900 249103100 +0.75% BenchmarkMandelbrot200 29237330 29241790 +0.02% BenchmarkParse 8111320 8096865 -0.18% BenchmarkRevcomp 2595780000 2694153000 +3.79% BenchmarkTemplate 276679600 264497000 -4.40% benchmark old ns/op new ns/op delta BenchmarkAppendFloatDecimal 429 416 -3.03% BenchmarkAppendFloat 780 740 -5.13% BenchmarkAppendFloatExp 746 700 -6.17% BenchmarkAppendFloatNegExp 752 694 -7.71% BenchmarkAppendFloatBig 1228 1108 -9.77% BenchmarkAppendFloat32Integer 457 416 -8.97% BenchmarkAppendFloat32ExactFraction 662 631 -4.68% BenchmarkAppendFloat32Point 771 735 -4.67% BenchmarkAppendFloat32Exp 722 672 -6.93% BenchmarkAppendFloat32NegExp 724 659 -8.98% BenchmarkAppendFloat64Fixed1 429 400 -6.76% BenchmarkAppendFloat64Fixed2 463 442 -4.54% Update #1914. R=golang-dev, daniel.morsing, rsc CC=golang-dev https://golang.org/cl/6574043
-
- 26 Sep, 2012 1 commit
-
-
Rémy Oudompheng authored
In two cases, registers were allocated too early resulting in exhausting of available registers when nesting these operations. The case of method calls was due to missing cases in igen, which only makes calls but doesn't allocate a register for the result. The case of 8-bit multiplication was due to a wrong order in register allocation when Ullman numbers were bigger on the RHS. Fixes #3907. Fixes #4156. R=rsc CC=golang-dev, remy https://golang.org/cl/6560054
-
- 24 Sep, 2012 2 commits
-
-
Rémy Oudompheng authored
Apart from reducing the number of LEAL/LEAQ instructions by about 30%, it gives 8g easier registerization in several cases, for example in strconv. Performance with 6g is not affected. Before (386): src/pkg/strconv/decimal.go:22 TEXT (*decimal).String+0(SB),$240-12 src/pkg/strconv/extfloat.go:540 TEXT (*extFloat).ShortestDecimal+0(SB),$584-20 After (386): src/pkg/strconv/decimal.go:22 TEXT (*decimal).String+0(SB),$196-12 src/pkg/strconv/extfloat.go:540 TEXT (*extFloat).ShortestDecimal+0(SB),$420-20 Benchmarks with GOARCH=386 (on a Core 2). benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 7110191000 7079644000 -0.43% BenchmarkFannkuch11 7769274000 7766514000 -0.04% BenchmarkGobDecode 33454820 34755400 +3.89% BenchmarkGobEncode 11675710 11007050 -5.73% BenchmarkGzip 2013519000 1593855000 -20.84% BenchmarkGunzip 253368200 242667600 -4.22% BenchmarkJSONEncode 152443900 120763400 -20.78% BenchmarkJSONDecode 304112800 247461800 -18.63% BenchmarkMandelbrot200 29245520 29240490 -0.02% BenchmarkParse 8484105 8088660 -4.66% BenchmarkRevcomp 2695688000 2841263000 +5.40% BenchmarkTemplate 363759800 277271200 -23.78% benchmark old ns/op new ns/op delta BenchmarkAtof64Decimal 127 129 +1.57% BenchmarkAtof64Float 166 164 -1.20% BenchmarkAtof64FloatExp 308 300 -2.60% BenchmarkAtof64Big 584 571 -2.23% BenchmarkAppendFloatDecimal 440 430 -2.27% BenchmarkAppendFloat 995 776 -22.01% BenchmarkAppendFloatExp 897 746 -16.83% BenchmarkAppendFloatNegExp 900 752 -16.44% BenchmarkAppendFloatBig 1528 1228 -19.63% BenchmarkAppendFloat32Integer 443 453 +2.26% BenchmarkAppendFloat32ExactFraction 812 661 -18.60% BenchmarkAppendFloat32Point 1002 773 -22.85% BenchmarkAppendFloat32Exp 858 725 -15.50% BenchmarkAppendFloat32NegExp 848 728 -14.15% BenchmarkAppendFloat64Fixed1 447 431 -3.58% BenchmarkAppendFloat64Fixed2 480 462 -3.75% BenchmarkAppendFloat64Fixed3 461 457 -0.87% BenchmarkAppendFloat64Fixed4 509 484 -4.91% Update #1914. R=rsc, nigeltao CC=golang-dev, remy https://golang.org/cl/6494107
-
Rémy Oudompheng authored
Comparisons used to create temporaries for arguments even if they were already variables or addressable. Removing the extra ones reduces pressure on regopt. benchmark old ns/op new ns/op delta BenchmarkGobDecode 50787620 49908980 -1.73% BenchmarkGobEncode 19870190 19473030 -2.00% BenchmarkGzip 3214321000 3067929000 -4.55% BenchmarkGunzip 496792800 465828600 -6.23% BenchmarkJSONEncode 232524800 263864400 +13.48% BenchmarkJSONDecode 622038400 506600600 -18.56% BenchmarkMandelbrot200 23937310 45913060 +91.81% BenchmarkParse 14364450 13997010 -2.56% BenchmarkRevcomp 6919028000 6480009000 -6.35% BenchmarkTemplate 594458800 539528200 -9.24% benchmark old MB/s new MB/s speedup BenchmarkGobDecode 15.11 15.38 1.02x BenchmarkGobEncode 38.63 39.42 1.02x BenchmarkGzip 6.04 6.33 1.05x BenchmarkGunzip 39.06 41.66 1.07x BenchmarkJSONEncode 8.35 7.35 0.88x BenchmarkJSONDecode 3.12 3.83 1.23x BenchmarkParse 4.03 4.14 1.03x BenchmarkRevcomp 36.73 39.22 1.07x BenchmarkTemplate 3.26 3.60 1.10x R=mtj, daniel.morsing, rsc CC=golang-dev https://golang.org/cl/6547064
-
- 23 Sep, 2012 1 commit
-
-
Russ Cox authored
Fixes #3670. R=ken2 CC=golang-dev https://golang.org/cl/6542058
-
- 12 Sep, 2012 1 commit
-
-
Nigel Tao authored
Code higher up in the function already catches these cases. R=remyoudompheng, rsc CC=golang-dev https://golang.org/cl/6496106
-
- 11 Sep, 2012 1 commit
-
-
Rémy Oudompheng authored
Removes an extra LEAL/LEAQ instructions there and usually saves a useless temporary in the idiom if err := foo(); err != nil {...} Generated code is also less involved: MOVQ err+n(SP), AX CMPQ AX, $0 (potentially CMPQ n(SP), $0) instead of LEAQ err+n(SP), AX CMPQ (AX), $0 Update #1914. R=daniel.morsing, nigeltao, rsc CC=golang-dev, remy https://golang.org/cl/6493099
-
- 09 Sep, 2012 1 commit
-
-
Rémy Oudompheng authored
This makes the compilers code more similar and improves code generation a lot. The number of LEAL instructions generated for cmd/go drops by 60%. % GOARCH=386 go build -gcflags -S -a cmd/go | grep LEAL | wc -l Before: 89774 After: 47548 benchmark old ns/op new ns/op delta BenchmarkAppendFloatDecimal 540 444 -17.78% BenchmarkAppendFloat 1160 1035 -10.78% BenchmarkAppendFloatExp 1060 922 -13.02% BenchmarkAppendFloatNegExp 1053 920 -12.63% BenchmarkAppendFloatBig 1773 1558 -12.13% BenchmarkFormatInt 13065 12481 -4.47% BenchmarkAppendInt 10981 9900 -9.84% BenchmarkFormatUint 3804 3650 -4.05% BenchmarkAppendUint 3506 3303 -5.79% BenchmarkUnquoteEasy 714 683 -4.34% BenchmarkUnquoteHard 5117 2915 -43.03% Update #1914. R=nigeltao, rsc, golang-dev CC=golang-dev, remy https://golang.org/cl/6489067
-
- 23 Aug, 2012 1 commit
-
-
Nigel Tao authored
in 13416:67c0b8c8fb29 "faster code, mainly for rotate" [1]. The codegen can run out of registers if there are too many small-int arithmetic ops. An alternative approach is to copy 6g's sbop/abop codegen to 8g, but this change is less risky. Fixes #3835. [1] http://code.google.com/p/go/source/diff?spec=svn67c0b8c8fb29b1b7b6221977af6b89cae787b941&name=67c0b8c8fb29&r=67c0b8c8fb29b1b7b6221977af6b89cae787b941&format=side&path=/src/cmd/8g/cgen.c R=rsc, remyoudompheng, r CC=golang-dev https://golang.org/cl/6450163
-
- 14 Jun, 2012 1 commit
-
-
Nigel Tao authored
GOARCH=amd64 benchmarks src/pkg/runtime benchmark old ns/op new ns/op delta BenchmarkConvT2ESmall 10 10 +1.00% BenchmarkConvT2EUintptr 9 0 -92.07% BenchmarkConvT2EBig 74 74 -0.27% BenchmarkConvT2I 27 26 -3.62% BenchmarkConvI2E 4 4 -7.05% BenchmarkConvI2I 20 19 -2.99% test/bench/go1 benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 5930908000 5937260000 +0.11% BenchmarkFannkuch11 3927057000 3933556000 +0.17% BenchmarkGobDecode 21998090 21870620 -0.58% BenchmarkGobEncode 12725310 12734480 +0.07% BenchmarkGzip 567617600 567892800 +0.05% BenchmarkGunzip 178284100 178706900 +0.24% BenchmarkJSONEncode 87693550 86794300 -1.03% BenchmarkJSONDecode 314212600 324115000 +3.15% BenchmarkMandelbrot200 7016640 7073766 +0.81% BenchmarkParse 7852100 7892085 +0.51% BenchmarkRevcomp 1285663000 1286147000 +0.04% BenchmarkTemplate 566823800 567606200 +0.14% I'm not entirely sure why the JSON* numbers have changed, but eyeballing the profile suggests that it could be spending less and more time in runtime.{new,old}stack, so it could simply be stack-split boundary noise. R=rsc, dave, bsiegert, dsymonds CC=golang-dev https://golang.org/cl/6280049
-
- 03 Jun, 2012 1 commit
-
-
Luuk van Dijk authored
R=rsc, ality, rogpeppe, minux.ma, dave CC=golang-dev https://golang.org/cl/5966075
-
- 30 May, 2012 1 commit
-
-
Russ Cox authored
Drop expecttaken function in favor of extra argument to gbranch and bgen. Mark loop condition as likely to be true, so that loops are generated inline. The main benefit here is contiguous code when trying to read the generated assembly. It has only minor effects on the timing, and they mostly cancel the minor effects that aligning function entry points had. One exception: both changes made Fannkuch faster. Compared to before CL 6244066 (before aligned functions) benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 4222117400 4201958800 -0.48% BenchmarkFannkuch11 3462631800 3215908600 -7.13% BenchmarkGobDecode 20887622 20899164 +0.06% BenchmarkGobEncode 9548772 9439083 -1.15% BenchmarkGzip 151687 152060 +0.25% BenchmarkGunzip 8742 8711 -0.35% BenchmarkJSONEncode 62730560 62686700 -0.07% BenchmarkJSONDecode 252569180 252368960 -0.0...
-
- 29 May, 2012 1 commit
-
-
Russ Cox authored
The old code generated for a bounds check was CMP JLT ok CALL panicindex ok: ... The new code is (once the linker finishes with it): CMP JGE panic ... panic: CALL panicindex which moves the calls out of line, putting more useful code in each cache line. This matters especially in tight loops, such as in Fannkuch. The benefit is more modest elsewhere, but real. From test/bench/go1, amd64: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 6096092000 6088808000 -0.12% BenchmarkFannkuch11 6151404000 4020463000 -34.64% BenchmarkGobDecode 28990050 28894630 -0.33% BenchmarkGobEncode 12406310 12136730 -2.17% BenchmarkGzip 179923 179903 -0.01% BenchmarkGunzip 11219 11130 -0.79% BenchmarkJSONEncode 86429350 86515900 +0.10% BenchmarkJSONDecode 334593800 315728400 -5.64% BenchmarkRevcomp25M 1219763000 1180767000 -3.20% BenchmarkTemplate 492947600 483646800 -1.89% And 386: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 6354902000 6243000000 -1.76% BenchmarkFannkuch11 8043769000 7326965000 -8.91% BenchmarkGobDecode 19010800 18941230 -0.37% BenchmarkGobEncode 14077500 13792460 -2.02% BenchmarkGzip 194087 193619 -0.24% BenchmarkGunzip 12495 12457 -0.30% BenchmarkJSONEncode 125636400 125451400 -0.15% BenchmarkJSONDecode 696648600 685032800 -1.67% BenchmarkRevcomp25M 2058088000 2052545000 -0.27% BenchmarkTemplate 602140000 589876800 -2.04% To implement this, two new instruction forms: JLT target // same as always JLT $0, target // branch expected not taken JLT $1, target // branch expected taken The linker could also emit the prediction prefixes, but it does not: expected taken branches are reversed so that the expected case is not taken (as in example above), and the default expectaton for such a jump is not taken already. R=golang-dev, gri, r, dave CC=golang-dev https://golang.org/cl/6248049
-
- 24 May, 2012 1 commit
-
-
Russ Cox authored
* Eliminate bounds check on known small shifts. * Rewrite x<<s | x>>(32-s) as a rotate (constant s). * More aggressive (but still minimal) range analysis. R=ken, dave, iant CC=golang-dev https://golang.org/cl/6209077
-
- 13 Feb, 2012 1 commit
-
-
Anthony Martin authored
8g/cgen.c print format type mismatch 8l/asm.c resoff set and not used gc/pgen.c misleading comparison INT > 0x80000000 gc/reflect.c dalgsym must be static to match forward declaration gc/subr.c assumed_equal set and not used hashmem's second argument is not used gc/walk.c duplicated (unreachable) code R=rsc CC=golang-dev https://golang.org/cl/5651079
-