1. 22 Mar, 2013 20 commits
  2. 21 Mar, 2013 20 commits
    • Robert Griesemer's avatar
      test: more systematic shift tests · f8ff6893
      Robert Griesemer authored
      To be submitted once gc agrees.
      
      R=rsc, iant, remyoudompheng
      CC=golang-dev
      https://golang.org/cl/7861045
      f8ff6893
    • Rémy Oudompheng's avatar
      cmd/gc: accept ideal float as indices. · 88b98ff7
      Rémy Oudompheng authored
      Fixes #4813.
      
      R=golang-dev, daniel.morsing, rsc
      CC=golang-dev
      https://golang.org/cl/7625050
      88b98ff7
    • Robert Griesemer's avatar
      go/doc: use regexp for -notes instead of comma-sep. list · 5e6a1a3f
      Robert Griesemer authored
      -notes="BUG|TODO" instead of -notes="BUG,TODO".
      Permits -notes=".*" to see all notes.
      
      R=cnicolaou
      CC=golang-dev
      https://golang.org/cl/7951043
      5e6a1a3f
    • Rob Pike's avatar
      bufio.Scanner: delete obsolete TODO · 5bbdf405
      Rob Pike authored
      Also fix the examples to use stderr for errors.
      
      R=golang-dev, bradfitz
      CC=golang-dev
      https://golang.org/cl/7716052
      5bbdf405
    • Brad Fitzpatrick's avatar
      net/http/fcgi: fix a shutdown race · d7c1f67c
      Brad Fitzpatrick authored
      If a handler didn't consume all its Request.Body, child.go was
      closing the socket while the host was still writing to it,
      causing the child to send a RST and the host (at least nginx)
      to send an empty response body.
      
      Now, we tell the host we're done with the request/response
      first, and then close our input pipe after consuming a bit of
      it. Consuming the body fixes the problem, and flushing to the
      host first to tell it that we're done increases the chance
      that the host cuts off further data to us, meaning we won't
      have much to consume.
      
      No new tests, because this package is lacking in tests.
      Tested by hand with nginx.  See issue for testing details.
      
      Fixes #4183
      
      R=golang-dev, rsc
      CC=golang-dev
      https://golang.org/cl/7939045
      d7c1f67c
    • Russ Cox's avatar
      debug/elf: restore Go 1.0 semantics for (*File).Symbols · aafc444b
      Russ Cox authored
      Also adjust the implementation of applyRelocationsAMD64
      so that the test added in CL 6848044 still passes.
      
      R=golang-dev, minux.ma
      CC=golang-dev
      https://golang.org/cl/7686049
      aafc444b
    • Russ Cox's avatar
      reflect: implement method values · 3be70366
      Russ Cox authored
      Fixes #1517.
      
      R=golang-dev, r
      CC=golang-dev
      https://golang.org/cl/7906043
      3be70366
    • Rob Pike's avatar
      doc/go1.1.html: more TODOs done · 178d8d4f
      Rob Pike authored
      Only the net stuff remains as significant work in the "minor changes" section.
      
      R=golang-dev, dave, elias.naur, rsc
      CC=golang-dev
      https://golang.org/cl/7933044
      178d8d4f
    • Russ Cox's avatar
      crypto/rc4: faster amd64 implementation · b505ff62
      Russ Cox authored
      XOR key into data 128 bits at a time instead of 64 bits
      and pipeline half of state loads. Rotate loop to allow
      single-register indexing for state[i].
      
      On a MacBookPro10,2 (Core i5):
      
      benchmark           old ns/op    new ns/op    delta
      BenchmarkRC4_128          412          224  -45.63%
      BenchmarkRC4_1K          3179         1613  -49.26%
      BenchmarkRC4_8K         25223        12545  -50.26%
      
      benchmark            old MB/s     new MB/s  speedup
      BenchmarkRC4_128       310.51       570.42    1.84x
      BenchmarkRC4_1K        322.09       634.48    1.97x
      BenchmarkRC4_8K        320.97       645.32    2.01x
      
      For comparison, on the same machine, openssl 0.9.8r reports
      its rc4 speed as somewhat under 350 MB/s for both 1K and 8K
      (it is operating 64 bits at a time).
      
      On an Intel Xeon E5520:
      
      benchmark           old ns/op    new ns/op    delta
      BenchmarkRC4_128          418          259  -38.04%
      BenchmarkRC4_1K          3200         1884  -41.12%
      BenchmarkRC4_8K         25173        14529  -42.28%
      
      benchmark            old MB/s     new MB/s  speedup
      BenchmarkRC4_128       306.04       492.48    1.61x
      BenchmarkRC4_1K        319.93       543.26    1.70x
      BenchmarkRC4_8K        321.61       557.20    1.73x
      
      For comparison, on the same machine, openssl 1.0.1
      reports its rc4 speed as 587 MB/s for 1K and 601 MB/s for 8K.
      
      R=agl
      CC=golang-dev
      https://golang.org/cl/7865046
      b505ff62
    • Shenghou Ma's avatar
      cmd/ld: portability fixes · d04ac4b0
      Shenghou Ma authored
      fix code that implicitly assumes little-endian machines.
      
      R=golang-dev, bradfitz, rsc, alex.brainman
      CC=golang-dev
      https://golang.org/cl/6792043
      d04ac4b0
    • Shenghou Ma's avatar
      cmd/ld: don't generate DW_AT_type attr for unsafe.Pointer to match gcc behavior · a891b916
      Shenghou Ma authored
      gcc generates only attr DW_AT_byte_size for DW_TAG_pointer_type of "void *",
      but we used to also generate DW_AT_type pointing to imaginary unspecified
      type "void", which confuses some gdb.
      This change makes old Apple gdb 6.x (specifically, Apple version gdb-1515)
      accepts our binary without issue like this:
      (gdb) b 'main.main'
      Die: DW_TAG_unspecified_type (abbrev = 10, offset = 47079)
          has children: FALSE
          attributes:
              DW_AT_name (DW_FORM_string) string: "void"
      Dwarf Error: Cannot find type of die [in module /Users/minux/go/go2.hg/bin/go]
      
      Special thanks to Russ Cox for pointing out the problem in comment #6 of
      CL 7891044.
      
      R=golang-dev, rsc
      CC=golang-dev
      https://golang.org/cl/7744051
      a891b916
    • Shenghou Ma's avatar
      cmd/ld: fix bad merge · f74b4d3d
      Shenghou Ma authored
      CL 7504044 accidentally reverted part of CL 7891044 and 7552045, this CL
      bring those part back.
      
      R=golang-dev
      TBR=rsc
      CC=golang-dev
      https://golang.org/cl/7950045
      f74b4d3d
    • Jan Ziak's avatar
    • Robert Griesemer's avatar
      go/format: fix documentation · e95c41f3
      Robert Griesemer authored
      R=r
      CC=golang-dev
      https://golang.org/cl/7920048
      e95c41f3
    • Russ Cox's avatar
      crypto/sha1: faster amd64, 386 implementations · 2f32138a
      Russ Cox authored
      -- amd64 --
      
      On a MacBookPro10,2 (Core i5):
      
      benchmark              old ns/op    new ns/op    delta
      BenchmarkHash8Bytes          785          592  -24.59%
      BenchmarkHash1K             8727         3014  -65.46%
      BenchmarkHash8K            64926        20723  -68.08%
      
      benchmark               old MB/s     new MB/s  speedup
      BenchmarkHash8Bytes        10.19        13.50    1.32x
      BenchmarkHash1K           117.34       339.71    2.90x
      BenchmarkHash8K           126.17       395.31    3.13x
      
      For comparison, on the same machine, openssl 0.9.8r reports
      its sha1 speed as 341 MB/s for 1K and 404 MB/s for 8K.
      
      On an Intel Xeon E5520:
      
      benchmark              old ns/op    new ns/op    delta
      BenchmarkHash8Bytes          984          707  -28.15%
      BenchmarkHash1K            11141         3466  -68.89%
      BenchmarkHash8K            82435        23411  -71.60%
      
      benchmark               old MB/s     new MB/s  speedup
      BenchmarkHash8Bytes         8.13        11.31    1.39x
      BenchmarkHash1K            91.91       295.36    3.21x
      BenchmarkHash8K            99.37       349.91    3.52x
      
      For comparison, on the same machine, openssl 1.0.1 reports
      its sha1 speed as 286 MB/s for 1K and 394 MB/s for 8K.
      
      -- 386 --
      
      On a MacBookPro10,2 (Core i5):
      
      benchmark              old ns/op    new ns/op    delta
      BenchmarkHash8Bytes         1041          713  -31.51%
      BenchmarkHash1K            15612         3382  -78.34%
      BenchmarkHash8K           110152        22733  -79.36%
      
      benchmark               old MB/s     new MB/s  speedup
      BenchmarkHash8Bytes         7.68        11.21    1.46x
      BenchmarkHash1K            65.59       302.76    4.62x
      BenchmarkHash8K            74.37       360.36    4.85x
      
      On an Intel Xeon E5520:
      
      benchmark              old ns/op    new ns/op    delta
      BenchmarkHash8Bytes         1221          842  -31.04%
      BenchmarkHash1K            14643         4137  -71.75%
      BenchmarkHash8K           108722        27394  -74.80%
      
      benchmark               old MB/s     new MB/s  speedup
      BenchmarkHash8Bytes         6.55         9.49    1.45x
      BenchmarkHash1K            69.93       247.51    3.54x
      BenchmarkHash8K            75.35       299.04    3.97x
      
      R=agl, dave
      CC=golang-dev
      https://golang.org/cl/7763049
      2f32138a
    • Russ Cox's avatar
      crypto/md5: faster amd64, 386 implementations · 25cbd534
      Russ Cox authored
      -- amd64 --
      
      On a MacBookPro10,2 (Core i5):
      
      benchmark                       old ns/op    new ns/op    delta
      BenchmarkHash8Bytes                   471          524  +11.25%
      BenchmarkHash1K                      3018         2220  -26.44%
      BenchmarkHash8K                     20634        14604  -29.22%
      BenchmarkHash8BytesUnaligned          468          523  +11.75%
      BenchmarkHash1KUnaligned             3006         2212  -26.41%
      BenchmarkHash8KUnaligned            20820        14652  -29.63%
      
      benchmark                        old MB/s     new MB/s  speedup
      BenchmarkHash8Bytes                 16.98        15.26    0.90x
      BenchmarkHash1K                    339.26       461.19    1.36x
      BenchmarkHash8K                    397.00       560.92    1.41x
      BenchmarkHash8BytesUnaligned        17.08        15.27    0.89x
      BenchmarkHash1KUnaligned           340.65       462.75    1.36x
      BenchmarkHash8KUnaligned           393.45       559.08    1.42x
      
      For comparison, on the same machine, openssl 0.9.8r reports
      its md5 speed as 350 MB/s for 1K and 410 MB/s for 8K.
      
      On an Intel Xeon E5520:
      
      benchmark                       old ns/op    new ns/op    delta
      BenchmarkHash8Bytes                   565          607   +7.43%
      BenchmarkHash1K                      3753         2475  -34.05%
      BenchmarkHash8K                     25945        16250  -37.37%
      BenchmarkHash8BytesUnaligned          559          594   +6.26%
      BenchmarkHash1KUnaligned             3754         2474  -34.10%
      BenchmarkHash8KUnaligned            26011        16359  -37.11%
      
      benchmark                        old MB/s     new MB/s  speedup
      BenchmarkHash8Bytes                 14.15        13.17    0.93x
      BenchmarkHash1K                    272.83       413.58    1.52x
      BenchmarkHash8K                    315.74       504.11    1.60x
      BenchmarkHash8BytesUnaligned        14.31        13.46    0.94x
      BenchmarkHash1KUnaligned           272.73       413.78    1.52x
      BenchmarkHash8KUnaligned           314.93       500.73    1.59x
      
      For comparison, on the same machine, openssl 1.0.1 reports
      its md5 speed as 443 MB/s for 1K and 513 MB/s for 8K.
      
      -- 386 --
      
      On a MacBookPro10,2 (Core i5):
      
      benchmark                       old ns/op    new ns/op    delta
      BenchmarkHash8Bytes                   602          670  +11.30%
      BenchmarkHash1K                      4038         2549  -36.87%
      BenchmarkHash8K                     27879        16690  -40.13%
      BenchmarkHash8BytesUnaligned          602          670  +11.30%
      BenchmarkHash1KUnaligned             4025         2546  -36.75%
      BenchmarkHash8KUnaligned            27844        16692  -40.05%
      
      benchmark                        old MB/s     new MB/s  speedup
      BenchmarkHash8Bytes                 13.28        11.93    0.90x
      BenchmarkHash1K                    253.58       401.69    1.58x
      BenchmarkHash8K                    293.83       490.81    1.67x
      BenchmarkHash8BytesUnaligned        13.27        11.94    0.90x
      BenchmarkHash1KUnaligned           254.40       402.05    1.58x
      BenchmarkHash8KUnaligned           294.21       490.77    1.67x
      
      On an Intel Xeon E5520:
      
      benchmark                       old ns/op    new ns/op    delta
      BenchmarkHash8Bytes                   752          716   -4.79%
      BenchmarkHash1K                      5307         2799  -47.26%
      BenchmarkHash8K                     36993        18042  -51.23%
      BenchmarkHash8BytesUnaligned          748          730   -2.41%
      BenchmarkHash1KUnaligned             5301         2795  -47.27%
      BenchmarkHash8KUnaligned            36983        18085  -51.10%
      
      benchmark                        old MB/s     new MB/s  speedup
      BenchmarkHash8Bytes                 10.64        11.16    1.05x
      BenchmarkHash1K                    192.93       365.80    1.90x
      BenchmarkHash8K                    221.44       454.03    2.05x
      BenchmarkHash8BytesUnaligned        10.69        10.95    1.02x
      BenchmarkHash1KUnaligned           193.15       366.36    1.90x
      BenchmarkHash8KUnaligned           221.51       452.96    2.04x
      
      R=agl
      CC=golang-dev
      https://golang.org/cl/7621049
      25cbd534
    • Russ Cox's avatar
      crypto/rc4: faster amd64, 386 implementations · 1af96080
      Russ Cox authored
      -- amd64 --
      
      On a MacBookPro10,2 (Core i5):
      
      benchmark           old ns/op    new ns/op    delta
      BenchmarkRC4_128          470          421  -10.43%
      BenchmarkRC4_1K          3123         3275   +4.87%
      BenchmarkRC4_8K         26351        25866   -1.84%
      
      benchmark            old MB/s     new MB/s  speedup
      BenchmarkRC4_128       272.22       303.40    1.11x
      BenchmarkRC4_1K        327.80       312.58    0.95x
      BenchmarkRC4_8K        307.24       313.00    1.02x
      
      For comparison, on the same machine, openssl 0.9.8r reports
      its rc4 speed as somewhat under 350 MB/s for both 1K and 8K.
      The Core i5 performance can be boosted another 20%, but only
      by making the Xeon performance significantly slower.
      
      On an Intel Xeon E5520:
      
      benchmark           old ns/op    new ns/op    delta
      BenchmarkRC4_128          774          417  -46.12%
      BenchmarkRC4_1K          6121         3200  -47.72%
      BenchmarkRC4_8K         48394        25151  -48.03%
      
      benchmark            old MB/s     new MB/s  speedup
      BenchmarkRC4_128       165.18       306.84    1.86x
      BenchmarkRC4_1K        167.28       319.92    1.91x
      BenchmarkRC4_8K        167.29       321.89    1.92x
      
      For comparison, on the same machine, openssl 1.0.1
      (which uses a different implementation than 0.9.8r)
      reports its rc4 speed as 587 MB/s for 1K and 601 MB/s for 8K.
      It is using SIMD instructions to do more in parallel.
      
      So there's still some improvement to be had, but even so,
      this is almost 2x faster than what it replaced.
      
      -- 386 --
      
      On a MacBookPro10,2 (Core i5):
      
      benchmark           old ns/op    new ns/op    delta
      BenchmarkRC4_128         3491          421  -87.94%
      BenchmarkRC4_1K         28063         3205  -88.58%
      BenchmarkRC4_8K        220392        25228  -88.55%
      
      benchmark            old MB/s     new MB/s  speedup
      BenchmarkRC4_128        36.66       303.81    8.29x
      BenchmarkRC4_1K         36.49       319.42    8.75x
      BenchmarkRC4_8K         36.73       320.90    8.74x
      
      On an Intel Xeon E5520:
      
      benchmark           old ns/op    new ns/op    delta
      BenchmarkRC4_128         2268          524  -76.90%
      BenchmarkRC4_1K         18161         4137  -77.22%
      BenchmarkRC4_8K        142396        32350  -77.28%
      
      benchmark            old MB/s     new MB/s  speedup
      BenchmarkRC4_128        56.42       244.13    4.33x
      BenchmarkRC4_1K         56.38       247.46    4.39x
      BenchmarkRC4_8K         56.86       250.26    4.40x
      
      R=agl
      CC=golang-dev
      https://golang.org/cl/7547050
      1af96080
    • Dmitriy Vyukov's avatar
      runtime: explicitly remove fd's from epoll waitset before close() · 44840786
      Dmitriy Vyukov authored
      Fixes #5061.
      
      Current code relies on the fact that fd's are automatically removed from epoll set when closed. However, it is not true. Underlying file description is removed from epoll set only when *all* fd's referring to it are closed.
      
      There are 2 bad consequences:
      1. Kernel delivers notifications on already closed fd's.
      2. The following sequence of events leads to error:
         - add fd1 to epoll
         - dup fd1 = fd2
         - close fd1 (not removed from epoll since we've dup'ed the fd)
         - dup fd2 = fd1 (get the same fd as fd1)
         - add fd1 to epoll = EEXIST
      
      So, if fd can be potentially dup'ed of fork'ed, it's necessary to explicitly remove the fd from epoll set.
      
      R=golang-dev, bradfitz, dave
      CC=golang-dev
      https://golang.org/cl/7870043
      44840786
    • Dmitriy Vyukov's avatar
      runtime: faster parallel GC · d4c80d19
      Dmitriy Vyukov authored
      Use per-thread work buffers instead of global mutex-protected pool. This eliminates contention from parallel scan phase.
      
      benchmark                             old ns/op    new ns/op    delta
      garbage.BenchmarkTree2-8               97100768     71417553  -26.45%
      garbage.BenchmarkTree2LastPause-8     970931485    714103692  -26.45%
      garbage.BenchmarkTree2Pause-8         469127802    345029253  -26.45%
      garbage.BenchmarkParser-8            2880950854   2715456901   -5.74%
      garbage.BenchmarkParserLastPause-8    137047399    103336476  -24.60%
      garbage.BenchmarkParserPause-8         80686028     58922680  -26.97%
      
      R=golang-dev, 0xe2.0x9a.0x9b, dave, adg, rsc, iant
      CC=golang-dev
      https://golang.org/cl/7816044
      d4c80d19
    • Rémy Oudompheng's avatar
      cmd/gc: implement more cases in racewalk. · 656bc3eb
      Rémy Oudompheng authored
      Add missing CLOSUREVAR in switch.
      Mark MAKE, string conversion nodes as impossible.
      Control statements do not need instrumentation.
      Instrument COM and LROT nodes.
      Instrument map length.
      
      Update #4228
      
      R=dvyukov, golang-dev
      CC=golang-dev
      https://golang.org/cl/7504047
      656bc3eb