• Kirill Smelkov's avatar
    libgolang: Fix chan.close to dequeue subscribers atomically · dcf4ebd1
    Kirill Smelkov authored
    Currently chan.close iterates all send/recv subscribers
    unlocking/relocking the channel for each and notifying dequeued
    subscriber with channel unlocked. This leads to that even if channel had
    only one subscriber, chan.close accesses chan._mu again - after
    notifying that subscriber. That in turn means that an idiom where e.g.
    a done channel is passed to worker, which worker closes at the end, and
    main task waiting on the done and destroying done right after wakeup
    cannot work - because close, internally, accesses already destroyed
    channel as the following TSAN report shows for _test_go_c:
    
        WARNING: ThreadSanitizer: data race (pid=7143)
          Write of size 8 at 0x7b1400000650 by main thread:
            #0 free ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:649 (libtsan.so.0+0x2b46a)
            #1 free ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:643 (libtsan.so.0+0x2b46a)
            #2 golang::_chan::decref() golang/runtime/libgolang.cpp:470 (liblibgolang.so.0.1+0x47f2)
            #3 _chanxdecref golang/runtime/libgolang.cpp:452 (liblibgolang.so.0.1+0x484a)
            #4 _test_go_c golang/runtime/libgolang_test_c.c:86 (_golang_test.so+0x13a2e)
            #5 __pyx_pf_6golang_12_golang_test_12test_go_c golang/_golang_test.cpp:3340 (_golang_test.so+0xcbaa)
            #6 __pyx_pw_6golang_12_golang_test_13test_go_c golang/_golang_test.cpp:3305 (_golang_test.so+0xcbaa)
            #7 PyEval_EvalFrameEx <null> (python2.7+0xf68b4)
    
          Previous read of size 8 at 0x7b1400000650 by thread T8:
            #0 golang::Sema::acquire() golang/runtime/libgolang.cpp:164 (liblibgolang.so.0.1+0x410a)
            #1 golang::Mutex::lock() golang/runtime/libgolang.cpp:175 (liblibgolang.so.0.1+0x4c82)
            #2 golang::_chan::close() golang/runtime/libgolang.cpp:754 (liblibgolang.so.0.1+0x4c82)
            #3 _chanclose golang/runtime/libgolang.cpp:732 (liblibgolang.so.0.1+0x4d1a)
            #4 _work golang/runtime/libgolang_test_c.c:92 (_golang_test.so+0x136cc)
            #5 <null> <null> (python2.7+0x1929e3)
    
          Thread T8 (tid=7311, finished) created by main thread at:
            #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors.cc:915 (libtsan.so.0+0x2be1b)
            #1 PyThread_start_new_thread <null> (python2.7+0x19299f)
            #2 _taskgo golang/runtime/libgolang.cpp:119 (liblibgolang.so.0.1+0x3f68)
            #3 _test_go_c golang/runtime/libgolang_test_c.c:84 (_golang_test.so+0x13a1c)
            #4 __pyx_pf_6golang_12_golang_test_12test_go_c golang/_golang_test.cpp:3340 (_golang_test.so+0xcbaa)
            #5 __pyx_pw_6golang_12_golang_test_13test_go_c golang/_golang_test.cpp:3305 (_golang_test.so+0xcbaa)
            #6 PyEval_EvalFrameEx <null> (python2.7+0xf68b4)
    
    -> Fix close to dequeue all channel's subscribers atomically, and notify
    them all after channel is unlocked and _no_ longer accessed.
    
    Close was already working this way when channels were done at Python
    level, but in 3b241983 (Port/move channels to C/C++/Pyx) I introduced
    this bug while trying to avoid additional memory allocation in close.
    
    Added test catches the bug on all - even not under TSAN - builds.
    
    ----
    
    Added test also reveals another bug: recv<onstack=false> uses channel
    after wakeup, and, as at the time of wakeup the channel could be
    already destroyed, that segfaults. Fix it by pre-reading in recv
    everything needed from _chan object before going to sleep.
    
    This fix cannot go separately from close fix, as fixed close is required
    for recv-uses-chan-after-wakeup testcase.
    dcf4ebd1
libgolang.cpp 33.1 KB