• Christian Borntraeger's avatar
    KVM: s390: Fix ipte locking · 1365039d
    Christian Borntraeger authored
    ipte_unlock_siif uses cmpxchg to replace the in-memory data of the ipte
    lock together with ACCESS_ONCE for the intial read.
    
    union ipte_control {
            unsigned long val;
            struct {
                    unsigned long k  : 1;
                    unsigned long kh : 31;
                    unsigned long kg : 32;
            };
    };
    [...]
    static void ipte_unlock_siif(struct kvm_vcpu *vcpu)
    {
            union ipte_control old, new, *ic;
    
            ic = &vcpu->kvm->arch.sca->ipte_control;
            do {
                    new = old = ACCESS_ONCE(*ic);
                    new.kh--;
                    if (!new.kh)
                            new.k = 0;
            } while (cmpxchg(&ic->val, old.val, new.val) != old.val);
            if (!new.kh)
                    wake_up(&vcpu->kvm->arch.ipte_wq);
    }
    
    The new value, is loaded twice from memory with gcc 4.7.2 of
    fedora 18, despite the ACCESS_ONCE:
    
    --->
    
    l       %r4,0(%r3)      <--- load first 32 bit of lock (k and kh) in r4
    alfi    %r4,2147483647  <--- add -1 to r4
    llgtr   %r4,%r4         <--- zero out the sign bit of r4
    lg      %r1,0(%r3)      <--- load all 64 bit of lock into new
    lgr     %r2,%r1         <--- load the same into old
    risbg   %r1,%r4,1,31,32 <--- shift and insert r4 into the bits 1-31 of
    new
    llihf   %r4,2147483647
    ngrk    %r4,%r1,%r4
    jne     aa0 <ipte_unlock+0xf8>
    nihh    %r1,32767
    lgr     %r4,%r2
    csg     %r4,%r1,0(%r3)
    cgr     %r2,%r4
    jne     a70 <ipte_unlock+0xc8>
    
    If the memory value changes between the first load (l) and the second
    load (lg) we are broken. If that happens VCPU threads will hang
    (unkillable) in handle_ipte_interlock.
    
    Andreas Krebbel analyzed this and tracked it down to a compiler bug in
    that version:
    "while it is not that obvious the C99 standard basically forbids
    duplicating the memory access also in that case. For an argumentation of
    a similiar case please see:
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=22278#c43
    
    For the implementation-defined cases regarding volatile there are some
    GCC-specific clarifications which can be found here:
    https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html#Volatiles
    
    I've tracked down the problem with a reduced testcase. The problem was
    that during a tree level optimization (SRA - scalar replacement of
    aggregates) the volatile marker is lost. And an RTL level optimizer (CSE
    - common subexpression elimination) then propagated the memory read into
      its second use introducing another access to the memory location. So
    indeed Christian's suspicion that the union access has something to do
    with it is correct (since it triggered the SRA optimization).
    
    This issue has been reported and fixed in the GCC 4.8 development cycle:
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58145"
    
    This patch replaces the ACCESS_ONCE scheme with a barrier() based scheme
    that should work for all supported compilers.
    Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
    Cc: stable@vger.kernel.org # v3.16+
    1365039d
gaccess.c 19.2 KB