• Srikar Dronamraju's avatar
    powerpc/paravirt: Use is_kvm_guest() in vcpu_is_preempted() · ca3f969d
    Srikar Dronamraju authored
    If its a shared LPAR but not a KVM guest, then see if the vCPU is
    related to the calling vCPU. On PowerVM, only cores can be preempted.
    So if one vCPU is a non-preempted state, we can decipher that all
    other vCPUs sharing the same core are in non-preempted state.
    
    Performance results:
    
      $ perf stat -r 5 -a perf bench sched pipe -l 10000000 (lesser time is better)
    
      powerpc/next
           35,107,951.20 msec cpu-clock                 #  255.898 CPUs utilized            ( +-  0.31% )
              23,655,348      context-switches          #    0.674 K/sec                    ( +-  3.72% )
                  14,465      cpu-migrations            #    0.000 K/sec                    ( +-  5.37% )
                  82,463      page-faults               #    0.002 K/sec                    ( +-  8.40% )
       1,127,182,328,206      cycles                    #    0.032 GHz                      ( +-  1.60% )  (66.67%)
          78,587,300,622      stalled-cycles-frontend   #    6.97% frontend cycles idle     ( +-  0.08% )  (50.01%)
         654,124,218,432      stalled-cycles-backend    #   58.03% backend cycles idle      ( +-  1.74% )  (50.01%)
         834,013,059,242      instructions              #    0.74  insn per cycle
                                                        #    0.78  stalled cycles per insn  ( +-  0.73% )  (66.67%)
         132,911,454,387      branches                  #    3.786 M/sec                    ( +-  0.59% )  (50.00%)
           2,890,882,143      branch-misses             #    2.18% of all branches          ( +-  0.46% )  (50.00%)
    
                 137.195 +- 0.419 seconds time elapsed  ( +-  0.31% )
    
      powerpc/next + patchset
           29,981,702.64 msec cpu-clock                 #  255.881 CPUs utilized            ( +-  1.30% )
              40,162,456      context-switches          #    0.001 M/sec                    ( +-  0.01% )
                   1,110      cpu-migrations            #    0.000 K/sec                    ( +-  5.20% )
                  62,616      page-faults               #    0.002 K/sec                    ( +-  3.93% )
       1,430,030,626,037      cycles                    #    0.048 GHz                      ( +-  1.41% )  (66.67%)
          83,202,707,288      stalled-cycles-frontend   #    5.82% frontend cycles idle     ( +-  0.75% )  (50.01%)
         744,556,088,520      stalled-cycles-backend    #   52.07% backend cycles idle      ( +-  1.39% )  (50.01%)
         940,138,418,674      instructions              #    0.66  insn per cycle
                                                        #    0.79  stalled cycles per insn  ( +-  0.51% )  (66.67%)
         146,452,852,283      branches                  #    4.885 M/sec                    ( +-  0.80% )  (50.00%)
           3,237,743,996      branch-misses             #    2.21% of all branches          ( +-  1.18% )  (50.01%)
    
                  117.17 +- 1.52 seconds time elapsed  ( +-  1.30% )
    
    This is around 14.6% improvement in performance.
    Signed-off-by: default avatarSrikar Dronamraju <srikar@linux.vnet.ibm.com>
    Acked-by: default avatarWaiman Long <longman@redhat.com>
    [mpe: Fold in performance results from cover letter]
    Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
    Link: https://lore.kernel.org/r/20201202050456.164005-5-srikar@linux.vnet.ibm.com
    ca3f969d
paravirt.h 2.16 KB