• Peter Zijlstra's avatar
    x86/percpu: Relax smp_processor_id() · 9ed7d75b
    Peter Zijlstra authored
    Nadav reported that since this_cpu_read() became asm-volatile, many
    smp_processor_id() users generated worse code due to the extra
    constraints.
    
    However since smp_processor_id() is reading a stable value, we can use
    __this_cpu_read().
    
    While this does reduce text size somewhat, this mostly results in code
    movement to .text.unlikely as a result of more/larger .cold.
    subfunctions. Less text on the hotpath is good for I$.
    
      $ ./compare.sh defconfig-build1 defconfig-build2 vmlinux.o
      setup_APIC_ibs                                             90         98   -12,+20
      force_ibs_eilvt_setup                                     400        413   -57,+70
      pci_serr_error                                            109        104   -54,+49
      pci_serr_error                                            109        104   -54,+49
      unknown_nmi_error                                         125        120   -76,+71
      unknown_nmi_error                                         125        120   -76,+71
      io_check_error                                            125        132   -97,+104
      intel_thermal_interrupt                                   730        822   +92,+0
      intel_init_thermal                                        951        945   -6,+0
      generic_get_mtrr                                          301        294   -7,+0
      generic_get_mtrr                                          301        294   -7,+0
      generic_set_all                                           749        754   -44,+49
      get_fixed_ranges                                          352        360   -41,+49
      x86_acpi_suspend_lowlevel                                 369        363   -6,+0
      check_tsc_sync_source                                     412        412   -71,+71
      irq_migrate_all_off_this_cpu                              662        674   -14,+26
      clocksource_watchdog                                      748        748   -113,+113
      __perf_event_account_interrupt                            204        197   -7,+0
      attempt_merge                                            1748       1741   -7,+0
      intel_guc_send_ct                                        1424       1409   -15,+0
      __fini_doorbell                                           235        231   -4,+0
      bdw_set_cdclk                                             928        923   -5,+0
      gen11_dsi_disable                                        1571       1556   -15,+0
      gmbus_wait                                                493        488   -5,+0
      md_make_request                                           376        369   -7,+0
      __split_and_process_bio                                   543        536   -7,+0
      delay_tsc                                                  96         89   -7,+0
      hsw_disable_pc8                                           696        691   -5,+0
      tsc_verify_tsc_adjust                                     215        228   -22,+35
      cpuidle_driver_unref                                       56         49   -7,+0
      blk_account_io_completion                                 159        148   -11,+0
      mtrr_wrmsr                                                 95         99   -29,+33
      __intel_wait_for_register_fw                              401        419   +18,+0
      cpuidle_driver_ref                                         43         36   -7,+0
      cpuidle_get_driver                                         15          8   -7,+0
      blk_account_io_done                                       535        528   -7,+0
      irq_migrate_all_off_this_cpu                              662        674   -14,+26
      check_tsc_sync_source                                     412        412   -71,+71
      irq_wait_for_poll                                         170        163   -7,+0
      generic_end_io_acct                                       329        322   -7,+0
      x86_acpi_suspend_lowlevel                                 369        363   -6,+0
      nohz_balance_enter_idle                                   198        191   -7,+0
      generic_start_io_acct                                     254        247   -7,+0
      blk_account_io_start                                      341        334   -7,+0
      perf_event_task_tick                                      682        675   -7,+0
      intel_init_thermal                                        951        945   -6,+0
      amd_e400_c1e_apic_setup                                    47         51   -28,+32
      setup_APIC_eilvt                                          350        328   -22,+0
      hsw_enable_pc8                                           1611       1605   -6,+0
                                                   total   12985947   12985892   -994,+939
    Reported-by: default avatarNadav Amit <nadav.amit@gmail.com>
    Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    Cc: Peter Zijlstra <peterz@infradead.org>
    Cc: Thomas Gleixner <tglx@linutronix.de>
    Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
    9ed7d75b
smp.h 5.98 KB