1. 12 Nov, 2019 7 commits
    • Arnaldo Carvalho de Melo's avatar
      perf unwind: Use 'struct map_symbol' in 'struct unwind_entry' · c1529738
      Arnaldo Carvalho de Melo authored
      To help in passing that info around to callchain routines that, for the
      same reason, are moving to use 'struct map_symbol'.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-epsiibeprpxa8qpwji47uskc@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      c1529738
    • Arnaldo Carvalho de Melo's avatar
      perf annotate: Pass a 'map_symbol' in places receiving a pair of 'map' and 'symbol' pointers · 29754894
      Arnaldo Carvalho de Melo authored
      We are already passing things like:
      
        symbol__annotate(ms->sym, ms->map, ...)
      
      So shorten the signature of such functions to receive the 'map_symbol'
      pointer.
      
      This also paves the way to having the 'struct map_groups' pointer in the
      'struct map_symbol' so that we can get rid of 'struct map'->groups.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-23yx8v1t41nzpkpi7rdrozww@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      29754894
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Add map_groups to 'struct addr_location' · d3a022cb
      Arnaldo Carvalho de Melo authored
      From there we can get al->mg->machine, so replace that field with the
      more useful 'struct map_groups' that for now we're obtaining from
      al->map->groups, and that is one thing getting into the way of maps
      being fully shareable.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-4qdducrm32tgrjupcp0kjh1e@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      d3a022cb
    • Arnaldo Carvalho de Melo's avatar
      perf map_groups: Pass the object to map_groups__find_ams() · 9d355b38
      Arnaldo Carvalho de Melo authored
      We were just passing a map to look for and reuse its map->groups member,
      but the idea is that this is going away, as a map can be in multiple
      rb_trees when being reused via a map_node, so do as all the other
      map_groups methods and pass as its first arg the object being operated
      on.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-nmi2pbggqloogwl6vxrvex5a@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      9d355b38
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Stop using map->groups, we can use kmaps instead · f2baa060
      Arnaldo Carvalho de Melo authored
      To test that that function is being called I just added a probe on that
      place, enabled it via 'perf trace' asking for at most 16 levels of
      backtraces, system wide, and then ran 'perf top' on another xterm,
      voilà:
      
        # perf probe -x ~/bin/perf dso__process_kernel_symbol
        Added new event:
          probe_perf:dso__process_kernel_symbol (on dso__process_kernel_symbol in /home/acme/bin/perf)
      
        You can now use it in all perf tools, such as:
      
        	perf record -e probe_perf:dso__process_kernel_symbol -aR sleep 1
      
        # perf trace -e probe_perf:dso__process_kernel_symbol/max-stack=16/ --max-events=2
        # perf trace -e probe_perf:dso__process_kernel_symbol/max-stack=16/ --max-events=2
             0.000 :17345/17345 probe_perf:dso__process_kernel_symbol(__probe_ip: 5680224)
                                               dso__process_kernel_symbol (/home/acme/bin/perf)
                                               dso__load_vmlinux (/home/acme/bin/perf)
                                               dso__load_vmlinux_path (/home/acme/bin/perf)
                                               dso__load (/home/acme/bin/perf)
                                               map__load (/home/acme/bin/perf)
                                               thread__find_map (/home/acme/bin/perf)
                                               machine__resolve (/home/acme/bin/perf)
                                               deliver_event (/home/acme/bin/perf)
                                               __ordered_events__flush.part.0 (/home/acme/bin/perf)
                                               process_thread (/home/acme/bin/perf)
                                               start_thread (/usr/lib64/libpthread-2.29.so)
             0.064 :17345/17345 probe_perf:dso__process_kernel_symbol(__probe_ip: 5680224)
                                               dso__process_kernel_symbol (/home/acme/bin/perf)
                                               dso__load_vmlinux (/home/acme/bin/perf)
                                               dso__load_vmlinux_path (/home/acme/bin/perf)
                                               dso__load (/home/acme/bin/perf)
                                               map__load (/home/acme/bin/perf)
                                               thread__find_map (/home/acme/bin/perf)
                                               machine__resolve (/home/acme/bin/perf)
                                               deliver_event (/home/acme/bin/perf)
                                               __ordered_events__flush.part.0 (/home/acme/bin/perf)
                                               process_thread (/home/acme/bin/perf)
                                               start_thread (/usr/lib64/libpthread-2.29.so)
        #
        # perf stat -e probe_perf:dso__process_kernel_symbol
        ^C
         Performance counter stats for 'system wide':
      
                 107,308      probe_perf:dso__process_kernel_symbol
      
             8.215399813 seconds time elapsed
        #
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-5fy66x5hr5ct9pmw84jkiwvm@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      f2baa060
    • Arnaldo Carvalho de Melo's avatar
      perf map: Use map->dso->kernel + map__kmaps() in map__kmaps() · de90d513
      Arnaldo Carvalho de Melo authored
      Its equivalent to using map->groups to obtain the machine struct.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-bdbazuj4ggrmzxdviaqdrdwh@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      de90d513
    • Ingo Molnar's avatar
      Merge tag 'perf-core-for-mingo-5.5-20191107' of... · 56b2147f
      Ingo Molnar authored
      Merge tag 'perf-core-for-mingo-5.5-20191107' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
      
      Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
      
      perf report:
      
        Jin Yao:
      
        - Introduce --total-cycles, for basic block profiling, further using data
          obtained from LBR, an example should suffice:
      
            # perf record -b
            ^C[ perf record: Woken up 595 times to write data ]
            [ perf record: Captured and wrote 156.672 MB perf.data (196873 samples) ]
      
            # perf evlist -v
            cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|BRANCH_STACK, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, task: 1, precise_ip: 3, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY
      
            # perf report --total-cycles --stdio
            # To display the perf.data header info, please use --header/--header-only options.
            #
            # Total Lost Samples: 0
            #
            # Samples: 6M of event 'cycles'
            # Event count (approx.): 6299936
            #
            # Sampled  Sampled   Avg     Avg
            # Cycles%  Cycles  Cycles%  Cycles                 [Program Block Range]     Shared Object
            # .......  ......  .......  .....   ....................................  ................
            #
               2.17%     1.7M   0.08%     607       [compiler.h:199 -> common.c:221]  [kernel.vmlinux]
               0.72%   544.5K   0.03%     230     [entry_64.S:657 -> entry_64.S:662]  [kernel.vmlinux]
               0.56%   541.8K   0.09%     672       [compiler.h:199 -> common.c:300]  [kernel.vmlinux]
               0.39%   293.2K   0.01%     104   [list_debug.c:43 -> list_debug.c:61]  [kernel.vmlinux]
               0.36%   278.6K   0.03%     272   [entry_64.S:1289 -> entry_64.S:1308]  [kernel.vmlinux]
      
      perf record:
      
        Adrian Hunter:
      
        - Allow storing perf.data in a directory together with a copy of /proc/kcore.
      
        Jiwei Sun:
      
        - Add support for limit perf output file size, i.e.:
      
          # perf record --all-cpus -F 10000 --max-size=4M sleep 10h
          [ perf record: perf size limit reached (4097 KB), stopping session ]
          [ perf record: Woken up 6 times to write data ]
          [ perf record: Captured and wrote 4.048 MB perf.data (54094 samples) ]
          Terminated
          # ls -lah perf.data
          -rw-------. 1 root root 4.1M Nov  7 15:27 perf.data
          #
      
      perf stat:
      
        Jiri Olsa:
      
        - Add --per-node agregation support:
      
          In live mode:
      
            # perf stat  -a -I 1000 -e cycles --per-node
            #           time node   cpus             counts unit events
                 1.000542550 N0       20          6,202,097      cycles
                 1.000542550 N1       20            639,559      cycles
                 2.002040063 N0       20          7,412,495      cycles
                 2.002040063 N1       20          2,185,577      cycles
                 3.003451699 N0       20          6,508,917      cycles
                 3.003451699 N1       20            765,607      cycles
            ...
      
          Or in the record/report stat session:
      
            # perf stat record -a -I 1000 -e cycles
            #           time             counts unit events
                 1.000536937         10,008,468      cycles
                 2.002090152          9,578,539      cycles
                 3.003625233          7,647,869      cycles
                 4.005135036          7,032,086      cycles
            ^C     4.340902364          3,923,893      cycles
      
            # perf stat report --per-node
            #           time node   cpus             counts unit events
                 1.000536937 N0       20          9,355,086      cycles
                 1.000536937 N1       20            653,382      cycles
                 2.002090152 N0       20          7,712,838      cycles
                 2.002090152 N1       20          1,865,701      cycles
             ...
      
      perf probe:
      
        Masami Hiramatsu:
      
        Various fixes related to recent additions to the DWARF format:
      
        - Fix to find range-only function instance
      
        - Walk function lines in lexical blocks
      
        - Fix to show function entry line as probe-able
      
        - Fix wrong address verification
      
        - Fix to probe a function which has no entry pc
      
        - Fix to probe an inline function which has no entry pc
      
        - Fix to list probe event with correct line number
      
        - Fix to show inlined function callsite without entry_pc
      
        - Fix to show ranges of variables in functions without entry_pc
      
        - Return a better scope DIE if there is no best scope
      
        - Skip end-of-sequence and non statement lines
      
        - Filter out instances except for inlined subroutine and subprogram
      
        - Fix to show calling lines of inlined functions
      
        - Skip overlapped location on searching variables
      
      perf inject:
      
        Adrian Hunter:
      
        - Do not strip evsels with --strip, as they are needed for create_gcov
          (see the autofdo example in tools/perf/Documentation/intel-pt.txt).
      
      Intel PT:
      
        Adrian Hunter:
      
        - Intel PT uses an auxtrace_cache to store the results of code-walking, to avoid
          repeated decoding. Add an auxtrace_cache__remove to handle text poke events.
      
      core:
      
        Andi Kleen:
      
        - Always preserve errno while cleaning up perf_event_open failures.
      
      llvm:
      
        Arnaldo Carvalho de Melo:
      
        - No need to tell that the request for saving a .o file for BPF events, as
          expressed in ~/.perfconfig was satisfied, make that a debug message.
      
      perf vendor events:
      
      Intel:
      
        Haiyan Song:
      
        - Update CascadelakeX events to v1.05.
      
        - Update all the Intel JSON metrics from TMAM 3.6.
      
      Treewide:
      
        Ian Rogers:
      
        - Improve error paths, plugging leaks found using LLVM tools
          such as libFuzzer.
      
      jevents:
      
        Yunfeng Ye:
      
        - Fix resource leak in process_mapfile() and main()
      
      perf kvm:
      
        Igor Lubashev:
      
        - Use evlist layer api when possible.
      
      libsubcmd:
      
        James Clark:
      
        - Move EXTRA_FLAGS to the end to allow overriding existing flags.
      
        - Use -O0 with DEBUG=1
      
      perf diff:
      
        Jin Yao:
      
        - Don't use hack to skip column length calculation
      
      CoreSight ETM:
      
        Leo yan:
      
        - Fix definition of macro TO_CS_QUEUE_NR
      
      ARM64:
      
        John Garry:
      
        - Do not try to include libelf header files when its feature detection
          failed, fixing the cross build for ARM64.
      
      perf tests:
      
        Leo Yan:
      
        - Fix out of bounds memory access in the backward ring buffer test.
      Signed-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      56b2147f
  2. 11 Nov, 2019 3 commits
  3. 10 Nov, 2019 13 commits
    • Linus Torvalds's avatar
      Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 44866956
      Linus Torvalds authored
      Pull ARM SoC fixes from Olof Johansson:
       "A set of fixes that have trickled in over the last couple of weeks:
      
         - MAINTAINER update for Cavium/Marvell ThunderX2
      
         - stm32 tweaks to pinmux for Joystick/Camera, and RAM allocation for
           CAN interfaces
      
         - i.MX fixes for voltage regulator GPIO mappings, fixes voltage
           scaling issues
      
         - More i.MX fixes for various issues on i.MX eval boards: interrupt
           storm due to u-boot leaving pins in new states, fixing power button
           config, a couple of compatible-string corrections.
      
         - Powerdown and Suspend/Resume fixes for Allwinner A83-based tablets
      
         - A few documentation tweaks and a fix of a memory leak in the reset
           subsystem"
      
      * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
        MAINTAINERS: update Cavium ThunderX2 maintainers
        ARM: dts: stm32: change joystick pinctrl definition on stm32mp157c-ev1
        ARM: dts: stm32: remove OV5640 pinctrl definition on stm32mp157c-ev1
        ARM: dts: stm32: Fix CAN RAM mapping on stm32mp157c
        ARM: dts: stm32: relax qspi pins slew-rate for stm32mp157
        arm64: dts: zii-ultra: fix ARM regulator GPIO handle
        ARM: sunxi: Fix CPU powerdown on A83T
        ARM: dts: sun8i-a83t-tbs-a711: Fix WiFi resume from suspend
        arm64: dts: imx8mn: fix compatible string for sdma
        arm64: dts: imx8mm: fix compatible string for sdma
        reset: fix reset_control_ops kerneldoc comment
        ARM: dts: imx6-logicpd: Re-enable SNVS power key
        soc: imx: gpc: fix initialiser format
        ARM: dts: imx6qdl-sabreauto: Fix storm of accelerometer interrupts
        arm64: dts: ls1028a: fix a compatible issue
        reset: fix reset_control_get_exclusive kerneldoc comment
        reset: fix reset_control_lookup kerneldoc comment
        reset: fix of_reset_control_get_count kerneldoc comment
        reset: fix of_reset_simple_xlate kerneldoc comment
        reset: Fix memory leak in reset_control_array_put()
      44866956
    • Linus Torvalds's avatar
      Merge tag 'staging-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · dd892625
      Linus Torvalds authored
      Pull IIO fixes and staging driver from Greg KH:
       "Here is a mix of a number of IIO driver fixes for 5.4-rc7, and a whole
        new staging driver.
      
        The IIO fixes resolve some reported issues, all are tiny.
      
        The staging driver addition is the vboxsf filesystem, which is the
        VirtualBox guest shared folder code. Hans has been trying to get
        filesystem reviewers to review the code for many months now, and
        Christoph finally said to just merge it in staging now as it is
        stand-alone and the filesystem people can review it easier over time
        that way.
      
        I know it's late for this big of an addition, but it is stand-alone.
      
        The code has been in linux-next for a while, long enough to pick up a
        few tiny fixes for it already so people are looking at it.
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'staging-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
        staging: Fix error return code in vboxsf_fill_super()
        staging: vboxsf: fix dereference of pointer dentry before it is null checked
        staging: vboxsf: Remove unused including <linux/version.h>
        staging: Add VirtualBox guest shared folder (vboxsf) support
        iio: adc: stm32-adc: fix stopping dma
        iio: imu: inv_mpu6050: fix no data on MPU6050
        iio: srf04: fix wrong limitation in distance measuring
        iio: imu: adis16480: make sure provided frequency is positive
      dd892625
    • Linus Torvalds's avatar
      Merge tag 'char-misc-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 3de2a3e9
      Linus Torvalds authored
      Pull char/misc driver fixes from Greg KH:
       "Here are a number of late-arrival driver fixes for issues reported for
        some char/misc drivers for 5.4-rc7
      
        These all come from the different subsystem/driver maintainers as
        things that they had reports for and wanted to see fixed.
      
        All of these have been in linux-next with no reported issues"
      
      * tag 'char-misc-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
        intel_th: pci: Add Jasper Lake PCH support
        intel_th: pci: Add Comet Lake PCH support
        intel_th: msu: Fix possible memory leak in mode_store()
        intel_th: msu: Fix overflow in shift of an unsigned int
        intel_th: msu: Fix missing allocation failure check on a kstrndup
        intel_th: msu: Fix an uninitialized mutex
        intel_th: gth: Fix the window switching sequence
        soundwire: slave: fix scanf format
        soundwire: intel: fix intel_register_dai PDI offsets and numbers
        interconnect: Add locking in icc_set_tag()
        interconnect: qcom: Fix icc_onecell_data allocation
        soundwire: depend on ACPI || OF
        soundwire: depend on ACPI
        thunderbolt: Drop unnecessary read when writing LC command in Ice Lake
        thunderbolt: Fix lockdep circular locking depedency warning
        thunderbolt: Read DP IN adapter first two dwords in one go
      3de2a3e9
    • Linus Torvalds's avatar
      Merge tag 'configfs-for-5.4-2' of git://git.infradead.org/users/hch/configfs · a5871fcb
      Linus Torvalds authored
      Pull configfs regression fix from Christoph Hellwig:
       "Fix a regression from this merge window in the configfs symlink
        handling (Honggang Li)"
      
      * tag 'configfs-for-5.4-2' of git://git.infradead.org/users/hch/configfs:
        configfs: calculate the depth of parent item
      a5871fcb
    • Linus Torvalds's avatar
      Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 9805a683
      Linus Torvalds authored
      Pull x86 fixes from Thomas Gleixner:
       "A small set of fixes for x86:
      
         - Make the tsc=reliable/nowatchdog command line parameter work again.
           It was broken with the introduction of the early TSC clocksource.
      
         - Prevent the evaluation of exception stacks before they are set up.
           This causes a crash in dumpstack because the stack walk termination
           gets screwed up.
      
         - Prevent a NULL pointer dereference in the rescource control file
           system.
      
         - Avoid bogus warnings about APIC id mismatch related to the LDR
           which can happen when the LDR is not in use and therefore not
           initialized. Only evaluate that when the APIC is in logical
           destination mode"
      
      * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        x86/tsc: Respect tsc command line paraemeter for clocksource_tsc_early
        x86/dumpstack/64: Don't evaluate exception stacks before setup
        x86/apic/32: Avoid bogus LDR warnings
        x86/resctrl: Prevent NULL pointer dereference when reading mondata
      9805a683
    • Linus Torvalds's avatar
      Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 621084cd
      Linus Torvalds authored
      Pull timer fixes from Thomas Gleixner:
       "A small set of fixes for timekeepoing and clocksource drivers:
      
         - VDSO data was updated conditional on the availability of a VDSO
           capable clocksource. This causes the VDSO functions which do not
           depend on a VDSO capable clocksource to operate on stale data.
           Always update unconditionally.
      
         - Prevent a double free in the mediatek driver
      
         - Use the proper helper in the sh_mtu2 driver so it won't attempt to
           initialize non-existing interrupts"
      
      * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        timekeeping/vsyscall: Update VDSO data unconditionally
        clocksource/drivers/sh_mtu2: Do not loop using platform_get_irq_by_name()
        clocksource/drivers/mediatek: Fix error handling
      621084cd
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 81388c2b
      Linus Torvalds authored
      Pull scheduler fixes from Thomas Gleixner:
       "Two fixes for scheduler regressions:
      
         - Plug a subtle race condition which was introduced with the rework
           of the next task selection functionality. The change of task
           properties became unprotected which can be observed inconsistently
           causing state corruption.
      
         - A trivial compile fix for CONFIG_CGROUPS=n"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched: Fix pick_next_task() vs 'change' pattern race
        sched/core: Fix compilation error when cgroup not selected
      81388c2b
    • Linus Torvalds's avatar
      Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · b584a176
      Linus Torvalds authored
      Pull perf tooling fixes from Thomas Gleixner:
      
       - Fix the time sorting algorithm which was broken due to truncation of
         big numbers
      
       - Fix the python script generator fail caused by a broken tracepoint
         array iterator
      
      * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        perf tools: Fix time sorting
        perf tools: Remove unused trace_find_next_event()
        perf scripting engines: Iterate on tep event arrays directly
      b584a176
    • Linus Torvalds's avatar
      Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ffba65ea
      Linus Torvalds authored
      Pull irq fixlet from Thomas Gleixner:
       "A trivial fix for a kernel doc regression where an argument change was
        not reflected in the documentation"
      
      * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        irq/irqdomain: Update __irq_domain_alloc_fwnode() function documentation
      ffba65ea
    • Linus Torvalds's avatar
      Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 20c7e296
      Linus Torvalds authored
      Pull stacktrace fix from Thomas Gleixner:
       "A small fix for a stacktrace regression.
      
        Saving a stacktrace for a foreign task skipped an extra entry which
        makes e.g. the output of /proc/$PID/stack incomplete"
      
      * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        stacktrace: Don't skip first entry on noncurrent tasks
      20c7e296
    • Linus Torvalds's avatar
      Merge tag '5.4-rc7-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6 · 79a64063
      Linus Torvalds authored
      Pull cifs fix from Steve French:
       "Small fix for an smb3 reconnect bug (also marked for stable)"
      
      * tag '5.4-rc7-smb3-fix' of git://git.samba.org/sfrench/cifs-2.6:
        SMB3: Fix persistent handles reconnect
      79a64063
    • Corentin Labbe's avatar
      lib: Remove select of inexistant GENERIC_IO · 820b7c71
      Corentin Labbe authored
      config option GENERIC_IO was removed but still selected by lib/kconfig
      This patch finish the cleaning.
      
      Fixes: 9de8da47 ("kconfig: kill off GENERIC_IO option")
      Acked-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarCorentin Labbe <clabbe@baylibre.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      820b7c71
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 4763c089
      Linus Torvalds authored
      Pull pin control fixes from Linus Walleij:
      
       - Fix glitch risks in the Intel GPIO
      
       - Fix the Intel Cherryview valid irq mask calculation.
      
       - Allocate the Intel Cherryview irqchip dynamically.
      
       - Fix the valid mask init sequency on the ST STMFX driver.
      
      * tag 'pinctrl-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: stmfx: fix valid_mask init sequence
        pinctrl: cherryview: Allocate IRQ chip dynamic
        pinctrl: cherryview: Fix irq_valid_mask calculation
        pinctrl: intel: Avoid potential glitches if pin is in GPIO mode
      4763c089
  4. 09 Nov, 2019 11 commits
    • Linus Torvalds's avatar
      Merge tag 'for-5.4-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 00aff683
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "A few regressions and fixes for stable.
      
        Regressions:
      
         - fix a race leading to metadata space leak after task received a
           signal
      
         - un-deprecate 2 ioctls, marked as deprecated by mistake
      
        Fixes:
      
         - fix limit check for number of devices during chunk allocation
      
         - fix a race due to double evaluation of i_size_read inside max()
           macro, can cause a crash
      
         - remove wrong device id check in tree-checker"
      
      * tag 'for-5.4-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: un-deprecate ioctls START_SYNC and WAIT_SYNC
        btrfs: save i_size to avoid double evaluation of i_size_read in compress_file_range
        Btrfs: fix race leading to metadata space leak after task received signal
        btrfs: tree-checker: Fix wrong check on max devid
        btrfs: Consider system chunk array size for new SYSTEM chunks
      00aff683
    • Linus Torvalds's avatar
      Merge tag 'linux-watchdog-5.4-rc7' of git://www.linux-watchdog.org/linux-watchdog · 4aba1a7e
      Linus Torvalds authored
      Pull watchdog fixes from Wim Van Sebroeck:
      
       - cpwd: fix build regression
      
       - pm8916_wdt: fix pretimeout registration flow
      
       - meson: Fix the wrong value of left time
      
       - imx_sc_wdt: Pretimeout should follow SCU firmware format
      
       - bd70528: Add MODULE_ALIAS to allow module auto loading
      
      * tag 'linux-watchdog-5.4-rc7' of git://www.linux-watchdog.org/linux-watchdog:
        watchdog: bd70528: Add MODULE_ALIAS to allow module auto loading
        watchdog: imx_sc_wdt: Pretimeout should follow SCU firmware format
        watchdog: meson: Fix the wrong value of left time
        watchdog: pm8916_wdt: fix pretimeout registration flow
        watchdog: cpwd: fix build regression
      4aba1a7e
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 0058b0a5
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) BPF sample build fixes from Björn Töpel
      
       2) Fix powerpc bpf tail call implementation, from Eric Dumazet.
      
       3) DCCP leaks jiffies on the wire, fix also from Eric Dumazet.
      
       4) Fix crash in ebtables when using dnat target, from Florian Westphal.
      
       5) Fix port disable handling whne removing bcm_sf2 driver, from Florian
          Fainelli.
      
       6) Fix kTLS sk_msg trim on fallback to copy mode, from Jakub Kicinski.
      
       7) Various KCSAN fixes all over the networking, from Eric Dumazet.
      
       8) Memory leaks in mlx5 driver, from Alex Vesker.
      
       9) SMC interface refcounting fix, from Ursula Braun.
      
      10) TSO descriptor handling fixes in stmmac driver, from Jose Abreu.
      
      11) Add a TX lock to synchonize the kTLS TX path properly with crypto
          operations. From Jakub Kicinski.
      
      12) Sock refcount during shutdown fix in vsock/virtio code, from Stefano
          Garzarella.
      
      13) Infinite loop in Intel ice driver, from Colin Ian King.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (108 commits)
        ixgbe: need_wakeup flag might not be set for Tx
        i40e: need_wakeup flag might not be set for Tx
        igb/igc: use ktime accessors for skb->tstamp
        i40e: Fix for ethtool -m issue on X722 NIC
        iavf: initialize ITRN registers with correct values
        ice: fix potential infinite loop because loop counter being too small
        qede: fix NULL pointer deref in __qede_remove()
        net: fix data-race in neigh_event_send()
        vsock/virtio: fix sock refcnt holding during the shutdown
        net: ethernet: octeon_mgmt: Account for second possible VLAN header
        mac80211: fix station inactive_time shortly after boot
        net/fq_impl: Switch to kvmalloc() for memory allocation
        mac80211: fix ieee80211_txq_setup_flows() failure path
        ipv4: Fix table id reference in fib_sync_down_addr
        ipv6: fixes rt6_probe() and fib6_nh->last_probe init
        net: hns: Fix the stray netpoll locks causing deadlock in NAPI path
        net: usb: qmi_wwan: add support for DW5821e with eSIM support
        CDC-NCM: handle incomplete transfer of MTU
        nfc: netlink: fix double device reference drop
        NFC: st21nfca: fix double free
        ...
      0058b0a5
    • Linus Torvalds's avatar
      Merge tag 'for-linus-2019-11-08' of git://git.kernel.dk/linux-block · 5cb8418c
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - Two NVMe device removal crash fixes, and a compat fixup for for an
         ioctl that was introduced in this release (Anton, Charles, Max - via
         Keith)
      
       - Missing error path mutex unlock for drbd (Dan)
      
       - cgroup writeback fixup on dead memcg (Tejun)
      
       - blkcg online stats print fix (Tejun)
      
      * tag 'for-linus-2019-11-08' of git://git.kernel.dk/linux-block:
        cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead
        block: drbd: remove a stray unlock in __drbd_send_protocol()
        blkcg: make blkcg_print_stat() print stats only for online blkgs
        nvme: change nvme_passthru_cmd64 to explicitly mark rsvd
        nvme-multipath: fix crash in nvme_mpath_clear_ctrl_paths
        nvme-rdma: fix a segmentation fault during module unload
      5cb8418c
    • David S. Miller's avatar
      Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · a2582cdc
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Fixes 2019-11-08
      
      This series contains fixes to igb, igc, ixgbe, i40e, iavf and ice
      drivers.
      
      Colin Ian King fixes a potentially wrap-around counter in a for-loop.
      
      Nick fixes the default ITR values for the iavf driver to 50 usecs
      interval.
      
      Arkadiusz fixes 'ethtool -m' for X722 devices where the correct value
      cannot be obtained from the firmware, so add X722 to the check to ensure
      the wrong value is not returned.
      
      Jake fixes igb and igc drivers in their implementation of launch time
      support by declaring skb->tstamp value as ktime_t instead of s64.
      
      Magnus fixes ixgbe and i40e where the need_wakeup flag for transmit may
      not be set for AF_XDP sockets that are only used to send packets.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a2582cdc
    • Magnus Karlsson's avatar
      ixgbe: need_wakeup flag might not be set for Tx · 0843aa8f
      Magnus Karlsson authored
      The need_wakeup flag for Tx might not be set for AF_XDP sockets that
      are only used to send packets. This happens if there is at least one
      outstanding packet that has not been completed by the hardware and we
      get that corresponding completion (which will not generate an
      interrupt since interrupts are disabled in the napi poll loop) between
      the time we stopped processing the Tx completions and interrupts are
      enabled again. In this case, the need_wakeup flag will have been
      cleared at the end of the Tx completion processing as we believe we
      will get an interrupt from the outstanding completion at a later point
      in time. But if this completion interrupt occurs before interrupts
      are enable, we lose it and should at that point really have set the
      need_wakeup flag since there are no more outstanding completions that
      can generate an interrupt to continue the processing. When this
      happens, user space will see a Tx queue need_wakeup of 0 and skip
      issuing a syscall, which means will never get into the Tx processing
      again and we have a deadlock.
      
      This patch introduces a quick fix for this issue by just setting the
      need_wakeup flag for Tx to 1 all the time. I am working on a proper
      fix for this that will toggle the flag appropriately, but it is more
      challenging than I anticipated and I am afraid that this patch will
      not be completed before the merge window closes, therefore this easier
      fix for now. This fix has a negative performance impact in the range
      of 0% to 4%. Towards the higher end of the scale if you have driver
      and application on the same core and issue a lot of packets, and
      towards no negative impact if you use two cores, lower transmission
      speeds and/or a workload that also receives packets.
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      0843aa8f
    • Magnus Karlsson's avatar
      i40e: need_wakeup flag might not be set for Tx · 70563957
      Magnus Karlsson authored
      The need_wakeup flag for Tx might not be set for AF_XDP sockets that
      are only used to send packets. This happens if there is at least one
      outstanding packet that has not been completed by the hardware and we
      get that corresponding completion (which will not generate an
      interrupt since interrupts are disabled in the napi poll loop) between
      the time we stopped processing the Tx completions and interrupts are
      enabled again. In this case, the need_wakeup flag will have been
      cleared at the end of the Tx completion processing as we believe we
      will get an interrupt from the outstanding completion at a later point
      in time. But if this completion interrupt occurs before interrupts
      are enable, we lose it and should at that point really have set the
      need_wakeup flag since there are no more outstanding completions that
      can generate an interrupt to continue the processing. When this
      happens, user space will see a Tx queue need_wakeup of 0 and skip
      issuing a syscall, which means will never get into the Tx processing
      again and we have a deadlock.
      
      This patch introduces a quick fix for this issue by just setting the
      need_wakeup flag for Tx to 1 all the time. I am working on a proper
      fix for this that will toggle the flag appropriately, but it is more
      challenging than I anticipated and I am afraid that this patch will
      not be completed before the merge window closes, therefore this easier
      fix for now. This fix has a negative performance impact in the range
      of 0% to 4%. Towards the higher end of the scale if you have driver
      and application on the same core and issue a lot of packets, and
      towards no negative impact if you use two cores, lower transmission
      speeds and/or a workload that also receives packets.
      Signed-off-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      70563957
    • Jacob Keller's avatar
      igb/igc: use ktime accessors for skb->tstamp · 6acab13b
      Jacob Keller authored
      When implementing launch time support in the igb and igc drivers, the
      skb->tstamp value is assumed to be a s64, but it's declared as a ktime_t
      value.
      
      Although ktime_t is typedef'd to s64 it wasn't always, and the kernel
      provides accessors for ktime_t values.
      
      Use the ktime_to_timespec64 and ktime_set accessors instead of directly
      assuming that the variable is always an s64.
      
      This improves portability if the code is ever moved to another kernel
      version, or if the definition of ktime_t ever changes again in the
      future.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Acked-by: default avatarVinicius Costa Gomes <vinicius.gomes@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      6acab13b
    • Arkadiusz Kubalewski's avatar
      i40e: Fix for ethtool -m issue on X722 NIC · 4c9da6f2
      Arkadiusz Kubalewski authored
      This patch contains fix for a problem with command:
      'ethtool -m <dev>'
      which breaks functionality of:
      'ethtool <dev>'
      when called on X722 NIC
      
      Disallowed update of link phy_types on X722 NIC
      Currently correct value cannot be obtained from FW
      Previously wrong value returned by FW was used and was
      a root cause for incorrect output of 'ethtool <dev>' command
      Signed-off-by: default avatarArkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4c9da6f2
    • Nicholas Nunley's avatar
      iavf: initialize ITRN registers with correct values · 4eda4e00
      Nicholas Nunley authored
      Since commit 92418fb1 ("i40e/i40evf: Use usec value instead of reg
      value for ITR defines") the driver tracks the interrupt throttling
      intervals in single usec units, although the actual ITRN registers are
      programmed in 2 usec units. Most register programming flows in the driver
      correctly handle the conversion, although it is currently not applied when
      the registers are initialized to their default values. Most of the time
      this doesn't present a problem since the default values are usually
      immediately overwritten through the standard adaptive throttling mechanism,
      or updated manually by the user, but if adaptive throttling is disabled and
      the interval values are left alone then the incorrect value will persist.
      
      Since the intended default interval of 50 usecs (vs. 100 usecs as
      programmed) performs better for most traffic workloads, this can lead to
      performance regressions.
      
      This patch adds the correct conversion when writing the initial values to
      the ITRN registers.
      Signed-off-by: default avatarNicholas Nunley <nicholas.d.nunley@intel.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      4eda4e00
    • Colin Ian King's avatar
      ice: fix potential infinite loop because loop counter being too small · 615457a2
      Colin Ian King authored
      Currently the for-loop counter i is a u8 however it is being checked
      against a maximum value hw->num_tx_sched_layers which is a u16. Hence
      there is a potential wrap-around of counter i back to zero if
      hw->num_tx_sched_layers is greater than 255.  Fix this by making i
      a u16.
      
      Addresses-Coverity: ("Infinite loop")
      Fixes: b36c598c ("ice: Updates to Tx scheduler code")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Tested-by: default avatarAndrew Bowers <andrewx.bowers@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      615457a2
  5. 08 Nov, 2019 6 commits
    • Manish Chopra's avatar
      qede: fix NULL pointer deref in __qede_remove() · deabc871
      Manish Chopra authored
      While rebooting the system with SR-IOV vfs enabled leads
      to below crash due to recurrence of __qede_remove() on the VF
      devices (first from .shutdown() flow of the VF itself and
      another from PF's .shutdown() flow executing pci_disable_sriov())
      
      This patch adds a safeguard in __qede_remove() flow to fix this,
      so that driver doesn't attempt to remove "already removed" devices.
      
      [  194.360134] BUG: unable to handle kernel NULL pointer dereference at 00000000000008dc
      [  194.360227] IP: [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede]
      [  194.360304] PGD 0
      [  194.360325] Oops: 0000 [#1] SMP
      [  194.360360] Modules linked in: tcp_lp fuse tun bridge stp llc devlink bonding ip_set nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp scsi_tgt ib_ipoib ib_umad rpcrdma sunrpc rdma_ucm ib_uverbs ib_iser rdma_cm iw_cm ib_cm libiscsi scsi_transport_iscsi dell_smbios iTCO_wdt iTCO_vendor_support dell_wmi_descriptor dcdbas vfat fat pcc_cpufreq skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd qedr ib_core pcspkr ses enclosure joydev ipmi_ssif sg i2c_i801 lpc_ich mei_me mei wmi ipmi_si ipmi_devintf ipmi_msghandler tpm_crb acpi_pad acpi_power_meter xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_pclmul crct10dif_common crc32c_intel mgag200
      [  194.361044]  qede i2c_algo_bit drm_kms_helper qed syscopyarea sysfillrect nvme sysimgblt fb_sys_fops ttm nvme_core mpt3sas crc8 ptp drm pps_core ahci raid_class scsi_transport_sas libahci libata drm_panel_orientation_quirks nfit libnvdimm dm_mirror dm_region_hash dm_log dm_mod [last unloaded: ip_tables]
      [  194.361297] CPU: 51 PID: 7996 Comm: reboot Kdump: loaded Not tainted 3.10.0-1062.el7.x86_64 #1
      [  194.361359] Hardware name: Dell Inc. PowerEdge MX840c/0740HW, BIOS 2.4.6 10/15/2019
      [  194.361412] task: ffff9cea9b360000 ti: ffff9ceabebdc000 task.ti: ffff9ceabebdc000
      [  194.361463] RIP: 0010:[<ffffffffc03553c4>]  [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede]
      [  194.361534] RSP: 0018:ffff9ceabebdfac0  EFLAGS: 00010282
      [  194.361570] RAX: 0000000000000000 RBX: ffff9cd013846098 RCX: 0000000000000000
      [  194.361621] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9cd013846098
      [  194.361668] RBP: ffff9ceabebdfae8 R08: 0000000000000000 R09: 0000000000000000
      [  194.361715] R10: 00000000bfe14201 R11: ffff9ceabfe141e0 R12: 0000000000000000
      [  194.361762] R13: ffff9cd013846098 R14: 0000000000000000 R15: ffff9ceab5e48000
      [  194.361810] FS:  00007f799c02d880(0000) GS:ffff9ceacb0c0000(0000) knlGS:0000000000000000
      [  194.361865] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  194.361903] CR2: 00000000000008dc CR3: 0000001bdac76000 CR4: 00000000007607e0
      [  194.361953] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  194.362002] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  194.362051] PKRU: 55555554
      [  194.362073] Call Trace:
      [  194.362109]  [<ffffffffc0355500>] qede_remove+0x10/0x20 [qede]
      [  194.362180]  [<ffffffffb97d0f3e>] pci_device_remove+0x3e/0xc0
      [  194.362240]  [<ffffffffb98b3c52>] __device_release_driver+0x82/0xf0
      [  194.362285]  [<ffffffffb98b3ce3>] device_release_driver+0x23/0x30
      [  194.362343]  [<ffffffffb97c86d4>] pci_stop_bus_device+0x84/0xa0
      [  194.362388]  [<ffffffffb97c87e2>] pci_stop_and_remove_bus_device+0x12/0x20
      [  194.362450]  [<ffffffffb97f153f>] pci_iov_remove_virtfn+0xaf/0x160
      [  194.362496]  [<ffffffffb97f1aec>] sriov_disable+0x3c/0xf0
      [  194.362534]  [<ffffffffb97f1bc3>] pci_disable_sriov+0x23/0x30
      [  194.362599]  [<ffffffffc02f83c3>] qed_sriov_disable+0x5e3/0x650 [qed]
      [  194.362658]  [<ffffffffb9622df6>] ? kfree+0x106/0x140
      [  194.362709]  [<ffffffffc02cc0c0>] ? qed_free_stream_mem+0x70/0x90 [qed]
      [  194.362754]  [<ffffffffb9622df6>] ? kfree+0x106/0x140
      [  194.362803]  [<ffffffffc02cd659>] qed_slowpath_stop+0x1a9/0x1d0 [qed]
      [  194.362854]  [<ffffffffc035544e>] __qede_remove+0xae/0x130 [qede]
      [  194.362904]  [<ffffffffc03554e0>] qede_shutdown+0x10/0x20 [qede]
      [  194.362956]  [<ffffffffb97cf90a>] pci_device_shutdown+0x3a/0x60
      [  194.363010]  [<ffffffffb98b180b>] device_shutdown+0xfb/0x1f0
      [  194.363066]  [<ffffffffb94b66c6>] kernel_restart_prepare+0x36/0x40
      [  194.363107]  [<ffffffffb94b66e2>] kernel_restart+0x12/0x60
      [  194.363146]  [<ffffffffb94b6959>] SYSC_reboot+0x229/0x260
      [  194.363196]  [<ffffffffb95f200d>] ? handle_mm_fault+0x39d/0x9b0
      [  194.363253]  [<ffffffffb942b621>] ? __switch_to+0x151/0x580
      [  194.363304]  [<ffffffffb9b7ec28>] ? __schedule+0x448/0x9c0
      [  194.363343]  [<ffffffffb94b69fe>] SyS_reboot+0xe/0x10
      [  194.363387]  [<ffffffffb9b8bede>] system_call_fastpath+0x25/0x2a
      [  194.363430] Code: f9 e9 37 ff ff ff 90 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 4c 8d af 98 00 00 00 41 54 4c 89 ef 41 89 f4 53 e8 4c e4 55 f9 <80> b8 dc 08 00 00 01 48 89 c3 4c 8d b8 c0 08 00 00 4c 8b b0 c0
      [  194.363712] RIP  [<ffffffffc03553c4>] __qede_remove+0x24/0x130 [qede]
      [  194.363764]  RSP <ffff9ceabebdfac0>
      [  194.363791] CR2: 00000000000008dc
      Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarSudarsana Kalluru <skalluru@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      deabc871
    • Eric Dumazet's avatar
      net: fix data-race in neigh_event_send() · 1b53d644
      Eric Dumazet authored
      KCSAN reported the following data-race [1]
      
      The fix will also prevent the compiler from optimizing out
      the condition.
      
      [1]
      
      BUG: KCSAN: data-race in neigh_resolve_output / neigh_resolve_output
      
      write to 0xffff8880a41dba78 of 8 bytes by interrupt on cpu 1:
       neigh_event_send include/net/neighbour.h:443 [inline]
       neigh_resolve_output+0x78/0x480 net/core/neighbour.c:1474
       neigh_output include/net/neighbour.h:511 [inline]
       ip_finish_output2+0x4af/0xe40 net/ipv4/ip_output.c:228
       __ip_finish_output net/ipv4/ip_output.c:308 [inline]
       __ip_finish_output+0x23a/0x490 net/ipv4/ip_output.c:290
       ip_finish_output+0x41/0x160 net/ipv4/ip_output.c:318
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip_output+0xdf/0x210 net/ipv4/ip_output.c:432
       dst_output include/net/dst.h:436 [inline]
       ip_local_out+0x74/0x90 net/ipv4/ip_output.c:125
       __ip_queue_xmit+0x3a8/0xa40 net/ipv4/ip_output.c:532
       ip_queue_xmit+0x45/0x60 include/net/ip.h:237
       __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
       tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
       __tcp_retransmit_skb+0x4bd/0x15f0 net/ipv4/tcp_output.c:2976
       tcp_retransmit_skb+0x36/0x1a0 net/ipv4/tcp_output.c:2999
       tcp_retransmit_timer+0x719/0x16d0 net/ipv4/tcp_timer.c:515
       tcp_write_timer_handler+0x42d/0x510 net/ipv4/tcp_timer.c:598
       tcp_write_timer+0xd1/0xf0 net/ipv4/tcp_timer.c:618
      
      read to 0xffff8880a41dba78 of 8 bytes by interrupt on cpu 0:
       neigh_event_send include/net/neighbour.h:442 [inline]
       neigh_resolve_output+0x57/0x480 net/core/neighbour.c:1474
       neigh_output include/net/neighbour.h:511 [inline]
       ip_finish_output2+0x4af/0xe40 net/ipv4/ip_output.c:228
       __ip_finish_output net/ipv4/ip_output.c:308 [inline]
       __ip_finish_output+0x23a/0x490 net/ipv4/ip_output.c:290
       ip_finish_output+0x41/0x160 net/ipv4/ip_output.c:318
       NF_HOOK_COND include/linux/netfilter.h:294 [inline]
       ip_output+0xdf/0x210 net/ipv4/ip_output.c:432
       dst_output include/net/dst.h:436 [inline]
       ip_local_out+0x74/0x90 net/ipv4/ip_output.c:125
       __ip_queue_xmit+0x3a8/0xa40 net/ipv4/ip_output.c:532
       ip_queue_xmit+0x45/0x60 include/net/ip.h:237
       __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
       tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
       __tcp_retransmit_skb+0x4bd/0x15f0 net/ipv4/tcp_output.c:2976
       tcp_retransmit_skb+0x36/0x1a0 net/ipv4/tcp_output.c:2999
       tcp_retransmit_timer+0x719/0x16d0 net/ipv4/tcp_timer.c:515
       tcp_write_timer_handler+0x42d/0x510 net/ipv4/tcp_timer.c:598
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc3+ #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b53d644
    • Linus Torvalds's avatar
      Merge tag 'pwm/for-5.4-rc7' of... · abf6c397
      Linus Torvalds authored
      Merge tag 'pwm/for-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm
      
      Pull pwm fix from Thierry Reding:
       "One more fix to keep a reference to the driver's module as long as
        there are users of the PWM exposed by the driver"
      
      * tag 'pwm/for-5.4-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/thierry.reding/linux-pwm:
        pwm: bcm-iproc: Prevent unloading the driver module while in use
      abf6c397
    • Peter Zijlstra's avatar
      sched: Fix pick_next_task() vs 'change' pattern race · 6e2df058
      Peter Zijlstra authored
      Commit 67692435 ("sched: Rework pick_next_task() slow-path")
      inadvertly introduced a race because it changed a previously
      unexplored dependency between dropping the rq->lock and
      sched_class::put_prev_task().
      
      The comments about dropping rq->lock, in for example
      newidle_balance(), only mentions the task being current and ->on_cpu
      being set. But when we look at the 'change' pattern (in for example
      sched_setnuma()):
      
      	queued = task_on_rq_queued(p); /* p->on_rq == TASK_ON_RQ_QUEUED */
      	running = task_current(rq, p); /* rq->curr == p */
      
      	if (queued)
      		dequeue_task(...);
      	if (running)
      		put_prev_task(...);
      
      	/* change task properties */
      
      	if (queued)
      		enqueue_task(...);
      	if (running)
      		set_next_task(...);
      
      It becomes obvious that if we do this after put_prev_task() has
      already been called on @p, things go sideways. This is exactly what
      the commit in question allows to happen when it does:
      
      	prev->sched_class->put_prev_task(rq, prev, rf);
      	if (!rq->nr_running)
      		newidle_balance(rq, rf);
      
      The newidle_balance() call will drop rq->lock after we've called
      put_prev_task() and that allows the above 'change' pattern to
      interleave and mess up the state.
      
      Furthermore, it turns out we lost the RT-pull when we put the last DL
      task.
      
      Fix both problems by extracting the balancing from put_prev_task() and
      doing a multi-class balance() pass before put_prev_task().
      
      Fixes: 67692435 ("sched: Rework pick_next_task() slow-path")
      Reported-by: default avatarQuentin Perret <qperret@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: default avatarQuentin Perret <qperret@google.com>
      Tested-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      6e2df058
    • Qais Yousef's avatar
      sched/core: Fix compilation error when cgroup not selected · e3b8b6a0
      Qais Yousef authored
      When cgroup is disabled the following compilation error was hit
      
      	kernel/sched/core.c: In function ‘uclamp_update_active_tasks’:
      	kernel/sched/core.c:1081:23: error: storage size of ‘it’ isn’t known
      	  struct css_task_iter it;
      			       ^~
      	kernel/sched/core.c:1084:2: error: implicit declaration of function ‘css_task_iter_start’; did you mean ‘__sg_page_iter_start’? [-Werror=implicit-function-declaration]
      	  css_task_iter_start(css, 0, &it);
      	  ^~~~~~~~~~~~~~~~~~~
      	  __sg_page_iter_start
      	kernel/sched/core.c:1085:14: error: implicit declaration of function ‘css_task_iter_next’; did you mean ‘__sg_page_iter_next’? [-Werror=implicit-function-declaration]
      	  while ((p = css_task_iter_next(&it))) {
      		      ^~~~~~~~~~~~~~~~~~
      		      __sg_page_iter_next
      	kernel/sched/core.c:1091:2: error: implicit declaration of function ‘css_task_iter_end’; did you mean ‘get_task_cred’? [-Werror=implicit-function-declaration]
      	  css_task_iter_end(&it);
      	  ^~~~~~~~~~~~~~~~~
      	  get_task_cred
      	kernel/sched/core.c:1081:23: warning: unused variable ‘it’ [-Wunused-variable]
      	  struct css_task_iter it;
      			       ^~
      	cc1: some warnings being treated as errors
      	make[2]: *** [kernel/sched/core.o] Error 1
      
      Fix by protetion uclamp_update_active_tasks() with
      CONFIG_UCLAMP_TASK_GROUP
      
      Fixes: babbe170 ("sched/uclamp: Update CPU's refcount on TG's clamp changes")
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarQais Yousef <qais.yousef@arm.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Tested-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Vincent Guittot <vincent.guittot@linaro.org>
      Cc: Patrick Bellasi <patrick.bellasi@matbug.net>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Dietmar Eggemann <dietmar.eggemann@arm.com>
      Cc: Juri Lelli <juri.lelli@redhat.com>
      Cc: Ben Segall <bsegall@google.com>
      Link: https://lkml.kernel.org/r/20191105112212.596-1-qais.yousef@arm.com
      e3b8b6a0
    • Tejun Heo's avatar
      cgroup,writeback: don't switch wbs immediately on dead wbs if the memcg is dead · 65de03e2
      Tejun Heo authored
      cgroup writeback tries to refresh the associated wb immediately if the
      current wb is dead.  This is to avoid keeping issuing IOs on the stale
      wb after memcg - blkcg association has changed (ie. when blkcg got
      disabled / enabled higher up in the hierarchy).
      
      Unfortunately, the logic gets triggered spuriously on inodes which are
      associated with dead cgroups.  When the logic is triggered on dead
      cgroups, the attempt fails only after doing quite a bit of work
      allocating and initializing a new wb.
      
      While c3aab9a0 ("mm/filemap.c: don't initiate writeback if mapping
      has no dirty pages") alleviated the issue significantly as it now only
      triggers when the inode has dirty pages.  However, the condition can
      still be triggered before the inode is switched to a different cgroup
      and the logic simply doesn't make sense.
      
      Skip the immediate switching if the associated memcg is dying.
      
      This is a simplified version of the following two patches:
      
       * https://lore.kernel.org/linux-mm/20190513183053.GA73423@dennisz-mbp/
       * http://lkml.kernel.org/r/156355839560.2063.5265687291430814589.stgit@buzz
      
      Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Fixes: e8a7abf5 ("writeback: disassociate inodes from dying bdi_writebacks")
      Acked-by: default avatarDennis Zhou <dennis@kernel.org>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      65de03e2