1. 21 May, 2022 2 commits
    • Kan Liang's avatar
      perf regs x86: Fix arch__intr_reg_mask() for the hybrid platform · 01b28e4a
      Kan Liang authored
      The X86 specific arch__intr_reg_mask() is to check whether the kernel
      and hardware can collect XMM registers. But it doesn't work on some
      hybrid platform.
      
      Without the patch on ADL-N:
      
        $ perf record -I?
        available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
        R11 R12 R13 R14 R15
      
      The config of the test event doesn't contain the PMU information. The
      kernel may fail to initialize it on the correct hybrid PMU and return
      the wrong non-supported information.
      
      Add the PMU information into the config for the hybrid platform. The
      same register set is supported among different hybrid PMUs. Checking
      the first available one is good enough.
      
      With the patch on ADL-N:
      
        $ perf record -I?
        available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
        R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
        XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
      
      Fixes: 6466ec14 ("perf regs x86: Add X86 specific arch__intr_reg_mask()")
      Reported-by: default avatarAmmy Yi <ammy.yi@intel.com>
      Signed-off-by: default avatarKan Liang <kan.liang@linux.intel.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Link: https://lore.kernel.org/r/20220518145125.1494156-1-kan.liang@linux.intel.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      01b28e4a
    • Athira Rajeev's avatar
      perf test: Fix "all PMU test" to skip hv_24x7/hv_gpci tests on powerpc · 451ed805
      Athira Rajeev authored
      "perf all PMU test" picks the input events from "perf list --raw-dump
      pmu" list and runs "perf stat -e" for each of the event in the list. In
      case of powerpc, the PowerVM environment supports events from hv_24x7
      and hv_gpci PMU which is of example format like below:
      
      - hv_24x7/CPM_ADJUNCT_INST,domain=?,core=?/
      - hv_gpci/event,partition_id=?/
      
      The value for "?" needs to be filled in depending on system and
      respective event. CPM_ADJUNCT_INST needs have core value and domain
      value. hv_gpci event needs partition_id.  Similarly, there are other
      events for hv_24x7 and hv_gpci having "?" in event format. Hence skip
      these events on powerpc platform since values like partition_id, domain
      is specific to system and event.
      
      Fixes: 3d5ac9ef ("perf test: Workload test of all PMUs")
      Signed-off-by: default avatarAthira Jajeev <atrajeev@linux.vnet.ibm.com>
      Acked-by: default avatarIan Rogers <irogers@google.com>
      Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kajol Jain <kjain@linux.ibm.com>
      Cc: linuxppc-dev@lists.ozlabs.org
      Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
      Link: https://lore.kernel.org/r/20220520101236.17249-1-atrajeev@linux.vnet.ibm.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      451ed805
  2. 20 May, 2022 3 commits
    • Ian Rogers's avatar
      perf stat: Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT events · 92d579ea
      Ian Rogers authored
      Stat events can come from disk and so need a degree of validation. They
      contain a CPU which needs looking up via CPU map to access a counter.
      
      Add the CPU to index translation, alongside validity checking.
      
      Discussion thread:
      
        https://lore.kernel.org/linux-perf-users/CAP-5=fWQR=sCuiSMktvUtcbOLidEpUJLCybVF6=BRvORcDOq+g@mail.gmail.com/
      
      Fixes: 7ac0089d ("perf evsel: Pass cpu not cpu map index to synthesize")
      Reported-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Suggested-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Signed-off-by: default avatarIan Rogers <irogers@google.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Andrii Nakryiko <andrii@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Dave Marchevsky <davemarchevsky@fb.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: James Clark <james.clark@arm.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: KP Singh <kpsingh@kernel.org>
      Cc: Lv Ruyi <lv.ruyi@zte.com.cn>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin KaFai Lau <kafai@fb.com>
      Cc: Michael Petlan <mpetlan@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: netdev@vger.kernel.org
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Quentin Monnet <quentin@isovalent.com>
      Cc: Song Liu <songliubraving@fb.com>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
      Cc: Yonghong Song <yhs@fb.com>
      Link: http://lore.kernel.org/lkml/20220519032005.1273691-2-irogers@google.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      92d579ea
    • Arnaldo Carvalho de Melo's avatar
      perf build: Fix check for btf__load_from_kernel_by_id() in libbpf · 0ae065a5
      Arnaldo Carvalho de Melo authored
      Avi Kivity reported a problem where the __weak
      btf__load_from_kernel_by_id() in tools/perf/util/bpf-event.c was being
      used and it called btf__get_from_id() in tools/lib/bpf/btf.c that in
      turn called back to btf__load_from_kernel_by_id(), resulting in an
      endless loop.
      
      Fix this by adding a feature test to check if
      btf__load_from_kernel_by_id() is available when building perf with
      LIBBPF_DYNAMIC=1, and if not then provide the fallback to the old
      btf__get_from_id(), that doesn't call back to btf__load_from_kernel_by_id()
      since at that time it didn't exist at all.
      
      Tested on Fedora 35 where we have libbpf-devel 0.4.0 with LIBBPF_DYNAMIC
      where we don't have btf__load_from_kernel_by_id() and thus its feature
      test fail, not defining HAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID:
      
        $ cat /tmp/build/perf-urgent/feature/test-libbpf-btf__load_from_kernel_by_id.make.output
        test-libbpf-btf__load_from_kernel_by_id.c: In function ‘main’:
        test-libbpf-btf__load_from_kernel_by_id.c:6:16: error: implicit declaration of function ‘btf__load_from_kernel_by_id’ [-Werror=implicit-function-declaration]
            6 |         return btf__load_from_kernel_by_id(20151128, NULL);
              |                ^~~~~~~~~~~~~~~~~~~~~~~~~~~
        cc1: all warnings being treated as errors
        $
      
        $ nm /tmp/build/perf-urgent/perf | grep btf__load_from_kernel_by_id
        00000000005ba180 T btf__load_from_kernel_by_id
        $
      
        $ objdump --disassemble=btf__load_from_kernel_by_id -S /tmp/build/perf-urgent/perf
      
        /tmp/build/perf-urgent/perf:     file format elf64-x86-64
        <SNIP>
        00000000005ba180 <btf__load_from_kernel_by_id>:
        #include "record.h"
        #include "util/synthetic-events.h"
      
        #ifndef HAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID
        struct btf *btf__load_from_kernel_by_id(__u32 id)
        {
          5ba180:	55                   	push   %rbp
          5ba181:	48 89 e5             	mov    %rsp,%rbp
          5ba184:	48 83 ec 10          	sub    $0x10,%rsp
          5ba188:	64 48 8b 04 25 28 00 	mov    %fs:0x28,%rax
          5ba18f:	00 00
          5ba191:	48 89 45 f8          	mov    %rax,-0x8(%rbp)
          5ba195:	31 c0                	xor    %eax,%eax
               struct btf *btf;
        #pragma GCC diagnostic push
        #pragma GCC diagnostic ignored "-Wdeprecated-declarations"
               int err = btf__get_from_id(id, &btf);
          5ba197:	48 8d 75 f0          	lea    -0x10(%rbp),%rsi
          5ba19b:	e8 a0 57 e5 ff       	call   40f940 <btf__get_from_id@plt>
          5ba1a0:	89 c2                	mov    %eax,%edx
        #pragma GCC diagnostic pop
      
               return err ? ERR_PTR(err) : btf;
          5ba1a2:	48 98                	cltq
          5ba1a4:	85 d2                	test   %edx,%edx
          5ba1a6:	48 0f 44 45 f0       	cmove  -0x10(%rbp),%rax
        }
        <SNIP>
      
      Fixes: 218e7b77 ("perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions")
      Reported-by: default avatarAvi Kivity <avi@scylladb.com>
      Link: https://lore.kernel.org/linux-perf-users/f0add43b-3de5-20c5-22c4-70aff4af959f@scylladb.com
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Ian Rogers <irogers@google.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lore.kernel.org/linux-perf-users/YobjjFOblY4Xvwo7@kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      0ae065a5
    • Linus Torvalds's avatar
      Merge tag 'v5.18-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 3d7285a3
      Linus Torvalds authored
      Pull crypto fix from Herbert Xu:
       "Fix a regression in a recent fix to qcom-rng"
      
      * tag 'v5.18-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: qcom-rng - fix infinite loop on requests not multiple of WORD_SZ
      3d7285a3
  3. 19 May, 2022 14 commits
    • Linus Torvalds's avatar
      Merge tag 'for-5.18/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · b015dcd6
      Linus Torvalds authored
      Pull parisc architecture fixes from Helge Deller:
       "We had two big outstanding issues after v5.18-rc6:
      
         a) 32-bit kernels on 64-bit machines (e.g. on a C3700 which is able
            to run 32- and 64-bit kernels) failed early in userspace.
      
         b) 64-bit kernels on PA8800/PA8900 CPUs (e.g. in a C8000) showed
            random userspace segfaults. We assumed that those problems were
            caused by the tmpalias flushes.
      
        Dave did a lot of testing and reorganization of the current flush code
        and fixed the 32-bit cache flushing. For PA8800/PA8900 CPUs he
        switched the code to flush using the virtual address of user and
        kernel pages instead of using tmpalias flushes. The tmpalias flushes
        don't seem to work reliable on such CPUs.
      
        We tested the patches on a wide range machines (715/64, B160L, C3000,
        C3700, C8000, rp3440) and they have been in for-next without any
        conflicts.
      
        Summary:
      
         - Rewrite the cache flush code for PA8800/PA8900 CPUs to flush using
           the virtual address of user and kernel pages instead of using
           tmpalias flushes. Testing showed, that tmpalias flushes don't work
           reliably on PA8800/PA8900 CPUs
      
         - Fix flush code to allow 32-bit kernels to run on 64-bit capable
           machines, e.g. a 32-bit kernel on C3700 machines"
      
      * tag 'for-5.18/parisc-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Fix patch code locking and flushing
        parisc: Rewrite cache flush code for PA8800/PA8900
        parisc: Disable debug code regarding cache flushes in handle_nadtlb_fault()
      b015dcd6
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm · 99b05644
      Linus Torvalds authored
      Pull ARM fixes from Russell King:
       "Two further fixes for Spectre-BHB from Ard for Cortex A15 and to use
        the wide branch instruction for Thumb2"
      
      * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
        ARM: 9197/1: spectre-bhb: fix loop8 sequence for Thumb2
        ARM: 9196/1: spectre-bhb: enable for Cortex-A15
      99b05644
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 18e471dd
      Linus Torvalds authored
      Pull pin control fixes from Linus Walleij:
      
       - Fix an altmode in the Ocelot driver
      
       - Fix the IES control pins in the Mediatek MT8365 driver
      
       - Sunxi (AMLogic) driver:
          - Fix the UART2 function pin assignments
          - Fix the signal name of the PA2 SPI pin
      
      * tag 'pinctrl-v5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: sunxi: f1c100s: Fix signal name comment for PA2 SPI pin
        pinctrl: sunxi: fix f1c100s uart2 function
        pinctrl: mediatek: mt8365: fix IES control pins
        pinctrl: ocelot: Fix for lan966x alt mode
      18e471dd
    • Linus Torvalds's avatar
      Merge tag 'net-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · d904c8cc
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from can, xfrm and netfilter subtrees.
      
        Notably this reverts a recent TCP/DCCP netns-related change to address
        a possible UaF.
      
        Current release - regressions:
      
         - tcp: revert "tcp/dccp: get rid of inet_twsk_purge()"
      
         - xfrm: set dst dev to blackhole_netdev instead of loopback_dev in
           ifdown
      
        Previous releases - regressions:
      
         - netfilter: flowtable: fix TCP flow teardown
      
         - can: revert "can: m_can: pci: use custom bit timings for Elkhart
           Lake"
      
         - xfrm: check encryption module availability consistency
      
         - eth: vmxnet3: fix possible use-after-free bugs in
           vmxnet3_rq_alloc_rx_buf()
      
         - eth: mlx5: initialize flow steering during driver probe
      
         - eth: ice: fix crash when writing timestamp on RX rings
      
        Previous releases - always broken:
      
         - mptcp: fix checksum byte order
      
         - eth: lan966x: fix assignment of the MAC address
      
         - eth: mlx5: remove HW-GRO from reported features
      
         - eth: ftgmac100: disable hardware checksum on AST2600"
      
      * tag 'net-5.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
        net: bridge: Clear offload_fwd_mark when passing frame up bridge interface.
        ptp: ocp: change sysfs attr group handling
        selftests: forwarding: fix missing backslash
        netfilter: nf_tables: disable expression reduction infra
        netfilter: flowtable: move dst_check to packet path
        netfilter: flowtable: fix TCP flow teardown
        net: ftgmac100: Disable hardware checksum on AST2600
        igb: skip phy status check where unavailable
        nfc: pn533: Fix buggy cleanup order
        mptcp: Do TCP fallback on early DSS checksum failure
        mptcp: fix checksum byte order
        net: af_key: check encryption module availability consistency
        net: af_key: add check for pfkey_broadcast in function pfkey_process
        net/mlx5: Drain fw_reset when removing device
        net/mlx5e: CT: Fix setting flow_source for smfs ct tuples
        net/mlx5e: CT: Fix support for GRE tuples
        net/mlx5e: Remove HW-GRO from reported features
        net/mlx5e: Properly block HW GRO when XDP is enabled
        net/mlx5e: Properly block LRO when XDP is enabled
        net/mlx5e: Block rx-gro-hw feature in switchdev mode
        ...
      d904c8cc
    • Andrew Lunn's avatar
      net: bridge: Clear offload_fwd_mark when passing frame up bridge interface. · fbb3abdf
      Andrew Lunn authored
      It is possible to stack bridges on top of each other. Consider the
      following which makes use of an Ethernet switch:
      
             br1
           /    \
          /      \
         /        \
       br0.11    wlan0
         |
         br0
       /  |  \
      p1  p2  p3
      
      br0 is offloaded to the switch. Above br0 is a vlan interface, for
      vlan 11. This vlan interface is then a slave of br1. br1 also has a
      wireless interface as a slave. This setup trunks wireless lan traffic
      over the copper network inside a VLAN.
      
      A frame received on p1 which is passed up to the bridge has the
      skb->offload_fwd_mark flag set to true, indicating that the switch has
      dealt with forwarding the frame out ports p2 and p3 as needed. This
      flag instructs the software bridge it does not need to pass the frame
      back down again. However, the flag is not getting reset when the frame
      is passed upwards. As a result br1 sees the flag, wrongly interprets
      it, and fails to forward the frame to wlan0.
      
      When passing a frame upwards, clear the flag. This is the Rx
      equivalent of br_switchdev_frame_unmark() in br_dev_xmit().
      
      Fixes: f1c2eddf ("bridge: switchdev: Use an helper to clear forward mark")
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Tested-by: default avatarIdo Schimmel <idosch@nvidia.com>
      Acked-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Link: https://lore.kernel.org/r/20220518005840.771575-1-andrew@lunn.chSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      fbb3abdf
    • Jonathan Lemon's avatar
      ptp: ocp: change sysfs attr group handling · c2239294
      Jonathan Lemon authored
      In the detach path, the driver calls sysfs_remove_group() for the
      groups it believes has been registered.  However, if the group was
      never previously registered, then this causes a splat.
      
      Instead, compute the groups that should be registered in advance,
      and then call sysfs_create_groups(), which registers them all at once.
      
      Update the error handling appropriately.
      
      Fixes: c205d53c ("ptp: ocp: Add firmware capability bits for feature gating")
      Reported-by: default avatarZheyu Ma <zheyuma97@gmail.com>
      Signed-off-by: default avatarJonathan Lemon <jonathan.lemon@gmail.com>
      Link: https://lore.kernel.org/r/20220517214600.10606-1-jonathan.lemon@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      c2239294
    • Joachim Wiberg's avatar
      selftests: forwarding: fix missing backslash · 090f9dd0
      Joachim Wiberg authored
      Fix missing backslash, introduced in f62c5acc.  Causes all tests to
      not be installed.
      
      Fixes: f62c5acc ("selftests/net/forwarding: add missing tests to Makefile")
      Signed-off-by: default avatarJoachim Wiberg <troglobit@gmail.com>
      Acked-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Link: https://lore.kernel.org/r/20220518151630.2747773-1-troglobit@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      090f9dd0
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 7dc02d7f
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      1) Reduce number of hardware offload retries from flowtable datapath
         which might hog system with retries, from Felix Fietkau.
      
      2) Skip neighbour lookup for PPPoE device, fill_forward_path() already
         provides this and set on destination address from fill_forward_path for
         PPPoE device, also from Felix.
      
      4) When combining PPPoE on top of a VLAN device, set info->outdev to the
         PPPoE device so software offload works, from Felix.
      
      5) Fix TCP teardown flowtable state, races with conntrack gc might result
         in resetting the state to ESTABLISHED and the time to one day. Joint
         work with Oz Shlomo and Sven Auhagen.
      
      6) Call dst_check() from flowtable datapath to check if dst is stale
         instead of doing it from garbage collector path.
      
      7) Disable register tracking infrastructure, either user-space or
         kernel need to pre-fetch keys inconditionally, otherwise register
         tracking assumes data is already available in register that might
         not well be there, leading to incorrect reductions.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: disable expression reduction infra
        netfilter: flowtable: move dst_check to packet path
        netfilter: flowtable: fix TCP flow teardown
        netfilter: nft_flow_offload: fix offload with pppoe + vlan
        net: fix dev_fill_forward_path with pppoe + bridge
        netfilter: nft_flow_offload: skip dst neigh lookup for ppp devices
        netfilter: flowtable: fix excessive hw offload attempts after failure
      ====================
      
      Link: https://lore.kernel.org/r/20220518213841.359653-1-pablo@netfilter.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7dc02d7f
    • Linus Torvalds's avatar
      Merge tag 'block-5.18-2022-05-18' of git://git.kernel.dk/linux-block · f993aed4
      Linus Torvalds authored
      Pull block fix from Jens Axboe:
       "Just a small fix for a missing fifo time assigment for the head
        insertion case in mq-deadline"
      
      * tag 'block-5.18-2022-05-18' of git://git.kernel.dk/linux-block:
        block/mq-deadline: Set the fifo_time member also if inserting at head
      f993aed4
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.18-2022-05-18' of git://git.kernel.dk/linux-block · 01464a73
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
       "Two small changes fixing issues from the 5.18 merge window:
      
         - Fix wrong ordering of a tracepoint (Dylan)
      
         - Fix MSG_RING on IOPOLL rings (me)"
      
      * tag 'io_uring-5.18-2022-05-18' of git://git.kernel.dk/linux-block:
        io_uring: don't attempt to IOPOLL for MSG_RING requests
        io_uring: fix ordering of args in io_uring_queue_async_work
      01464a73
    • Linus Torvalds's avatar
      Merge tag 'audit-pr-20220518' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit · 8194a008
      Linus Torvalds authored
      Pull audit fix from Paul Moore:
       "A single audit patch to fix a problem where a task's audit_context was
        not being properly reset with io_uring"
      
      * tag 'audit-pr-20220518' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
        audit,io_uring,io-wq: call __audit_uring_exit for dummy contexts
      8194a008
    • Linus Torvalds's avatar
      Merge tag 'selinux-pr-20220518' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux · 6899c161
      Linus Torvalds authored
      Pull selinux fix from Paul Moore:
       "A single SELinux patch to fix an error path that was doing the wrong
        thing with respect to freeing memory"
      
      * tag 'selinux-pr-20220518' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux:
        selinux: fix bad cleanup on error in hashtab_duplicate()
      6899c161
    • Linus Torvalds's avatar
      Merge branch 'arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 5494d0eb
      Linus Torvalds authored
      Pull ARM SoC fixes from Arnd Bergmann:
       "The SoC bug fixes have calmed down sufficiently, there is one minor
        update for the MAINTAINERS file, and few bug fixes for dts
        descriptions:
      
         - Updates to the BananaPi R2-Pro (rk3568) dts to match production
           hardware rather than the prototype version.
      
         - Qualcomm sm8250 soundwire gets disabled on some machines to avoid
           crashes
      
         - A number of aspeed SoC specific fixes, addressing incorrect pin
           cotrol settings, some values in the romed8hm board, and a revert
           for an accidental removal of a DT node"
      
      * 'arm/fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
        MAINTAINERS: omap: remove me as a maintainer
        ARM: dts: aspeed: Add video engine to g6
        ARM: dts: aspeed: romed8hm3: Fix GPIOB0 name
        ARM: dts: aspeed: romed8hm3: Add lm25066 sense resistor values
        ARM: dts: aspeed-g6: fix SPI1/SPI2 quad pin group
        ARM: dts: aspeed-g6: add FWQSPI group in pinctrl dtsi
        dt-bindings: pinctrl: aspeed-g6: add FWQSPI function/group
        pinctrl: pinctrl-aspeed-g6: add FWQSPI function-group
        dt-bindings: pinctrl: aspeed-g6: remove FWQSPID group
        pinctrl: pinctrl-aspeed-g6: remove FWQSPID group in pinctrl
        ARM: dts: aspeed-g6: remove FWQSPID group in pinctrl dtsi
        arm64: dts: qcom: sm8250: don't enable rx/tx macro by default
        arm64: dts: rockchip: Add gmac1 and change network settings of bpi-r2-pro
        arm64: dts: rockchip: Change io-domains of bpi-r2-pro
      5494d0eb
    • Linus Torvalds's avatar
      Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · dbd380bb
      Linus Torvalds authored
      Pull misc fixes from Al Viro:
       "vhost race fix and a percpu_ref_init-caused cgroup double-free fix.
      
        The latter had manifested as buggered struct mount refcounting - those
        are also using percpu data structures, but anything that does percpu
        allocations could be hit"
      
      * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        Fix double fget() in vhost_net_set_backend()
        percpu_ref_init(): clean ->percpu_count_ref on failure
      dbd380bb
  4. 18 May, 2022 21 commits
    • Linus Torvalds's avatar
      Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · db1fd3fc
      Linus Torvalds authored
      Pull mlx5 fix from Michael Tsirkin:
       "One last minute fixup
      
        The patch has been on list for a while but as it was posted as part of
        a thread it was missed"
      
      * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
        vdpa/mlx5: Use consistent RQT size
      db1fd3fc
    • Al Viro's avatar
      Fix double fget() in vhost_net_set_backend() · fb4554c2
      Al Viro authored
      Descriptor table is a shared resource; two fget() on the same descriptor
      may return different struct file references.  get_tap_ptr_ring() is
      called after we'd found (and pinned) the socket we'll be using and it
      tries to find the private tun/tap data structures associated with it.
      Redoing the lookup by the same file descriptor we'd used to get the
      socket is racy - we need to same struct file.
      
      Thanks to Jason for spotting a braino in the original variant of patch -
      I'd missed the use of fd == -1 for disabling backend, and in that case
      we can end up with sock == NULL and sock != oldsock.
      
      Cc: stable@kernel.org
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      fb4554c2
    • Eli Cohen's avatar
      vdpa/mlx5: Use consistent RQT size · acde3929
      Eli Cohen authored
      The current code evaluates RQT size based on the configured number of
      virtqueues. This can raise an issue in the following scenario:
      
      Assume MQ was negotiated.
      1. mlx5_vdpa_set_map() gets called.
      2. handle_ctrl_mq() is called setting cur_num_vqs to some value, lower
         than the configured max VQs.
      3. A second set_map gets called, but now a smaller number of VQs is used
         to evaluate the size of the RQT.
      4. handle_ctrl_mq() is called with a value larger than what the RQT can
         hold. This will emit errors and the driver state is compromised.
      
      To fix this, we use a new field in struct mlx5_vdpa_net to hold the
      required number of entries in the RQT. This value is evaluated in
      mlx5_vdpa_set_driver_features() where we have the negotiated features
      all set up.
      
      In addition to that, we take into consideration the max capability of RQT
      entries early when the device is added so we don't need to take consider
      it when creating the RQT.
      
      Last, we remove the use of mlx5_vdpa_max_qps() which just returns the
      max_vas / 2 and make the code clearer.
      
      Fixes: 52893733 ("vdpa/mlx5: Add multiqueue support")
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarEli Cohen <elic@nvidia.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      acde3929
    • Linus Torvalds's avatar
      Merge tag 'sound-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · ef130216
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "A collection of last-minute HD- an USB-audio quirks in addition to a
        fix for the legacy ISA wavefront driver.
      
        All look small and easy"
      
      * tag 'sound-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: usb-audio: Restore Rane SL-1 quirk
        ALSA: hda/realtek: fix right sounds and mute/micmute LEDs for HP machine
        ALSA: hda/realtek: Add quirk for TongFang devices with pop noise
        ALSA: hda/realtek: Add quirk for the Framework Laptop
        ALSA: wavefront: Proper check of get_user() error
        ALSA: hda/realtek: Add quirk for Dell Latitude 7520
        ALSA: hda - fix unused Realtek function when PM is not enabled
        ALSA: usb-audio: Don't get sample rate for MCT Trigger 5 USB-to-HDMI
      ef130216
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: disable expression reduction infra · 9e539c5b
      Pablo Neira Ayuso authored
      Either userspace or kernelspace need to pre-fetch keys inconditionally
      before comparisons for this to work. Otherwise, register tracking data
      is misleading and it might result in reducing expressions which are not
      yet registers.
      
      First expression is also guaranteed to be evaluated always, however,
      certain expressions break before writing data to registers, before
      comparing the data, leaving the register in undetermined state.
      
      This patch disables this infrastructure by now.
      
      Fixes: b2d30654 ("netfilter: nf_tables: do not reduce read-only expressions")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      9e539c5b
    • Ritaro Takenaka's avatar
      netfilter: flowtable: move dst_check to packet path · 2738d9d9
      Ritaro Takenaka authored
      Fixes sporadic IPv6 packet loss when flow offloading is enabled.
      
      IPv6 route GC and flowtable GC are not synchronized.
      When dst_cache becomes stale and a packet passes through the flow before
      the flowtable GC teardowns it, the packet can be dropped.
      So, it is necessary to check dst every time in packet path.
      
      Fixes: 227e1e4d ("netfilter: nf_flowtable: skip device lookup from interface index")
      Signed-off-by: default avatarRitaro Takenaka <ritarot634@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      2738d9d9
    • Pablo Neira Ayuso's avatar
      netfilter: flowtable: fix TCP flow teardown · e5eaac2b
      Pablo Neira Ayuso authored
      This patch addresses three possible problems:
      
      1. ct gc may race to undo the timeout adjustment of the packet path, leaving
         the conntrack entry in place with the internal offload timeout (one day).
      
      2. ct gc removes the ct because the IPS_OFFLOAD_BIT is not set and the CLOSE
         timeout is reached before the flow offload del.
      
      3. tcp ct is always set to ESTABLISHED with a very long timeout
         in flow offload teardown/delete even though the state might be already
         CLOSED. Also as a remark we cannot assume that the FIN or RST packet
         is hitting flow table teardown as the packet might get bumped to the
         slow path in nftables.
      
      This patch resets IPS_OFFLOAD_BIT from flow_offload_teardown(), so
      conntrack handles the tcp rst/fin packet which triggers the CLOSE/FIN
      state transition.
      
      Moreover, teturn the connection's ownership to conntrack upon teardown
      by clearing the offload flag and fixing the established timeout value.
      The flow table GC thread will asynchonrnously free the flow table and
      hardware offload entries.
      
      Before this patch, the IPS_OFFLOAD_BIT remained set for expired flows on
      which is also misleading since the flow is back to classic conntrack
      path.
      
      If nf_ct_delete() removes the entry from the conntrack table, then it
      calls nf_ct_put() which decrements the refcnt. This is not a problem
      because the flowtable holds a reference to the conntrack object from
      flow_offload_alloc() path which is released via flow_offload_free().
      
      This patch also updates nft_flow_offload to skip packets in SYN_RECV
      state. Since we might miss or bump packets to slow path, we do not know
      what will happen there while we are still in SYN_RECV, this patch
      postpones offload up to the next packet which also aligns to the
      existing behaviour in tc-ct.
      
      flow_offload_teardown() does not reset the existing tcp state from
      flow_offload_fixup_tcp() to ESTABLISHED anymore, packets bump to slow
      path might have already update the state to CLOSE/FIN.
      
      Joint work with Oz and Sven.
      
      Fixes: 1e5b2471 ("netfilter: nf_flow_table: teardown flow timeout race")
      Signed-off-by: default avatarOz Shlomo <ozsh@nvidia.com>
      Signed-off-by: default avatarSven Auhagen <sven.auhagen@voleatech.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      e5eaac2b
    • Joel Stanley's avatar
      net: ftgmac100: Disable hardware checksum on AST2600 · 6fd45e79
      Joel Stanley authored
      The AST2600 when using the i210 NIC over NC-SI has been observed to
      produce incorrect checksum results with specific MTU values. This was
      first observed when sending data across a long distance set of networks.
      
      On a local network, the following test was performed using a 1MB file of
      random data.
      
      On the receiver run this script:
      
       #!/bin/bash
       while [ 1 ]; do
              # Zero the stats
              nstat -r  > /dev/null
              nc -l 9899 > test-file
              # Check for checksum errors
              TcpInCsumErrors=$(nstat | grep TcpInCsumErrors)
              if [ -z "$TcpInCsumErrors" ]; then
                      echo No TcpInCsumErrors
              else
                      echo TcpInCsumErrors = $TcpInCsumErrors
              fi
       done
      
      On an AST2600 system:
      
       # nc <IP of  receiver host> 9899 < test-file
      
      The test was repeated with various MTU values:
      
       # ip link set mtu 1410 dev eth0
      
      The observed results:
      
       1500 - good
       1434 - bad
       1400 - good
       1410 - bad
       1420 - good
      
      The test was repeated after disabling tx checksumming:
      
       # ethtool -K eth0 tx-checksumming off
      
      And all MTU values tested resulted in transfers without error.
      
      An issue with the driver cannot be ruled out, however there has been no
      bug discovered so far.
      
      David has done the work to take the original bug report of slow data
      transfer between long distance connections and triaged it down to this
      test case.
      
      The vendor suspects this this is a hardware issue when using NC-SI. The
      fixes line refers to the patch that introduced AST2600 support.
      Reported-by: default avatarDavid Wilder <wilder@us.ibm.com>
      Reviewed-by: default avatarDylan Hung <dylan_hung@aspeedtech.com>
      Signed-off-by: default avatarJoel Stanley <joel@jms.id.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fd45e79
    • Kevin Mitchell's avatar
      igb: skip phy status check where unavailable · 942d2ad5
      Kevin Mitchell authored
      igb_read_phy_reg() will silently return, leaving phy_data untouched, if
      hw->ops.read_reg isn't set. Depending on the uninitialized value of
      phy_data, this led to the phy status check either succeeding immediately
      or looping continuously for 2 seconds before emitting a noisy err-level
      timeout. This message went out to the console even though there was no
      actual problem.
      
      Instead, first check if there is read_reg function pointer. If not,
      proceed without trying to check the phy status register.
      
      Fixes: b72f3f72 ("igb: When GbE link up, wait for Remote receiver status condition")
      Signed-off-by: default avatarKevin Mitchell <kevmitch@arista.com>
      Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      942d2ad5
    • Lin Ma's avatar
      nfc: pn533: Fix buggy cleanup order · b8cedb70
      Lin Ma authored
      When removing the pn533 device (i2c or USB), there is a logic error. The
      original code first cancels the worker (flush_delayed_work) and then
      destroys the workqueue (destroy_workqueue), leaving the timer the last
      one to be deleted (del_timer). This result in a possible race condition
      in a multi-core preempt-able kernel. That is, if the cleanup
      (pn53x_common_clean) is concurrently run with the timer handler
      (pn533_listen_mode_timer), the timer can queue the poll_work to the
      already destroyed workqueue, causing use-after-free.
      
      This patch reorder the cleanup: it uses the del_timer_sync to make sure
      the handler is finished before the routine will destroy the workqueue.
      Note that the timer cannot be activated by the worker again.
      
      static void pn533_wq_poll(struct work_struct *work)
      ...
       rc = pn533_send_poll_frame(dev);
       if (rc)
         return;
      
       if (cur_mod->len == 0 && dev->poll_mod_count > 1)
         mod_timer(&dev->listen_timer, ...);
      
      That is, the mod_timer can be called only when pn533_send_poll_frame()
      returns no error, which is impossible because the device is detaching
      and the lower driver should return ENODEV code.
      Signed-off-by: default avatarLin Ma <linma@zju.edu.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b8cedb70
    • David S. Miller's avatar
      Merge branch 'mptcp-checksums' · 575fb4fb
      David S. Miller authored
      Mat Martineau says:
      
      ====================
      mptcp: Fix checksum byte order on little-endian
      
      These patches address a bug in the byte ordering of MPTCP checksums on
      little-endian architectures. The __sum16 type is always big endian, but
      was being cast to u16 and then byte-swapped (on little-endian archs)
      when reading/writing the checksum field in MPTCP option headers.
      
      MPTCP checksums are off by default, but are enabled if one or both peers
      request it in the SYN/SYNACK handshake.
      
      The corrected code is verified to interoperate between big-endian and
      little-endian machines.
      
      Patch 1 fixes the checksum byte order, patch 2 partially mitigates
      interoperation with peers sending bad checksums by falling back to TCP
      instead of resetting the connection.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      575fb4fb
    • Mat Martineau's avatar
      mptcp: Do TCP fallback on early DSS checksum failure · ae66fb2b
      Mat Martineau authored
      RFC 8684 section 3.7 describes several opportunities for a MPTCP
      connection to "fall back" to regular TCP early in the connection
      process, before it has been confirmed that MPTCP options can be
      successfully propagated on all SYN, SYN/ACK, and data packets. If a peer
      acknowledges the first received data packet with a regular TCP header
      (no MPTCP options), fallback is allowed.
      
      If the recipient of that first data packet finds a MPTCP DSS checksum
      error, this provides an opportunity to fail gracefully with a TCP
      fallback rather than resetting the connection (as might happen if a
      checksum failure were detected later).
      
      This commit modifies the checksum failure code to attempt fallback on
      the initial subflow of a MPTCP connection, only if it's a failure in the
      first data mapping. In cases where the peer initiates the connection,
      requests checksums, is the first to send data, and the peer is sending
      incorrect checksums (see
      https://github.com/multipath-tcp/mptcp_net-next/issues/275), this allows
      the connection to proceed as TCP rather than reset.
      
      Fixes: dd8bcd17 ("mptcp: validate the data checksum")
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae66fb2b
    • Paolo Abeni's avatar
      mptcp: fix checksum byte order · ba2c89e0
      Paolo Abeni authored
      The MPTCP code typecasts the checksum value to u16 and
      then converts it to big endian while storing the value into
      the MPTCP option.
      
      As a result, the wire encoding for little endian host is
      wrong, and that causes interoperabilty interoperability
      issues with other implementation or host with different endianness.
      
      Address the issue writing in the packet the unmodified __sum16 value.
      
      MPTCP checksum is disabled by default, interoperating with systems
      with bad mptcp-level csum encoding should cause fallback to TCP.
      
      Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/275
      Fixes: c5b39e26 ("mptcp: send out checksum for DSS")
      Fixes: 390b95a5 ("mptcp: receive checksum for DSS")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ba2c89e0
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue · 680b8926
      David S. Miller authored
      Tony Nguyen says:
      
      ====================
      Intel Wired LAN Driver Updates 2022-05-17
      
      This series contains updates to ice driver only.
      
      Arkadiusz prevents writing of timestamps when rings are being
      configured to resolve null pointer dereference.
      
      Paul changes a delayed call to baseline statistics to occur immediately
      which was causing misreporting of statistics due to the delay.
      
      Michal fixes incorrect restoration of interrupt moderation settings.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      680b8926
    • David S. Miller's avatar
      Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · 089403a3
      David S. Miller authored
      Steffen Klassert says:
      
      ====================
      pull request (net): ipsec 2022-05-18
      
      1) Fix "disable_policy" flag use when arriving from different devices.
         From Eyal Birger.
      
      2) Fix error handling of pfkey_broadcast in function pfkey_process.
         From Jiasheng Jiang.
      
      3) Check the encryption module availability consistency in pfkey.
         From Thomas Bartschies.
      
      Please pull or let me know if there are problems.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      089403a3
    • Ard Biesheuvel's avatar
      ARM: 9197/1: spectre-bhb: fix loop8 sequence for Thumb2 · 3cfb3019
      Ard Biesheuvel authored
      In Thumb2, 'b . + 4' produces a branch instruction that uses a narrow
      encoding, and so it does not jump to the following instruction as
      expected. So use W(b) instead.
      
      Fixes: 6c7cb60b ("ARM: fix Thumb2 regression with Spectre BHB")
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      3cfb3019
    • Ard Biesheuvel's avatar
      ARM: 9196/1: spectre-bhb: enable for Cortex-A15 · 0dc14aa9
      Ard Biesheuvel authored
      The Spectre-BHB mitigations were inadvertently left disabled for
      Cortex-A15, due to the fact that cpu_v7_bugs_init() is not called in
      that case. So fix that.
      
      Fixes: b9baf5c8 ("ARM: Spectre-BHB workaround")
      Signed-off-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarRussell King (Oracle) <rmk+kernel@armlinux.org.uk>
      0dc14aa9
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2022-05-17' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · 765d1216
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      mlx5 fixes 2022-05-17
      
      This series provides bug fixes to mlx5 driver.
      Please pull and let me know if there is any problem.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      765d1216
    • Thomas Bartschies's avatar
      net: af_key: check encryption module availability consistency · 015c44d7
      Thomas Bartschies authored
      Since the recent introduction supporting the SM3 and SM4 hash algos for IPsec, the kernel
      produces invalid pfkey acquire messages, when these encryption modules are disabled. This
      happens because the availability of the algos wasn't checked in all necessary functions.
      This patch adds these checks.
      Signed-off-by: default avatarThomas Bartschies <thomas.bartschies@cvk.de>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      015c44d7
    • Jiasheng Jiang's avatar
      net: af_key: add check for pfkey_broadcast in function pfkey_process · 4dc2a5a8
      Jiasheng Jiang authored
      If skb_clone() returns null pointer, pfkey_broadcast() will
      return error.
      Therefore, it should be better to check the return value of
      pfkey_broadcast() and return error if fails.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarJiasheng Jiang <jiasheng@iscas.ac.cn>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      4dc2a5a8
    • Al Viro's avatar
      percpu_ref_init(): clean ->percpu_count_ref on failure · a9171431
      Al Viro authored
      That way percpu_ref_exit() is safe after failing percpu_ref_init().
      At least one user (cgroup_create()) had a double-free that way;
      there might be other similar bugs.  Easier to fix in percpu_ref_init(),
      rather than playing whack-a-mole in sloppy users...
      
      Usual symptoms look like a messed refcounting in one of subsystems
      that use percpu allocations (might be percpu-refcount, might be
      something else).  Having refcounts for two different objects share
      memory is Not Nice(tm)...
      
      Reported-by: syzbot+5b1e53987f858500ec00@syzkaller.appspotmail.com
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      a9171431