1. 09 Jun, 2023 4 commits
    • Max Tottenham's avatar
      net/sched: act_pedit: Parse L3 Header for L4 offset · 6c02568f
      Max Tottenham authored
      Instead of relying on skb->transport_header being set correctly, opt
      instead to parse the L3 header length out of the L3 headers for both
      IPv4/IPv6 when the Extended Layer Op for tcp/udp is used. This fixes a
      bug if GRO is disabled, when GRO is disabled skb->transport_header is
      set by __netif_receive_skb_core() to point to the L3 header, it's later
      fixed by the upper protocol layers, but act_pedit will receive the SKB
      before the fixups are completed. The existing behavior causes the
      following to edit the L3 header if GRO is disabled instead of the UDP
      header:
      
          tc filter add dev eth0 ingress protocol ip flower ip_proto udp \
       dst_ip 192.168.1.3 action pedit ex munge udp set dport 18053
      
      Also re-introduce a rate-limited warning if we were unable to extract
      the header offset when using the 'ex' interface.
      
      Fixes: 71d0ed70 ("net/act_pedit: Support using offset relative to
      the conventional network headers")
      Signed-off-by: default avatarMax Tottenham <mtottenh@akamai.com>
      Reviewed-by: default avatarJosh Hunt <johunt@akamai.com>
      Reported-by: default avatarkernel test robot <lkp@intel.com>
      Closes: https://lore.kernel.org/oe-kbuild-all/202305261541.N165u9TZ-lkp@intel.com/Reviewed-by: default avatarPedro Tammela <pctammela@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c02568f
    • Wes Huang's avatar
      net: usb: qmi_wwan: add support for Compal RXM-G1 · 86319919
      Wes Huang authored
      Add support for Compal RXM-G1 which is based on Qualcomm SDX55 chip.
      This patch adds support for two compositions:
      
      0x9091: DIAG + MODEM + QMI_RMNET + ADB
      0x90db: DIAG + DUN + RMNET + DPL + QDSS(Trace) + ADB
      
      T:  Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=5000 MxCh= 0
      D:  Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs=  1
      P:  Vendor=05c6 ProdID=9091 Rev= 4.14
      S:  Manufacturer=QCOM
      S:  Product=SDXPRAIRIE-MTP _SN:719AB680
      S:  SerialNumber=719ab680
      C:* #Ifs= 4 Cfg#= 1 Atr=80 MxPwr=896mA
      I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=(none)
      E:  Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
      E:  Ad=84(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
      E:  Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 3 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
      E:  Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      
      T:  Bus=03 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=5000 MxCh= 0
      D:  Ver= 3.20 Cls=00(>ifc ) Sub=00 Prot=00 MxPS= 9 #Cfgs=  1
      P:  Vendor=05c6 ProdID=90db Rev= 4.14
      S:  Manufacturer=QCOM
      S:  Product=SDXPRAIRIE-MTP _SN:719AB680
      S:  SerialNumber=719ab680
      C:* #Ifs= 6 Cfg#= 1 Atr=80 MxPwr=896mA
      I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=(none)
      E:  Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none)
      E:  Ad=83(I) Atr=03(Int.) MxPS=  10 Ivl=32ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
      E:  Ad=84(I) Atr=03(Int.) MxPS=   8 Ivl=32ms
      E:  Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 3 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
      E:  Ad=8f(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 4 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
      E:  Ad=85(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      I:* If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
      E:  Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      E:  Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarWes Huang <wes.huang@moxa.com>
      Acked-by: default avatarBjørn Mork <bjorn@mork.no>
      Link: https://lore.kernel.org/r/20230608030141.3546-1-wes.huang@moxa.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      86319919
    • Yuezhen Luan's avatar
      igb: Fix extts capture value format for 82580/i354/i350 · 6292d743
      Yuezhen Luan authored
      82580/i354/i350 features circle-counter-like timestamp registers
      that are different with newer i210. The EXTTS capture value in
      AUXTSMPx should be converted from raw circle counter value to
      timestamp value in resolution of 1 nanosec by the driver.
      
      This issue can be reproduced on i350 nics, connecting an 1PPS
      signal to a SDP pin, and run 'ts2phc' command to read external
      1PPS timestamp value. On i210 this works fine, but on i350 the
      extts is not correctly converted.
      
      The i350/i354/82580's SYSTIM and other timestamp registers are
      40bit counters, presenting time range of 2^40 ns, that means these
      registers overflows every about 1099s. This causes all these regs
      can't be used directly in contrast to the newer i210/i211s.
      
      The igb driver needs to convert these raw register values to
      valid time stamp format by using kernel timecounter apis for i350s
      families. Here the igb_extts() just forgot to do the convert.
      
      Fixes: 38970eac ("igb: support EXTTS on 82580/i354/i350")
      Signed-off-by: default avatarYuezhen Luan <eggcar.luan@gmail.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230607164116.3768175-1-anthony.l.nguyen@intel.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6292d743
    • Guillaume Nault's avatar
      ping6: Fix send to link-local addresses with VRF. · 91ffd1ba
      Guillaume Nault authored
      Ping sockets can't send packets when they're bound to a VRF master
      device and the output interface is set to a slave device.
      
      For example, when net.ipv4.ping_group_range is properly set, so that
      ping6 can use ping sockets, the following kind of commands fails:
        $ ip vrf exec red ping6 fe80::854:e7ff:fe88:4bf1%eth1
      
      What happens is that sk->sk_bound_dev_if is set to the VRF master
      device, but 'oif' is set to the real output device. Since both are set
      but different, ping_v6_sendmsg() sees their value as inconsistent and
      fails.
      
      Fix this by allowing 'oif' to be a slave device of ->sk_bound_dev_if.
      
      This fixes the following kselftest failure:
        $ ./fcnal-test.sh -t ipv6_ping
        [...]
        TEST: ping out, vrf device+address bind - ns-B IPv6 LLA        [FAIL]
      Reported-by: default avatarMirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
      Closes: https://lore.kernel.org/netdev/b6191f90-ffca-dbca-7d06-88a9788def9c@alu.unizg.hr/Tested-by: default avatarMirsad Todorovac <mirsad.todorovac@alu.unizg.hr>
      Fixes: 5e457896 ("net: ipv6: Fix ping to link-local addresses.")
      Signed-off-by: default avatarGuillaume Nault <gnault@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/6c8b53108816a8d0d5705ae37bdc5a8322b5e3d9.1686153846.git.gnault@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      91ffd1ba
  2. 08 Jun, 2023 16 commits
  3. 07 Jun, 2023 20 commits
    • Linus Torvalds's avatar
      Merge tag 'input-for-v6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · 5f63595e
      Linus Torvalds authored
      Pull input fixes from Dmitry Torokhov:
      
       - a fix for unbalanced open count for inhibited input devices
      
       - fixups in Elantech PS/2 and Cyppress TTSP v5 drivers
      
       - a quirk to soc_button_array driver to make it work with Lenovo
         Yoga Book X90F / X90L
      
       - a removal of erroneous entry from xpad driver
      
      * tag 'input-for-v6.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: xpad - delete a Razer DeathAdder mouse VID/PID entry
        Input: psmouse - fix OOB access in Elantech protocol
        Input: soc_button_array - add invalid acpi_index DMI quirk handling
        Input: fix open count when closing inhibited device
        Input: cyttsp5 - fix array length
      5f63595e
    • Thomas Gleixner's avatar
      MAINTAINERS: Add entry for debug objects · 25bda386
      Thomas Gleixner authored
      This is overdue and an oversight.
      
      Add myself to this file deespite the fact that I'm trying to reduce the
      number of entries in this file which have my name attached, but in the
      hope that patches wont get picked up elsewhere completely unreviewed and
      unnoticed.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      25bda386
    • David Howells's avatar
      afs: Fix setting of mtime when creating a file/dir/symlink · a27648c7
      David Howells authored
      kafs incorrectly passes a zero mtime (ie. 1st Jan 1970) to the server when
      creating a file, dir or symlink because the mtime recorded in the
      afs_operation struct gets passed to the server by the marshalling routines,
      but the afs_mkdir(), afs_create() and afs_symlink() functions don't set it.
      
      This gets masked if a file or directory is subsequently modified.
      
      Fix this by filling in op->mtime before calling the create op.
      
      Fixes: e49c7b2f ("afs: Build an abstraction around an "operation" concept")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarJeffrey Altman <jaltman@auristor.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      cc: linux-fsdevel@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a27648c7
    • Jiri Olsa's avatar
      bpf: Add extra path pointer check to d_path helper · f46fab0e
      Jiri Olsa authored
      Anastasios reported crash on stable 5.15 kernel with following
      BPF attached to lsm hook:
      
        SEC("lsm.s/bprm_creds_for_exec")
        int BPF_PROG(bprm_creds_for_exec, struct linux_binprm *bprm)
        {
                struct path *path = &bprm->executable->f_path;
                char p[128] = { 0 };
      
                bpf_d_path(path, p, 128);
                return 0;
        }
      
      But bprm->executable can be NULL, so bpf_d_path call will crash:
      
        BUG: kernel NULL pointer dereference, address: 0000000000000018
        #PF: supervisor read access in kernel mode
        #PF: error_code(0x0000) - not-present page
        PGD 0 P4D 0
        Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC NOPTI
        ...
        RIP: 0010:d_path+0x22/0x280
        ...
        Call Trace:
         <TASK>
         bpf_d_path+0x21/0x60
         bpf_prog_db9cf176e84498d9_bprm_creds_for_exec+0x94/0x99
         bpf_trampoline_6442506293_0+0x55/0x1000
         bpf_lsm_bprm_creds_for_exec+0x5/0x10
         security_bprm_creds_for_exec+0x29/0x40
         bprm_execve+0x1c1/0x900
         do_execveat_common.isra.0+0x1af/0x260
         __x64_sys_execve+0x32/0x40
      
      It's problem for all stable trees with bpf_d_path helper, which was
      added in 5.9.
      
      This issue is fixed in current bpf code, where we identify and mark
      trusted pointers, so the above code would fail even to load.
      
      For the sake of the stable trees and to workaround potentially broken
      verifier in the future, adding the code that reads the path object from
      the passed pointer and verifies it's valid in kernel space.
      
      Fixes: 6e22ab9d ("bpf: Add d_path helper")
      Reported-by: default avatarAnastasios Papagiannis <tasos.papagiannnis@gmail.com>
      Suggested-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarStanislav Fomichev <sdf@google.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Link: https://lore.kernel.org/bpf/20230606181714.532998-1-jolsa@kernel.org
      f46fab0e
    • Hangyu Hua's avatar
      net: sched: fix possible refcount leak in tc_chain_tmplt_add() · 44f8baaf
      Hangyu Hua authored
      try_module_get will be called in tcf_proto_lookup_ops. So module_put needs
      to be called to drop the refcount if ops don't implement the required
      function.
      
      Fixes: 9f407f17 ("net: sched: introduce chain templates")
      Signed-off-by: default avatarHangyu Hua <hbh25y@gmail.com>
      Reviewed-by: default avatarLarysa Zaremba <larysa.zaremba@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      44f8baaf
    • Eric Dumazet's avatar
      net: sched: act_police: fix sparse errors in tcf_police_dump() · 682881ee
      Eric Dumazet authored
      Fixes following sparse errors:
      
      net/sched/act_police.c:360:28: warning: dereference of noderef expression
      net/sched/act_police.c:362:45: warning: dereference of noderef expression
      net/sched/act_police.c:362:45: warning: dereference of noderef expression
      net/sched/act_police.c:368:28: warning: dereference of noderef expression
      net/sched/act_police.c:370:45: warning: dereference of noderef expression
      net/sched/act_police.c:370:45: warning: dereference of noderef expression
      net/sched/act_police.c:376:45: warning: dereference of noderef expression
      net/sched/act_police.c:376:45: warning: dereference of noderef expression
      
      Fixes: d1967e49 ("net_sched: act_police: add 2 new attributes to support police 64bit rate and peakrate")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      682881ee
    • Eelco Chaudron's avatar
      net: openvswitch: fix upcall counter access before allocation · de9df6c6
      Eelco Chaudron authored
      Currently, the per cpu upcall counters are allocated after the vport is
      created and inserted into the system. This could lead to the datapath
      accessing the counters before they are allocated resulting in a kernel
      Oops.
      
      Here is an example:
      
        PID: 59693    TASK: ffff0005f4f51500  CPU: 0    COMMAND: "ovs-vswitchd"
         #0 [ffff80000a39b5b0] __switch_to at ffffb70f0629f2f4
         #1 [ffff80000a39b5d0] __schedule at ffffb70f0629f5cc
         #2 [ffff80000a39b650] preempt_schedule_common at ffffb70f0629fa60
         #3 [ffff80000a39b670] dynamic_might_resched at ffffb70f0629fb58
         #4 [ffff80000a39b680] mutex_lock_killable at ffffb70f062a1388
         #5 [ffff80000a39b6a0] pcpu_alloc at ffffb70f0594460c
         #6 [ffff80000a39b750] __alloc_percpu_gfp at ffffb70f05944e68
         #7 [ffff80000a39b760] ovs_vport_cmd_new at ffffb70ee6961b90 [openvswitch]
         ...
      
        PID: 58682    TASK: ffff0005b2f0bf00  CPU: 0    COMMAND: "kworker/0:3"
         #0 [ffff80000a5d2f40] machine_kexec at ffffb70f056a0758
         #1 [ffff80000a5d2f70] __crash_kexec at ffffb70f057e2994
         #2 [ffff80000a5d3100] crash_kexec at ffffb70f057e2ad8
         #3 [ffff80000a5d3120] die at ffffb70f0628234c
         #4 [ffff80000a5d31e0] die_kernel_fault at ffffb70f062828a8
         #5 [ffff80000a5d3210] __do_kernel_fault at ffffb70f056a31f4
         #6 [ffff80000a5d3240] do_bad_area at ffffb70f056a32a4
         #7 [ffff80000a5d3260] do_translation_fault at ffffb70f062a9710
         #8 [ffff80000a5d3270] do_mem_abort at ffffb70f056a2f74
         #9 [ffff80000a5d32a0] el1_abort at ffffb70f06297dac
        #10 [ffff80000a5d32d0] el1h_64_sync_handler at ffffb70f06299b24
        #11 [ffff80000a5d3410] el1h_64_sync at ffffb70f056812dc
        #12 [ffff80000a5d3430] ovs_dp_upcall at ffffb70ee6963c84 [openvswitch]
        #13 [ffff80000a5d3470] ovs_dp_process_packet at ffffb70ee6963fdc [openvswitch]
        #14 [ffff80000a5d34f0] ovs_vport_receive at ffffb70ee6972c78 [openvswitch]
        #15 [ffff80000a5d36f0] netdev_port_receive at ffffb70ee6973948 [openvswitch]
        #16 [ffff80000a5d3720] netdev_frame_hook at ffffb70ee6973a28 [openvswitch]
        #17 [ffff80000a5d3730] __netif_receive_skb_core.constprop.0 at ffffb70f06079f90
      
      We moved the per cpu upcall counter allocation to the existing vport
      alloc and free functions to solve this.
      
      Fixes: 95637d91 ("net: openvswitch: release vport resources on failure")
      Fixes: 1933ea36 ("net: openvswitch: Add support to count upcall packets")
      Signed-off-by: default avatarEelco Chaudron <echaudro@redhat.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Acked-by: default avatarAaron Conole <aconole@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      de9df6c6
    • Eric Dumazet's avatar
      net: sched: move rtm_tca_policy declaration to include file · 886bc7d6
      Eric Dumazet authored
      rtm_tca_policy is used from net/sched/sch_api.c and net/sched/cls_api.c,
      thus should be declared in an include file.
      
      This fixes the following sparse warning:
      net/sched/sch_api.c:1434:25: warning: symbol 'rtm_tca_policy' was not declared. Should it be static?
      
      Fixes: e331473f ("net/sched: cls_api: add missing validation of netlink attributes")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      886bc7d6
    • Michal Schmidt's avatar
      ice: make writes to /dev/gnssX synchronous · bf15bb38
      Michal Schmidt authored
      The current ice driver's GNSS write implementation buffers writes and
      works through them asynchronously in a kthread. That's bad because:
       - The GNSS write_raw operation is supposed to be synchronous[1][2].
       - There is no upper bound on the number of pending writes.
         Userspace can submit writes much faster than the driver can process,
         consuming unlimited amounts of kernel memory.
      
      A patch that's currently on review[3] ("[v3,net] ice: Write all GNSS
      buffers instead of first one") would add one more problem:
       - The possibility of waiting for a very long time to flush the write
         work when doing rmmod, softlockups.
      
      To fix these issues, simplify the implementation: Drop the buffering,
      the write_work, and make the writes synchronous.
      
      I tested this with gpsd and ubxtool.
      
      [1] https://events19.linuxfoundation.org/wp-content/uploads/2017/12/The-GNSS-Subsystem-Johan-Hovold-Hovold-Consulting-AB.pdf
          "User interface" slide.
      [2] A comment in drivers/gnss/core.c:gnss_write():
              /* Ignoring O_NONBLOCK, write_raw() is synchronous. */
      [3] https://patchwork.ozlabs.org/project/intel-wired-lan/patch/20230217120541.16745-1-karol.kolacinski@intel.com/
      
      Fixes: d6b98c8d ("ice: add write functionality for GNSS TTY")
      Signed-off-by: default avatarMichal Schmidt <mschmidt@redhat.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Tested-by: Sunitha Mekala <sunithax.d.mekala@intel.com> (A Contingent worker at Intel)
      Signed-off-by: default avatarTony Nguyen <anthony.l.nguyen@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf15bb38
    • Eric Dumazet's avatar
      net: sched: add rcu annotations around qdisc->qdisc_sleeping · d636fc5d
      Eric Dumazet authored
      syzbot reported a race around qdisc->qdisc_sleeping [1]
      
      It is time we add proper annotations to reads and writes to/from
      qdisc->qdisc_sleeping.
      
      [1]
      BUG: KCSAN: data-race in dev_graft_qdisc / qdisc_lookup_rcu
      
      read to 0xffff8881286fc618 of 8 bytes by task 6928 on cpu 1:
      qdisc_lookup_rcu+0x192/0x2c0 net/sched/sch_api.c:331
      __tcf_qdisc_find+0x74/0x3c0 net/sched/cls_api.c:1174
      tc_get_tfilter+0x18f/0x990 net/sched/cls_api.c:2547
      rtnetlink_rcv_msg+0x7af/0x8c0 net/core/rtnetlink.c:6386
      netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2546
      rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6413
      netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
      netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365
      netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1913
      sock_sendmsg_nosec net/socket.c:724 [inline]
      sock_sendmsg net/socket.c:747 [inline]
      ____sys_sendmsg+0x375/0x4c0 net/socket.c:2503
      ___sys_sendmsg net/socket.c:2557 [inline]
      __sys_sendmsg+0x1e3/0x270 net/socket.c:2586
      __do_sys_sendmsg net/socket.c:2595 [inline]
      __se_sys_sendmsg net/socket.c:2593 [inline]
      __x64_sys_sendmsg+0x46/0x50 net/socket.c:2593
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      write to 0xffff8881286fc618 of 8 bytes by task 6912 on cpu 0:
      dev_graft_qdisc+0x4f/0x80 net/sched/sch_generic.c:1115
      qdisc_graft+0x7d0/0xb60 net/sched/sch_api.c:1103
      tc_modify_qdisc+0x712/0xf10 net/sched/sch_api.c:1693
      rtnetlink_rcv_msg+0x807/0x8c0 net/core/rtnetlink.c:6395
      netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2546
      rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6413
      netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
      netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365
      netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1913
      sock_sendmsg_nosec net/socket.c:724 [inline]
      sock_sendmsg net/socket.c:747 [inline]
      ____sys_sendmsg+0x375/0x4c0 net/socket.c:2503
      ___sys_sendmsg net/socket.c:2557 [inline]
      __sys_sendmsg+0x1e3/0x270 net/socket.c:2586
      __do_sys_sendmsg net/socket.c:2595 [inline]
      __se_sys_sendmsg net/socket.c:2593 [inline]
      __x64_sys_sendmsg+0x46/0x50 net/socket.c:2593
      do_syscall_x64 arch/x86/entry/common.c:50 [inline]
      do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
      entry_SYSCALL_64_after_hwframe+0x63/0xcd
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 0 PID: 6912 Comm: syz-executor.5 Not tainted 6.4.0-rc3-syzkaller-00190-g0d85b27b #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/16/2023
      
      Fixes: 3a7d0d07 ("net: sched: extend Qdisc with rcu")
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Vlad Buslov <vladbu@nvidia.com>
      Acked-by: Jamal Hadi Salim<jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d636fc5d
    • David S. Miller's avatar
      Merge branch 'rfs-lockless-annotate' · e3144ff5
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      rfs: annotate lockless accesses
      
      rfs runs without locks held, so we should annotate
      read and writes to shared variables.
      
      It should prevent compilers forcing writes
      in the following situation:
      
        if (var != val)
           var = val;
      
      A compiler could indeed simply avoid the conditional:
      
          var = val;
      
      This matters if var is shared between many cpus.
      
      v2: aligns one closing bracket (Simon)
          adds Fixes: tags (Jakub)
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3144ff5
    • Eric Dumazet's avatar
      rfs: annotate lockless accesses to RFS sock flow table · 5c3b74a9
      Eric Dumazet authored
      Add READ_ONCE()/WRITE_ONCE() on accesses to the sock flow table.
      
      This also prevents a (smart ?) compiler to remove the condition in:
      
      if (table->ents[index] != newval)
              table->ents[index] = newval;
      
      We need the condition to avoid dirtying a shared cache line.
      
      Fixes: fec5e652 ("rfs: Receive Flow Steering")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5c3b74a9
    • Eric Dumazet's avatar
      rfs: annotate lockless accesses to sk->sk_rxhash · 1e5c647c
      Eric Dumazet authored
      Add READ_ONCE()/WRITE_ONCE() on accesses to sk->sk_rxhash.
      
      This also prevents a (smart ?) compiler to remove the condition in:
      
      if (sk->sk_rxhash != newval)
      	sk->sk_rxhash = newval;
      
      We need the condition to avoid dirtying a shared cache line.
      
      Fixes: fec5e652 ("rfs: Receive Flow Steering")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1e5c647c
    • Jakub Kicinski's avatar
      Merge tag 'for-net-2023-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · ab39b113
      Jakub Kicinski authored
      Luiz Augusto von Dentz says:
      
      ====================
      bluetooth pull request for net:
      
       - Fixes to debugfs registration
       - Fix use-after-free in hci_remove_ltk/hci_remove_irk
       - Fixes to ISO channel support
       - Fix missing checks for invalid L2CAP DCID
       - Fix l2cap_disconnect_req deadlock
       - Add lock to protect HCI_UNREGISTER
      
      * tag 'for-net-2023-06-05' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
        Bluetooth: L2CAP: Add missing checks for invalid DCID
        Bluetooth: ISO: use correct CIS order in Set CIG Parameters event
        Bluetooth: ISO: don't try to remove CIG if there are bound CIS left
        Bluetooth: Fix l2cap_disconnect_req deadlock
        Bluetooth: hci_qca: fix debugfs registration
        Bluetooth: fix debugfs registration
        Bluetooth: hci_sync: add lock to protect HCI_UNREGISTER
        Bluetooth: Fix use-after-free in hci_remove_ltk/hci_remove_irk
        Bluetooth: ISO: Fix CIG auto-allocation to select configurable CIG
        Bluetooth: ISO: consider right CIS when removing CIG at cleanup
      ====================
      
      Link: https://lore.kernel.org/r/20230606003454.2392552-1-luiz.dentz@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ab39b113
    • Jakub Kicinski's avatar
      Merge tag 'nf-23-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf · 20c47646
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Missing nul-check in basechain hook netlink dump path, from Gavrilov Ilia.
      
      2) Fix bitwise register tracking, from Jeremy Sowden.
      
      3) Null pointer dereference when accessing conntrack helper,
         from Tijs Van Buggenhout.
      
      4) Add schedule point to ipset's call_ad, from Kuniyuki Iwashima.
      
      5) Incorrect boundary check when building chain blob.
      
      * tag 'nf-23-06-07' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
        netfilter: nf_tables: out-of-bound check in chain blob
        netfilter: ipset: Add schedule point in call_ad().
        netfilter: conntrack: fix NULL pointer dereference in nf_confirm_cthelper
        netfilter: nft_bitwise: fix register tracking
        netfilter: nf_tables: Add null check for nla_nest_start_noflag() in nft_dump_basechain_hook()
      ====================
      
      Link: https://lore.kernel.org/r/20230606225851.67394-1-pablo@netfilter.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      20c47646
    • Jakub Kicinski's avatar
      Merge tag 'wireless-2023-06-06' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless · e684ab76
      Jakub Kicinski authored
      Kalle Valo says:
      
      ====================
      wireless fixes for v6.4
      
      Both rtw88 and rtw89 have a 802.11 powersave fix for a regression
      introduced in v6.0. mt76 fixes a race and a null pointer dereference.
      iwlwifi fixes an issue where not enough memory was allocated for a
      firmware event. And finally the stack has several smaller fixes all
      over.
      
      * tag 'wireless-2023-06-06' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
        wifi: cfg80211: fix locking in regulatory disconnect
        wifi: cfg80211: fix locking in sched scan stop work
        wifi: iwlwifi: mvm: Fix -Warray-bounds bug in iwl_mvm_wait_d3_notif()
        wifi: mac80211: fix switch count in EMA beacons
        wifi: mac80211: don't translate beacon/presp addrs
        wifi: mac80211: mlme: fix non-inheritence element
        wifi: cfg80211: reject bad AP MLD address
        wifi: mac80211: use correct iftype HE cap
        wifi: mt76: mt7996: fix possible NULL pointer dereference in mt7996_mac_write_txwi()
        wifi: rtw89: remove redundant check of entering LPS
        wifi: rtw89: correct PS calculation for SUPPORTS_DYNAMIC_PS
        wifi: rtw88: correct PS calculation for SUPPORTS_DYNAMIC_PS
        wifi: mt76: mt7615: fix possible race in mt7615_mac_sta_poll
      ====================
      
      Link: https://lore.kernel.org/r/20230606150817.EC133C433D2@smtp.kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      e684ab76
    • Brett Creeley's avatar
      virtio_net: use control_buf for coalesce params · accc1bf2
      Brett Creeley authored
      Commit 699b045a ("net: virtio_net: notifications coalescing
      support") added coalescing command support for virtio_net. However,
      the coalesce commands are using buffers on the stack, which is causing
      the device to see DMA errors. There should also be a complaint from
      check_for_stack() in debug_dma_map_xyz(). Fix this by adding and using
      coalesce params from the control_buf struct, which aligns with other
      commands.
      
      Cc: stable@vger.kernel.org
      Fixes: 699b045a ("net: virtio_net: notifications coalescing support")
      Reviewed-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Signed-off-by: default avatarAllen Hubbe <allen.hubbe@amd.com>
      Signed-off-by: default avatarBrett Creeley <brett.creeley@amd.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Reviewed-by: default avatarXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Link: https://lore.kernel.org/r/20230605195925.51625-1-brett.creeley@amd.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      accc1bf2
    • Brett Creeley's avatar
      pds_core: Fix FW recovery detection · 4f48c303
      Brett Creeley authored
      Commit 523847df ("pds_core: add devcmd device interfaces") included
      initial support for FW recovery detection. Unfortunately, the ordering
      in pdsc_is_fw_good() was incorrect, which was causing FW recovery to be
      undetected by the driver. Fix this by making sure to update the cached
      fw_status by calling pdsc_is_fw_running() before setting the local FW
      gen.
      
      Fixes: 523847df ("pds_core: add devcmd device interfaces")
      Signed-off-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Signed-off-by: default avatarBrett Creeley <brett.creeley@amd.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230605195116.49653-1-brett.creeley@amd.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      4f48c303
    • Eric Dumazet's avatar
      tcp: gso: really support BIG TCP · 82a01ab3
      Eric Dumazet authored
      We missed that tcp_gso_segment() was assuming skb->len was smaller than 65535 :
      
      oldlen = (u16)~skb->len;
      
      This part came with commit 0718bcc0 ("[NET]: Fix CHECKSUM_HW GSO problems.")
      
      This leads to wrong TCP checksum.
      
      Adapt the code to accept arbitrary packet length.
      
      v2:
        - use two csum_add() instead of csum_fold() (Alexander Duyck)
        - Change delta type to __wsum to reduce casts (Alexander Duyck)
      
      Fixes: 09f3d1a3 ("ipv6/gso: remove temporary HBH/jumbo header")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarAlexander Duyck <alexanderduyck@fb.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Link: https://lore.kernel.org/r/20230605161647.3624428-1-edumazet@google.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      82a01ab3
    • Kuniyuki Iwashima's avatar
      ipv6: rpl: Fix Route of Death. · a2f4c143
      Kuniyuki Iwashima authored
      A remote DoS vulnerability of RPL Source Routing is assigned CVE-2023-2156.
      
      The Source Routing Header (SRH) has the following format:
      
        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |  Next Header  |  Hdr Ext Len  | Routing Type  | Segments Left |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        | CmprI | CmprE |  Pad  |               Reserved                |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
        |                                                               |
        .                                                               .
        .                        Addresses[1..n]                        .
        .                                                               .
        |                                                               |
        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      
      The originator of an SRH places the first hop's IPv6 address in the IPv6
      header's IPv6 Destination Address and the second hop's IPv6 address as
      the first address in Addresses[1..n].
      
      The CmprI and CmprE fields indicate the number of prefix octets that are
      shared with the IPv6 Destination Address.  When CmprI or CmprE is not 0,
      Addresses[1..n] are compressed as follows:
      
        1..n-1 : (16 - CmprI) bytes
             n : (16 - CmprE) bytes
      
      Segments Left indicates the number of route segments remaining.  When the
      value is not zero, the SRH is forwarded to the next hop.  Its address
      is extracted from Addresses[n - Segment Left + 1] and swapped with IPv6
      Destination Address.
      
      When Segment Left is greater than or equal to 2, the size of SRH is not
      changed because Addresses[1..n-1] are decompressed and recompressed with
      CmprI.
      
      OTOH, when Segment Left changes from 1 to 0, the new SRH could have a
      different size because Addresses[1..n-1] are decompressed with CmprI and
      recompressed with CmprE.
      
      Let's say CmprI is 15 and CmprE is 0.  When we receive SRH with Segment
      Left >= 2, Addresses[1..n-1] have 1 byte for each, and Addresses[n] has
      16 bytes.  When Segment Left is 1, Addresses[1..n-1] is decompressed to
      16 bytes and not recompressed.  Finally, the new SRH will need more room
      in the header, and the size is (16 - 1) * (n - 1) bytes.
      
      Here the max value of n is 255 as Segment Left is u8, so in the worst case,
      we have to allocate 3825 bytes in the skb headroom.  However, now we only
      allocate a small fixed buffer that is IPV6_RPL_SRH_WORST_SWAP_SIZE (16 + 7
      bytes).  If the decompressed size overflows the room, skb_push() hits BUG()
      below [0].
      
      Instead of allocating the fixed buffer for every packet, let's allocate
      enough headroom only when we receive SRH with Segment Left 1.
      
      [0]:
      skbuff: skb_under_panic: text:ffffffff81c9f6e2 len:576 put:576 head:ffff8880070b5180 data:ffff8880070b4fb0 tail:0x70 end:0x140 dev:lo
      kernel BUG at net/core/skbuff.c:200!
      invalid opcode: 0000 [#1] PREEMPT SMP PTI
      CPU: 0 PID: 154 Comm: python3 Not tainted 6.4.0-rc4-00190-gc308e9ec #7
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
      RIP: 0010:skb_panic (net/core/skbuff.c:200)
      Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50 ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 80 6e 77 82 e8 ad 8b 60 ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90
      RSP: 0018:ffffc90000003da0 EFLAGS: 00000246
      RAX: 0000000000000085 RBX: ffff8880058a6600 RCX: 0000000000000000
      RDX: 0000000000000000 RSI: ffff88807dc1c540 RDI: ffff88807dc1c540
      RBP: ffffc90000003e48 R08: ffffffff82b392c8 R09: 00000000ffffdfff
      R10: ffffffff82a592e0 R11: ffffffff82b092e0 R12: ffff888005b1c800
      R13: ffff8880070b51b8 R14: ffff888005b1ca18 R15: ffff8880070b5190
      FS:  00007f4539f0b740(0000) GS:ffff88807dc00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 000055670baf3000 CR3: 0000000005b0e000 CR4: 00000000007506f0
      PKRU: 55555554
      Call Trace:
       <IRQ>
       skb_push (net/core/skbuff.c:210)
       ipv6_rthdr_rcv (./include/linux/skbuff.h:2880 net/ipv6/exthdrs.c:634 net/ipv6/exthdrs.c:718)
       ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437 (discriminator 5))
       ip6_input_finish (./include/linux/rcupdate.h:805 net/ipv6/ip6_input.c:483)
       __netif_receive_skb_one_core (net/core/dev.c:5494)
       process_backlog (./include/linux/rcupdate.h:805 net/core/dev.c:5934)
       __napi_poll (net/core/dev.c:6496)
       net_rx_action (net/core/dev.c:6565 net/core/dev.c:6696)
       __do_softirq (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142 kernel/softirq.c:572)
       do_softirq (kernel/softirq.c:472 kernel/softirq.c:459)
       </IRQ>
       <TASK>
       __local_bh_enable_ip (kernel/softirq.c:396)
       __dev_queue_xmit (net/core/dev.c:4272)
       ip6_finish_output2 (./include/net/neighbour.h:544 net/ipv6/ip6_output.c:134)
       rawv6_sendmsg (./include/net/dst.h:458 ./include/linux/netfilter.h:303 net/ipv6/raw.c:656 net/ipv6/raw.c:914)
       sock_sendmsg (net/socket.c:724 net/socket.c:747)
       __sys_sendto (net/socket.c:2144)
       __x64_sys_sendto (net/socket.c:2156 net/socket.c:2152 net/socket.c:2152)
       do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
       entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
      RIP: 0033:0x7f453a138aea
      Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
      RSP: 002b:00007ffcc212a1c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00007ffcc212a288 RCX: 00007f453a138aea
      RDX: 0000000000000060 RSI: 00007f4539084c20 RDI: 0000000000000003
      RBP: 00007f4538308e80 R08: 00007ffcc212a300 R09: 000000000000001c
      R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
      R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007f4539712d1b
       </TASK>
      Modules linked in:
      
      Fixes: 8610c7c6 ("net: ipv6: add support for rpl sr exthdr")
      Reported-by: Max VA
      Closes: https://www.interruptlabs.co.uk/articles/linux-ipv6-route-of-deathSigned-off-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Link: https://lore.kernel.org/r/20230605180617.67284-1-kuniyu@amazon.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      a2f4c143