1. 29 Jun, 2023 24 commits
    • Pauli Virtanen's avatar
      Bluetooth: hci_event: fix Set CIG Parameters error status handling · db9cbcad
      Pauli Virtanen authored
      If the event has error status, return right error code and don't show
      incorrect "response malformed" messages.
      Signed-off-by: default avatarPauli Virtanen <pav@iki.fi>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      db9cbcad
    • Pauli Virtanen's avatar
      Bluetooth: ISO: use hci_sync for setting CIG parameters · 6b9545dc
      Pauli Virtanen authored
      When reconfiguring CIG after disconnection of the last CIS, LE Remove
      CIG shall be sent before LE Set CIG Parameters.  Otherwise, it fails
      because CIG is in the inactive state and not configurable (Core v5.3
      Vol 6 Part B Sec. 4.5.14.3). This ordering is currently wrong under
      suitable timing conditions, because LE Remove CIG is sent via the
      hci_sync queue and may be delayed, but Set CIG Parameters is via
      hci_send_cmd.
      
      Make the ordering well-defined by sending also Set CIG Parameters via
      hci_sync.
      
      Fixes: 26afbd82 ("Bluetooth: Add initial implementation of CIS connections")
      Signed-off-by: default avatarPauli Virtanen <pav@iki.fi>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6b9545dc
    • Johan Hovold's avatar
      Bluetooth: hci_bcm: do not mark valid bd_addr as invalid · 56b7f325
      Johan Hovold authored
      A recent commit restored the original (and still documented) semantics
      for the HCI_QUIRK_USE_BDADDR_PROPERTY quirk so that the device address
      is considered invalid unless an address is provided by firmware.
      
      This specifically means that this flag must only be set for devices with
      invalid addresses, but the Broadcom driver has so far been setting this
      flag unconditionally.
      
      Fortunately the driver already checks for invalid addresses during setup
      and sets the HCI_QUIRK_INVALID_BDADDR flag. Use this flag to indicate
      when the address can be overridden by firmware (long term, this should
      probably just always be allowed).
      
      Fixes: 6945795b ("Bluetooth: fix use-bdaddr-property quirk")
      Reported-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Link: https://lore.kernel.org/lkml/ecef83c8-497f-4011-607b-a63c24764867@samsung.comSigned-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Tested-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      56b7f325
    • Sungwoo Kim's avatar
      Bluetooth: L2CAP: Fix use-after-free in l2cap_sock_ready_cb · 1728137b
      Sungwoo Kim authored
      l2cap_sock_release(sk) frees sk. However, sk's children are still alive
      and point to the already free'd sk's address.
      To fix this, l2cap_sock_release(sk) also cleans sk's children.
      
      ==================================================================
      BUG: KASAN: use-after-free in l2cap_sock_ready_cb+0xb7/0x100 net/bluetooth/l2cap_sock.c:1650
      Read of size 8 at addr ffff888104617aa8 by task kworker/u3:0/276
      
      CPU: 0 PID: 276 Comm: kworker/u3:0 Not tainted 6.2.0-00001-gef397bd4d5fb-dirty #59
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
      Workqueue: hci2 hci_rx_work
      Call Trace:
       <TASK>
       __dump_stack lib/dump_stack.c:88 [inline]
       dump_stack_lvl+0x72/0x95 lib/dump_stack.c:106
       print_address_description mm/kasan/report.c:306 [inline]
       print_report+0x175/0x478 mm/kasan/report.c:417
       kasan_report+0xb1/0x130 mm/kasan/report.c:517
       l2cap_sock_ready_cb+0xb7/0x100 net/bluetooth/l2cap_sock.c:1650
       l2cap_chan_ready+0x10e/0x1e0 net/bluetooth/l2cap_core.c:1386
       l2cap_config_req+0x753/0x9f0 net/bluetooth/l2cap_core.c:4480
       l2cap_bredr_sig_cmd net/bluetooth/l2cap_core.c:5739 [inline]
       l2cap_sig_channel net/bluetooth/l2cap_core.c:6509 [inline]
       l2cap_recv_frame+0xe2e/0x43c0 net/bluetooth/l2cap_core.c:7788
       l2cap_recv_acldata+0x6ed/0x7e0 net/bluetooth/l2cap_core.c:8506
       hci_acldata_packet net/bluetooth/hci_core.c:3813 [inline]
       hci_rx_work+0x66e/0xbc0 net/bluetooth/hci_core.c:4048
       process_one_work+0x4ea/0x8e0 kernel/workqueue.c:2289
       worker_thread+0x364/0x8e0 kernel/workqueue.c:2436
       kthread+0x1b9/0x200 kernel/kthread.c:376
       ret_from_fork+0x2c/0x50 arch/x86/entry/entry_64.S:308
       </TASK>
      
      Allocated by task 288:
       kasan_save_stack+0x22/0x50 mm/kasan/common.c:45
       kasan_set_track+0x25/0x30 mm/kasan/common.c:52
       ____kasan_kmalloc mm/kasan/common.c:374 [inline]
       __kasan_kmalloc+0x82/0x90 mm/kasan/common.c:383
       kasan_kmalloc include/linux/kasan.h:211 [inline]
       __do_kmalloc_node mm/slab_common.c:968 [inline]
       __kmalloc+0x5a/0x140 mm/slab_common.c:981
       kmalloc include/linux/slab.h:584 [inline]
       sk_prot_alloc+0x113/0x1f0 net/core/sock.c:2040
       sk_alloc+0x36/0x3c0 net/core/sock.c:2093
       l2cap_sock_alloc.constprop.0+0x39/0x1c0 net/bluetooth/l2cap_sock.c:1852
       l2cap_sock_create+0x10d/0x220 net/bluetooth/l2cap_sock.c:1898
       bt_sock_create+0x183/0x290 net/bluetooth/af_bluetooth.c:132
       __sock_create+0x226/0x380 net/socket.c:1518
       sock_create net/socket.c:1569 [inline]
       __sys_socket_create net/socket.c:1606 [inline]
       __sys_socket_create net/socket.c:1591 [inline]
       __sys_socket+0x112/0x200 net/socket.c:1639
       __do_sys_socket net/socket.c:1652 [inline]
       __se_sys_socket net/socket.c:1650 [inline]
       __x64_sys_socket+0x40/0x50 net/socket.c:1650
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x3f/0x90 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      Freed by task 288:
       kasan_save_stack+0x22/0x50 mm/kasan/common.c:45
       kasan_set_track+0x25/0x30 mm/kasan/common.c:52
       kasan_save_free_info+0x2e/0x50 mm/kasan/generic.c:523
       ____kasan_slab_free mm/kasan/common.c:236 [inline]
       ____kasan_slab_free mm/kasan/common.c:200 [inline]
       __kasan_slab_free+0x10a/0x190 mm/kasan/common.c:244
       kasan_slab_free include/linux/kasan.h:177 [inline]
       slab_free_hook mm/slub.c:1781 [inline]
       slab_free_freelist_hook mm/slub.c:1807 [inline]
       slab_free mm/slub.c:3787 [inline]
       __kmem_cache_free+0x88/0x1f0 mm/slub.c:3800
       sk_prot_free net/core/sock.c:2076 [inline]
       __sk_destruct+0x347/0x430 net/core/sock.c:2168
       sk_destruct+0x9c/0xb0 net/core/sock.c:2183
       __sk_free+0x82/0x220 net/core/sock.c:2194
       sk_free+0x7c/0xa0 net/core/sock.c:2205
       sock_put include/net/sock.h:1991 [inline]
       l2cap_sock_kill+0x256/0x2b0 net/bluetooth/l2cap_sock.c:1257
       l2cap_sock_release+0x1a7/0x220 net/bluetooth/l2cap_sock.c:1428
       __sock_release+0x80/0x150 net/socket.c:650
       sock_close+0x19/0x30 net/socket.c:1368
       __fput+0x17a/0x5c0 fs/file_table.c:320
       task_work_run+0x132/0x1c0 kernel/task_work.c:179
       resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
       exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
       exit_to_user_mode_prepare+0x113/0x120 kernel/entry/common.c:203
       __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline]
       syscall_exit_to_user_mode+0x21/0x50 kernel/entry/common.c:296
       do_syscall_64+0x4c/0x90 arch/x86/entry/common.c:86
       entry_SYSCALL_64_after_hwframe+0x72/0xdc
      
      The buggy address belongs to the object at ffff888104617800
       which belongs to the cache kmalloc-1k of size 1024
      The buggy address is located 680 bytes inside of
       1024-byte region [ffff888104617800, ffff888104617c00)
      
      The buggy address belongs to the physical page:
      page:00000000dbca6a80 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888104614000 pfn:0x104614
      head:00000000dbca6a80 order:2 compound_mapcount:0 subpages_mapcount:0 compound_pincount:0
      flags: 0x200000000010200(slab|head|node=0|zone=2)
      raw: 0200000000010200 ffff888100041dc0 ffffea0004212c10 ffffea0004234b10
      raw: ffff888104614000 0000000000080002 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff888104617980: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff888104617a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      >ffff888104617a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                                        ^
       ffff888104617b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff888104617b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
      ==================================================================
      
      Ack: This bug is found by FuzzBT with a modified Syzkaller. Other
      contributors are Ruoyu Wu and Hui Peng.
      Signed-off-by: default avatarSungwoo Kim <iam@sung-woo.kim>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      1728137b
    • Johan Hovold's avatar
      Bluetooth: fix use-bdaddr-property quirk · 6945795b
      Johan Hovold authored
      Devices that lack persistent storage for the device address can indicate
      this by setting the HCI_QUIRK_INVALID_BDADDR which causes the controller
      to be marked as unconfigured until user space has set a valid address.
      
      The related HCI_QUIRK_USE_BDADDR_PROPERTY was later added to similarly
      indicate that the device lacks a valid address but that one may be
      specified in the devicetree.
      
      As is clear from commit 7a0e5b15 ("Bluetooth: Add quirk for reading
      BD_ADDR from fwnode property") that added and documented this quirk and
      commits like de79a9df ("Bluetooth: btqcomsmd: use
      HCI_QUIRK_USE_BDADDR_PROPERTY"), the device address of controllers with
      this flag should be treated as invalid until user space has had a chance
      to configure the controller in case the devicetree property is missing.
      
      As it does not make sense to allow controllers with invalid addresses,
      restore the original semantics, which also makes sure that the
      implementation is consistent (e.g. get_missing_options() indicates that
      the address must be set) and matches the documentation (including
      comments in the code, such as, "In case any of them is set, the
      controller has to start up as unconfigured.").
      
      Fixes: e668eb1e ("Bluetooth: hci_core: Don't stop BT if the BD address missing in dts")
      Signed-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      6945795b
    • Johan Hovold's avatar
      Bluetooth: fix invalid-bdaddr quirk for non-persistent setup · 0cb73658
      Johan Hovold authored
      Devices that lack persistent storage for the device address can indicate
      this by setting the HCI_QUIRK_INVALID_BDADDR which causes the controller
      to be marked as unconfigured until user space has set a valid address.
      
      Once configured, the device address must be set on every setup for
      controllers with HCI_QUIRK_NON_PERSISTENT_SETUP to avoid marking the
      controller as unconfigured and requiring the address to be set again.
      
      Fixes: 740011cf ("Bluetooth: Add new quirk for non-persistent setup settings")
      Signed-off-by: default avatarJohan Hovold <johan+linaro@kernel.org>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      0cb73658
    • Zhengping Jiang's avatar
      Bluetooth: L2CAP: Fix use-after-free · f752a0b3
      Zhengping Jiang authored
      Fix potential use-after-free in l2cap_le_command_rej.
      Signed-off-by: default avatarZhengping Jiang <jiangzp@google.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      f752a0b3
    • Min-Hua Chen's avatar
      Bluetooth: btqca: use le32_to_cpu for ver.soc_id · 8153b738
      Min-Hua Chen authored
      Use le32_to_cpu for ver.soc_id to fix the following
      sparse warning.
      
      drivers/bluetooth/btqca.c:640:24: sparse: warning: restricted
      __le32 degrades to integer
      Signed-off-by: default avatarMin-Hua Chen <minhuadotchen@gmail.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8153b738
    • Dan Gora's avatar
      Bluetooth: btusb: Add device 6655:8771 to device tables · 022b6101
      Dan Gora authored
      This device is an Inspire branded BT 5.1 USB dongle with a
      Realtek RTL8761BU chip using the "Best Buy China" vendor ID.
      
      The device table is as follows:
      
      T:  Bus=01 Lev=01 Prnt=02 Port=09 Cnt=01 Dev#=  7 Spd=12   MxCh= 0
      D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=6655 ProdID=8771 Rev=02.00
      S:  Manufacturer=Realtek
      S:  Product=Bluetooth Radio
      S:  SerialNumber=00E04C239987
      C:  #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA
      I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      Signed-off-by: default avatarDan Gora <dan.gora@gmail.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      022b6101
    • Dan Gora's avatar
      Bluetooth: btrtl: Add missing MODULE_FIRMWARE declarations · bb23f07c
      Dan Gora authored
      Add missing MODULE_FIRMWARE declarations for firmware referenced in
      btrtl.c.
      Signed-off-by: default avatarDan Gora <dan.gora@gmail.com>
      Signed-off-by: default avatarLuiz Augusto von Dentz <luiz.von.dentz@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      bb23f07c
    • Tobias Heider's avatar
      Add MODULE_FIRMWARE() for FIRMWARE_TG357766. · 046f753d
      Tobias Heider authored
      Fixes a bug where on the M1 mac mini initramfs-tools fails to
      include the necessary firmware into the initrd.
      
      Fixes: c4dab506 ("tg3: Download 57766 EEE service patch firmware")
      Signed-off-by: default avatarTobias Heider <me@tobhe.de>
      Reviewed-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Link: https://lore.kernel.org/r/ZJt7LKzjdz8+dClx@tobhe.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      046f753d
    • Paolo Abeni's avatar
      Merge branch 'fix-ptp-received-on-wrong-port-with-bridged-sja1105-dsa' · 5998bb76
      Paolo Abeni authored
      Vladimir Oltean says:
      
      ====================
      Fix PTP received on wrong port with bridged SJA1105 DSA
      
      Since the changes were made to tag_8021q to support imprecise RX for
      bridged ports, the tag_sja1105 driver still prefers the source port
      information deduced from the VLAN headers for link-local traffic, even
      though the switch can theoretically do better and report the precise
      source port.
      
      The problem is that the tagger doesn't know when to trust one source of
      information over another, because the INCL_SRCPT option (to "tag" link
      local frames) is sometimes enabled and sometimes it isn't.
      
      The first patch makes the switch provide the hardware tag for link local
      traffic under all circumstances, and the second patch makes the tagger
      always use that hardware tag as primary source of information for link
      local packets.
      ====================
      
      Link: https://lore.kernel.org/r/20230627094207.3385231-1-vladimir.oltean@nxp.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5998bb76
    • Vladimir Oltean's avatar
      net: dsa: tag_sja1105: always prefer source port information from INCL_SRCPT · c1ae02d8
      Vladimir Oltean authored
      Currently the sja1105 tagging protocol prefers using the source port
      information from the VLAN header if that is available, falling back to
      the INCL_SRCPT option if it isn't. The VLAN header is available for all
      frames except for META frames initiated by the switch (containing RX
      timestamps), and thus, the "if (is_link_local)" branch is practically
      dead.
      
      The tag_8021q source port identification has become more loose
      ("imprecise") and will report a plausible rather than exact bridge port,
      when under a bridge (be it VLAN-aware or VLAN-unaware). But link-local
      traffic always needs to know the precise source port. With incorrect
      source port reporting, for example PTP traffic over 2 bridged ports will
      all be seen on sockets opened on the first such port, which is incorrect.
      
      Now that the tagging protocol has been changed to make link-local frames
      always contain source port information, we can reverse the order of the
      checks so that we always give precedence to that information (which is
      always precise) in lieu of the tag_8021q VID which is only precise for a
      standalone port.
      
      Fixes: d7f9787a ("net: dsa: tag_8021q: add support for imprecise RX based on the VBID")
      Fixes: 91495f21 ("net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c1ae02d8
    • Vladimir Oltean's avatar
      net: dsa: sja1105: always enable the INCL_SRCPT option · b4638af8
      Vladimir Oltean authored
      Link-local traffic on bridged SJA1105 ports is sometimes tagged by the
      hardware with source port information (when the port is under a VLAN
      aware bridge).
      
      The tag_8021q source port identification has become more loose
      ("imprecise") and will report a plausible rather than exact bridge port,
      when under a bridge (be it VLAN-aware or VLAN-unaware). But link-local
      traffic always needs to know the precise source port.
      
      Modify the driver logic (and therefore: the tagging protocol itself) to
      always include the source port information with link-local packets,
      regardless of whether the port is standalone, under a VLAN-aware or
      VLAN-unaware bridge. This makes it possible for the tagging driver to
      give priority to that information over the tag_8021q VLAN header.
      
      The big drawback with INCL_SRCPT is that it makes it impossible to
      distinguish between an original MAC DA of 01:80:C2:XX:YY:ZZ and
      01:80:C2:AA:BB:ZZ, because the tagger just patches MAC DA bytes 3 and 4
      with zeroes. Only if PTP RX timestamping is enabled, the switch will
      generate a META follow-up frame containing the RX timestamp and the
      original bytes 3 and 4 of the MAC DA. Those will be used to patch up the
      original packet. Nonetheless, in the absence of PTP RX timestamping, we
      have to live with this limitation, since it is more important to have
      the more precise source port information for link-local traffic.
      
      Fixes: d7f9787a ("net: dsa: tag_8021q: add support for imprecise RX based on the VBID")
      Fixes: 91495f21 ("net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b4638af8
    • Paolo Abeni's avatar
      Merge branch 'fix-ptp-packet-drops-with-ocelot-8021q-dsa-tag-protocol' · e999c897
      Paolo Abeni authored
      Vladimir Oltean says:
      
      ====================
      Fix PTP packet drops with ocelot-8021q DSA tag protocol
      
      Changes in v2:
      - Distinguish between L2 and L4 PTP packets
      v1 at:
      https://lore.kernel.org/netdev/20230626154003.3153076-1-vladimir.oltean@nxp.com/
      
      Patch 3/3 fixes an issue with the ocelot/felix driver, where it would
      drop PTP traffic on RX unless hardware timestamping for that packet type
      was enabled.
      
      Fixing that requires the driver to know whether it had previously
      configured the hardware to timestamp PTP packets on that port. But it
      cannot correctly determine that today using the existing code structure,
      so patches 1/3 and 2/3 fix the control path of the code such that
      ocelot->ports[port]->trap_proto faithfully reflects whether that
      configuration took place.
      ====================
      
      Link: https://lore.kernel.org/r/20230627163114.3561597-1-vladimir.oltean@nxp.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      e999c897
    • Vladimir Oltean's avatar
      net: dsa: felix: don't drop PTP frames with tag_8021q when RX timestamping is disabled · 2edcfcbb
      Vladimir Oltean authored
      The driver implements a workaround for the fact that it doesn't have an
      IRQ source to tell it whether PTP frames are available through the
      extraction registers, for those frames to be processed and passed
      towards the network stack. That workaround is to configure the switch,
      through felix_hwtstamp_set() -> felix_update_trapping_destinations(),
      to create two copies of PTP packets: one sent over Ethernet to the DSA
      master, and one to be consumed through the aforementioned CPU extraction
      queue registers.
      
      The reason why we want PTP packets to be consumed through the CPU
      extraction registers in the first place is because we want to see their
      hardware RX timestamp. With tag_8021q, that is only visible that way,
      and it isn't visible with the copy of the packet that's transmitted over
      Ethernet.
      
      The problem with the workaround implementation is that it drops the
      packet received over Ethernet, in expectation of its copy being present
      in the CPU extraction registers. However, if felix_hwtstamp_set() hasn't
      run (aka PTP RX timestamping is disabled), the driver will drop the
      original PTP frame and there will be no copy of it in the CPU extraction
      registers. So, the network stack will simply not see any PTP frame.
      
      Look at the port's trapping configuration to see whether the driver has
      previously enabled the CPU extraction registers. If it hasn't, just
      don't RX timestamp the frame and let it be passed up the stack by DSA,
      which is perfectly fine.
      
      Fixes: 0a6f17c6 ("net: dsa: tag_ocelot_8021q: add support for PTP timestamping")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2edcfcbb
    • Vladimir Oltean's avatar
      net: mscc: ocelot: don't keep PTP configuration of all ports in single structure · 45d0fcb5
      Vladimir Oltean authored
      In a future change, the driver will need to determine whether PTP RX
      timestamping is enabled on a port (including whether traps were set up
      on that port in particular) and that is currently not possible.
      
      The driver supports different RX filters (L2, L4) and kinds of TX
      timestamping (one-step, two-step) on its ports, but it saves all
      configuration in a single struct hwtstamp_config that is global to the
      switch. So, the latest timestamping configuration on one port
      (including a request to disable timestamping) affects what gets reported
      for all ports, even though the configuration itself is still individual
      to each port.
      
      The port timestamping configurations are only coupled because of the
      common structure, so replace the hwtstamp_config with a mask of trapped
      protocols saved per port. We also have the ptp_cmd to distinguish
      between one-step and two-step PTP timestamping, so with those 2 bits of
      information we can fully reconstruct a descriptive struct
      hwtstamp_config for each port, during the SIOCGHWTSTAMP ioctl.
      
      Fixes: 4e3b0468 ("net: mscc: PTP Hardware Clock (PHC) support")
      Fixes: 96ca08c0 ("net: mscc: ocelot: set up traps for PTP packets")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      45d0fcb5
    • Vladimir Oltean's avatar
      net: mscc: ocelot: don't report that RX timestamping is enabled by default · 4fd44b82
      Vladimir Oltean authored
      PTP RX timestamping should be enabled when the user requests it, not by
      default. If it is enabled by default, it can be problematic when the
      ocelot driver is a DSA master, and it sidesteps what DSA tries to avoid
      through __dsa_master_hwtstamp_validate().
      
      Additionally, after the change which made ocelot trap PTP packets only
      to the CPU at ocelot_hwtstamp_set() time, it is no longer even true that
      RX timestamping is enabled by default, because until ocelot_hwtstamp_set()
      is called, the PTP traps are actually not set up. So the rx_filter field
      of ocelot->hwtstamp_config reflects an incorrect reality.
      
      Fixes: 96ca08c0 ("net: mscc: ocelot: set up traps for PTP packets")
      Fixes: 4e3b0468 ("net: mscc: PTP Hardware Clock (PHC) support")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4fd44b82
    • Paolo Abeni's avatar
      Merge branch 'net-sched-act_ipt-bug-fixes' · 3c4bb45a
      Paolo Abeni authored
      Florian Westphal says:
      
      ====================
      net/sched: act_ipt bug fixes
      
      v3: prefer skb_header() helper in patch 2.  No other changes.
      I've retained Acks and RvB-Tags of v2.
      
      While checking if netfilter could be updated to replace selected
      instances of NF_DROP with kfree_skb_reason+NF_STOLEN to improve
      debugging info via drop monitor I found that act_ipt is incompatible
      with such an approach.  Moreover, it lacks multiple sanity checks
      to avoid certain code paths that make assumptions that the tc layer
      doesn't meet, such as header sanity checks, availability of skb_dst,
      skb_nfct() and so on.
      
      act_ipt test in the tc selftest still pass with this applied.
      
      I think that we should consider removal of this module, while
      this should take care of all problems, its ipv4 only and I don't
      think there are any netfilter targets that lack a native tc
      equivalent, even when ignoring bpf.
      ====================
      
      Link: https://lore.kernel.org/r/20230627123813.3036-1-fw@strlen.deSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      3c4bb45a
    • Florian Westphal's avatar
      net/sched: act_ipt: zero skb->cb before calling target · 93d75d47
      Florian Westphal authored
      xtables relies on skb being owned by ip stack, i.e. with ipv4
      check in place skb->cb is supposed to be IPCB.
      
      I don't see an immediate problem (REJECT target cannot be used anymore
      now that PRE/POSTROUTING hook validation has been fixed), but better be
      safe than sorry.
      
      A much better patch would be to either mark act_ipt as
      "depends on BROKEN" or remove it altogether. I plan to do this
      for -next in the near future.
      
      This tc extension is broken in the sense that tc lacks an
      equivalent of NF_STOLEN verdict.
      
      With NF_STOLEN, target function takes complete ownership of skb, caller
      cannot dereference it anymore.
      
      ACT_STOLEN cannot be used for this: it has a different meaning, caller
      is allowed to dereference the skb.
      
      At this time NF_STOLEN won't be returned by any targets as far as I can
      see, but this may change in the future.
      
      It might be possible to work around this via list of allowed
      target extensions known to only return DROP or ACCEPT verdicts, but this
      is error prone/fragile.
      
      Existing selftest only validates xt_LOG and act_ipt is restricted
      to ipv4 so I don't think this action is used widely.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      93d75d47
    • Florian Westphal's avatar
      net/sched: act_ipt: add sanity checks on skb before calling target · b2dc32dc
      Florian Westphal authored
      Netfilter targets make assumptions on the skb state, for example
      iphdr is supposed to be in the linear area.
      
      This is normally done by IP stack, but in act_ipt case no
      such checks are made.
      
      Some targets can even assume that skb_dst will be valid.
      Make a minimum effort to check for this:
      
      - Don't call the targets eval function for non-ipv4 skbs.
      - Don't call the targets eval function for POSTROUTING
        emulation when the skb has no dst set.
      
      v3: use skb_protocol helper (Davide Caratti)
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b2dc32dc
    • Florian Westphal's avatar
      net/sched: act_ipt: add sanity checks on table name and hook locations · b4ee9338
      Florian Westphal authored
      Looks like "tc" hard-codes "mangle" as the only supported table
      name, but on kernel side there are no checks.
      
      This is wrong.  Not all xtables targets are safe to call from tc.
      E.g. "nat" targets assume skb has a conntrack object assigned to it.
      Normally those get called from netfilter nat core which consults the
      nat table to obtain the address mapping.
      
      "tc" userspace either sets PRE or POSTROUTING as hook number, but there
      is no validation of this on kernel side, so update netlink policy to
      reject bogus numbers.  Some targets may assume skb_dst is set for
      input/forward hooks, so prevent those from being used.
      
      act_ipt uses the hook number in two places:
      1. the state hook number, this is fine as-is
      2. to set par.hook_mask
      
      The latter is a bit mask, so update the assignment to make
      xt_check_target() to the right thing.
      
      Followup patch adds required checks for the skb/packet headers before
      calling the targets evaluation function.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Reviewed-by: default avatarSimon Horman <simon.horman@corigine.com>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b4ee9338
    • Chengfeng Ye's avatar
      sctp: fix potential deadlock on &net->sctp.addr_wq_lock · 6feb37b3
      Chengfeng Ye authored
      As &net->sctp.addr_wq_lock is also acquired by the timer
      sctp_addr_wq_timeout_handler() in protocal.c, the same lock acquisition
      at sctp_auto_asconf_init() seems should disable irq since it is called
      from sctp_accept() under process context.
      
      Possible deadlock scenario:
      sctp_accept()
          -> sctp_sock_migrate()
          -> sctp_auto_asconf_init()
          -> spin_lock(&net->sctp.addr_wq_lock)
              <timer interrupt>
              -> sctp_addr_wq_timeout_handler()
              -> spin_lock_bh(&net->sctp.addr_wq_lock); (deadlock here)
      
      This flaw was found using an experimental static analysis tool we are
      developing for irq-related deadlock.
      
      The tentative patch fix the potential deadlock by spin_lock_bh().
      Signed-off-by: default avatarChengfeng Ye <dg573847474@gmail.com>
      Fixes: 34e5b011 ("sctp: delay auto_asconf init until binding the first addr")
      Acked-by: default avatarXin Long <lucien.xin@gmail.com>
      Link: https://lore.kernel.org/r/20230627120340.19432-1-dg573847474@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      6feb37b3
    • Moritz Fischer's avatar
      net: lan743x: Don't sleep in atomic context · 7a8227b2
      Moritz Fischer authored
      dev_set_rx_mode() grabs a spin_lock, and the lan743x implementation
      proceeds subsequently to go to sleep using readx_poll_timeout().
      
      Introduce a helper wrapping the readx_poll_timeout_atomic() function
      and use it to replace the calls to readx_polL_timeout().
      
      Fixes: 23f0703c ("lan743x: Add main source files for new lan743x driver")
      Cc: stable@vger.kernel.org
      Cc: Bryan Whitehead <bryan.whitehead@microchip.com>
      Cc: UNGLinuxDriver@microchip.com
      Signed-off-by: default avatarMoritz Fischer <moritzf@google.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Link: https://lore.kernel.org/r/20230627035000.1295254-1-moritzf@google.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      7a8227b2
  2. 28 Jun, 2023 16 commits
    • Linus Torvalds's avatar
      Merge tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · 3a8a670e
      Linus Torvalds authored
      Pull networking changes from Jakub Kicinski:
       "WiFi 7 and sendpage changes are the biggest pieces of work for this
        release. The latter will definitely require fixes but I think that we
        got it to a reasonable point.
      
        Core:
      
         - Rework the sendpage & splice implementations
      
           Instead of feeding data into sockets page by page extend sendmsg
           handlers to support taking a reference on the data, controlled by a
           new flag called MSG_SPLICE_PAGES
      
           Rework the handling of unexpected-end-of-file to invoke an
           additional callback instead of trying to predict what the right
           combination of MORE/NOTLAST flags is
      
           Remove the MSG_SENDPAGE_NOTLAST flag completely
      
         - Implement SCM_PIDFD, a new type of CMSG type analogous to
           SCM_CREDENTIALS, but it contains pidfd instead of plain pid
      
         - Enable socket busy polling with CONFIG_RT
      
         - Improve reliability and efficiency of reporting for ref_tracker
      
         - Auto-generate a user space C library for various Netlink families
      
        Protocols:
      
         - Allow TCP to shrink the advertised window when necessary, prevent
           sk_rcvbuf auto-tuning from growing the window all the way up to
           tcp_rmem[2]
      
         - Use per-VMA locking for "page-flipping" TCP receive zerocopy
      
         - Prepare TCP for device-to-device data transfers, by making sure
           that payloads are always attached to skbs as page frags
      
         - Make the backoff time for the first N TCP SYN retransmissions
           linear. Exponential backoff is unnecessarily conservative
      
         - Create a new MPTCP getsockopt to retrieve all info
           (MPTCP_FULL_INFO)
      
         - Avoid waking up applications using TLS sockets until we have a full
           record
      
         - Allow using kernel memory for protocol ioctl callbacks, paving the
           way to issuing ioctls over io_uring
      
         - Add nolocalbypass option to VxLAN, forcing packets to be fully
           encapsulated even if they are destined for a local IP address
      
         - Make TCPv4 use consistent hash in TIME_WAIT and SYN_RECV. Ensure
           in-kernel ECMP implementation (e.g. Open vSwitch) select the same
           link for all packets. Support L4 symmetric hashing in Open vSwitch
      
         - PPPoE: make number of hash bits configurable
      
         - Allow DNS to be overwritten by DHCPACK in the in-kernel DHCP client
           (ipconfig)
      
         - Add layer 2 miss indication and filtering, allowing higher layers
           (e.g. ACL filters) to make forwarding decisions based on whether
           packet matched forwarding state in lower devices (bridge)
      
         - Support matching on Connectivity Fault Management (CFM) packets
      
         - Hide the "link becomes ready" IPv6 messages by demoting their
           printk level to debug
      
         - HSR: don't enable promiscuous mode if device offloads the proto
      
         - Support active scanning in IEEE 802.15.4
      
         - Continue work on Multi-Link Operation for WiFi 7
      
        BPF:
      
         - Add precision propagation for subprogs and callbacks. This allows
           maintaining verification efficiency when subprograms are used, or
           in fact passing the verifier at all for complex programs,
           especially those using open-coded iterators
      
         - Improve BPF's {g,s}setsockopt() length handling. Previously BPF
           assumed the length is always equal to the amount of written data.
           But some protos allow passing a NULL buffer to discover what the
           output buffer *should* be, without writing anything
      
         - Accept dynptr memory as memory arguments passed to helpers
      
         - Add routing table ID to bpf_fib_lookup BPF helper
      
         - Support O_PATH FDs in BPF_OBJ_PIN and BPF_OBJ_GET commands
      
         - Drop bpf_capable() check in BPF_MAP_FREEZE command (used to mark
           maps as read-only)
      
         - Show target_{obj,btf}_id in tracing link fdinfo
      
         - Addition of several new kfuncs (most of the names are
           self-explanatory):
            - Add a set of new dynptr kfuncs: bpf_dynptr_adjust(),
              bpf_dynptr_is_null(), bpf_dynptr_is_rdonly(), bpf_dynptr_size()
              and bpf_dynptr_clone().
            - bpf_task_under_cgroup()
            - bpf_sock_destroy() - force closing sockets
            - bpf_cpumask_first_and(), rework bpf_cpumask_any*() kfuncs
      
        Netfilter:
      
         - Relax set/map validation checks in nf_tables. Allow checking
           presence of an entry in a map without using the value
      
         - Increase ip_vs_conn_tab_bits range for 64BIT builds
      
         - Allow updating size of a set
      
         - Improve NAT tuple selection when connection is closing
      
        Driver API:
      
         - Integrate netdev with LED subsystem, to allow configuring HW
           "offloaded" blinking of LEDs based on link state and activity
           (i.e. packets coming in and out)
      
         - Support configuring rate selection pins of SFP modules
      
         - Factor Clause 73 auto-negotiation code out of the drivers, provide
           common helper routines
      
         - Add more fool-proof helpers for managing lifetime of MDIO devices
           associated with the PCS layer
      
         - Allow drivers to report advanced statistics related to Time Aware
           scheduler offload (taprio)
      
         - Allow opting out of VF statistics in link dump, to allow more VFs
           to fit into the message
      
         - Split devlink instance and devlink port operations
      
        New hardware / drivers:
      
         - Ethernet:
            - Synopsys EMAC4 IP support (stmmac)
            - Marvell 88E6361 8 port (5x1GE + 3x2.5GE) switches
            - Marvell 88E6250 7 port switches
            - Microchip LAN8650/1 Rev.B0 PHYs
            - MediaTek MT7981/MT7988 built-in 1GE PHY driver
      
         - WiFi:
            - Realtek RTL8192FU, 2.4 GHz, b/g/n mode, 2T2R, 300 Mbps
            - Realtek RTL8723DS (SDIO variant)
            - Realtek RTL8851BE
      
         - CAN:
            - Fintek F81604
      
        Drivers:
      
         - Ethernet NICs:
            - Intel (100G, ice):
               - support dynamic interrupt allocation
               - use meta data match instead of VF MAC addr on slow-path
            - nVidia/Mellanox:
               - extend link aggregation to handle 4, rather than just 2 ports
               - spawn sub-functions without any features by default
            - OcteonTX2:
               - support HTB (Tx scheduling/QoS) offload
               - make RSS hash generation configurable
               - support selecting Rx queue using TC filters
            - Wangxun (ngbe/txgbe):
               - add basic Tx/Rx packet offloads
               - add phylink support (SFP/PCS control)
            - Freescale/NXP (enetc):
               - report TAPRIO packet statistics
            - Solarflare/AMD:
               - support matching on IP ToS and UDP source port of outer
                 header
               - VxLAN and GENEVE tunnel encapsulation over IPv4 or IPv6
               - add devlink dev info support for EF10
      
         - Virtual NICs:
            - Microsoft vNIC:
               - size the Rx indirection table based on requested
                 configuration
               - support VLAN tagging
            - Amazon vNIC:
               - try to reuse Rx buffers if not fully consumed, useful for ARM
                 servers running with 16kB pages
            - Google vNIC:
               - support TCP segmentation of >64kB frames
      
         - Ethernet embedded switches:
            - Marvell (mv88e6xxx):
               - enable USXGMII (88E6191X)
            - Microchip:
               - lan966x: add support for Egress Stage 0 ACL engine
               - lan966x: support mapping packet priority to internal switch
                 priority (based on PCP or DSCP)
      
         - Ethernet PHYs:
            - Broadcom PHYs:
               - support for Wake-on-LAN for BCM54210E/B50212E
               - report LPI counter
            - Microsemi PHYs: support RGMII delay configuration (VSC85xx)
            - Micrel PHYs: receive timestamp in the frame (LAN8841)
            - Realtek PHYs: support optional external PHY clock
            - Altera TSE PCS: merge the driver into Lynx PCS which it is a
              variant of
      
         - CAN: Kvaser PCIEcan:
            - support packet timestamping
      
         - WiFi:
            - Intel (iwlwifi):
               - major update for new firmware and Multi-Link Operation (MLO)
               - configuration rework to drop test devices and split the
                 different families
               - support for segmented PNVM images and power tables
               - new vendor entries for PPAG (platform antenna gain) feature
            - Qualcomm 802.11ax (ath11k):
               - Multiple Basic Service Set Identifier (MBSSID) and Enhanced
                 MBSSID Advertisement (EMA) support in AP mode
               - support factory test mode
            - RealTek (rtw89):
               - add RSSI based antenna diversity
               - support U-NII-4 channels on 5 GHz band
            - RealTek (rtl8xxxu):
               - AP mode support for 8188f
               - support USB RX aggregation for the newer chips"
      
      * tag 'net-next-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1602 commits)
        net: scm: introduce and use scm_recv_unix helper
        af_unix: Skip SCM_PIDFD if scm->pid is NULL.
        net: lan743x: Simplify comparison
        netlink: Add __sock_i_ino() for __netlink_diag_dump().
        net: dsa: avoid suspicious RCU usage for synced VLAN-aware MAC addresses
        Revert "af_unix: Call scm_recv() only after scm_set_cred()."
        phylink: ReST-ify the phylink_pcs_neg_mode() kdoc
        libceph: Partially revert changes to support MSG_SPLICE_PAGES
        net: phy: mscc: fix packet loss due to RGMII delays
        net: mana: use vmalloc_array and vcalloc
        net: enetc: use vmalloc_array and vcalloc
        ionic: use vmalloc_array and vcalloc
        pds_core: use vmalloc_array and vcalloc
        gve: use vmalloc_array and vcalloc
        octeon_ep: use vmalloc_array and vcalloc
        net: usb: qmi_wwan: add u-blox 0x1312 composition
        perf trace: fix MSG_SPLICE_PAGES build error
        ipvlan: Fix return value of ipvlan_queue_xmit()
        netfilter: nf_tables: fix underflow in chain reference counter
        netfilter: nf_tables: unbind non-anonymous set if rule construction fails
        ...
      3a8a670e
    • Linus Torvalds's avatar
      Merge tag 'v6.5-rc1-sysctl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux · 6a8cbd92
      Linus Torvalds authored
      Pull sysctl updates from Luis Chamberlain:
       "The changes for sysctl are in line with prior efforts to stop usage of
        deprecated routines which incur recursion and also make it hard to
        remove the empty array element in each sysctl array declaration.
      
        The most difficult user to modify was parport which required a bit of
        re-thinking of how to declare shared sysctls there, Joel Granados has
        stepped up to the plate to do most of this work and eventual removal
        of register_sysctl_table(). That work ended up saving us about 1465
        bytes according to bloat-o-meter. Since we gained a few bloat-o-meter
        karma points I moved two rather small sysctl arrays from
        kernel/sysctl.c leaving us only two more sysctl arrays to move left.
      
        Most changes have been tested on linux-next for about a month. The
        last straggler patches are a minor parport fix, changes to the sysctl
        kernel selftest so to verify correctness and prevent regressions for
        the future change he made to provide an alternative solution for the
        special sysctl mount point target which was using the now deprecated
        sysctl child element.
      
        This is all prep work to now finally be able to remove the empty array
        element in all sysctl declarations / registrations which is expected
        to save us a bit of bytes all over the kernel. That work will be
        tested early after v6.5-rc1 is out"
      
      * tag 'v6.5-rc1-sysctl-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
        sysctl: replace child with an enumeration
        sysctl: Remove debugging dump_stack
        test_sysclt: Test for registering a mount point
        test_sysctl: Add an option to prevent test skip
        test_sysctl: Add an unregister sysctl test
        test_sysctl: Group node sysctl test under one func
        test_sysctl: Fix test metadata getters
        parport: plug a sysctl register leak
        sysctl: move security keys sysctl registration to its own file
        sysctl: move umh sysctl registration to its own file
        signal: move show_unhandled_signals sysctl to its own file
        sysctl: remove empty dev table
        sysctl: Remove register_sysctl_table
        sysctl: Refactor base paths registrations
        sysctl: stop exporting register_sysctl_table
        parport: Removed sysctl related defines
        parport: Remove register_sysctl_table from parport_default_proc_register
        parport: Remove register_sysctl_table from parport_device_proc_register
        parport: Remove register_sysctl_table from parport_proc_register
        parport: Move magic number "15" to a define
      6a8cbd92
    • Linus Torvalds's avatar
      Merge tag 'v6.5-rc1-modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux · 4e3c09e9
      Linus Torvalds authored
      Pull module updates from Luis Chamberlain:
       "The changes queued up for modules are pretty tame, mostly code removal
        of moving of code.
      
        Only two minor functional changes are made, the only one which stands
        out is Sebastian Andrzej Siewior's simplification of module reference
        counting by removing preempt_disable() and that has been tested on
        linux-next for well over a month without no regressions.
      
        I'm now, I guess, also a kitchen sink for some kallsyms changes"
      
      [ There was a mis-communication about the concurrent module load changes
        that I had expected to come through Luis despite me authoring the
        patch. So some of the module updates were left hanging in the email
        ether, and I just committed them separately.
      
        It's my bad - I should have made it more clear that I expected my
        own patches to come through the module tree too. Now they missed
        linux-next, but hopefully that won't cause any issues    - Linus ]
      
      * tag 'v6.5-rc1-modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
        kallsyms: make kallsyms_show_value() as generic function
        kallsyms: move kallsyms_show_value() out of kallsyms.c
        kallsyms: remove unsed API lookup_symbol_attrs
        kallsyms: remove unused arch_get_kallsym() helper
        module: Remove preempt_disable() from module reference counting.
      4e3c09e9
    • Linus Torvalds's avatar
      modules: catch concurrent module loads, treat them as idempotent · 9b9879fc
      Linus Torvalds authored
      This is the new-and-improved attempt at avoiding huge memory load spikes
      when the user space boot sequence tries to load hundreds (or even
      thousands) of redundant duplicate modules in parallel.
      
      See commit 9828ed3f ("module: error out early on concurrent load of
      the same module file") for background and an earlier failed attempt that
      was reverted.
      
      That earlier attempt just said "concurrently loading the same module is
      silly, just open the module file exclusively and return -ETXTBSY if
      somebody else is already loading it".
      
      While it is true that concurrent module loads of the same module is
      silly, the reason that earlier attempt then failed was that the
      concurrently loaded module would often be a prerequisite for another
      module.
      
      Thus failing to load the prerequisite would then cause cascading
      failures of the other modules, rather than just short-circuiting that
      one unnecessary module load.
      
      At the same time, we still really don't want to load the contents of the
      same module file hundreds of times, only to then wait for an eventually
      successful load, and have everybody else return -EEXIST.
      
      As a result, this takes another approach, and treats concurrent module
      loads from the same file as "idempotent" in the inode.  So if one module
      load is ongoing, we don't start a new one, but instead just wait for the
      first one to complete and return the same return value as it did.
      
      So unlike the first attempt, this does not return early: the intent is
      not to speed up the boot, but to avoid a thundering herd problem in
      allocating memory (both physical and virtual) for a module more than
      once.
      
      Also note that this does change behavior: it used to be that when you
      had concurrent loads, you'd have one "winner" that would return success,
      and everybody else would return -EEXIST.
      
      In contrast, this idempotent logic goes all Oprah on the problem, and
      says "You are a winner! And you are a winner! We are ALL winners".  But
      since there's no possible actual real semantic difference between "you
      loaded the module" and "somebody else already loaded the module", this
      is more of a feel-good change than an actual honest-to-goodness semantic
      change.
      
      Of course, any true Johnny-come-latelies that don't get caught in the
      concurrency filter will still return -EEXIST.  It's no different from
      not even getting a seat at an Oprah taping.  That's life.
      
      See the long thread on the kernel mailing list about this all, which
      includes some numbers for memory use before and after the patch.
      
      Link: https://lore.kernel.org/lkml/20230524213620.3509138-1-mcgrof@kernel.org/Reviewed-by: default avatarJohan Hovold <johan@kernel.org>
      Tested-by: default avatarJohan Hovold <johan@kernel.org>
      Tested-by: default avatarLuis Chamberlain <mcgrof@kernel.org>
      Tested-by: default avatarDan Williams <dan.j.williams@intel.com>
      Tested-by: default avatarRudi Heitbaum <rudi@heitbaum..com>
      Tested-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9b9879fc
    • Linus Torvalds's avatar
      module: split up 'finit_module()' into init_module_from_file() helper · 054a7300
      Linus Torvalds authored
      This will simplify the next step, where we can then key off the inode to
      do one idempotent module load.
      
      Let's do the obvious re-organization in one step, and then the new code
      in another.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      054a7300
    • Linus Torvalds's avatar
      Merge tag 'mmc-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · 89181f54
      Linus Torvalds authored
      Pull MMC updates from Ulf Hansson:
       "MMC core:
         - Allow synchronous detection of (e)MMC/SD/SDIO cards
         - Fixup error check for ioctls for SPI hosts
         - Disable broken SD-Cache support for Kingston Canvas Go Plus from 2019
         - Disable broken eMMC-Trim support for Kingston EMMC04G-M627
         - Disable broken eMMC-Trim support for Micron MTFC4GACAJCN-1M
      
        MMC host:
         - bcm2835: Convert DT bindings to YAML
         - mmci:
            - Enable asynchronous probe
            - Transform the ux500 HW-busy detection into a proper state machine
            - Add support for SW busy-end timeouts for the ux500 variants
         - mmci_stm32:
            - Add support for sdm32 variant revision v3.0 used on STM32MP25
            - Improve the tuning sequence
         - mtk-sd: Tune polling-period to improve performance
         - sdhci: Fixup DMA configuration for 64-bit DMA mode
         - sdhci-bcm-kona: Convert DT bindings to YAML
         - sdhci-msm:
            - Switch to use the new ICE API
            - Add support for the SC8280XP/IPQ6018/QDU1000/QRU1000 variants
         - sdhci-pci-gli:
            - Add support SD Express cards for GL9767
            - Add support for the Genesys Logic GL9767 variant"
      
      * tag 'mmc-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: (42 commits)
        dt-bindings: mmc: fsl-imx-esdhc: Add imx6ul support
        mmc: mmci: Add support for SW busy-end timeouts
        mmc: Add MMC_QUIRK_BROKEN_SD_CACHE for Kingston Canvas Go Plus from 11/2019
        mmc: core: disable TRIM on Kingston EMMC04G-M627
        mmc: mmci: stm32: add delay block support for STM32MP25
        mmc: mmci: stm32: prepare other delay block support
        mmc: mmci: stm32: manage block gap hardware flow control
        mmc: mmci: Add support for sdmmc variant revision v3.0
        mmc: mmci: add stm32_idmabsize_align parameter
        dt-bindings: mmc: mmci: Add st,stm32mp25-sdmmc2 compatible
        mmc: core: disable TRIM on Micron MTFC4GACAJCN-1M
        mmc: mmci: Break out a helper function
        mmc: mmci: Use a switch statement machine
        mmc: mmci: Use state machine state as exit condition
        mmc: mmci: Retry the busy start condition
        mmc: mmci: Make busy complete state machine explicit
        mmc: mmci: Break out error check in busy detect
        mmc: mmci: Stash status while waiting for busy
        mmc: mmci: Unwind big if() clause
        mmc: mmci: Clear busy_status when starting command
        ...
      89181f54
    • Linus Torvalds's avatar
      Merge tag 'mtd/for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux · 1364b406
      Linus Torvalds authored
      Pull mtd updates from
       "Core MTD changes:
         - otp:
            - Put factory OTP/NVRAM into the entropy pool
            - Clean up on error in mtd_otp_nvmem_add()
      
        MTD devices changes:
         - sm_ftl: Fix typos in comments
         - Use SPDX license headers
         - pismo: Switch back to use i2c_driver's .probe()
         - mtdpart: Drop useless LIST_HEAD
         - st_spi_fsm: Use the devm_clk_get_enabled() helper function
      
        DT binding changes:
         - partitions:
            - Include TP-Link SafeLoader in allowed list
            - Add missing type for "linux,rootfs"
         - Extend the nand node names filter
         - Create a file for raw NAND chip properties
         - Mark nand-ecc-placement deprecated
         - Describe nand-ecc-mode
         - Prevent NAND chip unevaluated properties in all NAND bindings with
           a NAND chip reference.
         - Qcom: Fix a property position
         - Marvell: Convert to YAML DT schema
      
        Raw NAND chip drivers changes:
         - Macronix: OTP access for MX30LFxG18AC
         - Add basic Sandisk manufacturer ops
         - Add support for Sandisk SDTNQGAMA
      
        Raw NAND controller driver changes:
         - Meson:
            - Replace integer consts with proper defines
            - Allow waiting w/o wired ready/busy pin
            - Check buffer length validity
            - Fix unaligned DMA buffers handling
            - dt-bindings: Fix 'nand-rb' property
         - Arasan: Revert "mtd: rawnand: arasan: Prevent an unsupported
           configuration" as this limitation is no longer true thanks to the
           recent efforts in improving the clocks support in this driver
      
        SPI-NAND changes:
         - Gigadevice: add support for GD5F2GQ5xExxH
         - Macronix: Add support for serial NAND flashes"
      
      * tag 'mtd/for-6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: (38 commits)
        dt-bindings: mtd: marvell-nand: Convert to YAML DT scheme
        dt-bindings: mtd: ti,am654: Prevent unevaluated properties
        dt-bindings: mtd: mediatek: Prevent NAND chip unevaluated properties
        dt-bindings: mtd: mediatek: Reference raw-nand-chip.yaml
        dt-bindings: mtd: stm32: Prevent NAND chip unevaluated properties
        dt-bindings: mtd: rockchip: Prevent NAND chip unevaluated properties
        dt-bindings: mtd: intel: Prevent NAND chip unevaluated properties
        dt-bindings: mtd: denali: Prevent NAND chip unevaluated properties
        dt-bindings: mtd: brcmnand: Prevent NAND chip unevaluated properties
        dt-bindings: mtd: meson: Prevent NAND chip unevaluated properties
        dt-bindings: mtd: sunxi: Prevent NAND chip unevaluated properties
        dt-bindings: mtd: ingenic: Prevent NAND chip unevaluated properties
        dt-bindings: mtd: qcom: Prevent NAND chip unevaluated properties
        dt-bindings: mtd: qcom: Fix a property position
        dt-bindings: mtd: Describe nand-ecc-mode
        dt-bindings: mtd: Mark nand-ecc-placement deprecated
        dt-bindings: mtd: Create a file for raw NAND chip properties
        dt-bindings: mtd: Accept nand related node names
        mtd: sm_ftl: Fix typos in comments
        mtd: otp: clean up on error in mtd_otp_nvmem_add()
        ...
      1364b406
    • Linus Torvalds's avatar
      Merge tag 'spi-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi · 84fccbba
      Linus Torvalds authored
      Pull spi updates from Mark Brown:
       "One small core feature this time around but mostly driver improvements
        and additions for SPI:
      
         - Add support for controlling the idle state of MOSI, some systems
           can support this and depending on the system integration may need
           it to avoid glitching in some situations
      
         - Support for polling mode in the S3C64xx driver and DMA on the
           Qualcomm QSPI driver
      
         - Support for several Allwinner SoCs, AMD Pensando Elba, Intel Mount
           Evans, Renesas RZ/V2M, and ST STM32H7"
      
      * tag 'spi-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: (66 commits)
        spi: dt-bindings: atmel,at91rm9200-spi: fix broken sam9x7 compatible
        spi: dt-bindings: atmel,at91rm9200-spi: add sam9x7 compatible
        spi: Add support for Renesas CSI
        spi: dt-bindings: Add bindings for RZ/V2M CSI
        spi: sun6i: Use the new helper to derive the xfer timeout value
        spi: atmel: Prevent false timeouts on long transfers
        spi: dt-bindings: stm32: do not disable spi-slave property for stm32f4-f7
        spi: Create a helper to derive adaptive timeouts
        spi: spi-geni-qcom: correctly handle -EPROBE_DEFER from dma_request_chan()
        spi: stm32: disable spi-slave property for stm32f4-f7
        spi: stm32: introduction of stm32h7 SPI device mode support
        spi: stm32: use dmaengine_terminate_{a}sync instead of _all
        spi: stm32: renaming of spi_master into spi_controller
        spi: dw: Remove misleading comment for Mount Evans SoC
        spi: dt-bindings: snps,dw-apb-ssi: Add compatible for Intel Mount Evans SoC
        spi: dw: Add compatible for Intel Mount Evans SoC
        spi: s3c64xx: Use dev_err_probe()
        spi: s3c64xx: Use the managed spi master allocation function
        spi: spl022: Probe defer is no error
        spi: spi-imx: fix mixing of native and gpio chipselects for imx51/imx53/imx6 variants
        ...
      84fccbba
    • Linus Torvalds's avatar
      Merge tag 'regulator-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator · 362067b6
      Linus Torvalds authored
      Pull regulator updates from Mark Brown:
       "This release is almost all drivers, there's some small improvements in
        the core but otherwise everything is updates to drivers, mostly the
        addition of new ones.
      
        There's also a bunch of changes pulled in from the MFD subsystem as
        dependencies, Rockchip and TI core MFD code that the regulator drivers
        depend on.
      
        I've also yet again managed to put a SPI commit in the regulator tree,
        I don't know what it is about those two trees (this for
        spi-geni-qcom).
      
        Summary:
      
         - Support for Renesas RAA215300, Rockchip RK808, Texas Instruments
           TPS6594 and TPS6287x, and X-Powers AXP15060 and AXP313a"
      
      * tag 'regulator-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: (43 commits)
        regulator: Add Renesas PMIC RAA215300 driver
        regulator: dt-bindings: Add Renesas RAA215300 PMIC bindings
        regulator: ltc3676: Use maple tree register cache
        regulator: ltc3589: Use maple tree register cache
        regulator: helper: Document ramp_delay parameter of regulator_set_ramp_delay_regmap()
        regulator: mt6358: Use linear voltage helpers for single range regulators
        regulator: mt6358: Const-ify mt6358_regulator_info data structures
        regulator: mt6358: Drop *_SSHUB regulators
        regulator: mt6358: Merge VCN33_* regulators
        regulator: dt-bindings: mt6358: Drop *_sshub regulators
        regulator: dt-bindings: mt6358: Merge ldo_vcn33_* regulators
        regulator: dt-bindings: pwm-regulator: Add missing type for "pwm-dutycycle-unit"
        regulator: Switch two more i2c drivers back to use .probe()
        spi: spi-geni-qcom: Do not do DMA map/unmap inside driver, use framework instead
        soc: qcom: geni-se: Add interfaces geni_se_tx_init_dma() and geni_se_rx_init_dma()
        regulator: tps6594-regulator: Add driver for TI TPS6594 regulators
        regulator: axp20x: Add AXP15060 support
        regulator: axp20x: Add support for AXP313a variant
        dt-bindings: pfuze100.yaml: Add an entry for interrupts
        regulator: stm32-pwr: Fix regulator disabling
        ...
      362067b6
    • Linus Torvalds's avatar
      Merge tag 'regmap-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap · 4171a9aa
      Linus Torvalds authored
      Pull regmap updates from Mark Brown:
       "Another busy release for regmap with the second half of the maple tree
        register cache implementation, there's some smaller optimisations that
        could be done but this should now be able to replace the rbtree cache
        for most devices.
      
        We also had a followup from Aidan MacDonald's refactoring of some of
        the regmap-irq interfaces, the conversion is complete so the old
        interfaces are removed. This means that even with the new features for
        the maple tree cache we'd have a nice negative diffstat were it not
        for the addition of a bunch more KUnit coverage.
      
        There's one GPIO patch in here, it was a dependency for a cleanup of
        an API in the regmap-irq code for which the gpio-104-dio-48e driver
        was the only user.
      
        Highlights:
      
         - The maple tree cache can now load in default values more
           efficiently, and is capabale of syncing multiple registers
           in a single write during cache sync
      
         - More KUnit coverage, including some coverage for raw I/O
           and a dummy RAM backed cache to support it
      
         - Removal of several old interfaces in regmap-irq now all
           users have been modernised"
      
      * tag 'regmap-v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: (23 commits)
        regmap: Allow reads from write only registers with the flat cache
        regmap: Drop early readability check
        regmap: Check for register readability before checking cache during read
        regmap: Add test to make sure we don't sync to read only registers
        regmap: Add a test case for write only registers
        regmap: Add test that writes to write only registers are prevented
        regmap: Add debugfs file for forcing field writes
        regmap: Don't check for changes in regcache_set_val()
        regmap: maple: Implement block sync for the maple tree cache
        regmap: Provide basic KUnit coverage for the raw register I/O
        regmap: Provide a ram backed regmap with raw support
        regmap: Add missing cache_only checks
        regmap: regmap-irq: Move handle_post_irq to before pm_runtime_put
        regmap: Load register defaults in blocks rather than register by register
        regmap: mmio: Allow passing an empty config->reg_stride
        regmap-irq: Drop backward compatibility for inverted mask/unmask
        regmap-irq: Minor adjustments to .handle_mask_sync()
        regmap-irq: Remove support for not_fixed_stride
        regmap-irq: Remove type registers
        regmap-irq: Remove virtual registers
        ...
      4171a9aa
    • Linus Torvalds's avatar
      x86/mem_encrypt: Remove stale mem_encrypt_init() declaration · 1b2c92a1
      Linus Torvalds authored
      The memory encryption initialization logic was moved from init/main.c
      into arch_cpu_finalize_init() in commit 439e1757 ("init, x86: Move
      mem_encrypt_init() into arch_cpu_finalize_init()"), but a stale
      declaration for the init function was left in <linux/init.h>.
      
      And didn't cause any problems if you had X86_MEM_ENCRYPT enabled, which
      apparently everybody involved did have.  See also commit 0a9567ac
      ("x86/mem_encrypt: Unbreak the AMD_MEM_ENCRYPT=n build") in this whole
      sad saga of conflicting declarations for different situations.
      Reported-by: default avatarMatthew Wilcox <willy@infradead.org>
      Fixes: 439e1757 init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init()
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1b2c92a1
    • Linus Torvalds's avatar
      mm: fix __access_remote_vm() GUP failure case · 6581ccf0
      Linus Torvalds authored
      Commit ca5e8632 ("mm/gup: remove vmas parameter from
      get_user_pages_remote()") removed the vma argument from GUP handling,
      and instead added a helper function (get_user_page_vma_remote()) that
      looks it up separately using 'vma_lookup()'.  And then converted
      existing users that needed a vma to use the helper instead.
      
      However, the helper function intentionally acts exactly like the old
      get_user_pages_remote() did, and only fills in 'vma' on successful page
      lookup.  Fine so far.
      
      However, __access_remote_vm() wants the vma even for the unsuccessful
      case, and used to do a
      
      	vma = vma_lookup(mm, addr);
      
      explicitly to look it up when the get_user_page() failed.
      
      However, that conversion commit incorrectly removed that vma lookup,
      thinking that get_user_page_vma_remote() would have done it.  Not so.
      
      So add the vma_lookup() back in.
      
      Fixes: ca5e8632 ("mm/gup: remove vmas parameter from get_user_pages_remote()")
      Cc: Lorenzo Stoakes <lstoakes@gmail.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6581ccf0
    • Linus Torvalds's avatar
      Merge tag 'mm-nonmm-stable-2023-06-24-19-23' of... · 77b1a7f7
      Linus Torvalds authored
      Merge tag 'mm-nonmm-stable-2023-06-24-19-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
      
      Pull non-mm updates from Andrew Morton:
      
       - Arnd Bergmann has fixed a bunch of -Wmissing-prototypes in top-level
         directories
      
       - Douglas Anderson has added a new "buddy" mode to the hardlockup
         detector. It permits the detector to work on architectures which
         cannot provide the required interrupts, by having CPUs periodically
         perform checks on other CPUs
      
       - Zhen Lei has enhanced kexec's ability to support two crash regions
      
       - Petr Mladek has done a lot of cleanup on the hard lockup detector's
         Kconfig entries
      
       - And the usual bunch of singleton patches in various places
      
      * tag 'mm-nonmm-stable-2023-06-24-19-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (72 commits)
        kernel/time/posix-stubs.c: remove duplicated include
        ocfs2: remove redundant assignment to variable bit_off
        watchdog/hardlockup: fix typo in config HARDLOCKUP_DETECTOR_PREFER_BUDDY
        powerpc: move arch_trigger_cpumask_backtrace from nmi.h to irq.h
        devres: show which resource was invalid in __devm_ioremap_resource()
        watchdog/hardlockup: define HARDLOCKUP_DETECTOR_ARCH
        watchdog/sparc64: define HARDLOCKUP_DETECTOR_SPARC64
        watchdog/hardlockup: make HAVE_NMI_WATCHDOG sparc64-specific
        watchdog/hardlockup: declare arch_touch_nmi_watchdog() only in linux/nmi.h
        watchdog/hardlockup: make the config checks more straightforward
        watchdog/hardlockup: sort hardlockup detector related config values a logical way
        watchdog/hardlockup: move SMP barriers from common code to buddy code
        watchdog/buddy: simplify the dependency for HARDLOCKUP_DETECTOR_PREFER_BUDDY
        watchdog/buddy: don't copy the cpumask in watchdog_next_cpu()
        watchdog/buddy: cleanup how watchdog_buddy_check_hardlockup() is called
        watchdog/hardlockup: remove softlockup comment in touch_nmi_watchdog()
        watchdog/hardlockup: in watchdog_hardlockup_check() use cpumask_copy()
        watchdog/hardlockup: don't use raw_cpu_ptr() in watchdog_hardlockup_kick()
        watchdog/hardlockup: HAVE_NMI_WATCHDOG must implement watchdog_hardlockup_probe()
        watchdog/hardlockup: keep kernel.nmi_watchdog sysctl as 0444 if probe fails
        ...
      77b1a7f7
    • Linus Torvalds's avatar
      Merge tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm · 6e17c6de
      Linus Torvalds authored
      Pull mm updates from Andrew Morton:
      
       - Yosry Ahmed brought back some cgroup v1 stats in OOM logs
      
       - Yosry has also eliminated cgroup's atomic rstat flushing
      
       - Nhat Pham adds the new cachestat() syscall. It provides userspace
         with the ability to query pagecache status - a similar concept to
         mincore() but more powerful and with improved usability
      
       - Mel Gorman provides more optimizations for compaction, reducing the
         prevalence of page rescanning
      
       - Lorenzo Stoakes has done some maintanance work on the
         get_user_pages() interface
      
       - Liam Howlett continues with cleanups and maintenance work to the
         maple tree code. Peng Zhang also does some work on maple tree
      
       - Johannes Weiner has done some cleanup work on the compaction code
      
       - David Hildenbrand has contributed additional selftests for
         get_user_pages()
      
       - Thomas Gleixner has contributed some maintenance and optimization
         work for the vmalloc code
      
       - Baolin Wang has provided some compaction cleanups,
      
       - SeongJae Park continues maintenance work on the DAMON code
      
       - Huang Ying has done some maintenance on the swap code's usage of
         device refcounting
      
       - Christoph Hellwig has some cleanups for the filemap/directio code
      
       - Ryan Roberts provides two patch series which yield some
         rationalization of the kernel's access to pte entries - use the
         provided APIs rather than open-coding accesses
      
       - Lorenzo Stoakes has some fixes to the interaction between pagecache
         and directio access to file mappings
      
       - John Hubbard has a series of fixes to the MM selftesting code
      
       - ZhangPeng continues the folio conversion campaign
      
       - Hugh Dickins has been working on the pagetable handling code, mainly
         with a view to reducing the load on the mmap_lock
      
       - Catalin Marinas has reduced the arm64 kmalloc() minimum alignment
         from 128 to 8
      
       - Domenico Cerasuolo has improved the zswap reclaim mechanism by
         reorganizing the LRU management
      
       - Matthew Wilcox provides some fixups to make gfs2 work better with the
         buffer_head code
      
       - Vishal Moola also has done some folio conversion work
      
       - Matthew Wilcox has removed the remnants of the pagevec code - their
         functionality is migrated over to struct folio_batch
      
      * tag 'mm-stable-2023-06-24-19-15' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (380 commits)
        mm/hugetlb: remove hugetlb_set_page_subpool()
        mm: nommu: correct the range of mmap_sem_read_lock in task_mem()
        hugetlb: revert use of page_cache_next_miss()
        Revert "page cache: fix page_cache_next/prev_miss off by one"
        mm/vmscan: fix root proactive reclaim unthrottling unbalanced node
        mm: memcg: rename and document global_reclaim()
        mm: kill [add|del]_page_to_lru_list()
        mm: compaction: convert to use a folio in isolate_migratepages_block()
        mm: zswap: fix double invalidate with exclusive loads
        mm: remove unnecessary pagevec includes
        mm: remove references to pagevec
        mm: rename invalidate_mapping_pagevec to mapping_try_invalidate
        mm: remove struct pagevec
        net: convert sunrpc from pagevec to folio_batch
        i915: convert i915_gpu_error to use a folio_batch
        pagevec: rename fbatch_count()
        mm: remove check_move_unevictable_pages()
        drm: convert drm_gem_put_pages() to use a folio_batch
        i915: convert shmem_sg_free_table() to use a folio_batch
        scatterlist: add sg_set_folio()
        ...
      6e17c6de
    • Linus Torvalds's avatar
      Merge tag 'docs-arm64-move' of git://git.lwn.net/linux · 6aeadf78
      Linus Torvalds authored
      Pull arm64 documentation move from Jonathan Corbet:
       "Move the arm64 architecture documentation under Documentation/arch/.
      
        This brings some order to the documentation directory, declutters the
        top-level directory, and makes the documentation organization more
        closely match that of the source"
      
      * tag 'docs-arm64-move' of git://git.lwn.net/linux:
        perf arm-spe: Fix a dangling Documentation/arm64 reference
        mm: Fix a dangling Documentation/arm64 reference
        arm64: Fix dangling references to Documentation/arm64
        dt-bindings: fix dangling Documentation/arm64 reference
        docs: arm64: Move arm64 documentation under Documentation/arch/
      6aeadf78
    • Linus Torvalds's avatar
      Merge tag 'hardening-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 582c161c
      Linus Torvalds authored
      Pull hardening updates from Kees Cook:
       "There are three areas of note:
      
        A bunch of strlcpy()->strscpy() conversions ended up living in my tree
        since they were either Acked by maintainers for me to carry, or got
        ignored for multiple weeks (and were trivial changes).
      
        The compiler option '-fstrict-flex-arrays=3' has been enabled
        globally, and has been in -next for the entire devel cycle. This
        changes compiler diagnostics (though mainly just -Warray-bounds which
        is disabled) and potential UBSAN_BOUNDS and FORTIFY _warning_
        coverage. In other words, there are no new restrictions, just
        potentially new warnings. Any new FORTIFY warnings we've seen have
        been fixed (usually in their respective subsystem trees). For more
        details, see commit df8fc4e9.
      
        The under-development compiler attribute __counted_by has been added
        so that we can start annotating flexible array members with their
        associated structure member that tracks the count of flexible array
        elements at run-time. It is possible (likely?) that the exact syntax
        of the attribute will change before it is finalized, but GCC and Clang
        are working together to sort it out. Any changes can be made to the
        macro while we continue to add annotations.
      
        As an example of that last case, I have a treewide commit waiting with
        such annotations found via Coccinelle:
      
          https://git.kernel.org/linus/adc5b3cb48a049563dc673f348eab7b6beba8a9b
      
        Also see commit dd06e72e for more details.
      
        Summary:
      
         - Fix KMSAN vs FORTIFY in strlcpy/strlcat (Alexander Potapenko)
      
         - Convert strreplace() to return string start (Andy Shevchenko)
      
         - Flexible array conversions (Arnd Bergmann, Wyes Karny, Kees Cook)
      
         - Add missing function prototypes seen with W=1 (Arnd Bergmann)
      
         - Fix strscpy() kerndoc typo (Arne Welzel)
      
         - Replace strlcpy() with strscpy() across many subsystems which were
           either Acked by respective maintainers or were trivial changes that
           went ignored for multiple weeks (Azeem Shaikh)
      
         - Remove unneeded cc-option test for UBSAN_TRAP (Nick Desaulniers)
      
         - Add KUnit tests for strcat()-family
      
         - Enable KUnit tests of FORTIFY wrappers under UML
      
         - Add more complete FORTIFY protections for strlcat()
      
         - Add missed disabling of FORTIFY for all arch purgatories.
      
         - Enable -fstrict-flex-arrays=3 globally
      
         - Tightening UBSAN_BOUNDS when using GCC
      
         - Improve checkpatch to check for strcpy, strncpy, and fake flex
           arrays
      
         - Improve use of const variables in FORTIFY
      
         - Add requested struct_size_t() helper for types not pointers
      
         - Add __counted_by macro for annotating flexible array size members"
      
      * tag 'hardening-v6.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (54 commits)
        netfilter: ipset: Replace strlcpy with strscpy
        uml: Replace strlcpy with strscpy
        um: Use HOST_DIR for mrproper
        kallsyms: Replace all non-returning strlcpy with strscpy
        sh: Replace all non-returning strlcpy with strscpy
        of/flattree: Replace all non-returning strlcpy with strscpy
        sparc64: Replace all non-returning strlcpy with strscpy
        Hexagon: Replace all non-returning strlcpy with strscpy
        kobject: Use return value of strreplace()
        lib/string_helpers: Change returned value of the strreplace()
        jbd2: Avoid printing outside the boundary of the buffer
        checkpatch: Check for 0-length and 1-element arrays
        riscv/purgatory: Do not use fortified string functions
        s390/purgatory: Do not use fortified string functions
        x86/purgatory: Do not use fortified string functions
        acpi: Replace struct acpi_table_slit 1-element array with flex-array
        clocksource: Replace all non-returning strlcpy with strscpy
        string: use __builtin_memcpy() in strlcpy/strlcat
        staging: most: Replace all non-returning strlcpy with strscpy
        drm/i2c: tda998x: Replace all non-returning strlcpy with strscpy
        ...
      582c161c