- 26 Apr, 2024 17 commits
-
-
Linus Torvalds authored
Merge tag 'mm-hotfixes-stable-2024-04-26-13-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "11 hotfixes. 8 are cc:stable and the remaining 3 (nice ratio!) address post-6.8 issues or aren't considered suitable for backporting. All except one of these are for MM. I see no particular theme - it's singletons all over" * tag 'mm-hotfixes-stable-2024-04-26-13-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/hugetlb: fix DEBUG_LOCKS_WARN_ON(1) when dissolve_free_hugetlb_folio() selftests: mm: protection_keys: save/restore nr_hugepages value from launch script stackdepot: respect __GFP_NOLOCKDEP allocation flag hugetlb: check for anon_vma prior to folio allocation mm: zswap: fix shrinker NULL crash with cgroup_disable=memory mm: turn folio_test_hugetlb into a PageType mm: support page_mapcount() on page_has_type() pages mm: create FOLIO_FLAG_FALSE and FOLIO_TYPE_OPS macros mm/hugetlb: fix missing hugetlb_lock for resv uncharge selftests: mm: fix unused and uninitialized variable warning selftests/harness: remove use of LINE_MAX
-
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmcLinus Torvalds authored
Pull MMC host fixes from Ulf Hansson: - moxart: Fix regression for sg_miter for PIO mode - sdhci-msm: Avoid hang by preventing access to suspended controller - sdhci-of-dwcmshc: Fix SD card tuning error for th1520 * tag 'mmc-v6.9-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: moxart: fix handling of sgm->consumed, otherwise WARN_ON triggers mmc: sdhci-of-dwcmshc: th1520: Increase tuning loop count to 128 mmc: sdhci-msm: pervent access to suspended controller
-
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arcLinus Torvalds authored
Pull ARC fixes from Vineet Gupta: - Incorrect VIPT aliasing assumption - Misc build warning fixes and some typos * tag 'arc-6.9-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: [plat-hsdk]: Remove misplaced interrupt-cells property ARC: Fix typos ARC: mm: fix new code about cache aliasing ARC: Fix -Wmissing-prototypes warnings
-
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linuxLinus Torvalds authored
Pull MTD fixes from Miquel Raynal: "There has been OTP support improvements in the NVMEM subsystem, and later also improvements of OTP support in the NAND subsystem. This lead to situations that we currently cannot handle, so better prevent this situation from happening in order to avoid canceling device's probe. In the raw NAND subsystem, two runtime fixes have been shared, one fixing two important commands in the Qcom driver since it got reworked and a NULL pointer dereference happening on STB chips. Arnd also fixed a UBSAN link failure on diskonchip" * tag 'mtd/fixes-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: limit OTP NVMEM cell parse to non-NAND devices mtd: diskonchip: work around ubsan link failure mtd: rawnand: qcom: Fix broken OP_RESET_DEVICE command in qcom_misc_cmd_type_exec() mtd: rawnand: brcmnand: Fix data access violation for STB chip
-
git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linuxLinus Torvalds authored
Pull gpio fixes from Bartosz Golaszewski: - fix a regression in pin access control in gpio-tegra186 - make data pointer dereference robust in Intel Tangier driver * tag 'gpio-fixes-for-v6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: gpio: tegra186: Fix tegra186_gpio_is_accessible() check gpio: tangier: Use correct type for the IRQ chip data
-
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxlLinus Torvalds authored
Pull cxl fix from Dave Jiang: - Fix potential payload size confusion in cxl_mem_get_poison() * tag 'cxl-fixes-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/core: Fix potential payload size confusion in cxl_mem_get_poison()
-
Linus Torvalds authored
Merge tag 'for-6.9/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - Fix 6.9 regression so that DM device removal is performed synchronously by default. Asynchronous removal has always been possible but it isn't the default. It is important that synchronous removal be preserved, otherwise it is an interface change that breaks lvm2. - Remove errant semicolon in drivers/md/dm-vdo/murmurhash3.c * tag 'for-6.9/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: restore synchronous close of device mapper block device dm vdo murmurhash: remove unneeded semicolon
-
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfsLinus Torvalds authored
Pull vfs fixes from Christian Brauner: "This contains a few small fixes for this merge window and the attempt to handle the ntfs removal regression that was reported a little while ago: - After the removal of the legacy ntfs driver we received reports about regressions for some people that do mount "ntfs" explicitly and expect the driver to be available. Since ntfs3 is a drop-in for legacy ntfs we alias legacy ntfs to ntfs3 just like ext3 is aliased to ext4. We also enforce legacy ntfs is always mounted read-only and give it custom file operations to ensure that ioctl()'s can't be abused to perform write operations. - Fix an unbalanced module_get() in bdev_open(). - Two smaller fixes for the netfs work done earlier in this cycle. - Fix the errno returned from the new FS_IOC_GETUUID and FS_IOC_GETFSSYSFSPATH ioctls. Both commands just pull information out of the superblock so there's no need to call into the actual ioctl handlers. So instead of returning ENOIOCTLCMD to indicate to fallback we just return ENOTTY directly avoiding that indirection" * tag 'vfs-6.9-rc6.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: netfs: Fix the pre-flush when appending to a file in writethrough mode netfs: Fix writethrough-mode error handling ntfs3: add legacy ntfs file operations ntfs3: enforce read-only when used as legacy ntfs driver ntfs3: serve as alias for the legacy ntfs driver block: fix module reference leakage from bdev_open_by_dev error path fs: Return ENOTTY directly if FS_IOC_GETUUID or FS_IOC_GETFSSYSFSPATH fail
-
Linus Torvalds authored
Merge tag 'loongarch-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Fix some build errors and some trivial runtime bugs" * tag 'loongarch-fixes-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: Lately init pmu after smp is online LoongArch: Fix callchain parse error with kernel tracepoint events LoongArch: Fix access error when read fault on a write-only VMA LoongArch: Fix a build error due to __tlb_remove_tlb_entry() LoongArch: Fix Kconfig item and left code related to CRASH_CORE
-
git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linuxLinus Torvalds authored
Pull maintainer entry update from Uwe Kleine-König: "This is just an update to my maintainer entries as I will switch jobs soon. Getting a contact email address into the MAINTAINERS file that will work also after my switch will hopefully reduce people mailing to the then non-existing address. I also drop my co-maintenance for SIOX, but that continues to be in good hands" * tag 'pwm/for-6.9-rc6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux: MAINTAINERS: Update Uwe's email address, drop SIOX maintenance
-
https://gitlab.freedesktop.org/drm/kernelLinus Torvalds authored
Pull drm fixes from Dave Airlie: "Regular weekly merge request, mostly amdgpu and misc bits in xe/etnaviv/gma500 and some core changes. Nothing too outlandish, seems to be about normal for this time of release. atomic-helpers: - Fix memory leak in drm_format_conv_state_copy() fbdev: - fbdefio: Fix address calculation amdgpu: - Suspend/resume fix - Don't expose gpu_od directory if it's empty - SDMA 4.4.2 fix - VPE fix - BO eviction fix - UMSCH fix - SMU 13.0.6 reset fixes - GPUVM flush accounting fix - SDMA 5.2 fix - Fix possible UAF in mes code amdkfd: - Eviction fence handling fix - Fix memory leak when GPU memory allocation fails - Fix dma-buf validation - Fix rescheduling of restore worker - SVM fix gma500: - Fix crash during boot etnaviv: - fix GC7000 TX clock gating - revert NPU UAPI changes xe: - Fix error paths on managed allocations - Fix PF/VF relay messages" * tag 'drm-fixes-2024-04-26' of https://gitlab.freedesktop.org/drm/kernel: (23 commits) Revert "drm/etnaviv: Expose a few more chipspecs to userspace" drm/etnaviv: fix tx clock gating on some GC7000 variants drm/xe/guc: Fix arguments passed to relay G2H handlers drm/xe: call free_gsc_pkt only once on action add failure drm/xe: Remove sysfs only once on action add failure fbdev: fix incorrect address computation in deferred IO drm/amdgpu/mes: fix use-after-free issue drm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3 drm/amdgpu: Fix the ring buffer size for queue VM flush drm/amdkfd: Add VRAM accounting for SVM migration drm/amd/pm: Restore config space after reset drm/amdgpu/umsch: don't execute umsch test when GPU is in reset/suspend drm/amdkfd: Fix rescheduling of restore worker drm/amdgpu: Update BO eviction priorities drm/amdgpu/vpe: fix vpe dpm setup failed drm/amdgpu: Assign correct bits for SDMA HDP flush drm/amdgpu/pm: Remove gpu_od if it's an empty directory drm/amdkfd: make sure VM is ready for updating operations drm/amdgpu: Fix leak when GPU memory allocation fails drm/amdkfd: Fix eviction fence handling ...
-
David Howells authored
In netfs_perform_write(), when the file is marked NETFS_ICTX_WRITETHROUGH or O_*SYNC or RWF_*SYNC was specified, write-through caching is performed on a buffered file. When setting up for write-through, we flush any conflicting writes in the region and wait for the write to complete, failing if there's a write error to return. The issue arises if we're writing at or above the EOF position because we skip the flush and - more importantly - the wait. This becomes a problem if there's a partial folio at the end of the file that is being written out and we want to make a write to it too. Both the already-running write and the write we start both want to clear the writeback mark, but whoever is second causes a warning looking something like: ------------[ cut here ]------------ R=00000012: folio 11 is not under writeback WARNING: CPU: 34 PID: 654 at fs/netfs/write_collect.c:105 ... CPU: 34 PID: 654 Comm: kworker/u386:27 Tainted: G S ... ... Workqueue: events_unbound netfs_write_collection_worker ... RIP: 0010:netfs_writeback_lookup_folio Fix this by making the flush-and-wait unconditional. It will do nothing if there are no folios in the pagecache and will return quickly if there are no folios in the region specified. Further, move the WBC attachment above the flush call as the flush is going to attach a WBC and detach it again if it is not present - and since we need one anyway we might as well share it. Fixes: 41d8e767 ("netfs: Implement a write-through caching option") Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202404161031.468b84f-oliver.sang@intel.comSigned-off-by: David Howells <dhowells@redhat.com> Link: https://lore.kernel.org/r/2150448.1714130115@warthog.procyon.org.ukReviewed-by: Jeffrey Layton <jlayton@kernel.org> cc: Eric Van Hensbergen <ericvh@kernel.org> cc: Latchesar Ionkov <lucho@ionkov.net> cc: Dominique Martinet <asmadeus@codewreck.org> cc: Christian Schoenebeck <linux_oss@crudebyte.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org cc: v9fs@lists.linux.dev cc: linux-afs@lists.infradead.org cc: linux-cifs@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
-
Uwe Kleine-König authored
In the context of changing my career path, my Pengutronix email address will soon stop to be available to me. Update the PWM maintainer entry to my kernel.org identity. I drop my co-maintenance of SIOX. Thorsten will continue to care for it with the support of the Pengutronix kernel team. Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org> Acked-by: Thorsten Scherer <t.scherer@eckelmann.de> Link: https://lore.kernel.org/r/20240424212626.603631-2-ukleinek@kernel.orgSigned-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
-
https://gitlab.freedesktop.org/drm/xe/kernelDave Airlie authored
- Fix error paths on managed allocations - Fix PF/VF relay messages Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/gxaxtvxeoax7mnddxbl3tfn2hfnm5e4ngnl3wpi4p5tvn7il4s@fwsvpntse7bh
-
https://git.pengutronix.de/git/lst/linuxDave Airlie authored
- fix GC7000 TX clock gating - revert NPU UAPI changes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas Stach <l.stach@pengutronix.de> Link: https://patchwork.freedesktop.org/patch/msgid/c24457dc18ba9eab3ff919b398a25b1af9f1124e.camel@pengutronix.de
-
Dave Airlie authored
Merge tag 'drm-misc-fixes-2024-04-25' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: atomic-helpers: - Fix memory leak in drm_format_conv_state_copy() fbdev: - fbdefio: Fix address calculation gma500: - Fix crash during boot Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20240425102413.GA6301@localhost.localdomain
-
Dave Airlie authored
Merge tag 'amd-drm-fixes-6.9-2024-04-24' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.9-2024-04-24: amdgpu: - Suspend/resume fix - Don't expose gpu_od directory if it's empty - SDMA 4.4.2 fix - VPE fix - BO eviction fix - UMSCH fix - SMU 13.0.6 reset fixes - GPUVM flush accounting fix - SDMA 5.2 fix - Fix possible UAF in mes code amdkfd: - Eviction fence handling fix - Fix memory leak when GPU memory allocation fails - Fix dma-buf validation - Fix rescheduling of restore worker - SVM fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240424202408.1973661-1-alexander.deucher@amd.com
-
- 25 Apr, 2024 23 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhostLinus Torvalds authored
Pull virtio fix from Michael Tsirkin: "enum renames for vdpa uapi - we better do this now before the names have been exposed in any releases" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vDPA: code clean for vhost_vdpa uapi
-
git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fsLinus Torvalds authored
Pull 9p fix from Eric Van Hensbergen: "This contains a single mitigation to help deal with an apparent race condition between client and server having to deal with inode number collisions" * tag '9p-for-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs: fs/9p: mitigate inode collisions
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull ACPI fixes from Rafael Wysocki: "These fix three recent regressions, one introduced while enabling a new platform firmware feature for power management, and two introduced by a recent CPPC library update. Specifics: - Allow two overlapping Low-Power S0 Idle _DSM function sets to be used at the same time (Rafael Wysocki) - Fix bit offset computation in MASK_VAL() macro used for applying a bitmask to a new CPPC register value (Jarred White) - Fix access width field usage for PCC registers in CPPC (Vanshidhar Konda)" * tag 'acpi-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: PM: s2idle: Evaluate all Low-Power S0 Idle _DSM functions ACPI: CPPC: Fix access width used for PCC registers ACPI: CPPC: Fix bit_offset shift in MASK_VAL() macro
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from Jakub Kicinski: "Including fixes from netfilter, wireless and bluetooth. Nothing major, regression fixes are mostly in drivers, two more of those are flowing towards us thru various trees. I wish some of the changes went into -rc5, we'll try to keep an eye on frequency of PRs from sub-trees. Also disproportional number of fixes for bugs added in v6.4, strange coincidence. Current release - regressions: - igc: fix LED-related deadlock on driver unbind - wifi: mac80211: small fixes to recent clean up of the connection process - Revert "wifi: iwlwifi: bump FW API to 90 for BZ/SC devices", kernel doesn't have all the code to deal with that version, yet - Bluetooth: - set power_ctrl_enabled on NULL returned by gpiod_get_optional() - qca: fix invalid device address check, again - eth: ravb: fix registered interrupt names Current release - new code bugs: - wifi: mac80211: check EHT/TTLM action frame length Previous releases - regressions: - fix sk_memory_allocated_{add|sub} for architectures where __this_cpu_{add|sub}* are not IRQ-safe - dsa: mv88e6xx: fix link setup for 88E6250 Previous releases - always broken: - ip: validate dev returned from __in_dev_get_rcu(), prevent possible null-derefs in a few places - switch number of for_each_rcu() loops using call_rcu() on the iterator to for_each_safe() - macsec: fix isolation of broadcast traffic in presence of offload - vxlan: drop packets from invalid source address - eth: mlxsw: trap and ACL programming fixes - eth: bnxt: PCIe error recovery fixes, fix counting dropped packets - Bluetooth: - lots of fixes for the command submission rework from v6.4 - qca: fix NULL-deref on non-serdev suspend Misc: - tools: ynl: don't ignore errors in NLMSG_DONE messages" * tag 'net-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits) af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc(). net: b44: set pause params only when interface is up tls: fix lockless read of strp->msg_ready in ->poll dpll: fix dpll_pin_on_pin_register() for multiple parent pins net: ravb: Fix registered interrupt names octeontx2-af: fix the double free in rvu_npc_freemem() net: ethernet: ti: am65-cpts: Fix PTPv1 message type on TX packets ice: fix LAG and VF lock dependency in ice_reset_vf() iavf: Fix TC config comparison with existing adapter TC config i40e: Report MFS in decimal base instead of hex i40e: Do not use WQ_MEM_RECLAIM flag for workqueue net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns() net/mlx5e: Advertise mlx5 ethernet driver updates sk_buff md_dst for MACsec macsec: Detect if Rx skb is macsec-related for offloading devices that update md_dst ethernet: Add helper for assigning packet type when dest address does not match device address macsec: Enable devices to advertise whether they update sk_buff md_dst during offloads net: phy: dp83869: Fix MII mode failure netfilter: nf_tables: honor table dormant flag from netdev release event path eth: bnxt: fix counting packets discarded due to OOM and netpoll igc: Fix LED-related deadlock on driver unbind ...
-
Rafael J. Wysocki authored
* acpi-cppc: ACPI: CPPC: Fix access width used for PCC registers ACPI: CPPC: Fix bit_offset shift in MASK_VAL() macro
-
Miaohe Lin authored
When I did memory failure tests recently, below warning occurs: DEBUG_LOCKS_WARN_ON(1) WARNING: CPU: 8 PID: 1011 at kernel/locking/lockdep.c:232 __lock_acquire+0xccb/0x1ca0 Modules linked in: mce_inject hwpoison_inject CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 RIP: 0010:__lock_acquire+0xccb/0x1ca0 RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082 RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0 RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10 R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004 FS: 00007ff9f32aa740(0000) GS:ffffa1ce5fc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ff9f3134ba0 CR3: 00000008484e4000 CR4: 00000000000006f0 Call Trace: <TASK> lock_acquire+0xbe/0x2d0 _raw_spin_lock_irqsave+0x3a/0x60 hugepage_subpool_put_pages.part.0+0xe/0xc0 free_huge_folio+0x253/0x3f0 dissolve_free_huge_page+0x147/0x210 __page_handle_poison+0x9/0x70 memory_failure+0x4e6/0x8c0 hard_offline_page_store+0x55/0xa0 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x380/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xbc/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff9f3114887 RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887 RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001 RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00 </TASK> Kernel panic - not syncing: kernel: panic_on_warn set ... CPU: 8 PID: 1011 Comm: bash Kdump: loaded Not tainted 6.9.0-rc3-next-20240410-00012-gdb69f219f4be #3 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> panic+0x326/0x350 check_panic_on_warn+0x4f/0x50 __warn+0x98/0x190 report_bug+0x18e/0x1a0 handle_bug+0x3d/0x70 exc_invalid_op+0x18/0x70 asm_exc_invalid_op+0x1a/0x20 RIP: 0010:__lock_acquire+0xccb/0x1ca0 RSP: 0018:ffffa7a1c7fe3bd0 EFLAGS: 00000082 RAX: 0000000000000000 RBX: eb851eb853975fcf RCX: ffffa1ce5fc1c9c8 RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa1ce5fc1c9c0 RBP: ffffa1c6865d3280 R08: ffffffffb0f570a8 R09: 0000000000009ffb R10: 0000000000000286 R11: ffffffffb0f2ad50 R12: ffffa1c6865d3d10 R13: ffffa1c6865d3c70 R14: 0000000000000000 R15: 0000000000000004 lock_acquire+0xbe/0x2d0 _raw_spin_lock_irqsave+0x3a/0x60 hugepage_subpool_put_pages.part.0+0xe/0xc0 free_huge_folio+0x253/0x3f0 dissolve_free_huge_page+0x147/0x210 __page_handle_poison+0x9/0x70 memory_failure+0x4e6/0x8c0 hard_offline_page_store+0x55/0xa0 kernfs_fop_write_iter+0x12c/0x1d0 vfs_write+0x380/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xbc/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7ff9f3114887 RSP: 002b:00007ffecbacb458 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000000000c RCX: 00007ff9f3114887 RDX: 000000000000000c RSI: 0000564494164e10 RDI: 0000000000000001 RBP: 0000564494164e10 R08: 00007ff9f31d1460 R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000c R13: 00007ff9f321b780 R14: 00007ff9f3217600 R15: 00007ff9f3216a00 </TASK> After git bisecting and digging into the code, I believe the root cause is that _deferred_list field of folio is unioned with _hugetlb_subpool field. In __update_and_free_hugetlb_folio(), folio->_deferred_list is initialized leading to corrupted folio->_hugetlb_subpool when folio is hugetlb. Later free_huge_folio() will use _hugetlb_subpool and above warning happens. But it is assumed hugetlb flag must have been cleared when calling folio_put() in update_and_free_hugetlb_folio(). This assumption is broken due to below race: CPU1 CPU2 dissolve_free_huge_page update_and_free_pages_bulk update_and_free_hugetlb_folio hugetlb_vmemmap_restore_folios folio_clear_hugetlb_vmemmap_optimized clear_flag = folio_test_hugetlb_vmemmap_optimized if (clear_flag) <-- False, it's already cleared. __folio_clear_hugetlb(folio) <-- Hugetlb is not cleared. folio_put free_huge_folio <-- free_the_page is expected. list_for_each_entry() __folio_clear_hugetlb <-- Too late. Fix this issue by checking whether folio is hugetlb directly instead of checking clear_flag to close the race window. Link: https://lkml.kernel.org/r/20240419085819.1901645-1-linmiaohe@huawei.com Fixes: 32c87719 ("hugetlb: do not clear hugetlb dtor until allocating vmemmap") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
Muhammad Usama Anjum authored
The save/restore of nr_hugepages was added to the test itself by using the atexit() functionality. But it is broken as parent exits after creating child. Hence calling the atexit() function early. That's not it. The child exits after creating its child and so on. The parent cannot wait to get the termination status for its children as it'll keep on holding the resources until the new pkey allocation fails. It is impossible to wait for exits of all the grand and great grand children. Hence the restoring of nr_hugepages value from parent is wrong. Let's save/restore the nr_hugepages settings in the launch script instead of doing it in the test. Link: https://lkml.kernel.org/r/20240419115027.3848958-1-usama.anjum@collabora.com Fixes: c52eb6db ("selftests: mm: restore settings from only parent process") Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Reported-by: Joey Gouly <joey.gouly@arm.com> Closes: https://lore.kernel.org/all/20240418125250.GA2941398@e124191.cambridge.arm.com Cc: Joey Gouly <joey.gouly@arm.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds authored
Pull nfsd fixes from Chuck Lever: - Revert some backchannel fixes that went into v6.9-rc * tag 'nfsd-6.9-5' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: Revert "NFSD: Convert the callback workqueue to use delayed_work" Revert "NFSD: Reschedule CB operations when backchannel rpc_clnt is shut down"
-
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hidLinus Torvalds authored
Pull HID fixes from Benjamin Tissoires: - A couple of i2c-hid fixes (Kenny Levinsen & Nam Cao) - A config issue with mcp-2221 when CONFIG_IIO is not enabled (Abdelrahman Morsy) - A dev_err fix in intel-ish-hid (Zhang Lixu) - A couple of mouse fixes for both nintendo and Logitech-dj (Nuno Pereira and Yaraslau Furman) - I'm changing my main kernel email address as it's way simpler for me than the Red Hat one (Benjamin Tissoires) * tag 'for-linus-2024042501' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: HID: mcp-2221: cancel delayed_work only when CONFIG_IIO is enabled HID: logitech-dj: allow mice to use all types of reports HID: i2c-hid: Revert to await reset ACK before reading report descriptor HID: nintendo: Fix N64 controller being identified as mouse MAINTAINERS: update Benjamin's email address HID: intel-ish-hid: ipc: Fix dev_err usage with uninitialized dev->devc HID: i2c-hid: remove I2C_HID_READ_PENDING flag to prevent lock-up
-
Sergei Antonov authored
When e.g. 8 bytes are to be read, sgm->consumed equals 8 immediately after sg_miter_next() call. The driver then increments it as bytes are read, so sgm->consumed becomes 16 and this warning triggers in sg_miter_stop(): WARN_ON(miter->consumed > miter->length); WARNING: CPU: 0 PID: 28 at lib/scatterlist.c:925 sg_miter_stop+0x2c/0x10c CPU: 0 PID: 28 Comm: kworker/0:2 Tainted: G W 6.9.0-rc5-dirty #249 Hardware name: Generic DT based system Workqueue: events_freezable mmc_rescan Call trace:. unwind_backtrace from show_stack+0x10/0x14 show_stack from dump_stack_lvl+0x44/0x5c dump_stack_lvl from __warn+0x78/0x16c __warn from warn_slowpath_fmt+0xb0/0x160 warn_slowpath_fmt from sg_miter_stop+0x2c/0x10c sg_miter_stop from moxart_request+0xb0/0x468 moxart_request from mmc_start_request+0x94/0xa8 mmc_start_request from mmc_wait_for_req+0x60/0xa8 mmc_wait_for_req from mmc_app_send_scr+0xf8/0x150 mmc_app_send_scr from mmc_sd_setup_card+0x1c/0x420 mmc_sd_setup_card from mmc_sd_init_card+0x12c/0x4dc mmc_sd_init_card from mmc_attach_sd+0xf0/0x16c mmc_attach_sd from mmc_rescan+0x1e0/0x298 mmc_rescan from process_scheduled_works+0x2e4/0x4ec process_scheduled_works from worker_thread+0x1ec/0x24c worker_thread from kthread+0xd4/0xe0 kthread from ret_from_fork+0x14/0x38 This patch adds initial zeroing of sgm->consumed. It is then incremented as bytes are read or written. Signed-off-by: Sergei Antonov <saproj@gmail.com> Cc: Linus Walleij <linus.walleij@linaro.org> Fixes: 3ee0e7c3 ("mmc: moxart-mmc: Use sg_miter for PIO") Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240422153607.963672-1-saproj@gmail.comSigned-off-by: Ulf Hansson <ulf.hansson@linaro.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nfJakub Kicinski authored
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains two Netfilter/IPVS fixes for net: Patch #1 fixes SCTP checksumming for IPVS with gso packets, from Ismael Luceno. Patch #2 honor dormant flag from netdev event path to fix a possible double hook unregistration. * tag 'nf-24-04-25' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: nf_tables: honor table dormant flag from netdev release event path ipvs: Fix checksumming on GSO of SCTP packets ==================== Link: https://lore.kernel.org/r/20240425090149.1359547-1-pablo@netfilter.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Kuniyuki Iwashima authored
syzbot reported a lockdep splat regarding unix_gc_lock and unix_state_lock(). One is called from recvmsg() for a connected socket, and another is called from GC for TCP_LISTEN socket. So, the splat is false-positive. Let's add a dedicated lock class for the latter to suppress the splat. Note that this change is not necessary for net-next.git as the issue is only applied to the old GC impl. [0]: WARNING: possible circular locking dependency detected 6.9.0-rc5-syzkaller-00007-g4d200843 #0 Not tainted ----------------------------------------------------- kworker/u8:1/11 is trying to acquire lock: ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffff88807cea4e70 (&u->lock){+.+.}-{2:2}, at: __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 but task is already holding lock: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (unix_gc_lock){+.+.}-{2:2}: lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] unix_notinflight+0x13d/0x390 net/unix/garbage.c:140 unix_detach_fds net/unix/af_unix.c:1819 [inline] unix_destruct_scm+0x221/0x350 net/unix/af_unix.c:1876 skb_release_head_state+0x100/0x250 net/core/skbuff.c:1188 skb_release_all net/core/skbuff.c:1200 [inline] __kfree_skb net/core/skbuff.c:1216 [inline] kfree_skb_reason+0x16d/0x3b0 net/core/skbuff.c:1252 kfree_skb include/linux/skbuff.h:1262 [inline] manage_oob net/unix/af_unix.c:2672 [inline] unix_stream_read_generic+0x1125/0x2700 net/unix/af_unix.c:2749 unix_stream_splice_read+0x239/0x320 net/unix/af_unix.c:2981 do_splice_read fs/splice.c:985 [inline] splice_file_to_pipe+0x299/0x500 fs/splice.c:1295 do_splice+0xf2d/0x1880 fs/splice.c:1379 __do_splice fs/splice.c:1436 [inline] __do_sys_splice fs/splice.c:1652 [inline] __se_sys_splice+0x331/0x4a0 fs/splice.c:1634 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f -> #0 (&u->lock){+.+.}-{2:2}: check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(unix_gc_lock); lock(&u->lock); lock(unix_gc_lock); lock(&u->lock); *** DEADLOCK *** 3 locks held by kworker/u8:1/11: #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3229 [inline] #0: ffff888015089148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x8e0/0x17c0 kernel/workqueue.c:3335 #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3230 [inline] #1: ffffc90000107d00 (unix_gc_work){+.+.}-{0:0}, at: process_scheduled_works+0x91b/0x17c0 kernel/workqueue.c:3335 #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline] #2: ffffffff8f6ab638 (unix_gc_lock){+.+.}-{2:2}, at: __unix_gc+0x117/0xf70 net/unix/garbage.c:261 stack backtrace: CPU: 0 PID: 11 Comm: kworker/u8:1 Not tainted 6.9.0-rc5-syzkaller-00007-g4d200843 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 Workqueue: events_unbound __unix_gc Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2187 check_prev_add kernel/locking/lockdep.c:3134 [inline] check_prevs_add kernel/locking/lockdep.c:3253 [inline] validate_chain+0x18cb/0x58e0 kernel/locking/lockdep.c:3869 __lock_acquire+0x1346/0x1fd0 kernel/locking/lockdep.c:5137 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5754 __raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline] _raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154 spin_lock include/linux/spinlock.h:351 [inline] __unix_gc+0x40e/0xf70 net/unix/garbage.c:302 process_one_work kernel/workqueue.c:3254 [inline] process_scheduled_works+0xa10/0x17c0 kernel/workqueue.c:3335 worker_thread+0x86d/0xd70 kernel/workqueue.c:3416 kthread+0x2f0/0x390 kernel/kthread.c:388 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK> Fixes: 47d8ac01 ("af_unix: Fix garbage collector racing against connect()") Reported-and-tested-by: syzbot+fa379358c28cc87cc307@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=fa379358c28cc87cc307Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/r/20240424170443.9832-1-kuniyu@amazon.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Peter Münster authored
b44_free_rings() accesses b44::rx_buffers (and ::tx_buffers) unconditionally, but b44::rx_buffers is only valid when the device is up (they get allocated in b44_open(), and deallocated again in b44_close()), any other time these are just a NULL pointers. So if you try to change the pause params while the network interface is disabled/administratively down, everything explodes (which likely netifd tries to do). Link: https://github.com/openwrt/openwrt/issues/13789 Fixes: 1da177e4 (Linux-2.6.12-rc2) Cc: stable@vger.kernel.org Reported-by: Peter Münster <pm@a16n.net> Suggested-by: Jonas Gorski <jonas.gorski@gmail.com> Signed-off-by: Vaclav Svoboda <svoboda@neng.cz> Tested-by: Peter Münster <pm@a16n.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Peter Münster <pm@a16n.net> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/87y192oolj.fsf@a16n.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sabrina Dubroca authored
tls_sk_poll is called without locking the socket, and needs to read strp->msg_ready (via tls_strp_msg_ready). Convert msg_ready to a bool and use READ_ONCE/WRITE_ONCE where needed. The remaining reads are only performed when the socket is locked. Fixes: 121dca78 ("tls: suppress wakeups unless we have a full record") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://lore.kernel.org/r/0b7ee062319037cf86af6b317b3d72f7bfcd2e97.1713797701.git.sd@queasysnail.netSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Arkadiusz Kubalewski authored
In scenario where pin is registered with multiple parent pins via dpll_pin_on_pin_register(..), all belonging to the same dpll device. A second call to dpll_pin_on_pin_unregister(..) would cause a call trace, as it tries to use already released registration resources (due to fix introduced in b446631f). In this scenario pin was registered twice, so resources are not yet expected to be release until each registered pin/pin pair is unregistered. Currently, the following crash/call trace is produced when ice driver is removed on the system with installed E810T NIC which includes dpll device: WARNING: CPU: 51 PID: 9155 at drivers/dpll/dpll_core.c:809 dpll_pin_ops+0x20/0x30 RIP: 0010:dpll_pin_ops+0x20/0x30 Call Trace: ? __warn+0x7f/0x130 ? dpll_pin_ops+0x20/0x30 dpll_msg_add_pin_freq+0x37/0x1d0 dpll_cmd_pin_get_one+0x1c0/0x400 ? __nlmsg_put+0x63/0x80 dpll_pin_event_send+0x93/0x140 dpll_pin_on_pin_unregister+0x3f/0x100 ice_dpll_deinit_pins+0xa1/0x230 [ice] ice_remove+0xf1/0x210 [ice] Fix by adding a parent pointer as a cookie when creating a registration, also when searching for it. For the regular pins pass NULL, this allows to create separated registration for each parent the pin is registered with. Fixes: b446631f ("dpll: fix dpll_xa_ref_*_del() for multiple registrations") Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240424101636.1491424-1-arkadiusz.kubalewski@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Geert Uytterhoeven authored
As interrupts are now requested from ravb_probe(), before calling register_netdev(), ndev->name still contains the template "eth%d", leading to funny names in /proc/interrupts. E.g. on R-Car E3: 89: 0 0 GICv2 93 Level eth%d:ch22:multi 90: 0 3 GICv2 95 Level eth%d:ch24:emac 91: 0 23484 GICv2 71 Level eth%d:ch0:rx_be 92: 0 0 GICv2 72 Level eth%d:ch1:rx_nc 93: 0 13735 GICv2 89 Level eth%d:ch18:tx_be 94: 0 0 GICv2 90 Level eth%d:ch19:tx_nc Worse, on platforms with multiple RAVB instances (e.g. R-Car V4H), all interrupts have similar names. Fix this by using the device name instead, like is done in several other drivers: 89: 0 0 GICv2 93 Level e6800000.ethernet:ch22:multi 90: 0 1 GICv2 95 Level e6800000.ethernet:ch24:emac 91: 0 28578 GICv2 71 Level e6800000.ethernet:ch0:rx_be 92: 0 0 GICv2 72 Level e6800000.ethernet:ch1:rx_nc 93: 0 14044 GICv2 89 Level e6800000.ethernet:ch18:tx_be 94: 0 0 GICv2 90 Level e6800000.ethernet:ch19:tx_nc Rename the local variable dev_name, as it shadows the dev_name() function, and pre-initialize it, to simplify the code. Fixes: 32f012b8 ("net: ravb: Move getting/requesting IRQs in the probe() method") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Reviewed-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> # on RZ/G3S Link: https://lore.kernel.org/r/cde67b68adf115b3cf0b44c32334ae00b2fbb321.1713944647.git.geert+renesas@glider.beSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Su Hui authored
Clang static checker(scan-build) warning: drivers/net/ethernet/marvell/octeontx2/af/rvu_npc.c:line 2184, column 2 Attempt to free released memory. npc_mcam_rsrcs_deinit() has released 'mcam->counters.bmap'. Deleted this redundant kfree() to fix this double free problem. Fixes: dd784287 ("octeontx2-af: Add new devlink param to configure maximum usable NIX block LFs") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://lore.kernel.org/r/20240424022724.144587-1-suhui@nfschina.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jason Reeder authored
The CPTS, by design, captures the messageType (Sync, Delay_Req, etc.) field from the second nibble of the PTP header which is defined in the PTPv2 (1588-2008) specification. In the PTPv1 (1588-2002) specification the first two bytes of the PTP header are defined as the versionType which is always 0x0001. This means that any PTPv1 packets that are tagged for TX timestamping by the CPTS will have their messageType set to 0x0 which corresponds to a Sync message type. This causes issues when a PTPv1 stack is expecting a Delay_Req (messageType: 0x1) timestamp that never appears. Fix this by checking if the ptp_class of the timestamped TX packet is PTP_CLASS_V1 and then matching the PTP sequence ID to the stored sequence ID in the skb->cb data structure. If the sequence IDs match and the packet is of type PTPv1 then there is a chance that the messageType has been incorrectly stored by the CPTS so overwrite the messageType stored by the CPTS with the messageType from the skb->cb data structure. This allows the PTPv1 stack to receive TX timestamps for Delay_Req packets which are necessary to lock onto a PTP Leader. Signed-off-by: Jason Reeder <jreeder@ti.com> Signed-off-by: Ravi Gunasekaran <r-gunasekaran@ti.com> Tested-by: Ed Trexel <ed.trexel@hp.com> Fixes: f6bd5952 ("net: ethernet: ti: introduce am654 common platform time sync driver") Link: https://lore.kernel.org/r/20240424071626.32558-1-r-gunasekaran@ti.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-04-23 (i40e, iavf, ice) This series contains updates to i40e, iavf, and ice drivers. Sindhu removes WQ_MEM_RECLAIM flag from workqueue for i40e. Erwan Velu adjusts message to avoid confusion on base being reported on i40e. Sudheer corrects insufficient check for TC equality on iavf. Jake corrects ordering of locks to avoid possible deadlock on ice. ==================== Link: https://lore.kernel.org/r/20240423182723.740401-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Rahul Rameshbabu says: ==================== Fix isolation of broadcast traffic and unmatched unicast traffic with MACsec offload Some device drivers support devices that enable them to annotate whether a Rx skb refers to a packet that was processed by the MACsec offloading functionality of the device. Logic in the Rx handling for MACsec offload does not utilize this information to preemptively avoid forwarding to the macsec netdev currently. Because of this, things like multicast messages or unicast messages with an unmatched destination address such as ARP requests are forwarded to the macsec netdev whether the message received was MACsec encrypted or not. The goal of this patch series is to improve the Rx handling for MACsec offload for devices capable of annotating skbs received that were decrypted by the NIC offload for MACsec. Here is a summary of the issue that occurs with the existing logic today. * The current design of the MACsec offload handling path tries to use "best guess" mechanisms for determining whether a packet associated with the currently handled skb in the datapath was processed via HW offload * The best guess mechanism uses the following heuristic logic (in order of precedence) - Check if header destination MAC address matches MACsec netdev MAC address -> forward to MACsec port - Check if packet is multicast traffic -> forward to MACsec port - MACsec security channel was able to be looked up from skb offload context (mlx5 only) -> forward to MACsec port * Problem: plaintext traffic can potentially solicit a MACsec encrypted response from the offload device - Core aspect of MACsec is that it identifies unauthorized LAN connections and excludes them from communication + This behavior can be seen when not enabling offload for MACsec - The offload behavior violates this principle in MACsec I believe this behavior is a security bug since applications utilizing MACsec could be exploited using this behavior, and the correct way to resolve this is by having the hardware correctly indicate whether MACsec offload occurred for the packet or not. In the patches in this series, I leave a warning for when the problematic path occurs because I cannot figure out a secure way to fix the security issue that applies to the core MACsec offload handling in the Rx path without breaking MACsec offload for other vendors. Shown at the bottom is an example use case where plaintext traffic sent to a physical port of a NIC configured for MACsec offload is unable to be handled correctly by the software stack when the NIC provides awareness to the kernel about whether the received packet is MACsec traffic or not. In this specific example, plaintext ARP requests are being responded with MACsec encrypted ARP replies (which leads to routing information being unable to be built for the requester). Side 1 ip link del macsec0 ip address flush mlx5_1 ip address add 1.1.1.1/24 dev mlx5_1 ip link set dev mlx5_1 up ip link add link mlx5_1 macsec0 type macsec sci 1 encrypt on ip link set dev macsec0 address 00:11:22:33:44:66 ip macsec offload macsec0 mac ip macsec add macsec0 tx sa 0 pn 1 on key 00 dffafc8d7b9a43d5b9a3dfbbf6a30c16 ip macsec add macsec0 rx sci 2 on ip macsec add macsec0 rx sci 2 sa 0 pn 1 on key 00 ead3664f508eb06c40ac7104cdae4ce5 ip address flush macsec0 ip address add 2.2.2.1/24 dev macsec0 ip link set dev macsec0 up # macsec0 enters promiscuous mode. # This enables all traffic received on macsec_vlan to be processed by # the macsec offload rx datapath. This however means that traffic # meant to be received by mlx5_1 will be incorrectly steered to # macsec0 as well. ip link add link macsec0 name macsec_vlan type vlan id 1 ip link set dev macsec_vlan address 00:11:22:33:44:88 ip address flush macsec_vlan ip address add 3.3.3.1/24 dev macsec_vlan ip link set dev macsec_vlan up Side 2 ip link del macsec0 ip address flush mlx5_1 ip address add 1.1.1.2/24 dev mlx5_1 ip link set dev mlx5_1 up ip link add link mlx5_1 macsec0 type macsec sci 2 encrypt on ip link set dev macsec0 address 00:11:22:33:44:77 ip macsec offload macsec0 mac ip macsec add macsec0 tx sa 0 pn 1 on key 00 ead3664f508eb06c40ac7104cdae4ce5 ip macsec add macsec0 rx sci 1 on ip macsec add macsec0 rx sci 1 sa 0 pn 1 on key 00 dffafc8d7b9a43d5b9a3dfbbf6a30c16 ip address flush macsec0 ip address add 2.2.2.2/24 dev macsec0 ip link set dev macsec0 up # macsec0 enters promiscuous mode. # This enables all traffic received on macsec_vlan to be processed by # the macsec offload rx datapath. This however means that traffic # meant to be received by mlx5_1 will be incorrectly steered to # macsec0 as well. ip link add link macsec0 name macsec_vlan type vlan id 1 ip link set dev macsec_vlan address 00:11:22:33:44:99 ip address flush macsec_vlan ip address add 3.3.3.2/24 dev macsec_vlan ip link set dev macsec_vlan up Side 1 ping -I mlx5_1 1.1.1.2 PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 mlx5_1: 56(84) bytes of data. From 1.1.1.1 icmp_seq=1 Destination Host Unreachable ping: sendmsg: No route to host From 1.1.1.1 icmp_seq=2 Destination Host Unreachable From 1.1.1.1 icmp_seq=3 Destination Host Unreachable Changes: v2->v3: * Made dev paramater const for eth_skb_pkt_type helper as suggested by Sabrina Dubroca <sd@queasysnail.net> v1->v2: * Fixed series subject to detail the issue being fixed * Removed strange characters from cover letter * Added comment in example that illustrates the impact involving promiscuous mode * Added patch for generalizing packet type detection * Added Fixes: tags and targeting net * Removed pointless warning in the heuristic Rx path for macsec offload * Applied small refactor in Rx path offload to minimize scope of rx_sc local variable Link: https://github.com/Binary-Eater/macsec-rx-offload/blob/trunk/MACsec_violation_in_core_stack_offload_rx_handling.pdf Link: https://lore.kernel.org/netdev/20240419213033.400467-5-rrameshbabu@nvidia.com/ Link: https://lore.kernel.org/netdev/20240419011740.333714-1-rrameshbabu@nvidia.com/ Link: https://lore.kernel.org/netdev/87r0l25y1c.fsf@nvidia.com/ Link: https://lore.kernel.org/netdev/20231116182900.46052-1-rrameshbabu@nvidia.com/ ==================== Link: https://lore.kernel.org/r/20240423181319.115860-1-rrameshbabu@nvidia.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jacob Keller authored
9f74a3df ("ice: Fix VF Reset paths when interface in a failed over aggregate"), the ice driver has acquired the LAG mutex in ice_reset_vf(). The commit placed this lock acquisition just prior to the acquisition of the VF configuration lock. If ice_reset_vf() acquires the configuration lock via the ICE_VF_RESET_LOCK flag, this could deadlock with ice_vc_cfg_qs_msg() because it always acquires the locks in the order of the VF configuration lock and then the LAG mutex. Lockdep reports this violation almost immediately on creating and then removing 2 VF: ====================================================== WARNING: possible circular locking dependency detected 6.8.0-rc6 #54 Tainted: G W O ------------------------------------------------------ kworker/60:3/6771 is trying to acquire lock: ff40d43e099380a0 (&vf->cfg_lock){+.+.}-{3:3}, at: ice_reset_vf+0x22f/0x4d0 [ice] but task is already holding lock: ff40d43ea1961210 (&pf->lag_mutex){+.+.}-{3:3}, at: ice_reset_vf+0xb7/0x4d0 [ice] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&pf->lag_mutex){+.+.}-{3:3}: __lock_acquire+0x4f8/0xb40 lock_acquire+0xd4/0x2d0 __mutex_lock+0x9b/0xbf0 ice_vc_cfg_qs_msg+0x45/0x690 [ice] ice_vc_process_vf_msg+0x4f5/0x870 [ice] __ice_clean_ctrlq+0x2b5/0x600 [ice] ice_service_task+0x2c9/0x480 [ice] process_one_work+0x1e9/0x4d0 worker_thread+0x1e1/0x3d0 kthread+0x104/0x140 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x1b/0x30 -> #0 (&vf->cfg_lock){+.+.}-{3:3}: check_prev_add+0xe2/0xc50 validate_chain+0x558/0x800 __lock_acquire+0x4f8/0xb40 lock_acquire+0xd4/0x2d0 __mutex_lock+0x9b/0xbf0 ice_reset_vf+0x22f/0x4d0 [ice] ice_process_vflr_event+0x98/0xd0 [ice] ice_service_task+0x1cc/0x480 [ice] process_one_work+0x1e9/0x4d0 worker_thread+0x1e1/0x3d0 kthread+0x104/0x140 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x1b/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&pf->lag_mutex); lock(&vf->cfg_lock); lock(&pf->lag_mutex); lock(&vf->cfg_lock); *** DEADLOCK *** 4 locks held by kworker/60:3/6771: #0: ff40d43e05428b38 ((wq_completion)ice){+.+.}-{0:0}, at: process_one_work+0x176/0x4d0 #1: ff50d06e05197e58 ((work_completion)(&pf->serv_task)){+.+.}-{0:0}, at: process_one_work+0x176/0x4d0 #2: ff40d43ea1960e50 (&pf->vfs.table_lock){+.+.}-{3:3}, at: ice_process_vflr_event+0x48/0xd0 [ice] #3: ff40d43ea1961210 (&pf->lag_mutex){+.+.}-{3:3}, at: ice_reset_vf+0xb7/0x4d0 [ice] stack backtrace: CPU: 60 PID: 6771 Comm: kworker/60:3 Tainted: G W O 6.8.0-rc6 #54 Hardware name: Workqueue: ice ice_service_task [ice] Call Trace: <TASK> dump_stack_lvl+0x4a/0x80 check_noncircular+0x12d/0x150 check_prev_add+0xe2/0xc50 ? save_trace+0x59/0x230 ? add_chain_cache+0x109/0x450 validate_chain+0x558/0x800 __lock_acquire+0x4f8/0xb40 ? lockdep_hardirqs_on+0x7d/0x100 lock_acquire+0xd4/0x2d0 ? ice_reset_vf+0x22f/0x4d0 [ice] ? lock_is_held_type+0xc7/0x120 __mutex_lock+0x9b/0xbf0 ? ice_reset_vf+0x22f/0x4d0 [ice] ? ice_reset_vf+0x22f/0x4d0 [ice] ? rcu_is_watching+0x11/0x50 ? ice_reset_vf+0x22f/0x4d0 [ice] ice_reset_vf+0x22f/0x4d0 [ice] ? process_one_work+0x176/0x4d0 ice_process_vflr_event+0x98/0xd0 [ice] ice_service_task+0x1cc/0x480 [ice] process_one_work+0x1e9/0x4d0 worker_thread+0x1e1/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0x104/0x140 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x31/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK> To avoid deadlock, we must acquire the LAG mutex only after acquiring the VF configuration lock. Fix the ice_reset_vf() to acquire the LAG mutex only after we either acquire or check that the VF configuration lock is held. Fixes: 9f74a3df ("ice: Fix VF Reset paths when interface in a failed over aggregate") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Dave Ertman <david.m.ertman@intel.com> Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Tested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240423182723.740401-5-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Sudheer Mogilappagari authored
Same number of TCs doesn't imply that underlying TC configs are same. The config could be different due to difference in number of queues in each TC. Add utility function to determine if TC configs are same. Fixes: d5b33d02 ("i40evf: add ndo_setup_tc callback to i40evf") Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Mineri Bhange <minerix.bhange@intel.com> (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20240423182723.740401-4-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Erwan Velu authored
If the MFS is set below the default (0x2600), a warning message is reported like the following : MFS for port 1 has been set below the default: 600 This message is a bit confusing as the number shown here (600) is in fact an hexa number: 0x600 = 1536 Without any explicit "0x" prefix, this message is read like the MFS is set to 600 bytes. MFS, as per MTUs, are usually expressed in decimal base. This commit reports both current and default MFS values in decimal so it's less confusing for end-users. A typical warning message looks like the following : MFS for port 1 (1536) has been set below the default (9728) Signed-off-by: Erwan Velu <e.velu@criteo.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Tony Brelinski <tony.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Fixes: 3a2c6ced ("i40e: Add a check to see if MFS is set") Link: https://lore.kernel.org/r/20240423182723.740401-3-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-