- 05 Apr, 2016 8 commits
-
-
Thadeu Lima de Souza Cascardo authored
When creating an ip6tnl tunnel with ip tunnel, rtnl_link_ops is not set before ip6_tnl_create2 is called. When register_netdevice is called, there is no linkinfo attribute in the NEWLINK message because of that. Setting rtnl_link_ops before calling register_netdevice fixes that. Fixes: 0b112457 ("ip6tnl: add support of link creation via rtnl") Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Bjorn Helgaas authored
This reverts commit 543e3a8d. Direct callers of __netpoll_setup() depend on it to set np->dev, so we can't simply move that assignment up to netpoll_stup(). Reported-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queueDavid S. Miller authored
Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2016-04-05 This series contains updates to i40e and e1000. Jesse fixes an issue where code was added by a previous commit but the flag to enable it was never set. Alex fixes the e1000 driver from grossly overestimated the descriptors needed to transmit a frame. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexander Duyck authored
The 82544 has code that adds one additional descriptor per data buffer. However we weren't taking that into account when determining the descriptors needed for the next transmit at the end of the xmit_frame path. This change takes that into account by doubling the number of descriptors needed for the 82544 so that we can avoid a potential issue where we could hang the Tx ring by loading frames with xmit_more enabled and then stopping the ring without writing the tail. In addition it adds a few more descriptors to account for some additional workarounds that have been added over time. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Alexander Duyck authored
The current code path is capable of grossly overestimating the number of descriptors needed to transmit a new frame. This specifically occurs if the skb contains a number of 4K pages. The issue is that the logic for determining the descriptors needed is ((S) >> (X)) + 1. When X is 12 it means that we were indicating that we required 2 descriptors for each 4K page when we only needed one. This change corrects this by instead adding (1 << (X)) - 1 to the S value instead of adding 1 after the fact. This way we get an accurate descriptor needed count as we are essentially doing a DIV_ROUNDUP(). Reported-by: Ivan Suzdal <isuzdal@mirantis.com> Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Jesse Brandeburg authored
There was an error introduced with commit 3fced535 ("i40e: X722 is on the IOSF bus and does not report the PCI bus info"), where code was added but the enabling flag is never set. CC: Anjali Singhai Jain <anjali.singhai@intel.com> CC: Stefan Assman <sassman@redhat.com> Fixes: 3fced535 ("i40e: X722 is on the IOSF bus ...") Reported-by: Steve Best <sbest@redhat.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
-
Marcelo Ricardo Leitner authored
Use list_* helpers in sctp_list_dequeue, more readable. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Marcelo Ricardo Leitner authored
There is no point on delaying the packet if we can't fit a single byte of data on it anymore. So lets just reduce the threshold by the amount that a data chunk with 4 bytes (rounding) would use. v2: based on the right tree Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 04 Apr, 2016 3 commits
-
-
Bastien Philbert authored
This fixes the incorrect variable assignment on error path in br_sysfs_addbr for when the call to kobject_create_and_add fails to assign the value of -EINVAL to the returned variable of err rather then incorrectly return zero making callers think this function has succeededed due to the previous assignment being assigned zero when assigning it the successful return value of the call to sysfs_create_group which is zero. Signed-off-by: Bastien Philbert <bastienphilbert@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Haishuang Yan authored
pskb_may_pull() can change skb->data, so we have to load ptr/optr at the right place. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Haishuang Yan authored
pskb_may_pull() can change skb->data, so we have to load ptr/optr at the right place. Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 02 Apr, 2016 4 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Missing device reference in IPSEC input path results in crashes during device unregistration. From Subash Abhinov Kasiviswanathan. 2) Per-queue ISR register writes not being done properly in macb driver, from Cyrille Pitchen. 3) Stats accounting bugs in bcmgenet, from Patri Gynther. 4) Lightweight tunnel's TTL and TOS were swapped in netlink dumps, from Quentin Armitage. 5) SXGBE driver has off-by-one in probe error paths, from Rasmus Villemoes. 6) Fix race in save/swap/delete options in netfilter ipset, from Vishwanath Pai. 7) Ageing time of bridge not set properly when not operating over a switchdev device. Fix from Haishuang Yan. 8) Fix GRO regression wrt nested FOU/GUE based tunnels, from Alexander Duyck. 9) IPV6 UDP code bumps wrong stats, from Eric Dumazet. 10) FEC driver should only access registers that actually exist on the given chipset, fix from Fabio Estevam. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (73 commits) net: mvneta: fix changing MTU when using per-cpu processing stmmac: fix MDIO settings Revert "stmmac: Fix 'eth0: No PHY found' regression" stmmac: fix TX normal DESC net: mvneta: use cache_line_size() to get cacheline size net: mvpp2: use cache_line_size() to get cacheline size net: mvpp2: fix maybe-uninitialized warning tun, bpf: fix suspicious RCU usage in tun_{attach, detach}_filter net: usb: cdc_ncm: adding Telit LE910 V2 mobile broadband card rtnl: fix msg size calculation in if_nlmsg_size() fec: Do not access unexisting register in Coldfire net: mvneta: replace MVNETA_CPU_D_CACHE_LINE_SIZE with L1_CACHE_BYTES net: mvpp2: replace MVPP2_CPU_D_CACHE_LINE_SIZE with L1_CACHE_BYTES net: dsa: mv88e6xxx: Clear the PDOWN bit on setup net: dsa: mv88e6xxx: Introduce _mv88e6xxx_phy_page_{read, write} bpf: make padding in bpf_tunnel_key explicit ipv6: udp: fix UDP_MIB_IGNOREDMULTI updates bnxt_en: Fix ethtool -a reporting. bnxt_en: Fix typo in bnxt_hwrm_set_pause_common(). bnxt_en: Implement proper firmware message padding. ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linuxLinus Torvalds authored
Pull clk fixes from Stephen Boyd: "A handful of const updates for reset ops and a couple fixes to the newly introduced IPQ4019 clock driver" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: qcom: ipq4019: add some fixed clocks for ddrppl and fepll clk: qcom: ipq4019: switch remaining defines to enums clk: qcom: Make reset_control_ops const clk: tegra: Make reset_control_ops const clk: sunxi: Make reset_control_ops const clk: atlas7: Make reset_control_ops const clk: rockchip: Make reset_control_ops const clk: mmp: Make reset_control_ops const clk: mediatek: Make reset_control_ops const
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull power management and ACPI fix from Rafael J. Wysocki: "Just one fix for a nasty boot failure on some systems based on Intel Skylake that shipped with broken firmware where enabling hardware-coordinated P-states management (HWP) causes a faulty interrupt handler in SMM to be invoked and crash the system (Srinivas Pandruvada)" * tag 'pm+acpi-4.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / processor: Request native thermal interrupt handling via _OSC
-
Linus Torvalds authored
Merge fixes from Andrew Morton: "11 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: .mailmap: add Christophe Ricard Make CONFIG_FHANDLE default y mm/page_isolation.c: fix the function comments oom, oom_reaper: do not enqueue task if it is on the oom_reaper_list head mm/page_isolation: fix tracepoint to mirror check function behavior mm/rmap: batched invalidations should use existing api x86/mm: TLB_REMOTE_SEND_IPI should count pages mm: fix invalid node in alloc_migrate_target() include/linux/huge_mm.h: return NULL instead of false for pmd_trans_huge_lock() mm, kasan: fix compilation for CONFIG_SLAB MAINTAINERS: orangefs mailing list is subscribers-only
-
- 01 Apr, 2016 25 commits
-
-
Rafael J. Wysocki authored
* acpi-processor: ACPI / processor: Request native thermal interrupt handling via _OSC
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfsLinus Torvalds authored
Pull btrfs fixes from Chris Mason: "This has a few fixes Dave Sterba had queued up. These are all pretty small, but since they were tested I decided against waiting for more" * 'for-linus-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: btrfs: transaction_kthread() is not freezable btrfs: cleaner_kthread() doesn't need explicit freeze btrfs: do not write corrupted metadata blocks to disk btrfs: csum_tree_block: return proper errno value
-
git://github.com/martinbrandenburg/linuxLinus Torvalds authored
Pull OrangeFS fixes from Martin Brandenburg: "Two bugfixes for OrangeFS. One is a reference counting bug and the other is a typo in client minimum version" * tag 'for-linus' of git://github.com/martinbrandenburg/linux: orangefs: minimum userspace version is 2.9.3 orangefs: don't put readdir slot twice
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds authored
Pull arm64 fixes from Will Deacon: - fix oops when patching in alternative sequences on big-endian CPUs - reconcile asm/perf_event.h after merge window fallout with KVM ARM - defconfig updates * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: defconfig: updates for 4.6 arm64: perf: Move PMU register related defines to asm/perf_event.h arm64: opcodes.h: Add arm big-endian config options before including arm header
-
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/soundLinus Torvalds authored
Pull sound fixes from Takashi Iwai: "A collection of small fixes: - a fix in ALSA timer core to avoid possible BUG() trigger - a fix in ALSA timer core 32bit compat layer - a few HD-audio quirks for ASUS and HP machines - AMD HD-audio HDMI controller quirks - fixes of USB-audio double-free at some error paths - a fix for memory leak in DICE driver at hotunplug" * tag 'sound-4.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: timer: Use mod_timer() for rearming the system timer ALSA: hda - fix front mic problem for a HP desktop ALSA: usb-audio: Fix double-free in error paths after snd_usb_add_audio_stream() call ALSA: hda: add AMD Polaris-10/11 AZ PCI IDs with proper driver caps ALSA: dice: fix memory leak when unplugging ALSA: hda - Apply fix for white noise on Asus N550JV, too ALSA: hda - Fix white noise on Asus N750JV headphone ALSA: hda - Asus N750JV external subwoofer fixup ALSA: timer: fix gparams ioctl compatibility for different architectures
-
Christophe Ricard authored
Different computers had different settings in the mail client. Some contributions appear as Christophe Ricard, others as Christophe RICARD. Signed-off-by: Christophe Ricard <christophe-h.ricard@st.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andi Kleen authored
Newer Fedora and OpenSUSE didn't boot with my standard configuration. It took me some time to figure out why, in fact I had to write a script to try different config options systematically. The problem is that something (systemd) in dracut depends on CONFIG_FHANDLE, which adds open by file handle syscalls. While it is set in defconfigs it is very easy to miss when updating older configs because it is not default y. Make it default y and also depend on EXPERT, as dracut use is likely widespread. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Richard Weinberger <richard.weinberger@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Neil Zhang authored
Commit fea85cff ("mm/page_isolation.c: return last tested pfn rather than failure indicator") changed the meaning of the return value. Let's change the function comments as well. Signed-off-by: Neil Zhang <neilzhang1123@hotmail.com> Cc: Joonsoo Kim <js1304@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Michal Hocko authored
Commit bb29902a ("oom, oom_reaper: protect oom_reaper_list using simpler way") has simplified the check for tasks already enqueued for the oom reaper by checking tsk->oom_reaper_list != NULL. This check is not sufficient because the tsk might be the head of the queue without any other tasks queued and then we would simply lockup looping on the same task. Fix the condition by checking for the head as well. Fixes: bb29902a ("oom, oom_reaper: protect oom_reaper_list using simpler way") Signed-off-by: Michal Hocko <mhocko@suse.com> Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Lucas Stach authored
Page isolation has not failed if the fin pfn extends beyond the end pfn and test_pages_isolated checks this correctly. Fix the tracepoint to report the same result as the actual check function. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Joonsoo Kim <js1304@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Nadav Amit authored
The recently introduced batched invalidations mechanism uses its own mechanism for shootdown. However, it does wrong accounting of interrupts (e.g., inc_irq_stat is called for local invalidations), trace-points (e.g., TLB_REMOTE_SHOOTDOWN for local invalidations) and may break some platforms as it bypasses the invalidation mechanisms of Xen and SGI UV. This patch reuses the existing TLB flushing mechnaisms instead. We use NULL as mm to indicate a global invalidation is required. Fixes 72b252ae ("mm: send one IPI per CPU to TLB flush all entries after unmapping pages") Signed-off-by: Nadav Amit <namit@vmware.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Nadav Amit authored
TLB_REMOTE_SEND_IPI was recently introduced, but it counts bytes instead of pages. In addition, it does not report correctly the case in which flush_tlb_page flushes a page. Fix it to be consistent with other TLB counters. Fixes: 5b74283a ("x86, mm: trace when an IPI is about to be sent") Signed-off-by: Nadav Amit <namit@vmware.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Xishi Qiu authored
It is incorrect to use next_node to find a target node, it will return MAX_NUMNODES or invalid node. This will lead to crash in buddy system allocation. Fixes: c8721bbb ("mm: memory-hotplug: enable memory hotplug to handle hugepage") Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Joonsoo Kim <js1304@gmail.com> Cc: David Rientjes <rientjes@google.com> Cc: "Laura Abbott" <lauraa@codeaurora.org> Cc: Hui Zhu <zhuhui@xiaomi.com> Cc: Wang Xiaoqiang <wangxq10@lzu.edu.cn> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Chen Gang authored
The return value of pmd_trans_huge_lock() is a pointer, not a boolean value, so use NULL instead of false as the return value. Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alexander Potapenko authored
Add the missing argument to set_track(). Fixes: cd11016e ("mm, kasan: stackdepot implementation. Enable stackdepot for SLAB") Signed-off-by: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <adech.fo@gmail.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Konstantin Serebryany <kcc@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joe Perches authored
So update MAINTAINERS to say so. Signed-off-by: Joe Perches <joe@perches.com> Cc: Mike Marshall <hubcap@omnibond.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Marcin Wojtas authored
After enabling per-cpu processing it appeared that under heavy load changing MTU can result in blocking all port's interrupts and transmitting data is not possible after the change. This commit fixes above issue by disabling percpu interrupts for the time, when TXQs and RXQs are reconfigured. Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Giuseppe Cavallaro says: ==================== stmmac MDIO and normal descr fixes This patch series is to fix the problems below and recently debugged in this mailing list: o to fix a problem for the HW where the normal descriptor o to fix the mdio registration according to the different platform configurations I am resending all the patches again: built on top of net.git repo. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Giuseppe CAVALLARO authored
Initially the phy_bus_name was added to manipulate the driver name but it was recently just used to manage the fixed-link and then to take some decision at run-time. So the patch uses the is_pseudo_fixed_link and removes the phy_bus_name variable not necessary anymore. The driver can manage the mdio registration by using phy-handle, dwmac-mdio and own parameter e.g. snps,phy-addr. This patch takes care about all these possible configurations and fixes the mdio registration in case of there is a real transceiver or a switch (that needs to be managed by using fixed-link). Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Reviewed-by: Andreas Färber <afaerber@suse.de> Tested-by: Frank Schäfer <fschaefer.oss@googlemail.com> Cc: Gabriel Fernandez <gabriel.fernandez@linaro.org> Cc: Dinh Nguyen <dinh.linux@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Phil Reid <preid@electromag.com.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Giuseppe CAVALLARO authored
This reverts commit 88f8b1bb. due to problems on GeekBox and Banana Pi M1 board when connected to a real transceiver instead of a switch via fixed-link. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Cc: Gabriel Fernandez <gabriel.fernandez@linaro.org> Cc: Andreas Färber <afaerber@suse.de> Cc: Frank Schäfer <fschaefer.oss@googlemail.com> Cc: Dinh Nguyen <dinh.linux@gmail.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Giuseppe CAVALLARO authored
This patch fixs a regression raised when test on chips that use the normal descriptor layout. In fact, no len bits were set for the TDES1 and no OWN bit inside the TDES0. Signed-off-by: Giuseppe CAVALLARO <peppe.cavallaro@st.com> Tested-by: Andreas Färber <afaerber@suse.de> Cc: Fabrice Gasnier <fabrice.gasnier@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jisheng Zhang authored
L1_CACHE_BYTES may not be the real cacheline size, use cache_line_size to determine the cacheline size in runtime. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Suggested-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jisheng Zhang authored
L1_CACHE_BYTES may not be the real cacheline size, use cache_line_size to determine the cacheline size in runtime. Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Suggested-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jisheng Zhang authored
This is to fix the following maybe-uninitialized warning: drivers/net/ethernet/marvell/mvpp2.c:6007:18: warning: 'err' may be used uninitialized in this function [-Wmaybe-uninitialized] Signed-off-by: Jisheng Zhang <jszhang@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
Sasha Levin reported a suspicious rcu_dereference_protected() warning found while fuzzing with trinity that is similar to this one: [ 52.765684] net/core/filter.c:2262 suspicious rcu_dereference_protected() usage! [ 52.765688] other info that might help us debug this: [ 52.765695] rcu_scheduler_active = 1, debug_locks = 1 [ 52.765701] 1 lock held by a.out/1525: [ 52.765704] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff816a64b7>] rtnl_lock+0x17/0x20 [ 52.765721] stack backtrace: [ 52.765728] CPU: 1 PID: 1525 Comm: a.out Not tainted 4.5.0+ #264 [...] [ 52.765768] Call Trace: [ 52.765775] [<ffffffff813e488d>] dump_stack+0x85/0xc8 [ 52.765784] [<ffffffff810f2fa5>] lockdep_rcu_suspicious+0xd5/0x110 [ 52.765792] [<ffffffff816afdc2>] sk_detach_filter+0x82/0x90 [ 52.765801] [<ffffffffa0883425>] tun_detach_filter+0x35/0x90 [tun] [ 52.765810] [<ffffffffa0884ed4>] __tun_chr_ioctl+0x354/0x1130 [tun] [ 52.765818] [<ffffffff8136fed0>] ? selinux_file_ioctl+0x130/0x210 [ 52.765827] [<ffffffffa0885ce3>] tun_chr_ioctl+0x13/0x20 [tun] [ 52.765834] [<ffffffff81260ea6>] do_vfs_ioctl+0x96/0x690 [ 52.765843] [<ffffffff81364af3>] ? security_file_ioctl+0x43/0x60 [ 52.765850] [<ffffffff81261519>] SyS_ioctl+0x79/0x90 [ 52.765858] [<ffffffff81003ba2>] do_syscall_64+0x62/0x140 [ 52.765866] [<ffffffff817d563f>] entry_SYSCALL64_slow_path+0x25/0x25 Same can be triggered with PROVE_RCU (+ PROVE_RCU_REPEATEDLY) enabled from tun_attach_filter() when user space calls ioctl(tun_fd, TUN{ATTACH, DETACH}FILTER, ...) for adding/removing a BPF filter on tap devices. Since the fix in f91ff5b9 ("net: sk_{detach|attach}_filter() rcu fixes") sk_attach_filter()/sk_detach_filter() now dereferences the filter with rcu_dereference_protected(), checking whether socket lock is held in control path. Since its introduction in 99405162 ("tun: socket filter support"), tap filters are managed under RTNL lock from __tun_chr_ioctl(). Thus the sock_owned_by_user(sk) doesn't apply in this specific case and therefore triggers the false positive. Extend the BPF API with __sk_attach_filter()/__sk_detach_filter() pair that is used by tap filters and pass in lockdep_rtnl_is_held() for the rcu_dereference_protected() checks instead. Reported-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-