- 04 Jan, 2021 15 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfJakub Kicinski authored
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for net: 1) Missing sanitization of rateest userspace string, bug has been triggered by syzbot, patch from Florian Westphal. 2) Report EOPNOTSUPP on missing set features in nft_dynset, otherwise error reporting to userspace via EINVAL is misleading since this is reserved for malformed netlink requests. 3) New binaries with old kernels might silently accept several set element expressions. New binaries set on the NFT_SET_EXPR and NFT_DYNSET_F_EXPR flags to request for several expressions per element, hence old kernels which do not support for this bail out with EOPNOTSUPP. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf: netfilter: nftables: add set expression flags netfilter: nft_dynset: report EOPNOTSUPP on missing set feature netfilter: xt_RATEEST: reject non-null terminated string from userspace ==================== Link: https://lore.kernel.org/r/20210103192920.18639-1-pablo@netfilter.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Martin Blumenstingl says: ==================== net: dsa: lantiq_gswip: two fixes for -net/-stable While testing the lantiq_gswip driver in OpenWrt at least one board had a non-working Ethernet port connected to an internal 100Mbit/s PHY22F GPHY. The problem which could be observed: - the PHY would detect the link just fine - ethtool stats would see the TX counter rise - the RX counter in ethtool was stuck at zero It turns out that two independent patches are needed to fix this: - first we need to enable the MII data lines also for internal PHYs - second we need to program the GSWIP_MII_CFG registers for all ports except the CPU port These two patches have also been tested by back-porting them on top of Linux 5.4.86 in OpenWrt. Special thanks to Hauke for debugging and brainstorming this on IRC with me! ==================== Link: https://lore.kernel.org/r/20210103012544.3259029-1-martin.blumenstingl@googlemail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Martin Blumenstingl authored
There is one GSWIP_MII_CFG register for each switch-port except the CPU port. The register offset for the first port is 0x0, 0x02 for the second, 0x04 for the third and so on. Update the driver to not only restrict the GSWIP_MII_CFG registers to ports 0, 1 and 5. Handle ports 0..5 instead but skip the CPU port. This means we are not overwriting the configuration for the third port (port two since we start counting from zero) with the settings for the sixth port (with number five) anymore. The GSWIP_MII_PCDU(p) registers are not updated because there's really only three (one for each of the following ports: 0, 1, 5). Fixes: 14fceff4 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Martin Blumenstingl authored
Enable GSWIP_MII_CFG_EN also for internal PHYs to make traffic flow. Without this the PHY link is detected properly and ethtool statistics for TX are increasing but there's no RX traffic coming in. Fixes: 14fceff4 ("net: dsa: Add Lantiq / Intel DSA driver for vrx200") Suggested-by: Hauke Mehrtens <hauke@hauke-m.de> Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Acked-by: Hauke Mehrtens <hauke@hauke-m.de> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Xie He authored
In lapb_device_event, lapb_devtostruct is called to get a reference to an object of "struct lapb_cb". lapb_devtostruct increases the refcount of the object and returns a pointer to it. However, we didn't decrease the refcount after we finished using the pointer. This patch fixes this problem. Fixes: a4989fa9 ("net/lapb: support netdev events") Cc: Martin Schiller <ms@dev.tdt.de> Signed-off-by: Xie He <xie.he.0141@gmail.com> Link: https://lore.kernel.org/r/20201231174331.64539-1-xie.he.0141@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Heiner Kallweit authored
A user reported failing network with RTL8168dp (a quite rare chip version). Realtek confirmed that few chip versions suffer from a PLL power-down hw bug. Fixes: 07df5bd8 ("r8169: power down chip in probe") Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/a1c39460-d533-7f9e-fa9d-2b8990b02426@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Bjørn Mork authored
New modem using ff/ff/30 for QCDM, ff/00/00 for AT and NMEA, and ff/ff/ff for RMNET/QMI. T: Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#= 2 Spd=5000 MxCh= 0 D: Ver= 3.20 Cls=ef(misc ) Sub=02 Prot=01 MxPS= 9 #Cfgs= 1 P: Vendor=2c7c ProdID=0620 Rev= 4.09 S: Manufacturer=Quectel S: Product=EM160R-GL S: SerialNumber=e31cedc1 C:* #Ifs= 5 Cfg#= 1 Atr=a0 MxPwr=896mA I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=(none) E: Ad=81(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=01(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=82(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=02(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=84(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=03(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=(none) E: Ad=87(I) Atr=03(Int.) MxPS= 10 Ivl=32ms E: Ad=86(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=04(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none) E: Ad=88(I) Atr=03(Int.) MxPS= 8 Ivl=32ms E: Ad=8e(I) Atr=02(Bulk) MxPS=1024 Ivl=0ms E: Ad=0f(O) Atr=02(Bulk) MxPS=1024 Ivl=0ms Signed-off-by: Bjørn Mork <bjorn@mork.no> Link: https://lore.kernel.org/r/20201230152451.245271-1-bjorn@mork.noSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ido Schimmel authored
The test was setting the headroom size of the wrong port. This was not visible because of a firmware bug that canceled this bug. Set the headroom size of the correct port, so that the test will pass with both old and new firmware versions. Fixes: bfa80478 ("selftests: mlxsw: Add a PFC test") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/20201230114251.394009-1-idosch@idosch.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Charles Keepax authored
A new flag MACB_CAPS_CLK_HW_CHG was added and all callers of macb_set_tx_clk were gated on the presence of this flag. - if (!clk) + if (!bp->tx_clk || !(bp->caps & MACB_CAPS_CLK_HW_CHG)) However the flag was not added to anything other than the new sama7g5_gem, turning that function call into a no op for all other systems. This breaks the networking on Zynq. The commit message adding this states: a new capability so that macb_set_tx_clock() to not be called for IPs having this capability This strongly implies that present of the flag was intended to skip the function not absence of the flag. Update the if statement to this effect, which repairs the existing users. Fixes: daafa1d3 ("net: macb: add capability to not set the clock rate") Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Charles Keepax <ckeepax@opensource.cirrus.com> Reviewed-by: Claudiu Beznea <claudiu.beznea@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20210104103802.13091-1-ckeepax@opensource.cirrus.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
YANG LI authored
The error is due to dereference a null pointer in function reset_one_sub_crq_queue(): if (!scrq) { netdev_dbg(adapter->netdev, "Invalid scrq reset. irq (%d) or msgs(%p).\n", scrq->irq, scrq->msgs); return -EINVAL; } If the expression is true, scrq must be a null pointer and cannot dereference. Fixes: 9281cf2d ("ibmvnic: avoid memset null scrq msgs") Signed-off-by: YANG LI <abaci-bugfix@linux.alibaba.com> Reported-by: Abaci <abaci@linux.alibaba.com> Acked-by: Lijun Pan <ljp@linux.ibm.com> Link: https://lore.kernel.org/r/1609312994-121032-1-git-send-email-abaci-bugfix@linux.alibaba.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Baruch Siach authored
Before commit 889b8f96 ("packet: Kill CONFIG_PACKET_MMAP.") there used to be a CONFIG_PACKET_MMAP config symbol that depended on CONFIG_PACKET. The text still implies that PACKET_MMAP can be disabled. Remove that from the text, as well as reference to old kernel versions. Also, drop reference to broken link to information for pre 2.6.5 kernels. Make a slight working improvement (s/In/On/) while at it. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Link: https://lore.kernel.org/r/80089f3783372c8fd7833f28ce774a171b2ef252.1609232919.git.baruch@tkos.co.ilSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Baruch Siach authored
The citation of macro definitions should appear in a code block. Signed-off-by: Baruch Siach <baruch@tkos.co.il> Link: https://lore.kernel.org/r/5cb47005e7a59b64299e038827e295822193384c.1609232919.git.baruch@tkos.co.ilSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yunjian Wang authored
Currently the vhost_zerocopy_callback() maybe be called to decrease the refcount when sendmsg fails in tun. The error handling in vhost handle_tx_zerocopy() will try to decrease the same refcount again. This is wrong. To fix this issue, we only call vhost_net_ubuf_put() when vq->heads[nvq->desc].len == VHOST_DMA_IN_PROGRESS. Fixes: bab632d6 ("vhost: vhost TX zero-copy support") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://lore.kernel.org/r/1609207308-20544-1-git-send-email-wangyunjian@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Taehee Yoo authored
In the bareudp6_xmit_skb(), it calculates min_headroom. At that point, it uses struct iphdr, but it's not correct. So panic could occur. The struct ipv6hdr should be used. Test commands: ip netns add A ip netns add B ip link add veth0 netns A type veth peer name veth1 netns B ip netns exec A ip link set veth0 up ip netns exec A ip a a 2001:db8:0::1/64 dev veth0 ip netns exec B ip link set veth1 up ip netns exec B ip a a 2001:db8:0::2/64 dev veth1 for i in {10..1} do let A=$i-1 ip netns exec A ip link add bareudp$i type bareudp dstport $i \ ethertype 0x86dd ip netns exec A ip link set bareudp$i up ip netns exec A ip -6 a a 2001:db8:$i::1/64 dev bareudp$i ip netns exec A ip -6 r a 2001:db8:$i::2 encap ip6 src \ 2001:db8:$A::1 dst 2001:db8:$A::2 via 2001:db8:$i::2 \ dev bareudp$i ip netns exec B ip link add bareudp$i type bareudp dstport $i \ ethertype 0x86dd ip netns exec B ip link set bareudp$i up ip netns exec B ip -6 a a 2001:db8:$i::2/64 dev bareudp$i ip netns exec B ip -6 r a 2001:db8:$i::1 encap ip6 src \ 2001:db8:$A::2 dst 2001:db8:$A::1 via 2001:db8:$i::1 \ dev bareudp$i done ip netns exec A ping 2001:db8:7::2 Splat looks like: [ 66.436679][ C2] skbuff: skb_under_panic: text:ffffffff928614c8 len:454 put:14 head:ffff88810abb4000 data:ffff88810abb3ffa tail:0x1c0 end:0x3ec0 dev:veth0 [ 66.441626][ C2] ------------[ cut here ]------------ [ 66.443458][ C2] kernel BUG at net/core/skbuff.c:109! [ 66.445313][ C2] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 66.447606][ C2] CPU: 2 PID: 913 Comm: ping Not tainted 5.10.0+ #819 [ 66.450251][ C2] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 66.453713][ C2] RIP: 0010:skb_panic+0x15d/0x15f [ 66.455345][ C2] Code: 98 fe 4c 8b 4c 24 10 53 8b 4d 70 45 89 e0 48 c7 c7 60 8b 78 93 41 57 41 56 41 55 48 8b 54 24 20 48 8b 74 24 28 e8 b5 40 f9 ff <0f> 0b 48 8b 6c 24 20 89 34 24 e8 08 c9 98 fe 8b 34 24 48 c7 c1 80 [ 66.462314][ C2] RSP: 0018:ffff888119209648 EFLAGS: 00010286 [ 66.464281][ C2] RAX: 0000000000000089 RBX: ffff888003159000 RCX: 0000000000000000 [ 66.467216][ C2] RDX: 0000000000000089 RSI: 0000000000000008 RDI: ffffed10232412c0 [ 66.469768][ C2] RBP: ffff88810a53d440 R08: ffffed102328018d R09: ffffed102328018d [ 66.472297][ C2] R10: ffff888119400c67 R11: ffffed102328018c R12: 000000000000000e [ 66.474833][ C2] R13: ffff88810abb3ffa R14: 00000000000001c0 R15: 0000000000003ec0 [ 66.477361][ C2] FS: 00007f37c0c72f00(0000) GS:ffff888119200000(0000) knlGS:0000000000000000 [ 66.480214][ C2] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 66.482296][ C2] CR2: 000055a058808570 CR3: 000000011039e002 CR4: 00000000003706e0 [ 66.484811][ C2] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 66.487793][ C2] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 66.490424][ C2] Call Trace: [ 66.491469][ C2] <IRQ> [ 66.492374][ C2] ? eth_header+0x28/0x190 [ 66.494054][ C2] ? eth_header+0x28/0x190 [ 66.495401][ C2] skb_push.cold.99+0x22/0x22 [ 66.496700][ C2] eth_header+0x28/0x190 [ 66.497867][ C2] neigh_resolve_output+0x3de/0x720 [ 66.499615][ C2] ? __neigh_update+0x7e8/0x20a0 [ 66.501176][ C2] __neigh_update+0x8bd/0x20a0 [ 66.502749][ C2] ndisc_update+0x34/0xc0 [ 66.504010][ C2] ndisc_recv_na+0x8da/0xb80 [ 66.505041][ C2] ? pndisc_redo+0x20/0x20 [ 66.505888][ C2] ? rcu_read_lock_sched_held+0xc0/0xc0 [ 66.506965][ C2] ndisc_rcv+0x3a0/0x470 [ 66.507797][ C2] icmpv6_rcv+0xad9/0x1b00 [ 66.508645][ C2] ip6_protocol_deliver_rcu+0xcd6/0x1560 [ 66.509719][ C2] ip6_input_finish+0x5b/0xf0 [ 66.510615][ C2] ip6_input+0xcd/0x2d0 [ 66.511406][ C2] ? ip6_input_finish+0xf0/0xf0 [ 66.512327][ C2] ? rcu_read_lock_held+0x91/0xa0 [ 66.513279][ C2] ? ip6_protocol_deliver_rcu+0x1560/0x1560 [ 66.514414][ C2] ipv6_rcv+0xe8/0x300 [ ... ] Acked-by: Guillaume Nault <gnault@redhat.com> Fixes: 571912c6 ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20201228152146.24270-1-ap420073@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Taehee Yoo authored
Like other tunneling interfaces, the bareudp doesn't need TXLOCK. So, It is good to set the NETIF_F_LLTX flag to improve performance and to avoid lockdep's false-positive warning. Test commands: ip netns add A ip netns add B ip link add veth0 netns A type veth peer name veth1 netns B ip netns exec A ip link set veth0 up ip netns exec A ip a a 10.0.0.1/24 dev veth0 ip netns exec B ip link set veth1 up ip netns exec B ip a a 10.0.0.2/24 dev veth1 for i in {2..1} do let A=$i-1 ip netns exec A ip link add bareudp$i type bareudp \ dstport $i ethertype ip ip netns exec A ip link set bareudp$i up ip netns exec A ip a a 10.0.$i.1/24 dev bareudp$i ip netns exec A ip r a 10.0.$i.2 encap ip src 10.0.$A.1 \ dst 10.0.$A.2 via 10.0.$i.2 dev bareudp$i ip netns exec B ip link add bareudp$i type bareudp \ dstport $i ethertype ip ip netns exec B ip link set bareudp$i up ip netns exec B ip a a 10.0.$i.2/24 dev bareudp$i ip netns exec B ip r a 10.0.$i.1 encap ip src 10.0.$A.2 \ dst 10.0.$A.1 via 10.0.$i.1 dev bareudp$i done ip netns exec A ping 10.0.2.2 Splat looks like: [ 96.992803][ T822] ============================================ [ 96.993954][ T822] WARNING: possible recursive locking detected [ 96.995102][ T822] 5.10.0+ #819 Not tainted [ 96.995927][ T822] -------------------------------------------- [ 96.997091][ T822] ping/822 is trying to acquire lock: [ 96.998083][ T822] ffff88810f753898 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960 [ 96.999813][ T822] [ 96.999813][ T822] but task is already holding lock: [ 97.001192][ T822] ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960 [ 97.002908][ T822] [ 97.002908][ T822] other info that might help us debug this: [ 97.004401][ T822] Possible unsafe locking scenario: [ 97.004401][ T822] [ 97.005784][ T822] CPU0 [ 97.006407][ T822] ---- [ 97.007010][ T822] lock(_xmit_NONE#2); [ 97.007779][ T822] lock(_xmit_NONE#2); [ 97.008550][ T822] [ 97.008550][ T822] *** DEADLOCK *** [ 97.008550][ T822] [ 97.010057][ T822] May be due to missing lock nesting notation [ 97.010057][ T822] [ 97.011594][ T822] 7 locks held by ping/822: [ 97.012426][ T822] #0: ffff888109a144f0 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0x12f7/0x2b00 [ 97.014191][ T822] #1: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020 [ 97.016045][ T822] #2: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960 [ 97.017897][ T822] #3: ffff88810c385498 (_xmit_NONE#2){+.-.}-{2:2}, at: __dev_queue_xmit+0x1f52/0x2960 [ 97.019684][ T822] #4: ffffffffbce2f600 (rcu_read_lock){....}-{1:2}, at: bareudp_xmit+0x31b/0x3690 [bareudp] [ 97.021573][ T822] #5: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: ip_finish_output2+0x249/0x2020 [ 97.023424][ T822] #6: ffffffffbce2f5a0 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x1fd/0x2960 [ 97.025259][ T822] [ 97.025259][ T822] stack backtrace: [ 97.026349][ T822] CPU: 3 PID: 822 Comm: ping Not tainted 5.10.0+ #819 [ 97.027609][ T822] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 [ 97.029407][ T822] Call Trace: [ 97.030015][ T822] dump_stack+0x99/0xcb [ 97.030783][ T822] __lock_acquire.cold.77+0x149/0x3a9 [ 97.031773][ T822] ? stack_trace_save+0x81/0xa0 [ 97.032661][ T822] ? register_lock_class+0x1910/0x1910 [ 97.033673][ T822] ? register_lock_class+0x1910/0x1910 [ 97.034679][ T822] ? rcu_read_lock_sched_held+0x91/0xc0 [ 97.035697][ T822] ? rcu_read_lock_bh_held+0xa0/0xa0 [ 97.036690][ T822] lock_acquire+0x1b2/0x730 [ 97.037515][ T822] ? __dev_queue_xmit+0x1f52/0x2960 [ 97.038466][ T822] ? check_flags+0x50/0x50 [ 97.039277][ T822] ? netif_skb_features+0x296/0x9c0 [ 97.040226][ T822] ? validate_xmit_skb+0x29/0xb10 [ 97.041151][ T822] _raw_spin_lock+0x30/0x70 [ 97.041977][ T822] ? __dev_queue_xmit+0x1f52/0x2960 [ 97.042927][ T822] __dev_queue_xmit+0x1f52/0x2960 [ 97.043852][ T822] ? netdev_core_pick_tx+0x290/0x290 [ 97.044824][ T822] ? mark_held_locks+0xb7/0x120 [ 97.045712][ T822] ? lockdep_hardirqs_on_prepare+0x12c/0x3e0 [ 97.046824][ T822] ? __local_bh_enable_ip+0xa5/0xf0 [ 97.047771][ T822] ? ___neigh_create+0x12a8/0x1eb0 [ 97.048710][ T822] ? trace_hardirqs_on+0x41/0x120 [ 97.049626][ T822] ? ___neigh_create+0x12a8/0x1eb0 [ 97.050556][ T822] ? __local_bh_enable_ip+0xa5/0xf0 [ 97.051509][ T822] ? ___neigh_create+0x12a8/0x1eb0 [ 97.052443][ T822] ? check_chain_key+0x244/0x5f0 [ 97.053352][ T822] ? rcu_read_lock_bh_held+0x56/0xa0 [ 97.054317][ T822] ? ip_finish_output2+0x6ea/0x2020 [ 97.055263][ T822] ? pneigh_lookup+0x410/0x410 [ 97.056135][ T822] ip_finish_output2+0x6ea/0x2020 [ ... ] Acked-by: Guillaume Nault <gnault@redhat.com> Fixes: 571912c6 ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Link: https://lore.kernel.org/r/20201228152136.24215-1-ap420073@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 28 Dec, 2020 25 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller authored
Daniel Borkmann says: ==================== pull-request: bpf 2020-12-28 The following pull-request contains BPF updates for your *net* tree. There is a small merge conflict between bpf tree commit 69ca310f ("bpf: Save correct stopping point in file seq iteration") and net tree commit 66ed5944 ("bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu"). The get_files_struct() does not exist anymore in net, so take the hunk in HEAD and add the `info->tid = curr_tid` to the error path: [...] curr_task = task_seq_get_next(ns, &curr_tid, true); if (!curr_task) { info->task = NULL; info->tid = curr_tid; return NULL; } /* set info->task and info->tid */ [...] We've added 10 non-merge commits during the last 9 day(s) which contain a total of 11 files changed, 75 insertions(+), 20 deletions(-). The main changes are: 1) Various AF_XDP fixes such as fill/completion ring leak on failed bind and fixing a race in skb mode's backpressure mechanism, from Magnus Karlsson. 2) Fix latency spikes on lockdep enabled kernels by adding a rescheduling point to BPF hashtab initialization, from Eric Dumazet. 3) Fix a splat in task iterator by saving the correct stopping point in the seq file iteration, from Jonathan Lemon. 4) Fix BPF maps selftest by adding retries in case hashtab returns EBUSY errors on update/deletes, from Andrii Nakryiko. 5) Fix BPF selftest error reporting to something more user friendly if the vmlinux BTF cannot be found, from Kamal Mostafa. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Xie He authored
ppp_cp_event is called directly or indirectly by ppp_rx with "ppp->lock" held. It may call mod_timer to add a new timer. However, at the same time ppp_timer may be already running and waiting for "ppp->lock". In this case, there's no need for ppp_timer to continue running and it can just exit. If we let ppp_timer continue running, it may call add_timer. This causes kernel panic because add_timer can't be called with a timer pending. This patch fixes this problem. Fixes: e022c2f0 ("WAN: new synchronous PPP implementation for generic HDLC.") Cc: Krzysztof Halasa <khc@pm.waw.pl> Signed-off-by: Xie He <xie.he.0141@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Leo Le Bouter authored
This was tested on a RaptorCS Talos II with IBM POWER9 DD2.2 CPUs and an ASUS XG-C100F PCI-e card without any issue. Speeds of ~8Gbps could be attained with not-very-scientific (wget HTTP) both-ways measurements on a local network. No warning or error reported in kernel logs. The drivers seems to be portable enough for it not to be gated like such. Signed-off-by: Léo Le Bouter <lle-bout@zaclys.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Cong Wang authored
Both version 0 and version 1 use ETH_P_ERSPAN, but version 0 does not have an erspan header. So the check in gre_parse_header() is wrong, we have to distinguish version 1 from version 0. We can just check the gre header length like is_erspan_type1(). Fixes: cb73ee40 ("net: ip_gre: use erspan key field for tunnel lookup") Reported-by: syzbot+f583ce3d4ddf9836b27a@syzkaller.appspotmail.com Cc: William Tu <u9012063@gmail.com> Cc: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yunjian Wang authored
The function skb_copy() could return NULL, the return value need to be checked. Fixes: b5996f11 ("net: add Hisilicon Network Subsystem basic ethernet support") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Randy Dunlap authored
Check Scell_log shift size in red_check_params() and modify all callers of red_check_params() to pass Scell_log. This prevents a shift out-of-bounds as detected by UBSAN: UBSAN: shift-out-of-bounds in ./include/net/red.h:252:22 shift exponent 72 is too large for 32-bit type 'int' Fixes: 8afa10cb ("net_sched: red: Avoid illegal values") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: syzbot+97c5bd9cc81eca63d36e@syzkaller.appspotmail.com Cc: Nogah Frankel <nogahf@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jiri Pirko <jiri@resnulli.us> Cc: netdev@vger.kernel.org Cc: "David S. Miller" <davem@davemloft.net> Cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
weichenchen authored
pneigh_enqueue() tries to obtain a random delay by mod NEIGH_VAR(p, PROXY_DELAY). However, NEIGH_VAR(p, PROXY_DELAY) migth be zero at that point because someone could write zero to /proc/sys/net/ipv4/neigh/[device]/proxy_delay after the callers check it. This patch uses prandom_u32_max() to get a random delay instead which avoids potential division by zero. Signed-off-by: weichenchen <weichen.chen@linux.alibaba.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Guillaume Nault authored
RT_TOS() only clears one of the ECN bits. Therefore, when fib_compute_spec_dst() resorts to a fib lookup, it can return different results depending on the value of the second ECN bit. For example, ECT(0) and ECT(1) packets could be treated differently. $ ip netns add ns0 $ ip netns add ns1 $ ip link add name veth01 netns ns0 type veth peer name veth10 netns ns1 $ ip -netns ns0 link set dev lo up $ ip -netns ns1 link set dev lo up $ ip -netns ns0 link set dev veth01 up $ ip -netns ns1 link set dev veth10 up $ ip -netns ns0 address add 192.0.2.10/24 dev veth01 $ ip -netns ns1 address add 192.0.2.11/24 dev veth10 $ ip -netns ns1 address add 192.0.2.21/32 dev lo $ ip -netns ns1 route add 192.0.2.10/32 tos 4 dev veth10 src 192.0.2.21 $ ip netns exec ns1 sysctl -wq net.ipv4.icmp_echo_ignore_broadcasts=0 With TOS 4 and ECT(1), ns1 replies using source address 192.0.2.21 (ping uses -Q to set all TOS and ECN bits): $ ip netns exec ns0 ping -c 1 -b -Q 5 192.0.2.255 [...] 64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.544 ms But with TOS 4 and ECT(0), ns1 replies using source address 192.0.2.11 because the "tos 4" route isn't matched: $ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255 [...] 64 bytes from 192.0.2.11: icmp_seq=1 ttl=64 time=0.597 ms After this patch the ECN bits don't affect the result anymore: $ ip netns exec ns0 ping -c 1 -b -Q 6 192.0.2.255 [...] 64 bytes from 192.0.2.21: icmp_seq=1 ttl=64 time=0.591 ms Fixes: 35ebf65e ("ipv4: Create and use fib_compute_spec_dst() helper.") Signed-off-by: Guillaume Nault <gnault@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stefan Chulski authored
The packet coalescing interrupt threshold has separated registers for different aggregated/cpu (sw-thread). The required value should be loaded for every thread but not only for 1 current cpu. Fixes: 213f428f ("net: mvpp2: add support for TX interrupts and RX queue distribution modes") Signed-off-by: Stefan Chulski <stefanc@marvell.com> Link: https://lore.kernel.org/r/1608748521-11033-1-git-send-email-stefanc@marvell.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Alex Elder says: ==================== net: ipa: fix some new build warnings I got a super friendly message from the Intel kernel test robot that pointed out that two patches I posted last week caused new build warnings. I already had these problems fixed in my own tree but the fix was not included in what I sent out last week. ==================== Link: https://lore.kernel.org/r/20201226213737.338928-1-elder@linaro.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Alex Elder authored
Callers of evt_ring_command() no longer care whether the command times out, and don't use what evt_ring_command() returns. Redefine that function to have void return type. Reported-by: kernel test robot <lkp@intel.com> Fixes: 428b448e ("net: ipa: use state to determine event ring command success") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Alex Elder authored
Callers of gsi_channel_command() no longer care whether the command times out, and don't use what gsi_channel_command() returns. Redefine that function to have void return type. Reported-by: kernel test robot <lkp@intel.com> Fixes: 6ffddf3b ("net: ipa: use state to determine channel command success") Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Michael Chan says: ==================== bnxt_en: Bug fixes. The first patch fixes recovery of fatal AER errors. The second one fixes a potential array out of bounds issue. ==================== Link: https://lore.kernel.org/r/1609096698-15009-1-git-send-email-michael.chan@broadcom.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Michael Chan authored
TQM rings are hardware resources that require host context memory managed by the driver. The driver supports up to 9 TQM rings and the number of rings to use is requested by firmware during run-time. Cap this number to the maximum supported to prevent accessing beyond the array. Future firmware may request more than 9 TQM rings. Define macros to remove the magic number 9 from the C code. Fixes: ac3158cb ("bnxt_en: Allocate TQM ring context memory according to fw specification.") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vasundhara Volam authored
A recent change skips sending firmware messages to the firmware when pci_channel_offline() is true during fatal AER error. To make this complete, we need to move the re-initialization sequence to bnxt_io_resume(), otherwise the firmware messages to re-initialize will all be skipped. In any case, it is more correct to re-initialize in bnxt_io_resume(). Also, fix the reverse x-mas tree format when defining variables in bnxt_io_slot_reset(). Fixes: b340dc68 ("bnxt_en: Avoid sending firmware messages when AER error is detected.") Reviewed-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queueJakub Kicinski authored
Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2020-12-23 Commit e086ba2f ("e1000e: disable s0ix entry and exit flows for ME systems") disabled S0ix flows for systems that have various incarnations of the i219-LM ethernet controller. This was done because of some regressions caused by an earlier commit 632fbd5e ("e1000e: fix S0ix flows for cable connected case") with i219-LM controller. Per discussion with Intel architecture team this direction should be changed and allow S0ix flows to be used by default. This patch series includes directional changes for their conclusions in https://lkml.org/lkml/2020/12/13/15. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: e1000e: Export S0ix flags to ethtool Revert "e1000e: disable s0ix entry and exit flows for ME systems" e1000e: bump up timeout to wait when ME un-configures ULP mode e1000e: Only run S0ix flows if shutdown succeeded ==================== Link: https://lore.kernel.org/r/20201223233625.92519-1-anthony.l.nguyen@intel.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Davide Caratti authored
the following syzkaller reproducer: r0 = socket$inet_mptcp(0x2, 0x1, 0x106) bind$inet(r0, &(0x7f0000000080)={0x2, 0x4e24, @multicast2}, 0x10) connect$inet(r0, &(0x7f0000000480)={0x2, 0x4e24, @local}, 0x10) sendto$inet(r0, &(0x7f0000000100)="f6", 0xffffffe7, 0xc000, 0x0, 0x0) systematically triggers the following warning: WARNING: CPU: 2 PID: 8618 at net/core/stream.c:208 sk_stream_kill_queues+0x3fa/0x580 Modules linked in: CPU: 2 PID: 8618 Comm: syz-executor Not tainted 5.10.0+ #334 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/04 RIP: 0010:sk_stream_kill_queues+0x3fa/0x580 Code: df 48 c1 ea 03 0f b6 04 02 84 c0 74 04 3c 03 7e 40 8b ab 20 02 00 00 e9 64 ff ff ff e8 df f0 81 2 RSP: 0018:ffffc9000290fcb0 EFLAGS: 00010293 RAX: ffff888011cb8000 RBX: 0000000000000000 RCX: ffffffff86eecf0e RDX: 0000000000000000 RSI: ffffffff86eecf6a RDI: 0000000000000005 RBP: 0000000000000e28 R08: ffff888011cb8000 R09: fffffbfff1f48139 R10: ffffffff8fa409c7 R11: fffffbfff1f48138 R12: ffff8880215e6220 R13: ffffffff8fa409c0 R14: ffffc9000290fd30 R15: 1ffff92000521fa2 FS: 00007f41c78f4800(0000) GS:ffff88802d000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f95c803d088 CR3: 0000000025ed2000 CR4: 00000000000006f0 Call Trace: __mptcp_destroy_sock+0x4f5/0x8e0 mptcp_close+0x5e2/0x7f0 inet_release+0x12b/0x270 __sock_release+0xc8/0x270 sock_close+0x18/0x20 __fput+0x272/0x8e0 task_work_run+0xe0/0x1a0 exit_to_user_mode_prepare+0x1df/0x200 syscall_exit_to_user_mode+0x19/0x50 entry_SYSCALL_64_after_hwframe+0x44/0xa9 userspace programs provide arbitrarily high values of 'len' in sendmsg(): this is causing integer overflow of 'amount'. Cap forward allocation to 1 megabyte: higher values are not really useful. Suggested-by: Paolo Abeni <pabeni@redhat.com> Fixes: e93da928 ("mptcp: implement wmem reservation") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/3334d00d8b2faecafdfab9aa593efcbf61442756.1608584474.git.dcaratti@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Yunjian Wang authored
Currently the tun_napi_alloc_frags() function returns -ENOMEM when the number of iovs exceeds MAX_SKB_FRAGS + 1. However this is inappropriate, we should use -EMSGSIZE instead of -ENOMEM. The following distinctions are matters: 1. the caller need to drop the bad packet when -EMSGSIZE is returned, which means meeting a persistent failure. 2. the caller can try again when -ENOMEM is returned, which means meeting a transient failure. Fixes: 90e33d45 ("tun: enable napi_gro_frags() for TUN/TAP driver") Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Willem de Bruijn <willemb@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://lore.kernel.org/r/1608864736-24332-1-git-send-email-wangyunjian@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Grygorii Strashko authored
The CPTS driver registers PTP PHC clock when first netif is going up and unregister it when all netif are down. Now ethtool will show: - PTP PHC clock index 0 after boot until first netif is up; - the last assigned PTP PHC clock index even if PTP PHC clock is not registered any more after all netifs are down. This patch ensures that -1 is returned by ethtool when PTP PHC clock is not registered any more. Fixes: 8a2c9a5a ("net: ethernet: ti: cpts: rework initialization/deinitialization") Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20201224162405.28032-1-grygorii.strashko@ti.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Antoine Tenart says: ==================== net-sysfs: fix race conditions in the xps code This series fixes race conditions in the xps code, where out of bound accesses can occur when dev->num_tc is updated, triggering oops. The root cause is linked to locking issues. An explanation is given in each of the commit logs. We had a discussion on the v1 of this series about using the xps_map mutex instead of the rtnl lock. While that seemed a better compromise, v2 showed the added complexity wasn't best for fixes. So we decided to go back to v1 and use the rtnl lock. Because of this, the only differences between v1 and v3 are improvements in the commit messages. ==================== Link: https://lore.kernel.org/r/20201223212323.3603139-1-atenart@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Antoine Tenart authored
Accesses to dev->xps_rxqs_map (when using dev->num_tc) should be protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't see an actual bug being triggered, but let's be safe here and take the rtnl lock while accessing the map in sysfs. Fixes: 8af2c06f ("net-sysfs: Add interface for Rx queue(s) map per Tx queue") Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Antoine Tenart authored
Two race conditions can be triggered when storing xps rxqs, resulting in various oops and invalid memory accesses: 1. Calling netdev_set_num_tc while netif_set_xps_queue: - netif_set_xps_queue uses dev->tc_num as one of the parameters to compute the size of new_dev_maps when allocating it. dev->tc_num is also used to access the map, and the compiler may generate code to retrieve this field multiple times in the function. - netdev_set_num_tc sets dev->tc_num. If new_dev_maps is allocated using dev->tc_num and then dev->tc_num is set to a higher value through netdev_set_num_tc, later accesses to new_dev_maps in netif_set_xps_queue could lead to accessing memory outside of new_dev_maps; triggering an oops. 2. Calling netif_set_xps_queue while netdev_set_num_tc is running: 2.1. netdev_set_num_tc starts by resetting the xps queues, dev->tc_num isn't updated yet. 2.2. netif_set_xps_queue is called, setting up the map with the *old* dev->num_tc. 2.3. netdev_set_num_tc updates dev->tc_num. 2.4. Later accesses to the map lead to out of bound accesses and oops. A similar issue can be found with netdev_reset_tc. One way of triggering this is to set an iface up (for which the driver uses netdev_set_num_tc in the open path, such as bnx2x) and writing to xps_rxqs in a concurrent thread. With the right timing an oops is triggered. Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc and netdev_reset_tc should be mutually exclusive. We do that by taking the rtnl lock in xps_rxqs_store. Fixes: 8af2c06f ("net-sysfs: Add interface for Rx queue(s) map per Tx queue") Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Antoine Tenart authored
Accesses to dev->xps_cpus_map (when using dev->num_tc) should be protected by the rtnl lock, like we do for netif_set_xps_queue. I didn't see an actual bug being triggered, but let's be safe here and take the rtnl lock while accessing the map in sysfs. Fixes: 184c449f ("net: Add support for XPS with QoS via traffic classes") Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Antoine Tenart authored
Two race conditions can be triggered when storing xps cpus, resulting in various oops and invalid memory accesses: 1. Calling netdev_set_num_tc while netif_set_xps_queue: - netif_set_xps_queue uses dev->tc_num as one of the parameters to compute the size of new_dev_maps when allocating it. dev->tc_num is also used to access the map, and the compiler may generate code to retrieve this field multiple times in the function. - netdev_set_num_tc sets dev->tc_num. If new_dev_maps is allocated using dev->tc_num and then dev->tc_num is set to a higher value through netdev_set_num_tc, later accesses to new_dev_maps in netif_set_xps_queue could lead to accessing memory outside of new_dev_maps; triggering an oops. 2. Calling netif_set_xps_queue while netdev_set_num_tc is running: 2.1. netdev_set_num_tc starts by resetting the xps queues, dev->tc_num isn't updated yet. 2.2. netif_set_xps_queue is called, setting up the map with the *old* dev->num_tc. 2.3. netdev_set_num_tc updates dev->tc_num. 2.4. Later accesses to the map lead to out of bound accesses and oops. A similar issue can be found with netdev_reset_tc. One way of triggering this is to set an iface up (for which the driver uses netdev_set_num_tc in the open path, such as bnx2x) and writing to xps_cpus in a concurrent thread. With the right timing an oops is triggered. Both issues have the same fix: netif_set_xps_queue, netdev_set_num_tc and netdev_reset_tc should be mutually exclusive. We do that by taking the rtnl lock in xps_cpus_store. Fixes: 184c449f ("net: Add support for XPS with QoS via traffic classes") Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Roland Dreier authored
The cdc_ncm driver passes network connection notifications up to usbnet_link_change(), which is the right place for any logging. Remove the netdev_info() duplicating this from the driver itself. This stops devices such as my "TRENDnet USB 10/100/1G/2.5G LAN" (ID 20f4:e02b) adapter from spamming the kernel log with cdc_ncm 2-2:2.0 enp0s2u2c2: network connection: connected messages every 60 msec or so. Signed-off-by: Roland Dreier <roland@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Link: https://lore.kernel.org/r/20201224032116.2453938-1-roland@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-