- 24 Apr, 2016 3 commits
-
-
Sven Eckelmann authored
The shutdown of an batman-adv interface can happen with one of its slave interfaces still being in the BATADV_IF_TO_BE_ACTIVATED state. A possible reason for it is that the routing algorithm BATMAN_V was selected and batadv_schedule_bat_ogm was not yet called for this interface. This slave interface still has to be set to BATADV_IF_INACTIVE or the batman-adv interface will never reduce its usage counter and thus never gets shutdown. This problem can be simulated via: $ modprobe dummy $ modprobe batman-adv routing_algo=BATMAN_V $ ip link add bat0 type batadv $ ip link set dummy0 master bat0 $ ip link set dummy0 up $ ip link del bat0 unregister_netdevice: waiting for bat0 to become free. Usage count = 3 Reported-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <a@unstable.cc>
-
Marek Lindner authored
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> [sven@narfation.org: fix conflicts with current version] Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Antonio Quartulli <a@unstable.cc>
-
Sven Eckelmann authored
The encapsulated ethernet and VLAN header may be outside the received ethernet frame. Thus the skb buffer size has to be checked before it can be parsed to find out if it encapsulates another batman-adv packet. Fixes: 42019357 ("batman-adv: softif bridge loop avoidance") Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Signed-off-by: Antonio Quartulli <a@unstable.cc>
-
- 21 Apr, 2016 30 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linuxLinus Torvalds authored
Pull RTC fixes from Alexandre Belloni: "A few fixes for the RTC subsystem. The documentation fix already missed 4.5 so I think it is worth taking it now: A documentation fix for s3c and two fixes for the ds1307" * tag 'rtc-4.6-3' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: rtc: ds1307: Use irq when available for wakeup-source device rtc: ds1307: ds3231 temperature s16 overflow rtc: s3c: Document in binding that only s3c6410 needs a src clk
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull power management fixes from Rafael Wysocki: "Two fixes for issues introduced recently, one for an intel_pstate driver problem uncovered by the recent switch over from using timers and the other one for a potential cpufreq core problem related to system suspend/resume. Specifics: - Fix an intel_pstate driver problem causing CPUs to get stuck in the highest P-state when completely idle uncovered by the recent switch over from using timers (Rafael Wysocki). - Avoid attempts to get the current CPU frequency when all devices (like I2C controllers that may be nedded for that purpose) have been suspended during system suspend/resume (Rafael Wysocki)" * tag 'pm+acpi-4.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Abort cpufreq_update_current_freq() for cpufreq_suspended set intel_pstate: Avoid getting stuck in high P-states when idle
-
Nishanth Menon authored
With commit 8bc2a407 ("rtc: ds1307: add support for the DT property 'wakeup-source'") we lost the ability for rtc irq functionality for devices that are actually hooked on a real IRQ line and have capability to wakeup as well. This is not an expected behavior. So, instead of just not requesting IRQ, skip the IRQ requirement only if interrupts are not defined for the device. Fixes: 8bc2a407 ("rtc: ds1307: add support for the DT property 'wakeup-source'") Reported-by: Tony Lindgren <tony@atomide.com> Cc: Michael Lange <linuxstuff@milaw.biz> Cc: Alexandre Belloni <alexandre.belloni@free-electrons.com> Signed-off-by: Nishanth Menon <nm@ti.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
-
Zhuang Yuyao authored
while retrieving temperature from ds3231, the result may be overflow since s16 is too small for a multiplication with 250. ie. if temp_buf[0] == 0x2d, the result (s16 temp) will be negative. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Tested-by: Michael Tatarinov <kukabu@gmail.com> Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Fix memory leak in iwlwifi, from Matti Gottlieb. 2) Add missing registration of netfilter arp_tables into initial namespace, from Florian Westphal. 3) Fix potential NULL deref in DecNET routing code. 4) Restrict NETLINK_URELEASE to truly bound sockets only, from Dmitry Ivanov. 5) Fix dst ref counting in VRF, from David Ahern. 6) Fix TSO segmenting limits in i40e driver, from Alexander Duyck. 7) Fix heap leak in PACKET_DIAG_MCLIST, from Mathias Krause. 8) Ravalidate IPV6 datagram socket cached routes properly, particularly with UDP, from Martin KaFai Lau. 9) Fix endian bug in RDS dp_ack_seq handling, from Qing Huang. 10) Fix stats typing in bcmgenet driver, from Eric Dumazet. 11) Openvswitch needs to orphan SKBs before ipv6 fragmentation handing, from Joe Stringer. 12) SPI device reference leak in spi_ks8895 PHY driver, from Mark Brown. 13) atl2 doesn't actually support scatter-gather, so don't advertise the feature. From Ben Hucthings. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (72 commits) openvswitch: use flow protocol when recalculating ipv6 checksums Driver: Vmxnet3: set CHECKSUM_UNNECESSARY for IPv6 packets atl2: Disable unimplemented scatter/gather feature net/mlx4_en: Split SW RX dropped counter per RX ring net/mlx4_core: Don't allow to VF change global pause settings net/mlx4_core: Avoid repeated calls to pci enable/disable net/mlx4_core: Implement pci_resume callback net: phy: spi_ks8895: Don't leak references to SPI devices net: ethernet: davinci_emac: Fix platform_data overwrite net: ethernet: davinci_emac: Fix Unbalanced pm_runtime_enable qede: Fix single MTU sized packet from firmware GRO flow qede: Fix setting Skb network header qede: Fix various memory allocation error flows for fastpath tcp: Merge tx_flags and tskey in tcp_shifted_skb tcp: Merge tx_flags and tskey in tcp_collapse_retrans drivers: net: cpsw: fix wrong regs access in cpsw_ndo_open tcp: Fix SOF_TIMESTAMPING_TX_ACK when handling dup acks openvswitch: Orphan skbs before IPv6 defrag Revert "Prevent NUll pointer dereference with two PHYs on cpsw" VSOCK: Only check error on skb_recv_datagram when skb is NULL ...
-
Simon Horman authored
When using masked actions the ipv6_proto field of an action to set IPv6 fields may be zero rather than the prevailing protocol which will result in skipping checksum recalculation. This patch resolves the problem by relying on the protocol in the flow key rather than that in the set field action. Fixes: 83d2b9ba ("net: openvswitch: Support masked set actions.") Cc: Jarno Rajahalme <jrajahalme@nicira.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Shrikrishna Khare authored
For IPv6, if the device indicates that the checksum is correct, set CHECKSUM_UNNECESSARY. Reported-by: Subbarao Narahari <snarahari@vmware.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: Jin Heo <heoj@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ben Hutchings authored
atl2 includes NETIF_F_SG in hw_features even though it has no support for non-linear skbs. This bug was originally harmless since the driver does not claim to implement checksum offload and that used to be a requirement for SG. Now that SG and checksum offload are independent features, if you explicitly enable SG *and* use one of the rare protocols that can use SG without checkusm offload, this potentially leaks sensitive information (before you notice that it just isn't working). Therefore this obscure bug has been designated CVE-2016-2117. Reported-by: Justin Yackoski <jyackoski@crypto-nite.com> Signed-off-by: Ben Hutchings <ben@decadent.org.uk> Fixes: ec5f0615 ("net: Kill link between CSUM and SG features.") Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Or Gerlitz says: ==================== Mellaox 40G driver fixes for 4.6-rc With the fix for ARM bug being under the works, these are few other fixes for mlx4 we have ready to go. Eran addressed the problematic/wrong reporting of dropped packets, Daniel fixed some matters related to PPC EEH's and Jenny's patch makes sure VFs can't change the port's pause settings. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eran Ben Elisha authored
Count SW packet drops per RX ring instead of a global counter. This will allow monitoring the number of rx drops per ring. In addition, SW rx_dropped counter was overwritten by HW rx_dropped counter, sum both of them instead to show the accurate value. Fixes: a3333b35 ('net/mlx4_en: Moderate ethtool callback to [...] ') Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reported-by: Brenden Blanco <bblanco@plumgrid.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Reported-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eugenia Emantayev authored
Currently changing global pause settings is done via SET_PORT command with input modifier GENERAL. This command is allowed for each VF since MTU setting is done via the same command. Change the above to the following scheme: before passing the request to the FW, the PF will check whether it was issued by a slave. If yes, don't change global pause and warn, otherwise change to the requested value and store for further reference. Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Jurgens authored
Maintain the PCI status and provide wrappers for enabling and disabling the PCI device. Performing the actions more than once without doing its opposite results in warning logs. This occurred when EEH hotplugged the device causing a warning for disabling an already disabled device. Fixes: 2ba5fbd6 ('net/mlx4_core: Handle AER flow properly') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Jurgens authored
Move resume related activities to a new pci_resume function instead of performing them in mlx4_pci_slot_reset. This change is needed to avoid a hotplug during EEH recovery due to commit f2da4ccf ("powerpc/eeh: More relaxed hotplug criterion"). Fixes: 2ba5fbd6 ('net/mlx4_core: Handle AER flow properly') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mark Brown authored
The ks8895 driver is using spi_dev_get() apparently just to take a copy of the SPI device used to instantiate it but never calls spi_dev_put() to free it. Since the device is guaranteed to exist between probe() and remove() there should be no need for the driver to take an extra reference to it so fix the leak by just using a straight assignment. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Neil Armstrong authored
When the DaVinci emac driver is removed and re-probed, the actual pdev->dev.platform_data is populated with an unwanted valid pointer saved by the previous davinci_emac_of_get_pdata() call, causing a kernel crash when calling priv->int_disable() in emac_int_disable(). Unable to handle kernel paging request at virtual address c8622a80 ... [<c0426fb4>] (emac_int_disable) from [<c0427700>] (emac_dev_open+0x290/0x5f8) [<c0427700>] (emac_dev_open) from [<c04c00ec>] (__dev_open+0xb8/0x120) [<c04c00ec>] (__dev_open) from [<c04c0370>] (__dev_change_flags+0x88/0x14c) [<c04c0370>] (__dev_change_flags) from [<c04c044c>] (dev_change_flags+0x18/0x48) [<c04c044c>] (dev_change_flags) from [<c052bafc>] (devinet_ioctl+0x6b4/0x7ac) [<c052bafc>] (devinet_ioctl) from [<c04a1428>] (sock_ioctl+0x1d8/0x2c0) [<c04a1428>] (sock_ioctl) from [<c014f054>] (do_vfs_ioctl+0x41c/0x600) [<c014f054>] (do_vfs_ioctl) from [<c014f2a4>] (SyS_ioctl+0x6c/0x7c) [<c014f2a4>] (SyS_ioctl) from [<c000ff60>] (ret_fast_syscall+0x0/0x1c) Fixes: 42f59967 ("net: ethernet: davinci_emac: add OF support") Cc: Brian Hutchinson <b.hutchman@gmail.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Neil Armstrong authored
In order to avoid an Unbalanced pm_runtime_enable in the DaVinci emac driver when the device is removed and re-probed, and a pm_runtime_disable() call in davinci_emac_remove(). Actually, using unbind/bind on a TI DM8168 SoC gives : $ echo 4a120000.ethernet > /sys/bus/platform/drivers/davinci_emac/unbind net eth1: DaVinci EMAC: davinci_emac_remove() $ echo 4a120000.ethernet > /sys/bus/platform/drivers/davinci_emac/bind davinci_emac 4a120000.ethernet: Unbalanced pm_runtime_enable Cc: Brian Hutchinson <b.hutchman@gmail.com> Fixes: 3ba97381 ("net: ethernet: davinci_emac: add pm_runtime support") Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rafael J. Wysocki authored
* pm-cpufreq-fixes: cpufreq: Abort cpufreq_update_current_freq() for cpufreq_suspended set intel_pstate: Avoid getting stuck in high P-states when idle
-
David S. Miller authored
Manish Chopra says: ==================== qede: Bug fixes This series fixes - * various memory allocation failure flows for fastpath * issues with respect to driver GRO packets handling V1->V2 * Send series against net instead of net-next. Please consider applying this series to "net" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Manish Chopra authored
In firmware assisted GRO flow there could be a single MTU sized segment arriving due to firmware aggregation timeout/last segment in an aggregation flow, which is not expected to be an actual gro packet. So If a skb has zero frags from the GRO flow then simply push it in the stack as non gso skb. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Yuval Mintz <yuval.mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Manish Chopra authored
Skb's network header needs to be set before extracting IPv4/IPv6 headers from it. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Yuval Mintz <yuval.mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Manish Chopra authored
This patch handles memory allocation failures for fastpath gracefully in the driver. Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: Yuval Mintz <yuval.mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Martin KaFai Lau says: ==================== tcp: Merge timestamp info when coalescing skbs This series is separated from the RFC series related to tcp_sendmsg(MSG_EOR) and it is targeting for the net branch. This patchset is focusing on fixing cases where TCP timestamp could be lost after coalescing skbs. A BPF prog is used to kprobe to sock_queue_err_skb() and print out the value of serr->ee.ee_data. The BPF prog (run-able from bcc) is attached here: BPF prog used for testing: ~~~~~ from __future__ import print_function from bcc import BPF bpf_text = """ int trace_err_skb(struct pt_regs *ctx) { struct sk_buff *skb = (struct sk_buff *)ctx->si; struct sock *sk = (struct sock *)ctx->di; struct sock_exterr_skb *serr; u32 ee_data = 0; if (!sk || !skb) return 0; serr = SKB_EXT_ERR(skb); bpf_probe_read(&ee_data, sizeof(ee_data), &serr->ee.ee_data); bpf_trace_printk("ee_data:%u\\n", ee_data); return 0; }; """ b = BPF(text=bpf_text) b.attach_kprobe(event="sock_queue_err_skb", fn_name="trace_err_skb") print("Attached to kprobe") b.trace_print() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Martin KaFai Lau authored
After receiving sacks, tcp_shifted_skb() will collapse skbs if possible. tx_flags and tskey also have to be merged. This patch reuses the tcp_skb_collapse_tstamp() to handle them. BPF Output Before: ~~~~~ <no-output-due-to-missing-tstamp-event> BPF Output After: ~~~~~ <...>-2024 [007] d.s. 88.644374: : ee_data:14599 Packetdrill Script: ~~~~~ +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10` +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1` +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 0.200 write(4, ..., 1460) = 1460 +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0 0.200 write(4, ..., 13140) = 13140 0.200 > P. 1:1461(1460) ack 1 0.200 > . 1461:8761(7300) ack 1 0.200 > P. 8761:14601(5840) ack 1 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:14601,nop,nop> 0.300 > P. 1:1461(1460) ack 1 0.400 < . 1:1(0) ack 14601 win 257 0.400 close(4) = 0 0.400 > F. 14601:14601(0) ack 1 0.500 < F. 1:1(0) ack 14602 win 257 0.500 > . 14602:14602(0) ack 2 Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Tested-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Martin KaFai Lau authored
If two skbs are merged/collapsed during retransmission, the current logic does not merge the tx_flags and tskey. The end result is the SCM_TSTAMP_ACK timestamp could be missing for a packet. The patch: 1. Merge the tx_flags 2. Overwrite the prev_skb's tskey with the next_skb's tskey BPF Output Before: ~~~~~~ <no-output-due-to-missing-tstamp-event> BPF Output After: ~~~~~~ packetdrill-2092 [001] d.s. 453.998486: : ee_data:1459 Packetdrill Script: ~~~~~~ +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10` +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1` +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 0.200 write(4, ..., 730) = 730 +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0 0.200 write(4, ..., 730) = 730 +0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0 0.200 write(4, ..., 11680) = 11680 +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0 0.200 > P. 1:731(730) ack 1 0.200 > P. 731:1461(730) ack 1 0.200 > . 1461:8761(7300) ack 1 0.200 > P. 8761:13141(4380) ack 1 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:2921,nop,nop> 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:4381,nop,nop> 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:5841,nop,nop> 0.300 > P. 1:1461(1460) ack 1 0.400 < . 1:1(0) ack 13141 win 257 0.400 close(4) = 0 0.400 > F. 13141:13141(0) ack 1 0.500 < F. 1:1(0) ack 13142 win 257 0.500 > . 13142:13142(0) ack 2 Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Tested-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grygorii Strashko authored
The cpsw_ndo_open() could try to access CPSW registers before calling pm_runtime_get_sync(). This will trigger L3 error: WARNING: CPU: 0 PID: 21 at drivers/bus/omap_l3_noc.c:147 l3_interrupt_handler+0x220/0x34c() 44000000.ocp:L3 Custom Error: MASTER M2 (64-bit) TARGET L4_FAST (Idle): Data Access in Supervisor mode during Functional access and CPSW will stop functioning. Hence, fix it by moving pm_runtime_get_sync() before the first access to CPSW registers in cpsw_ndo_open(). Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Martin KaFai Lau authored
Assuming SOF_TIMESTAMPING_TX_ACK is on. When dup acks are received, it could incorrectly think that a skb has already been acked and queue a SCM_TSTAMP_ACK cmsg to the sk->sk_error_queue. In tcp_ack_tstamp(), it checks 'between(shinfo->tskey, prior_snd_una, tcp_sk(sk)->snd_una - 1)'. If prior_snd_una == tcp_sk(sk)->snd_una like the following packetdrill script, between() returns true but the tskey is actually not acked. e.g. try between(3, 2, 1). The fix is to replace between() with one before() and one !before(). By doing this, the -1 offset on the tcp_sk(sk)->snd_una can also be removed. A packetdrill script is used to reproduce the dup ack scenario. Due to the lacking cmsg support in packetdrill (may be I cannot find it), a BPF prog is used to kprobe to sock_queue_err_skb() and print out the value of serr->ee.ee_data. Both the packetdrill and the bcc BPF script is attached at the end of this commit message. BPF Output Before Fix: ~~~~~~ <...>-2056 [001] d.s. 433.927987: : ee_data:1459 #incorrect packetdrill-2056 [001] d.s. 433.929563: : ee_data:1459 #incorrect packetdrill-2056 [001] d.s. 433.930765: : ee_data:1459 #incorrect packetdrill-2056 [001] d.s. 434.028177: : ee_data:1459 packetdrill-2056 [001] d.s. 434.029686: : ee_data:14599 BPF Output After Fix: ~~~~~~ <...>-2049 [000] d.s. 113.517039: : ee_data:1459 <...>-2049 [000] d.s. 113.517253: : ee_data:14599 BCC BPF Script: ~~~~~~ #!/usr/bin/env python from __future__ import print_function from bcc import BPF bpf_text = """ #include <uapi/linux/ptrace.h> #include <net/sock.h> #include <bcc/proto.h> #include <linux/errqueue.h> #ifdef memset #undef memset #endif int trace_err_skb(struct pt_regs *ctx) { struct sk_buff *skb = (struct sk_buff *)ctx->si; struct sock *sk = (struct sock *)ctx->di; struct sock_exterr_skb *serr; u32 ee_data = 0; if (!sk || !skb) return 0; serr = SKB_EXT_ERR(skb); bpf_probe_read(&ee_data, sizeof(ee_data), &serr->ee.ee_data); bpf_trace_printk("ee_data:%u\\n", ee_data); return 0; }; """ b = BPF(text=bpf_text) b.attach_kprobe(event="sock_queue_err_skb", fn_name="trace_err_skb") print("Attached to kprobe") b.trace_print() Packetdrill Script: ~~~~~~ +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10` +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1` +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0 0.200 write(4, ..., 1460) = 1460 0.200 write(4, ..., 13140) = 13140 0.200 > P. 1:1461(1460) ack 1 0.200 > . 1461:8761(7300) ack 1 0.200 > P. 8761:14601(5840) ack 1 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:2921,nop,nop> 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:4381,nop,nop> 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:5841,nop,nop> 0.300 > P. 1:1461(1460) ack 1 0.400 < . 1:1(0) ack 14601 win 257 0.400 close(4) = 0 0.400 > F. 14601:14601(0) ack 1 0.500 < F. 1:1(0) ack 14602 win 257 0.500 > . 14602:14602(0) ack 2 Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Soheil Hassas Yeganeh <soheil.kdev@gmail.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Tested-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Stringer authored
This is the IPv6 counterpart to commit 8282f274 ("inet: frag: Always orphan skbs inside ip_defrag()"). Prior to commit 029f7f3b ("netfilter: ipv6: nf_defrag: avoid/free clone operations"), ipv6 fragments sent to nf_ct_frag6_gather() would be cloned (implicitly orphaning) prior to queueing for reassembly. As such, when the IPv6 message is eventually reassembled, the skb->sk for all fragments would be NULL. After that commit was introduced, rather than cloning, the original skbs were queued directly without orphaning. The end result is that all frags except for the first and last may have a socket attached. This commit explicitly orphans such skbs during nf_ct_frag6_gather() to prevent BUG_ON(skb->sk) during a later call to ip6_fragment(). kernel BUG at net/ipv6/ip6_output.c:631! [...] Call Trace: <IRQ> [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0 [<ffffffffa042c7c0>] ? do_output.isra.28+0x1b0/0x1b0 [openvswitch] [<ffffffff810bb8a2>] ? __lock_is_held+0x52/0x70 [<ffffffffa042c587>] ovs_fragment+0x1f7/0x280 [openvswitch] [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0 [<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50 [<ffffffff81697ea0>] ? dst_discard_out+0x20/0x20 [<ffffffff81697e80>] ? dst_ifdown+0x80/0x80 [<ffffffffa042c703>] do_output.isra.28+0xf3/0x1b0 [openvswitch] [<ffffffffa042d279>] do_execute_actions+0x709/0x12c0 [openvswitch] [<ffffffffa04340a4>] ? ovs_flow_stats_update+0x74/0x1e0 [openvswitch] [<ffffffffa04340d1>] ? ovs_flow_stats_update+0xa1/0x1e0 [openvswitch] [<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40 [<ffffffffa042de75>] ovs_execute_actions+0x45/0x120 [openvswitch] [<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch] [<ffffffff817be387>] ? _raw_spin_unlock+0x27/0x40 [<ffffffffa042def4>] ovs_execute_actions+0xc4/0x120 [openvswitch] [<ffffffffa0432d65>] ovs_dp_process_packet+0x85/0x150 [openvswitch] [<ffffffffa04337f2>] ? key_extract+0x442/0xc10 [openvswitch] [<ffffffffa043b26d>] ovs_vport_receive+0x5d/0xb0 [openvswitch] [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0 [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0 [<ffffffff810be8f7>] ? __lock_acquire+0x927/0x20a0 [<ffffffff817be416>] ? _raw_spin_unlock_irqrestore+0x36/0x50 [<ffffffffa043c11d>] internal_dev_xmit+0x6d/0x150 [openvswitch] [<ffffffffa043c0b5>] ? internal_dev_xmit+0x5/0x150 [openvswitch] [<ffffffff8168fb5f>] dev_hard_start_xmit+0x2df/0x660 [<ffffffff8168f5ea>] ? validate_xmit_skb.isra.105.part.106+0x1a/0x2b0 [<ffffffff81690925>] __dev_queue_xmit+0x8f5/0x950 [<ffffffff81690080>] ? __dev_queue_xmit+0x50/0x950 [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0 [<ffffffff81690990>] dev_queue_xmit+0x10/0x20 [<ffffffff8169a418>] neigh_resolve_output+0x178/0x220 [<ffffffff81752759>] ? ip6_finish_output2+0x219/0x7b0 [<ffffffff81752759>] ip6_finish_output2+0x219/0x7b0 [<ffffffff817525a5>] ? ip6_finish_output2+0x65/0x7b0 [<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80 [<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50 [<ffffffff81754af1>] ip6_fragment+0xba1/0xc50 [<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40 [<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0 [<ffffffff81754dcf>] ip6_output+0x5f/0x1a0 [<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50 [<ffffffff81797fbd>] ip6_local_out+0x3d/0x80 [<ffffffff817554df>] ip6_send_skb+0x2f/0xc0 [<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50 [<ffffffff817796cc>] icmpv6_push_pending_frames+0xac/0xe0 [<ffffffff8177a4be>] icmpv6_echo_reply+0x42e/0x500 [<ffffffff8177acbf>] icmpv6_rcv+0x4cf/0x580 [<ffffffff81755ac7>] ip6_input_finish+0x1a7/0x690 [<ffffffff81755925>] ? ip6_input_finish+0x5/0x690 [<ffffffff817567a0>] ip6_input+0x30/0xa0 [<ffffffff81755920>] ? ip6_rcv_finish+0x1a0/0x1a0 [<ffffffff817557ce>] ip6_rcv_finish+0x4e/0x1a0 [<ffffffff8175640f>] ipv6_rcv+0x45f/0x7c0 [<ffffffff81755fe6>] ? ipv6_rcv+0x36/0x7c0 [<ffffffff81755780>] ? ip6_make_skb+0x1c0/0x1c0 [<ffffffff8168b649>] __netif_receive_skb_core+0x229/0xb80 [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0 [<ffffffff8168c07f>] ? process_backlog+0x6f/0x230 [<ffffffff8168bfb6>] __netif_receive_skb+0x16/0x70 [<ffffffff8168c088>] process_backlog+0x78/0x230 [<ffffffff8168c0ed>] ? process_backlog+0xdd/0x230 [<ffffffff8168db43>] net_rx_action+0x203/0x480 [<ffffffff810bdab5>] ? mark_held_locks+0x75/0xa0 [<ffffffff817c156e>] __do_softirq+0xde/0x49f [<ffffffff81752768>] ? ip6_finish_output2+0x228/0x7b0 [<ffffffff817c070c>] do_softirq_own_stack+0x1c/0x30 <EOI> [<ffffffff8106f88b>] do_softirq.part.18+0x3b/0x40 [<ffffffff8106f946>] __local_bh_enable_ip+0xb6/0xc0 [<ffffffff81752791>] ip6_finish_output2+0x251/0x7b0 [<ffffffff81754af1>] ? ip6_fragment+0xba1/0xc50 [<ffffffff816cde2b>] ? ip_idents_reserve+0x6b/0x80 [<ffffffff8175488f>] ? ip6_fragment+0x93f/0xc50 [<ffffffff81754af1>] ip6_fragment+0xba1/0xc50 [<ffffffff81752540>] ? ip6_flush_pending_frames+0x40/0x40 [<ffffffff81754c6b>] ip6_finish_output+0xcb/0x1d0 [<ffffffff81754dcf>] ip6_output+0x5f/0x1a0 [<ffffffff81754ba0>] ? ip6_fragment+0xc50/0xc50 [<ffffffff81797fbd>] ip6_local_out+0x3d/0x80 [<ffffffff817554df>] ip6_send_skb+0x2f/0xc0 [<ffffffff817555bd>] ip6_push_pending_frames+0x4d/0x50 [<ffffffff81778558>] rawv6_sendmsg+0xa28/0xe30 [<ffffffff81719097>] ? inet_sendmsg+0xc7/0x1d0 [<ffffffff817190d6>] inet_sendmsg+0x106/0x1d0 [<ffffffff81718fd5>] ? inet_sendmsg+0x5/0x1d0 [<ffffffff8166d078>] sock_sendmsg+0x38/0x50 [<ffffffff8166d4d6>] SYSC_sendto+0xf6/0x170 [<ffffffff8100201b>] ? trace_hardirqs_on_thunk+0x1b/0x1d [<ffffffff8166e38e>] SyS_sendto+0xe/0x10 [<ffffffff817bebe5>] entry_SYSCALL_64_fastpath+0x18/0xa8 Code: 06 48 83 3f 00 75 26 48 8b 87 d8 00 00 00 2b 87 d0 00 00 00 48 39 d0 72 14 8b 87 e4 00 00 00 83 f8 01 75 09 48 83 7f 18 00 74 9a <0f> 0b 41 8b 86 cc 00 00 00 49 8# RIP [<ffffffff8175468a>] ip6_fragment+0x73a/0xc50 RSP <ffff880072803120> Fixes: 029f7f3b ("netfilter: ipv6: nf_defrag: avoid/free clone operations") Reported-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Joe Stringer <joe@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds authored
Pull ARM fixes from Russell King: "Three further fixes for ARM. Alexandre Courbot was having problems with DMA allocations with the GFP flags affecting where the tracking data was being allocated from. Vladimir Murzin noticed that the CPU feature code was not entirely correct, which can cause some features to be misreported" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8564/1: fix cpu feature extracting helper ARM: 8563/1: fix demoting HWCAP_SWP ARM: 8551/2: DMA: Fix kzalloc flags in __dma_alloc
-
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linuxLinus Torvalds authored
Pull fbdev fixes from Tomi Valkeinen: - ARM CLCD: fix regression on multiplatform kernels - panel-sharp-ls037v7dw01: fix possible NULL deref * tag 'fbdev-fixes-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: omapfb: panel-sharp-ls037v7dw01: fix check of gpio_to_desc() return value video: ARM CLCD: runtime check for Versatile
-
Linus Torvalds authored
Merge tag 'platform-drivers-x86-v4.6-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86 Pull x86 platform driver fixes from Darren Hart: "An S4 fix for intel-hid, new platform 'quirk' for hp_accel, a fix for broader support of ACPI resources for the Intel P-unit, and a few uninitialized variable fixes. intel p-unit: - decouple telemetry driver from the optional IPC resources thinkpad_acpi: - Silence an uninitialized variable warning intel_telemetry_pltdrv: - Silence an uninitialized variable warning hp_accel: - Silence an uninitialized variable warning - Add support for HP ProBook 440 G3 intel-hid: - add a workaround to ignore an event after waking up from S4" * tag 'platform-drivers-x86-v4.6-2' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86: platform:x86 decouple telemetry driver from the optional IPC resources thinkpad_acpi: Silence an uninitialized variable warning intel_telemetry_pltdrv: Silence an uninitialized variable warning hp_accel: Silence an uninitialized variable warning hp_accel: Add support for HP ProBook 440 G3 intel-hid: add a workaround to ignore an event after waking up from S4.
-
- 20 Apr, 2016 6 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds authored
Pull crypto fixes from Herbert Xu: "This fixes the following issues: - Incorrect output buffer size calculation in rsa-pkcs1pad - Uninitialised padding bytes on exported state in ccp driver - Potentially freed pointer used on completion callback in sha1-mb" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: ccp - Prevent information leakage on export crypto: sha1-mb - use corrcet pointer while completing jobs crypto: rsa-pkcs1pad - fix dst len
-
Andrew Goodbody authored
This reverts commit cfe25560 This can result in a "Unable to handle kernel paging request" during boot. This was due to using an uninitialised struct member, data->slaves. Signed-off-by: Andrew Goodbody <andrew.goodbody@cambrionix.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jorgen Hansen authored
If skb_recv_datagram returns an skb, we should ignore the err value returned. Otherwise, datagram receives will return EAGAIN when they have to wait for a datagram. Acked-by: Adit Ranadive <aditr@vmware.com> Signed-off-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Konstantin Khlebnikov authored
skb->sk could point to timewait or request socket which has no sk_classid. Detected as "BUG: KASAN: slab-out-of-bounds in cls_cgroup_classify". Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Konstantin Khlebnikov authored
This patch fixes couple error paths after allocation failures. Atomic set of page reference counter is safe only if it is zero, otherwise set can race with any speculative get_page_unless_zero. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Konstantin Khlebnikov authored
High order pages are optional here since commit 51151a16 ("mlx4: allow order-0 memory allocations in RX path"), so here is no reason for depleting reserves. Generic __netdev_alloc_frag() implements the same logic. Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 19 Apr, 2016 1 commit
-
-
Linus Torvalds authored
Merge the ptmx internal interface cleanup branch. This doesn't change semantics, but it should be a sane basis for eventually getting the multi-instance devpts code into some sane shape where we can get rid of the kernel config option. Which we can hopefully get done next merge window.. * ptmx-cleanup: devpts: clean up interface to pty drivers
-