1. 15 Nov, 2017 15 commits
    • Eric Dumazet's avatar
      tcp: highest_sack fix · 50895b9d
      Eric Dumazet authored
      syzbot easily found a regression added in our latest patches [1]
      
      No longer set tp->highest_sack to the head of the send queue since
      this is not logical and error prone.
      
      Only sack processing should maintain the pointer to an skb from rtx queue.
      
      We might in the future only remember the sequence instead of a pointer to skb,
      since rb-tree should allow a fast lookup.
      
      [1]
      BUG: KASAN: use-after-free in tcp_highest_sack_seq include/net/tcp.h:1706 [inline]
      BUG: KASAN: use-after-free in tcp_ack+0x42bb/0x4fd0 net/ipv4/tcp_input.c:3537
      Read of size 4 at addr ffff8801c154faa8 by task syz-executor4/12860
      
      CPU: 0 PID: 12860 Comm: syz-executor4 Not tainted 4.14.0-next-20171113+ #41
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x194/0x257 lib/dump_stack.c:53
       print_address_description+0x73/0x250 mm/kasan/report.c:252
       kasan_report_error mm/kasan/report.c:351 [inline]
       kasan_report+0x25b/0x340 mm/kasan/report.c:409
       __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:429
       tcp_highest_sack_seq include/net/tcp.h:1706 [inline]
       tcp_ack+0x42bb/0x4fd0 net/ipv4/tcp_input.c:3537
       tcp_rcv_established+0x672/0x18a0 net/ipv4/tcp_input.c:5439
       tcp_v4_do_rcv+0x2ab/0x7d0 net/ipv4/tcp_ipv4.c:1468
       sk_backlog_rcv include/net/sock.h:909 [inline]
       __release_sock+0x124/0x360 net/core/sock.c:2264
       release_sock+0xa4/0x2a0 net/core/sock.c:2778
       tcp_sendmsg+0x3a/0x50 net/ipv4/tcp.c:1462
       inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
       sock_sendmsg_nosec net/socket.c:632 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:642
       ___sys_sendmsg+0x75b/0x8a0 net/socket.c:2048
       __sys_sendmsg+0xe5/0x210 net/socket.c:2082
       SYSC_sendmsg net/socket.c:2093 [inline]
       SyS_sendmsg+0x2d/0x50 net/socket.c:2089
       entry_SYSCALL_64_fastpath+0x1f/0x96
      RIP: 0033:0x452879
      RSP: 002b:00007fc9761bfbe8 EFLAGS: 00000212 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 0000000000758020 RCX: 0000000000452879
      RDX: 0000000000000000 RSI: 0000000020917fc8 RDI: 0000000000000015
      RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000212 R12: 00000000006ee3a0
      R13: 00000000ffffffff R14: 00007fc9761c06d4 R15: 0000000000000000
      
      Allocated by task 12860:
       save_stack+0x43/0xd0 mm/kasan/kasan.c:447
       set_track mm/kasan/kasan.c:459 [inline]
       kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:551
       kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:489
       kmem_cache_alloc_node+0x144/0x760 mm/slab.c:3638
       __alloc_skb+0xf1/0x780 net/core/skbuff.c:193
       alloc_skb_fclone include/linux/skbuff.h:1023 [inline]
       sk_stream_alloc_skb+0x11d/0x900 net/ipv4/tcp.c:870
       tcp_sendmsg_locked+0x1341/0x3b80 net/ipv4/tcp.c:1299
       tcp_sendmsg+0x2f/0x50 net/ipv4/tcp.c:1461
       inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
       sock_sendmsg_nosec net/socket.c:632 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:642
       SYSC_sendto+0x358/0x5a0 net/socket.c:1749
       SyS_sendto+0x40/0x50 net/socket.c:1717
       entry_SYSCALL_64_fastpath+0x1f/0x96
      
      Freed by task 12860:
       save_stack+0x43/0xd0 mm/kasan/kasan.c:447
       set_track mm/kasan/kasan.c:459 [inline]
       kasan_slab_free+0x71/0xc0 mm/kasan/kasan.c:524
       __cache_free mm/slab.c:3492 [inline]
       kmem_cache_free+0x77/0x280 mm/slab.c:3750
       kfree_skbmem+0xdd/0x1d0 net/core/skbuff.c:603
       __kfree_skb+0x1d/0x20 net/core/skbuff.c:642
       sk_wmem_free_skb include/net/sock.h:1419 [inline]
       tcp_rtx_queue_unlink_and_free include/net/tcp.h:1682 [inline]
       tcp_clean_rtx_queue net/ipv4/tcp_input.c:3111 [inline]
       tcp_ack+0x1b17/0x4fd0 net/ipv4/tcp_input.c:3593
       tcp_rcv_established+0x672/0x18a0 net/ipv4/tcp_input.c:5439
       tcp_v4_do_rcv+0x2ab/0x7d0 net/ipv4/tcp_ipv4.c:1468
       sk_backlog_rcv include/net/sock.h:909 [inline]
       __release_sock+0x124/0x360 net/core/sock.c:2264
       release_sock+0xa4/0x2a0 net/core/sock.c:2778
       tcp_sendmsg+0x3a/0x50 net/ipv4/tcp.c:1462
       inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
       sock_sendmsg_nosec net/socket.c:632 [inline]
       sock_sendmsg+0xca/0x110 net/socket.c:642
       ___sys_sendmsg+0x75b/0x8a0 net/socket.c:2048
       __sys_sendmsg+0xe5/0x210 net/socket.c:2082
       SYSC_sendmsg net/socket.c:2093 [inline]
       SyS_sendmsg+0x2d/0x50 net/socket.c:2089
       entry_SYSCALL_64_fastpath+0x1f/0x96
      
      The buggy address belongs to the object at ffff8801c154fa80
       which belongs to the cache skbuff_fclone_cache of size 456
      The buggy address is located 40 bytes inside of
       456-byte region [ffff8801c154fa80, ffff8801c154fc48)
      The buggy address belongs to the page:
      page:ffffea00070553c0 count:1 mapcount:0 mapping:ffff8801c154f080 index:0x0
      flags: 0x2fffc0000000100(slab)
      raw: 02fffc0000000100 ffff8801c154f080 0000000000000000 0000000100000006
      raw: ffffea00070a5a20 ffffea0006a18360 ffff8801d9ca0500 0000000000000000
      page dumped because: kasan: bad access detected
      
      Fixes: 737ff314 ("tcp: use sequence distance to detect reordering")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50895b9d
    • Hangbin Liu's avatar
      geneve: fix fill_info when link down · fd7eafd0
      Hangbin Liu authored
      geneve->sock4/6 were added with geneve_open and released with geneve_stop.
      So when geneve link down, we will not able to show remote address and
      checksum info after commit 11387fe4 ("geneve: fix fill_info when using
      collect_metadata").
      
      Fix this by avoid passing *_REMOTE{,6} for COLLECT_METADATA since they are
      mutually exclusive, and always show UDP_ZERO_CSUM6_RX info.
      
      Fixes: 11387fe4 ("geneve: fix fill_info when using collect_metadata")
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd7eafd0
    • Eric Dumazet's avatar
      bpf: fix lockdep splat · 89ad2fa3
      Eric Dumazet authored
      pcpu_freelist_pop() needs the same lockdep awareness than
      pcpu_freelist_populate() to avoid a false positive.
      
       [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ]
      
       switchto-defaul/12508 [HC0[0]:SC0[6]:HE0:SE0] is trying to acquire:
        (&htab->buckets[i].lock){......}, at: [<ffffffff9dc099cb>] __htab_percpu_map_update_elem+0x1cb/0x300
      
       and this task is already holding:
        (dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2){+.-...}, at: [<ffffffff9e135848>] __dev_queue_xmit+0
      x868/0x1240
       which would create a new lock dependency:
        (dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2){+.-...} -> (&htab->buckets[i].lock){......}
      
       but this new dependency connects a SOFTIRQ-irq-safe lock:
        (dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2){+.-...}
       ... which became SOFTIRQ-irq-safe at:
         [<ffffffff9db5931b>] __lock_acquire+0x42b/0x1f10
         [<ffffffff9db5b32c>] lock_acquire+0xbc/0x1b0
         [<ffffffff9da05e38>] _raw_spin_lock+0x38/0x50
         [<ffffffff9e135848>] __dev_queue_xmit+0x868/0x1240
         [<ffffffff9e136240>] dev_queue_xmit+0x10/0x20
         [<ffffffff9e1965d9>] ip_finish_output2+0x439/0x590
         [<ffffffff9e197410>] ip_finish_output+0x150/0x2f0
         [<ffffffff9e19886d>] ip_output+0x7d/0x260
         [<ffffffff9e19789e>] ip_local_out+0x5e/0xe0
         [<ffffffff9e197b25>] ip_queue_xmit+0x205/0x620
         [<ffffffff9e1b8398>] tcp_transmit_skb+0x5a8/0xcb0
         [<ffffffff9e1ba152>] tcp_write_xmit+0x242/0x1070
         [<ffffffff9e1baffc>] __tcp_push_pending_frames+0x3c/0xf0
         [<ffffffff9e1b3472>] tcp_rcv_established+0x312/0x700
         [<ffffffff9e1c1acc>] tcp_v4_do_rcv+0x11c/0x200
         [<ffffffff9e1c3dc2>] tcp_v4_rcv+0xaa2/0xc30
         [<ffffffff9e191107>] ip_local_deliver_finish+0xa7/0x240
         [<ffffffff9e191a36>] ip_local_deliver+0x66/0x200
         [<ffffffff9e19137d>] ip_rcv_finish+0xdd/0x560
         [<ffffffff9e191e65>] ip_rcv+0x295/0x510
         [<ffffffff9e12ff88>] __netif_receive_skb_core+0x988/0x1020
         [<ffffffff9e130641>] __netif_receive_skb+0x21/0x70
         [<ffffffff9e1306ff>] process_backlog+0x6f/0x230
         [<ffffffff9e132129>] net_rx_action+0x229/0x420
         [<ffffffff9da07ee8>] __do_softirq+0xd8/0x43d
         [<ffffffff9e282bcc>] do_softirq_own_stack+0x1c/0x30
         [<ffffffff9dafc2f5>] do_softirq+0x55/0x60
         [<ffffffff9dafc3a8>] __local_bh_enable_ip+0xa8/0xb0
         [<ffffffff9db4c727>] cpu_startup_entry+0x1c7/0x500
         [<ffffffff9daab333>] start_secondary+0x113/0x140
      
       to a SOFTIRQ-irq-unsafe lock:
        (&head->lock){+.+...}
       ... which became SOFTIRQ-irq-unsafe at:
       ...  [<ffffffff9db5971f>] __lock_acquire+0x82f/0x1f10
         [<ffffffff9db5b32c>] lock_acquire+0xbc/0x1b0
         [<ffffffff9da05e38>] _raw_spin_lock+0x38/0x50
         [<ffffffff9dc0b7fa>] pcpu_freelist_pop+0x7a/0xb0
         [<ffffffff9dc08b2c>] htab_map_alloc+0x50c/0x5f0
         [<ffffffff9dc00dc5>] SyS_bpf+0x265/0x1200
         [<ffffffff9e28195f>] entry_SYSCALL_64_fastpath+0x12/0x17
      
       other info that might help us debug this:
      
       Chain exists of:
         dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2 --> &htab->buckets[i].lock --> &head->lock
      
        Possible interrupt unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(&head->lock);
                                      local_irq_disable();
                                      lock(dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2);
                                      lock(&htab->buckets[i].lock);
         <Interrupt>
           lock(dev_queue->dev->qdisc_class ?: &qdisc_tx_lock#2);
      
        *** DEADLOCK ***
      
      Fixes: e19494ed ("bpf: introduce percpu_freelist")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      89ad2fa3
    • Bjørn Mork's avatar
      net: cdc_ncm: GetNtbFormat endian fix · 6314dab4
      Bjørn Mork authored
      The GetNtbFormat and SetNtbFormat requests operate on 16 bit little
      endian values. We get away with ignoring this most of the time, because
      we only care about USB_CDC_NCM_NTB16_FORMAT which is 0x0000.  This
      fails for USB_CDC_NCM_NTB32_FORMAT.
      
      Fix comparison between LE value from device and constant by converting
      the constant to LE.
      Reported-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Fixes: 2b02c20c ("cdc_ncm: Set NTB format again after altsetting switch for Huawei devices")
      Cc: Enrico Mioso <mrkiko.rs@gmail.com>
      Cc: Christian Panton <christian@panton.org>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Acked-By: default avatarEnrico Mioso <mrkiko.rs@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6314dab4
    • Gustavo A. R. Silva's avatar
      openvswitch: meter: fix NULL pointer dereference in ovs_meter_cmd_reply_start · b74912a2
      Gustavo A. R. Silva authored
      It seems that the intention of the code is to null check the value
      returned by function genlmsg_put. But the current code is null
      checking the address of the pointer that holds the value returned
      by genlmsg_put.
      
      Fix this by properly null checking the value returned by function
      genlmsg_put in order to avoid a pontential null pointer dereference.
      
      Addresses-Coverity-ID: 1461561 ("Dereference before null check")
      Addresses-Coverity-ID: 1461562 ("Dereference null return value")
      Fixes: 96fbc13d ("openvswitch: Add meter infrastructure")
      Signed-off-by: default avatarGustavo A. R. Silva <garsilva@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b74912a2
    • David S. Miller's avatar
      Merge branch 'netem-fix-compilation-on-32-bit' · 69d48179
      David S. Miller authored
      Stephen Hemminger says:
      
      ====================
      netem: fix compilation on 32 bit
      
      A couple of places where 64 bit CPU was being assumed incorrectly.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69d48179
    • Stephen Hemminger's avatar
      netem: remove unnecessary 64 bit modulus · 9b0ed891
      Stephen Hemminger authored
      Fix compilation on 32 bit platforms (where doing modulus operation
      with 64 bit requires extra glibc functions) by truncation.
      The jitter for table distribution is limited to a 32 bit value
      because random numbers are scaled as 32 bit value.
      
      Also fix some whitespace.
      
      Fixes: 99803171 ("netem: add uapi to express delay and jitter in nanoseconds")
      Reported-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b0ed891
    • Stephen Hemminger's avatar
      netem: use 64 bit divide by rate · bce552fd
      Stephen Hemminger authored
      Since times are now expressed in nanosecond, need to now do
      true 64 bit divide. Old code would truncate rate at 32 bits.
      Rename function to better express current usage.
      Signed-off-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bce552fd
    • Stephen Hemminger's avatar
      tcp: Namespace-ify sysctl_tcp_default_congestion_control · 6670e152
      Stephen Hemminger authored
      Make default TCP default congestion control to a per namespace
      value. This changes default congestion control to a pointer to congestion ops
      (rather than implicit as first element of available lsit).
      
      The congestion control setting of new namespaces is inherited
      from the current setting of the root namespace.
      Signed-off-by: default avatarStephen Hemminger <sthemmin@microsoft.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6670e152
    • Kirill Tkhai's avatar
      net: Protect iterations over net::fib_notifier_ops in fib_seq_sum() · 11bf284f
      Kirill Tkhai authored
      There is at least unlocked deletion of net->ipv4.fib_notifier_ops
      from net::fib_notifier_ops:
      
      ip_fib_net_exit()
        rtnl_unlock()
        fib4_notifier_exit()
          fib_notifier_ops_unregister(net->ipv4.notifier_ops)
            list_del_rcu(&ops->list)
      
      So fib_seq_sum() can't use rtnl_lock() only for protection.
      
      The possible solution could be to use rtnl_lock()
      in fib_notifier_ops_unregister(), but this adds
      a possible delay during net namespace creation,
      so we better use rcu_read_lock() till someone
      really needs the mutex (if that happens).
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11bf284f
    • Nicolas Dichtel's avatar
      ipv6: set all.accept_dad to 0 by default · 09400953
      Nicolas Dichtel authored
      With commits 35e015e1 and a2d3f3e3, the global 'accept_dad' flag
      is also taken into account (default value is 1). If either global or
      per-interface flag is non-zero, DAD will be enabled on a given interface.
      
      This is not backward compatible: before those patches, the user could
      disable DAD just by setting the per-interface flag to 0. Now, the
      user instead needs to set both flags to 0 to actually disable DAD.
      
      Restore the previous behaviour by setting the default for the global
      'accept_dad' flag to 0. This way, DAD is still enabled by default,
      as per-interface flags are set to 1 on device creation, but setting
      them to 0 is enough to disable DAD on a given interface.
      
      - Before 35e015e1f57a7 and a2d3f3e3:
                global    per-interface    DAD enabled
      [default]   1             1              yes
                  X             0              no
                  X             1              yes
      
      - After 35e015e1 and a2d3f3e3:
                global    per-interface    DAD enabled
      [default]   1             1              yes
                  0             0              no
                  0             1              yes
                  1             0              yes
      
      - After this fix:
                global    per-interface    DAD enabled
                  1             1              yes
                  0             0              no
      [default]   0             1              yes
                  1             0              yes
      
      Fixes: 35e015e1 ("ipv6: fix net.ipv6.conf.all interface DAD handlers")
      Fixes: a2d3f3e3 ("ipv6: fix net.ipv6.conf.all.accept_dad behaviour for real")
      CC: Stefano Brivio <sbrivio@redhat.com>
      CC: Matteo Croce <mcroce@redhat.com>
      CC: Erik Kline <ek@google.com>
      Signed-off-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Acked-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      09400953
    • Dmitry V. Levin's avatar
      uapi: fix linux/tls.h userspace compilation error · b9f3eb49
      Dmitry V. Levin authored
      Move inclusion of a private kernel header <net/tcp.h>
      from uapi/linux/tls.h to its only user - net/tls.h,
      to fix the following linux/tls.h userspace compilation error:
      
      /usr/include/linux/tls.h:41:21: fatal error: net/tcp.h: No such file or directory
      
      As to this point uapi/linux/tls.h was totaly unusuable for userspace,
      cleanup this header file further by moving other redundant includes
      to net/tls.h.
      
      Fixes: 3c4d7559 ("tls: kernel TLS support")
      Cc: <stable@vger.kernel.org> # v4.13+
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9f3eb49
    • Alexander Kappner's avatar
      usbnet: ipheth: prevent TX queue timeouts when device not ready · bb1b40c7
      Alexander Kappner authored
      iOS devices require the host to be "trusted" before servicing network
      packets. Establishing trust requires the user to confirm a dialog on the
      iOS device.Until trust is established, the iOS device will silently discard
      network packets from the host. Currently, the ipheth driver does not detect
      whether an iOS device has established trust with the host, and immediately
      sets up the transmit queues.
      
      This causes the following problems:
      
      - Kernel taint due to WARN() in netdev watchdog.
      - Dmesg spam ("TX timeout").
      - Disruption of user space networking activity (dhcpd, etc...) when new
      interface comes up but cannot be used.
      - Unnecessary host and device wakeups and USB traffic
      
      Example dmesg output:
      
      [ 1101.319778] NETDEV WATCHDOG: eth1 (ipheth): transmit queue 0 timed out
      [ 1101.319817] ------------[ cut here ]------------
      [ 1101.319828] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:316 dev_watchdog+0x20f/0x220
      [ 1101.319831] Modules linked in: ipheth usbmon nvidia_drm(PO) nvidia_modeset(PO) nvidia(PO) iwlmvm mac80211 iwlwifi btusb btrtl btbcm btintel qmi_wwan bluetooth cfg80211 ecdh_generic thinkpad_acpi rfkill [last unloaded: ipheth]
      [ 1101.319861] CPU: 0 PID: 0 Comm: swapper/0 Tainted: P           O    4.13.12.1 #1
      [ 1101.319864] Hardware name: LENOVO 20ENCTO1WW/20ENCTO1WW, BIOS N1EET62W (1.35 ) 11/10/2016
      [ 1101.319867] task: ffffffff81e11500 task.stack: ffffffff81e00000
      [ 1101.319873] RIP: 0010:dev_watchdog+0x20f/0x220
      [ 1101.319876] RSP: 0018:ffff8810a3c03e98 EFLAGS: 00010292
      [ 1101.319880] RAX: 000000000000003a RBX: 0000000000000000 RCX: 0000000000000000
      [ 1101.319883] RDX: ffff8810a3c15c48 RSI: ffffffff81ccbfc2 RDI: 00000000ffffffff
      [ 1101.319886] RBP: ffff880c04ebc41c R08: 0000000000000000 R09: 0000000000000379
      [ 1101.319889] R10: 00000100696589d0 R11: 0000000000000378 R12: ffff880c04ebc000
      [ 1101.319892] R13: 0000000000000000 R14: 0000000000000001 R15: ffff880c2865fc80
      [ 1101.319896] FS:  0000000000000000(0000) GS:ffff8810a3c00000(0000) knlGS:0000000000000000
      [ 1101.319899] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1101.319902] CR2: 00007f3ff24ac000 CR3: 0000000001e0a000 CR4: 00000000003406f0
      [ 1101.319905] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [ 1101.319908] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [ 1101.319910] Call Trace:
      [ 1101.319914]  <IRQ>
      [ 1101.319921]  ? dev_graft_qdisc+0x70/0x70
      [ 1101.319928]  ? dev_graft_qdisc+0x70/0x70
      [ 1101.319934]  ? call_timer_fn+0x2e/0x170
      [ 1101.319939]  ? dev_graft_qdisc+0x70/0x70
      [ 1101.319944]  ? run_timer_softirq+0x1ea/0x440
      [ 1101.319951]  ? timerqueue_add+0x54/0x80
      [ 1101.319956]  ? enqueue_hrtimer+0x38/0xa0
      [ 1101.319963]  ? __do_softirq+0xed/0x2e7
      [ 1101.319970]  ? irq_exit+0xb4/0xc0
      [ 1101.319976]  ? smp_apic_timer_interrupt+0x39/0x50
      [ 1101.319981]  ? apic_timer_interrupt+0x8c/0xa0
      [ 1101.319983]  </IRQ>
      [ 1101.319992]  ? cpuidle_enter_state+0xfa/0x2a0
      [ 1101.319999]  ? do_idle+0x1a3/0x1f0
      [ 1101.320004]  ? cpu_startup_entry+0x5f/0x70
      [ 1101.320011]  ? start_kernel+0x444/0x44c
      [ 1101.320017]  ? early_idt_handler_array+0x120/0x120
      [ 1101.320023]  ? x86_64_start_kernel+0x145/0x154
      [ 1101.320028]  ? secondary_startup_64+0x9f/0x9f
      [ 1101.320033] Code: 20 04 00 00 eb 9f 4c 89 e7 c6 05 59 44 71 00 01 e8 a7 df fd ff 89 d9 4c 89 e6 48 c7 c7 70 b7 cd 81 48 89 c2 31 c0 e8 97 64 90 ff <0f> ff eb bf 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00
      [ 1101.320103] ---[ end trace 0cc4d251e2b57080 ]---
      [ 1101.320110] ipheth 1-5:4.2: ipheth_tx_timeout: TX timeout
      
      The last message "TX timeout" is repeated every 5 seconds until trust is
      established or the device is disconnected, filling up dmesg.
      
      The proposed patch eliminates the problem by, upon connection, keeping the
      TX queue and carrier disabled until a packet is first received from the iOS
      device. This is reflected by the confirmed_pairing variable in the device
      structure. Only after at least one packet has been received from the iOS
      device, the transmit queue and carrier are brought up during the periodic
      device poll in ipheth_carrier_set. Because the iOS device will always send
      a packet immediately upon trust being established, this should not delay
      the interface becoming useable. To prevent failed UBRs in
      ipheth_rcvbulk_callback from perpetually re-enabling the queue if it was
      disabled, a new check is added so only successful transfers re-enable the
      queue, whereas failed transfers only trigger an immediate poll.
      
      This has the added benefit of removing the periodic control requests to the
      iOS device until trust has been established and thus should reduce wakeup
      events on both the host and the iOS device.
      Signed-off-by: default avatarAlexander Kappner <agk@godking.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bb1b40c7
    • Jason Wang's avatar
      vhost_net: conditionally enable tx polling · feb8892c
      Jason Wang authored
      We always poll tx for socket, this is sub optimal since this will
      slightly increase the waitqueue traversing time and more important,
      vhost could not benefit from commit 9e641bdc ("net-tun:
      restructure tun_do_read for better sleep/wakeup efficiency") even if
      we've stopped rx polling during handle_rx(), tx poll were still left
      in the waitqueue.
      
      Pktgen from a remote host to VM over mlx4 on two 2.00GHz Xeon E5-2650
      shows 11.7% improvements on rx PPS. (from 1.28Mpps to 1.44Mpps)
      
      Cc: Wei Xu <wexu@redhat.com>
      Cc: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      feb8892c
    • Dmitry V. Levin's avatar
      uapi: fix linux/rxrpc.h userspace compilation errors · 0eef304b
      Dmitry V. Levin authored
      Consistently use types provided by <linux/types.h> to fix the following
      linux/rxrpc.h userspace compilation errors:
      
      /usr/include/linux/rxrpc.h:24:2: error: unknown type name 'u16'
        u16  srx_service; /* service desired */
      /usr/include/linux/rxrpc.h:25:2: error: unknown type name 'u16'
        u16  transport_type; /* type of transport socket (SOCK_DGRAM) */
      /usr/include/linux/rxrpc.h:26:2: error: unknown type name 'u16'
        u16  transport_len; /* length of transport address */
      
      Use __kernel_sa_family_t instead of sa_family_t the same way
      as uapi/linux/in.h does, to fix the following
      linux/rxrpc.h userspace compilation errors:
      
      /usr/include/linux/rxrpc.h:23:2: error: unknown type name 'sa_family_t'
        sa_family_t srx_family; /* address family */
      /usr/include/linux/rxrpc.h:28:3: error: unknown type name 'sa_family_t'
        sa_family_t family;  /* transport address family */
      
      Fixes: 727f8914 ("rxrpc: Expose UAPI definitions to userspace")
      Cc: <stable@vger.kernel.org> # v4.14
      Signed-off-by: default avatarDmitry V. Levin <ldv@altlinux.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0eef304b
  2. 14 Nov, 2017 25 commits