1. 10 Jun, 2016 15 commits
  2. 09 Jun, 2016 11 commits
  3. 08 Jun, 2016 9 commits
    • Bert Kenward's avatar
      sfc: report supported link speeds on SFP connections · 3497ed8c
      Bert Kenward authored
      7000-series SFC NICs connected with an SFP+ module currently fail to
      report any supported link speeds.
      Reported-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarBert Kenward <bkenward@solarflare.com>
      Reviewed-by: default avatarJarod Wilson <jarod@redhat.com>
      Tested-by: default avatarJarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3497ed8c
    • Eric Dumazet's avatar
      net_sched: add missing paddattr description · e0d194ad
      Eric Dumazet authored
      "make htmldocs" complains otherwise:
      
      .//net/core/gen_stats.c:65: warning: No description found for parameter 'padattr'
      .//net/core/gen_stats.c:101: warning: No description found for parameter 'padattr'
      
      Fixes: 9854518e ("sched: align nlattr properly when needed")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarkbuild test robot <fengguang.wu@intel.com>
      Acked-by: default avatarNicolas Dichtel <nicolas.dichtel@6wind.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e0d194ad
    • Jakub Sitnicki's avatar
      ipv6: Skip XFRM lookup if dst_entry in socket cache is valid · 00bc0ef5
      Jakub Sitnicki authored
      At present we perform an xfrm_lookup() for each UDPv6 message we
      send. The lookup involves querying the flow cache (flow_cache_lookup)
      and, in case of a cache miss, creating an XFRM bundle.
      
      If we miss the flow cache, we can end up creating a new bundle and
      deriving the path MTU (xfrm_init_pmtu) from on an already transformed
      dst_entry, which we pass from the socket cache (sk->sk_dst_cache) down
      to xfrm_lookup(). This can happen only if we're caching the dst_entry
      in the socket, that is when we're using a connected UDP socket.
      
      To put it another way, the path MTU shrinks each time we miss the flow
      cache, which later on leads to incorrectly fragmented payload. It can
      be observed with ESPv6 in transport mode:
      
        1) Set up a transformation and lower the MTU to trigger fragmentation
          # ip xfrm policy add dir out src ::1 dst ::1 \
            tmpl src ::1 dst ::1 proto esp spi 1
          # ip xfrm state add src ::1 dst ::1 \
            proto esp spi 1 enc 'aes' 0x0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b0b
          # ip link set dev lo mtu 1500
      
        2) Monitor the packet flow and set up an UDP sink
          # tcpdump -ni lo -ttt &
          # socat udp6-listen:12345,fork /dev/null &
      
        3) Send a datagram that needs fragmentation with a connected socket
          # perl -e 'print "@" x 1470 | socat - udp6:[::1]:12345
          2016/06/07 18:52:52 socat[724] E read(3, 0x555bb3d5ba00, 8192): Protocol error
          00:00:00.000000 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x2), length 1448
          00:00:00.000014 IP6 ::1 > ::1: frag (1448|32)
          00:00:00.000050 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x3), length 1272
          (^ ICMPv6 Parameter Problem)
          00:00:00.000022 IP6 ::1 > ::1: ESP(spi=0x00000001,seq=0x5), length 136
      
        4) Compare it to a non-connected socket
          # perl -e 'print "@" x 1500' | socat - udp6-sendto:[::1]:12345
          00:00:40.535488 IP6 ::1 > ::1: frag (0|1448) ESP(spi=0x00000001,seq=0x6), length 1448
          00:00:00.000010 IP6 ::1 > ::1: frag (1448|64)
      
      What happens in step (3) is:
      
        1) when connecting the socket in __ip6_datagram_connect(), we
           perform an XFRM lookup, miss the flow cache, create an XFRM
           bundle, and cache the destination,
      
        2) afterwards, when sending the datagram, we perform an XFRM lookup,
           again, miss the flow cache (due to mismatch of flowi6_iif and
           flowi6_oif, which is an issue of its own), and recreate an XFRM
           bundle based on the cached (and already transformed) destination.
      
      To prevent the recreation of an XFRM bundle, avoid an XFRM lookup
      altogether whenever we already have a destination entry cached in the
      socket. This prevents the path MTU shrinkage and brings us on par with
      UDPv4.
      
      The fix also benefits connected PINGv6 sockets, another user of
      ip6_sk_dst_lookup_flow(), who also suffer messages being transformed
      twice.
      
      Joint work with Hannes Frederic Sowa.
      Reported-by: default avatarJan Tluka <jtluka@redhat.com>
      Signed-off-by: default avatarJakub Sitnicki <jkbs@redhat.com>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      00bc0ef5
    • Guillaume Nault's avatar
      l2tp: fix configuration passed to setup_udp_tunnel_sock() · a5c5e2da
      Guillaume Nault authored
      Unused fields of udp_cfg must be all zeros. Otherwise
      setup_udp_tunnel_sock() fills ->gro_receive and ->gro_complete
      callbacks with garbage, eventually resulting in panic when used by
      udp_gro_receive().
      
      [   72.694123] BUG: unable to handle kernel paging request at ffff880033f87d78
      [   72.695518] IP: [<ffff880033f87d78>] 0xffff880033f87d78
      [   72.696530] PGD 26e2067 PUD 26e3067 PMD 342ed063 PTE 8000000033f87163
      [   72.696530] Oops: 0011 [#1] SMP KASAN
      [   72.696530] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pptp gre pppox ppp_generic slhc crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ansi_cprng aesni_intel evdev aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper serio_raw acpi_cpufreq button proc\
      essor ext4 crc16 jbd2 mbcache virtio_blk virtio_net virtio_pci virtio_ring virtio
      [   72.696530] CPU: 3 PID: 0 Comm: swapper/3 Not tainted 4.7.0-rc1 #1
      [   72.696530] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
      [   72.696530] task: ffff880035b59700 ti: ffff880035b70000 task.ti: ffff880035b70000
      [   72.696530] RIP: 0010:[<ffff880033f87d78>]  [<ffff880033f87d78>] 0xffff880033f87d78
      [   72.696530] RSP: 0018:ffff880035f87bc0  EFLAGS: 00010246
      [   72.696530] RAX: ffffed000698f996 RBX: ffff88003326b840 RCX: ffffffff814cc823
      [   72.696530] RDX: ffff88003326b840 RSI: ffff880033e48038 RDI: ffff880034c7c780
      [   72.696530] RBP: ffff880035f87c18 R08: 000000000000a506 R09: 0000000000000000
      [   72.696530] R10: ffff880035f87b38 R11: ffff880034b9344d R12: 00000000ebfea715
      [   72.696530] R13: 0000000000000000 R14: ffff880034c7c780 R15: 0000000000000000
      [   72.696530] FS:  0000000000000000(0000) GS:ffff880035f80000(0000) knlGS:0000000000000000
      [   72.696530] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   72.696530] CR2: ffff880033f87d78 CR3: 0000000033c98000 CR4: 00000000000406a0
      [   72.696530] Stack:
      [   72.696530]  ffffffff814cc834 ffff880034b93468 0000001481416818 ffff88003326b874
      [   72.696530]  ffff880034c7ccb0 ffff880033e48038 ffff88003326b840 ffff880034b93462
      [   72.696530]  ffff88003326b88a ffff88003326b88c ffff880034b93468 ffff880035f87c70
      [   72.696530] Call Trace:
      [   72.696530]  <IRQ>
      [   72.696530]  [<ffffffff814cc834>] ? udp_gro_receive+0x1c6/0x1f9
      [   72.696530]  [<ffffffff814ccb1c>] udp4_gro_receive+0x2b5/0x310
      [   72.696530]  [<ffffffff814d989b>] inet_gro_receive+0x4a3/0x4cd
      [   72.696530]  [<ffffffff81431b32>] dev_gro_receive+0x584/0x7a3
      [   72.696530]  [<ffffffff810adf7a>] ? __lock_is_held+0x29/0x64
      [   72.696530]  [<ffffffff814321f7>] napi_gro_receive+0x124/0x21d
      [   72.696530]  [<ffffffffa000b145>] virtnet_receive+0x8df/0x8f6 [virtio_net]
      [   72.696530]  [<ffffffffa000b27e>] virtnet_poll+0x1d/0x8d [virtio_net]
      [   72.696530]  [<ffffffff81431350>] net_rx_action+0x15b/0x3b9
      [   72.696530]  [<ffffffff815893d6>] __do_softirq+0x216/0x546
      [   72.696530]  [<ffffffff81062392>] irq_exit+0x49/0xb6
      [   72.696530]  [<ffffffff81588e9a>] do_IRQ+0xe2/0xfa
      [   72.696530]  [<ffffffff81587a49>] common_interrupt+0x89/0x89
      [   72.696530]  <EOI>
      [   72.696530]  [<ffffffff810b05df>] ? trace_hardirqs_on_caller+0x229/0x270
      [   72.696530]  [<ffffffff8102b3c7>] ? default_idle+0x1c/0x2d
      [   72.696530]  [<ffffffff8102b3c5>] ? default_idle+0x1a/0x2d
      [   72.696530]  [<ffffffff8102bb8c>] arch_cpu_idle+0xa/0xc
      [   72.696530]  [<ffffffff810a6c39>] default_idle_call+0x1a/0x1c
      [   72.696530]  [<ffffffff810a6d96>] cpu_startup_entry+0x15b/0x20f
      [   72.696530]  [<ffffffff81039a81>] start_secondary+0x12c/0x133
      [   72.696530] Code: ff ff ff ff ff ff ff ff ff ff 7f ff ff ff ff ff ff ff 7f 00 7e f8 33 00 88 ff ff 6d 61 58 81 ff ff ff ff 5e de 0a 81 ff ff ff ff <00> 5c e2 34 00 88 ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00
      [   72.696530] RIP  [<ffff880033f87d78>] 0xffff880033f87d78
      [   72.696530]  RSP <ffff880035f87bc0>
      [   72.696530] CR2: ffff880033f87d78
      [   72.696530] ---[ end trace ad7758b9a1dccf99 ]---
      [   72.696530] Kernel panic - not syncing: Fatal exception in interrupt
      [   72.696530] Kernel Offset: disabled
      [   72.696530] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
      
      v2: use empty initialiser instead of "{ NULL }" to avoid relying on
          first field's type.
      
      Fixes: 38fd2af2 ("udp: Add socket based GRO and config")
      Signed-off-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a5c5e2da
    • Hariprasad Shenai's avatar
    • Ben Dooks's avatar
      net-sysfs: fix missing <linux/of_net.h> · 88832a22
      Ben Dooks authored
      The of_find_net_device_by_node() function is defined in
      <linux/of_net.h> but not included in the .c file that
      implements it. Fix the following warning by including the
      header:
      
      net/core/net-sysfs.c:1494:19: warning: symbol 'of_find_net_device_by_node' was not declared. Should it be static?
      Signed-off-by: default avatarBen Dooks <ben.dooks@codethink.co.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88832a22
    • Toshiaki Makita's avatar
      bridge: Don't insert unnecessary local fdb entry on changing mac address · 0b148def
      Toshiaki Makita authored
      The missing br_vlan_should_use() test caused creation of an unneeded
      local fdb entry on changing mac address of a bridge device when there is
      a vlan which is configured on a bridge port but not on the bridge
      device.
      
      Fixes: 2594e906 ("bridge: vlan: add per-vlan struct and move to rhashtables")
      Signed-off-by: default avatarToshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0b148def
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 32565644
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter/IPVS fixes for net
      
      The following patchset contains two Netfilter/IPVS fixes for your net
      tree, they are:
      
      1) Fix missing alignment in next offset calculation for standard
         targets, introduced in the previous merge window, patch from
         Florian Westphal.
      
      2) Fix to correct the handling of outgoing connections which use the
         SIP-pe such that the binding of a real-server is updated when needed.
         This was an omission from changes introduced by Marco Angaroni in
         the previous merge window too, to allow handling of outgoing
         connections by the SIP-pe. Patch and report came via Simon Horman.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      32565644
    • Yuchung Cheng's avatar
      tcp: record TLP and ER timer stats in v6 stats · ce3cf4ec
      Yuchung Cheng authored
      The v6 tcp stats scan do not provide TLP and ER timer information
      correctly like the v4 version . This patch fixes that.
      
      Fixes: 6ba8a3b1 ("tcp: Tail loss probe (TLP)")
      Fixes: eed530b6 ("tcp: early retransmit")
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ce3cf4ec
  4. 07 Jun, 2016 5 commits
    • Daniel Borkmann's avatar
      net: sched: fix tc_should_offload for specific clsact classes · 92c075db
      Daniel Borkmann authored
      When offloading classifiers such as u32 or flower to hardware, and the
      qdisc is clsact (TC_H_CLSACT), then we need to differentiate its classes,
      since not all of them handle ingress, therefore we must leave those in
      software path. Add a .tcf_cl_offload() callback, so we can generically
      handle them, tested on ixgbe.
      
      Fixes: 10cbc684 ("net/sched: cls_flower: Hardware offloaded filters statistics support")
      Fixes: 5b33f488 ("net/flower: Introduce hardware offload support")
      Fixes: a1b7c5fd ("net: sched: add cls_u32 offload hooks for netdevs")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      92c075db
    • WANG Cong's avatar
      act_police: fix a crash during removal · a03e6fe5
      WANG Cong authored
      The police action is using its own code to initialize tcf hash
      info, which makes us to forgot to initialize a->hinfo correctly.
      Fix this by calling the helper function tcf_hash_create() directly.
      
      This patch fixed the following crash:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
       IP: [<ffffffff810c099f>] __lock_acquire+0xd3/0xf91
       PGD d3c34067 PUD d3e18067 PMD 0
       Oops: 0000 [#1] SMP
       CPU: 2 PID: 853 Comm: tc Not tainted 4.6.0+ #87
       Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
       task: ffff8800d3e28040 ti: ffff8800d3f6c000 task.ti: ffff8800d3f6c000
       RIP: 0010:[<ffffffff810c099f>]  [<ffffffff810c099f>] __lock_acquire+0xd3/0xf91
       RSP: 0000:ffff88011b203c80  EFLAGS: 00010002
       RAX: 0000000000000046 RBX: 0000000000000000 RCX: 0000000000000000
       RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000028
       RBP: ffff88011b203d40 R08: 0000000000000001 R09: 0000000000000000
       R10: ffff88011b203d58 R11: ffff88011b208000 R12: 0000000000000001
       R13: ffff8800d3e28040 R14: 0000000000000028 R15: 0000000000000000
       FS:  0000000000000000(0000) GS:ffff88011b200000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000028 CR3: 00000000d4be1000 CR4: 00000000000006e0
       Stack:
        ffff8800d3e289c0 0000000000000046 000000001b203d60 ffffffff00000000
        0000000000000000 ffff880000000000 0000000000000000 ffffffff00000000
        ffffffff8187142c ffff88011b203ce8 ffff88011b203ce8 ffffffff8101dbfc
       Call Trace:
        <IRQ>
        [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
        [<ffffffff8101dbfc>] ? native_sched_clock+0x1a/0x35
        [<ffffffff8101dbfc>] ? native_sched_clock+0x1a/0x35
        [<ffffffff810a9604>] ? sched_clock_local+0x11/0x78
        [<ffffffff810bf6a1>] ? mark_lock+0x24/0x201
        [<ffffffff810c1dbd>] lock_acquire+0x120/0x1b4
        [<ffffffff810c1dbd>] ? lock_acquire+0x120/0x1b4
        [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
        [<ffffffff81aad89f>] _raw_spin_lock_bh+0x3c/0x72
        [<ffffffff8187142c>] ? __tcf_hash_release+0x77/0xd1
        [<ffffffff8187142c>] __tcf_hash_release+0x77/0xd1
        [<ffffffff81871a27>] tcf_action_destroy+0x49/0x7c
        [<ffffffff81870b1c>] tcf_exts_destroy+0x20/0x2d
        [<ffffffff8189273b>] u32_destroy_key+0x1b/0x4d
        [<ffffffff81892788>] u32_delete_key_freepf_rcu+0x1b/0x1d
        [<ffffffff810de3b8>] rcu_process_callbacks+0x610/0x82e
        [<ffffffff8189276d>] ? u32_destroy_key+0x4d/0x4d
        [<ffffffff81ab0bc1>] __do_softirq+0x191/0x3f4
      
      Fixes: ddf97ccd ("net_sched: add network namespace support for tc actions")
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a03e6fe5
    • Eric Dumazet's avatar
      fq_codel: return non zero qlen in class dumps · aafddbf0
      Eric Dumazet authored
      We properly scan the flow list to count number of packets,
      but John passed 0 to gnet_stats_copy_queue() so we report
      a zero value to user space instead of the result.
      
      Fixes: 64015853 ("net: sched: restrict use of qstats qlen")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.r.fastabend@intel.com>
      Acked-by: default avatarJohn Fastabend <john.r.fastabend@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      aafddbf0
    • David S. Miller's avatar
      Merge branch 'u32-hwoffload-fixes' · 064d5e6f
      David S. Miller authored
      Jakub Kicinski says:
      
      ====================
      cls_u32 hardware offload fixes
      
      This set fixes two small issues with error codes I noticed
      in cls_u32.  Second patch could be viewed as user space API
      change but that portion of API is not part of any release,
      yet.
      
      Compile tested only.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      064d5e6f
    • Jakub Kicinski's avatar
      net: cls_u32: be more strict about skip-sw flag · d47a0f38
      Jakub Kicinski authored
      Return an error if user requested skip-sw and the underlaying
      hardware cannot handle tc offloads (or offloads are disabled).
      Signed-off-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d47a0f38