1. 16 Oct, 2018 21 commits
    • Pieter Jansen van Vuuren's avatar
      nfp: flower: use offsets provided by pedit instead of index for ipv6 · 140b6aba
      Pieter Jansen van Vuuren authored
      Previously when populating the set ipv6 address action, we incorrectly
      made use of pedit's key index to determine which 32bit word should be
      set. We now calculate which word has been selected based on the offset
      provided by the pedit action.
      
      Fixes: 354b82bb ("nfp: add set ipv6 source and destination address")
      Signed-off-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      140b6aba
    • Pieter Jansen van Vuuren's avatar
      nfp: flower: fix multiple keys per pedit action · d08c9e58
      Pieter Jansen van Vuuren authored
      Previously we only allowed a single header key per pedit action to
      change the header. This used to result in the last header key in the
      pedit action to overwrite previous headers. We now keep track of them
      and allow multiple header keys per pedit action.
      
      Fixes: c0b1bd9a ("nfp: add set ipv4 header action flower offload")
      Fixes: 354b82bb ("nfp: add set ipv6 source and destination address")
      Fixes: f8b7b0a6 ("nfp: add set tcp and udp header action flower offload")
      Signed-off-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d08c9e58
    • Pieter Jansen van Vuuren's avatar
      nfp: flower: fix pedit set actions for multiple partial masks · 8913806f
      Pieter Jansen van Vuuren authored
      Previously we did not correctly change headers when using multiple
      pedit actions with partial masks. We now take this into account and
      no longer just commit the last pedit action.
      
      Fixes: c0b1bd9a ("nfp: add set ipv4 header action flower offload")
      Signed-off-by: default avatarPieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8913806f
    • David Howells's avatar
      rxrpc: Fix a missing rxrpc_put_peer() in the error_report handler · 1890fea7
      David Howells authored
      Fix a missing call to rxrpc_put_peer() on the main path through the
      rxrpc_error_report() function.  This manifests itself as a ref leak
      whenever an ICMP packet or other error comes in.
      
      In commit f3344303, the hand-off of the ref to a work item was removed
      and was not replaced with a put.
      
      Fixes: f3344303 ("rxrpc: Fix error distribution")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1890fea7
    • Xin Long's avatar
      sctp: use the pmtu from the icmp packet to update transport pathmtu · d805397c
      Xin Long authored
      Other than asoc pmtu sync from all transports, sctp_assoc_sync_pmtu
      is also processing transport pmtu_pending by icmp packets. But it's
      meaningless to use sctp_dst_mtu(t->dst) as new pmtu for a transport.
      
      The right pmtu value should come from the icmp packet, and it would
      be saved into transport->mtu_info in this patch and used later when
      the pmtu sync happens in sctp_sendmsg_to_asoc or sctp_packet_config.
      
      Besides, without this patch, as pmtu can only be updated correctly
      when receiving a icmp packet and no place is holding sock lock, it
      will take long time if the sock is busy with sending packets.
      
      Note that it doesn't process transport->mtu_info in .release_cb(),
      as there is no enough information for pmtu update, like for which
      asoc or transport. It is not worth traversing all asocs to check
      pmtu_pending. So unlike tcp, sctp does this in tx path, for which
      mtu_info needs to be atomic_t.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d805397c
    • Fugang Duan's avatar
      net: fec: don't dump RX FIFO register when not available · ec20a63a
      Fugang Duan authored
      Commit db65f35f ("net: fec: add support of ethtool get_regs") introduce
      ethool "--register-dump" interface to dump all FEC registers.
      
      But not all silicon implementations of the Freescale FEC hardware module
      have the FRBR (FIFO Receive Bound Register) and FRSR (FIFO Receive Start
      Register) register, so we should not be trying to dump them on those that
      don't.
      
      To fix it we create a quirk flag, FEC_QUIRK_HAS_RFREG, and check it before
      dump those RX FIFO registers.
      Signed-off-by: default avatarFugang Duan <fugang.duan@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ec20a63a
    • Colin Ian King's avatar
      qed: fix spelling mistake "Ireelevant" -> "Irrelevant" · fbe1222c
      Colin Ian King authored
      Trivial fix to spelling mistake in DP_INFO message
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fbe1222c
    • Eric Dumazet's avatar
      ipv6: mcast: fix a use-after-free in inet6_mc_check · dc012f36
      Eric Dumazet authored
      syzbot found a use-after-free in inet6_mc_check [1]
      
      The problem here is that inet6_mc_check() uses rcu
      and read_lock(&iml->sflock)
      
      So the fact that ip6_mc_leave_src() is called under RTNL
      and the socket lock does not help us, we need to acquire
      iml->sflock in write mode.
      
      In the future, we should convert all this stuff to RCU.
      
      [1]
      BUG: KASAN: use-after-free in ipv6_addr_equal include/net/ipv6.h:521 [inline]
      BUG: KASAN: use-after-free in inet6_mc_check+0xae7/0xb40 net/ipv6/mcast.c:649
      Read of size 8 at addr ffff8801ce7f2510 by task syz-executor0/22432
      
      CPU: 1 PID: 22432 Comm: syz-executor0 Not tainted 4.19.0-rc7+ #280
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c4/0x2b4 lib/dump_stack.c:113
       print_address_description.cold.8+0x9/0x1ff mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
       __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
       ipv6_addr_equal include/net/ipv6.h:521 [inline]
       inet6_mc_check+0xae7/0xb40 net/ipv6/mcast.c:649
       __raw_v6_lookup+0x320/0x3f0 net/ipv6/raw.c:98
       ipv6_raw_deliver net/ipv6/raw.c:183 [inline]
       raw6_local_deliver+0x3d3/0xcb0 net/ipv6/raw.c:240
       ip6_input_finish+0x467/0x1aa0 net/ipv6/ip6_input.c:345
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ip6_input+0xe9/0x600 net/ipv6/ip6_input.c:426
       ip6_mc_input+0x48a/0xd20 net/ipv6/ip6_input.c:503
       dst_input include/net/dst.h:450 [inline]
       ip6_rcv_finish+0x17a/0x330 net/ipv6/ip6_input.c:76
       NF_HOOK include/linux/netfilter.h:289 [inline]
       ipv6_rcv+0x120/0x640 net/ipv6/ip6_input.c:271
       __netif_receive_skb_one_core+0x14d/0x200 net/core/dev.c:4913
       __netif_receive_skb+0x2c/0x1e0 net/core/dev.c:5023
       netif_receive_skb_internal+0x12c/0x620 net/core/dev.c:5126
       napi_frags_finish net/core/dev.c:5664 [inline]
       napi_gro_frags+0x75a/0xc90 net/core/dev.c:5737
       tun_get_user+0x3189/0x4250 drivers/net/tun.c:1923
       tun_chr_write_iter+0xb9/0x154 drivers/net/tun.c:1968
       call_write_iter include/linux/fs.h:1808 [inline]
       do_iter_readv_writev+0x8b0/0xa80 fs/read_write.c:680
       do_iter_write+0x185/0x5f0 fs/read_write.c:959
       vfs_writev+0x1f1/0x360 fs/read_write.c:1004
       do_writev+0x11a/0x310 fs/read_write.c:1039
       __do_sys_writev fs/read_write.c:1112 [inline]
       __se_sys_writev fs/read_write.c:1109 [inline]
       __x64_sys_writev+0x75/0xb0 fs/read_write.c:1109
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x457421
      Code: 75 14 b8 14 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 34 b5 fb ff c3 48 83 ec 08 e8 1a 2d 00 00 48 89 04 24 b8 14 00 00 00 0f 05 <48> 8b 3c 24 48 89 c2 e8 63 2d 00 00 48 89 d0 48 83 c4 08 48 3d 01
      RSP: 002b:00007f2d30ecaba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 000000000000003e RCX: 0000000000457421
      RDX: 0000000000000001 RSI: 00007f2d30ecabf0 RDI: 00000000000000f0
      RBP: 0000000020000500 R08: 00000000000000f0 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000293 R12: 00007f2d30ecb6d4
      R13: 00000000004c4890 R14: 00000000004d7b90 R15: 00000000ffffffff
      
      Allocated by task 22437:
       save_stack+0x43/0xd0 mm/kasan/kasan.c:448
       set_track mm/kasan/kasan.c:460 [inline]
       kasan_kmalloc+0xc7/0xe0 mm/kasan/kasan.c:553
       __do_kmalloc mm/slab.c:3718 [inline]
       __kmalloc+0x14e/0x760 mm/slab.c:3727
       kmalloc include/linux/slab.h:518 [inline]
       sock_kmalloc+0x15a/0x1f0 net/core/sock.c:1983
       ip6_mc_source+0x14dd/0x1960 net/ipv6/mcast.c:427
       do_ipv6_setsockopt.isra.9+0x3afb/0x45d0 net/ipv6/ipv6_sockglue.c:743
       ipv6_setsockopt+0xbd/0x170 net/ipv6/ipv6_sockglue.c:933
       rawv6_setsockopt+0x59/0x140 net/ipv6/raw.c:1069
       sock_common_setsockopt+0x9a/0xe0 net/core/sock.c:3038
       __sys_setsockopt+0x1ba/0x3c0 net/socket.c:1902
       __do_sys_setsockopt net/socket.c:1913 [inline]
       __se_sys_setsockopt net/socket.c:1910 [inline]
       __x64_sys_setsockopt+0xbe/0x150 net/socket.c:1910
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 22430:
       save_stack+0x43/0xd0 mm/kasan/kasan.c:448
       set_track mm/kasan/kasan.c:460 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/kasan.c:521
       kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
       __cache_free mm/slab.c:3498 [inline]
       kfree+0xcf/0x230 mm/slab.c:3813
       __sock_kfree_s net/core/sock.c:2004 [inline]
       sock_kfree_s+0x29/0x60 net/core/sock.c:2010
       ip6_mc_leave_src+0x11a/0x1d0 net/ipv6/mcast.c:2448
       __ipv6_sock_mc_close+0x20b/0x4e0 net/ipv6/mcast.c:310
       ipv6_sock_mc_close+0x158/0x1d0 net/ipv6/mcast.c:328
       inet6_release+0x40/0x70 net/ipv6/af_inet6.c:452
       __sock_release+0xd7/0x250 net/socket.c:579
       sock_close+0x19/0x20 net/socket.c:1141
       __fput+0x385/0xa30 fs/file_table.c:278
       ____fput+0x15/0x20 fs/file_table.c:309
       task_work_run+0x1e8/0x2a0 kernel/task_work.c:113
       tracehook_notify_resume include/linux/tracehook.h:193 [inline]
       exit_to_usermode_loop+0x318/0x380 arch/x86/entry/common.c:166
       prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
       syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
       do_syscall_64+0x6be/0x820 arch/x86/entry/common.c:293
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff8801ce7f2500
       which belongs to the cache kmalloc-192 of size 192
      The buggy address is located 16 bytes inside of
       192-byte region [ffff8801ce7f2500, ffff8801ce7f25c0)
      The buggy address belongs to the page:
      page:ffffea000739fc80 count:1 mapcount:0 mapping:ffff8801da800040 index:0x0
      flags: 0x2fffc0000000100(slab)
      raw: 02fffc0000000100 ffffea0006f6e548 ffffea000737b948 ffff8801da800040
      raw: 0000000000000000 ffff8801ce7f2000 0000000100000010 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff8801ce7f2400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
       ffff8801ce7f2480: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
      >ffff8801ce7f2500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                               ^
       ffff8801ce7f2580: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
       ffff8801ce7f2600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dc012f36
    • Tung Nguyen's avatar
      tipc: fix unsafe rcu locking when accessing publication list · d3092b2e
      Tung Nguyen authored
      The binding table's 'cluster_scope' list is rcu protected to handle
      races between threads changing the list and those traversing the list at
      the same moment. We have now found that the function named_distribute()
      uses the regular list_for_each() macro to traverse the said list.
      Likewise, the function tipc_named_withdraw() is removing items from the
      same list using the regular list_del() call. When these two functions
      execute in parallel we see occasional crashes.
      
      This commit fixes this by adding the missing _rcu() suffixes.
      Signed-off-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d3092b2e
    • David Howells's avatar
      rxrpc: Fix incorrect conditional on IPV6 · 7ec8dc96
      David Howells authored
      The udpv6_encap_enable() function is part of the ipv6 code, and if that is
      configured as a loadable module and rxrpc is built in then a build failure
      will occur because the conditional check is wrong:
      
        net/rxrpc/local_object.o: In function `rxrpc_lookup_local':
        local_object.c:(.text+0x2688): undefined reference to `udpv6_encap_enable'
      
      Use the correct config symbol (CONFIG_AF_RXRPC_IPV6) in the conditional
      check rather than CONFIG_IPV6 as that will do the right thing.
      
      Fixes: 5271953c ("rxrpc: Use the UDP encap_rcv hook")
      Reported-by: kbuild-all@01.org
      Reported-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ec8dc96
    • Sabrina Dubroca's avatar
      ipv6: rate-limit probes for neighbourless routes · f547fac6
      Sabrina Dubroca authored
      When commit 27097255 ("[IPV6]: ROUTE: Add Router Reachability
      Probing (RFC4191).") introduced router probing, the rt6_probe() function
      required that a neighbour entry existed. This neighbour entry is used to
      record the timestamp of the last probe via the ->updated field.
      
      Later, commit 2152caea ("ipv6: Do not depend on rt->n in rt6_probe().")
      removed the requirement for a neighbour entry. Neighbourless routes skip
      the interval check and are not rate-limited.
      
      This patch adds rate-limiting for neighbourless routes, by recording the
      timestamp of the last probe in the fib6_info itself.
      
      Fixes: 2152caea ("ipv6: Do not depend on rt->n in rt6_probe().")
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Reviewed-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f547fac6
    • Florian Fainelli's avatar
      net: bcmgenet: Poll internal PHY for GENETv5 · 64bd9c81
      Florian Fainelli authored
      On GENETv5, there is a hardware issue which prevents the GENET hardware
      from generating a link UP interrupt when the link is operating at
      10Mbits/sec. Since we do not have any way to configure the link
      detection logic, fallback to polling in that case.
      
      Fixes: 42138085 ("net: bcmgenet: add support for the GENETv5 hardware")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      64bd9c81
    • YueHaibing's avatar
      rxrpc: use correct kvec num when sending BUSY response packet · d6672a5a
      YueHaibing authored
      Fixes gcc '-Wunused-but-set-variable' warning:
      
      net/rxrpc/output.c: In function 'rxrpc_reject_packets':
      net/rxrpc/output.c:527:11: warning:
       variable 'ioc' set but not used [-Wunused-but-set-variable]
      
      'ioc' is the correct kvec num when sending a BUSY (or an ABORT) response
      packet.
      
      Fixes: ece64fec ("rxrpc: Emit BUSY packets when supposed to rather than ABORTs")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6672a5a
    • David Howells's avatar
      rxrpc: Fix an uninitialised variable · d7b4c24f
      David Howells authored
      Fix an uninitialised variable introduced by the last patch.  This can cause
      a crash when a new call comes in to a local service, such as when an AFS
      fileserver calls back to the local cache manager.
      
      Fixes: c1e15b49 ("rxrpc: Fix the packet reception routine")
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7b4c24f
    • Jon Maloy's avatar
      tipc: initialize broadcast link stale counter correctly · 4af00f4c
      Jon Maloy authored
      In the commit referred to below we added link tolerance as an additional
      criteria for declaring broadcast transmission "stale" and resetting the
      unicast links to the affected node.
      
      Unfortunately, this 'improvement' introduced two bugs, which each and
      one alone cause only limited problems, but combined lead to seemingly
      stochastic unicast link resets, depending on the amount of broadcast
      traffic transmitted.
      
      The first issue, a missing initialization of the 'tolerance' field of
      the receiver broadcast link, was recently fixed by commit 047491ea
      ("tipc: set link tolerance correctly in broadcast link").
      
      Ths second issue, where we omit to reset the 'stale_cnt' field of
      the same link after a 'stale' period is over, leads to this counter
      accumulating over time, and in the absence of the 'tolerance' criteria
      leads to the above described symptoms. This commit adds the missing
      initialization.
      
      Fixes: a4dc70d4 ("tipc: extend link reset criteria for stale packet retransmission")
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4af00f4c
    • Cong Wang's avatar
      llc: set SOCK_RCU_FREE in llc_sap_add_socket() · 5a8e7aea
      Cong Wang authored
      WHen an llc sock is added into the sk_laddr_hash of an llc_sap,
      it is not marked with SOCK_RCU_FREE.
      
      This causes that the sock could be freed while it is still being
      read by __llc_lookup_established() with RCU read lock. sock is
      refcounted, but with RCU read lock, nothing prevents the readers
      getting a zero refcnt.
      
      Fix it by setting SOCK_RCU_FREE in llc_sap_add_socket().
      
      Reported-by: syzbot+11e05f04c15e03be5254@syzkaller.appspotmail.com
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a8e7aea
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2018-10-10' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · d0f068e5
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2018-10-10
      
      This pull request includes some fixes to mlx5 driver,
      Please pull and let me know if there's any problem.
      
      For -stable v4.11:
      ('net/mlx5: Take only bit 24-26 of wqe.pftype_wq for page fault type')
      For -stable v4.17:
      ('net/mlx5: Fix memory leak when setting fpga ipsec caps')
      For -stable v4.18:
      ('net/mlx5: WQ, fixes for fragmented WQ buffers API')
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0f068e5
    • Davide Caratti's avatar
      net/sched: cls_api: add missing validation of netlink attributes · e331473f
      Davide Caratti authored
      Similarly to what has been done in 8b4c3cdd ("net: sched: Add policy
      validation for tc attributes"), fix classifier code to add validation of
      TCA_CHAIN and TCA_KIND netlink attributes.
      
      tested with:
       # ./tdc.py -c filter
      
      v2: Let sch_api and cls_api share nla_policy they have in common, thanks
          to David Ahern.
      v3: Avoid EXPORT_SYMBOL(), as validation of those attributes is not done
          by TC modules, thanks to Cong Wang.
          While at it, restore the 'Delete / get qdisc' comment to its orginal
          position, just above tc_get_qdisc() function prototype.
      
      Fixes: 5bc17018 ("net: sched: introduce multichain support for filters")
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e331473f
    • Wenwen Wang's avatar
      ethtool: fix a privilege escalation bug · 58f5bbe3
      Wenwen Wang authored
      In dev_ethtool(), the eth command 'ethcmd' is firstly copied from the
      use-space buffer 'useraddr' and checked to see whether it is
      ETHTOOL_PERQUEUE. If yes, the sub-command 'sub_cmd' is further copied from
      the user space. Otherwise, 'sub_cmd' is the same as 'ethcmd'. Next,
      according to 'sub_cmd', a permission check is enforced through the function
      ns_capable(). For example, the permission check is required if 'sub_cmd' is
      ETHTOOL_SCOALESCE, but it is not necessary if 'sub_cmd' is
      ETHTOOL_GCOALESCE, as suggested in the comment "Allow some commands to be
      done by anyone". The following execution invokes different handlers
      according to 'ethcmd'. Specifically, if 'ethcmd' is ETHTOOL_PERQUEUE,
      ethtool_set_per_queue() is called. In ethtool_set_per_queue(), the kernel
      object 'per_queue_opt' is copied again from the user-space buffer
      'useraddr' and 'per_queue_opt.sub_command' is used to determine which
      operation should be performed. Given that the buffer 'useraddr' is in the
      user space, a malicious user can race to change the sub-command between the
      two copies. In particular, the attacker can supply ETHTOOL_PERQUEUE and
      ETHTOOL_GCOALESCE to bypass the permission check in dev_ethtool(). Then
      before ethtool_set_per_queue() is called, the attacker changes
      ETHTOOL_GCOALESCE to ETHTOOL_SCOALESCE. In this way, the attacker can
      bypass the permission check and execute ETHTOOL_SCOALESCE.
      
      This patch enforces a check in ethtool_set_per_queue() after the second
      copy from 'useraddr'. If the sub-command is different from the one obtained
      in the first copy in dev_ethtool(), an error code EINVAL will be returned.
      
      Fixes: f38d138a ("net/ethtool: support set coalesce per queue")
      Signed-off-by: default avatarWenwen Wang <wang6495@umn.edu>
      Reviewed-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58f5bbe3
    • Wenwen Wang's avatar
      ethtool: fix a missing-check bug · 2bb3207d
      Wenwen Wang authored
      In ethtool_get_rxnfc(), the eth command 'cmd' is compared against
      'ETHTOOL_GRXFH' to see whether it is necessary to adjust the variable
      'info_size'. Then the whole structure of 'info' is copied from the
      user-space buffer 'useraddr' with 'info_size' bytes. In the following
      execution, 'info' may be copied again from the buffer 'useraddr' depending
      on the 'cmd' and the 'info.flow_type'. However, after these two copies,
      there is no check between 'cmd' and 'info.cmd'. In fact, 'cmd' is also
      copied from the buffer 'useraddr' in dev_ethtool(), which is the caller
      function of ethtool_get_rxnfc(). Given that 'useraddr' is in the user
      space, a malicious user can race to change the eth command in the buffer
      between these copies. By doing so, the attacker can supply inconsistent
      data and cause undefined behavior because in the following execution 'info'
      will be passed to ops->get_rxnfc().
      
      This patch adds a necessary check on 'info.cmd' and 'cmd' to confirm that
      they are still same after the two copies in ethtool_get_rxnfc(). Otherwise,
      an error code EINVAL will be returned.
      Signed-off-by: default avatarWenwen Wang <wang6495@umn.edu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2bb3207d
    • Jian-Hong Pan's avatar
      r8169: Enable MSI-X on RTL8106e · d49c88d7
      Jian-Hong Pan authored
      Originally, we have an issue where r8169 MSI-X interrupt is broken after
      S3 suspend/resume on RTL8106e of ASUS X441UAR.
      
      02:00.0 Ethernet controller [0200]: Realtek Semiconductor Co., Ltd.
      RTL8101/2/6E PCI Express Fast/Gigabit Ethernet controller [10ec:8136]
      (rev 07)
      	Subsystem: ASUSTeK Computer Inc. RTL810xE PCI Express Fast
      Ethernet controller [1043:200f]
      	Flags: bus master, fast devsel, latency 0, IRQ 16
      	I/O ports at e000 [size=256]
      	Memory at ef100000 (64-bit, non-prefetchable) [size=4K]
      	Memory at e0000000 (64-bit, prefetchable) [size=16K]
      	Capabilities: [40] Power Management version 3
      	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
      	Capabilities: [70] Express Endpoint, MSI 01
      	Capabilities: [b0] MSI-X: Enable+ Count=4 Masked-
      	Capabilities: [d0] Vital Product Data
      	Capabilities: [100] Advanced Error Reporting
      	Capabilities: [140] Virtual Channel
      	Capabilities: [160] Device Serial Number 01-00-00-00-36-4c-e0-00
      	Capabilities: [170] Latency Tolerance Reporting
      	Kernel driver in use: r8169
      	Kernel modules: r8169
      
      We found the all of the values in PCI BAR=4 of the ethernet adapter
      become 0xFF after system resumes.  That breaks the MSI-X interrupt.
      Therefore, we can only fall back to MSI interrupt to fix the issue at
      that time.
      
      However, there is a commit which resolves the drivers getting nothing in
      PCI BAR=4 after system resumes.  It is 04cb3ae895d7 "PCI: Reprogram
      bridge prefetch registers on resume" by Daniel Drake.
      
      After apply the patch, the ethernet adapter works fine before suspend
      and after resume.  So, we can revert the workaround after the commit
      "PCI: Reprogram bridge prefetch registers on resume" is merged into main
      tree.
      
      This patch reverts commit 7bb05b85
      "r8169: don't use MSI-X on RTL8106e".
      
      Buglink: https://bugzilla.kernel.org/show_bug.cgi?id=201181
      Fixes: 7bb05b85 ("r8169: don't use MSI-X on RTL8106e")
      Signed-off-by: default avatarJian-Hong Pan <jian-hong@endlessm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d49c88d7
  2. 14 Oct, 2018 1 commit
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 028c99fa
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2018-10-14
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Fix xsk map update and delete operation to not call synchronize_net()
         but to piggy back on SOCK_RCU_FREE for sockets instead as we are not
         allowed to sleep under RCU, from Björn.
      
      2) Do not change RLIMIT_MEMLOCK in reuseport_bpf selftest if the process
         already has unlimited RLIMIT_MEMLOCK, from Eric.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      028c99fa
  3. 12 Oct, 2018 15 commits
  4. 11 Oct, 2018 3 commits
    • David S. Miller's avatar
      Merge branch 'net-dsa-bcm_sf2-Couple-of-fixes' · 6b9bab55
      David S. Miller authored
      Florian Fainelli says:
      
      ====================
      net: dsa: bcm_sf2: Couple of fixes
      
      Here are two fixes for the bcm_sf2 driver that were found during
      testing unbind and analysing another issue during system
      suspend/resume.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b9bab55
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Call setup during switch resume · 54baca09
      Florian Fainelli authored
      There is no reason to open code what the switch setup function does, in
      fact, because we just issued a switch reset, we would make all the
      register get their default values, including for instance, having unused
      port be enabled again and wasting power and leading to an inappropriate
      switch core clock being selected.
      
      Fixes: 8cfa9498 ("net: dsa: bcm_sf2: add suspend/resume callbacks")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54baca09
    • Florian Fainelli's avatar
      net: dsa: bcm_sf2: Fix unbind ordering · bf3b452b
      Florian Fainelli authored
      The order in which we release resources is unfortunately leading to bus
      errors while dismantling the port. This is because we set
      priv->wol_ports_mask to 0 to tell bcm_sf2_sw_suspend() that it is now
      permissible to clock gate the switch. Later on, when dsa_slave_destroy()
      comes in from dsa_unregister_switch() and calls
      dsa_switch_ops::port_disable, we perform the same dismantling again, and
      this time we hit registers that are clock gated.
      
      Make sure that dsa_unregister_switch() is the first thing that happens,
      which takes care of releasing all user visible resources, then proceed
      with clock gating hardware. We still need to set priv->wol_ports_mask to
      0 to make sure that an enabled port properly gets disabled in case it
      was previously used as part of Wake-on-LAN.
      
      Fixes: d9338023 ("net: dsa: bcm_sf2: Make it a real platform device driver")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf3b452b