1. 02 Nov, 2020 3 commits
  2. 01 Nov, 2020 3 commits
    • Jakub Kicinski's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · 859191b2
      Jakub Kicinski authored
      Pablo Neira Ayuso says:
      
      ====================
      Netfilter fixes for net
      
      The following patchset contains Netfilter fixes for net:
      
      1) Incorrect netlink report logic in flowtable and genID.
      
      2) Add a selftest to check that wireguard passes the right sk
         to ip_route_me_harder, from Jason A. Donenfeld.
      
      3) Pass the actual sk to ip_route_me_harder(), also from Jason.
      
      4) Missing expression validation of updates via nft --check.
      
      5) Update byte and packet counters regardless of whether they
         match, from Stefano Brivio.
      ====================
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      859191b2
    • wenxu's avatar
      ip_tunnel: fix over-mtu packet send fail without TUNNEL_DONT_FRAGMENT flags · 20149e9e
      wenxu authored
      The tunnel device such as vxlan, bareudp and geneve in the lwt mode set
      the outer df only based TUNNEL_DONT_FRAGMENT.
      And this was also the behavior for gre device before switching to use
      ip_md_tunnel_xmit in commit 962924fa ("ip_gre: Refactor collect
      metatdata mode tunnel xmit to ip_md_tunnel_xmit")
      
      When the ip_gre in lwt mode xmit with ip_md_tunnel_xmi changed the rule and
      make the discrepancy between handling of DF by different tunnels. So in the
      ip_md_tunnel_xmit should follow the same rule like other tunnels.
      
      Fixes: cfc7381b ("ip_tunnel: add collect_md mode to IPIP tunnel")
      Signed-off-by: default avatarwenxu <wenxu@ucloud.cn>
      Link: https://lore.kernel.org/r/1604028728-31100-1-git-send-email-wenxu@ucloud.cnSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      20149e9e
    • Mark Deneen's avatar
      cadence: force nonlinear buffers to be cloned · 403dc167
      Mark Deneen authored
      In my test setup, I had a SAMA5D27 device configured with ip forwarding, and
      second device with usb ethernet (r8152) sending ICMP packets.  If the packet
      was larger than about 220 bytes, the SAMA5 device would "oops" with the
      following trace:
      
      kernel BUG at net/core/skbuff.c:1863!
      Internal error: Oops - BUG: 0 [#1] ARM
      Modules linked in: xt_MASQUERADE ppp_async ppp_generic slhc iptable_nat xt_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 can_raw can bridge stp llc ipt_REJECT nf_reject_ipv4 sd_mod cdc_ether usbnet usb_storage r8152 scsi_mod mii o
      ption usb_wwan usbserial micrel macb at91_sama5d2_adc phylink gpio_sama5d2_piobu m_can_platform m_can industrialio_triggered_buffer kfifo_buf of_mdio can_dev fixed_phy sdhci_of_at91 sdhci_pltfm libphy sdhci mmc_core ohci_at91 ehci_atmel o
      hci_hcd iio_rescale industrialio sch_fq_codel spidev prox2_hal(O)
      CPU: 0 PID: 0 Comm: swapper Tainted: G           O      5.9.1-prox2+ #1
      Hardware name: Atmel SAMA5
      PC is at skb_put+0x3c/0x50
      LR is at macb_start_xmit+0x134/0xad0 [macb]
      pc : [<c05258cc>]    lr : [<bf0ea5b8>]    psr: 20070113
      sp : c0d01a60  ip : c07232c0  fp : c4250000
      r10: c0d03cc8  r9 : 00000000  r8 : c0d038c0
      r7 : 00000000  r6 : 00000008  r5 : c59b66c0  r4 : 0000002a
      r3 : 8f659eff  r2 : c59e9eea  r1 : 00000001  r0 : c59b66c0
      Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
      Control: 10c53c7d  Table: 2640c059  DAC: 00000051
      Process swapper (pid: 0, stack limit = 0x75002d81)
      
      <snipped stack>
      
      [<c05258cc>] (skb_put) from [<bf0ea5b8>] (macb_start_xmit+0x134/0xad0 [macb])
      [<bf0ea5b8>] (macb_start_xmit [macb]) from [<c053e504>] (dev_hard_start_xmit+0x90/0x11c)
      [<c053e504>] (dev_hard_start_xmit) from [<c0571180>] (sch_direct_xmit+0x124/0x260)
      [<c0571180>] (sch_direct_xmit) from [<c053eae4>] (__dev_queue_xmit+0x4b0/0x6d0)
      [<c053eae4>] (__dev_queue_xmit) from [<c05a5650>] (ip_finish_output2+0x350/0x580)
      [<c05a5650>] (ip_finish_output2) from [<c05a7e24>] (ip_output+0xb4/0x13c)
      [<c05a7e24>] (ip_output) from [<c05a39d0>] (ip_forward+0x474/0x500)
      [<c05a39d0>] (ip_forward) from [<c05a13d8>] (ip_sublist_rcv_finish+0x3c/0x50)
      [<c05a13d8>] (ip_sublist_rcv_finish) from [<c05a19b8>] (ip_sublist_rcv+0x11c/0x188)
      [<c05a19b8>] (ip_sublist_rcv) from [<c05a2494>] (ip_list_rcv+0xf8/0x124)
      [<c05a2494>] (ip_list_rcv) from [<c05403c4>] (__netif_receive_skb_list_core+0x1a0/0x20c)
      [<c05403c4>] (__netif_receive_skb_list_core) from [<c05405c4>] (netif_receive_skb_list_internal+0x194/0x230)
      [<c05405c4>] (netif_receive_skb_list_internal) from [<c0540684>] (gro_normal_list.part.0+0x14/0x28)
      [<c0540684>] (gro_normal_list.part.0) from [<c0541280>] (napi_complete_done+0x16c/0x210)
      [<c0541280>] (napi_complete_done) from [<bf14c1c0>] (r8152_poll+0x684/0x708 [r8152])
      [<bf14c1c0>] (r8152_poll [r8152]) from [<c0541424>] (net_rx_action+0x100/0x328)
      [<c0541424>] (net_rx_action) from [<c01012ec>] (__do_softirq+0xec/0x274)
      [<c01012ec>] (__do_softirq) from [<c012d6d4>] (irq_exit+0xcc/0xd0)
      [<c012d6d4>] (irq_exit) from [<c0160960>] (__handle_domain_irq+0x58/0xa4)
      [<c0160960>] (__handle_domain_irq) from [<c0100b0c>] (__irq_svc+0x6c/0x90)
      Exception stack(0xc0d01ef0 to 0xc0d01f38)
      1ee0:                                     00000000 0000003d 0c31f383 c0d0fa00
      1f00: c0d2eb80 00000000 c0d2e630 4dad8c49 4da967b0 0000003d 0000003d 00000000
      1f20: fffffff5 c0d01f40 c04e0f88 c04e0f8c 30070013 ffffffff
      [<c0100b0c>] (__irq_svc) from [<c04e0f8c>] (cpuidle_enter_state+0x7c/0x378)
      [<c04e0f8c>] (cpuidle_enter_state) from [<c04e12c4>] (cpuidle_enter+0x28/0x38)
      [<c04e12c4>] (cpuidle_enter) from [<c014f710>] (do_idle+0x194/0x214)
      [<c014f710>] (do_idle) from [<c014fa50>] (cpu_startup_entry+0xc/0x14)
      [<c014fa50>] (cpu_startup_entry) from [<c0a00dc8>] (start_kernel+0x46c/0x4a0)
      Code: e580c054 8a000002 e1a00002 e8bd8070 (e7f001f2)
      ---[ end trace 146c8a334115490c ]---
      
      The solution was to force nonlinear buffers to be cloned.  This was previously
      reported by Klaus Doth (https://www.spinics.net/lists/netdev/msg556937.html)
      but never formally submitted as a patch.
      
      This is the third revision, hopefully the formatting is correct this time!
      Suggested-by: default avatarKlaus Doth <krnl@doth.eu>
      Fixes: 653e92a9 ("net: macb: add support for padding and fcs computation")
      Signed-off-by: default avatarMark Deneen <mdeneen@saucontech.com>
      Link: https://lore.kernel.org/r/20201030155814.622831-1-mdeneen@saucontech.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      403dc167
  3. 31 Oct, 2020 5 commits
    • Jakub Kicinski's avatar
      Merge branch 'ipv6-reply-icmp-error-if-fragment-doesn-t-contain-all-headers' · 72a41f95
      Jakub Kicinski authored
      Hangbin Liu says:
      
      ====================
      IPv6: reply ICMP error if fragment doesn't contain all headers
      
      When our Engineer run latest IPv6 Core Conformance test, test v6LC.1.3.6:
      First Fragment Doesn’t Contain All Headers[1] failed. The test purpose is to
      verify that the node (Linux for example) should properly process IPv6 packets
      that don’t include all the headers through the Upper-Layer header.
      
      Based on RFC 8200, Section 4.5 Fragment Header
      
        -  If the first fragment does not include all headers through an
           Upper-Layer header, then that fragment should be discarded and
           an ICMP Parameter Problem, Code 3, message should be sent to
           the source of the fragment, with the Pointer field set to zero.
      
      The first patch add a definition for ICMPv6 Parameter Problem, code 3.
      The second patch add a check for the 1st fragment packet to make sure
      Upper-Layer header exist.
      
      [1] Page 68, v6LC.1.3.6: First Fragment Doesn’t Contain All Headers part A, B,
      C and D at https://ipv6ready.org/docs/Core_Conformance_5_0_0.pdf
      [2] My reproducer:
      
      import sys, os
      from scapy.all import *
      
      def send_frag_dst_opt(src_ip6, dst_ip6):
          ip6 = IPv6(src = src_ip6, dst = dst_ip6, nh = 44)
      
          frag_1 = IPv6ExtHdrFragment(nh = 60, m = 1)
          dst_opt = IPv6ExtHdrDestOpt(nh = 58)
      
          frag_2 = IPv6ExtHdrFragment(nh = 58, offset = 4, m = 1)
          icmp_echo = ICMPv6EchoRequest(seq = 1)
      
          pkt_1 = ip6/frag_1/dst_opt
          pkt_2 = ip6/frag_2/icmp_echo
      
          send(pkt_1)
          send(pkt_2)
      
      def send_frag_route_opt(src_ip6, dst_ip6):
          ip6 = IPv6(src = src_ip6, dst = dst_ip6, nh = 44)
      
          frag_1 = IPv6ExtHdrFragment(nh = 43, m = 1)
          route_opt = IPv6ExtHdrRouting(nh = 58)
      
          frag_2 = IPv6ExtHdrFragment(nh = 58, offset = 4, m = 1)
          icmp_echo = ICMPv6EchoRequest(seq = 2)
      
          pkt_1 = ip6/frag_1/route_opt
          pkt_2 = ip6/frag_2/icmp_echo
      
          send(pkt_1)
          send(pkt_2)
      
      if __name__ == '__main__':
          src = sys.argv[1]
          dst = sys.argv[2]
          conf.iface = sys.argv[3]
          send_frag_dst_opt(src, dst)
          send_frag_route_opt(src, dst)
      ====================
      
      Link: https://lore.kernel.org/r/20201027123313.3717941-1-liuhangbin@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      72a41f95
    • Hangbin Liu's avatar
      IPv6: reply ICMP error if the first fragment don't include all headers · 2efdaaaf
      Hangbin Liu authored
      Based on RFC 8200, Section 4.5 Fragment Header:
      
        -  If the first fragment does not include all headers through an
           Upper-Layer header, then that fragment should be discarded and
           an ICMP Parameter Problem, Code 3, message should be sent to
           the source of the fragment, with the Pointer field set to zero.
      
      Checking each packet header in IPv6 fast path will have performance impact,
      so I put the checking in ipv6_frag_rcv().
      
      As the packet may be any kind of L4 protocol, I only checked some common
      protocols' header length and handle others by (offset + 1) > skb->len.
      Also use !(frag_off & htons(IP6_OFFSET)) to catch atomic fragments
      (fragmented packet with only one fragment).
      
      When send ICMP error message, if the 1st truncated fragment is ICMP message,
      icmp6_send() will break as is_ineligible() return true. So I added a check
      in is_ineligible() to let fragment packet with nexthdr ICMP but no ICMP header
      return false.
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2efdaaaf
    • Hangbin Liu's avatar
      ICMPv6: Add ICMPv6 Parameter Problem, code 3 definition · b59e286b
      Hangbin Liu authored
      Based on RFC7112, Section 6:
      
         IANA has added the following "Type 4 - Parameter Problem" message to
         the "Internet Control Message Protocol version 6 (ICMPv6) Parameters"
         registry:
      
            CODE     NAME/DESCRIPTION
             3       IPv6 First Fragment has incomplete IPv6 Header Chain
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      b59e286b
    • Colin Ian King's avatar
      net: atm: fix update of position index in lec_seq_next · 2f71e006
      Colin Ian King authored
      The position index in leq_seq_next is not updated when the next
      entry is fetched an no more entries are available. This causes
      seq_file to report the following error:
      
      "seq_file: buggy .next function lec_seq_next [lec] did not update
       position index"
      
      Fix this by always updating the position index.
      
      [ Note: this is an ancient 2002 bug, the sha is from the
        tglx/history repo ]
      
      Fixes 4aea2cbf ("[ATM]: Move lan seq_file ops to lec.c [1/3]")
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Link: https://lore.kernel.org/r/20201027114925.21843-1-colin.king@canonical.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2f71e006
    • Stefano Brivio's avatar
      netfilter: ipset: Update byte and packet counters regardless of whether they match · 7d10e62c
      Stefano Brivio authored
      In ip_set_match_extensions(), for sets with counters, we take care of
      updating counters themselves by calling ip_set_update_counter(), and of
      checking if the given comparison and values match, by calling
      ip_set_match_counter() if needed.
      
      However, if a given comparison on counters doesn't match the configured
      values, that doesn't mean the set entry itself isn't matching.
      
      This fix restores the behaviour we had before commit 4750005a
      ("netfilter: ipset: Fix "don't update counters" mode when counters used
      at the matching"), without reintroducing the issue fixed there: back
      then, mtype_data_match() first updated counters in any case, and then
      took care of matching on counters.
      
      Now, if the IPSET_FLAG_SKIP_COUNTER_UPDATE flag is set,
      ip_set_update_counter() will anyway skip counter updates if desired.
      
      The issue observed is illustrated by this reproducer:
      
        ipset create c hash:ip counters
        ipset add c 192.0.2.1
        iptables -I INPUT -m set --match-set c src --bytes-gt 800 -j DROP
      
      if we now send packets from 192.0.2.1, bytes and packets counters
      for the entry as shown by 'ipset list' are always zero, and, no
      matter how many bytes we send, the rule will never match, because
      counters themselves are not updated.
      Reported-by: default avatarMithil Mhatre <mmhatre@redhat.com>
      Fixes: 4750005a ("netfilter: ipset: Fix "don't update counters" mode when counters used at the matching")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarJozsef Kadlecsik <kadlec@netfilter.org>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      7d10e62c
  4. 30 Oct, 2020 17 commits
  5. 29 Oct, 2020 12 commits
    • Linus Torvalds's avatar
      Merge tag 'fallthrough-fixes-clang-5.10-rc2' of... · 07e08873
      Linus Torvalds authored
      Merge tag 'fallthrough-fixes-clang-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
      
      Pull fallthrough fix from Gustavo A. R. Silva:
       "This fixes a ton of fall-through warnings when building with Clang
        12.0.0 and -Wimplicit-fallthrough"
      
      * tag 'fallthrough-fixes-clang-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
        include: jhash/signal: Fix fall-through warnings for Clang
      07e08873
    • Linus Torvalds's avatar
      Merge tag 'net-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 934291ff
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Current release regressions:
      
         - r8169: fix forced threading conflicting with other shared
           interrupts; we tried to fix the use of raise_softirq_irqoff from an
           IRQ handler on RT by forcing hard irqs, but this driver shares
           legacy PCI IRQs so drop the _irqoff() instead
      
         - tipc: fix memory leak caused by a recent syzbot report fix to
           tipc_buf_append()
      
        Current release - bugs in new features:
      
         - devlink: Unlock on error in dumpit() and fix some error codes
      
         - net/smc: fix null pointer dereference in smc_listen_decline()
      
        Previous release - regressions:
      
         - tcp: Prevent low rmem stalls with SO_RCVLOWAT.
      
         - net: protect tcf_block_unbind with block lock
      
         - ibmveth: Fix use of ibmveth in a bridge; the self-imposed filtering
           to only send legal frames to the hypervisor was too strict
      
         - net: hns3: Clear the CMDQ registers before unmapping BAR region;
           incorrect cleanup order was leading to a crash
      
         - bnxt_en - handful of fixes to fixes:
            - Send HWRM_FUNC_RESET fw command unconditionally, even if there
              are PCIe errors being reported
            - Check abort error state in bnxt_open_nic().
            - Invoke cancel_delayed_work_sync() for PFs also.
            - Fix regression in workqueue cleanup logic in bnxt_remove_one().
      
         - mlxsw: Only advertise link modes supported by both driver and
           device, after removal of 56G support from the driver 56G was not
           cleared from advertised modes
      
         - net/smc: fix suppressed return code
      
        Previous release - always broken:
      
         - netem: fix zero division in tabledist, caused by integer overflow
      
         - bnxt_en: Re-write PCI BARs after PCI fatal error.
      
         - cxgb4: set up filter action after rewrites
      
         - net: ipa: command payloads already mapped
      
        Misc:
      
         - s390/ism: fix incorrect system EID, it's okay to change since it
           was added in current release
      
         - vsock: use ns_capable_noaudit() on socket create to suppress false
           positive audit messages"
      
      * tag 'net-5.10-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (36 commits)
        r8169: fix issue with forced threading in combination with shared interrupts
        netem: fix zero division in tabledist
        ibmvnic: fix ibmvnic_set_mac
        mptcp: add missing memory scheduling in the rx path
        tipc: fix memory leak caused by tipc_buf_append()
        gtp: fix an use-before-init in gtp_newlink()
        net: protect tcf_block_unbind with block lock
        ibmveth: Fix use of ibmveth in a bridge.
        net/sched: act_mpls: Add softdep on mpls_gso.ko
        ravb: Fix bit fields checking in ravb_hwtstamp_get()
        devlink: Unlock on error in dumpit()
        devlink: Fix some error codes
        chelsio/chtls: fix memory leaks in CPL handlers
        chelsio/chtls: fix deadlock issue
        net: hns3: Clear the CMDQ registers before unmapping BAR region
        bnxt_en: Send HWRM_FUNC_RESET fw command unconditionally.
        bnxt_en: Check abort error state in bnxt_open_nic().
        bnxt_en: Re-write PCI BARs after PCI fatal error.
        bnxt_en: Invoke cancel_delayed_work_sync() for PFs also.
        bnxt_en: Fix regression in workqueue cleanup logic in bnxt_remove_one().
        ...
      934291ff
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · b9c0f4bd
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "The good news is people are testing rc1 in the RDMA world - the bad
        news is testing of the for-next area is not as good as I had hoped, as
        we really should have caught at least the rdma_connect_locked() issue
        before now.
      
        Notable merge window regressions that didn't get caught/fixed in time
        for rc1:
      
         - Fix in kernel users of rxe, they were broken by the rapid fix to
           undo the uABI breakage in rxe from another patch
      
         - EFA userspace needs to read the GID table but was broken with the
           new GID table logic
      
         - Fix user triggerable deadlock in mlx5 using devlink reload
      
         - Fix deadlock in several ULPs using rdma_connect from the CM handler
           callbacks
      
         - Memory leak in qedr"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
        RDMA/qedr: Fix memory leak in iWARP CM
        RDMA: Add rdma_connect_locked()
        RDMA/uverbs: Fix false error in query gid IOCTL
        RDMA/mlx5: Fix devlink deadlock on net namespace deletion
        RDMA/rxe: Fix small problem in network_type patch
      b9c0f4bd
    • Heiner Kallweit's avatar
      r8169: fix issue with forced threading in combination with shared interrupts · 2734a24e
      Heiner Kallweit authored
      As reported by Serge flag IRQF_NO_THREAD causes an error if the
      interrupt is actually shared and the other driver(s) don't have this
      flag set. This situation can occur if a PCI(e) legacy interrupt is
      used in combination with forced threading.
      There's no good way to deal with this properly, therefore we have to
      remove flag IRQF_NO_THREAD. For fixing the original forced threading
      issue switch to napi_schedule().
      
      Fixes: 424a646e ("r8169: fix operation under forced interrupt threading")
      Link: https://www.spinics.net/lists/netdev/msg694960.htmlReported-by: default avatarSerge Belyshev <belyshev@depni.sinp.msu.ru>
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Tested-by: default avatarSerge Belyshev <belyshev@depni.sinp.msu.ru>
      Link: https://lore.kernel.org/r/b5b53bfe-35ac-3768-85bf-74d1290cf394@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2734a24e
    • Aleksandr Nogikh's avatar
      netem: fix zero division in tabledist · eadd1bef
      Aleksandr Nogikh authored
      Currently it is possible to craft a special netlink RTM_NEWQDISC
      command that can result in jitter being equal to 0x80000000. It is
      enough to set the 32 bit jitter to 0x02000000 (it will later be
      multiplied by 2^6) or just set the 64 bit jitter via
      TCA_NETEM_JITTER64. This causes an overflow during the generation of
      uniformly distributed numbers in tabledist(), which in turn leads to
      division by zero (sigma != 0, but sigma * 2 is 0).
      
      The related fragment of code needs 32-bit division - see commit
      9b0ed89 ("netem: remove unnecessary 64 bit modulus"), so switching to
      64 bit is not an option.
      
      Fix the issue by keeping the value of jitter within the range that can
      be adequately handled by tabledist() - [0;INT_MAX]. As negative std
      deviation makes no sense, take the absolute value of the passed value
      and cap it at INT_MAX. Inside tabledist(), switch to unsigned 32 bit
      arithmetic in order to prevent overflows.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarAleksandr Nogikh <nogikh@google.com>
      Reported-by: syzbot+ec762a6342ad0d3c0d8f@syzkaller.appspotmail.com
      Acked-by: default avatarStephen Hemminger <stephen@networkplumber.org>
      Link: https://lore.kernel.org/r/20201028170731.1383332-1-aleksandrnogikh@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      eadd1bef
    • Lijun Pan's avatar
      ibmvnic: fix ibmvnic_set_mac · 8fc3672a
      Lijun Pan authored
      Jakub Kicinski brought up a concern in ibmvnic_set_mac().
      ibmvnic_set_mac() does this:
      
      	ether_addr_copy(adapter->mac_addr, addr->sa_data);
      	if (adapter->state != VNIC_PROBED)
      		rc = __ibmvnic_set_mac(netdev, addr->sa_data);
      
      So if state == VNIC_PROBED, the user can assign an invalid address to
      adapter->mac_addr, and ibmvnic_set_mac() will still return 0.
      
      The fix is to validate ethernet address at the beginning of
      ibmvnic_set_mac(), and move the ether_addr_copy to
      the case of "adapter->state != VNIC_PROBED".
      
      Fixes: c26eba03 ("ibmvnic: Update reset infrastructure to support tunable parameters")
      Signed-off-by: default avatarLijun Pan <ljp@linux.ibm.com>
      Link: https://lore.kernel.org/r/20201027220456.71450-1-ljp@linux.ibm.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8fc3672a
    • Paolo Abeni's avatar
      mptcp: add missing memory scheduling in the rx path · 9c3f94e1
      Paolo Abeni authored
      When moving the skbs from the subflow into the msk receive
      queue, we must schedule there the required amount of memory.
      
      Try to borrow the required memory from the subflow, if needed,
      so that we leverage the existing TCP heuristic.
      
      Fixes: 6771bfd9 ("mptcp: update mptcp ack sequence from work queue")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarMat Martineau <mathew.j.martineau@linux.intel.com>
      Link: https://lore.kernel.org/r/f6143a6193a083574f11b00dbf7b5ad151bc4ff4.1603810630.git.pabeni@redhat.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9c3f94e1
    • Gustavo A. R. Silva's avatar
      include: jhash/signal: Fix fall-through warnings for Clang · 4169e889
      Gustavo A. R. Silva authored
      In preparation to enable -Wimplicit-fallthrough for Clang, explicitly
      add break statements instead of letting the code fall through to the
      next case.
      
      This patch adds four break statements that, together, fix almost 40,000
      warnings when building Linux 5.10-rc1 with Clang 12.0.0 and this[1] change
      reverted. Notice that in order to enable -Wimplicit-fallthrough for Clang,
      such change[1] is meant to be reverted at some point. So, this patch helps
      to move in that direction.
      
      Something important to mention is that there is currently a discrepancy
      between GCC and Clang when dealing with switch fall-through to empty case
      statements or to cases that only contain a break/continue/return
      statement[2][3][4].
      
      Now that the -Wimplicit-fallthrough option has been globally enabled[5],
      any compiler should really warn on missing either a fallthrough annotation
      or any of the other case-terminating statements (break/continue/return/
      goto) when falling through to the next case statement. Making exceptions
      to this introduces variation in case handling which may continue to lead
      to bugs, misunderstandings, and a general lack of robustness. The point
      of enabling options like -Wimplicit-fallthrough is to prevent human error
      and aid developers in spotting bugs before their code is even built/
      submitted/committed, therefore eliminating classes of bugs. So, in order
      to really accomplish this, we should, and can, move in the direction of
      addressing any error-prone scenarios and get rid of the unintentional
      fallthrough bug-class in the kernel, entirely, even if there is some minor
      redundancy. Better to have explicit case-ending statements than continue to
      have exceptions where one must guess as to the right result. The compiler
      will eliminate any actual redundancy.
      
      [1] commit e2079e93 ("kbuild: Do not enable -Wimplicit-fallthrough for clang for now")
      [2] https://github.com/ClangBuiltLinux/linux/issues/636
      [3] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91432
      [4] https://godbolt.org/z/xgkvIh
      [5] commit a035d552 ("Makefile: Globally enable fall-through warning")
      Co-developed-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      4169e889
    • Linus Torvalds's avatar
      Merge tag 'afs-fixes-20201029' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · 598a5976
      Linus Torvalds authored
      Pull AFS fixes from David Howells:
      
       - Fix copy_file_range() to an afs file now returning EINVAL if the
         splice_write file op isn't supplied.
      
       - Fix a deref-before-check in afs_unuse_cell().
      
       - Fix a use-after-free in afs_xattr_get_acl().
      
       - Fix afs to not try to clear PG_writeback when laundering a page.
      
       - Fix afs to take a ref on a page that it sets PG_private on and to
         drop that ref when clearing PG_private. This is done through recently
         added helpers.
      
       - Fix a page leak if write_begin() fails.
      
       - Fix afs_write_begin() to not alter the dirty region info stored in
         page->private, but rather do this in afs_write_end() instead when we
         know what we actually changed.
      
       - Fix afs_invalidatepage() to alter the dirty region info on a page
         when partial page invalidation occurs so that we don't inadvertantly
         include a span of zeros that will get written back if a page gets
         laundered due to a remote 3rd-party induced invalidation.
      
         We mustn't, however, reduce the dirty region if the page has been
         seen to be mapped (ie. we got called through the page_mkwrite vector)
         as the page might still be mapped and we might lose data if the file
         is extended again.
      
       - Fix the dirty region info to have a lower resolution if the size of
         the page is too large for this to be encoded (e.g. powerpc32 with 64K
         pages).
      
         Note that this might not be the ideal way to handle this, since it
         may allow some leakage of undirtied zero bytes to the server's copy
         in the case of a 3rd-party conflict.
      
      To aid the last two fixes, two additional changes:
      
       - Wrap the manipulations of the dirty region info stored in
         page->private into helper functions.
      
       - Alter the encoding of the dirty region so that the region bounds can
         be stored with one fewer bit, making a bit available for the
         indication of mappedness.
      
      * tag 'afs-fixes-20201029' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        afs: Fix dirty-region encoding on ppc32 with 64K pages
        afs: Fix afs_invalidatepage to adjust the dirty region
        afs: Alter dirty range encoding in page->private
        afs: Wrap page->private manipulations in inline functions
        afs: Fix where page->private is set during write
        afs: Fix page leak on afs_write_begin() failure
        afs: Fix to take ref on page when PG_private is set
        afs: Fix afs_launder_page to not clear PG_writeback
        afs: Fix a use after free in afs_xattr_get_acl()
        afs: Fix tracing deref-before-check
        afs: Fix copy_file_range()
      598a5976
    • Tung Nguyen's avatar
      tipc: fix memory leak caused by tipc_buf_append() · ceb1eb2f
      Tung Nguyen authored
      Commit ed42989e ("tipc: fix the skb_unshare() in tipc_buf_append()")
      replaced skb_unshare() with skb_copy() to not reduce the data reference
      counter of the original skb intentionally. This is not the correct
      way to handle the cloned skb because it causes memory leak in 2
      following cases:
       1/ Sending multicast messages via broadcast link
        The original skb list is cloned to the local skb list for local
        destination. After that, the data reference counter of each skb
        in the original list has the value of 2. This causes each skb not
        to be freed after receiving ACK:
        tipc_link_advance_transmq()
        {
         ...
         /* release skb */
         __skb_unlink(skb, &l->transmq);
         kfree_skb(skb); <-- memory exists after being freed
        }
      
       2/ Sending multicast messages via replicast link
        Similar to the above case, each skb cannot be freed after purging
        the skb list:
        tipc_mcast_xmit()
        {
         ...
         __skb_queue_purge(pkts); <-- memory exists after being freed
        }
      
      This commit fixes this issue by using skb_unshare() instead. Besides,
      to avoid use-after-free error reported by KASAN, the pointer to the
      fragment is set to NULL before calling skb_unshare() to make sure that
      the original skb is not freed after freeing the fragment 2 times in
      case skb_unshare() returns NULL.
      
      Fixes: ed42989e ("tipc: fix the skb_unshare() in tipc_buf_append()")
      Acked-by: default avatarJon Maloy <jmaloy@redhat.com>
      Reported-by: default avatarThang Hoang Ngo <thang.h.ngo@dektech.com.au>
      Signed-off-by: default avatarTung Nguyen <tung.q.nguyen@dektech.com.au>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Link: https://lore.kernel.org/r/20201027032403.1823-1-tung.q.nguyen@dektech.com.auSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ceb1eb2f
    • Masahiro Fujiwara's avatar
      gtp: fix an use-before-init in gtp_newlink() · 51467431
      Masahiro Fujiwara authored
      *_pdp_find() from gtp_encap_recv() would trigger a crash when a peer
      sends GTP packets while creating new GTP device.
      
      RIP: 0010:gtp1_pdp_find.isra.0+0x68/0x90 [gtp]
      <SNIP>
      Call Trace:
       <IRQ>
       gtp_encap_recv+0xc2/0x2e0 [gtp]
       ? gtp1_pdp_find.isra.0+0x90/0x90 [gtp]
       udp_queue_rcv_one_skb+0x1fe/0x530
       udp_queue_rcv_skb+0x40/0x1b0
       udp_unicast_rcv_skb.isra.0+0x78/0x90
       __udp4_lib_rcv+0x5af/0xc70
       udp_rcv+0x1a/0x20
       ip_protocol_deliver_rcu+0xc5/0x1b0
       ip_local_deliver_finish+0x48/0x50
       ip_local_deliver+0xe5/0xf0
       ? ip_protocol_deliver_rcu+0x1b0/0x1b0
      
      gtp_encap_enable() should be called after gtp_hastable_new() otherwise
      *_pdp_find() will access the uninitialized hash table.
      
      Fixes: 1e3a3abd ("gtp: make GTP sockets in gtp_newlink optional")
      Signed-off-by: default avatarMasahiro Fujiwara <fujiwara.masahiro@gmail.com>
      Link: https://lore.kernel.org/r/20201027114846.3924-1-fujiwara.masahiro@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      51467431
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 58130a6c
      Linus Torvalds authored
      Pull ext4 fixes from Ted Ts'o:
       "Bug fixes for the new ext4 fast commit feature, plus a fix for the
        'data=journal' bug fix.
      
        Also use the generic casefolding support which has now landed in
        fs/libfs.c for 5.10"
      
      * tag 'ext4_for_linus_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
        ext4: indicate that fast_commit is available via /sys/fs/ext4/feature/...
        ext4: use generic casefolding support
        ext4: do not use extent after put_bh
        ext4: use IS_ERR() for error checking of path
        ext4: fix mmap write protection for data=journal mode
        jbd2: fix a kernel-doc markup
        ext4: use s_mount_flags instead of s_mount_state for fast commit state
        ext4: make num of fast commit blocks configurable
        ext4: properly check for dirty state in ext4_inode_datasync_dirty()
        ext4: fix double locking in ext4_fc_commit_dentry_updates()
      58130a6c