1. 28 Jan, 2022 28 commits
  2. 27 Jan, 2022 12 commits
    • Linus Torvalds's avatar
      Merge tag 'net-5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 23a46422
      Linus Torvalds authored
      Pull networking fixes from Jakub Kicinski:
       "Including fixes from netfilter and can.
      
        Current release - new code bugs:
      
         - tcp: add a missing sk_defer_free_flush() in tcp_splice_read()
      
         - tcp: add a stub for sk_defer_free_flush(), fix CONFIG_INET=n
      
         - nf_tables: set last expression in register tracking area
      
         - nft_connlimit: fix memleak if nf_ct_netns_get() fails
      
         - mptcp: fix removing ids bitmap setting
      
         - bonding: use rcu_dereference_rtnl when getting active slave
      
         - fix three cases of sleep in atomic context in drivers: lan966x, gve
      
         - handful of build fixes for esoteric drivers after netdev->dev_addr
           was made const
      
        Previous releases - regressions:
      
         - revert "ipv6: Honor all IPv6 PIO Valid Lifetime values", it broke
           Linux compatibility with USGv6 tests
      
         - procfs: show net device bound packet types
      
         - ipv4: fix ip option filtering for locally generated fragments
      
         - phy: broadcom: hook up soft_reset for BCM54616S
      
        Previous releases - always broken:
      
         - ipv4: raw: lock the socket in raw_bind()
      
         - ipv4: decrease the use of shared IPID generator to decrease the
           chance of attackers guessing the values
      
         - procfs: fix cross-netns information leakage in /proc/net/ptype
      
         - ethtool: fix link extended state for big endian
      
         - bridge: vlan: fix single net device option dumping
      
         - ping: fix the sk_bound_dev_if match in ping_lookup"
      
      * tag 'net-5.17-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (86 commits)
        net: bridge: vlan: fix memory leak in __allowed_ingress
        net: socket: rename SKB_DROP_REASON_SOCKET_FILTER
        ipv4: remove sparse error in ip_neigh_gw4()
        ipv4: avoid using shared IP generator for connected sockets
        ipv4: tcp: send zero IPID in SYNACK messages
        ipv4: raw: lock the socket in raw_bind()
        MAINTAINERS: add missing IPv4/IPv6 header paths
        MAINTAINERS: add more files to eth PHY
        net: stmmac: dwmac-sun8i: use return val of readl_poll_timeout()
        net: bridge: vlan: fix single net device option dumping
        net: stmmac: skip only stmmac_ptp_register when resume from suspend
        net: stmmac: configure PTP clock source prior to PTP initialization
        Revert "ipv6: Honor all IPv6 PIO Valid Lifetime values"
        connector/cn_proc: Use task_is_in_init_pid_ns()
        pid: Introduce helper task_is_in_init_pid_ns()
        gve: Fix GFP flags when allocing pages
        net: lan966x: Fix sleep in atomic context when updating MAC table
        net: lan966x: Fix sleep in atomic context when injecting frames
        ethernet: seeq/ether3: don't write directly to netdev->dev_addr
        ethernet: 8390/etherh: don't write directly to netdev->dev_addr
        ...
      23a46422
    • Tim Yi's avatar
      net: bridge: vlan: fix memory leak in __allowed_ingress · fd20d973
      Tim Yi authored
      When using per-vlan state, if vlan snooping and stats are disabled,
      untagged or priority-tagged ingress frame will go to check pvid state.
      If the port state is forwarding and the pvid state is not
      learning/forwarding, untagged or priority-tagged frame will be dropped
      but skb memory is not freed.
      Should free skb when __allowed_ingress returns false.
      
      Fixes: a580c76d ("net: bridge: vlan: add per-vlan state")
      Signed-off-by: default avatarTim Yi <tim.yi@pica8.com>
      Acked-by: default avatarNikolay Aleksandrov <nikolay@nvidia.com>
      Link: https://lore.kernel.org/r/20220127074953.12632-1-tim.yi@pica8.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fd20d973
    • Menglong Dong's avatar
      net: socket: rename SKB_DROP_REASON_SOCKET_FILTER · 364df53c
      Menglong Dong authored
      Rename SKB_DROP_REASON_SOCKET_FILTER, which is used
      as the reason of skb drop out of socket filter before
      it's part of a released kernel. It will be used for
      more protocols than just TCP in future series.
      Signed-off-by: default avatarMenglong Dong <imagedong@tencent.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/all/20220127091308.91401-2-imagedong@tencent.com/Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      364df53c
    • Eric Dumazet's avatar
      ipv4: remove sparse error in ip_neigh_gw4() · 3c42b201
      Eric Dumazet authored
      ./include/net/route.h:373:48: warning: incorrect type in argument 2 (different base types)
      ./include/net/route.h:373:48:    expected unsigned int [usertype] key
      ./include/net/route.h:373:48:    got restricted __be32 [usertype] daddr
      
      Fixes: 5c9f7c1d ("ipv4: Add helpers for neigh lookup for nexthop")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Link: https://lore.kernel.org/r/20220127013404.1279313-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3c42b201
    • Jakub Kicinski's avatar
      Merge branch 'ipv4-less-uses-of-shared-ip-generator' · 3ede6465
      Jakub Kicinski authored
      Eric Dumazet says:
      
      ====================
      ipv4: less uses of shared IP generator
      
      From: Eric Dumazet <edumazet@google.com>
      
      We keep receiving research reports based on linux IPID generation.
      
      Before breaking part of the Internet by switching to pure
      random generator, this series reduces the need for the
      shared IP generator for TCP sockets.
      ====================
      
      Link: https://lore.kernel.org/r/20220127011022.1274803-1-eric.dumazet@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3ede6465
    • Eric Dumazet's avatar
      ipv4: avoid using shared IP generator for connected sockets · 23f57406
      Eric Dumazet authored
      ip_select_ident_segs() has been very conservative about using
      the connected socket private generator only for packets with IP_DF
      set, claiming it was needed for some VJ compression implementations.
      
      As mentioned in this referenced document, this can be abused.
      (Ref: Off-Path TCP Exploits of the Mixed IPID Assignment)
      
      Before switching to pure random IPID generation and possibly hurt
      some workloads, lets use the private inet socket generator.
      
      Not only this will remove one vulnerability, this will also
      improve performance of TCP flows using pmtudisc==IP_PMTUDISC_DONT
      
      Fixes: 73f156a6 ("inetpeer: get rid of ip_id_count")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Reported-by: default avatarRay Che <xijiache@gmail.com>
      Cc: Willy Tarreau <w@1wt.eu>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      23f57406
    • Eric Dumazet's avatar
      ipv4: tcp: send zero IPID in SYNACK messages · 970a5a3e
      Eric Dumazet authored
      In commit 431280ee ("ipv4: tcp: send zero IPID for RST and
      ACK sent in SYN-RECV and TIME-WAIT state") we took care of some
      ctl packets sent by TCP.
      
      It turns out we need to use a similar strategy for SYNACK packets.
      
      By default, they carry IP_DF and IPID==0, but there are ways
      to ask them to use the hashed IP ident generator and thus
      be used to build off-path attacks.
      (Ref: Off-Path TCP Exploits of the Mixed IPID Assignment)
      
      One of this way is to force (before listener is started)
      echo 1 >/proc/sys/net/ipv4/ip_no_pmtu_disc
      
      Another way is using forged ICMP ICMP_FRAG_NEEDED
      with a very small MTU (like 68) to force a false return from
      ip_dont_fragment()
      
      In this patch, ip_build_and_send_pkt() uses the following
      heuristics.
      
      1) Most SYNACK packets are smaller than IPV4_MIN_MTU and therefore
      can use IP_DF regardless of the listener or route pmtu setting.
      
      2) In case the SYNACK packet is bigger than IPV4_MIN_MTU,
      we use prandom_u32() generator instead of the IPv4 hashed ident one.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarRay Che <xijiache@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Cc: Geoff Alexander <alexandg@cs.unm.edu>
      Cc: Willy Tarreau <w@1wt.eu>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      970a5a3e
    • Mathias Krause's avatar
      drm/vmwgfx: Fix stale file descriptors on failed usercopy · a0f90c88
      Mathias Krause authored
      A failing usercopy of the fence_rep object will lead to a stale entry in
      the file descriptor table as put_unused_fd() won't release it. This
      enables userland to refer to a dangling 'file' object through that still
      valid file descriptor, leading to all kinds of use-after-free
      exploitation scenarios.
      
      Fix this by deferring the call to fd_install() until after the usercopy
      has succeeded.
      
      Fixes: c906965d ("drm/vmwgfx: Add export fence to file descriptor support")
      Signed-off-by: default avatarMathias Krause <minipli@grsecurity.net>
      Signed-off-by: default avatarZack Rusin <zackr@vmware.com>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a0f90c88
    • Eric Dumazet's avatar
      ipv4: raw: lock the socket in raw_bind() · 153a0d18
      Eric Dumazet authored
      For some reason, raw_bind() forgot to lock the socket.
      
      BUG: KCSAN: data-race in __ip4_datagram_connect / raw_bind
      
      write to 0xffff8881170d4308 of 4 bytes by task 5466 on cpu 0:
       raw_bind+0x1b0/0x250 net/ipv4/raw.c:739
       inet_bind+0x56/0xa0 net/ipv4/af_inet.c:443
       __sys_bind+0x14b/0x1b0 net/socket.c:1697
       __do_sys_bind net/socket.c:1708 [inline]
       __se_sys_bind net/socket.c:1706 [inline]
       __x64_sys_bind+0x3d/0x50 net/socket.c:1706
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      read to 0xffff8881170d4308 of 4 bytes by task 5468 on cpu 1:
       __ip4_datagram_connect+0xb7/0x7b0 net/ipv4/datagram.c:39
       ip4_datagram_connect+0x2a/0x40 net/ipv4/datagram.c:89
       inet_dgram_connect+0x107/0x190 net/ipv4/af_inet.c:576
       __sys_connect_file net/socket.c:1900 [inline]
       __sys_connect+0x197/0x1b0 net/socket.c:1917
       __do_sys_connect net/socket.c:1927 [inline]
       __se_sys_connect net/socket.c:1924 [inline]
       __x64_sys_connect+0x3d/0x50 net/socket.c:1924
       do_syscall_x64 arch/x86/entry/common.c:50 [inline]
       do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
       entry_SYSCALL_64_after_hwframe+0x44/0xae
      
      value changed: 0x00000000 -> 0x0003007f
      
      Reported by Kernel Concurrency Sanitizer on:
      CPU: 1 PID: 5468 Comm: syz-executor.5 Not tainted 5.17.0-rc1-syzkaller #0
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      153a0d18
    • Jakub Kicinski's avatar
      MAINTAINERS: add missing IPv4/IPv6 header paths · 966f435a
      Jakub Kicinski authored
      Add missing headers to the IP entry.
      Reviewed-by: default avatarDavid Ahern <dsahern@kernel.org>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      966f435a
    • Jakub Kicinski's avatar
      MAINTAINERS: add more files to eth PHY · 492fefba
      Jakub Kicinski authored
      include/linux/linkmode.h and include/linux/mii.h
      do not match anything in MAINTAINERS. Looks like
      they should be under Ethernet PHY.
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      492fefba
    • Jisheng Zhang's avatar
      net: stmmac: dwmac-sun8i: use return val of readl_poll_timeout() · 9e0db41e
      Jisheng Zhang authored
      When readl_poll_timeout() timeout, we'd better directly use its return
      value.
      
      Before this patch:
      [    2.145528] dwmac-sun8i: probe of 4500000.ethernet failed with error -14
      
      After this patch:
      [    2.138520] dwmac-sun8i: probe of 4500000.ethernet failed with error -110
      Signed-off-by: default avatarJisheng Zhang <jszhang@kernel.org>
      Acked-by: default avatarJernej Skrabec <jernej.skrabec@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e0db41e