1. 22 Aug, 2018 2 commits
    • Wei Wang's avatar
      l2tp: use sk_dst_check() to avoid race on sk->sk_dst_cache · 46be8e44
      Wei Wang authored
      [ Upstream commit 6d37fa49 ]
      
      In l2tp code, if it is a L2TP_UDP_ENCAP tunnel, tunnel->sk points to a
      UDP socket. User could call sendmsg() on both this tunnel and the UDP
      socket itself concurrently. As l2tp_xmit_skb() holds socket lock and call
      __sk_dst_check() to refresh sk->sk_dst_cache, while udpv6_sendmsg() is
      lockless and call sk_dst_check() to refresh sk->sk_dst_cache, there
      could be a race and cause the dst cache to be freed multiple times.
      So we fix l2tp side code to always call sk_dst_check() to garantee
      xchg() is called when refreshing sk->sk_dst_cache to avoid race
      conditions.
      
      Syzkaller reported stack trace:
      BUG: KASAN: use-after-free in atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
      BUG: KASAN: use-after-free in atomic_fetch_add_unless include/linux/atomic.h:575 [inline]
      BUG: KASAN: use-after-free in atomic_add_unless include/linux/atomic.h:597 [inline]
      BUG: KASAN: use-after-free in dst_hold_safe include/net/dst.h:308 [inline]
      BUG: KASAN: use-after-free in ip6_hold_safe+0xe6/0x670 net/ipv6/route.c:1029
      Read of size 4 at addr ffff8801aea9a880 by task syz-executor129/4829
      
      CPU: 0 PID: 4829 Comm: syz-executor129 Not tainted 4.18.0-rc7-next-20180802+ #30
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113
       print_address_description+0x6c/0x20b mm/kasan/report.c:256
       kasan_report_error mm/kasan/report.c:354 [inline]
       kasan_report.cold.7+0x242/0x30d mm/kasan/report.c:412
       check_memory_region_inline mm/kasan/kasan.c:260 [inline]
       check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
       kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272
       atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
       atomic_fetch_add_unless include/linux/atomic.h:575 [inline]
       atomic_add_unless include/linux/atomic.h:597 [inline]
       dst_hold_safe include/net/dst.h:308 [inline]
       ip6_hold_safe+0xe6/0x670 net/ipv6/route.c:1029
       rt6_get_pcpu_route net/ipv6/route.c:1249 [inline]
       ip6_pol_route+0x354/0xd20 net/ipv6/route.c:1922
       ip6_pol_route_output+0x54/0x70 net/ipv6/route.c:2098
       fib6_rule_lookup+0x283/0x890 net/ipv6/fib6_rules.c:122
       ip6_route_output_flags+0x2c5/0x350 net/ipv6/route.c:2126
       ip6_dst_lookup_tail+0x1278/0x1da0 net/ipv6/ip6_output.c:978
       ip6_dst_lookup_flow+0xc8/0x270 net/ipv6/ip6_output.c:1079
       ip6_sk_dst_lookup_flow+0x5ed/0xc50 net/ipv6/ip6_output.c:1117
       udpv6_sendmsg+0x2163/0x36b0 net/ipv6/udp.c:1354
       inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798
       sock_sendmsg_nosec net/socket.c:622 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:632
       ___sys_sendmsg+0x51d/0x930 net/socket.c:2115
       __sys_sendmmsg+0x240/0x6f0 net/socket.c:2210
       __do_sys_sendmmsg net/socket.c:2239 [inline]
       __se_sys_sendmmsg net/socket.c:2236 [inline]
       __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2236
       do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x446a29
      Code: e8 ac b8 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f4de5532db8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
      RAX: ffffffffffffffda RBX: 00000000006dcc38 RCX: 0000000000446a29
      RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003
      RBP: 00000000006dcc30 R08: 00007f4de5533700 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000006dcc3c
      R13: 00007ffe2b830fdf R14: 00007f4de55339c0 R15: 0000000000000001
      
      Fixes: 71b1391a ("l2tp: ensure sk->dst is still valid")
      Reported-by: syzbot+05f840f3b04f211bad55@syzkaller.appspotmail.com
      Signed-off-by: default avatarWei Wang <weiwan@google.com>
      Signed-off-by: default avatarMartin KaFai Lau <kafai@fb.com>
      Cc: Guillaume Nault <g.nault@alphalink.fr>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      46be8e44
    • Alexey Kodanev's avatar
      dccp: fix undefined behavior with 'cwnd' shift in ccid2_cwnd_restart() · 64d9b03d
      Alexey Kodanev authored
      [ Upstream commit 61ef4b07 ]
      
      The shift of 'cwnd' with '(now - hc->tx_lsndtime) / hc->tx_rto' value
      can lead to undefined behavior [1].
      
      In order to fix this use a gradual shift of the window with a 'while'
      loop, similar to what tcp_cwnd_restart() is doing.
      
      When comparing delta and RTO there is a minor difference between TCP
      and DCCP, the last one also invokes dccp_cwnd_restart() and reduces
      'cwnd' if delta equals RTO. That case is preserved in this change.
      
      [1]:
      [40850.963623] UBSAN: Undefined behaviour in net/dccp/ccids/ccid2.c:237:7
      [40851.043858] shift exponent 67 is too large for 32-bit type 'unsigned int'
      [40851.127163] CPU: 3 PID: 15940 Comm: netstress Tainted: G        W   E     4.18.0-rc7.x86_64 #1
      ...
      [40851.377176] Call Trace:
      [40851.408503]  dump_stack+0xf1/0x17b
      [40851.451331]  ? show_regs_print_info+0x5/0x5
      [40851.503555]  ubsan_epilogue+0x9/0x7c
      [40851.548363]  __ubsan_handle_shift_out_of_bounds+0x25b/0x2b4
      [40851.617109]  ? __ubsan_handle_load_invalid_value+0x18f/0x18f
      [40851.686796]  ? xfrm4_output_finish+0x80/0x80
      [40851.739827]  ? lock_downgrade+0x6d0/0x6d0
      [40851.789744]  ? xfrm4_prepare_output+0x160/0x160
      [40851.845912]  ? ip_queue_xmit+0x810/0x1db0
      [40851.895845]  ? ccid2_hc_tx_packet_sent+0xd36/0x10a0 [dccp]
      [40851.963530]  ccid2_hc_tx_packet_sent+0xd36/0x10a0 [dccp]
      [40852.029063]  dccp_xmit_packet+0x1d3/0x720 [dccp]
      [40852.086254]  dccp_write_xmit+0x116/0x1d0 [dccp]
      [40852.142412]  dccp_sendmsg+0x428/0xb20 [dccp]
      [40852.195454]  ? inet_dccp_listen+0x200/0x200 [dccp]
      [40852.254833]  ? sched_clock+0x5/0x10
      [40852.298508]  ? sched_clock+0x5/0x10
      [40852.342194]  ? inet_create+0xdf0/0xdf0
      [40852.388988]  sock_sendmsg+0xd9/0x160
      ...
      
      Fixes: 113ced1f ("dccp ccid-2: Perform congestion-window validation")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64d9b03d
  2. 18 Aug, 2018 2 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.14.65 · 4cea13b6
      Greg Kroah-Hartman authored
      4cea13b6
    • Sean Christopherson's avatar
      x86/speculation/l1tf: Exempt zeroed PTEs from inversion · 3f2e4f5d
      Sean Christopherson authored
      commit f19f5c49 upstream.
      
      It turns out that we should *not* invert all not-present mappings,
      because the all zeroes case is obviously special.
      
      clear_page() does not undergo the XOR logic to invert the address bits,
      i.e. PTE, PMD and PUD entries that have not been individually written
      will have val=0 and so will trigger __pte_needs_invert(). As a result,
      {pte,pmd,pud}_pfn() will return the wrong PFN value, i.e. all ones
      (adjusted by the max PFN mask) instead of zero. A zeroed entry is ok
      because the page at physical address 0 is reserved early in boot
      specifically to mitigate L1TF, so explicitly exempt them from the
      inversion when reading the PFN.
      
      Manifested as an unexpected mprotect(..., PROT_NONE) failure when called
      on a VMA that has VM_PFNMAP and was mmap'd to as something other than
      PROT_NONE but never used. mprotect() sends the PROT_NONE request down
      prot_none_walk(), which walks the PTEs to check the PFNs.
      prot_none_pte_entry() gets the bogus PFN from pte_pfn() and returns
      -EACCES because it thinks mprotect() is trying to adjust a high MMIO
      address.
      
      [ This is a very modified version of Sean's original patch, but all
        credit goes to Sean for doing this and also pointing out that
        sometimes the __pte_needs_invert() function only gets the protection
        bits, not the full eventual pte.  But zero remains special even in
        just protection bits, so that's ok.   - Linus ]
      
      Fixes: f22cc87f ("x86/speculation/l1tf: Invert all not present mappings")
      Signed-off-by: default avatarSean Christopherson <sean.j.christopherson@intel.com>
      Acked-by: default avatarAndi Kleen <ak@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f2e4f5d
  3. 17 Aug, 2018 23 commits
  4. 15 Aug, 2018 13 commits