1. 25 Feb, 2018 25 commits
    • Paolo Abeni's avatar
      netfilter: on sockopt() acquire sock lock only in the required scope · 4ec264d8
      Paolo Abeni authored
      commit 3f34cfae upstream.
      
      Syzbot reported several deadlocks in the netfilter area caused by
      rtnl lock and socket lock being acquired with a different order on
      different code paths, leading to backtraces like the following one:
      
      ======================================================
      WARNING: possible circular locking dependency detected
      4.15.0-rc9+ #212 Not tainted
      ------------------------------------------------------
      syzkaller041579/3682 is trying to acquire lock:
        (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>] lock_sock
      include/net/sock.h:1463 [inline]
        (sk_lock-AF_INET6){+.+.}, at: [<000000008775e4dd>]
      do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167
      
      but task is already holding lock:
        (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      
      which lock already depends on the new lock.
      
      the existing dependency chain (in reverse order) is:
      
      -> #1 (rtnl_mutex){+.+.}:
              __mutex_lock_common kernel/locking/mutex.c:756 [inline]
              __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
              mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
              rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
              register_netdevice_notifier+0xad/0x860 net/core/dev.c:1607
              tee_tg_check+0x1a0/0x280 net/netfilter/xt_TEE.c:106
              xt_check_target+0x22c/0x7d0 net/netfilter/x_tables.c:845
              check_target net/ipv6/netfilter/ip6_tables.c:538 [inline]
              find_check_entry.isra.7+0x935/0xcf0
      net/ipv6/netfilter/ip6_tables.c:580
              translate_table+0xf52/0x1690 net/ipv6/netfilter/ip6_tables.c:749
              do_replace net/ipv6/netfilter/ip6_tables.c:1165 [inline]
              do_ip6t_set_ctl+0x370/0x5f0 net/ipv6/netfilter/ip6_tables.c:1691
              nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
              nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
              ipv6_setsockopt+0x115/0x150 net/ipv6/ipv6_sockglue.c:928
              udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
              sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
              SYSC_setsockopt net/socket.c:1849 [inline]
              SyS_setsockopt+0x189/0x360 net/socket.c:1828
              entry_SYSCALL_64_fastpath+0x29/0xa0
      
      -> #0 (sk_lock-AF_INET6){+.+.}:
              lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3914
              lock_sock_nested+0xc2/0x110 net/core/sock.c:2780
              lock_sock include/net/sock.h:1463 [inline]
              do_ipv6_setsockopt.isra.8+0x3c5/0x39d0 net/ipv6/ipv6_sockglue.c:167
              ipv6_setsockopt+0xd7/0x150 net/ipv6/ipv6_sockglue.c:922
              udpv6_setsockopt+0x45/0x80 net/ipv6/udp.c:1422
              sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2978
              SYSC_setsockopt net/socket.c:1849 [inline]
              SyS_setsockopt+0x189/0x360 net/socket.c:1828
              entry_SYSCALL_64_fastpath+0x29/0xa0
      
      other info that might help us debug this:
      
        Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(rtnl_mutex);
                                      lock(sk_lock-AF_INET6);
                                      lock(rtnl_mutex);
         lock(sk_lock-AF_INET6);
      
        *** DEADLOCK ***
      
      1 lock held by syzkaller041579/3682:
        #0:  (rtnl_mutex){+.+.}, at: [<000000004342eaa9>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      
      The problem, as Florian noted, is that nf_setsockopt() is always
      called with the socket held, even if the lock itself is required only
      for very tight scopes and only for some operation.
      
      This patch addresses the issues moving the lock_sock() call only
      where really needed, namely in ipv*_getorigdst(), so that nf_setsockopt()
      does not need anymore to acquire both locks.
      
      Fixes: 22265a5c ("netfilter: xt_TEE: resolve oif using netdevice notifiers")
      Reported-by: syzbot+a4c2dc980ac1af699b36@syzkaller.appspotmail.com
      Suggested-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ec264d8
    • Dmitry Vyukov's avatar
      netfilter: ipt_CLUSTERIP: fix out-of-bounds accesses in clusterip_tg_check() · ab2b0f7b
      Dmitry Vyukov authored
      commit 1a38956c upstream.
      
      Commit 136e92bb switched local_nodes from an array to a bitmask
      but did not add proper bounds checks. As the result
      clusterip_config_init_nodelist() can both over-read
      ipt_clusterip_tgt_info.local_nodes and over-write
      clusterip_config.local_nodes.
      
      Add bounds checks for both.
      
      Fixes: 136e92bb ("[NETFILTER] CLUSTERIP: use a bitmap to store node responsibility data")
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ab2b0f7b
    • Eric Dumazet's avatar
      netfilter: x_tables: avoid out-of-bounds reads in xt_request_find_{match|target} · b39f3f38
      Eric Dumazet authored
      commit da17c73b upstream.
      
      It looks like syzbot found its way into netfilter territory.
      
      Issue here is that @name comes from user space and might
      not be null terminated.
      
      Out-of-bound reads happen, KASAN is not happy.
      
      v2 added similar fix for xt_request_find_target(),
      as Florian advised.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b39f3f38
    • Dmitry Vyukov's avatar
      netfilter: x_tables: fix int overflow in xt_alloc_table_info() · 1099c708
      Dmitry Vyukov authored
      commit 889c604f upstream.
      
      syzkaller triggered OOM kills by passing ipt_replace.size = -1
      to IPT_SO_SET_REPLACE. The root cause is that SMP_ALIGN() in
      xt_alloc_table_info() causes int overflow and the size check passes
      when it should not. SMP_ALIGN() is no longer needed leftover.
      
      Remove SMP_ALIGN() call in xt_alloc_table_info().
      
      Reported-by: syzbot+4396883fa8c4f64e0175@syzkaller.appspotmail.com
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1099c708
    • Dmitry Vyukov's avatar
      kcov: detect double association with a single task · c33f9272
      Dmitry Vyukov authored
      commit a77660d2 upstream.
      
      Currently KCOV_ENABLE does not check if the current task is already
      associated with another kcov descriptor.  As the result it is possible
      to associate a single task with more than one kcov descriptor, which
      later leads to a memory leak of the old descriptor.  This relation is
      really meant to be one-to-one (task has only one back link).
      
      Extend validation to detect such misuse.
      
      Link: http://lkml.kernel.org/r/20180122082520.15716-1-dvyukov@google.com
      Fixes: 5c9a8750 ("kernel: add kcov code coverage")
      Signed-off-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Reported-by: default avatarShankara Pailoor <sp3485@columbia.edu>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: syzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c33f9272
    • Wanpeng Li's avatar
      KVM: x86: fix escape of guest dr6 to the host · 9748fd5b
      Wanpeng Li authored
      commit efdab992 upstream.
      
      syzkaller reported:
      
         WARNING: CPU: 0 PID: 12927 at arch/x86/kernel/traps.c:780 do_debug+0x222/0x250
         CPU: 0 PID: 12927 Comm: syz-executor Tainted: G           OE    4.15.0-rc2+ #16
         RIP: 0010:do_debug+0x222/0x250
         Call Trace:
          <#DB>
          debug+0x3e/0x70
         RIP: 0010:copy_user_enhanced_fast_string+0x10/0x20
          </#DB>
          _copy_from_user+0x5b/0x90
          SyS_timer_create+0x33/0x80
          entry_SYSCALL_64_fastpath+0x23/0x9a
      
      The testcase sets a watchpoint (with perf_event_open) on a buffer that is
      passed to timer_create() as the struct sigevent argument.  In timer_create(),
      copy_from_user()'s rep movsb triggers the BP.  The testcase also sets
      the debug registers for the guest.
      
      However, KVM only restores host debug registers when the host has active
      watchpoints, which triggers a race condition when running the testcase with
      multiple threads.  The guest's DR6.BS bit can escape to the host before
      another thread invokes timer_create(), and do_debug() complains.
      
      The fix is to respect do_debug()'s dr6 invariant when leaving KVM.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: David Hildenbrand <david@redhat.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9748fd5b
    • Douglas Gilbert's avatar
      blk_rq_map_user_iov: fix error override · 7abb5e9d
      Douglas Gilbert authored
      commit 69e0927b upstream.
      
      During stress tests by syzkaller on the sg driver the block layer
      infrequently returns EINVAL. Closer inspection shows the block
      layer was trying to return ENOMEM (which is much more
      understandable) but for some reason overroad that useful error.
      
      Patch below does not show this (unchanged) line:
         ret =__blk_rq_map_user_iov(rq, map_data, &i, gfp_mask, copy);
      That 'ret' was being overridden when that function failed.
      Signed-off-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7abb5e9d
    • Laura Abbott's avatar
      staging: android: ion: Switch from WARN to pr_warn · 3ee287d3
      Laura Abbott authored
      commit e4e179a8 upstream.
      
      Syzbot reported a warning with Ion:
      
      WARNING: CPU: 0 PID: 3502 at drivers/staging/android/ion/ion-ioctl.c:73 ion_ioctl+0x2db/0x380 drivers/staging/android/ion/ion-ioctl.c:73
      Kernel panic - not syncing: panic_on_warn set ...
      
      This is a warning that validation of the ioctl fields failed. This was
      deliberately added as a warning to make it very obvious to developers that
      something needed to be fixed. In reality, this is overkill and disturbs
      fuzzing. Switch to pr_warn for a message instead.
      
      Reported-by: syzbot+fa2d5f63ee5904a0115a@syzkaller.appspotmail.com
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarLaura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ee287d3
    • Laura Abbott's avatar
      staging: android: ion: Add __GFP_NOWARN for system contig heap · 458d2fc9
      Laura Abbott authored
      commit 0c75f103 upstream.
      
      syzbot reported a warning from Ion:
      
        WARNING: CPU: 1 PID: 3485 at mm/page_alloc.c:3926
      
        ...
         __alloc_pages_nodemask+0x9fb/0xd80 mm/page_alloc.c:4252
        alloc_pages_current+0xb6/0x1e0 mm/mempolicy.c:2036
        alloc_pages include/linux/gfp.h:492 [inline]
        ion_system_contig_heap_allocate+0x40/0x2c0
        drivers/staging/android/ion/ion_system_heap.c:374
        ion_buffer_create drivers/staging/android/ion/ion.c:93 [inline]
        ion_alloc+0x2c1/0x9e0 drivers/staging/android/ion/ion.c:420
        ion_ioctl+0x26d/0x380 drivers/staging/android/ion/ion-ioctl.c:84
        vfs_ioctl fs/ioctl.c:46 [inline]
        do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:686
        SYSC_ioctl fs/ioctl.c:701 [inline]
        SyS_ioctl+0x8f/0xc0 fs/ioctl.c:692
      
      This is a warning about attempting to allocate order > MAX_ORDER. This
      is coming from a userspace Ion allocation request. Since userspace is
      free to request however much memory it wants (and the kernel is free to
      deny its allocation), silence the allocation attempt with __GFP_NOWARN
      in case it fails.
      
      Reported-by: syzbot+76e7efc4748495855a4d@syzkaller.appspotmail.com
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarLaura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      458d2fc9
    • Eric Biggers's avatar
      crypto: x86/twofish-3way - Fix %rbp usage · eda4a836
      Eric Biggers authored
      commit d8c7fe9f upstream.
      
      Using %rbp as a temporary register breaks frame pointer convention and
      breaks stack traces when unwinding from an interrupt in the crypto code.
      
      In twofish-3way, we can't simply replace %rbp with another register
      because there are none available.  Instead, we use the stack to hold the
      values that %rbp, %r11, and %r12 were holding previously.  Each of these
      values represents the half of the output from the previous Feistel round
      that is being passed on unchanged to the following round.  They are only
      used once per round, when they are exchanged with %rax, %rbx, and %rcx.
      
      As a result, we free up 3 registers (one per block) and can reassign
      them so that %rbp is not used, and additionally %r14 and %r15 are not
      used so they do not need to be saved/restored.
      
      There may be a small overhead caused by replacing 'xchg REG, REG' with
      the needed sequence 'mov MEM, REG; mov REG, MEM; mov REG, REG' once per
      round.  But, counterintuitively, when I tested "ctr-twofish-3way" on a
      Haswell processor, the new version was actually about 2% faster.
      (Perhaps 'xchg' is not as well optimized as plain moves.)
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eda4a836
    • Paul Moore's avatar
      selinux: skip bounded transition processing if the policy isn't loaded · 5e6f51aa
      Paul Moore authored
      commit 4b14752e upstream.
      
      We can't do anything reasonable in security_bounded_transition() if we
      don't have a policy loaded, and in fact we could run into problems
      with some of the code inside expecting a policy.  Fix these problems
      like we do many others in security/selinux/ss/services.c by checking
      to see if the policy is loaded (ss_initialized) and returning quickly
      if it isn't.
      Reported-by: default avatarsyzbot <syzkaller-bugs@googlegroups.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Reviewed-by: default avatarJames Morris <james.l.morris@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5e6f51aa
    • Paul Moore's avatar
      selinux: ensure the context is NUL terminated in security_context_to_sid_core() · fe1cb580
      Paul Moore authored
      commit ef28df55 upstream.
      
      The syzbot/syzkaller automated tests found a problem in
      security_context_to_sid_core() during early boot (before we load the
      SELinux policy) where we could potentially feed context strings without
      NUL terminators into the strcmp() function.
      
      We already guard against this during normal operation (after the SELinux
      policy has been loaded) by making a copy of the context strings and
      explicitly adding a NUL terminator to the end.  The patch extends this
      protection to the early boot case (no loaded policy) by moving the context
      copy earlier in security_context_to_sid_core().
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Reviewed-By: default avatarWilliam Roberts <william.c.roberts@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe1cb580
    • David Howells's avatar
      Provide a function to create a NUL-terminated string from unterminated data · 5cab144f
      David Howells authored
      commit f3515741 upstream.
      
      Provide a function, kmemdup_nul(), that will create a NUL-terminated string
      from an unterminated character array where the length is known in advance.
      
      This is better than kstrndup() in situations where we already know the
      string length as the strnlen() in kstrndup() is superfluous.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5cab144f
    • Jason Wang's avatar
      ptr_ring: fail early if queue occupies more than KMALLOC_MAX_SIZE · 5fd4db30
      Jason Wang authored
      commit 6e6e41c3 upstream.
      
      To avoid slab to warn about exceeded size, fail early if queue
      occupies more than KMALLOC_MAX_SIZE.
      
      Reported-by: syzbot+e4d4f9ddd4295539735d@syzkaller.appspotmail.com
      Fixes: 2e0ab8ca ("ptr_ring: array based FIFO for pointers")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5fd4db30
    • Chris Wilson's avatar
      drm: Require __GFP_NOFAIL for the legacy drm_modeset_lock_all · eeb1f9bd
      Chris Wilson authored
      commit d18d1a5a upstream.
      
      To acquire all modeset locks requires a ww_ctx to be allocated. As this
      is the legacy path and the allocation small, to reduce the changes
      required (and complex untested error handling) to the legacy drivers, we
      simply assume that the allocation succeeds. At present, it relies on the
      too-small-to-fail rule, but syzbot found that by injecting a failure
      here we would hit the WARN. Document that this allocation must succeed
      with __GFP_NOFAIL.
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      Reviewed-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20171031115535.15166-1-chris@chris-wilson.co.ukSigned-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eeb1f9bd
    • Jens Axboe's avatar
      blktrace: fix unlocked registration of tracepoints · 7569adcf
      Jens Axboe authored
      commit a6da0024 upstream.
      
      We need to ensure that tracepoints are registered and unregistered
      with the users of them. The existing atomic count isn't enough for
      that. Add a lock around the tracepoints, so we serialize access
      to them.
      
      This fixes cases where we have multiple users setting up and
      tearing down tracepoints, like this:
      
      CPU: 0 PID: 2995 Comm: syzkaller857118 Not tainted
      4.14.0-rc5-next-20171018+ #36
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:16 [inline]
        dump_stack+0x194/0x257 lib/dump_stack.c:52
        panic+0x1e4/0x41c kernel/panic.c:183
        __warn+0x1c4/0x1e0 kernel/panic.c:546
        report_bug+0x211/0x2d0 lib/bug.c:183
        fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:177
        do_trap_no_signal arch/x86/kernel/traps.c:211 [inline]
        do_trap+0x260/0x390 arch/x86/kernel/traps.c:260
        do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:297
        do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:310
        invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
      RIP: 0010:tracepoint_add_func kernel/tracepoint.c:210 [inline]
      RIP: 0010:tracepoint_probe_register_prio+0x397/0x9a0 kernel/tracepoint.c:283
      RSP: 0018:ffff8801d1d1f6c0 EFLAGS: 00010293
      RAX: ffff8801d22e8540 RBX: 00000000ffffffef RCX: ffffffff81710f07
      RDX: 0000000000000000 RSI: ffffffff85b679c0 RDI: ffff8801d5f19818
      RBP: ffff8801d1d1f7c8 R08: ffffffff81710c10 R09: 0000000000000004
      R10: ffff8801d1d1f6b0 R11: 0000000000000003 R12: ffffffff817597f0
      R13: 0000000000000000 R14: 00000000ffffffff R15: ffff8801d1d1f7a0
        tracepoint_probe_register+0x2a/0x40 kernel/tracepoint.c:304
        register_trace_block_rq_insert include/trace/events/block.h:191 [inline]
        blk_register_tracepoints+0x1e/0x2f0 kernel/trace/blktrace.c:1043
        do_blk_trace_setup+0xa10/0xcf0 kernel/trace/blktrace.c:542
        blk_trace_setup+0xbd/0x180 kernel/trace/blktrace.c:564
        sg_ioctl+0xc71/0x2d90 drivers/scsi/sg.c:1089
        vfs_ioctl fs/ioctl.c:45 [inline]
        do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685
        SYSC_ioctl fs/ioctl.c:700 [inline]
        SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691
        entry_SYSCALL_64_fastpath+0x1f/0xbe
      RIP: 0033:0x444339
      RSP: 002b:00007ffe05bb5b18 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
      RAX: ffffffffffffffda RBX: 00000000006d66c0 RCX: 0000000000444339
      RDX: 000000002084cf90 RSI: 00000000c0481273 RDI: 0000000000000009
      RBP: 0000000000000082 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000206 R12: ffffffffffffffff
      R13: 00000000c0481273 R14: 0000000000000000 R15: 0000000000000000
      
      since we can now run these in parallel. Ensure that the exported helpers
      for doing this are grabbing the queue trace mutex.
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7569adcf
    • Xin Long's avatar
      sctp: set frag_point in sctp_setsockopt_maxseg correctly · 2e671223
      Xin Long authored
      commit ecca8f88 upstream.
      
      Now in sctp_setsockopt_maxseg user_frag or frag_point can be set with
      val >= 8 and val <= SCTP_MAX_CHUNK_LEN. But both checks are incorrect.
      
      val >= 8 means frag_point can even be less than SCTP_DEFAULT_MINSEGMENT.
      Then in sctp_datamsg_from_user(), when it's value is greater than cookie
      echo len and trying to bundle with cookie echo chunk, the first_len will
      overflow.
      
      The worse case is when it's value is equal as cookie echo len, first_len
      becomes 0, it will go into a dead loop for fragment later on. In Hangbin
      syzkaller testing env, oom was even triggered due to consecutive memory
      allocation in that loop.
      
      Besides, SCTP_MAX_CHUNK_LEN is the max size of the whole chunk, it should
      deduct the data header for frag_point or user_frag check.
      
      This patch does a proper check with SCTP_DEFAULT_MINSEGMENT subtracting
      the sctphdr and datahdr, SCTP_MAX_CHUNK_LEN subtracting datahdr when
      setting frag_point via sockopt. It also improves sctp_setsockopt_maxseg
      codes.
      Suggested-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Reported-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e671223
    • Cong Wang's avatar
      xfrm: check id proto in validate_tmpl() · 85552886
      Cong Wang authored
      commit 6a53b759 upstream.
      
      syzbot reported a kernel warning in xfrm_state_fini(), which
      indicates that we have entries left in the list
      net->xfrm.state_all whose proto is zero. And
      xfrm_id_proto_match() doesn't consider them as a match with
      IPSEC_PROTO_ANY in this case.
      
      Proto with value 0 is probably not a valid value, at least
      verify_newsa_info() doesn't consider it valid either.
      
      This patch fixes it by checking the proto value in
      validate_tmpl() and rejecting invalid ones, like what iproute2
      does in xfrm_xfrmproto_getbyname().
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Steffen Klassert <steffen.klassert@secunet.com>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      85552886
    • Steffen Klassert's avatar
      xfrm: Fix stack-out-of-bounds read on socket policy lookup. · 46b31716
      Steffen Klassert authored
      commit ddc47e44 upstream.
      
      When we do tunnel or beet mode, we pass saddr and daddr from the
      template to xfrm_state_find(), this is ok. On transport mode,
      we pass the addresses from the flowi, assuming that the IP
      addresses (and address family) don't change during transformation.
      This assumption is wrong in the IPv4 mapped IPv6 case, packet
      is IPv4 and template is IPv6.
      
      Fix this by catching address family missmatches of the policy
      and the flow already before we do the lookup.
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      46b31716
    • Tetsuo Handa's avatar
      mm,vmscan: Make unregister_shrinker() no-op if register_shrinker() failed. · 274ee93f
      Tetsuo Handa authored
      commit bb422a73 upstream.
      
      Syzbot caught an oops at unregister_shrinker() because combination of
      commit 1d3d4437 ("vmscan: per-node deferred work") and fault
      injection made register_shrinker() fail and the caller of
      register_shrinker() did not check for failure.
      
      ----------
      [  554.881422] FAULT_INJECTION: forcing a failure.
      [  554.881422] name failslab, interval 1, probability 0, space 0, times 0
      [  554.881438] CPU: 1 PID: 13231 Comm: syz-executor1 Not tainted 4.14.0-rc8+ #82
      [  554.881443] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      [  554.881445] Call Trace:
      [  554.881459]  dump_stack+0x194/0x257
      [  554.881474]  ? arch_local_irq_restore+0x53/0x53
      [  554.881486]  ? find_held_lock+0x35/0x1d0
      [  554.881507]  should_fail+0x8c0/0xa40
      [  554.881522]  ? fault_create_debugfs_attr+0x1f0/0x1f0
      [  554.881537]  ? check_noncircular+0x20/0x20
      [  554.881546]  ? find_next_zero_bit+0x2c/0x40
      [  554.881560]  ? ida_get_new_above+0x421/0x9d0
      [  554.881577]  ? find_held_lock+0x35/0x1d0
      [  554.881594]  ? __lock_is_held+0xb6/0x140
      [  554.881628]  ? check_same_owner+0x320/0x320
      [  554.881634]  ? lock_downgrade+0x990/0x990
      [  554.881649]  ? find_held_lock+0x35/0x1d0
      [  554.881672]  should_failslab+0xec/0x120
      [  554.881684]  __kmalloc+0x63/0x760
      [  554.881692]  ? lock_downgrade+0x990/0x990
      [  554.881712]  ? register_shrinker+0x10e/0x2d0
      [  554.881721]  ? trace_event_raw_event_module_request+0x320/0x320
      [  554.881737]  register_shrinker+0x10e/0x2d0
      [  554.881747]  ? prepare_kswapd_sleep+0x1f0/0x1f0
      [  554.881755]  ? _down_write_nest_lock+0x120/0x120
      [  554.881765]  ? memcpy+0x45/0x50
      [  554.881785]  sget_userns+0xbcd/0xe20
      (...snipped...)
      [  554.898693] kasan: CONFIG_KASAN_INLINE enabled
      [  554.898724] kasan: GPF could be caused by NULL-ptr deref or user memory access
      [  554.898732] general protection fault: 0000 [#1] SMP KASAN
      [  554.898737] Dumping ftrace buffer:
      [  554.898741]    (ftrace buffer empty)
      [  554.898743] Modules linked in:
      [  554.898752] CPU: 1 PID: 13231 Comm: syz-executor1 Not tainted 4.14.0-rc8+ #82
      [  554.898755] Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      [  554.898760] task: ffff8801d1dbe5c0 task.stack: ffff8801c9e38000
      [  554.898772] RIP: 0010:__list_del_entry_valid+0x7e/0x150
      [  554.898775] RSP: 0018:ffff8801c9e3f108 EFLAGS: 00010246
      [  554.898780] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
      [  554.898784] RDX: 0000000000000000 RSI: ffff8801c53c6f98 RDI: ffff8801c53c6fa0
      [  554.898788] RBP: ffff8801c9e3f120 R08: 1ffff100393c7d55 R09: 0000000000000004
      [  554.898791] R10: ffff8801c9e3ef70 R11: 0000000000000000 R12: 0000000000000000
      [  554.898795] R13: dffffc0000000000 R14: 1ffff100393c7e45 R15: ffff8801c53c6f98
      [  554.898800] FS:  0000000000000000(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
      [  554.898804] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
      [  554.898807] CR2: 00000000dbc23000 CR3: 00000001c7269000 CR4: 00000000001406e0
      [  554.898813] DR0: 0000000020000000 DR1: 0000000020000000 DR2: 0000000000000000
      [  554.898816] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600
      [  554.898818] Call Trace:
      [  554.898828]  unregister_shrinker+0x79/0x300
      [  554.898837]  ? perf_trace_mm_vmscan_writepage+0x750/0x750
      [  554.898844]  ? down_write+0x87/0x120
      [  554.898851]  ? deactivate_super+0x139/0x1b0
      [  554.898857]  ? down_read+0x150/0x150
      [  554.898864]  ? check_same_owner+0x320/0x320
      [  554.898875]  deactivate_locked_super+0x64/0xd0
      [  554.898883]  deactivate_super+0x141/0x1b0
      ----------
      
      Since allowing register_shrinker() callers to call unregister_shrinker()
      when register_shrinker() failed can simplify error recovery path, this
      patch makes unregister_shrinker() no-op when register_shrinker() failed.
      Also, reset shrinker->nr_deferred in case unregister_shrinker() was
      by error called twice.
      Signed-off-by: default avatarTetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Signed-off-by: default avatarAliaksei Karaliou <akaraliou.dev@gmail.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Glauber Costa <glauber@scylladb.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      274ee93f
    • Florian Westphal's avatar
      xfrm: skip policies marked as dead while rehashing · 5d89917c
      Florian Westphal authored
      commit 862591bf upstream.
      
      syzkaller triggered following KASAN splat:
      
      BUG: KASAN: slab-out-of-bounds in xfrm_hash_rebuild+0xdbe/0xf00 net/xfrm/xfrm_policy.c:618
      read of size 2 at addr ffff8801c8e92fe4 by task kworker/1:1/23 [..]
      Workqueue: events xfrm_hash_rebuild [..]
       __asan_report_load2_noabort+0x14/0x20 mm/kasan/report.c:428
       xfrm_hash_rebuild+0xdbe/0xf00 net/xfrm/xfrm_policy.c:618
       process_one_work+0xbbf/0x1b10 kernel/workqueue.c:2112
       worker_thread+0x223/0x1990 kernel/workqueue.c:2246 [..]
      
      The reproducer triggers:
      1016                 if (error) {
      1017                         list_move_tail(&walk->walk.all, &x->all);
      1018                         goto out;
      1019                 }
      
      in xfrm_policy_walk() via pfkey (it sets tiny rcv space, dump
      callback returns -ENOBUFS).
      
      In this case, *walk is located the pfkey socket struct, so this socket
      becomes visible in the global policy list.
      
      It looks like this is intentional -- phony walker has walk.dead set to 1
      and all other places skip such "policies".
      
      Ccing original authors of the two commits that seem to expose this
      issue (first patch missed ->dead check, second patch adds pfkey
      sockets to policies dumper list).
      
      Fixes: 880a6fab ("xfrm: configure policy hash table thresholds by netlink")
      Fixes: 12a169e7 ("ipsec: Put dumpers on the dump list")
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: Timo Teras <timo.teras@iki.fi>
      Cc: Christophe Gouault <christophe.gouault@6wind.com>
      Reported-by: default avatarsyzbot <bot+c028095236fcb6f4348811565b75084c754dc729@syzkaller.appspotmail.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5d89917c
    • Johannes Berg's avatar
      cfg80211: check dev_set_name() return value · 75898034
      Johannes Berg authored
      commit 59b179b4 upstream.
      
      syzbot reported a warning from rfkill_alloc(), and after a while
      I think that the reason is that it was doing fault injection and
      the dev_set_name() failed, leaving the name NULL, and we didn't
      check the return value and got to rfkill_alloc() with a NULL name.
      Since we really don't want a NULL name, we ought to check the
      return value.
      
      Fixes: fb28ad35 ("net: struct device - replace bus_id with dev_name(), dev_set_name()")
      Reported-by: syzbot+1ddfb3357e1d7bb5b5d3@syzkaller.appspotmail.com
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      75898034
    • Tom Herbert's avatar
      kcm: Only allow TCP sockets to be attached to a KCM mux · 2bb174af
      Tom Herbert authored
      commit 581e7226 upstream.
      
      TCP sockets for IPv4 and IPv6 that are not listeners or in closed
      stated are allowed to be attached to a KCM mux.
      
      Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module")
      Reported-by: syzbot+8865eaff7f9acd593945@syzkaller.appspotmail.com
      Signed-off-by: default avatarTom Herbert <tom@quantonium.net>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2bb174af
    • Tom Herbert's avatar
      kcm: Check if sk_user_data already set in kcm_attach · 085cbbda
      Tom Herbert authored
      commit e5571240 upstream.
      
      This is needed to prevent sk_user_data being overwritten.
      The check is done under the callback lock. This should prevent
      a socket from being attached twice to a KCM mux. It also prevents
      a socket from being attached for other use cases of sk_user_data
      as long as the other cases set sk_user_data under the lock.
      Followup work is needed to unify all the use cases of sk_user_data
      to use the same locking.
      
      Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com
      Fixes: ab7ac4eb ("kcm: Kernel Connection Multiplexor module")
      Signed-off-by: default avatarTom Herbert <tom@quantonium.net>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      085cbbda
    • Jason Wang's avatar
      vhost: use mutex_lock_nested() in vhost_dev_lock_vqs() · bd3ccdc6
      Jason Wang authored
      commit e9cb4239 upstream.
      
      We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to
      hold mutexes of all virtqueues. This may confuse lockdep to report a
      possible deadlock because of trying to hold locks belong to same
      class. Switch to use mutex_lock_nested() to avoid false positive.
      
      Fixes: 6b1e6cc7 ("vhost: new device IOTLB API")
      Reported-by: syzbot+dbb7c1161485e61b0241@syzkaller.appspotmail.com
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bd3ccdc6
  2. 22 Feb, 2018 15 commits