1. 23 Feb, 2017 40 commits
    • Al Viro's avatar
      sg_write()/bsg_write() is not fit to be called under KERNEL_DS · 249741c2
      Al Viro authored
      commit 128394ef upstream.
      
      Both damn things interpret userland pointers embedded into the payload;
      worse, they are actually traversing those.  Leaving aside the bad
      API design, this is very much _not_ safe to call with KERNEL_DS.
      Bail out early if that happens.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      249741c2
    • Marcelo Ricardo Leitner's avatar
      sctp: validate chunk len before actually using it · 1685cd22
      Marcelo Ricardo Leitner authored
      commit bf911e98 upstream.
      
      Andrey Konovalov reported that KASAN detected that SCTP was using a slab
      beyond the boundaries. It was caused because when handling out of the
      blue packets in function sctp_sf_ootb() it was checking the chunk len
      only after already processing the first chunk, validating only for the
      2nd and subsequent ones.
      
      The fix is to just move the check upwards so it's also validated for the
      1st chunk.
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Reviewed-by: default avatarXin Long <lucien.xin@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.16: moved code is slightly different]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1685cd22
    • Zhou Chengming's avatar
      sysctl: Drop reference added by grab_header in proc_sys_readdir · 0b66ea3b
      Zhou Chengming authored
      commit 93362fa4 upstream.
      
      Fixes CVE-2016-9191, proc_sys_readdir doesn't drop reference
      added by grab_header when return from !dir_emit_dots path.
      It can cause any path called unregister_sysctl_table will
      wait forever.
      
      The calltrace of CVE-2016-9191:
      
      [ 5535.960522] Call Trace:
      [ 5535.963265]  [<ffffffff817cdaaf>] schedule+0x3f/0xa0
      [ 5535.968817]  [<ffffffff817d33fb>] schedule_timeout+0x3db/0x6f0
      [ 5535.975346]  [<ffffffff817cf055>] ? wait_for_completion+0x45/0x130
      [ 5535.982256]  [<ffffffff817cf0d3>] wait_for_completion+0xc3/0x130
      [ 5535.988972]  [<ffffffff810d1fd0>] ? wake_up_q+0x80/0x80
      [ 5535.994804]  [<ffffffff8130de64>] drop_sysctl_table+0xc4/0xe0
      [ 5536.001227]  [<ffffffff8130de17>] drop_sysctl_table+0x77/0xe0
      [ 5536.007648]  [<ffffffff8130decd>] unregister_sysctl_table+0x4d/0xa0
      [ 5536.014654]  [<ffffffff8130deff>] unregister_sysctl_table+0x7f/0xa0
      [ 5536.021657]  [<ffffffff810f57f5>] unregister_sched_domain_sysctl+0x15/0x40
      [ 5536.029344]  [<ffffffff810d7704>] partition_sched_domains+0x44/0x450
      [ 5536.036447]  [<ffffffff817d0761>] ? __mutex_unlock_slowpath+0x111/0x1f0
      [ 5536.043844]  [<ffffffff81167684>] rebuild_sched_domains_locked+0x64/0xb0
      [ 5536.051336]  [<ffffffff8116789d>] update_flag+0x11d/0x210
      [ 5536.057373]  [<ffffffff817cf61f>] ? mutex_lock_nested+0x2df/0x450
      [ 5536.064186]  [<ffffffff81167acb>] ? cpuset_css_offline+0x1b/0x60
      [ 5536.070899]  [<ffffffff810fce3d>] ? trace_hardirqs_on+0xd/0x10
      [ 5536.077420]  [<ffffffff817cf61f>] ? mutex_lock_nested+0x2df/0x450
      [ 5536.084234]  [<ffffffff8115a9f5>] ? css_killed_work_fn+0x25/0x220
      [ 5536.091049]  [<ffffffff81167ae5>] cpuset_css_offline+0x35/0x60
      [ 5536.097571]  [<ffffffff8115aa2c>] css_killed_work_fn+0x5c/0x220
      [ 5536.104207]  [<ffffffff810bc83f>] process_one_work+0x1df/0x710
      [ 5536.110736]  [<ffffffff810bc7c0>] ? process_one_work+0x160/0x710
      [ 5536.117461]  [<ffffffff810bce9b>] worker_thread+0x12b/0x4a0
      [ 5536.123697]  [<ffffffff810bcd70>] ? process_one_work+0x710/0x710
      [ 5536.130426]  [<ffffffff810c3f7e>] kthread+0xfe/0x120
      [ 5536.135991]  [<ffffffff817d4baf>] ret_from_fork+0x1f/0x40
      [ 5536.142041]  [<ffffffff810c3e80>] ? kthread_create_on_node+0x230/0x230
      
      One cgroup maintainer mentioned that "cgroup is trying to offline
      a cpuset css, which takes place under cgroup_mutex.  The offlining
      ends up trying to drain active usages of a sysctl table which apprently
      is not happening."
      The real reason is that proc_sys_readdir doesn't drop reference added
      by grab_header when return from !dir_emit_dots path. So this cpuset
      offline path will wait here forever.
      
      See here for details: http://www.openwall.com/lists/oss-security/2016/11/04/13
      
      Fixes: f0c3b509 ("[readdir] convert procfs")
      Reported-by: default avatarCAI Qian <caiqian@redhat.com>
      Tested-by: default avatarYang Shukui <yangshukui@huawei.com>
      Signed-off-by: default avatarZhou Chengming <zhouchengming1@huawei.com>
      Acked-by: default avatarAl Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0b66ea3b
    • Linus Torvalds's avatar
      Fix potential infoleak in older kernels · 17a3ea20
      Linus Torvalds authored
      Not upstream as it is not needed there.
      
      So a patch something like this might be a safe way to fix the
      potential infoleak in older kernels.
      
      THIS IS UNTESTED. It's a very obvious patch, though, so if it compiles
      it probably works. It just initializes the output variable with 0 in
      the inline asm description, instead of doing it in the exception
      handler.
      
      It will generate slightly worse code (a few unnecessary ALU
      operations), but it doesn't have any interactions with the exception
      handler implementation.
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      17a3ea20
    • EunTaik Lee's avatar
      staging/android/ion : fix a race condition in the ion driver · ce626e14
      EunTaik Lee authored
      commit 9590232b upstream.
      
      There is a use-after-free problem in the ion driver.
      This is caused by a race condition in the ion_ioctl()
      function.
      
      A handle has ref count of 1 and two tasks on different
      cpus calls ION_IOC_FREE simultaneously.
      
      cpu 0                                   cpu 1
      -------------------------------------------------------
      ion_handle_get_by_id()
      (ref == 2)
                                  ion_handle_get_by_id()
                                  (ref == 3)
      
      ion_free()
      (ref == 2)
      
      ion_handle_put()
      (ref == 1)
      
                                  ion_free()
                                  (ref == 0 so ion_handle_destroy() is
                                  called
                                  and the handle is freed.)
      
                                  ion_handle_put() is called and it
                                  decreases the slub's next free pointer
      
      The problem is detected as an unaligned access in the
      spin lock functions since it uses load exclusive
       instruction. In some cases it corrupts the slub's
      free pointer which causes a mis-aligned access to the
      next free pointer.(kmalloc returns a pointer like
      ffffc0745b4580aa). And it causes lots of other
      hard-to-debug problems.
      
      This symptom is caused since the first member in the
      ion_handle structure is the reference count and the
      ion driver decrements the reference after it has been
      freed.
      
      To fix this problem client->lock mutex is extended
      to protect all the codes that uses the handle.
      Signed-off-by: default avatarEun Taik Lee <eun.taik.lee@samsung.com>
      Reviewed-by: default avatarLaura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      ce626e14
    • Eric Dumazet's avatar
      tcp: take care of truncations done by sk_filter() · 3d59e6e2
      Eric Dumazet authored
      commit ac6e7800 upstream.
      
      With syzkaller help, Marco Grassi found a bug in TCP stack,
      crashing in tcp_collapse()
      
      Root cause is that sk_filter() can truncate the incoming skb,
      but TCP stack was not really expecting this to happen.
      It probably was expecting a simple DROP or ACCEPT behavior.
      
      We first need to make sure no part of TCP header could be removed.
      Then we need to adjust TCP_SKB_CB(skb)->end_seq
      
      Many thanks to syzkaller team and Marco for giving us a reproducer.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMarco Grassi <marco.gra@gmail.com>
      Reported-by: default avatarVladis Dronov <vdronov@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      3d59e6e2
    • Willem de Bruijn's avatar
      dccp: limit sk_filter trim to payload · 49bbb1dc
      Willem de Bruijn authored
      commit 4f0c40d9 upstream.
      
      Dccp verifies packet integrity, including length, at initial rcv in
      dccp_invalid_packet, later pulls headers in dccp_enqueue_skb.
      
      A call to sk_filter in-between can cause __skb_pull to wrap skb->len.
      skb_copy_datagram_msg interprets this as a negative value, so
      (correctly) fails with EFAULT. The negative length is reported in
      ioctl SIOCINQ or possibly in a DCCP_WARN in dccp_close.
      
      Introduce an sk_receive_skb variant that caps how small a filter
      program can trim packets, and call this in dccp with the header
      length. Excessively trimmed packets are now processed normally and
      queued for reception as 0B payloads.
      
      Fixes: 7c657876 ("[DCCP]: Initial implementation")
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      49bbb1dc
    • Willem de Bruijn's avatar
      rose: limit sk_filter trim to payload · d0fb92f2
      Willem de Bruijn authored
      commit f4979fce upstream.
      
      Sockets can have a filter program attached that drops or trims
      incoming packets based on the filter program return value.
      
      Rose requires data packets to have at least ROSE_MIN_LEN bytes. It
      verifies this on arrival in rose_route_frame and unconditionally pulls
      the bytes in rose_recvmsg. The filter can trim packets to below this
      value in-between, causing pull to fail, leaving the partial header at
      the time of skb_copy_datagram_msg.
      
      Place a lower bound on the size to which sk_filter may trim packets
      by introducing sk_filter_trim_cap and call this for rose packets.
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d0fb92f2
    • Ben Hutchings's avatar
      net: Add __sock_queue_rcv_skb() · bed7167a
      Ben Hutchings authored
      Extraxcted from commit e6afc8ac
      "udp: remove headers from UDP packets before queueing".
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      bed7167a
    • Kees Cook's avatar
      fbdev: color map copying bounds checking · 4952d0fe
      Kees Cook authored
      commit 2dc705a9 upstream.
      
      Copying color maps to userspace doesn't check the value of to->start,
      which will cause kernel heap buffer OOB read due to signedness wraps.
      
      CVE-2016-8405
      
      Link: http://lkml.kernel.org/r/20170105224249.GA50925@beast
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Reported-by: Peter Pi (@heisecode) of Trend Micro
      Cc: Min Chong <mchong@google.com>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4952d0fe
    • Phil Turnbull's avatar
      netfilter: nfnetlink: correctly validate length of batch messages · 8a984a47
      Phil Turnbull authored
      commit c58d6c93 upstream.
      
      If nlh->nlmsg_len is zero then an infinite loop is triggered because
      'skb_pull(skb, msglen);' pulls zero bytes.
      
      The calculation in nlmsg_len() underflows if 'nlh->nlmsg_len <
      NLMSG_HDRLEN' which bypasses the length validation and will later
      trigger an out-of-bound read.
      
      If the length validation does fail then the malformed batch message is
      copied back to userspace. However, we cannot do this because the
      nlh->nlmsg_len can be invalid. This leads to an out-of-bounds read in
      netlink_ack:
      
          [   41.455421] ==================================================================
          [   41.456431] BUG: KASAN: slab-out-of-bounds in memcpy+0x1d/0x40 at addr ffff880119e79340
          [   41.456431] Read of size 4294967280 by task a.out/987
          [   41.456431] =============================================================================
          [   41.456431] BUG kmalloc-512 (Not tainted): kasan: bad access detected
          [   41.456431] -----------------------------------------------------------------------------
          ...
          [   41.456431] Bytes b4 ffff880119e79310: 00 00 00 00 d5 03 00 00 b0 fb fe ff 00 00 00 00  ................
          [   41.456431] Object ffff880119e79320: 20 00 00 00 10 00 05 00 00 00 00 00 00 00 00 00   ...............
          [   41.456431] Object ffff880119e79330: 14 00 0a 00 01 03 fc 40 45 56 11 22 33 10 00 05  .......@EV."3...
          [   41.456431] Object ffff880119e79340: f0 ff ff ff 88 99 aa bb 00 14 00 0a 00 06 fe fb  ................
                                                  ^^ start of batch nlmsg with
                                                     nlmsg_len=4294967280
          ...
          [   41.456431] Memory state around the buggy address:
          [   41.456431]  ffff880119e79400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          [   41.456431]  ffff880119e79480: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
          [   41.456431] >ffff880119e79500: 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc fc
          [   41.456431]                                ^
          [   41.456431]  ffff880119e79580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
          [   41.456431]  ffff880119e79600: fc fc fc fc fc fc fc fc fc fc fb fb fb fb fb fb
          [   41.456431] ==================================================================
      
      Fix this with better validation of nlh->nlmsg_len and by setting
      NFNL_BATCH_FAILURE if any batch message fails length validation.
      
      CAP_NET_ADMIN is required to trigger the bugs.
      
      Fixes: 9ea2aa8b ("netfilter: nfnetlink: validate nfnetlink header from batch")
      Signed-off-by: default avatarPhil Turnbull <phil.turnbull@oracle.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      [bwh: Backported to 3.16:
       - We don't have an error list so don't call nfnl_err_reset()
       - Set 'success' variable instead of 'status']
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      8a984a47
    • Benjamin Tissoires's avatar
      HID: core: prevent out-of-bound readings · e137da9c
      Benjamin Tissoires authored
      commit 50220dea upstream.
      
      Plugging a Logitech DJ receiver with KASAN activated raises a bunch of
      out-of-bound readings.
      
      The fields are allocated up to MAX_USAGE, meaning that potentially, we do
      not have enough fields to fit the incoming values.
      Add checks and silence KASAN.
      Signed-off-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e137da9c
    • Lars-Peter Clausen's avatar
      usb: gadget: f_fs: Fix use-after-free · 0fbed614
      Lars-Peter Clausen authored
      commit 38740a5b upstream.
      
      When using asynchronous read or write operations on the USB endpoints the
      issuer of the IO request is notified by calling the ki_complete() callback
      of the submitted kiocb when the URB has been completed.
      
      Calling this ki_complete() callback will free kiocb. Make sure that the
      structure is no longer accessed beyond that point, otherwise undefined
      behaviour might occur.
      
      Fixes: 2e4c7553 ("usb: gadget: f_fs: add aio support")
      Signed-off-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      [bwh: Backported to 3.16:
       - Adjust filename
       - We only use kiocb::private, not kiocb::ki_flags]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0fbed614
    • Peter Zijlstra's avatar
      perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race · fe525a28
      Peter Zijlstra authored
      commit 321027c1 upstream.
      
      Di Shen reported a race between two concurrent sys_perf_event_open()
      calls where both try and move the same pre-existing software group
      into a hardware context.
      
      The problem is exactly that described in commit:
      
        f63a8daa ("perf: Fix event->ctx locking")
      
      ... where, while we wait for a ctx->mutex acquisition, the event->ctx
      relation can have changed under us.
      
      That very same commit failed to recognise sys_perf_event_context() as an
      external access vector to the events and thereby didn't apply the
      established locking rules correctly.
      
      So while one sys_perf_event_open() call is stuck waiting on
      mutex_lock_double(), the other (which owns said locks) moves the group
      about. So by the time the former sys_perf_event_open() acquires the
      locks, the context we've acquired is stale (and possibly dead).
      
      Apply the established locking rules as per perf_event_ctx_lock_nested()
      to the mutex_lock_double() for the 'move_group' case. This obviously means
      we need to validate state after we acquire the locks.
      
      Reported-by: Di Shen (Keen Lab)
      Tested-by: default avatarJohn Dias <joaodias@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Min Chong <mchong@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: f63a8daa ("perf: Fix event->ctx locking")
      Link: http://lkml.kernel.org/r/20170106131444.GZ3174@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [bwh: Backported to 3.16:
       - Use ACCESS_ONCE() instead of READ_ONCE()
       - Test perf_event::group_flags instead of group_caps
       - Add the err_locked cleanup block, which we didn't need before
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      fe525a28
    • Peter Zijlstra's avatar
      perf: Do not double free · 5838f3ef
      Peter Zijlstra authored
      commit 13005627 upstream.
      
      In case of: err_file: fput(event_file), we'll end up calling
      perf_release() which in turn will free the event.
      
      Do not then free the event _again_.
      Tested-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Reviewed-by: default avatarAlexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: dvyukov@google.com
      Cc: eranian@google.com
      Cc: oleg@redhat.com
      Cc: panand@redhat.com
      Cc: sasha.levin@oracle.com
      Cc: vince@deater.net
      Link: http://lkml.kernel.org/r/20160224174947.697350349@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5838f3ef
    • Peter Zijlstra's avatar
      perf: Fix event->ctx locking · 18163dd1
      Peter Zijlstra authored
      commit f63a8daa upstream.
      
      There have been a few reported issues wrt. the lack of locking around
      changing event->ctx. This patch tries to address those.
      
      It avoids the whole rwsem thing; and while it appears to work, please
      give it some thought in review.
      
      What I did fail at is sensible runtime checks on the use of
      event->ctx, the RCU use makes it very hard.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/20150123125834.209535886@infradead.orgSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      18163dd1
    • Peter Hurley's avatar
      tty: Prevent ldisc drivers from re-using stale tty fields · 16c30eea
      Peter Hurley authored
      commit dd42bf11 upstream.
      
      Line discipline drivers may mistakenly misuse ldisc-related fields
      when initializing. For example, a failure to initialize tty->receive_room
      in the N_GIGASET_M101 line discipline was recently found and fixed [1].
      Now, the N_X25 line discipline has been discovered accessing the previous
      line discipline's already-freed private data [2].
      
      Harden the ldisc interface against misuse by initializing revelant
      tty fields before instancing the new line discipline.
      
      [1]
          commit fd98e941
          Author: Tilman Schmidt <tilman@imap.cc>
          Date:   Tue Jul 14 00:37:13 2015 +0200
      
          isdn/gigaset: reset tty->receive_room when attaching ser_gigaset
      
      [2] Report from Sasha Levin <sasha.levin@oracle.com>
          [  634.336761] ==================================================================
          [  634.338226] BUG: KASAN: use-after-free in x25_asy_open_tty+0x13d/0x490 at addr ffff8800a743efd0
          [  634.339558] Read of size 4 by task syzkaller_execu/8981
          [  634.340359] =============================================================================
          [  634.341598] BUG kmalloc-512 (Not tainted): kasan: bad access detected
          ...
          [  634.405018] Call Trace:
          [  634.405277] dump_stack (lib/dump_stack.c:52)
          [  634.405775] print_trailer (mm/slub.c:655)
          [  634.406361] object_err (mm/slub.c:662)
          [  634.406824] kasan_report_error (mm/kasan/report.c:138 mm/kasan/report.c:236)
          [  634.409581] __asan_report_load4_noabort (mm/kasan/report.c:279)
          [  634.411355] x25_asy_open_tty (drivers/net/wan/x25_asy.c:559 (discriminator 1))
          [  634.413997] tty_ldisc_open.isra.2 (drivers/tty/tty_ldisc.c:447)
          [  634.414549] tty_set_ldisc (drivers/tty/tty_ldisc.c:567)
          [  634.415057] tty_ioctl (drivers/tty/tty_io.c:2646 drivers/tty/tty_io.c:2879)
          [  634.423524] do_vfs_ioctl (fs/ioctl.c:43 fs/ioctl.c:607)
          [  634.427491] SyS_ioctl (fs/ioctl.c:622 fs/ioctl.c:613)
          [  634.427945] entry_SYSCALL_64_fastpath (arch/x86/entry/entry_64.S:188)
      
      Cc: Tilman Schmidt <tilman@imap.cc>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarPeter Hurley <peter@hurleysoftware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      16c30eea
    • Peter Zijlstra's avatar
      perf: Fix race in swevent hash · 311c3b32
      Peter Zijlstra authored
      commit 12ca6ad2 upstream.
      
      There's a race on CPU unplug where we free the swevent hash array
      while it can still have events on. This will result in a
      use-after-free which is BAD.
      
      Simply do not free the hash array on unplug. This leaves the thing
      around and no use-after-free takes place.
      
      When the last swevent dies, we do a for_each_possible_cpu() iteration
      anyway to clean these up, at which time we'll free it, so no leakage
      will occur.
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Tested-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      311c3b32
    • Calvin Owens's avatar
      sg: Fix double-free when drives detach during SG_IO · 79cfd634
      Calvin Owens authored
      commit f3951a37 upstream.
      
      In sg_common_write(), we free the block request and return -ENODEV if
      the device is detached in the middle of the SG_IO ioctl().
      
      Unfortunately, sg_finish_rem_req() also tries to free srp->rq, so we
      end up freeing rq->cmd in the already free rq object, and then free
      the object itself out from under the current user.
      
      This ends up corrupting random memory via the list_head on the rq
      object. The most common crash trace I saw is this:
      
        ------------[ cut here ]------------
        kernel BUG at block/blk-core.c:1420!
        Call Trace:
        [<ffffffff81281eab>] blk_put_request+0x5b/0x80
        [<ffffffffa0069e5b>] sg_finish_rem_req+0x6b/0x120 [sg]
        [<ffffffffa006bcb9>] sg_common_write.isra.14+0x459/0x5a0 [sg]
        [<ffffffff8125b328>] ? selinux_file_alloc_security+0x48/0x70
        [<ffffffffa006bf95>] sg_new_write.isra.17+0x195/0x2d0 [sg]
        [<ffffffffa006cef4>] sg_ioctl+0x644/0xdb0 [sg]
        [<ffffffff81170f80>] do_vfs_ioctl+0x90/0x520
        [<ffffffff81258967>] ? file_has_perm+0x97/0xb0
        [<ffffffff811714a1>] SyS_ioctl+0x91/0xb0
        [<ffffffff81602afb>] tracesys+0xdd/0xe2
          RIP [<ffffffff81281e04>] __blk_put_request+0x154/0x1a0
      
      The solution is straightforward: just set srp->rq to NULL in the
      failure branch so that sg_finish_rem_req() doesn't attempt to re-free
      it.
      
      Additionally, since sg_rq_end_io() will never be called on the object
      when this happens, we need to free memory backing ->cmd if it isn't
      embedded in the object itself.
      
      KASAN was extremely helpful in finding the root cause of this bug.
      Signed-off-by: default avatarCalvin Owens <calvinowens@fb.com>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      [bwh: Backported to 3.16:
       - sg_finish_rem_req() would not free srp->rq->cmd so don't do it here either
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      79cfd634
    • Dan Carpenter's avatar
      ser_gigaset: return -ENOMEM on error instead of success · fd61e9c0
      Dan Carpenter authored
      commit 93a97c50 upstream.
      
      If we can't allocate the resources in gigaset_initdriver() then we
      should return -ENOMEM instead of zero.
      
      Fixes: 2869b23e ("[PATCH] drivers/isdn/gigaset: new M101 driver (v2)")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      fd61e9c0
    • 추지호's avatar
      can: peak: fix bad memory access and free sequence · 6623e57d
      추지호 authored
      commit b67d0dd7 upstream.
      
      Fix for bad memory access while disconnecting. netdev is freed before
      private data free, and dev is accessed after freeing netdev.
      
      This makes a slub problem, and it raise kernel oops with slub debugger
      config.
      Signed-off-by: default avatarJiho Chu <jiho.chu@samsung.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      6623e57d
    • Marc Kleine-Budde's avatar
      can: raw: raw_setsockopt: limit number of can_filter that can be set · 2dd51775
      Marc Kleine-Budde authored
      commit 332b05ca upstream.
      
      This patch adds a check to limit the number of can_filters that can be
      set via setsockopt on CAN_RAW sockets. Otherwise allocations > MAX_ORDER
      are not prevented resulting in a warning.
      
      Reference: https://lkml.org/lkml/2016/12/2/230Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Tested-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2dd51775
    • John David Anglin's avatar
      parisc: Remove unnecessary TLB purges from flush_dcache_page_asm and flush_icache_page_asm · 2cc1ff50
      John David Anglin authored
      commit febe4296 upstream.
      
      We have four routines in pacache.S that use temporary alias pages:
      copy_user_page_asm(), clear_user_page_asm(), flush_dcache_page_asm() and
      flush_icache_page_asm().  copy_user_page_asm() and clear_user_page_asm()
      don't purge the TLB entry used for the operation.
      flush_dcache_page_asm() and flush_icache_page_asm do purge the entry.
      
      Presumably, this was thought to optimize TLB use.  However, the
      operation is quite heavy weight on PA 1.X processors as we need to take
      the TLB lock and a TLB broadcast is sent to all processors.
      
      This patch removes the purges from flush_dcache_page_asm() and
      flush_icache_page_asm.
      Signed-off-by: default avatarJohn David Anglin  <dave.anglin@bell.net>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2cc1ff50
    • John David Anglin's avatar
      parisc: Purge TLB before setting PTE · 76d794a5
      John David Anglin authored
      commit c78e710c upstream.
      
      The attached change interchanges the order of purging the TLB and
      setting the corresponding page table entry.  TLB purges are strongly
      ordered.  It occurred to me one night that setting the PTE first might
      have subtle ordering issues on SMP machines and cause random memory
      corruption.
      
      A TLB lock guards the insertion of user TLB entries.  So after the TLB
      is purged, a new entry can't be inserted until the lock is released.
      This ensures that the new PTE value is used when the lock is released.
      
      Since making this change, no random segmentation faults have been
      observed on the Debian hppa buildd servers.
      Signed-off-by: default avatarJohn David Anglin  <dave.anglin@bell.net>
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      76d794a5
    • Miklos Szeredi's avatar
      fuse: fix clearing suid, sgid for chown() · a4ab1eeb
      Miklos Szeredi authored
      commit c01638f5 upstream.
      
      Basically, the pjdfstests set the ownership of a file to 06555, and then
      chowns it (as root) to a new uid/gid. Prior to commit a09f99ed ("fuse:
      fix killing s[ug]id in setattr"), fuse would send down a setattr with both
      the uid/gid change and a new mode.  Now, it just sends down the uid/gid
      change.
      
      Technically this is NOTABUG, since POSIX doesn't _require_ that we clear
      these bits for a privileged process, but Linux (wisely) has done that and I
      think we don't want to change that behavior here.
      
      This is caused by the use of should_remove_suid(), which will always return
      0 when the process has CAP_FSETID.
      
      In fact we really don't need to be calling should_remove_suid() at all,
      since we've already been indicated that we should remove the suid, we just
      don't want to use a (very) stale mode for that.
      
      This patch should fix the above as well as simplify the logic.
      Reported-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: a09f99ed ("fuse: fix killing s[ug]id in setattr")
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      [bwh: Backported to 3.16: adjust context, indentation]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a4ab1eeb
    • Peter Zijlstra (Intel)'s avatar
      perf/x86: Fix full width counter, counter overflow · dd613d33
      Peter Zijlstra (Intel) authored
      commit 7f612a7f upstream.
      
      Lukasz reported that perf stat counters overflow handling is broken on KNL/SLM.
      
      Both these parts have full_width_write set, and that does indeed have
      a problem. In order to deal with counter wrap, we must sample the
      counter at at least half the counter period (see also the sampling
      theorem) such that we can unambiguously reconstruct the count.
      
      However commit:
      
        069e0c3c ("perf/x86/intel: Support full width counting")
      
      sets the sampling interval to the full period, not half.
      
      Fixing that exposes another issue, in that we must not sign extend the
      delta value when we shift it right; the counter cannot have
      decremented after all.
      
      With both these issues fixed, counter overflow functions correctly
      again.
      Reported-by: default avatarLukasz Odzioba <lukasz.odzioba@intel.com>
      Tested-by: default avatarLiang, Kan <kan.liang@intel.com>
      Tested-by: default avatarOdzioba, Lukasz <lukasz.odzioba@intel.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: 069e0c3c ("perf/x86/intel: Support full width counting")
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [bwh: Backported to 3.16: adjust filenames]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      dd613d33
    • Florian Fainelli's avatar
      net: ep93xx_eth: Do not crash unloading module · 4ba569b0
      Florian Fainelli authored
      commit c823abac upstream.
      
      When we unload the ep93xx_eth, whether we have opened the network
      interface or not, we will either hit a kernel paging request error, or a
      simple NULL pointer de-reference because:
      
      - if ep93xx_open has been called, we have created a valid DMA mapping
        for ep->descs, when we call ep93xx_stop, we also call
        ep93xx_free_buffers, ep->descs now has a stale value
      
      - if ep93xx_open has not been called, we have a NULL pointer for
        ep->descs, so performing any operation against that address just won't
        work
      
      Fix this by adding a NULL pointer check for ep->descs which means that
      ep93xx_free_buffers() was able to successfully tear down the descriptors
      and free the DMA cookie as well.
      
      Fixes: 1d22e05d ("[PATCH] Cirrus Logic ep93xx ethernet driver")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4ba569b0
    • Kees Cook's avatar
      net: ping: check minimum size on ICMP header length · 718d94ed
      Kees Cook authored
      commit 0eab121e upstream.
      
      Prior to commit c0371da6 ("put iov_iter into msghdr") in v3.19, there
      was no check that the iovec contained enough bytes for an ICMP header,
      and the read loop would walk across neighboring stack contents. Since the
      iov_iter conversion, bad arguments are noticed, but the returned error is
      EFAULT. Returning EINVAL is a clearer error and also solves the problem
      prior to v3.19.
      
      This was found using trinity with KASAN on v3.18:
      
      BUG: KASAN: stack-out-of-bounds in memcpy_fromiovec+0x60/0x114 at addr ffffffc071077da0
      Read of size 8 by task trinity-c2/9623
      page:ffffffbe034b9a08 count:0 mapcount:0 mapping:          (null) index:0x0
      flags: 0x0()
      page dumped because: kasan: bad access detected
      CPU: 0 PID: 9623 Comm: trinity-c2 Tainted: G    BU         3.18.0-dirty #15
      Hardware name: Google Tegra210 Smaug Rev 1,3+ (DT)
      Call trace:
      [<ffffffc000209c98>] dump_backtrace+0x0/0x1ac arch/arm64/kernel/traps.c:90
      [<ffffffc000209e54>] show_stack+0x10/0x1c arch/arm64/kernel/traps.c:171
      [<     inline     >] __dump_stack lib/dump_stack.c:15
      [<ffffffc000f18dc4>] dump_stack+0x7c/0xd0 lib/dump_stack.c:50
      [<     inline     >] print_address_description mm/kasan/report.c:147
      [<     inline     >] kasan_report_error mm/kasan/report.c:236
      [<ffffffc000373dcc>] kasan_report+0x380/0x4b8 mm/kasan/report.c:259
      [<     inline     >] check_memory_region mm/kasan/kasan.c:264
      [<ffffffc00037352c>] __asan_load8+0x20/0x70 mm/kasan/kasan.c:507
      [<ffffffc0005b9624>] memcpy_fromiovec+0x5c/0x114 lib/iovec.c:15
      [<     inline     >] memcpy_from_msg include/linux/skbuff.h:2667
      [<ffffffc000ddeba0>] ping_common_sendmsg+0x50/0x108 net/ipv4/ping.c:674
      [<ffffffc000dded30>] ping_v4_sendmsg+0xd8/0x698 net/ipv4/ping.c:714
      [<ffffffc000dc91dc>] inet_sendmsg+0xe0/0x12c net/ipv4/af_inet.c:749
      [<     inline     >] __sock_sendmsg_nosec net/socket.c:624
      [<     inline     >] __sock_sendmsg net/socket.c:632
      [<ffffffc000cab61c>] sock_sendmsg+0x124/0x164 net/socket.c:643
      [<     inline     >] SYSC_sendto net/socket.c:1797
      [<ffffffc000cad270>] SyS_sendto+0x178/0x1d8 net/socket.c:1761
      
      CVE-2016-8399
      Reported-by: default avatarQidan He <i@flanker017.me>
      Fixes: c319b4d7 ("net: ipv4: add IPPROTO_ICMP socket kind")
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      718d94ed
    • Michal Kubeček's avatar
      tipc: check minimum bearer MTU · cd539242
      Michal Kubeček authored
      commit 3de81b75 upstream.
      
      Qian Zhang (张谦) reported a potential socket buffer overflow in
      tipc_msg_build() which is also known as CVE-2016-8632: due to
      insufficient checks, a buffer overflow can occur if MTU is too short for
      even tipc headers. As anyone can set device MTU in a user/net namespace,
      this issue can be abused by a regular user.
      
      As agreed in the discussion on Ben Hutchings' original patch, we should
      check the MTU at the moment a bearer is attached rather than for each
      processed packet. We also need to repeat the check when bearer MTU is
      adjusted to new device MTU. UDP case also needs a check to avoid
      overflow when calculating bearer MTU.
      
      Fixes: b97bf3fd ("[TIPC] Initial merge")
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Reported-by: default avatarQian Zhang (张谦) <zhangqian-c@360.cn>
      Acked-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.16:
       - Adjust context
       - Duplicate macro definitions in bearer.h to avoid mutual inclusion
       - NETDEV_DOWN and NETDEV_CHANGEMTU cases in net notifier were combined
       - Drop changes in udp_media.c]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      cd539242
    • Chris Brandt's avatar
      sh_eth: remove unchecked interrupts for RZ/A1 · e07d83a4
      Chris Brandt authored
      commit 33d446db upstream.
      
      When streaming a lot of data and the RZ/A1 can't keep up, some status bits
      will get set that are not being checked or cleared which cause the
      following messages and the Ethernet driver to stop working. This
      patch fixes that issue.
      
      irq 21: nobody cared (try booting with the "irqpoll" option)
      handlers:
      [<c036b71c>] sh_eth_interrupt
      Disabling IRQ #21
      
      Fixes: db893473 ("sh_eth: Add support for r7s72100")
      Signed-off-by: default avatarChris Brandt <chris.brandt@renesas.com>
      Acked-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e07d83a4
    • Florian Fainelli's avatar
      net: bcmgenet: Utilize correct struct device for all DMA operations · a0579d7e
      Florian Fainelli authored
      commit 8c4799ac upstream.
      
      __bcmgenet_tx_reclaim() and bcmgenet_free_rx_buffers() are not using the
      same struct device during unmap that was used for the map operation,
      which makes DMA-API debugging warn about it. Fix this by always using
      &priv->pdev->dev throughout the driver, using an identical device
      reference for all map/unmap calls.
      
      Fixes: 1c1008c7 ("net: bcmgenet: add main driver file")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.16:
       - Adjust context
       - Also fix call from bcmgenet_desc_rx(), fixed upstream in commit
         d6707bec "net: bcmgenet: rewrite bcmgenet_rx_refill()"]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a0579d7e
    • Eli Cooper's avatar
      ipv6: Set skb->protocol properly for local output · 770e9171
      Eli Cooper authored
      commit b4e479a9 upstream.
      
      When xfrm is applied to TSO/GSO packets, it follows this path:
      
          xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()
      
      where skb_gso_segment() relies on skb->protocol to function properly.
      
      This patch sets skb->protocol to ETH_P_IPV6 before dst_output() is called,
      fixing a bug where GSO packets sent through an ipip6 tunnel are dropped
      when xfrm is involved.
      Signed-off-by: default avatarEli Cooper <elicooper@gmx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      770e9171
    • Eli Cooper's avatar
      ipv4: Set skb->protocol properly for local output · bb8815da
      Eli Cooper authored
      commit f4180439 upstream.
      
      When xfrm is applied to TSO/GSO packets, it follows this path:
      
          xfrm_output() -> xfrm_output_gso() -> skb_gso_segment()
      
      where skb_gso_segment() relies on skb->protocol to function properly.
      
      This patch sets skb->protocol to ETH_P_IP before dst_output() is called,
      fixing a bug where GSO packets sent through a sit tunnel are dropped
      when xfrm is involved.
      Signed-off-by: default avatarEli Cooper <elicooper@gmx.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      bb8815da
    • Philip Pettersson's avatar
      packet: fix race condition in packet_set_ring · 943e7299
      Philip Pettersson authored
      commit 84ac7260 upstream.
      
      When packet_set_ring creates a ring buffer it will initialize a
      struct timer_list if the packet version is TPACKET_V3. This value
      can then be raced by a different thread calling setsockopt to
      set the version to TPACKET_V1 before packet_set_ring has finished.
      
      This leads to a use-after-free on a function pointer in the
      struct timer_list when the socket is closed as the previously
      initialized timer will not be deleted.
      
      The bug is fixed by taking lock_sock(sk) in packet_setsockopt when
      changing the packet version while also taking the lock at the start
      of packet_set_ring.
      
      Fixes: f6fb8f10 ("af-packet: TPACKET_V3 flexible buffer implementation.")
      Signed-off-by: default avatarPhilip Pettersson <philip.pettersson@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      943e7299
    • Thomas Gleixner's avatar
      locking/rtmutex: Prevent dequeue vs. unlock race · b9419575
      Thomas Gleixner authored
      commit dbb26055 upstream.
      
      David reported a futex/rtmutex state corruption. It's caused by the
      following problem:
      
      CPU0		CPU1		CPU2
      
      l->owner=T1
      		rt_mutex_lock(l)
      		lock(l->wait_lock)
      		l->owner = T1 | HAS_WAITERS;
      		enqueue(T2)
      		boost()
      		  unlock(l->wait_lock)
      		schedule()
      
      				rt_mutex_lock(l)
      				lock(l->wait_lock)
      				l->owner = T1 | HAS_WAITERS;
      				enqueue(T3)
      				boost()
      				  unlock(l->wait_lock)
      				schedule()
      		signal(->T2)	signal(->T3)
      		lock(l->wait_lock)
      		dequeue(T2)
      		deboost()
      		  unlock(l->wait_lock)
      				lock(l->wait_lock)
      				dequeue(T3)
      				  ===> wait list is now empty
      				deboost()
      				 unlock(l->wait_lock)
      		lock(l->wait_lock)
      		fixup_rt_mutex_waiters()
      		  if (wait_list_empty(l)) {
      		    owner = l->owner & ~HAS_WAITERS;
      		    l->owner = owner
      		     ==> l->owner = T1
      		  }
      
      				lock(l->wait_lock)
      rt_mutex_unlock(l)		fixup_rt_mutex_waiters()
      				  if (wait_list_empty(l)) {
      				    owner = l->owner & ~HAS_WAITERS;
      cmpxchg(l->owner, T1, NULL)
       ===> Success (l->owner = NULL)
      				    l->owner = owner
      				     ==> l->owner = T1
      				  }
      
      That means the problem is caused by fixup_rt_mutex_waiters() which does the
      RMW to clear the waiters bit unconditionally when there are no waiters in
      the rtmutexes rbtree.
      
      This can be fatal: A concurrent unlock can release the rtmutex in the
      fastpath because the waiters bit is not set. If the cmpxchg() gets in the
      middle of the RMW operation then the previous owner, which just unlocked
      the rtmutex is set as the owner again when the write takes place after the
      successfull cmpxchg().
      
      The solution is rather trivial: verify that the owner member of the rtmutex
      has the waiters bit set before clearing it. This does not require a
      cmpxchg() or other atomic operations because the waiters bit can only be
      set and cleared with the rtmutex wait_lock held. It's also safe against the
      fast path unlock attempt. The unlock attempt via cmpxchg() will either see
      the bit set and take the slowpath or see the bit cleared and release it
      atomically in the fastpath.
      
      It's remarkable that the test program provided by David triggers on ARM64
      and MIPS64 really quick, but it refuses to reproduce on x86-64, while the
      problem exists there as well. That refusal might explain that this got not
      discovered earlier despite the bug existing from day one of the rtmutex
      implementation more than 10 years ago.
      
      Thanks to David for meticulously instrumenting the code and providing the
      information which allowed to decode this subtle problem.
      Reported-by: default avatarDavid Daney <ddaney@caviumnetworks.com>
      Tested-by: default avatarDavid Daney <david.daney@cavium.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Sebastian Siewior <bigeasy@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Fixes: 23f78d4a ("[PATCH] pi-futex: rt mutex core")
      Link: http://lkml.kernel.org/r/20161130210030.351136722@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [bwh: Backported to 3.16: use ACCESS_ONCE() instead of {READ,WRITE}_ONCE()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b9419575
    • Sven Eckelmann's avatar
      batman-adv: Check for alloc errors when preparing TT local data · 907742aa
      Sven Eckelmann authored
      commit c2d0f48a upstream.
      
      batadv_tt_prepare_tvlv_local_data can fail to allocate the memory for the
      new TVLV block. The caller is informed about this problem with the returned
      length of 0. Not checking this value results in an invalid memory access
      when either tt_data or tt_change is accessed.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Fixes: 7ea7b4a1 ("batman-adv: make the TT CRC logic VLAN specific")
      Signed-off-by: default avatarSven Eckelmann <sven@narfation.org>
      Signed-off-by: default avatarSimon Wunderlich <sw@simonwunderlich.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      907742aa
    • Andrew Donnellan's avatar
      powerpc/eeh: Fix deadlock when PE frozen state can't be cleared · 1344f5b6
      Andrew Donnellan authored
      commit 409bf7f8 upstream.
      
      In eeh_reset_device(), we take the pci_rescan_remove_lock immediately after
      after we call eeh_reset_pe() to reset the PCI controller. We then call
      eeh_clear_pe_frozen_state(), which can return an error. In this case, we
      bail out of eeh_reset_device() without calling pci_unlock_rescan_remove().
      
      Add a call to pci_unlock_rescan_remove() in the eeh_clear_pe_frozen_state()
      error path so that we don't cause a deadlock later on.
      Reported-by: default avatarPradipta Ghosh <pradghos@in.ibm.com>
      Fixes: 78954700 ("powerpc/eeh: Avoid I/O access during PE reset")
      Signed-off-by: default avatarAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Acked-by: default avatarRussell Currey <ruscur@russell.cc>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      [bwh: Backported to 3.16: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1344f5b6
    • Hongxu Jia's avatar
      netfilter: arp_tables: fix invoking 32bit "iptable -P INPUT ACCEPT" failed in 64bit kernel · 85dc53fe
      Hongxu Jia authored
      commit 17a49cd5 upstream.
      
      Since 09d96860 ("netfilter: x_tables: do compat validation via
      translate_table"), it used compatr structure to assign newinfo
      structure.  In translate_compat_table of ip_tables.c and ip6_tables.c,
      it used compatr->hook_entry to replace info->hook_entry and
      compatr->underflow to replace info->underflow, but not do the same
      replacement in arp_tables.c.
      
      It caused invoking 32-bit "arptbale -P INPUT ACCEPT" failed in 64bit
      kernel.
      --------------------------------------
      root@qemux86-64:~# arptables -P INPUT ACCEPT
      root@qemux86-64:~# arptables -P INPUT ACCEPT
      ERROR: Policy for `INPUT' offset 448 != underflow 0
      arptables: Incompatible with this kernel
      --------------------------------------
      
      Fixes: 09d96860 ("netfilter: x_tables: do compat validation via translate_table")
      Signed-off-by: default avatarHongxu Jia <hongxu.jia@windriver.com>
      Acked-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      85dc53fe
    • Johan Hovold's avatar
      pwm: Fix device reference leak · 61c64c67
      Johan Hovold authored
      commit 0e1614ac upstream.
      
      Make sure to drop the reference to the parent device taken by
      class_find_device() after "unexporting" any children when deregistering
      a PWM chip.
      
      Fixes: 0733424c ("pwm: Unexport children before chip removal")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarThierry Reding <thierry.reding@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      61c64c67
    • Jack Morgenstein's avatar
      net/mlx4: Fix uninitialized fields in rule when adding promiscuous mode to... · 96239ede
      Jack Morgenstein authored
      net/mlx4: Fix uninitialized fields in rule when adding promiscuous mode to device managed flow steering
      
      commit 44b911e7 upstream.
      
      In procedure mlx4_flow_steer_promisc_add(), several fields
      were left uninitialized in the rule structure.
      Correctly initialize these fields.
      
      Fixes: 592e49dd ("net/mlx4: Implement promiscuous mode with device managed flow-steering")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      96239ede