1. 21 Nov, 2014 31 commits
    • Joe Thornber's avatar
      dm btree: fix a recursion depth bug in btree walking code · fe30b804
      Joe Thornber authored
      commit 9b460d36 upstream.
      
      The walk code was using a 'ro_spine' to hold it's locked btree nodes.
      But this data structure is designed for the rolling lock scheme, and
      as such automatically unlocks blocks that are two steps up the call
      chain.  This is not suitable for the simple recursive walk algorithm,
      which retraces its steps.
      
      This code is only used by the persistent array code, which in turn is
      only used by dm-cache.  In order to trigger it you need to have a
      mapping tree that is more than 2 levels deep; which equates to 8-16
      million cache blocks.  For instance a 4T ssd with a very small block
      size of 32k only just triggers this bug.
      
      The fix just places the locked blocks on the stack, and stops using
      the ro_spine altogether.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fe30b804
    • Jan Kara's avatar
      block: Fix computation of merged request priority · 9d0c2702
      Jan Kara authored
      commit ece9c72a upstream.
      
      Priority of a merged request is computed by ioprio_best(). If one of the
      requests has undefined priority (IOPRIO_CLASS_NONE) and another request
      has priority from IOPRIO_CLASS_BE, the function will return the
      undefined priority which is wrong. Fix the function to properly return
      priority of a request with the defined priority.
      
      Fixes: d58cdfb8Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9d0c2702
    • Helge Deller's avatar
      parisc: Use compat layer for msgctl, shmat, shmctl and semtimedop syscalls · aca0ab61
      Helge Deller authored
      commit 2fe749f5 upstream.
      
      Switch over the msgctl, shmat, shmctl and semtimedop syscalls to use the compat
      layer. The problem was found with the debian procenv package, which called
      	shmctl(0, SHM_INFO, &info);
      in which the shmctl syscall then overwrote parts of the surrounding areas on
      the stack on which the info variable was stored and thus lead to a segfault
      later on.
      
      Additionally fix the definition of struct shminfo64 to use unsigned longs like
      the other architectures. This has no impact on userspace since we only have a
      32bit userspace up to now.
      Signed-off-by: default avatarHelge Deller <deller@gmx.de>
      Cc: John David Anglin <dave.anglin@bell.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aca0ab61
    • Christoph Hellwig's avatar
      scsi: only re-lock door after EH on devices that were reset · 945f341a
      Christoph Hellwig authored
      commit 48379270 upstream.
      
      Setups that use the blk-mq I/O path can lock up if a host with a single
      device that has its door locked enters EH.  Make sure to only send the
      command to re-lock the door to devices that actually were reset and thus
      might have lost their state.  Otherwise the EH code might be get blocked
      on blk_get_request as all requests for non-reset devices might be in use.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reported-by: default avatarMeelis Roos <meelis.roos@ut.ee>
      Tested-by: default avatarMeelis Roos <meelis.roos@ut.ee>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      945f341a
    • Peng Tao's avatar
      nfs: fix pnfs direct write memory leak · 32049712
      Peng Tao authored
      commit 8c393f9a upstream.
      
      For pNFS direct writes, layout driver may dynamically allocate ds_cinfo.buckets.
      So we need to take care to free them when freeing dreq.
      
      Ideally this needs to be done inside layout driver where ds_cinfo.buckets
      are allocated. But buckets are attached to dreq and reused across LD IO iterations.
      So I feel it's OK to free them in the generic layer.
      Signed-off-by: default avatarPeng Tao <tao.peng@primarydata.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32049712
    • Stefan Richter's avatar
      firewire: cdev: prevent kernel stack leaking into ioctl arguments · 562e4948
      Stefan Richter authored
      commit eaca2d8e upstream.
      
      Found by the UC-KLEE tool:  A user could supply less input to
      firewire-cdev ioctls than write- or write/read-type ioctl handlers
      expect.  The handlers used data from uninitialized kernel stack then.
      
      This could partially leak back to the user if the kernel subsequently
      generated fw_cdev_event_'s (to be read from the firewire-cdev fd)
      which notably would contain the _u64 closure field which many of the
      ioctl argument structures contain.
      
      The fact that the handlers would act on random garbage input is a
      lesser issue since all handlers must check their input anyway.
      
      The fix simply always null-initializes the entire ioctl argument buffer
      regardless of the actual length of expected user input.  That is, a
      runtime overhead of memset(..., 40) is added to each firewirew-cdev
      ioctl() call.  [Comment from Clemens Ladisch:  This part of the stack is
      most likely to be already in the cache.]
      
      Remarks:
        - There was never any leak from kernel stack to the ioctl output
          buffer itself.  IOW, it was not possible to read kernel stack by a
          read-type or write/read-type ioctl alone; the leak could at most
          happen in combination with read()ing subsequent event data.
        - The actual expected minimum user input of each ioctl from
          include/uapi/linux/firewire-cdev.h is, in bytes:
          [0x00] = 32, [0x05] =  4, [0x0a] = 16, [0x0f] = 20, [0x14] = 16,
          [0x01] = 36, [0x06] = 20, [0x0b] =  4, [0x10] = 20, [0x15] = 20,
          [0x02] = 20, [0x07] =  4, [0x0c] =  0, [0x11] =  0, [0x16] =  8,
          [0x03] =  4, [0x08] = 24, [0x0d] = 20, [0x12] = 36, [0x17] = 12,
          [0x04] = 20, [0x09] = 24, [0x0e] =  4, [0x13] = 40, [0x18] =  4.
      Reported-by: default avatarDavid Ramos <daramos@stanford.edu>
      Signed-off-by: default avatarStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      562e4948
    • Kyle McMartin's avatar
      arm64: __clear_user: handle exceptions on strb · 16640ca6
      Kyle McMartin authored
      commit 97fc1543 upstream.
      
      ARM64 currently doesn't fix up faults on the single-byte (strb) case of
      __clear_user... which means that we can cause a nasty kernel panic as an
      ordinary user with any multiple PAGE_SIZE+1 read from /dev/zero.
      i.e.: dd if=/dev/zero of=foo ibs=1 count=1 (or ibs=65537, etc.)
      
      This is a pretty obscure bug in the general case since we'll only
      __do_kernel_fault (since there's no extable entry for pc) if the
      mmap_sem is contended. However, with CONFIG_DEBUG_VM enabled, we'll
      always fault.
      
      if (!down_read_trylock(&mm->mmap_sem)) {
      	if (!user_mode(regs) && !search_exception_tables(regs->pc))
      		goto no_context;
      retry:
      	down_read(&mm->mmap_sem);
      } else {
      	/*
      	 * The above down_read_trylock() might have succeeded in
      	 * which
      	 * case, we'll have missed the might_sleep() from
      	 * down_read().
      	 */
      	might_sleep();
      	if (!user_mode(regs) && !search_exception_tables(regs->pc))
      		goto no_context;
      }
      
      Fix that by adding an extable entry for the strb instruction, since it
      touches user memory, similar to the other stores in __clear_user.
      Signed-off-by: default avatarKyle McMartin <kyle@redhat.com>
      Reported-by: default avatarMiloš Prchlík <mprchlik@redhat.com>
      Signed-off-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16640ca6
    • Nathan Lynch's avatar
      ARM: 8198/1: make kuser helpers depend on MMU · 3e1f6a23
      Nathan Lynch authored
      commit 08b964ff upstream.
      
      The kuser helpers page is not set up on non-MMU systems, so it does
      not make sense to allow CONFIG_KUSER_HELPERS to be enabled when
      CONFIG_MMU=n.  Allowing it to be set on !MMU results in an oops in
      set_tls (used in execve and the arm_syscall trap handler):
      
      Unhandled exception: IPSR = 00000005 LR = fffffff1
      CPU: 0 PID: 1 Comm: swapper Not tainted 3.18.0-rc1-00041-ga30465a #216
      task: 8b838000 ti: 8b82a000 task.ti: 8b82a000
      PC is at flush_thread+0x32/0x40
      LR is at flush_thread+0x21/0x40
      pc : [<8f00157a>]    lr : [<8f001569>]    psr: 4100000b
      sp : 8b82be20  ip : 00000000  fp : 8b83c000
      r10: 00000001  r9 : 88018c84  r8 : 8bb85000
      r7 : 8b838000  r6 : 00000000  r5 : 8bb77400  r4 : 8b82a000
      r3 : ffff0ff0  r2 : 8b82a000  r1 : 00000000  r0 : 88020354
      xPSR: 4100000b
      CPU: 0 PID: 1 Comm: swapper Not tainted 3.18.0-rc1-00041-ga30465a #216
      [<8f002bc1>] (unwind_backtrace) from [<8f002033>] (show_stack+0xb/0xc)
      [<8f002033>] (show_stack) from [<8f00265b>] (__invalid_entry+0x4b/0x4c)
      
      As best I can tell this issue existed for the set_tls ARM syscall
      before commit fbfb872f "ARM: 8148/1: flush TLS and thumbee
      register state during exec" consolidated the TLS manipulation code
      into the set_tls helper function, but now that we're using it to flush
      register state during execve, !MMU users encounter the oops at the
      first exec.
      
      Prevent CONFIG_MMU=n configurations from enabling
      CONFIG_KUSER_HELPERS.
      
      Fixes: fbfb872f (ARM: 8148/1: flush TLS and thumbee register state during exec)
      Signed-off-by: default avatarNathan Lynch <nathan_lynch@mentor.com>
      Reported-by: default avatarStefan Agner <stefan@agner.ch>
      Acked-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e1f6a23
    • Alex Deucher's avatar
      drm/radeon: add missing crtc unlock when setting up the MC · 9458c73c
      Alex Deucher authored
      commit f0d7bfb9 upstream.
      
      Need to unlock the crtc after updating the blanking state.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9458c73c
    • Johannes Berg's avatar
      mac80211: fix use-after-free in defragmentation · 2e613ff8
      Johannes Berg authored
      commit b8fff407 upstream.
      
      Upon receiving the last fragment, all but the first fragment
      are freed, but the multicast check for statistics at the end
      of the function refers to the current skb (the last fragment)
      causing a use-after-free bug.
      
      Since multicast frames cannot be fragmented and we check for
      this early in the function, just modify that check to also
      do the accounting to fix the issue.
      Reported-by: default avatarYosef Khyal <yosefx.khyal@intel.com>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e613ff8
    • Herbert Xu's avatar
      macvtap: Fix csum_start when VLAN tags are present · b34fafa3
      Herbert Xu authored
      commit 3ce9b20f upstream.
      
      When VLAN is in use in macvtap_put_user, we end up setting
      csum_start to the wrong place.  The result is that the whoever
      ends up doing the checksum setting will corrupt the packet instead
      of writing the checksum to the expected location, usually this
      means writing the checksum with an offset of -4.
      
      This patch fixes this by adjusting csum_start when VLAN tags are
      detected.
      
      Fixes: f09e2249 ("macvtap: restore vlan header on user read")
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b34fafa3
    • Emmanuel Grumbach's avatar
      iwlwifi: configure the LTR · 597b3896
      Emmanuel Grumbach authored
      commit 9180ac50 upstream.
      
      The LTR is the handshake between the device and the root
      complex about the latency allowed when the bus exits power
      save. This configuration was missing and this led to high
      latency in the link power up. The end user could experience
      high latency in the network because of this.
      Signed-off-by: default avatarEmmanuel Grumbach <emmanuel.grumbach@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      597b3896
    • Ilya Dryomov's avatar
      libceph: do not crash on large auth tickets · 8169b2b9
      Ilya Dryomov authored
      commit aaef3170 upstream.
      
      Large (greater than 32k, the value of PAGE_ALLOC_COSTLY_ORDER) auth
      tickets will have their buffers vmalloc'ed, which leads to the
      following crash in crypto:
      
      [   28.685082] BUG: unable to handle kernel paging request at ffffeb04000032c0
      [   28.686032] IP: [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
      [   28.686032] PGD 0
      [   28.688088] Oops: 0000 [#1] PREEMPT SMP
      [   28.688088] Modules linked in:
      [   28.688088] CPU: 0 PID: 878 Comm: kworker/0:2 Not tainted 3.17.0-vm+ #305
      [   28.688088] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
      [   28.688088] Workqueue: ceph-msgr con_work
      [   28.688088] task: ffff88011a7f9030 ti: ffff8800d903c000 task.ti: ffff8800d903c000
      [   28.688088] RIP: 0010:[<ffffffff81392b42>]  [<ffffffff81392b42>] scatterwalk_pagedone+0x22/0x80
      [   28.688088] RSP: 0018:ffff8800d903f688  EFLAGS: 00010286
      [   28.688088] RAX: ffffeb04000032c0 RBX: ffff8800d903f718 RCX: ffffeb04000032c0
      [   28.688088] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8800d903f750
      [   28.688088] RBP: ffff8800d903f688 R08: 00000000000007de R09: ffff8800d903f880
      [   28.688088] R10: 18df467c72d6257b R11: 0000000000000000 R12: 0000000000000010
      [   28.688088] R13: ffff8800d903f750 R14: ffff8800d903f8a0 R15: 0000000000000000
      [   28.688088] FS:  00007f50a41c7700(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
      [   28.688088] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      [   28.688088] CR2: ffffeb04000032c0 CR3: 00000000da3f3000 CR4: 00000000000006b0
      [   28.688088] Stack:
      [   28.688088]  ffff8800d903f698 ffffffff81392ca8 ffff8800d903f6e8 ffffffff81395d32
      [   28.688088]  ffff8800dac96000 ffff880000000000 ffff8800d903f980 ffff880119b7e020
      [   28.688088]  ffff880119b7e010 0000000000000000 0000000000000010 0000000000000010
      [   28.688088] Call Trace:
      [   28.688088]  [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
      [   28.688088]  [<ffffffff81392ca8>] scatterwalk_done+0x38/0x40
      [   28.688088]  [<ffffffff81395d32>] blkcipher_walk_done+0x182/0x220
      [   28.688088]  [<ffffffff813990bf>] crypto_cbc_encrypt+0x15f/0x180
      [   28.688088]  [<ffffffff81399780>] ? crypto_aes_set_key+0x30/0x30
      [   28.688088]  [<ffffffff8156c40c>] ceph_aes_encrypt2+0x29c/0x2e0
      [   28.688088]  [<ffffffff8156d2a3>] ceph_encrypt2+0x93/0xb0
      [   28.688088]  [<ffffffff8156d7da>] ceph_x_encrypt+0x4a/0x60
      [   28.688088]  [<ffffffff8155b39d>] ? ceph_buffer_new+0x5d/0xf0
      [   28.688088]  [<ffffffff8156e837>] ceph_x_build_authorizer.isra.6+0x297/0x360
      [   28.688088]  [<ffffffff8112089b>] ? kmem_cache_alloc_trace+0x11b/0x1c0
      [   28.688088]  [<ffffffff8156b496>] ? ceph_auth_create_authorizer+0x36/0x80
      [   28.688088]  [<ffffffff8156ed83>] ceph_x_create_authorizer+0x63/0xd0
      [   28.688088]  [<ffffffff8156b4b4>] ceph_auth_create_authorizer+0x54/0x80
      [   28.688088]  [<ffffffff8155f7c0>] get_authorizer+0x80/0xd0
      [   28.688088]  [<ffffffff81555a8b>] prepare_write_connect+0x18b/0x2b0
      [   28.688088]  [<ffffffff81559289>] try_read+0x1e59/0x1f10
      
      This is because we set up crypto scatterlists as if all buffers were
      kmalloc'ed.  Fix it.
      Signed-off-by: default avatarIlya Dryomov <idryomov@redhat.com>
      Reviewed-by: default avatarSage Weil <sage@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8169b2b9
    • Max Filippov's avatar
      xtensa: re-wire umount syscall to sys_oldumount · 86800fc6
      Max Filippov authored
      commit 2651cc69 upstream.
      
      Userspace actually passes single parameter (path name) to the umount
      syscall, so new umount just fails. Fix it by requesting old umount
      syscall implementation and re-wiring umount to it.
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      86800fc6
    • Takashi Iwai's avatar
      ALSA: usb-audio: Fix memory leak in FTU quirk · 9a262b4b
      Takashi Iwai authored
      commit 1a290581 upstream.
      
      M-audio FastTrack Ultra quirk doesn't release the kzalloc'ed memory.
      This patch adds the private_free callback to release it properly.
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9a262b4b
    • Tejun Heo's avatar
      ahci: disable MSI instead of NCQ on Samsung pci-e SSDs on macbooks · 9921a2d5
      Tejun Heo authored
      commit 66a7cbc3 upstream.
      
      Samsung pci-e SSDs on macbooks failed miserably on NCQ commands, so
      67809f85 ("ahci: disable NCQ on Samsung pci-e SSDs on macbooks")
      disabled NCQ on them.  It turns out that NCQ is fine as long as MSI is
      not used, so let's turn off MSI and leave NCQ on.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=60731
      Tested-by: <dorin@i51.org>
      Tested-by: default avatarImre Kaloz <kaloz@openwrt.org>
      Fixes: 67809f85 ("ahci: disable NCQ on Samsung pci-e SSDs on macbooks")
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9921a2d5
    • James Ralston's avatar
      ahci: Add Device IDs for Intel Sunrise Point PCH · 4886eb4a
      James Ralston authored
      commit 690000b9 upstream.
      
      This patch adds the AHCI-mode SATA Device IDs for the Intel Sunrise Point PCH.
      Signed-off-by: default avatarJames Ralston <james.d.ralston@intel.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4886eb4a
    • Miklos Szeredi's avatar
      audit: keep inode pinned · bd501a2e
      Miklos Szeredi authored
      commit 799b6014 upstream.
      
      Audit rules disappear when an inode they watch is evicted from the cache.
      This is likely not what we want.
      
      The guilty commit is "fsnotify: allow marks to not pin inodes in core",
      which didn't take into account that audit_tree adds watches with a zero
      mask.
      
      Adding any mask should fix this.
      
      Fixes: 90b1e7a5 ("fsnotify: allow marks to not pin inodes in core")
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bd501a2e
    • Andy Lutomirski's avatar
      x86, x32, audit: Fix x32's AUDIT_ARCH wrt audit · 89b27dc7
      Andy Lutomirski authored
      commit 81f49a8f upstream.
      
      is_compat_task() is the wrong check for audit arch; the check should
      be is_ia32_task(): x32 syscalls should be AUDIT_ARCH_X86_64, not
      AUDIT_ARCH_I386.
      
      CONFIG_AUDITSYSCALL is currently incompatible with x32, so this has
      no visible effect.
      Signed-off-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Link: http://lkml.kernel.org/r/a0138ed8c709882aec06e4acc30bfa9b623b8717.1409954077.git.luto@amacapital.netSigned-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89b27dc7
    • Andreas Larsson's avatar
      sparc32: Implement xchg and atomic_xchg using ATOMIC_HASH locks · 96920746
      Andreas Larsson authored
      [ Upstream commit 1a17fdc4 ]
      
      Atomicity between xchg and cmpxchg cannot be guaranteed when xchg is
      implemented with a swap and cmpxchg is implemented with locks.
      Without this, e.g. mcs_spin_lock and mcs_spin_unlock are broken.
      Signed-off-by: default avatarAndreas Larsson <andreas@gaisler.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      96920746
    • David S. Miller's avatar
      sparc64: Do irq_{enter,exit}() around generic_smp_call_function*(). · 13471246
      David S. Miller authored
      [ Upstream commit ab5c7809 ]
      
      Otherwise rcu_irq_{enter,exit}() do not happen and we get dumps like:
      
      ====================
      [  188.275021] ===============================
      [  188.309351] [ INFO: suspicious RCU usage. ]
      [  188.343737] 3.18.0-rc3-00068-g20f3963d-dirty #54 Not tainted
      [  188.394786] -------------------------------
      [  188.429170] include/linux/rcupdate.h:883 rcu_read_lock() used
      illegally while idle!
      [  188.505235]
      other info that might help us debug this:
      
      [  188.554230]
      RCU used illegally from idle CPU!
      rcu_scheduler_active = 1, debug_locks = 0
      [  188.637587] RCU used illegally from extended quiescent state!
      [  188.690684] 3 locks held by swapper/7/0:
      [  188.721932]  #0:  (&x->wait#11){......}, at: [<0000000000495de8>] complete+0x8/0x60
      [  188.797994]  #1:  (&p->pi_lock){-.-.-.}, at: [<000000000048510c>] try_to_wake_up+0xc/0x400
      [  188.881343]  #2:  (rcu_read_lock){......}, at: [<000000000048a910>] select_task_rq_fair+0x90/0xb40
      [  188.973043]stack backtrace:
      [  188.993879] CPU: 7 PID: 0 Comm: swapper/7 Not tainted 3.18.0-rc3-00068-g20f3963d-dirty #54
      [  189.076187] Call Trace:
      [  189.089719]  [0000000000499360] lockdep_rcu_suspicious+0xe0/0x100
      [  189.147035]  [000000000048a99c] select_task_rq_fair+0x11c/0xb40
      [  189.202253]  [00000000004852d8] try_to_wake_up+0x1d8/0x400
      [  189.252258]  [000000000048554c] default_wake_function+0xc/0x20
      [  189.306435]  [0000000000495554] __wake_up_common+0x34/0x80
      [  189.356448]  [00000000004955b4] __wake_up_locked+0x14/0x40
      [  189.406456]  [0000000000495e08] complete+0x28/0x60
      [  189.448142]  [0000000000636e28] blk_end_sync_rq+0x8/0x20
      [  189.496057]  [0000000000639898] __blk_mq_end_request+0x18/0x60
      [  189.550249]  [00000000006ee014] scsi_end_request+0x94/0x180
      [  189.601286]  [00000000006ee334] scsi_io_completion+0x1d4/0x600
      [  189.655463]  [00000000006e51c4] scsi_finish_command+0xc4/0xe0
      [  189.708598]  [00000000006ed958] scsi_softirq_done+0x118/0x140
      [  189.761735]  [00000000006398ec] __blk_mq_complete_request_remote+0xc/0x20
      [  189.827383]  [00000000004c75d0] generic_smp_call_function_single_interrupt+0x150/0x1c0
      [  189.906581]  [000000000043e514] smp_call_function_single_client+0x14/0x40
      ====================
      
      Based almost entirely upon a patch by Paul E. McKenney.
      Reported-by: default avatarMeelis Roos <mroos@linux.ee>
      Tested-by: default avatarMeelis Roos <mroos@linux.ee>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      13471246
    • David S. Miller's avatar
      sparc64: Fix crashes in schizo_pcierr_intr_other(). · 865db7fb
      David S. Miller authored
      [ Upstream commit 7da89a2a ]
      
      Meelis Roos reports crashes during bootup on a V480 that look like
      this:
      
      ====================
      [   61.300577] PCI: Scanning PBM /pci@9,600000
      [   61.304867] schizo f009b070: PCI host bridge to bus 0003:00
      [   61.310385] pci_bus 0003:00: root bus resource [io  0x7ffe9000000-0x7ffe9ffffff] (bus address [0x0000-0xffffff])
      [   61.320515] pci_bus 0003:00: root bus resource [mem 0x7fb00000000-0x7fbffffffff] (bus address [0x00000000-0xffffffff])
      [   61.331173] pci_bus 0003:00: root bus resource [bus 00]
      [   61.385344] Unable to handle kernel NULL pointer dereference
      [   61.390970] tsk->{mm,active_mm}->context = 0000000000000000
      [   61.396515] tsk->{mm,active_mm}->pgd = fff000b000002000
      [   61.401716]               \|/ ____ \|/
      [   61.401716]               "@'/ .. \`@"
      [   61.401716]               /_| \__/ |_\
      [   61.401716]                  \__U_/
      [   61.416362] swapper/0(0): Oops [#1]
      [   61.419837] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.18.0-rc1-00422-g2cc91884-dirty #24
      [   61.427975] task: fff000b0fd8e9c40 ti: fff000b0fd928000 task.ti: fff000b0fd928000
      [   61.435426] TSTATE: 0000004480e01602 TPC: 00000000004455e4 TNPC: 00000000004455e8 Y: 00000000    Not tainted
      [   61.445230] TPC: <schizo_pcierr_intr+0x104/0x560>
      [   61.449897] g0: 0000000000000000 g1: 0000000000000000 g2: 0000000000a10f78 g3: 000000000000000a
      [   61.458563] g4: fff000b0fd8e9c40 g5: fff000b0fdd82000 g6: fff000b0fd928000 g7: 000000000000000a
      [   61.467229] o0: 000000000000003d o1: 0000000000000000 o2: 0000000000000006 o3: fff000b0ffa5fc7e
      [   61.475894] o4: 0000000000060000 o5: c000000000000000 sp: fff000b0ffa5f3c1 ret_pc: 00000000004455cc
      [   61.484909] RPC: <schizo_pcierr_intr+0xec/0x560>
      [   61.489500] l0: fff000b0fd8e9c40 l1: 0000000000a20800 l2: 0000000000000000 l3: 000000000119a430
      [   61.498164] l4: 0000000001742400 l5: 00000000011cfbe0 l6: 00000000011319c0 l7: fff000b0fd8ea348
      [   61.506830] i0: 0000000000000000 i1: fff000b0fdb34000 i2: 0000000320000000 i3: 0000000000000000
      [   61.515497] i4: 00060002010b003f i5: 0000040004e02000 i6: fff000b0ffa5f481 i7: 00000000004a9920
      [   61.524175] I7: <handle_irq_event_percpu+0x40/0x140>
      [   61.529099] Call Trace:
      [   61.531531]  [00000000004a9920] handle_irq_event_percpu+0x40/0x140
      [   61.537681]  [00000000004a9a58] handle_irq_event+0x38/0x80
      [   61.543145]  [00000000004ac77c] handle_fasteoi_irq+0xbc/0x200
      [   61.548860]  [00000000004a9084] generic_handle_irq+0x24/0x40
      [   61.554500]  [000000000042be0c] handler_irq+0xac/0x100
      ====================
      
      The problem is that pbm->pci_bus->self is NULL.
      
      This code is trying to go through the standard PCI config space
      interfaces to read the PCI controller's PCI_STATUS register.
      
      This doesn't work, because we more often than not do not enumerate
      the PCI controller as a bonafide PCI device during the OF device
      node scan.  Therefore bus->self remains NULL.
      
      Existing common code for PSYCHO and PSYCHO-like PCI controllers
      handles this properly, by doing the config space access directly.
      
      Do the same here, pbm->pci_ops->{read,write}().
      Reported-by: default avatarMeelis Roos <mroos@linux.ee>
      Tested-by: default avatarMeelis Roos <mroos@linux.ee>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      865db7fb
    • Dwight Engen's avatar
      sunvdc: don't call VD_OP_GET_VTOC · df6329d2
      Dwight Engen authored
      [ Upstream commit 85b0c6e6 ]
      
      The VD_OP_GET_VTOC operation will succeed only if the vdisk backend has a
      VTOC label, otherwise it will fail. In particular, it will return error
      48 (ENOTSUP) if the disk has an EFI label. VTOC disk labels are already
      handled by directly reading the disk in block/partitions/sun.c (enabled by
      CONFIG_SUN_PARTITION which defaults to y on SPARC). Since port->label is
      unused in the driver, remove the call and the field.
      Signed-off-by: default avatarDwight Engen <dwight.engen@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      df6329d2
    • Dwight Engen's avatar
      vio: fix reuse of vio_dring slot · 891b6057
      Dwight Engen authored
      [ Upstream commit d0aedcd4 ]
      
      vio_dring_avail() will allow use of every dring entry, but when the last
      entry is allocated then dr->prod == dr->cons which is indistinguishable from
      the ring empty condition. This causes the next allocation to reuse an entry.
      When this happens in sunvdc, the server side vds driver begins nack'ing the
      messages and ends up resetting the ldc channel. This problem does not effect
      sunvnet since it checks for < 2.
      
      The fix here is to just never allocate the very last dring slot so that full
      and empty are not the same condition. The request start path was changed to
      check for the ring being full a bit earlier, and to stop the blk_queue if
      there is no space left. The blk_queue will be restarted once the ring is
      only half full again. The number of ring entries was increased to 512 which
      matches the sunvnet and Solaris vdc drivers, and greatly reduces the
      frequency of hitting the ring full condition and the associated blk_queue
      stop/starting. The checks in sunvent were adjusted to account for
      vio_dring_avail() returning 1 less.
      
      Orabug: 19441666
      OraBZ: 14983
      Signed-off-by: default avatarDwight Engen <dwight.engen@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      891b6057
    • Dwight Engen's avatar
      sunvdc: limit each sg segment to a page · 5cf61378
      Dwight Engen authored
      [ Upstream commit 5eed69ff ]
      
      ldc_map_sg() could fail its check that the number of pages referred to
      by the sg scatterlist was <= the number of cookies.
      
      This fixes the issue by doing a similar thing to the xen-blkfront driver,
      ensuring that the scatterlist will only ever contain a segment count <=
      port->ring_cookies, and each segment will be page aligned, and <= page
      size. This ensures that the scatterlist is always mappable.
      
      Orabug: 19347817
      OraBZ: 15945
      Signed-off-by: default avatarDwight Engen <dwight.engen@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5cf61378
    • Allen Pais's avatar
      sunvdc: compute vdisk geometry from capacity · 9e23c211
      Allen Pais authored
      [ Upstream commit de5b73f0 ]
      
      The LDom diskserver doesn't return reliable geometry data. In addition,
      the types for all fields in the vio_disk_geom are u16, which were being
      truncated in the cast into the u8's of the Linux struct hd_geometry.
      
      Modify vdc_getgeo() to compute the geometry from the disk's capacity in a
      manner consistent with xen-blkfront::blkif_getgeo().
      Signed-off-by: default avatarDwight Engen <dwight.engen@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e23c211
    • Allen Pais's avatar
      sunvdc: add cdrom and v1.1 protocol support · e4da88a6
      Allen Pais authored
      [ Upstream commit 9bce2182 ]
      
      Interpret the media type from v1.1 protocol to support CDROM/DVD.
      
      For v1.0 protocol, a disk's size continues to be calculated from the
      geometry returned by the vdisk server. The geometry returned by the server
      can be less than the actual number of sectors available in the backing
      image/device due to the rounding in the division used to compute the
      geometry in the vdisk server.
      
      In v1.1 protocol a disk's actual size in sectors is returned during the
      handshake. Use this size when v1.1 protocol is negotiated. Since this size
      will always be larger than the former geometry computed size, disks created
      under v1.0 will be forwards compatible to v1.1, but not vice versa.
      Signed-off-by: default avatarDwight Engen <dwight.engen@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e4da88a6
    • Daniel Borkmann's avatar
      net: sctp: fix memory leak in auth key management · e79c2487
      Daniel Borkmann authored
      [ Upstream commit 4184b2a7 ]
      
      A very minimal and simple user space application allocating an SCTP
      socket, setting SCTP_AUTH_KEY setsockopt(2) on it and then closing
      the socket again will leak the memory containing the authentication
      key from user space:
      
      unreferenced object 0xffff8800837047c0 (size 16):
        comm "a.out", pid 2789, jiffies 4296954322 (age 192.258s)
        hex dump (first 16 bytes):
          01 00 00 00 04 00 00 00 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff816d7e8e>] kmemleak_alloc+0x4e/0xb0
          [<ffffffff811c88d8>] __kmalloc+0xe8/0x270
          [<ffffffffa0870c23>] sctp_auth_create_key+0x23/0x50 [sctp]
          [<ffffffffa08718b1>] sctp_auth_set_key+0xa1/0x140 [sctp]
          [<ffffffffa086b383>] sctp_setsockopt+0xd03/0x1180 [sctp]
          [<ffffffff815bfd94>] sock_common_setsockopt+0x14/0x20
          [<ffffffff815beb61>] SyS_setsockopt+0x71/0xd0
          [<ffffffff816e58a9>] system_call_fastpath+0x12/0x17
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      This is bad because of two things, we can bring down a machine from
      user space when auth_enable=1, but also we would leave security sensitive
      keying material in memory without clearing it after use. The issue is
      that sctp_auth_create_key() already sets the refcount to 1, but after
      allocation sctp_auth_set_key() does an additional refcount on it, and
      thus leaving it around when we free the socket.
      
      Fixes: 65b07e5d ("[SCTP]: API updates to suport SCTP-AUTH extensions.")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e79c2487
    • Daniel Borkmann's avatar
      net: sctp: fix NULL pointer dereference in af->from_addr_param on malformed packet · 7031dcb0
      Daniel Borkmann authored
      [ Upstream commit e40607cb ]
      
      An SCTP server doing ASCONF will panic on malformed INIT ping-of-death
      in the form of:
      
        ------------ INIT[PARAM: SET_PRIMARY_IP] ------------>
      
      While the INIT chunk parameter verification dissects through many things
      in order to detect malformed input, it misses to actually check parameters
      inside of parameters. E.g. RFC5061, section 4.2.4 proposes a 'set primary
      IP address' parameter in ASCONF, which has as a subparameter an address
      parameter.
      
      So an attacker may send a parameter type other than SCTP_PARAM_IPV4_ADDRESS
      or SCTP_PARAM_IPV6_ADDRESS, param_type2af() will subsequently return 0
      and thus sctp_get_af_specific() returns NULL, too, which we then happily
      dereference unconditionally through af->from_addr_param().
      
      The trace for the log:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
      IP: [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
      PGD 0
      Oops: 0000 [#1] SMP
      [...]
      Pid: 0, comm: swapper Not tainted 2.6.32-504.el6.x86_64 #1 Bochs Bochs
      RIP: 0010:[<ffffffffa01e9c62>]  [<ffffffffa01e9c62>] sctp_process_init+0x492/0x990 [sctp]
      [...]
      Call Trace:
       <IRQ>
       [<ffffffffa01f2add>] ? sctp_bind_addr_copy+0x5d/0xe0 [sctp]
       [<ffffffffa01e1fcb>] sctp_sf_do_5_1B_init+0x21b/0x340 [sctp]
       [<ffffffffa01e3751>] sctp_do_sm+0x71/0x1210 [sctp]
       [<ffffffffa01e5c09>] ? sctp_endpoint_lookup_assoc+0xc9/0xf0 [sctp]
       [<ffffffffa01e61f6>] sctp_endpoint_bh_rcv+0x116/0x230 [sctp]
       [<ffffffffa01ee986>] sctp_inq_push+0x56/0x80 [sctp]
       [<ffffffffa01fcc42>] sctp_rcv+0x982/0xa10 [sctp]
       [<ffffffffa01d5123>] ? ipt_local_in_hook+0x23/0x28 [iptable_filter]
       [<ffffffff8148bdc9>] ? nf_iterate+0x69/0xb0
       [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
       [<ffffffff8148bf86>] ? nf_hook_slow+0x76/0x120
       [<ffffffff81496d10>] ? ip_local_deliver_finish+0x0/0x2d0
      [...]
      
      A minimal way to address this is to check for NULL as we do on all
      other such occasions where we know sctp_get_af_specific() could
      possibly return with NULL.
      
      Fixes: d6de3097 ("[SCTP]: Add the handling of "Set Primary IP Address" parameter to INIT")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7031dcb0
    • Steffen Klassert's avatar
      gre6: Move the setting of dev->iflink into the ndo_init functions. · 460ceaa0
      Steffen Klassert authored
      [ Upstream commit f03eb128 ]
      
      Otherwise it gets overwritten by register_netdev().
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      460ceaa0
    • Steffen Klassert's avatar
      ip6_tunnel: Use ip6_tnl_dev_init as the ndo_init function. · 2d2a8385
      Steffen Klassert authored
      [ Upstream commit 6c6151da ]
      
      ip6_tnl_dev_init() sets the dev->iflink via a call to
      ip6_tnl_link_config(). After that, register_netdevice()
      sets dev->iflink = -1. So we loose the iflink configuration
      for ipv6 tunnels. Fix this by using ip6_tnl_dev_init() as the
      ndo_init function. Then ip6_tnl_dev_init() is called after
      dev->iflink is set to -1 from register_netdevice().
      Signed-off-by: default avatarSteffen Klassert <steffen.klassert@secunet.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d2a8385
  2. 14 Nov, 2014 9 commits