1. 25 Aug, 2016 2 commits
  2. 24 Aug, 2016 4 commits
    • Jens Axboe's avatar
      blk-mq: improve warning for running a queue on the wrong CPU · 0e87e58b
      Jens Axboe authored
      __blk_mq_run_hw_queue() currently warns if we are running the queue on a
      CPU that isn't set in its mask. However, this can happen if a CPU is
      being offlined, and the workqueue handling will place the work on CPU0
      instead. Improve the warning so that it only triggers if the batch cpu
      in the hardware queue is currently online.  If it triggers for that
      case, then it's indicative of a flow problem in blk-mq, so we want to
      retain it for that case.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      0e87e58b
    • Jens Axboe's avatar
      blk-mq: don't overwrite rq->mq_ctx · e57690fe
      Jens Axboe authored
      We do this in a few places, if the CPU is offline. This isn't allowed,
      though, since on multi queue hardware, we can't just move a request
      from one software queue to another, if they map to different hardware
      queues. The request and tag isn't valid on another hardware queue.
      
      This can happen if plugging races with CPU offlining. But it does
      no harm, since it can only happen in the window where we are
      currently busy freezing the queue and flushing IO, in preparation
      for redoing the software <-> hardware queue mappings.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      e57690fe
    • Ming Lei's avatar
      block: make sure a big bio is split into at most 256 bvecs · 4d70dca4
      Ming Lei authored
      After arbitrary bio size was introduced, the incoming bio may
      be very big. We have to split the bio into small bios so that
      each holds at most BIO_MAX_PAGES bvecs for safety reason, such
      as bio_clone().
      
      This patch fixes the following kernel crash:
      
      > [  172.660142] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
      > [  172.660229] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
      > [  172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0
      > [  172.660399] Oops: 0000 [#1] SMP
      > [...]
      > [  172.664780] Call Trace:
      > [  172.664813]  [<ffffffffa007f3be>] ? raid1_make_request+0x2e8/0xad7 [raid1]
      > [  172.664846]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
      > [  172.664880]  [<ffffffffa005fb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
      > [  172.664912]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
      > [  172.664947]  [<ffffffffa0445c89>] ? prio_io+0x85/0x95 [bcache]
      > [  172.664981]  [<ffffffffa0448252>] ? register_cache_set+0x355/0x8d0 [bcache]
      > [  172.665016]  [<ffffffffa04497d3>] ? register_bcache+0x1006/0x1174 [bcache]
      
      The issue can be reproduced by the following steps:
      	- create one raid1 over two virtio-blk
      	- build bcache device over the above raid1 and another cache device
      	and bucket size is set as 2Mbytes
      	- set cache mode as writeback
      	- run random write over ext4 on the bcache device
      
      Fixes: 54efd50b(block: make generic_make_request handle arbitrarily sized bios)
      Reported-by: default avatarSebastian Roesner <sroesner-kernelorg@roesner-online.de>
      Reported-by: default avatarEric Wheeler <bcache@lists.ewheeler.net>
      Cc: stable@vger.kernel.org (4.3+)
      Cc: Shaohua Li <shli@fb.com>
      Acked-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Signed-off-by: default avatarMing Lei <ming.lei@canonical.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      4d70dca4
    • Andy Lutomirski's avatar
      nvme: Fix nvme_get/set_features() with a NULL result pointer · 9b47f77a
      Andy Lutomirski authored
      nvme_set_features() callers seem to expect that passing NULL as the
      result pointer is acceptable.  Teach nvme_set_features() not to try to
      write to the NULL address.
      
      For symmetry, make the same change to nvme_get_features(), despite the
      fact that all current callers pass a valid result pointer.
      
      I assume that this bug hasn't been reported in practice because
      the callers that pass NULL are all in the SCSI translation layer
      and no one uses the relevant operations.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAndy Lutomirski <luto@kernel.org>
      Reviewed-by: default avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      9b47f77a
  3. 22 Aug, 2016 1 commit
    • Vegard Nossum's avatar
      bdev: fix NULL pointer dereference · e9e5e3fa
      Vegard Nossum authored
      I got this:
      
          kasan: GPF could be caused by NULL-ptr deref or user memory access
          general protection fault: 0000 [#1] PREEMPT SMP KASAN
          Dumping ftrace buffer:
             (ftrace buffer empty)
          CPU: 0 PID: 5505 Comm: syz-executor Not tainted 4.8.0-rc2+ #161
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
          task: ffff880113415940 task.stack: ffff880118350000
          RIP: 0010:[<ffffffff8172cb32>]  [<ffffffff8172cb32>] bd_mount+0x52/0xa0
          RSP: 0018:ffff880118357ca0  EFLAGS: 00010207
          RAX: dffffc0000000000 RBX: ffffffffffffffff RCX: ffffc90000bb6000
          RDX: 0000000000000018 RSI: ffffffff846d6b20 RDI: 00000000000000c7
          RBP: ffff880118357cb0 R08: ffff880115967c68 R09: 0000000000000000
          R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801188211e8
          R13: ffffffff847baa20 R14: ffff8801139cb000 R15: 0000000000000080
          FS:  00007fa3ff6c0700(0000) GS:ffff88011aa00000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 00007fc1d8cc7e78 CR3: 0000000109f20000 CR4: 00000000000006f0
          DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000
          DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
          Stack:
           ffff880112cfd6c0 ffff8801188211e8 ffff880118357cf0 ffffffff8167f207
           ffffffff816d7a1e ffff880112a413c0 ffffffff847baa20 ffff8801188211e8
           0000000000000080 ffff880112cfd6c0 ffff880118357d38 ffffffff816dce0a
          Call Trace:
           [<ffffffff8167f207>] mount_fs+0x97/0x2e0
           [<ffffffff816d7a1e>] ? alloc_vfsmnt+0x55e/0x760
           [<ffffffff816dce0a>] vfs_kern_mount+0x7a/0x300
           [<ffffffff83c3247c>] ? _raw_read_unlock+0x2c/0x50
           [<ffffffff816dfc87>] do_mount+0x3d7/0x2730
           [<ffffffff81235fd4>] ? trace_do_page_fault+0x1f4/0x3a0
           [<ffffffff816df8b0>] ? copy_mount_string+0x40/0x40
           [<ffffffff8161ea81>] ? memset+0x31/0x40
           [<ffffffff816df73e>] ? copy_mount_options+0x1ee/0x320
           [<ffffffff816e2a02>] SyS_mount+0xb2/0x120
           [<ffffffff816e2950>] ? copy_mnt_ns+0x970/0x970
           [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
           [<ffffffff83c3282a>] entry_SYSCALL64_slow_path+0x25/0x25
          Code: 83 e8 63 1b fc ff 48 85 c0 48 89 c3 74 4c e8 56 35 d1 ff 48 8d bb c8 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 75 36 4c 8b a3 c8 00 00 00 48 b8 00 00 00 00 00 fc
          RIP  [<ffffffff8172cb32>] bd_mount+0x52/0xa0
           RSP <ffff880118357ca0>
          ---[ end trace 13690ad962168b98 ]---
      
      mount_pseudo() returns ERR_PTR(), not NULL, on error.
      
      Fixes: 3684aa70 ("block-dev: enable writeback cgroup support")
      Cc: Shaohua Li <shli@fb.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Jens Axboe <axboe@fb.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      e9e5e3fa
  4. 19 Aug, 2016 8 commits
    • Jens Axboe's avatar
      Merge branch 'v4.8-rc2-bcache-fixes' of https://bitbucket.org/ewheelerinc/linux into for-linus · cbaaf6ef
      Jens Axboe authored
      Eric writes:
      
      Please pull this bcache branch based on v4.8-rc2.  These fix one
      deadlock, one use blkdev_put() use counter, and one dmesg output with a
      better pr_err() description.
      cbaaf6ef
    • Jens Axboe's avatar
      Merge branch 'stable/for-jens-4.8-for-linus' of... · c9d8fa6d
      Jens Axboe authored
      Merge branch 'stable/for-jens-4.8-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen into for-linus
      
      Konrad writes:
      
      Please git pull the following three fixes in to your 'for-linus' branch.
      It is against 'for-linus' instead of 'for-4.8/drivers' because we had
      some code in xen-blkfront go through the Xen tree (with your Ack). The
      reason was that you 'for-4.8/drivers' was based on 4.7-rc2, and the
      fixes needed to be against newer tag.
      
      This branch fixes as we are exercising the multiqueue components more
      aggressively.
      c9d8fa6d
    • Bob Liu's avatar
      xen-blkfront: free resources if xlvbd_alloc_gendisk fails · 4e876c2b
      Bob Liu authored
      Current code forgets to free resources in the failure path of
      xlvbd_alloc_gendisk(), this patch fix it.
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      4e876c2b
    • Bob Liu's avatar
      xen-blkfront: introduce blkif_set_queue_limits() · 172335ad
      Bob Liu authored
      blk_mq_update_nr_hw_queues() reset all queue limits to default which it's
      not as xen-blkfront expected, introducing blkif_set_queue_limits() to reset
      limits with initial correct values.
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Acked-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      172335ad
    • Bob Liu's avatar
      xen-blkfront: fix places not updated after introducing 64KB page granularity · 6c647b0e
      Bob Liu authored
      Two places didn't get updated when 64KB page granularity was introduced,
      this patch fix them.
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Acked-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      6c647b0e
    • Eric Wheeler's avatar
      bcache: pr_err: more meaningful error message when nr_stripes is invalid · 90706094
      Eric Wheeler authored
      The original error was thought to be corruption, but was actually caused by:
      	make-bcache --data-offset N
      where N was in bytes and should have been in sectors.  While userspace
      tools should be updated to check --data-offset beyond end of volume,
      hopefully this will help others that might not have noticed the units.
      Signed-off-by: default avatarEric Wheeler <bcache@linux.ewheeler.net>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      90706094
    • Kent Overstreet's avatar
      bcache: RESERVE_PRIO is too small by one when prio_buckets() is a power of two. · acc9cf8c
      Kent Overstreet authored
      This patch fixes a cachedev registration-time allocation deadlock.
      This can deadlock on boot if your initrd auto-registeres bcache devices:
      
      Allocator thread:
      [  720.727614] INFO: task bcache_allocato:3833 blocked for more than 120 seconds.
      [  720.732361]  [<ffffffff816eeac7>] schedule+0x37/0x90
      [  720.732963]  [<ffffffffa05192b8>] bch_bucket_alloc+0x188/0x360 [bcache]
      [  720.733538]  [<ffffffff810e6950>] ? prepare_to_wait_event+0xf0/0xf0
      [  720.734137]  [<ffffffffa05302bd>] bch_prio_write+0x19d/0x340 [bcache]
      [  720.734715]  [<ffffffffa05190bf>] bch_allocator_thread+0x3ff/0x470 [bcache]
      [  720.735311]  [<ffffffff816ee41c>] ? __schedule+0x2dc/0x950
      [  720.735884]  [<ffffffffa0518cc0>] ? invalidate_buckets+0x980/0x980 [bcache]
      
      Registration thread:
      [  720.710403] INFO: task bash:3531 blocked for more than 120 seconds.
      [  720.715226]  [<ffffffff816eeac7>] schedule+0x37/0x90
      [  720.715805]  [<ffffffffa05235cd>] __bch_btree_map_nodes+0x12d/0x150 [bcache]
      [  720.716409]  [<ffffffffa0522d30>] ? bch_btree_insert_check_key+0x1c0/0x1c0 [bcache]
      [  720.717008]  [<ffffffffa05236e4>] bch_btree_insert+0xf4/0x170 [bcache]
      [  720.717586]  [<ffffffff810e6950>] ? prepare_to_wait_event+0xf0/0xf0
      [  720.718191]  [<ffffffffa0527d9a>] bch_journal_replay+0x14a/0x290 [bcache]
      [  720.718766]  [<ffffffff810cc90d>] ? ttwu_do_activate.constprop.94+0x5d/0x70
      [  720.719369]  [<ffffffff810cf684>] ? try_to_wake_up+0x1d4/0x350
      [  720.719968]  [<ffffffffa05317d0>] run_cache_set+0x580/0x8e0 [bcache]
      [  720.720553]  [<ffffffffa053302e>] register_bcache+0xe2e/0x13b0 [bcache]
      [  720.721153]  [<ffffffff81354cef>] kobj_attr_store+0xf/0x20
      [  720.721730]  [<ffffffff812a2dad>] sysfs_kf_write+0x3d/0x50
      [  720.722327]  [<ffffffff812a225a>] kernfs_fop_write+0x12a/0x180
      [  720.722904]  [<ffffffff81225177>] __vfs_write+0x37/0x110
      [  720.723503]  [<ffffffff81228048>] ? __sb_start_write+0x58/0x110
      [  720.724100]  [<ffffffff812cedb3>] ? security_file_permission+0x23/0xa0
      [  720.724675]  [<ffffffff812258a9>] vfs_write+0xa9/0x1b0
      [  720.725275]  [<ffffffff8102479c>] ? do_audit_syscall_entry+0x6c/0x70
      [  720.725849]  [<ffffffff81226755>] SyS_write+0x55/0xd0
      [  720.726451]  [<ffffffff8106a390>] ? do_page_fault+0x30/0x80
      [  720.727045]  [<ffffffff816f2cae>] system_call_fastpath+0x12/0x71
      
      The fifo code in upstream bcache can't use the last element in the buffer,
      which was the cause of the bug: if you asked for a power of two size,
      it'd give you a fifo that could hold one less than what you asked for
      rather than allocating a buffer twice as big.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Tested-by: default avatarEric Wheeler <bcache@linux.ewheeler.net>
      Cc: stable@vger.kernel.org
      acc9cf8c
    • Eric Wheeler's avatar
      bcache: register_bcache(): call blkdev_put() when cache_alloc() fails · d9dc1702
      Eric Wheeler authored
      register_cache() is supposed to return an error string on error so that
      register_bcache() will will blkdev_put and cleanup other user counters,
      but it does not set 'char *err' when cache_alloc() fails (eg, due to
      memory pressure) and thus register_bcache() performs no cleanup.
      
      register_bcache() <----------\  <- no jump to err_close, no blkdev_put()
         |                         |
         +->register_cache()       |  <- fails to set char *err
               |                   |
               +->cache_alloc() ---/  <- returns error
      
      This patch sets `char *err` for this failure case so that register_cache()
      will cause register_bcache() to correctly jump to err_close and do
      cleanup.  This was tested under OOM conditions that triggered the bug.
      Signed-off-by: default avatarEric Wheeler <bcache@linux.ewheeler.net>
      Cc: Kent Overstreet <kent.overstreet@gmail.com>
      Cc: stable@vger.kernel.org
      d9dc1702
  5. 17 Aug, 2016 1 commit
  6. 16 Aug, 2016 1 commit
  7. 15 Aug, 2016 4 commits
  8. 14 Aug, 2016 2 commits
    • Linus Torvalds's avatar
      Merge tag 'fixes-for-linus-4.8' of... · 118253a5
      Linus Torvalds authored
      Merge tag 'fixes-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
      
      Pull h8300 and unicore32 architecture fixes from Guenter Roeck:
       "Two patches to fix h8300 and unicore32 builds.
      
        unicore32 builds have been broken since v4.6.  The fix has been
        available in -next since March of this year.
      
        h8300 builds have been broken since the last commit window.  The fix
        has been available in -next since June of this year"
      
      * tag 'fixes-for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
        h8300: Add missing include file to asm/io.h
        unicore32: mm: Add missing parameter to arch_vma_access_permitted
      118253a5
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 120c5475
      Linus Torvalds authored
      Pull arm64 fixes from Catalin Marinas:
      
       - support for nr_cpus= command line argument (maxcpus was previously
         changed to allow secondary CPUs to be hot-plugged)
      
       - ARM PMU interrupt handling fix
      
       - fix potential TLB conflict in the hibernate code
      
       - improved handling of EL1 instruction aborts (better error reporting)
      
       - removal of useless jprobes code for stack saving/restoring
      
       - defconfig updates
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: defconfig: enable CONFIG_LOCALVERSION_AUTO
        arm64: defconfig: add options for virtualization and containers
        arm64: hibernate: handle allocation failures
        arm64: hibernate: avoid potential TLB conflict
        arm64: Handle el1 synchronous instruction aborts cleanly
        arm64: Remove stack duplicating code from jprobes
        drivers/perf: arm-pmu: Fix handling of SPI lacking "interrupt-affinity" property
        drivers/perf: arm-pmu: convert arm_pmu_mutex to spinlock
        arm64: Support hard limit of cpu count by nr_cpus
      120c5475
  9. 13 Aug, 2016 4 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 329f4152
      Linus Torvalds authored
      Pull KVM fixes from Radim Krčmář:
       "KVM:
         - lock kvm_device list to prevent corruption on device creation.
      
        PPC:
         - split debugfs initialization from creation of the xics device to
           unlock the newly taken kvm lock earlier.
      
        s390:
         - prevent userspace from triggering two WARN_ON_ONCE.
      
        MIPS:
         - fix several issues in the management of TLB faults (Cc: stable)"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
        MIPS: KVM: Propagate kseg0/mapped tlb fault errors
        MIPS: KVM: Fix gfn range check in kseg0 tlb faults
        MIPS: KVM: Add missing gfn range check
        MIPS: KVM: Fix mapped fault broken commpage handling
        KVM: Protect device ops->create and list_add with kvm->lock
        KVM: PPC: Move xics_debugfs_init out of create
        KVM: s390: reset KVM_REQ_MMU_RELOAD if mapping the prefix failed
        KVM: s390: set the prefix initially properly
      329f4152
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.dk/linux-block · a1e21033
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - an NVMe fix from Gabriel, fixing a suspend/resume issue on some
         setups
      
       - addition of a few missing entries in the block queue sysfs
         documentation, from Joe
      
       - a fix for a sparse shadow warning for the bvec iterator, from
         Johannes
      
       - a writeback deadlock involving raid issuing barriers, and not
         flushing the plug when we wakeup the flusher threads.  From
         Konstantin
      
       - a set of patches for the NVMe target/loop/rdma code, from Roland and
         Sagi
      
      * 'for-linus' of git://git.kernel.dk/linux-block:
        bvec: avoid variable shadowing warning
        doc: update block/queue-sysfs.txt entries
        nvme: Suspend all queues before deletion
        mm, writeback: flush plugged IO in wakeup_flusher_threads()
        nvme-rdma: Remove unused includes
        nvme-rdma: start async event handler after reconnecting to a controller
        nvmet: Fix controller serial number inconsistency
        nvmet-rdma: Don't use the inline buffer in order to avoid allocation for small reads
        nvmet-rdma: Correctly handle RDMA device hot removal
        nvme-rdma: Make sure to shutdown the controller if we can
        nvme-loop: Remove duplicate call to nvme_remove_namespaces
        nvme-rdma: Free the I/O tags when we delete the controller
        nvme-rdma: Remove duplicate call to nvme_remove_namespaces
        nvme-rdma: Fix device removal handling
        nvme-rdma: Queue ns scanning after a sucessful reconnection
        nvme-rdma: Don't leak uninitialized memory in connect request private data
      a1e21033
    • Guenter Roeck's avatar
      h8300: Add missing include file to asm/io.h · 2b05980d
      Guenter Roeck authored
      h8300 builds fail with
      
      arch/h8300/include/asm/io.h:9:15: error: unknown type name ‘u8’
      arch/h8300/include/asm/io.h:15:15: error: unknown type name ‘u16’
      arch/h8300/include/asm/io.h:21:15: error: unknown type name ‘u32’
      
      and many related errors.
      
      Fixes: 23c82d41bdf4 ("kexec-allow-architectures-to-override-boot-mapping-fix")
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      2b05980d
    • Guenter Roeck's avatar
      unicore32: mm: Add missing parameter to arch_vma_access_permitted · 783011b1
      Guenter Roeck authored
      unicore32 fails to compile with the following errors.
      
      mm/memory.c: In function ‘__handle_mm_fault’:
      mm/memory.c:3381: error:
      	too many arguments to function ‘arch_vma_access_permitted’
      mm/gup.c: In function ‘check_vma_flags’:
      mm/gup.c:456: error:
      	too many arguments to function ‘arch_vma_access_permitted’
      mm/gup.c: In function ‘vma_permits_fault’:
      mm/gup.c:640: error:
      	too many arguments to function ‘arch_vma_access_permitted’
      
      Fixes: d61172b4 ("mm/core, x86/mm/pkeys: Differentiate instruction fetches")
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Acked-by: default avatarGuan Xuetao <gxt@mprc.pku.edu.cn>
      783011b1
  10. 12 Aug, 2016 13 commits