1. 28 Feb, 2019 4 commits
    • Carlos Maiolino's avatar
      fs: fix guard_bio_eod to check for real EOD errors · dce30ca9
      Carlos Maiolino authored
      guard_bio_eod() can truncate a segment in bio to allow it to do IO on
      odd last sectors of a device.
      
      It already checks if the IO starts past EOD, but it does not consider
      the possibility of an IO request starting within device boundaries can
      contain more than one segment past EOD.
      
      In such cases, truncated_bytes can be bigger than PAGE_SIZE, and will
      underflow bvec->bv_len.
      
      Fix this by checking if truncated_bytes is lower than PAGE_SIZE.
      
      This situation has been found on filesystems such as isofs and vfat,
      which doesn't check the device size before mount, if the device is
      smaller than the filesystem itself, a readahead on such filesystem,
      which spans EOD, can trigger this situation, leading a call to
      zero_user() with a wrong size possibly corrupting memory.
      
      I didn't see any crash, or didn't let the system run long enough to
      check if memory corruption will be hit somewhere, but adding
      instrumentation to guard_bio_end() to check truncated_bytes size, was
      enough to see the error.
      
      The following script can trigger the error.
      
      MNT=/mnt
      IMG=./DISK.img
      DEV=/dev/loop0
      
      mkfs.vfat $IMG
      mount $IMG $MNT
      cp -R /etc $MNT &> /dev/null
      umount $MNT
      
      losetup -D
      
      losetup --find --show --sizelimit 16247280 $IMG
      mount $DEV $MNT
      
      find $MNT -type f -exec cat {} + >/dev/null
      
      Kudos to Eric Sandeen for coming up with the reproducer above
      Reviewed-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarCarlos Maiolino <cmaiolino@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      dce30ca9
    • Dongli Zhang's avatar
      blk-mq: use HCTX_TYPE_DEFAULT but not 0 to index blk_mq_tag_set->map · 7d76f856
      Dongli Zhang authored
      Replace set->map[0] with set->map[HCTX_TYPE_DEFAULT] to avoid hardcoding.
      Signed-off-by: default avatarDongli Zhang <dongli.zhang@oracle.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      7d76f856
    • Christoph Hellwig's avatar
      block: optimize bvec iteration in bvec_iter_advance · 5b88a17c
      Christoph Hellwig authored
      There is no need to only iterate in chunks of PAGE_SIZE or less in
      bvec_iter_advance, given that the callers pass in the chunk length that
      they are operating on - either that already is less than PAGE_SIZE
      because they do classic page-based iteration, or it is larger because
      the caller operates on multi-page bvecs.
      
      This should help shaving off a few cycles of the I/O hot path.
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5b88a17c
    • Ming Lei's avatar
      block: introduce mp_bvec_for_each_page() for iterating over page · 594b9a89
      Ming Lei authored
      mp_bvec_for_each_segment() is a bit big for the iteration, so introduce
      a light-weight helper for iterating over pages, then 32bytes stack
      space can be saved.
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      594b9a89
  2. 27 Feb, 2019 3 commits
  3. 24 Feb, 2019 4 commits
  4. 22 Feb, 2019 2 commits
  5. 21 Feb, 2019 3 commits
    • Ming Lei's avatar
      block: bounce: make sure that bvec table is updated · 8f4e80da
      Ming Lei authored
      Block bounce needs to allocate new page for doing IO, and the
      new page has to be updated to bvec table.
      
      Commit 6dc4f100 switches __blk_queue_bounce() to use the new
      bio_for_each_segment_all() interface. Unfortunately the new
      bio_for_each_segment_all() can't be used to update bvec table.
      
      This patch fixes this issue by retrieving bvec from the table
      directly, then the new allocated page can be updated to the bio.
      This way is safe because the cloned bio is single page bvec.
      
      Fixes: 6dc4f100 ("block: allow bio_for_each_segment_all() to iterate over multi-page bvec")
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Omar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      8f4e80da
    • Jens Axboe's avatar
      Merge branch 'nvme-5.1' of git://git.infradead.org/nvme into for-5.1/block · 037b2625
      Jens Axboe authored
      Pull NVMe changes for 5.1 from Christoph
      
      * 'nvme-5.1' of git://git.infradead.org/nvme: (22 commits)
        nvme-rdma: use nr_phys_segments when map rq to sgl
        nvmet: convert to SPDX identifiers
        nvmet-rdma: convert to SPDX identifiers
        nvme-loop: convert to SPDX identifiers
        nvmet-fcloop: convert to SPDX identifiers
        nvmet-fc: convert to SPDX identifiers
        nvme: convert to SPDX identifiers
        nvme-pci: convert to SPDX identifiers
        nvme-lightnvm: convert to SPDX identifiers
        nvme-rdma: convert to SPDX identifiers
        nvme-fc: convert to SPDX identifiers
        nvme-fabrics: convert to SPDX identifiers
        nvme-tcp.h: fix SPDX header
        nvme_ioctl.h: remove duplicate GPL boilerplate
        nvme: return error from nvme_alloc_ns()
        nvme: avoid that deleting a controller triggers a circular locking complaint
        nvme: introduce a helper function for controller deletion
        nvme: unexport nvme_delete_ctrl_sync()
        nvme-pci: check kstrtoint() return value in queue_count_set()
        nvme-fabrics: document the poll function argument
        ...
      037b2625
    • Chaitanya Kulkarni's avatar
      nvme-rdma: use nr_phys_segments when map rq to sgl · 34e08191
      Chaitanya Kulkarni authored
      Use blk_rq_nr_phys_segments() instead of blk_rq_payload_bytes() to check
      if a command contains data to be mapped.  This fixes the case where
      a struct request contains LBAs, but it has no payload, such as
      Write Zeroes support.
      
      Fixes: 6e02318e ("nvme: add support for the Write Zeroes command")
      Reported-by: default avatarMing Lei <tom.leiming@gmail.com>
      Signed-off-by: default avatarChaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
      Tested-by: default avatarMing Lei <tom.leiming@gmail.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      34e08191
  6. 20 Feb, 2019 21 commits
  7. 19 Feb, 2019 1 commit
  8. 15 Feb, 2019 2 commits
    • Jens Axboe's avatar
      Merge tag 'v5.0-rc6' into for-5.1/block · 6fb845f0
      Jens Axboe authored
      Pull in 5.0-rc6 to avoid a dumb merge conflict with fs/iomap.c.
      This is needed since io_uring is now based on the block branch,
      to avoid a conflict between the multi-page bvecs and the bits
      of io_uring that touch the core block parts.
      
      * tag 'v5.0-rc6': (525 commits)
        Linux 5.0-rc6
        x86/mm: Make set_pmd_at() paravirt aware
        MAINTAINERS: Update the ocores i2c bus driver maintainer, etc
        blk-mq: remove duplicated definition of blk_mq_freeze_queue
        Blk-iolatency: warn on negative inflight IO counter
        blk-iolatency: fix IO hang due to negative inflight counter
        MAINTAINERS: unify reference to xen-devel list
        x86/mm/cpa: Fix set_mce_nospec()
        futex: Handle early deadlock return correctly
        futex: Fix barrier comment
        net: dsa: b53: Fix for failure when irq is not defined in dt
        blktrace: Show requests without sector
        mips: cm: reprime error cause
        mips: loongson64: remove unreachable(), fix loongson_poweroff().
        sit: check if IPv6 enabled before calling ip6_err_gen_icmpv6_unreach()
        geneve: should not call rt6_lookup() when ipv6 was disabled
        KVM: nVMX: unconditionally cancel preemption timer in free_nested (CVE-2019-7221)
        KVM: x86: work around leak of uninitialized stack contents (CVE-2019-7222)
        kvm: fix kvm_ioctl_create_device() reference counting (CVE-2019-6974)
        signal: Better detection of synchronous signals
        ...
      6fb845f0
    • Ming Lei's avatar
      block: kill BLK_MQ_F_SG_MERGE · 56d18f62
      Ming Lei authored
      QUEUE_FLAG_NO_SG_MERGE has been killed, so kill BLK_MQ_F_SG_MERGE too.
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarMing Lei <ming.lei@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      56d18f62