1. 13 Dec, 2019 4 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20191212' of git://git.kernel.dk/linux-block · f1fcd778
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
      
       - stable fix for the bi_size overflow. Not a corruption issue, but a
         case wher we could merge but disallowed (Andreas)
      
       - NVMe pull request via Keith, with various fixes.
      
       - MD pull request from Song.
      
       - Merge window regression fix for the rq passthrough stats (Logan)
      
       - Remove unused blkcg_drain_queue() function (Guoqing)
      
      * tag 'for-linus-20191212' of git://git.kernel.dk/linux-block:
        blk-cgroup: remove blkcg_drain_queue
        block: fix NULL pointer dereference in account statistics with IDE
        md: make sure desc_nr less than MD_SB_DISKS
        md: raid1: check rdev before reference in raid1_sync_request func
        raid5: need to set STRIPE_HANDLE for batch head
        block: fix "check bi_size overflow before merge"
        nvme/pci: Fix read queue count
        nvme/pci Limit write queue sizes to possible cpus
        nvme/pci: Fix write and poll queue types
        nvme/pci: Remove last_cq_head
        nvme: Namepace identification descriptor list is optional
        nvme-fc: fix double-free scenarios on hw queues
        nvme: else following return is not needed
        nvme: add error message on mismatching controller ids
        nvme_fc: add module to ops template to allow module references
        nvmet-loop: Avoid preallocating big SGL for data
        nvme-fc: Avoid preallocating big SGL for data
        nvme-rdma: Avoid preallocating big SGL for data
      f1fcd778
    • Linus Torvalds's avatar
      Merge tag 'io_uring-5.5-20191212' of git://git.kernel.dk/linux-block · 5bd831a4
      Linus Torvalds authored
      Pull io_uring fixes from Jens Axboe:
      
       - A tweak to IOSQE_IO_LINK (also marked for stable) to allow links that
         don't sever if the result is < 0.
      
         This is mostly for linked timeouts, where if we ask for a pure
         timeout we always get -ETIME. This makes links useless for that case,
         hence allow a case where it works.
      
       - Five minor optimizations to fix and improve cases that regressed
         since v5.4.
      
       - An SQTHREAD locking fix.
      
       - A sendmsg/recvmsg iov assignment fix.
      
       - Net fix where read_iter/write_iter don't honor IOCB_NOWAIT, and
         subsequently ensuring that works for io_uring.
      
       - Fix a case where for an invalid opcode we might return -EBADF instead
         of -EINVAL, if the ->fd of that sqe was set to an invalid fd value.
      
      * tag 'io_uring-5.5-20191212' of git://git.kernel.dk/linux-block:
        io_uring: ensure we return -EINVAL on unknown opcode
        io_uring: add sockets to list of files that support non-blocking issue
        net: make socket read/write_iter() honor IOCB_NOWAIT
        io_uring: only hash regular files for async work execution
        io_uring: run next sqe inline if possible
        io_uring: don't dynamically allocate poll data
        io_uring: deferred send/recvmsg should assign iov
        io_uring: sqthread should grab ctx->uring_lock for submissions
        io-wq: briefly spin for new work after finishing work
        io-wq: remove worker->wait waitqueue
        io_uring: allow unbreakable links
      5bd831a4
    • Linus Torvalds's avatar
      Merge tag 'for-5.5/dm-fixes' of... · 15da849c
      Linus Torvalds authored
      Merge tag 'for-5.5/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper fixes from Mike Snitzer:
      
       - Fix DM multipath by restoring full path selector functionality for
         bio-based configurations that don't haave a SCSI device handler.
      
       - Fix dm-btree removal to ensure non-root btree nodes have at least
         (max_entries / 3) entries. This resolves userspace thin_check
         utility's report of "too few entries in btree_node".
      
       - Fix both the DM thin-provisioning and dm-clone targets to properly
         flush the data device prior to metadata commit. This resolves the
         potential for inconsistency across a power loss event when the data
         device has a volatile writeback cache.
      
       - Small documentation fixes to dm-clone and dm-integrity.
      
      * tag 'for-5.5/dm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        docs: dm-integrity: remove reference to ARC4
        dm thin: Flush data device before committing metadata
        dm thin metadata: Add support for a pre-commit callback
        dm clone: Flush destination device before committing metadata
        dm clone metadata: Use a two phase commit
        dm clone metadata: Track exact changes per transaction
        dm btree: increase rebalance threshold in __rebalance2()
        dm: add dm-clone to the documentation index
        dm mpath: remove harmful bio-based optimization
      15da849c
    • Linus Torvalds's avatar
      Merge tag 'sizeof_field-v5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 22ff311a
      Linus Torvalds authored
      Pull FIELD_SIZEOF conversion from Kees Cook:
       "A mostly mechanical treewide conversion from FIELD_SIZEOF() to
        sizeof_field(). This avoids the redundancy of having 2 macros
        (actually 3) doing the same thing, and consolidates on sizeof_field().
        While "field" is not an accurate name, it is the common name used in
        the kernel, and doesn't result in any unintended innuendo.
      
        As there are still users of FIELD_SIZEOF() in -next, I will clean up
        those during this coming development cycle and send the final old
        macro removal patch at that time"
      
      * tag 'sizeof_field-v5.5-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        treewide: Use sizeof_field() macro
        MIPS: OCTEON: Replace SIZEOF_FIELD() macro
      22ff311a
  2. 12 Dec, 2019 5 commits
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.5-rc2' of git://github.com/ceph/ceph-client · 37d4e84f
      Linus Torvalds authored
      Pull ceph fixes from Ilya Dryomov:
       "A fix to avoid a corner case when scheduling cap reclaim in batches
        from Xiubo, a patch to add some observability into cap waiters from
        Jeff and a couple of cleanups"
      
      * tag 'ceph-for-5.5-rc2' of git://github.com/ceph/ceph-client:
        ceph: add more debug info when decoding mdsmap
        ceph: switch to global cap helper
        ceph: trigger the reclaim work once there has enough pending caps
        ceph: show tasks waiting on caps in debugfs caps file
        ceph: convert int fields in ceph_mount_options to unsigned int
      37d4e84f
    • Guoqing Jiang's avatar
      blk-cgroup: remove blkcg_drain_queue · 5addeae1
      Guoqing Jiang authored
      Since blk_drain_queue had already been removed, so this function
      is not needed anymore.
      Signed-off-by: default avatarGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      5addeae1
    • Logan Gunthorpe's avatar
      block: fix NULL pointer dereference in account statistics with IDE · ecb6186c
      Logan Gunthorpe authored
      The IDE driver creates some passthru requests which never get
      submitted to the block layer in such a way that blk_account_io_start()
      gets called. However, the driver still calls __blk_mq_end_request() in
      ide_end_rq() which will call blk_account_io_completion() which tries
      to dereferences req->part which is never set. See ide_prep_sense() for
      an example of where these requests come from.
      
      To fix this, blk_account_io_completion() and blk_account_io_done()
      should do nothing if req->part is not set.
      
      The back trace of this bug is:
      
          BUG: kernel NULL pointer dereference, address: 000002ac
          #PF: supervisor write access in kernel mode
          #PF: error_code(0x0002) - not-present page
          *pde = 00000000
          Oops: 0002 [#1]
          CPU: 0 PID: 237 Comm: kworker/0:1H Not tainted
          5.4.0-rc2-00011-g48d9b0d4 #1
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1
          04/01/2014
          Workqueue: kblockd drive_rq_insert_work
          EIP: blk_account_io_completion+0x7a/0xf0
          Code: 89 54 24 08 31 d2 89 4c 24 04 31 c9 c7 04 24 02 00 00 00 c1 ee
          09 e8 f5 21 a6 ff e8 70 5c a7 ff 8b 53 60 8d 04 bd 00 00 00 00 <01> b4
          02 ac 02 00 00 8b 9a 88 02 00 00 85 db 74 11 85 d2 74 51 8b
          EAX: 00000000 EBX: f5b80000 ECX: 00000000 EDX: 00000000
          ESI: 00000000 EDI: 00000000 EBP: f3031e70 ESP: f3031e54
          DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 EFLAGS: 00010046
          CR0: 80050033 CR2: 000002ac CR3: 03c25000 CR4: 000406d0
          Call Trace:
           <IRQ>
            blk_update_request+0x85/0x420
            ide_end_rq+0x38/0xa0
            ide_complete_rq+0x3d/0x70
            cdrom_newpc_intr+0x258/0xba0
            ide_intr+0x135/0x250
            __handle_irq_event_percpu+0x3e/0x250
            handle_irq_event_percpu+0x1f/0x50
            handle_irq_event+0x32/0x60
            handle_level_irq+0x6c/0x110
            handle_irq+0x72/0xa0
            </IRQ>
            do_IRQ+0x45/0xad
            common_interrupt+0x115/0x11c
      
      Fixes: 48d9b0d4 ("block: account statistics for passthrough requests")
      Reported-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Signed-off-by: default avatarLogan Gunthorpe <logang@deltatee.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      ecb6186c
    • Jens Axboe's avatar
      Merge branch 'md-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md into for-linus · 296aec45
      Jens Axboe authored
      Pull MD fixes from Song.
      
      * 'md-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/song/md:
        md: make sure desc_nr less than MD_SB_DISKS
        md: raid1: check rdev before reference in raid1_sync_request func
        raid5: need to set STRIPE_HANDLE for batch head
      296aec45
    • Linus Torvalds's avatar
      Merge tag 'afs-fixes-20191211' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs · ae4b064e
      Linus Torvalds authored
      Pull AFS fixes from David Howells:
       "Fixes for AFS plus one patch to make debugging easier:
      
         - Fix how addresses are matched to server records. This is currently
           incorrect which means cache invalidation callbacks from the server
           don't necessarily get delivered correctly. This causes stale data
           and metadata to be seen under some circumstances.
      
         - Make the dynamic root superblock R/W so that rpm/dnf can reapply
           the SELinux label to it when upgrading the Fedora filesystem-afs
           package. If the filesystem is R/O, this fails and the upgrade
           fails.
      
           It might be better in future to allow setxattr from an LSM to
           bypass the R/O protections, if only for pseudo-filesystems.
      
         - Fix the parsing of mountpoint strings. The mountpoint object has to
           have a terminal dot, whereas the source/device string passed to
           mount should not. This confuses type-forcing suffix detection
           leading to the wrong volume variant being mounted.
      
         - Make lookups in the dynamic root superblock for creation events
           (such as mkdir) fail with EOPNOTSUPP rather than something like
           EEXIST. The dynamic root only allows implicit creation by the
           ->lookup() method - and only if the target cell exists.
      
         - Fix the looking up of an AFS superblock to include the cell in the
           matching key - otherwise all volumes with the same ID number are
           treated as the same thing, irrespective of which cell they're in.
      
         - Show the volume name of each volume in the volume records displayed
           in /proc/net/afs/<cell>/volumes. This proved useful in debugging as
           it provides a way to map the volume IDs to names, where the names
           are what appear in /proc/mounts"
      
      * tag 'afs-fixes-20191211' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
        afs: Show volume name in /proc/net/afs/<cell>/volumes
        afs: Fix missing cell comparison in afs_test_super()
        afs: Fix creation calls in the dynamic root to fail with EOPNOTSUPP
        afs: Fix mountpoint parsing
        afs: Fix SELinux setting security label on /afs
        afs: Fix afs_find_server lookups for ipv4 peers
      ae4b064e
  3. 11 Dec, 2019 11 commits
    • Jens Axboe's avatar
      io_uring: ensure we return -EINVAL on unknown opcode · 9e3aa61a
      Jens Axboe authored
      If we submit an unknown opcode and have fd == -1, io_op_needs_file()
      will return true as we default to needing a file. Then when we go and
      assign the file, we find the 'fd' invalid and return -EBADF. We really
      should be returning -EINVAL for that case, as we normally do for
      unsupported opcodes.
      
      Change io_op_needs_file() to have the following return values:
      
      0   - does not need a file
      1   - does need a file
      < 0 - error value
      
      and use this to pass back the right value for this invalid case.
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      9e3aa61a
    • Linus Torvalds's avatar
      Merge tag 'erofs-for-5.5-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs · 687dec9b
      Linus Torvalds authored
      Pull erofs fixes from Gao Xiang:
       "Mainly address a regression reported by David recently observed
        together with overlayfs due to the improper return value of
        listxattr() without xattr. Update outdated expressions in document as
        well.
      
        Summary:
      
         - Fix improper return value of listxattr() with no xattr
      
         - Keep up documentation with latest code"
      
      * tag 'erofs-for-5.5-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs:
        erofs: update documentation
        erofs: zero out when listxattr is called with no xattr
      687dec9b
    • Linus Torvalds's avatar
      Merge tag 'trace-v5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 6674fdb2
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
      
       - Remove code I accidentally applied when doing a minor fix up to a
         patch, and then using "git commit -a --amend", which pulled in some
         other changes I was playing with.
      
       - Remove an used variable in trace_events_inject code
      
       - Fix function graph tracer when it traces a ftrace direct function.
         It will now ignore tracing a function that has a ftrace direct
         tramploine attached. This is needed for eBPF to use the ftrace direct
         code.
      
      * tag 'trace-v5.5-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        ftrace: Fix function_graph tracer interaction with BPF trampoline
        tracing: remove set but not used variable 'buffer'
        module: Remove accidental change of module_enable_x()
      6674fdb2
    • Linus Torvalds's avatar
      pipe: simplify signal handling in pipe_read() and add comments · d1c6a2aa
      Linus Torvalds authored
      There's no need to separately check for signals while inside the locked
      region, since we're going to do "wait_event_interruptible()" right
      afterwards anyway, and the error handling is much simpler there.
      
      The check for whether we had already read anything was also redundant,
      since we no longer do the odd merging of reads when there are pending
      writers.
      
      But perhaps more importantly, this adds commentary about why we still
      need to wake up possible writers even though we didn't read any data,
      and why we can skip all the finishing touches now if we get a signal (or
      had a signal pending) while waiting for more data.
      
      [ This is a split-out cleanup from my "make pipe IO use exclusive wait
        queues" thing, which I can't apply because it triggers a nasty bug in
        the GNU make jobserver   - Linus ]
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d1c6a2aa
    • Yufen Yu's avatar
      md: make sure desc_nr less than MD_SB_DISKS · 3b7436cc
      Yufen Yu authored
      For super_90_load, we need to make sure 'desc_nr' less
      than MD_SB_DISKS, avoiding invalid memory access of 'sb->disks'.
      
      Fixes: 228fc7d7 ("md: avoid invalid memory access for array sb->dev_roles")
      Signed-off-by: default avatarYufen Yu <yuyufen@huawei.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      3b7436cc
    • Zhiqiang Liu's avatar
      md: raid1: check rdev before reference in raid1_sync_request func · 028288df
      Zhiqiang Liu authored
      In raid1_sync_request func, rdev should be checked before reference.
      Signed-off-by: default avatarZhiqiang Liu <liuzhiqiang26@huawei.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      028288df
    • Guoqing Jiang's avatar
      raid5: need to set STRIPE_HANDLE for batch head · a7ede3d1
      Guoqing Jiang authored
      With commit 6ce220dd ("raid5: don't set
      STRIPE_HANDLE to stripe which is in batch list"), we don't want to set
      STRIPE_HANDLE flag for sh which is already in batch list.
      
      However, the stripe which is the head of batch list should set this flag,
      otherwise panic could happen inside init_stripe at BUG_ON(sh->batch_head),
      it is reproducible with raid5 on top of nvdimm devices per Xiao oberserved.
      
      Thanks for Xiao's effort to verify the change.
      
      Fixes: 6ce220dd ("raid5: don't set STRIPE_HANDLE to stripe which is in batch list")
      Reported-by: default avatarXiao Ni <xni@redhat.com>
      Tested-by: default avatarXiao Ni <xni@redhat.com>
      Signed-off-by: default avatarGuoqing Jiang <guoqing.jiang@cloud.ionos.com>
      Signed-off-by: default avatarSong Liu <songliubraving@fb.com>
      a7ede3d1
    • David Howells's avatar
      afs: Show volume name in /proc/net/afs/<cell>/volumes · 50559800
      David Howells authored
      Show the name of each volume in /proc/net/afs/<cell>/volumes to make it
      easier to work out the name corresponding to a volume ID.  This makes it
      easier to work out which mounts in /proc/mounts correspond to which volume
      ID.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      50559800
    • David Howells's avatar
      afs: Fix missing cell comparison in afs_test_super() · 106bc798
      David Howells authored
      Fix missing cell comparison in afs_test_super().  Without this, any pair
      volumes that have the same volume ID will share a superblock, no matter the
      cell, unless they're in different network namespaces.
      
      Normally, most users will only deal with a single cell and so they won't
      see this.  Even if they do look into a second cell, they won't see a
      problem unless they happen to hit a volume with the same ID as one they've
      already got mounted.
      
      Before the patch:
      
          # ls /afs/grand.central.org/archive
          linuxdev/  mailman/  moin/  mysql/  pipermail/  stage/  twiki/
          # ls /afs/kth.se/
          linuxdev/  mailman/  moin/  mysql/  pipermail/  stage/  twiki/
          # cat /proc/mounts | grep afs
          none /afs afs rw,relatime,dyn,autocell 0 0
          #grand.central.org:root.cell /afs/grand.central.org afs ro,relatime 0 0
          #grand.central.org:root.archive /afs/grand.central.org/archive afs ro,relatime 0 0
          #grand.central.org:root.archive /afs/kth.se afs ro,relatime 0 0
      
      After the patch:
      
          # ls /afs/grand.central.org/archive
          linuxdev/  mailman/  moin/  mysql/  pipermail/  stage/  twiki/
          # ls /afs/kth.se/
          admin/        common/  install/  OldFiles/  service/  system/
          bakrestores/  home/    misc/     pkg/       src/      wsadmin/
          # cat /proc/mounts | grep afs
          none /afs afs rw,relatime,dyn,autocell 0 0
          #grand.central.org:root.cell /afs/grand.central.org afs ro,relatime 0 0
          #grand.central.org:root.archive /afs/grand.central.org/archive afs ro,relatime 0 0
          #kth.se:root.cell /afs/kth.se afs ro,relatime 0 0
      
      Fixes: ^1da177e4 ("Linux-2.6.12-rc2")
      Reported-by: default avatarCarsten Jacobi <jacobi@de.ibm.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Tested-by: default avatarJonathan Billings <jsbillings@jsbillings.org>
      cc: Todd DeSantis <atd@us.ibm.com>
      106bc798
    • David Howells's avatar
      afs: Fix creation calls in the dynamic root to fail with EOPNOTSUPP · 1da4bd9f
      David Howells authored
      Fix the lookup method on the dynamic root directory such that creation
      calls, such as mkdir, open(O_CREAT), symlink, etc. fail with EOPNOTSUPP
      rather than failing with some odd error (such as EEXIST).
      
      lookup() itself tries to create automount directories when it is invoked.
      These are cached locally in RAM and not committed to storage.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Tested-by: default avatarJonathan Billings <jsbillings@jsbillings.org>
      1da4bd9f
    • David Howells's avatar
      afs: Fix mountpoint parsing · 158d5833
      David Howells authored
      Each AFS mountpoint has strings that define the target to be mounted.  This
      is required to end in a dot that is supposed to be stripped off.  The
      string can include suffixes of ".readonly" or ".backup" - which are
      supposed to come before the terminal dot.  To add to the confusion, the "fs
      lsmount" afs utility does not show the terminal dot when displaying the
      string.
      
      The kernel mount source string parser, however, assumes that the terminal
      dot marks the suffix and that the suffix is always "" and is thus ignored.
      In most cases, there is no suffix and this is not a problem - but if there
      is a suffix, it is lost and this affects the ability to mount the correct
      volume.
      
      The command line mount command, on the other hand, is expected not to
      include a terminal dot - so the problem doesn't arise there.
      
      Fix this by making sure that the dot exists and then stripping it when
      passing the string to the mount configuration.
      
      Fixes: bec5eb61 ("AFS: Implement an autocell mount capability [ver #2]")
      Reported-by: default avatarJonathan Billings <jsbillings@jsbillings.org>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Reviewed-by: default avatarMarc Dionne <marc.dionne@auristor.com>
      Tested-by: default avatarJonathan Billings <jsbillings@jsbillings.org>
      158d5833
  4. 10 Dec, 2019 15 commits
  5. 09 Dec, 2019 5 commits