Commits · e59cd88028dbd41472453e5883f78330aa73c56e · nexedi / linux

30 Mar, 2020 6 commits

Merge tag 'for-5.7/io_uring-2020-03-29' of git://git.kernel.dk/linux-block · e59cd880

Linus Torvalds authored Mar 30, 2020

Pull io_uring updates from Jens Axboe:
 "Here are the io_uring changes for this merge window. Light on new
  features this time around (just splice + buffer selection), lots of
  cleanups, fixes, and improvements to existing support. In particular,
  this contains:

   - Cleanup fixed file update handling for stack fallback (Hillf)

   - Re-work of how pollable async IO is handled, we no longer require
     thread offload to handle that. Instead we rely using poll to drive
     this, with task_work execution.

   - In conjunction with the above, allow expendable buffer selection,
     so that poll+recv (for example) no longer has to be a split
     operation.

   - Make sure we honor RLIMIT_FSIZE for buffered writes

   - Add support for splice (Pavel)

   - Linked work inheritance fixes and optimizations (Pavel)

   - Async work fixes and cleanups (Pavel)

   - Improve io-wq locking (Pavel)

   - Hashed link write improvements (Pavel)

   - SETUP_IOPOLL|SETUP_SQPOLL improvements (Xiaoguang)"

* tag 'for-5.7/io_uring-2020-03-29' of git://git.kernel.dk/linux-block: (54 commits)
  io_uring: cleanup io_alloc_async_ctx()
  io_uring: fix missing 'return' in comment
  io-wq: handle hashed writes in chains
  io-uring: drop 'free_pfile' in struct io_file_put
  io-uring: drop completion when removing file
  io_uring: Fix ->data corruption on re-enqueue
  io-wq: close cancel gap for hashed linked work
  io_uring: make spdxcheck.py happy
  io_uring: honor original task RLIMIT_FSIZE
  io-wq: hash dependent work
  io-wq: split hashing and enqueueing
  io-wq: don't resched if there is no work
  io-wq: remove duplicated cancel code
  io_uring: fix truncated async read/readv and write/writev retry
  io_uring: dual license io_uring.h uapi header
  io_uring: io_uring_enter(2) don't poll while SETUP_IOPOLL|SETUP_SQPOLL enabled
  io_uring: Fix unused function warnings
  io_uring: add end-of-bits marker and build time verify it
  io_uring: provide means of removing buffers
  io_uring: add IOSQE_BUFFER_SELECT support for IORING_OP_RECVMSG
  ...

e59cd880

Merge tag 'for-5.7/drivers-2020-03-29' of git://git.kernel.dk/linux-block · 15926148

Linus Torvalds authored Mar 30, 2020

Pull block driver updates from Jens Axboe:

 - floppy driver cleanup series from Willy

 - NVMe updates and fixes (Various)

 - null_blk trace improvements (Chaitanya)

 - bcache fixes (Coly)

 - md fixes (via Song)

 - loop block size change optimizations (Martijn)

 - scnprintf() use (Takashi)

* tag 'for-5.7/drivers-2020-03-29' of git://git.kernel.dk/linux-block: (81 commits)
  null_blk: add trace in null_blk_zoned.c
  null_blk: add tracepoint helpers for zoned mode
  block: add a zone condition debug helper
  nvme: cleanup namespace identifier reporting in nvme_init_ns_head
  nvme: rename __nvme_find_ns_head to nvme_find_ns_head
  nvme: refactor nvme_identify_ns_descs error handling
  nvme-tcp: Add warning on state change failure at nvme_tcp_setup_ctrl
  nvme-rdma: Add warning on state change failure at nvme_rdma_setup_ctrl
  nvme: Fix controller creation races with teardown flow
  nvme: Make nvme_uninit_ctrl symmetric to nvme_init_ctrl
  nvme: Fix ctrl use-after-free during sysfs deletion
  nvme-pci: Re-order nvme_pci_free_ctrl
  nvme: Remove unused return code from nvme_delete_ctrl_sync
  nvme: Use nvme_state_terminal helper
  nvme: release ida resources
  nvme: Add compat_ioctl handler for NVME_IOCTL_SUBMIT_IO
  nvmet-tcp: optimize tcp stack TX when data digest is used
  nvme-fabrics: Use scnprintf() for avoiding potential buffer overflow
  nvme-multipath: do not reset on unknown status
  nvmet-rdma: allocate RW ctxs according to mdts
  ...

15926148

Merge tag 'for-5.7/block-2020-03-29' of git://git.kernel.dk/linux-block · 10f36b1e

Linus Torvalds authored Mar 30, 2020

Pull block updates from Jens Axboe:

 - Online capacity resizing (Balbir)

 - Number of hardware queue change fixes (Bart)

 - null_blk fault injection addition (Bart)

 - Cleanup of queue allocation, unifying the node/no-node API
   (Christoph)

 - Cleanup of genhd, moving code to where it makes sense (Christoph)

 - Cleanup of the partition handling code (Christoph)

 - disk stat fixes/improvements (Konstantin)

 - BFQ improvements (Paolo)

 - Various fixes and improvements

* tag 'for-5.7/block-2020-03-29' of git://git.kernel.dk/linux-block: (72 commits)
  block: return NULL in blk_alloc_queue() on error
  block: move bio_map_* to blk-map.c
  Revert "blkdev: check for valid request queue before issuing flush"
  block: simplify queue allocation
  bcache: pass the make_request methods to blk_queue_make_request
  null_blk: use blk_mq_init_queue_data
  block: add a blk_mq_init_queue_data helper
  block: move the ->devnode callback to struct block_device_operations
  block: move the part_stat* helpers from genhd.h to a new header
  block: move block layer internals out of include/linux/genhd.h
  block: move guard_bio_eod to bio.c
  block: unexport get_gendisk
  block: unexport disk_map_sector_rcu
  block: unexport disk_get_part
  block: mark part_in_flight and part_in_flight_rw static
  block: mark block_depr static
  block: factor out requeue handling from dispatch code
  block/diskstats: replace time_in_queue with sum of request times
  block/diskstats: accumulate all per-cpu counters in one pass
  block/diskstats: more accurate approximation of io_ticks for slow disks
  ...

10f36b1e

Merge tag 'for-5.7/libata-2020-03-29' of git://git.kernel.dk/linux-block · 3a0eb192

Linus Torvalds authored Mar 30, 2020

Pull libata updates from Jens Axboe:

 - Series from Bart, making the libata code smaller on PATA only setups.
   This is useful for smaller/embedded use cases, and will help us move
   some of those off drivers/ide.

 - Kill unused BPRINTK() (Hannes)

 - Add various Comet Lake ahci PCI ids (Kai-Heng, Mika)

 - Fix for a double scsi_host_put() in error handling (John)

 - Use scnprintf (Takashi)

 - Assign OF node to the SCSI device (Linus Walleij)

* tag 'for-5.7/libata-2020-03-29' of git://git.kernel.dk/linux-block: (36 commits)
  ata: make "libata.force" kernel parameter optional
  ata: move ata_eh_analyze_ncq_error() & co. to libata-sata.c
  ata: start separating SATA specific code from libata-eh.c
  ata: move ata_sas_*() to libata-sata.c
  ata: start separating SATA specific code from libata-scsi.c
  ata: move sata_deb_timing_*() to libata-sata.c
  ata: move ata_qc_complete_multiple() to libata-sata.c
  ata: move sata_link_hardreset() to libata-sata.c
  ata: move sata_link_{debounce,resume}() to libata-sata.c
  ata: move *sata_set_spd*() to libata-sata.c
  ata: move sata_scr_*() to libata-sata.c
  ata: start separating SATA specific code from libata-core.c
  ata: let compiler optimize out ata_eh_set_lpm() on non-SATA hosts
  ata: let compiler optimize out ata_dev_config_ncq() on non-SATA hosts
  ata: add CONFIG_SATA_HOST=n version of ata_ncq_enabled()
  ata: separate PATA timings code from libata-core.c
  ata: fix CodingStyle issues in PATA timings code
  ata: remove EXPORT_SYMBOL_GPL()s not used by modules
  ata: move EXPORT_SYMBOL_GPL()s close to exported code
  ata: optimize ata_scsi_rbuf[] size
  ...

3a0eb192

Merge tag 'i3c/for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux · c03cb664

Linus Torvalds authored Mar 30, 2020

Pull i3c updates from Boris Brezillon:

 - Fix driver auto-probing related issues

 - Stop using the deprecated i2c_new_device() function

 - Replace zero-length array with flexible-array member

* tag 'i3c/for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
  i3c: convert to use i2c_new_client_device()
  i3c: master: Replace zero-length array with flexible-array member
  i3c: Simplify i3c_device_match_id()
  i3c: Generate aliases for i3c modules
  i3c: Add a modalias sysfs attribute
  i3c: Fix MODALIAS uevents
  i3c: master: no need to iterate master device twice

c03cb664

Merge tag 'tpmdd-next-20200316' of git://git.infradead.org/users/jjs/linux-tpmdd · 0f751396

Linus Torvalds authored Mar 30, 2020

Pull tpm updates from Jarkko Sakkinen:
 "tpmdd updates for Linux v5.7"

* tag 'tpmdd-next-20200316' of git://git.infradead.org/users/jjs/linux-tpmdd:
  KEYS: reaching the keys quotas correctly
  tpm: ibmvtpm: Add support for TPM2
  tpm: ibmvtpm: Wait for buffer to be set before proceeding
  tpm: of: Handle IBM,vtpm20 case when getting log parameters
  MAINTAINERS: adjust to trusted keys subsystem creation
  tpm: tpm_tis_spi_cr50: use new structure for SPI transfer delays
  tpm_tis_spi: use new 'delay' structure for SPI transfer delays
  tpm: tpm2_bios_measurements_next should increase position index
  tpm: tpm1_bios_measurements_next should increase position index
  tpm: Don't make log failures fatal

0f751396

29 Mar, 2020 12 commits

Linux 5.6 · 7111951b
Linus Torvalds authored Mar 29, 2020

7111951b

Merge branch 'akpm' (patches from Andrew) · 570203ec

Linus Torvalds authored Mar 29, 2020

Merge vm fixes from Andrew Morton:
 "5 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/sparse: fix kernel crash with pfn_section_valid check
  mm: fork: fix kernel_stack memcg stats for various stack implementations
  hugetlb_cgroup: fix illegal access to memory
  drivers/base/memory.c: indicate all memory blocks as removable
  mm/swapfile.c: move inode_lock out of claim_swapfile

570203ec

Merge tag 'timers-urgent-2020-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ab93e984

Linus Torvalds authored Mar 29, 2020

Pull timer fix from Thomas Gleixner:
 "A single fix for the Hyper-V clocksource driver to make sched clock
  actually return nanoseconds and not the virtual clock value which
  increments at 10e7 HZ (100ns)"

* tag 'timers-urgent-2020-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/hyper-v: Make sched clock return nanoseconds correctly

ab93e984

Merge tag 'irq-urgent-2020-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 01af08bd

Linus Torvalds authored Mar 29, 2020

Pull irq fix from Thomas Gleixner:
 "A single bugfix to prevent reference leaks in irq affinity notifiers"

* tag 'irq-urgent-2020-03-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Fix reference leaks on irq affinity notifiers

01af08bd

mm/sparse: fix kernel crash with pfn_section_valid check · b943f045

Aneesh Kumar K.V authored Mar 28, 2020

Fix the crash like this:

    BUG: Kernel NULL pointer dereference on read at 0x00000000
    Faulting instruction address: 0xc000000000c3447c
    Oops: Kernel access of bad area, sig: 11 [#1]
    LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
    CPU: 11 PID: 7519 Comm: lt-ndctl Not tainted 5.6.0-rc7-autotest #1
    ...
    NIP [c000000000c3447c] vmemmap_populated+0x98/0xc0
    LR [c000000000088354] vmemmap_free+0x144/0x320
    Call Trace:
       section_deactivate+0x220/0x240
       __remove_pages+0x118/0x170
       arch_remove_memory+0x3c/0x150
       memunmap_pages+0x1cc/0x2f0
       devm_action_release+0x30/0x50
       release_nodes+0x2f8/0x3e0
       device_release_driver_internal+0x168/0x270
       unbind_store+0x130/0x170
       drv_attr_store+0x44/0x60
       sysfs_kf_write+0x68/0x80
       kernfs_fop_write+0x100/0x290
       __vfs_write+0x3c/0x70
       vfs_write+0xcc/0x240
       ksys_write+0x7c/0x140
       system_call+0x5c/0x68

The crash is due to NULL dereference at

	test_bit(idx, ms->usage->subsection_map);

due to ms->usage = NULL in pfn_section_valid()

With commit d41e2f3b ("mm/hotplug: fix hot remove failure in
SPARSEMEM|!VMEMMAP case") section_mem_map is set to NULL after
depopulate_section_mem().  This was done so that pfn_page() can work
correctly with kernel config that disables SPARSEMEM_VMEMMAP.  With that
config pfn_to_page does

	__section_mem_map_addr(__sec) + __pfn;

where

  static inline struct page *__section_mem_map_addr(struct mem_section *section)
  {
	unsigned long map = section->section_mem_map;
	map &= SECTION_MAP_MASK;
	return (struct page *)map;
  }

Now with SPASEMEM_VMEMAP enabled, mem_section->usage->subsection_map is
used to check the pfn validity (pfn_valid()).  Since section_deactivate
release mem_section->usage if a section is fully deactivated,
pfn_valid() check after a subsection_deactivate cause a kernel crash.

  static inline int pfn_valid(unsigned long pfn)
  {
  ...
	return early_section(ms) || pfn_section_valid(ms, pfn);
  }

where

  static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
  {
	int idx = subsection_map_index(pfn);

	return test_bit(idx, ms->usage->subsection_map);
  }

Avoid this by clearing SECTION_HAS_MEM_MAP when mem_section->usage is
freed.  For architectures like ppc64 where large pages are used for
vmmemap mapping (16MB), a specific vmemmap mapping can cover multiple
sections.  Hence before a vmemmap mapping page can be freed, the kernel
needs to make sure there are no valid sections within that mapping.
Clearing the section valid bit before depopulate_section_memap enables
this.

[aneesh.kumar@linux.ibm.com: add comment]
  Link: http://lkml.kernel.org/r/20200326133235.343616-1-aneesh.kumar@linux.ibm.comLink: http://lkml.kernel.org/r/20200325031914.107660-1-aneesh.kumar@linux.ibm.com
Fixes: d41e2f3b ("mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case")
Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b943f045

mm: fork: fix kernel_stack memcg stats for various stack implementations · 8380ce47

Roman Gushchin authored Mar 28, 2020

Depending on CONFIG_VMAP_STACK and the THREAD_SIZE / PAGE_SIZE ratio the
space for task stacks can be allocated using __vmalloc_node_range(),
alloc_pages_node() and kmem_cache_alloc_node().

In the first and the second cases page->mem_cgroup pointer is set, but
in the third it's not: memcg membership of a slab page should be
determined using the memcg_from_slab_page() function, which looks at
page->slab_cache->memcg_params.memcg .  In this case, using
mod_memcg_page_state() (as in account_kernel_stack()) is incorrect:
page->mem_cgroup pointer is NULL even for pages charged to a non-root
memory cgroup.

It can lead to kernel_stack per-memcg counters permanently showing 0 on
some architectures (depending on the configuration).

In order to fix it, let's introduce a mod_memcg_obj_state() helper,
which takes a pointer to a kernel object as a first argument, uses
mem_cgroup_from_obj() to get a RCU-protected memcg pointer and calls
mod_memcg_state().  It allows to handle all possible configurations
(CONFIG_VMAP_STACK and various THREAD_SIZE/PAGE_SIZE values) without
spilling any memcg/kmem specifics into fork.c .

Note: This is a special version of the patch created for stable
backports.  It contains code from the following two patches:
  - mm: memcg/slab: introduce mem_cgroup_from_obj()
  - mm: fork: fix kernel_stack memcg stats for various stack implementations

[guro@fb.com: introduce mem_cgroup_from_obj()]
  Link: http://lkml.kernel.org/r/20200324004221.GA36662@carbon.dhcp.thefacebook.com
Fixes: 4d96ba35 ("mm: memcg/slab: stop setting page->mem_cgroup pointer for slab pages")
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Bharata B Rao <bharata@linux.ibm.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200303233550.251375-1-guro@fb.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

8380ce47

hugetlb_cgroup: fix illegal access to memory · 726b7bbe

Mina Almasry authored Mar 28, 2020

This appears to be a mistake in commit faced7e0 ("mm: hugetlb
controller for cgroups v2").

Essentially that commit does a hugetlb_cgroup_from_counter assuming that
page_counter_try_charge has initialized counter.

But if that has failed then it seems will not initialize counter, so
hugetlb_cgroup_from_counter(counter) ends up pointing to random memory,
causing kasan to complain.

The solution is to simply use 'h_cg', instead of
hugetlb_cgroup_from_counter(counter), since that is a reference to the
hugetlb_cgroup anyway.  After this change kasan ceases to complain.

Fixes: faced7e0 ("mm: hugetlb controller for cgroups v2")
Reported-by: syzbot+cac0c4e204952cf449b1@syzkaller.appspotmail.com
Signed-off-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Giuseppe Scrivano <gscrivan@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Link: http://lkml.kernel.org/r/20200313223920.124230-1-almasrymina@google.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

726b7bbe

drivers/base/memory.c: indicate all memory blocks as removable · 53cdc1cb

David Hildenbrand authored Mar 28, 2020

We see multiple issues with the implementation/interface to compute
whether a memory block can be offlined (exposed via
/sys/devices/system/memory/memoryX/removable) and would like to simplify
it (remove the implementation).

1. It runs basically lockless. While this might be good for performance,
   we see possible races with memory offlining that will require at
   least some sort of locking to fix.

2. Nowadays, more false positives are possible. No arch-specific checks
   are performed that validate if memory offlining will not be denied
   right away (and such check will require locking). For example, arm64
   won't allow to offline any memory block that was added during boot -
   which will imply a very high error rate. Other archs have other
   constraints.

3. The interface is inherently racy. E.g., if a memory block is detected
   to be removable (and was not a false positive at that time), there is
   still no guarantee that offlining will actually succeed. So any
   caller already has to deal with false positives.

4. It is unclear which performance benefit this interface actually
   provides. The introducing commit 5c755e9f ("memory-hotplug: add
   sysfs removable attribute for hotplug memory remove") mentioned

	"A user-level agent must be able to identify which sections
	 of memory are likely to be removable before attempting the
	 potentially expensive operation."

   However, no actual performance comparison was included.

Known users:

 - lsmem: Will group memory blocks based on the "removable" property. [1]

 - chmem: Indirect user. It has a RANGE mode where one can specify
          removable ranges identified via lsmem to be offlined. However,
          it also has a "SIZE" mode, which allows a sysadmin to skip the
          manual "identify removable blocks" step. [2]

 - powerpc-utils: Uses the "removable" attribute to skip some memory
          blocks right away when trying to find some to offline+remove.
          However, with ballooning enabled, it already skips this
          information completely (because it once resulted in many false
          negatives). Therefore, the implementation can deal with false
          positives properly already. [3]

According to Nathan Fontenot, DLPAR on powerpc is nowadays no longer
driven from userspace via the drmgr command (powerpc-utils).  Nowadays
it's managed in the kernel - including onlining/offlining of memory
blocks - triggered by drmgr writing to /sys/kernel/dlpar.  So the
affected legacy userspace handling is only active on old kernels.  Only
very old versions of drmgr on a new kernel (unlikely) might execute
slower - totally acceptable.

With CONFIG_MEMORY_HOTREMOVE, always indicating "removable" should not
break any user space tool.  We implement a very bad heuristic now.
Without CONFIG_MEMORY_HOTREMOVE we cannot offline anything, so report
"not removable" as before.

Original discussion can be found in [4] ("[PATCH RFC v1] mm:
is_mem_section_removable() overhaul").

Other users of is_mem_section_removable() will be removed next, so that
we can remove is_mem_section_removable() completely.

[1] http://man7.org/linux/man-pages/man1/lsmem.1.html
[2] http://man7.org/linux/man-pages/man8/chmem.8.html
[3] https://github.com/ibm-power-utilities/powerpc-utils
[4] https://lkml.kernel.org/r/20200117105759.27905-1-david@redhat.com

Also, this patch probably fixes a crash reported by Steve.
http://lkml.kernel.org/r/CAPcyv4jpdaNvJ67SkjyUJLBnBnXXQv686BiVW042g03FUmWLXw@mail.gmail.comReported-by: "Scargall, Steve" <steve.scargall@intel.com>
Suggested-by: Michal Hocko <mhocko@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Nathan Fontenot <ndfont@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Badari Pulavarty <pbadari@us.ibm.com>
Cc: Robert Jennings <rcj@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Karel Zak <kzak@redhat.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200128093542.6908-1-david@redhat.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

53cdc1cb

mm/swapfile.c: move inode_lock out of claim_swapfile · d795a90e

Naohiro Aota authored Mar 28, 2020

claim_swapfile() currently keeps the inode locked when it is successful,
or the file is already swapfile (with -EBUSY).  And, on the other error
cases, it does not lock the inode.

This inconsistency of the lock state and return value is quite confusing
and actually causing a bad unlock balance as below in the "bad_swap"
section of __do_sys_swapon().

This commit fixes this issue by moving the inode_lock() and IS_SWAPFILE
check out of claim_swapfile().  The inode is unlocked in
"bad_swap_unlock_inode" section, so that the inode is ensured to be
unlocked at "bad_swap".  Thus, error handling codes after the locking now
jumps to "bad_swap_unlock_inode" instead of "bad_swap".

    =====================================
    WARNING: bad unlock balance detected!
    5.5.0-rc7+ #176 Not tainted
    -------------------------------------
    swapon/4294 is trying to release lock (&sb->s_type->i_mutex_key) at: __do_sys_swapon+0x94b/0x3550
    but there are no more locks to release!

    other info that might help us debug this:
    no locks held by swapon/4294.

    stack backtrace:
    CPU: 5 PID: 4294 Comm: swapon Not tainted 5.5.0-rc7-BTRFS-ZNS+ #176
    Hardware name: ASUS All Series/H87-PRO, BIOS 2102 07/29/2014
    Call Trace:
     dump_stack+0xa1/0xea
     print_unlock_imbalance_bug.cold+0x114/0x123
     lock_release+0x562/0xed0
     up_write+0x2d/0x490
     __do_sys_swapon+0x94b/0x3550
     __x64_sys_swapon+0x54/0x80
     do_syscall_64+0xa4/0x4b0
     entry_SYSCALL_64_after_hwframe+0x49/0xbe
    RIP: 0033:0x7f15da0a0dc7

Fixes: 1638045c ("mm: set S_SWAPFILE on blockdev swap devices")
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Qais Youef <qais.yousef@arm.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200206090132.154869-1-naohiro.aota@wdc.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d795a90e

block: return NULL in blk_alloc_queue() on error · 654a3667

Chaitanya Kulkarni authored Mar 29, 2020

This patch fixes follwoing warning:

block/blk-core.c: In function ‘blk_alloc_queue’:
block/blk-core.c:558:10: warning: returning ‘int’ from a function with return type ‘struct request_queue *’ makes pointer from integer without a cast [-Wint-conversion]
   return -EINVAL;

Fixes: 3d745ea5 ("block: simplify queue allocation")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

654a3667

i3c: convert to use i2c_new_client_device() · c4b9de11

Wolfram Sang authored Mar 26, 2020

Move away from the deprecated API.
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Link: https://lore.kernel.org/linux-i3c/20200326211002.13241-2-wsa+renesas@sang-engineering.com

c4b9de11

Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · e595dd94

Linus Torvalds authored Mar 28, 2020

Pull networking fixes from David Miller:

 1) Fix memory leak in vti6, from Torsten Hilbrich.

 2) Fix double free in xfrm_policy_timer, from YueHaibing.

 3) NL80211_ATTR_CHANNEL_WIDTH attribute is put with wrong type, from
    Johannes Berg.

 4) Wrong allocation failure check in qlcnic driver, from Xu Wang.

 5) Get ks8851-ml IO operations right, for real this time, from Marek
    Vasut.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (22 commits)
  r8169: fix PHY driver check on platforms w/o module softdeps
  net: ks8851-ml: Fix IO operations, again
  mlxsw: spectrum_mr: Fix list iteration in error path
  qlcnic: Fix bad kzalloc null test
  mac80211: set IEEE80211_TX_CTRL_PORT_CTRL_PROTO for nl80211 TX
  mac80211: mark station unauthorized before key removal
  mac80211: Check port authorization in the ieee80211_tx_dequeue() case
  cfg80211: Do not warn on same channel at the end of CSA
  mac80211: drop data frames without key on encrypted links
  ieee80211: fix HE SPR size calculation
  nl80211: fix NL80211_ATTR_CHANNEL_WIDTH attribute type
  xfrm: policy: Fix doulbe free in xfrm_policy_timer
  bpf: Explicitly memset some bpf info structures declared on the stack
  bpf: Explicitly memset the bpf_attr structure
  bpf: Sanitize the bpf_struct_ops tcp-cc name
  vti6: Fix memory leak of skb if input policy check fails
  esp: remove the skb from the chain when it's enqueued in cryptd_wq
  ipv6: xfrm6_tunnel.c: Use built-in RCU list checking
  xfrm: add the missing verify_sec_ctx_len check in xfrm_add_acquire
  xfrm: fix uctx len check in verify_sec_ctx_len
  ...

e595dd94

28 Mar, 2020 4 commits

Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 906c4043

Linus Torvalds authored Mar 28, 2020

Pull i2c fixes from Wolfram Sang:
 "Three more driver bugfixes, and two doc improvements fixing build
  warnings while we are here"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: pca-platform: Use platform_irq_get_optional
  i2c: st: fix missing struct parameter description
  i2c: nvidia-gpu: Handle timeout correctly in gpu_i2c_check_status()
  i2c: fix a doc warning
  i2c: hix5hd2: add missed clk_disable_unprepare in remove

906c4043

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 83fd69c9

Linus Torvalds authored Mar 28, 2020

Pull SCSI fixes from James Bottomley:
 "Two small fixes: one in drivers (qla2xxx), and one in the core (sd) to
  try to cope with USB enclosures that silently change reported
  parameters"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: sd: Fix optimal I/O size for devices that change reported values
  scsi: qla2xxx: Fix I/Os being passed down when FC device is being deleted

83fd69c9

i2c: pca-platform: Use platform_irq_get_optional · 14c1fe69

Chris Packham authored Mar 27, 2020

The interrupt is not required so use platform_irq_get_optional() to
avoid error messages like

  i2c-pca-platform 22080000.i2c: IRQ index 0 not found
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

14c1fe69

i2c: st: fix missing struct parameter description · f491c668

Alain Volmat authored Mar 26, 2020

Fix a missing struct parameter description to allow
warning free W=1 compilation.
Signed-off-by: Alain Volmat <avolmat@me.com>
Reviewed-by: Patrice Chotard <patrice.chotard@st.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>

f491c668

27 Mar, 2020 18 commits

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · a0ba26f3

David S. Miller authored Mar 27, 2020

Daniel Borkmann says:

====================
pull-request: bpf 2020-03-27

The following pull-request contains BPF updates for your *net* tree.

We've added 3 non-merge commits during the last 4 day(s) which contain
a total of 4 files changed, 25 insertions(+), 20 deletions(-).

The main changes are:

1) Explicitly memset the bpf_attr structure on bpf() syscall to avoid
   having to rely on compiler to do so. Issues have been noticed on
   some compilers with padding and other oddities where the request was
   then unexpectedly rejected, from Greg Kroah-Hartman.

2) Sanitize the bpf_struct_ops TCP congestion control name in order to
   avoid problematic characters such as whitespaces, from Martin KaFai Lau.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

a0ba26f3

r8169: fix PHY driver check on platforms w/o module softdeps · 2e8c339b

Heiner Kallweit authored Mar 27, 2020

On Android/x86 the module loading infrastructure can't deal with
softdeps. Therefore the check for presence of the Realtek PHY driver
module fails. mdiobus_register() will try to load the PHY driver
module, therefore move the check to after this call and explicitly
check that a dedicated PHY driver is bound to the PHY device.

Fixes: f3259377 ("r8169: check that Realtek PHY driver module is loaded")
Reported-by: Chih-Wei Huang <cwhuang@android-x86.org>
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2e8c339b

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · e00dd941

David S. Miller authored Mar 27, 2020

Steffen Klassert says:

====================
pull request (net): ipsec 2020-03-27

1) Handle NETDEV_UNREGISTER for xfrm device to handle asynchronous
   unregister events cleanly. From Raed Salem.

2) Fix vti6 tunnel inter address family TX through bpf_redirect().
   From Nicolas Dichtel.

3) Fix lenght check in verify_sec_ctx_len() to avoid a
   slab-out-of-bounds. From Xin Long.

4) Add a missing verify_sec_ctx_len check in xfrm_add_acquire
   to avoid a possible out-of-bounds to access. From Xin Long.

5) Use built-in RCU list checking of hlist_for_each_entry_rcu
   to silence false lockdep warning in __xfrm6_tunnel_spi_lookup
   when CONFIG_PROVE_RCU_LIST is enabled. From Madhuparna Bhowmik.

6) Fix a panic on esp offload when crypto is done asynchronously.
   From Xin Long.

7) Fix a skb memory leak in an error path of vti6_rcv.
   From Torsten Hilbrich.

8) Fix a race that can lead to a doulbe free in xfrm_policy_timer.
   From Xin Long.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e00dd941

Merge branch 'parisc-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 69c5eea3

Linus Torvalds authored Mar 27, 2020

Pull parsic fix from Helge Deller:
 "Fix a recursive loop when running 'make ARCH=parisc defconfig'"

* 'parisc-5.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix defconfig selection

69c5eea3

Merge tag 'arm-soc-fixes-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc · 32db9f10

Linus Torvalds authored Mar 27, 2020

Pull ARM DT and driver fixes from Arnd Bergmann:
 "For the devicetree files, there are a total of 20 patches, almost
  entirely for 32-bit machines:

   - The Allwinner/sun9i r40 SoC dtsi file contains a number of issues,
     both for correctness and for style that are addressed in separate
     patches. This causes most of the changed lines of the DT updates
     this time.

   - More Allwinner updates fixing the identification of the security
     system on sun8i/A33, a recent regression of the A83t ethernet, and
     a few board specific issues on the TBS-A711 macine.

   - Several bug fixes for OMAP dts files, most notably fixing the
     timings for the NAND flash on the Nokia N900 that regressed a while
     ago after the move to configuring them from DT. Some other OMAPs
     now set the correct dma limits on the L3 bus, and a regression fix
     addresses lost Ethernet on dm814x

   - One incorrect setting in the newly added Raspberry Pi Zero W that
     may cause issues with the SD card controller.

   - A missing property on the bcm2835 firmware node caused incorrect
     DMA settings.

   - An old bug on the oxnas platform causing spurious interrupts is
     finally addressed.

   - A regression on the Exynos Midas board broke the OLED panel power
     supply.

   - The i.MX6 phycore SoM specified the wrong voltage for the SoC, this
     is now set to the values from the datasheet.

   - Some 64-bit machines use a deprecated string to identify the PSCI
     firmware.

  There are also several small code fixes addressing mostly serious
  issues:

   - Fix the sunxi rsb bus access to no longer return incorrect data
     when mixing 8 and 16 bit I/O.

   - Fix a suspend/resume regression on the OMAP2+ lcdc from a missing
     quirk in the ti-sysc driver

   - Fix a NULL pointer access from a race in the fsl dpio driver

   - Fix a v5.5 regression in the exynos-chipid driver that caused an
     invalid error code probing the device on non-exynos platforms

   - Fix an out-of-bounds access in the AMD TEE driver"

* tag 'arm-soc-fixes-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (24 commits)
  soc: samsung: chipid: Fix return value on non-Exynos platforms
  arm64: dts: Fix leftover entry-methods for PSCI
  ARM: dts: exynos: Fix regulator node aliasing on Midas-based boards
  ARM: dts: oxnas: Fix clear-mask property
  ARM: dts: bcm283x: Fix vc4's firmware bus DMA limitations
  ARM: dts: omap5: Add bus_dma_limit for L3 bus
  ARM: dts: omap4-droid4: Fix lost touchscreen interrupts
  ARM: dts: dra7: Add bus_dma_limit for L3 bus
  ARM: bcm2835-rpi-zero-w: Add missing pinctrl name
  ARM: dts: sun8i: a33: add the new SS compatible
  dt-bindings: crypto: add new compatible for A33 SS
  ARM: dts: sun8i: r40: Move SPI device nodes based on address order
  ARM: dts: sun8i: r40: Fix register base address for SPI2 and SPI3
  ARM: dts: sun8i: r40: Move AHCI device node based on address order
  ARM: dts: imx6: phycore-som: fix arm and soc minimum voltage
  soc: fsl: dpio: register dpio irq handlers after dpio create
  tee: amdtee: out of bounds read in find_session()
  ARM: dts: N900: fix onenand timings
  bus: ti-sysc: Fix quirk flags for lcdc on am335x
  ARM: dts: Fix dm814x Ethernet by changing to use rgmii-id mode
  ...

32db9f10

null_blk: add trace in null_blk_zoned.c · 766c3297

Chaitanya Kulkarni authored Mar 25, 2020

With the help of previously added tracepoints we can now trace
report-zones, zone-write and zone-mgmt ops in null_blk_zoned.c.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

766c3297

null_blk: add tracepoint helpers for zoned mode · c51d0419

Chaitanya Kulkarni authored Mar 25, 2020

This patch adds two new tracpoints for null_blk_zoned.c that allows us
to trace report-zones, zone-mgmt-op and zone-write operations which has
direct effect on the zone condition state machine.

Also, we update drivers/block/Makefile so that new null_blk related
tracefiles can be compiled.
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

c51d0419

block: add a zone condition debug helper · 02694e86

Chaitanya Kulkarni authored Mar 25, 2020

Add a helper to stringify the zone conditions. We use this helper in the
next patch to track zone conditions in tracepoints.
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

02694e86

Merge tag 'riscv-for-linus-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux · 823846c3

Linus Torvalds authored Mar 27, 2020

Pull RISC-V fixes from Palmer Dabbelt:
 "Sorry for the last minute patches, but a few things fell through the
  cracks recently. I was on the fence about sending a late pull request
  just for the M-mode fixes, as we don't really have any users, but the
  last patch fixes the build for Fedora which I consider pretty
  important.

  Given that the M-mode fixes should be very low risk, I figured it's
  worth sending them along as well.

  Thhis passes my standard 'boot in QEMU' test"

* tag 'riscv-for-linus-5.6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  RISC-V: Move all address space definition macros to one place
  RISC-V: Only select essential drivers for SOC_VIRT config
  riscv: fix the IPI missing issue in nommu mode
  riscv: uaccess should be used in nommu mode

823846c3

block: move bio_map_* to blk-map.c · 130879f1

Christoph Hellwig authored Mar 27, 2020

The bio_map_* helpers are just the low-level helpers for the
blk_rq_map_* APIs.  Move them together for better logical grouping,
as no there isn't much overlap with other code in bio.c.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

130879f1

Merge tag 'devicetree-fixes-for-5.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux · bb36d37e

Linus Torvalds authored Mar 27, 2020

Pull Devicetree fix from Rob Herring:
 "A single fix for building dtc with GCC 10"

* tag 'devicetree-fixes-for-5.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  scripts/dtc: Remove redundant YYLOC global declaration

bb36d37e

Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 1fa8cb0b

Linus Torvalds authored Mar 27, 2020

Pull arm64 fix from Will Deacon:
 "Fix defconfig build when using Clang's integrated assembler"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: alternative: fix build with clang integrated assembler

1fa8cb0b

Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 527630fb

Linus Torvalds authored Mar 27, 2020

Pull clk fixes from Stephen Boyd:
 "A handful of clk driver fixes.

  Mostly they're around the i.MX drivers fixing the parents of a few
  clks and making KASAN happy with how the message passing code works.

  Besides that we have a TI driver fix for the RTC parent and a fix for
  the basic gate type registration functions introduced this release
  where they didn't actually pass the arguments in the right places to
  the multiplexer function down below"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: imx: Align imx sc clock parent msg structs to 4
  clk: imx: Align imx sc clock msg structs to 4
  clk: Pass correct arguments to __clk_hw_register_gate()
  clk: ti: am43xx: Fix clock parent for RTC clock
  clk: imx8mp: Correct the enet_qos parent clock
  clk: imx8mp: Correct IMX8MP_CLK_HDMI_AXI clock parent

527630fb

Revert "blkdev: check for valid request queue before issuing flush" · f01b411f

Christoph Hellwig authored Mar 27, 2020

This reverts commit f10d9f61.

We can't have queues without a make_request_fn any more (and the
loop device uses blk-mq these days anyway..).
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

f01b411f

block: simplify queue allocation · 3d745ea5

Christoph Hellwig authored Mar 27, 2020

Current make_request based drivers use either blk_alloc_queue_node or
blk_alloc_queue to allocate a queue, and then set up the make_request_fn
function pointer and a few parameters using the blk_queue_make_request
helper. Simplify this by passing the make_request pointer to
blk_alloc_queue, and while at it merge the _node variant into the main
helper by always passing a node_id, and remove the superfluous gfp_mask
parameter. A lower-level __blk_alloc_queue is kept for the blk-mq case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

3d745ea5

bcache: pass the make_request methods to blk_queue_make_request · ff27668c

Christoph Hellwig authored Mar 27, 2020

bcache is the only driver not actually passing its make_request
methods to blk_queue_make_request, but instead just sets them up
manually a little later.  Make bcache follow the common way of
setting up make_request based queues.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

ff27668c

null_blk: use blk_mq_init_queue_data · 8d96a111

Christoph Hellwig authored Mar 27, 2020

Use the new blk_mq_init_queue_data instead of open coding the queue
allocation and initialization.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

8d96a111

block: add a blk_mq_init_queue_data helper · 2f227bb9

Christoph Hellwig authored Mar 27, 2020

This allows a driver to pass a queuedata member before ->init_hctx is
called.  null_blk currently open codes this logic, but I'd rather have
it in the core to ease future maintainance.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>

2f227bb9