1. 14 Oct, 2020 40 commits
    • Linus Torvalds's avatar
      Merge tag 'pinctrl-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · b4e1bce8
      Linus Torvalds authored
      Pull pin control updates from Linus Walleij:
       "Core changes:
      
         - NONE whatsoever, we don't even touch the core files this time
           around.
      
        New drivers:
      
         - New driver for the Toshiba Visconti SoC.
      
         - New subdriver for the Qualcomm MSM8226 SoC.
      
         - New subdriver for the Actions Semiconductor S500 SoC.
      
         - New subdriver for the Mediatek MT8192 SoC.
      
         - New subdriver for the Microchip SAMA7G5 SoC.
      
        Driver enhancements:
      
         - Intel Cherryview and Baytrail cleanups and refactorings.
      
         - Enhanced support for the Renesas R8A7790, more pins and groups.
      
         - Some optimizations for the MCP23S08 MCP23x17 variant.
      
         - Some cleanups around the Actions Semiconductor subdrivers.
      
         - A bunch of cleanups around the SH-PFC and Emma Mobile drivers.
      
         - The "SH-PFC" (literally SuperH pin function controller, I think)
           subdirectory is now renamed to the more neutral "renesas", as these
           are not very much centered around SuperH anymore.
      
         - Non-critical fixes for the Aspeed driver.
      
         - Non-critical fixes for the Ingenic (MIPS!) driver.
      
         - Fix a bunch of missing pins on the AMD pinctrl driver"
      
      * tag 'pinctrl-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: (78 commits)
        pinctrl: amd: Add missing pins to the pin group list
        dt-bindings: pinctrl: sunxi: Allow pinctrl with more interrupt banks
        pinctrl: visconti: PINCTRL_TMPV7700 should depend on ARCH_VISCONTI
        pinctrl: mediatek: Free eint data on failure
        pinctrl: single: fix debug output when #pinctrl-cells = 2
        pinctrl: single: fix pinctrl_spec.args_count bounds check
        pinctrl: sunrisepoint: Modify COMMUNITY macros to be consistent
        pinctrl: cannonlake: Modify COMMUNITY macros to be consistent
        pinctrl: tigerlake: Fix register offsets for TGL-H variant
        pinctrl: Document pinctrl-single,pins when #pinctrl-cells = 2
        pinctrl: mediatek: use devm_platform_ioremap_resource_byname()
        pinctrl: nuvoton: npcm7xx: Constify static ops structs
        pinctrl: mediatek: mt7622: add antsel pins/groups
        pinctrl: ocelot: simplify the return expression of ocelot_gpiochip_register()
        pinctrl: at91-pio4: add support for sama7g5 SoC
        dt-bindings: pinctrl: at91-pio4: add microchip,sama7g5
        pinctrl: spear: simplify the return expression of tvc_connect()
        pinctrl: spear: simplify the return expression of spear310_pinctrl_probe
        pinctrl: sprd: use module_platform_driver to simplify the code
        pinctrl: Ingenic: Add I2S pins support for Ingenic SoCs.
        ...
      b4e1bce8
    • Linus Torvalds's avatar
      Merge tag 'leds-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds · 7fafb54c
      Linus Torvalds authored
      Pull LED updates from Pavel Machek:
       "Quite a lot of stuff is going on here. Great cleanups/fixes from Marek
        and others are biggest part.
      
        I limited CPU LED trigger to 8 LEDs, because it was willing to
        register 1024 'triggers' on machine with 1024 CPUs. I don't believe it
        will cause any problems, but we can raise the limit if it does"
      
      * tag 'leds-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds: (84 commits)
        leds: pwm: Remove platform_data support
        leds: lm3697: Fix out-of-bound access
        leds: ns2: do not guard OF match pointer with of_match_ptr
        leds: ns2: convert to fwnode API
        leds: tlc591xx: fix leak of device node iterator
        leds: pca963x: use struct led_init_data when registering
        leds: pca963x: register LEDs immediately after parsing, get rid of platdata
        leds: tca6507: remove binding comment
        leds: tca6507: cosmetic change: use helper variable
        leds: tca6507: do not set GPIO names
        dt-bindings: leds: tca6507: convert to YAML
        ledtrig-cpu: Limit to 8 CPUs
        leds: TODO: Add documentation about possible subsystem improvements
        leds: pca9532: read pwm settings from device tree
        leds: pca9532: correct shift computation in pca9532_getled
        leds: lm36274: Fix warning for undefined parameters
        leds: lm3532: Fix warnings for undefined parameters
        leds: pca963x: use flexible array
        leds: pca963x: cosmetic: rename variables
        leds: pca963x: cosmetic: rename variables
        ...
      7fafb54c
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 55e0500e
      Linus Torvalds authored
      Pull SCSI updates from James Bottomley:
       "The usual driver updates (ufs, qla2xxx, tcmu, ibmvfc, lpfc, smartpqi,
        hisi_sas, qedi, qedf, mpt3sas) and minor bug fixes.
      
        There are only three core changes: adding sense codes, cleaning up
        noretry and adding an option for limitless retries"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (226 commits)
        scsi: hisi_sas: Recover PHY state according to the status before reset
        scsi: hisi_sas: Filter out new PHY up events during suspend
        scsi: hisi_sas: Add device link between SCSI devices and hisi_hba
        scsi: hisi_sas: Add check for methods _PS0 and _PR0
        scsi: hisi_sas: Add controller runtime PM support for v3 hw
        scsi: hisi_sas: Switch to new framework to support suspend and resume
        scsi: hisi_sas: Use hisi_hba->cq_nvecs for calling calling synchronize_irq()
        scsi: qedf: Remove redundant assignment to variable 'rc'
        scsi: lpfc: Remove unneeded variable 'status' in lpfc_fcp_cpu_map_store()
        scsi: snic: Convert to use DEFINE_SEQ_ATTRIBUTE macro
        scsi: qla4xxx: Delete unneeded variable 'status' in qla4xxx_process_ddb_changed
        scsi: sun_esp: Use module_platform_driver to simplify the code
        scsi: sun3x_esp: Use module_platform_driver to simplify the code
        scsi: sni_53c710: Use module_platform_driver to simplify the code
        scsi: qlogicpti: Use module_platform_driver to simplify the code
        scsi: mac_esp: Use module_platform_driver to simplify the code
        scsi: jazz_esp: Use module_platform_driver to simplify the code
        scsi: mvumi: Fix error return in mvumi_io_attach()
        scsi: lpfc: Drop nodelist reference on error in lpfc_gen_req()
        scsi: be2iscsi: Fix a theoretical leak in beiscsi_create_eqs()
        ...
      55e0500e
    • Linus Torvalds's avatar
      Merge tag 'for-5.10/dm-changes' of... · 4815519e
      Linus Torvalds authored
      Merge tag 'for-5.10/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
      
      Pull device mapper updates from Mike Snitzer:
      
       - Improve DM core's bio splitting to use blk_max_size_offset(). Also
         fix bio splitting for bios that were deferred to the worker thread
         due to a DM device being suspended.
      
       - Remove DM core's special handling of NVMe devices now that block core
         has internalized efficiencies drivers previously needed to be
         concerned about (via now removed direct_make_request).
      
       - Fix request-based DM to not bounce through indirect dm_submit_bio;
         instead have block core make direct call to blk_mq_submit_bio().
      
       - Various DM core cleanups to simplify and improve code.
      
       - Update DM cryot to not use drivers that set
         CRYPTO_ALG_ALLOCATES_MEMORY.
      
       - Fix DM raid's raid1 and raid10 discard limits for the purposes of
         linux-stable. But then remove DM raid's discard limits settings now
         that MD raid can efficiently handle large discards.
      
       - A couple small cleanups across various targets.
      
      * tag 'for-5.10/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
        dm: fix request-based DM to not bounce through indirect dm_submit_bio
        dm: remove special-casing of bio-based immutable singleton target on NVMe
        dm: export dm_copy_name_and_uuid
        dm: fix comment in __dm_suspend()
        dm: fold dm_process_bio() into dm_submit_bio()
        dm: fix missing imposition of queue_limits from dm_wq_work() thread
        dm snap persistent: simplify area_io()
        dm thin metadata: Remove unused local variable when create thin and snap
        dm raid: remove unnecessary discard limits for raid10
        dm raid: fix discard limits for raid1 and raid10
        dm crypt: don't use drivers that have CRYPTO_ALG_ALLOCATES_MEMORY
        dm: use dm_table_get_device_name() where appropriate in targets
        dm table: make 'struct dm_table' definition accessible to all of DM core
        dm: eliminate need for start_io_acct() forward declaration
        dm: simplify __process_abnormal_io()
        dm: push use of on-stack flush_bio down to __send_empty_flush()
        dm: optimize max_io_len() by inlining max_io_len_target_boundary()
        dm: push md->immutable_target optimization down to __process_bio()
        dm: change max_io_len() to use blk_max_size_offset()
        dm table: stack 'chunk_sectors' limit to account for target-specific splitting
      4815519e
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.10-1' of git://github.com/cminyard/linux-ipmi · 6e4dc3d5
      Linus Torvalds authored
      Pull IPMI updates from Corey Minyard:
       "Some minor bug fixes, return values, cleanups of prints, conversion of
        tasklets to the new API.
      
        The biggest change is retrying the initial information fetch from the
        management controller. If that fails, the iterface is not operational,
        and one group was having trouble with the management controller not
        being ready when the OS started up. So a retry was added"
      
      * tag 'for-linus-5.10-1' of git://github.com/cminyard/linux-ipmi:
        ipmi_si: Fix wrong return value in try_smi_init()
        ipmi: msghandler: Fix a signedness bug
        ipmi: add retry in try_get_dev_id()
        ipmi: Clean up some printks
        ipmi:msghandler: retry to get device id on an error
        ipmi:sm: Print current state when the state is invalid
        ipmi: Reset response handler when failing to send the command
        ipmi: add a newline when printing parameter 'panic_op' by sysfs
        char: ipmi: convert tasklets to use new tasklet_setup() API
      6e4dc3d5
    • Linus Torvalds's avatar
      Merge branch 'for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 2f6c6d08
      Linus Torvalds authored
      Pull cgroup updates from Tejun Heo:
       "Two minor changes.
      
        One makes cgroup interface files ignore zero-sized writes rather than
        triggering -EINVAL on them. The other change is a cleanup which
        doesn't cause any behavior changes"
      
      * 'for-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: Zero sized write should be no-op
        cgroup: remove redundant kernfs_activate in cgroup_setup_root()
      2f6c6d08
    • Andrii Nakryiko's avatar
      fs: fix NULL dereference due to data race in prepend_path() · 09cad075
      Andrii Nakryiko authored
      Fix data race in prepend_path() with re-reading mnt->mnt_ns twice
      without holding the lock.
      
      is_mounted() does check for NULL, but is_anon_ns(mnt->mnt_ns) might
      re-read the pointer again which could be NULL already, if in between
      reads one of kern_unmount()/kern_unmount_array()/umount_tree() sets
      mnt->mnt_ns to NULL.
      
      This is seen in production with the following stack trace:
      
        BUG: kernel NULL pointer dereference, address: 0000000000000048
        ...
        RIP: 0010:prepend_path.isra.4+0x1ce/0x2e0
        Call Trace:
          d_path+0xe6/0x150
          proc_pid_readlink+0x8f/0x100
          vfs_readlink+0xf8/0x110
          do_readlinkat+0xfd/0x120
          __x64_sys_readlinkat+0x1a/0x20
          do_syscall_64+0x42/0x110
          entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: f2683bd8 ("[PATCH] fix d_absolute_path() interplay with fsmount()")
      Signed-off-by: default avatarAndrii Nakryiko <andrii@kernel.org>
      Reviewed-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09cad075
    • Linus Torvalds's avatar
      Merge tag 'threads-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 4da9af00
      Linus Torvalds authored
      Pull pidfd updates from Christian Brauner:
       "This introduces a new extension to the pidfd_open() syscall. Users can
        now raise the new PIDFD_NONBLOCK flag to support non-blocking pidfd
        file descriptors. This has been requested for uses in async process
        management libraries such as async-pidfd in Rust.
      
        Ever since the introduction of pidfds and more advanced async io
        various programming languages such as Rust have grown support for
        async event libraries. These libraries are created to help build
        epoll-based event loops around file descriptors. A common pattern is
        to automatically make all file descriptors they manage to O_NONBLOCK.
      
        For such libraries the EAGAIN error code is treated specially. When a
        function is called that returns EAGAIN the function isn't called again
        until the event loop indicates the the file descriptor is ready.
        Supporting EAGAIN when waiting on pidfds makes such libraries just
        work with little effort.
      
        This introduces a new flag PIDFD_NONBLOCK that is equivalent to
        O_NONBLOCK. This follows the same patterns we have for other (anon
        inode) file descriptors such as EFD_NONBLOCK, IN_NONBLOCK,
        SFD_NONBLOCK, TFD_NONBLOCK and the same for close-on-exec flags.
      
        Passing a non-blocking pidfd to waitid() currently has no effect, i.e.
        is not supported. There are users which would like to use waitid() on
        pidfds that are O_NONBLOCK and mix it with pidfds that are blocking
        and both pass them to waitid().
      
        The expected behavior is to have waitid() return -EAGAIN for
        non-blocking pidfds and to block for blocking pidfds without needing
        to perform any additional checks for flags set on the pidfd before
        passing it to waitid(). Non-blocking pidfds will return EAGAIN from
        waitid() when no child process is ready yet. Returning -EAGAIN for
        non-blocking pidfds makes it easier for event loops that handle EAGAIN
        specially.
      
        It also makes the API more consistent and uniform. In essence,
        waitid() is treated like a read on a non-blocking pidfd or a recvmsg()
        on a non-blocking socket.
      
        With the addition of support for non-blocking pidfds we support the
        same functionality that sockets do. For sockets() recvmsg() supports
        MSG_DONTWAIT for pidfds waitid() supports WNOHANG. Both flags are
        per-call options. In contrast non-blocking pidfds and non-blocking
        sockets are a setting on an open file description affecting all
        threads in the calling process as well as other processes that hold
        file descriptors referring to the same open file description. Both
        behaviors, per call and per open file description, have genuine
        use-cases.
      
        The interaction with the WNOHANG flag is documented as follows:
      
         - If a non-blocking pidfd is passed and WNOHANG is not raised we
           simply raise the WNOHANG flag internally. When do_wait() returns
           indicating that there are eligible child processes but none have
           exited yet we set EAGAIN. If no child process exists we continue
           returning ECHILD.
      
         - If a non-blocking pidfd is passed and WNOHANG is raised waitid()
           will continue returning 0, i.e. it will not set EAGAIN. This ensure
           backwards compatibility with applications passing WNOHANG
           explicitly with pidfds"
      
      * tag 'threads-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        tests: remove O_NONBLOCK before waiting for WSTOPPED
        tests: add waitid() tests for non-blocking pidfds
        tests: port pidfd_wait to kselftest harness
        pidfd: support PIDFD_NONBLOCK in pidfd_open()
        exit: support non-blocking pidfds
      4da9af00
    • Linus Torvalds's avatar
      Merge tag 'kernel-clone-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · 612e7a4c
      Linus Torvalds authored
      Pull kernel_clone() updates from Christian Brauner:
       "During the v5.9 merge window we reworked the process creation
        codepaths across multiple architectures. After this work we were only
        left with the _do_fork() helper based on the struct kernel_clone_args
        calling convention. As was pointed out _do_fork() isn't valid
        kernelese especially for a helper that isn't just static.
      
        This series removes the _do_fork() helper and introduces the new
        kernel_clone() helper. The process creation cleanup didn't change the
        name to something more reasonable mainly because _do_fork() was used
        in quite a few places. So sending this as a separate series seemed the
        better strategy.
      
        I originally intended to send this early in the v5.9 development cycle
        after the merge window had closed but given that this was touching
        quite a few places I decided to defer this until the v5.10 merge
        window"
      
      * tag 'kernel-clone-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        sched: remove _do_fork()
        tracing: switch to kernel_clone()
        kgdbts: switch to kernel_clone()
        kprobes: switch to kernel_clone()
        x86: switch to kernel_clone()
        sparc: switch to kernel_clone()
        nios2: switch to kernel_clone()
        m68k: switch to kernel_clone()
        ia64: switch to kernel_clone()
        h8300: switch to kernel_clone()
        fork: introduce kernel_clone()
      612e7a4c
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-fixes-5.10-rc1' of... · 9e51183e
      Linus Torvalds authored
      Merge tag 'linux-kselftest-fixes-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kselftest updates from Shuah Khan:
      
       - a selftests harness fix to flush stdout before forking to avoid
         parent and child printing duplicates messages. This is evident when
         test output is redirected to a file.
      
       - a tools/ wide change to avoid comma separated statements from Joe
         Perches. This fix spans tools/lib, tools/power/cpupower, and
         selftests.
      
      * tag 'linux-kselftest-fixes-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        tools: Avoid comma separated statements
        selftests/harness: Flush stdout before forking
      9e51183e
    • Linus Torvalds's avatar
      Merge tag 'xfs-5.10-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 2fc61f25
      Linus Torvalds authored
      Pull xfs updates from Darrick Wong:
       "The biggest changes are two new features for the ondisk metadata: one
        to record the sizes of the inode btrees in the AG to increase
        redundancy checks and to improve mount times; and a second new feature
        to support timestamps until the year 2486.
      
        We also fixed a problem where reflinking into a file that requires
        synchronous writes wouldn't actually flush the updates to disk; clean
        up a fair amount of cruft; and started fixing some bugs in the
        realtime volume code.
      
        Summary:
      
         - Clean up the buffer ioend calling path so that the retry strategy
           isn't quite so scattered everywhere.
      
         - Clean up m_sb_bp handling.
      
         - New feature: storing inode btree counts in the AGI to speed up
           certain mount time per-AG block reservation operatoins and add a
           little more metadata redundancy.
      
         - New feature: Widen inode timestamps and quota grace expiration
           timestamps to support dates through the year 2486.
      
         - Get rid of more of our custom buffer allocation API wrappers.
      
         - Use a proper VLA for shortform xattr structure namevals.
      
         - Force the log after reflinking or deduping into a file that is
           opened with O_SYNC or O_DSYNC.
      
         - Fix some math errors in the realtime allocator"
      
      * tag 'xfs-5.10-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (42 commits)
        xfs: ensure that fpunch, fcollapse, and finsert operations are aligned to rt extent size
        xfs: make sure the rt allocator doesn't run off the end
        xfs: Remove unneeded semicolon
        xfs: force the log after remapping a synchronous-writes file
        xfs: Convert xfs_attr_sf macros to inline functions
        xfs: Use variable-size array for nameval in xfs_attr_sf_entry
        xfs: Remove typedef xfs_attr_shortform_t
        xfs: remove typedef xfs_attr_sf_entry_t
        xfs: Remove kmem_zalloc_large()
        xfs: enable big timestamps
        xfs: trace timestamp limits
        xfs: widen ondisk quota expiration timestamps to handle y2038+
        xfs: widen ondisk inode timestamps to deal with y2038+
        xfs: redefine xfs_ictimestamp_t
        xfs: redefine xfs_timestamp_t
        xfs: move xfs_log_dinode_to_disk to the log recovery code
        xfs: refactor quota timestamp coding
        xfs: refactor default quota grace period setting code
        xfs: refactor quota expiration timer modification
        xfs: explicitly define inode timestamp range
        ...
      2fc61f25
    • Linus Torvalds's avatar
      Merge tag 'iomap-5.10-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux · 37187df4
      Linus Torvalds authored
      Pull iomap updates from Darrick Wong:
       "There's not a lot of new stuff going on here -- a little bit of code
        refactoring to make iomap workable with btrfs' fsync locking model,
        cleanups in preparation for adding THP support for filesystems, and
        fixing a data corruption issue for blocksize < pagesize filesystems.
      
        Summary:
      
         - Don't WARN_ON weird states that unprivileged users can create.
      
         - Don't invalidate page cache when direct writes want to fall back to
           buffered.
      
         - Fix some problems when readahead ios fail.
      
         - Fix a problem where inline data pages weren't getting flushed
           during an unshare operation.
      
         - Rework iomap to support arbitrarily many blocks per page in
           preparation to support THP for the page cache.
      
         - Fix a bug in the blocksize < pagesize buffered io path where we
           could fail to initialize the many-blocks-per-page uptodate bitmap
           correctly when the backing page is actually up to date. This could
           cause us to forget to write out dirty pages.
      
         - Split out the generic_write_sync at the end of the directio write
           path so that btrfs can drop the inode lock before sync'ing the
           file.
      
         - Call inode_dio_end before trying to sync the file after a O_DSYNC
           direct write (instead of afterwards) to match the behavior of the
           old directio code"
      
      * tag 'iomap-5.10-merge-4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
        iomap: Call inode_dio_end() before generic_write_sync()
        iomap: Allow filesystem to call iomap_dio_complete without i_rwsem
        iomap: Set all uptodate bits for an Uptodate page
        iomap: Change calling convention for zeroing
        iomap: Convert iomap_write_end types
        iomap: Convert write_count to write_bytes_pending
        iomap: Convert read_count to read_bytes_pending
        iomap: Support arbitrarily many blocks per page
        iomap: Use bitmap ops to set uptodate bits
        iomap: Use kzalloc to allocate iomap_page
        fs: Introduce i_blocks_per_page
        iomap: Fix misplaced page flushing
        iomap: Use round_down/round_up macros in __iomap_write_begin
        iomap: Mark read blocks uptodate in write_begin
        iomap: Clear page error before beginning a write
        iomap: Fix direct I/O write consistency check
        iomap: fix WARN_ON_ONCE() from unprivileged users
      37187df4
    • Linus Torvalds's avatar
      Merge tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 531d29b0
      Linus Torvalds authored
      Pull iommu updates from Joerg Roedel:
      
       - ARM-SMMU Updates from Will:
      
            - Continued SVM enablement, where page-table is shared with CPU
      
            - Groundwork to support integrated SMMU with Adreno GPU
      
            - Allow disabling of MSI-based polling on the kernel command-line
      
            - Minor driver fixes and cleanups (octal permissions, error
              messages, ...)
      
       - Secure Nested Paging Support for AMD IOMMU. The IOMMU will fault when
         a device tries DMA on memory owned by a guest. This needs new
         fault-types as well as a rewrite of the IOMMU memory semaphore for
         command completions.
      
       - Allow broken Intel IOMMUs (wrong address widths reported) to still be
         used for interrupt remapping.
      
       - IOMMU UAPI updates for supporting vSVA, where the IOMMU can access
         address spaces of processes running in a VM.
      
       - Support for the MT8167 IOMMU in the Mediatek IOMMU driver.
      
       - Device-tree updates for the Renesas driver to support r8a7742.
      
       - Several smaller fixes and cleanups all over the place.
      
      * tag 'iommu-updates-v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (57 commits)
        iommu/vt-d: Gracefully handle DMAR units with no supported address widths
        iommu/vt-d: Check UAPI data processed by IOMMU core
        iommu/uapi: Handle data and argsz filled by users
        iommu/uapi: Rename uapi functions
        iommu/uapi: Use named union for user data
        iommu/uapi: Add argsz for user filled data
        docs: IOMMU user API
        iommu/qcom: add missing put_device() call in qcom_iommu_of_xlate()
        iommu/arm-smmu-v3: Add SVA device feature
        iommu/arm-smmu-v3: Check for SVA features
        iommu/arm-smmu-v3: Seize private ASID
        iommu/arm-smmu-v3: Share process page tables
        iommu/arm-smmu-v3: Move definitions to a header
        iommu/io-pgtable-arm: Move some definitions to a header
        iommu/arm-smmu-v3: Ensure queue is read after updating prod pointer
        iommu/amd: Re-purpose Exclusion range registers to support SNP CWWB
        iommu/amd: Add support for RMP_PAGE_FAULT and RMP_HW_ERR
        iommu/amd: Use 4K page for completion wait write-back semaphore
        iommu/tegra-smmu: Allow to group clients in same swgroup
        iommu/tegra-smmu: Fix iova->phys translation
        ...
      531d29b0
    • Linus Torvalds's avatar
      Merge branch 'stable/for-linus-5.10' of... · 79db2b74
      Linus Torvalds authored
      Merge branch 'stable/for-linus-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
      
      Pull swiotlb updates from Konrad Rzeszutek Wilk:
       "Minor enhancement of using %p to print phys_addr_r and also compiler
        warnings"
      
      * 'stable/for-linus-5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
        swiotlb: Mark max_segment with static keyword
        swiotlb: Declare swiotlb_late_init_with_default_size() in header
        swiotlb: Use %pa to print phys_addr_t variables
      79db2b74
    • Linus Torvalds's avatar
      Merge tag 'pnp-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · defb53a7
      Linus Torvalds authored
      Pull PNP updates from Rafael Wysocki:
       "These clean the PNP code somewhat:
      
         - Remove the now unused pnp_find_card() function (Christoph Hellwig)
      
         - Drop duplicate pci.h include from the quirks code and add an
           "internal.h" include to acpi_pnp.c to fix a compiler warning (Tian
           Tao)"
      
      * tag 'pnp-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PNP: remove the now unused pnp_find_card() function
        PNP: ACPI: Fix missing-prototypes in acpi_pnp.c
        PNP: quirks: Fix duplicate included pci.h
      defb53a7
    • Linus Torvalds's avatar
      Merge tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · cf1d2b44
      Linus Torvalds authored
      Pull ACPI updates from Rafael Wysocki:
       "These add support for generic initiator-only proximity domains to the
        ACPI NUMA code and the architectures using it, clean up some
        non-ACPICA code referring to debug facilities from ACPICA, reduce the
        overhead related to accessing GPE registers, add a new DPTF (Dynamic
        Power and Thermal Framework) participant driver, update the ACPICA
        code in the kernel to upstream revision 20200925, add a new ACPI
        backlight whitelist entry, fix a few assorted issues and clean up some
        code.
      
        Specifics:
      
         - Add support for generic initiator-only proximity domains to the
           ACPI NUMA code and the architectures using it (Jonathan Cameron)
      
         - Clean up some non-ACPICA code referring to debug facilities from
           ACPICA that are not actually used in there (Hanjun Guo)
      
         - Add new DPTF driver for the PCH FIVR participant (Srinivas
           Pandruvada)
      
         - Reduce overhead related to accessing GPE registers in ACPICA and
           the OS interface layer and make it possible to access GPE registers
           using logical addresses if they are memory-mapped (Rafael Wysocki)
      
         - Update the ACPICA code in the kernel to upstream revision 20200925
           including changes as follows:
            + Add predefined names from the SMBus sepcification (Bob Moore)
            + Update acpi_help UUID list (Bob Moore)
            + Return exceptions for string-to-integer conversions in iASL (Bob
              Moore)
            + Add a new "ALL <NameSeg>" debugger command (Bob Moore)
            + Add support for 64 bit risc-v compilation (Colin Ian King)
            + Do assorted cleanups (Bob Moore, Colin Ian King, Randy Dunlap)
      
         - Add new ACPI backlight whitelist entry for HP 635 Notebook (Alex
           Hung)
      
         - Move TPS68470 OpRegion driver to drivers/acpi/pmic/ and split out
           Kconfig and Makefile specific for ACPI PMIC (Andy Shevchenko)
      
         - Clean up the ACPI SoC driver for AMD SoCs (Hanjun Guo)
      
         - Add missing config_item_put() to fix refcount leak (Hanjun Guo)
      
         - Drop lefrover field from struct acpi_memory_device (Hanjun Guo)
      
         - Make the ACPI extlog driver check for RDMSR failures (Ben
           Hutchings)
      
         - Fix handling of lid state changes in the ACPI button driver when
           input device is closed (Dmitry Torokhov)
      
         - Fix several assorted build issues (Barnabás Pőcze, John Garry,
           Nathan Chancellor, Tian Tao)
      
         - Drop unused inline functions and reduce code duplication by using
           kobj_to_dev() in the NFIT parsing code (YueHaibing, Wang Qing)
      
         - Serialize tools/power/acpi Makefile (Thomas Renninger)"
      
      * tag 'acpi-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (64 commits)
        ACPICA: Update version to 20200925 Version 20200925
        ACPICA: Remove unnecessary semicolon
        ACPICA: Debugger: Add a new command: "ALL <NameSeg>"
        ACPICA: iASL: Return exceptions for string-to-integer conversions
        ACPICA: acpi_help: Update UUID list
        ACPICA: Add predefined names found in the SMBus sepcification
        ACPICA: Tree-wide: fix various typos and spelling mistakes
        ACPICA: Drop the repeated word "an" in a comment
        ACPICA: Add support for 64 bit risc-v compilation
        ACPI: button: fix handling lid state changes when input device closed
        tools/power/acpi: Serialize Makefile
        ACPI: scan: Replace ACPI_DEBUG_PRINT() with pr_debug()
        ACPI: memhotplug: Remove 'state' from struct acpi_memory_device
        ACPI / extlog: Check for RDMSR failure
        ACPI: Make acpi_evaluate_dsm() prototype consistent
        docs: mm: numaperf.rst Add brief description for access class 1.
        node: Add access1 class to represent CPU to memory characteristics
        ACPI: HMAT: Fix handling of changes from ACPI 6.2 to ACPI 6.3
        ACPI: Let ACPI know we support Generic Initiator Affinity Structures
        x86: Support Generic Initiator only proximity domains
        ...
      cf1d2b44
    • Linus Torvalds's avatar
      Merge tag 'pm-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 0b8417c1
      Linus Torvalds authored
      Pull power management updates from Rafael Wysocki:
       "These rework the collection of cpufreq statistics to allow it to take
        place if fast frequency switching is enabled in the governor, rework
        the frequency invariance handling in the cpufreq core and drivers, add
        new hardware support to a couple of cpufreq drivers, fix a number of
        assorted issues and clean up the code all over.
      
        Specifics:
      
         - Rework cpufreq statistics collection to allow it to take place when
           fast frequency switching is enabled in the governor (Viresh Kumar).
      
         - Make the cpufreq core set the frequency scale on behalf of the
           driver and update several cpufreq drivers accordingly (Ionela
           Voinescu, Valentin Schneider).
      
         - Add new hardware support to the STI and qcom cpufreq drivers and
           improve them (Alain Volmat, Manivannan Sadhasivam).
      
         - Fix multiple assorted issues in cpufreq drivers (Jon Hunter,
           Krzysztof Kozlowski, Matthias Kaehlcke, Pali Rohár, Stephan
           Gerhold, Viresh Kumar).
      
         - Fix several assorted issues in the operating performance points
           (OPP) framework (Stephan Gerhold, Viresh Kumar).
      
         - Allow devfreq drivers to fetch devfreq instances by DT enumeration
           instead of using explicit phandles and modify the devfreq core code
           to support driver-specific devfreq DT bindings (Leonard Crestez,
           Chanwoo Choi).
      
         - Improve initial hardware resetting in the tegra30 devfreq driver
           and clean up the tegra cpuidle driver (Dmitry Osipenko).
      
         - Update the cpuidle core to collect state entry rejection statistics
           and expose them via sysfs (Lina Iyer).
      
         - Improve the ACPI _CST code handling diagnostics (Chen Yu).
      
         - Update the PSCI cpuidle driver to allow the PM domain
           initialization to occur in the OSI mode as well as in the PC mode
           (Ulf Hansson).
      
         - Rework the generic power domains (genpd) core code to allow domain
           power off transition to be aborted in the absence of the "power
           off" domain callback (Ulf Hansson).
      
         - Fix two suspend-to-idle issues in the ACPI EC driver (Rafael
           Wysocki).
      
         - Fix the handling of timer_expires in the PM-runtime framework on
           32-bit systems and the handling of device links in it (Grygorii
           Strashko, Xiang Chen).
      
         - Add IO requests batching support to the hibernate image saving and
           reading code and drop a bogus get_gendisk() from there (Xiaoyi
           Chen, Christoph Hellwig).
      
         - Allow PCIe ports to be put into the D3cold power state if they are
           power-manageable via ACPI (Lukas Wunner).
      
         - Add missing header file include to a power capping driver (Pujin
           Shi).
      
         - Clean up the qcom-cpr AVS driver a bit (Liu Shixin).
      
         - Kevin Hilman steps down as designated reviwer of adaptive voltage
           scaling (AVS) drivers (Kevin Hilman)"
      
      * tag 'pm-5.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits)
        cpufreq: stats: Fix string format specifier mismatch
        arm: disable frequency invariance for CONFIG_BL_SWITCHER
        cpufreq,arm,arm64: restructure definitions of arch_set_freq_scale()
        cpufreq: stats: Add memory barrier to store_reset()
        cpufreq: schedutil: Simplify sugov_fast_switch()
        ACPI: EC: PM: Drop ec_no_wakeup check from acpi_ec_dispatch_gpe()
        ACPI: EC: PM: Flush EC work unconditionally after wakeup
        PCI/ACPI: Whitelist hotplug ports for D3 if power managed by ACPI
        PM: hibernate: remove the bogus call to get_gendisk() in software_resume()
        cpufreq: Move traces and update to policy->cur to cpufreq core
        cpufreq: stats: Enable stats for fast-switch as well
        cpufreq: stats: Mark few conditionals with unlikely()
        cpufreq: stats: Remove locking
        cpufreq: stats: Defer stats update to cpufreq_stats_record_transition()
        PM: domains: Allow to abort power off when no ->power_off() callback
        PM: domains: Rename power state enums for genpd
        PM / devfreq: tegra30: Improve initial hardware resetting
        PM / devfreq: event: Change prototype of devfreq_event_get_edev_by_phandle function
        PM / devfreq: Change prototype of devfreq_get_devfreq_by_phandle function
        PM / devfreq: Add devfreq_get_devfreq_by_node function
        ...
      0b8417c1
    • Linus Torvalds's avatar
      Merge tag 'platform-drivers-x86-v5.10-1' of... · 15cb5469
      Linus Torvalds authored
      Merge tag 'platform-drivers-x86-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
      
      Pull x86 platform driver updates from Hans de Goede:
       "Rather calm cycle for x86 platform drivers, all these have been in
        for-next for a couple of days with no bot complaints.
      
        Highlights:
      
         - PMC TigerLake fixes and new RocketLake support
      
         - various small fixes / updates in other drivers/tools"
      
      * tag 'platform-drivers-x86-v5.10-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
        MAINTAINERS: update X86 PLATFORM DRIVERS entry with new kernel.org git repo
        platform/x86: mlx-platform: Add capability field to platform FAN description
        platform_data/mlxreg: Extend core platform structure
        platform_data/mlxreg: Update module license
        platform/x86: mlx-platform: Remove PSU EEPROM configuration
        MAINTAINERS: Update maintainers for pmc_core driver
        platform/x86: intel_pmc_core: fix: Replace dev_dbg macro with dev_info()
        platform/x86: intel_pmc_core: Add Intel RocketLake (RKL) support
        platform/x86: intel_pmc_core: Clean up: Remove the duplicate comments and reorganize
        platform/x86: intel_pmc_core: Fix the slp_s0 counter displayed value
        platform/x86: intel_pmc_core: Fix TigerLake power gating status map
        platform/x86: pmc_core: Use descriptive names for LPM registers
        tools/power/x86/intel-speed-select: Update version for v5.10
        tools/power/x86/intel-speed-select: Fix missing base-freq core IDs
        platform/x86: hp-wmi: add support for thermal policy
      15cb5469
    • Linus Torvalds's avatar
      Merge tag 'for-linus-5.10b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · a09b1d78
      Linus Torvalds authored
      Pull xen updates from Juergen Gross:
      
       - two small cleanup patches
      
       - avoid error messages when initializing MCA banks in a Xen dom0
      
       - a small series for converting the Xen gntdev driver to use
         pin_user_pages*() instead of get_user_pages*()
      
       - intermediate fix for running as a Xen guest on Arm with KPTI enabled
         (the final solution will need new Xen functionality)
      
      * tag 'for-linus-5.10b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
        x86/xen: Fix typo in xen_pagetable_p2m_free()
        x86/xen: disable Firmware First mode for correctable memory errors
        xen/arm: do not setup the runstate info page if kpti is enabled
        xen: remove redundant initialization of variable ret
        xen/gntdev.c: Convert get_user_pages*() to pin_user_pages*()
        xen/gntdev.c: Mark pages as dirty
      a09b1d78
    • Linus Torvalds's avatar
      Merge tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux · 4907a43d
      Linus Torvalds authored
      Pull Hyper-V updates from Wei Liu:
      
       - a series from Boqun Feng to support page size larger than 4K
      
       - a few miscellaneous clean-ups
      
      * tag 'hyperv-next-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
        hv: clocksource: Add notrace attribute to read_hv_sched_clock_*() functions
        x86/hyperv: Remove aliases with X64 in their name
        PCI: hv: Document missing hv_pci_protocol_negotiation() parameter
        scsi: storvsc: Support PAGE_SIZE larger than 4K
        Driver: hv: util: Use VMBUS_RING_SIZE() for ringbuffer sizes
        HID: hyperv: Use VMBUS_RING_SIZE() for ringbuffer sizes
        Input: hyperv-keyboard: Use VMBUS_RING_SIZE() for ringbuffer sizes
        hv_netvsc: Use HV_HYP_PAGE_SIZE for Hyper-V communication
        hv: hyperv.h: Introduce some hvpfn helper functions
        Drivers: hv: vmbus: Move virt_to_hvpfn() to hyperv header
        Drivers: hv: Use HV_HYP_PAGE in hv_synic_enable_regs()
        Drivers: hv: vmbus: Introduce types of GPADL
        Drivers: hv: vmbus: Move __vmbus_open()
        Drivers: hv: vmbus: Always use HV_HYP_PAGE_SIZE for gpadl
        drivers: hv: remove cast from hyperv_die_event
      4907a43d
    • Linus Torvalds's avatar
      Merge tag 'x86_seves_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · da9803df
      Linus Torvalds authored
      Pull x86 SEV-ES support from Borislav Petkov:
       "SEV-ES enhances the current guest memory encryption support called SEV
        by also encrypting the guest register state, making the registers
        inaccessible to the hypervisor by en-/decrypting them on world
        switches. Thus, it adds additional protection to Linux guests against
        exfiltration, control flow and rollback attacks.
      
        With SEV-ES, the guest is in full control of what registers the
        hypervisor can access. This is provided by a guest-host exchange
        mechanism based on a new exception vector called VMM Communication
        Exception (#VC), a new instruction called VMGEXIT and a shared
        Guest-Host Communication Block which is a decrypted page shared
        between the guest and the hypervisor.
      
        Intercepts to the hypervisor become #VC exceptions in an SEV-ES guest
        so in order for that exception mechanism to work, the early x86 init
        code needed to be made able to handle exceptions, which, in itself,
        brings a bunch of very nice cleanups and improvements to the early
        boot code like an early page fault handler, allowing for on-demand
        building of the identity mapping. With that, !KASLR configurations do
        not use the EFI page table anymore but switch to a kernel-controlled
        one.
      
        The main part of this series adds the support for that new exchange
        mechanism. The goal has been to keep this as much as possibly separate
        from the core x86 code by concentrating the machinery in two
        SEV-ES-specific files:
      
          arch/x86/kernel/sev-es-shared.c
          arch/x86/kernel/sev-es.c
      
        Other interaction with core x86 code has been kept at minimum and
        behind static keys to minimize the performance impact on !SEV-ES
        setups.
      
        Work by Joerg Roedel and Thomas Lendacky and others"
      
      * tag 'x86_seves_for_v5.10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (73 commits)
        x86/sev-es: Use GHCB accessor for setting the MMIO scratch buffer
        x86/sev-es: Check required CPU features for SEV-ES
        x86/efi: Add GHCB mappings when SEV-ES is active
        x86/sev-es: Handle NMI State
        x86/sev-es: Support CPU offline/online
        x86/head/64: Don't call verify_cpu() on starting APs
        x86/smpboot: Load TSS and getcpu GDT entry before loading IDT
        x86/realmode: Setup AP jump table
        x86/realmode: Add SEV-ES specific trampoline entry point
        x86/vmware: Add VMware-specific handling for VMMCALL under SEV-ES
        x86/kvm: Add KVM-specific VMMCALL handling under SEV-ES
        x86/paravirt: Allow hypervisor-specific VMMCALL handling under SEV-ES
        x86/sev-es: Handle #DB Events
        x86/sev-es: Handle #AC Events
        x86/sev-es: Handle VMMCALL Events
        x86/sev-es: Handle MWAIT/MWAITX Events
        x86/sev-es: Handle MONITOR/MONITORX Events
        x86/sev-es: Handle INVD Events
        x86/sev-es: Handle RDPMC Events
        x86/sev-es: Handle RDTSC(P) Events
        ...
      da9803df
    • Linus Torvalds's avatar
      Merge tag 'objtool-core-2020-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 6873139e
      Linus Torvalds authored
      Pull objtool updates from Ingo Molnar:
       "Most of the changes are cleanups and reorganization to make the
        objtool code more arch-agnostic. This is in preparation for non-x86
        support.
      
        Other changes:
      
         - KASAN fixes
      
         - Handle unreachable trap after call to noreturn functions better
      
         - Ignore unreachable fake jumps
      
         - Misc smaller fixes & cleanups"
      
      * tag 'objtool-core-2020-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (21 commits)
        perf build: Allow nested externs to enable BUILD_BUG() usage
        objtool: Allow nested externs to enable BUILD_BUG()
        objtool: Permit __kasan_check_{read,write} under UACCESS
        objtool: Ignore unreachable trap after call to noreturn functions
        objtool: Handle calling non-function symbols in other sections
        objtool: Ignore unreachable fake jumps
        objtool: Remove useless tests before save_reg()
        objtool: Decode unwind hint register depending on architecture
        objtool: Make unwind hint definitions available to other architectures
        objtool: Only include valid definitions depending on source file type
        objtool: Rename frame.h -> objtool.h
        objtool: Refactor jump table code to support other architectures
        objtool: Make relocation in alternative handling arch dependent
        objtool: Abstract alternative special case handling
        objtool: Move macros describing structures to arch-dependent code
        objtool: Make sync-check consider the target architecture
        objtool: Group headers to check in a single list
        objtool: Define 'struct orc_entry' only when needed
        objtool: Skip ORC entry creation for non-text sections
        objtool: Move ORC logic out of check()
        ...
      6873139e
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · d5660df4
      Linus Torvalds authored
      Merge misc updates from Andrew Morton:
       "181 patches.
      
        Subsystems affected by this patch series: kbuild, scripts, ntfs,
        ocfs2, vfs, mm (slab, slub, kmemleak, dax, debug, pagecache, fadvise,
        gup, swap, memremap, memcg, selftests, pagemap, mincore, hmm, dma,
        memory-failure, vmallo and migration)"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (181 commits)
        mm/migrate: remove obsolete comment about device public
        mm/migrate: remove cpages-- in migrate_vma_finalize()
        mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary
        memblock: use separate iterators for memory and reserved regions
        memblock: implement for_each_reserved_mem_region() using __next_mem_region()
        memblock: remove unused memblock_mem_size()
        x86/setup: simplify reserve_crashkernel()
        x86/setup: simplify initrd relocation and reservation
        arch, drivers: replace for_each_membock() with for_each_mem_range()
        arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()
        memblock: reduce number of parameters in for_each_mem_range()
        memblock: make memblock_debug and related functionality private
        memblock: make for_each_memblock_type() iterator private
        mircoblaze: drop unneeded NUMA and sparsemem initializations
        riscv: drop unneeded node initialization
        h8300, nds32, openrisc: simplify detection of memory extents
        arm64: numa: simplify dummy_numa_init()
        arm, xtensa: simplify initialization of high memory pages
        dma-contiguous: simplify cma_early_percent_memory()
        KVM: PPC: Book3S HV: simplify kvm_cma_reserve()
        ...
      d5660df4
    • Ralph Campbell's avatar
      mm/migrate: remove obsolete comment about device public · f1f4f3ab
      Ralph Campbell authored
      Device public memory never had an in tree consumer and was removed in
      commit 25b2995a ("mm: remove MEMORY_DEVICE_PUBLIC support").  Delete
      the obsolete comment.
      Signed-off-by: default avatarRalph Campbell <rcampbell@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarJason Gunthorpe <jgg@nvidia.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Link: http://lkml.kernel.org/r/20200827190735.12752-2-rcampbell@nvidia.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f1f4f3ab
    • Ralph Campbell's avatar
      mm/migrate: remove cpages-- in migrate_vma_finalize() · 42578891
      Ralph Campbell authored
      The variable struct migrate_vma->cpages is only used in
      migrate_vma_setup().  There is no need to decrement it in
      migrate_vma_finalize() since it is never checked.
      Signed-off-by: default avatarRalph Campbell <rcampbell@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Jason Gunthorpe <jgg@nvidia.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: John Hubbard <jhubbard@nvidia.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Link: http://lkml.kernel.org/r/20200827190735.12752-1-rcampbell@nvidia.comSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      42578891
    • Suren Baghdasaryan's avatar
      mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary · 67197a4f
      Suren Baghdasaryan authored
      Currently __set_oom_adj loops through all processes in the system to keep
      oom_score_adj and oom_score_adj_min in sync between processes sharing
      their mm.  This is done for any task with more that one mm_users, which
      includes processes with multiple threads (sharing mm and signals).
      However for such processes the loop is unnecessary because their signal
      structure is shared as well.
      
      Android updates oom_score_adj whenever a tasks changes its role
      (background/foreground/...) or binds to/unbinds from a service, making it
      more/less important.  Such operation can happen frequently.  We noticed
      that updates to oom_score_adj became more expensive and after further
      investigation found out that the patch mentioned in "Fixes" introduced a
      regression.  Using Pixel 4 with a typical Android workload, write time to
      oom_score_adj increased from ~3.57us to ~362us.  Moreover this regression
      linearly depends on the number of multi-threaded processes running on the
      system.
      
      Mark the mm with a new MMF_MULTIPROCESS flag bit when task is created with
      (CLONE_VM && !CLONE_THREAD && !CLONE_VFORK).  Change __set_oom_adj to use
      MMF_MULTIPROCESS instead of mm_users to decide whether oom_score_adj
      update should be synchronized between multiple processes.  To prevent
      races between clone() and __set_oom_adj(), when oom_score_adj of the
      process being cloned might be modified from userspace, we use
      oom_adj_mutex.  Its scope is changed to global.
      
      The combination of (CLONE_VM && !CLONE_THREAD) is rarely used except for
      the case of vfork().  To prevent performance regressions of vfork(), we
      skip taking oom_adj_mutex and setting MMF_MULTIPROCESS when CLONE_VFORK is
      specified.  Clearing the MMF_MULTIPROCESS flag (when the last process
      sharing the mm exits) is left out of this patch to keep it simple and
      because it is believed that this threading model is rare.  Should there
      ever be a need for optimizing that case as well, it can be done by hooking
      into the exit path, likely following the mm_update_next_owner pattern.
      
      With the combination of (CLONE_VM && !CLONE_THREAD && !CLONE_VFORK) being
      quite rare, the regression is gone after the change is applied.
      
      [surenb@google.com: v3]
        Link: https://lkml.kernel.org/r/20200902012558.2335613-1-surenb@google.com
      
      Fixes: 44a70ade ("mm, oom_adj: make sure processes sharing mm have same view of oom_score_adj")
      Reported-by: default avatarTim Murray <timmurray@google.com>
      Suggested-by: default avatarMichal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarSuren Baghdasaryan <surenb@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarChristian Brauner <christian.brauner@ubuntu.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Eugene Syromiatnikov <esyr@redhat.com>
      Cc: Christian Kellner <christian@kellner.me>
      Cc: Adrian Reber <areber@redhat.com>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Aleksa Sarai <cyphar@cyphar.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alexey Gladkov <gladkov.alexey@gmail.com>
      Cc: Michel Lespinasse <walken@google.com>
      Cc: Daniel Jordan <daniel.m.jordan@oracle.com>
      Cc: Andrei Vagin <avagin@gmail.com>
      Cc: Bernd Edlinger <bernd.edlinger@hotmail.de>
      Cc: John Johansen <john.johansen@canonical.com>
      Cc: Yafang Shao <laoar.shao@gmail.com>
      Link: https://lkml.kernel.org/r/20200824153036.3201505-1-surenb@google.comDebugged-by: default avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      67197a4f
    • Mike Rapoport's avatar
      memblock: use separate iterators for memory and reserved regions · cc6de168
      Mike Rapoport authored
      for_each_memblock() is used to iterate over memblock.memory in a few
      places that use data from memblock_region rather than the memory ranges.
      
      Introduce separate for_each_mem_region() and
      for_each_reserved_mem_region() to improve encapsulation of memblock
      internals from its users.
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarBaoquan He <bhe@redhat.com>
      Acked-by: Ingo Molnar <mingo@kernel.org>			[x86]
      Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>	[MIPS]
      Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>	[.clang-format]
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-18-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cc6de168
    • Mike Rapoport's avatar
      memblock: implement for_each_reserved_mem_region() using __next_mem_region() · 9f3d5eaa
      Mike Rapoport authored
      Iteration over memblock.reserved with for_each_reserved_mem_region() used
      __next_reserved_mem_region() that implemented a subset of
      __next_mem_region().
      
      Use __for_each_mem_range() and, essentially, __next_mem_region() with
      appropriate parameters to reduce code duplication.
      
      While on it, rename for_each_reserved_mem_region() to
      for_each_reserved_mem_range() for consistency.
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>	[.clang-format]
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-17-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9f3d5eaa
    • Mike Rapoport's avatar
      memblock: remove unused memblock_mem_size() · 5bd0960b
      Mike Rapoport authored
      The only user of memblock_mem_size() was x86 setup code, it is gone now
      and memblock_mem_size() funciton can be removed.
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-16-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5bd0960b
    • Mike Rapoport's avatar
      x86/setup: simplify reserve_crashkernel() · 6120cdc0
      Mike Rapoport authored
      * Replace magic numbers with defines
      * Replace memblock_find_in_range() + memblock_reserve() with
        memblock_phys_alloc_range()
      * Stop checking for low memory size in reserve_crashkernel_low(). The
        allocation from limited range will anyway fail if there is no enough
        memory, so there is no need for extra traversal of memblock.memory
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarBaoquan He <bhe@redhat.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-15-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6120cdc0
    • Mike Rapoport's avatar
      x86/setup: simplify initrd relocation and reservation · 3c45ee6d
      Mike Rapoport authored
      Currently, initrd image is reserved very early during setup and then it
      might be relocated and re-reserved after the initial physical memory
      mapping is created.  The "late" reservation of memblock verifies that
      mapped memory size exceeds the size of initrd, then checks whether the
      relocation required and, if yes, relocates inirtd to a new memory
      allocated from memblock and frees the old location.
      
      The check for memory size is excessive as memblock allocation will anyway
      fail if there is not enough memory.  Besides, there is no point to
      allocate memory from memblock using memblock_find_in_range() +
      memblock_reserve() when there exists memblock_phys_alloc_range() with
      required functionality.
      
      Remove the redundant check and simplify memblock allocation.
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarBaoquan He <bhe@redhat.com>
      Acked-by: default avatarIngo Molnar <mingo@kernel.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-14-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3c45ee6d
    • Mike Rapoport's avatar
      arch, drivers: replace for_each_membock() with for_each_mem_range() · b10d6bca
      Mike Rapoport authored
      There are several occurrences of the following pattern:
      
      	for_each_memblock(memory, reg) {
      		start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
      		end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));
      
      		/* do something with start and end */
      	}
      
      Using for_each_mem_range() iterator is more appropriate in such cases and
      allows simpler and cleaner code.
      
      [akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build]
      [rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring]
        Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b10d6bca
    • Mike Rapoport's avatar
      arch, mm: replace for_each_memblock() with for_each_mem_pfn_range() · c9118e6c
      Mike Rapoport authored
      There are several occurrences of the following pattern:
      
      	for_each_memblock(memory, reg) {
      		start_pfn = memblock_region_memory_base_pfn(reg);
      		end_pfn = memblock_region_memory_end_pfn(reg);
      
      		/* do something with start_pfn and end_pfn */
      	}
      
      Rather than iterate over all memblock.memory regions and each time query
      for their start and end PFNs, use for_each_mem_pfn_range() iterator to get
      simpler and clearer code.
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarBaoquan He <bhe@redhat.com>
      Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>	[.clang-format]
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-12-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c9118e6c
    • Mike Rapoport's avatar
      memblock: reduce number of parameters in for_each_mem_range() · 6e245ad4
      Mike Rapoport authored
      Currently for_each_mem_range() and for_each_mem_range_rev() iterators are
      the most generic way to traverse memblock regions.  As such, they have 8
      parameters and they are hardly convenient to users.  Most users choose to
      utilize one of their wrappers and the only user that actually needs most
      of the parameters is memblock itself.
      
      To avoid yet another naming for memblock iterators, rename the existing
      for_each_mem_range[_rev]() to __for_each_mem_range[_rev]() and add a new
      for_each_mem_range[_rev]() wrappers with only index, start and end
      parameters.
      
      The new wrapper nicely fits into init_unavailable_mem() and will be used
      in upcoming changes to simplify memblock traversals.
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>	[MIPS]
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-11-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6e245ad4
    • Mike Rapoport's avatar
      memblock: make memblock_debug and related functionality private · 87c55870
      Mike Rapoport authored
      The only user of memblock_dbg() outside memblock was s390 setup code and
      it is converted to use pr_debug() instead.  This allows to stop exposing
      memblock_debug and memblock_dbg() to the rest of the kernel.
      
      [akpm@linux-foundation.org: make memblock_dbg() safer and neater]
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-10-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      87c55870
    • Mike Rapoport's avatar
      memblock: make for_each_memblock_type() iterator private · cd991db8
      Mike Rapoport authored
      for_each_memblock_type() is not used outside mm/memblock.c, move it there
      from include/linux/memblock.h
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarBaoquan He <bhe@redhat.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-9-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      cd991db8
    • Mike Rapoport's avatar
      mircoblaze: drop unneeded NUMA and sparsemem initializations · 49645793
      Mike Rapoport authored
      microblaze does not support neither NUMA not SPARSMEM, so there is no
      point to call memblock_set_node() and
      sparse_memory_present_with_active_regions() functions during microblaze
      memory initialization.
      
      Remove these calls and the surrounding code.
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-8-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      49645793
    • Mike Rapoport's avatar
      riscv: drop unneeded node initialization · c8e47018
      Mike Rapoport authored
      RISC-V does not (yet) support NUMA and for UMA architectures node 0 is
      used implicitly during early memory initialization.
      
      There is no need to call memblock_set_node(), remove this call and the
      surrounding code.
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-7-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c8e47018
    • Mike Rapoport's avatar
      h8300, nds32, openrisc: simplify detection of memory extents · 80c45744
      Mike Rapoport authored
      Instead of traversing memblock.memory regions to find memory_start and
      memory_end, simply query memblock_{start,end}_of_DRAM().
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarStafford Horne <shorne@gmail.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-6-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      80c45744
    • Mike Rapoport's avatar
      arm64: numa: simplify dummy_numa_init() · ab8f21aa
      Mike Rapoport authored
      dummy_numa_init() loops over memblock.memory and passes nid=0 to
      numa_add_memblk() which essentially wraps memblock_set_node().  However,
      memblock_set_node() can cope with entire memory span itself, so the loop
      over memblock.memory regions is redundant.
      
      Using a single call to memblock_set_node() rather than a loop also fixes
      an issue with a buggy ACPI firmware in which the SRAT table covers some
      but not all of the memory in the EFI memory map.
      
      Jonathan Cameron says:
      
        This issue can be easily triggered by having an SRAT table which fails
        to cover all elements of the EFI memory map.
      
        This firmware error is detected and a warning printed. e.g.
        "NUMA: Warning: invalid memblk node 64 [mem 0x240000000-0x27fffffff]"
        At that point we fall back to dummy_numa_init().
      
        However, the failed ACPI init has left us with our memblocks all broken
        up as we split them when trying to assign them to NUMA nodes.
      
        We then iterate over the memblocks and add them to node 0.
      
        numa_add_memblk() calls memblock_set_node() which merges regions that
        were previously split up during the earlier attempt to add them to
        different nodes during parsing of SRAT.
      
        This means elements are moved in the memblock array and we can end up
        in a different memblock after the call to numa_add_memblk().
        Result is:
      
        Unable to handle kernel paging request at virtual address 0000000000003a40
        Mem abort info:
          ESR = 0x96000004
          EC = 0x25: DABT (current EL), IL = 32 bits
          SET = 0, FnV = 0
          EA = 0, S1PTW = 0
        Data abort info:
          ISV = 0, ISS = 0x00000004
          CM = 0, WnR = 0
        [0000000000003a40] user address but active_mm is swapper
        Internal error: Oops: 96000004 [#1] PREEMPT SMP
      
        ...
      
        Call trace:
          sparse_init_nid+0x5c/0x2b0
          sparse_init+0x138/0x170
          bootmem_init+0x80/0xe0
          setup_arch+0x2a0/0x5fc
          start_kernel+0x8c/0x648
      
      Replace the loop with a single call to memblock_set_node() to the entire
      memory.
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Acked-by: default avatarJonathan Cameron <Jonathan.Cameron@huawei.com>
      Acked-by: default avatarCatalin Marinas <catalin.marinas@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Baoquan He <bhe@redhat.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Daniel Axtens <dja@axtens.net>
      Cc: Dave Hansen <dave.hansen@linux.intel.com>
      Cc: Emil Renner Berthing <kernel@esmil.dk>
      Cc: Hari Bathini <hbathini@linux.ibm.com>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Marek Szyprowski <m.szyprowski@samsung.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
      Cc: Palmer Dabbelt <palmer@dabbelt.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Paul Walmsley <paul.walmsley@sifive.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will@kernel.org>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Link: https://lkml.kernel.org/r/20200818151634.14343-5-rppt@kernel.orgSigned-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ab8f21aa