1. 15 Mar, 2019 2 commits
    • Josef Bacik's avatar
      filemap: drop the mmap_sem for all blocking operations · 6b4c9f44
      Josef Bacik authored
      Currently we only drop the mmap_sem if there is contention on the page
      lock.  The idea is that we issue readahead and then go to lock the page
      while it is under IO and we want to not hold the mmap_sem during the IO.
      
      The problem with this is the assumption that the readahead does anything.
      In the case that the box is under extreme memory or IO pressure we may end
      up not reading anything at all for readahead, which means we will end up
      reading in the page under the mmap_sem.
      
      Even if the readahead does something, it could get throttled because of io
      pressure on the system and the process is in a lower priority cgroup.
      
      Holding the mmap_sem while doing IO is problematic because it can cause
      system-wide priority inversions.  Consider some large company that does a
      lot of web traffic.  This large company has load balancing logic in it's
      core web server, cause some engineer thought this was a brilliant plan.
      This load balancing logic gets statistics from /proc about the system,
      which trip over processes mmap_sem for various reasons.  Now the web
      server application is in a protected cgroup, but these other processes may
      not be, and if they are being throttled while their mmap_sem is held we'll
      stall, and cause this nice death spiral.
      
      Instead rework filemap fault path to drop the mmap sem at any point that
      we may do IO or block for an extended period of time.  This includes while
      issuing readahead, locking the page, or needing to call ->readpage because
      readahead did not occur.  Then once we have a fully uptodate page we can
      return with VM_FAULT_RETRY and come back again to find our nicely in-cache
      page that was gotten outside of the mmap_sem.
      
      This patch also adds a new helper for locking the page with the mmap_sem
      dropped.  This doesn't make sense currently as generally speaking if the
      page is already locked it'll have been read in (unless there was an error)
      before it was unlocked.  However a forthcoming patchset will change this
      with the ability to abort read-ahead bio's if necessary, making it more
      likely that we could contend for a page lock and still have a not uptodate
      page.  This allows us to deal with this case by grabbing the lock and
      issuing the IO without the mmap_sem held, and then returning
      VM_FAULT_RETRY to come back around.
      
      [josef@toxicpanda.com: v6]
        Link: http://lkml.kernel.org/r/20181212152757.10017-1-josef@toxicpanda.com
      [kirill@shutemov.name: fix race in filemap_fault()]
        Link: http://lkml.kernel.org/r/20181228235106.okk3oastsnpxusxs@kshutemo-mobl1
      [akpm@linux-foundation.org: coding style fixes]
      Link: http://lkml.kernel.org/r/20181211173801.29535-4-josef@toxicpanda.comSigned-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Tested-by: syzbot+b437b5a429d680cf2217@syzkaller.appspotmail.com
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6b4c9f44
    • Josef Bacik's avatar
      filemap: kill page_cache_read usage in filemap_fault · a75d4c33
      Josef Bacik authored
      Patch series "drop the mmap_sem when doing IO in the fault path", v6.
      
      Now that we have proper isolation in place with cgroups2 we have started
      going through and fixing the various priority inversions.  Most are all
      gone now, but this one is sort of weird since it's not necessarily a
      priority inversion that happens within the kernel, but rather because of
      something userspace does.
      
      We have giant applications that we want to protect, and parts of these
      giant applications do things like watch the system state to determine how
      healthy the box is for load balancing and such.  This involves running
      'ps' or other such utilities.  These utilities will often walk
      /proc/<pid>/whatever, and these files can sometimes need to
      down_read(&task->mmap_sem).  Not usually a big deal, but we noticed when
      we are stress testing that sometimes our protected application has latency
      spikes trying to get the mmap_sem for tasks that are in lower priority
      cgroups.
      
      This is because any down_write() on a semaphore essentially turns it into
      a mutex, so even if we currently have it held for reading, any new readers
      will not be allowed on to keep from starving the writer.  This is fine,
      except a lower priority task could be stuck doing IO because it has been
      throttled to the point that its IO is taking much longer than normal.  But
      because a higher priority group depends on this completing it is now stuck
      behind lower priority work.
      
      In order to avoid this particular priority inversion we want to use the
      existing retry mechanism to stop from holding the mmap_sem at all if we
      are going to do IO.  This already exists in the read case sort of, but
      needed to be extended for more than just grabbing the page lock.  With
      io.latency we throttle at submit_bio() time, so the readahead stuff can
      block and even page_cache_read can block, so all these paths need to have
      the mmap_sem dropped.
      
      The other big thing is ->page_mkwrite.  btrfs is particularly shitty here
      because we have to reserve space for the dirty page, which can be a very
      expensive operation.  We use the same retry method as the read path, and
      simply cache the page and verify the page is still setup properly the next
      pass through ->page_mkwrite().
      
      I've tested these patches with xfstests and there are no regressions.
      
      This patch (of 3):
      
      If we do not have a page at filemap_fault time we'll do this weird forced
      page_cache_read thing to populate the page, and then drop it again and
      loop around and find it.  This makes for 2 ways we can read a page in
      filemap_fault, and it's not really needed.  Instead add a FGP_FOR_MMAP
      flag so that pagecache_get_page() will return a unlocked page that's in
      pagecache.  Then use the normal page locking and readpage logic already in
      filemap_fault.  This simplifies the no page in page cache case
      significantly.
      
      [akpm@linux-foundation.org: fix comment text]
      [josef@toxicpanda.com: don't unlock null page in FGP_FOR_MMAP case]
        Link: http://lkml.kernel.org/r/20190312201742.22935-1-josef@toxicpanda.com
      Link: http://lkml.kernel.org/r/20181211173801.29535-2-josef@toxicpanda.comSigned-off-by: default avatarJosef Bacik <josef@toxicpanda.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reviewed-by: default avatarJan Kara <jack@suse.cz>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a75d4c33
  2. 14 Mar, 2019 4 commits
  3. 12 Mar, 2019 34 commits
    • Linus Torvalds's avatar
      Merge tag 'nfsd-5.1' of git://linux-nfs.org/~bfields/linux · ebc551f2
      Linus Torvalds authored
      Pull NFS server updates from Bruce Fields:
       "Miscellaneous NFS server fixes.
      
        Probably the most visible bug is one that could artificially limit
        NFSv4.1 performance by limiting the number of oustanding rpcs from a
        single client.
      
        Neil Brown also gets a special mention for fixing a 14.5-year-old
        memory-corruption bug in the encoding of NFSv3 readdir responses"
      
      * tag 'nfsd-5.1' of git://linux-nfs.org/~bfields/linux:
        nfsd: allow nfsv3 readdir request to be larger.
        nfsd: fix wrong check in write_v4_end_grace()
        nfsd: fix memory corruption caused by readdir
        nfsd: fix performance-limiting session calculation
        svcrpc: fix UDP on servers with lots of threads
        svcrdma: Remove syslog warnings in work completion handlers
        svcrdma: Squelch compiler warning when SUNRPC_DEBUG is disabled
        svcrdma: Use struct_size() in kmalloc()
        svcrpc: fix unlikely races preventing queueing of sockets
        svcrpc: svc_xprt_has_something_to_do seems a little long
        SUNRPC: Don't allow compiler optimisation of svc_xprt_release_slot()
        nfsd: fix an IS_ERR() vs NULL check
      ebc551f2
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · a5adcfca
      Linus Torvalds authored
      Pull ext4 updates from Ted Ts'o:
       "A large number of bug fixes and cleanups.
      
        One new feature to allow users to more easily find the jbd2 journal
        thread for a particular ext4 file system"
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (25 commits)
        jbd2: jbd2_get_transaction does not need to return a value
        jbd2: fix invalid descriptor block checksum
        ext4: fix bigalloc cluster freeing when hole punching under load
        ext4: add sysfs attr /sys/fs/ext4/<disk>/journal_task
        ext4: Change debugging support help prefix from EXT4 to Ext4
        ext4: fix compile error when using BUFFER_TRACE
        jbd2: fix compile warning when using JBUFFER_TRACE
        ext4: fix some error pointer dereferences
        ext4: annotate more implicit fall throughs
        ext4: annotate implicit fall throughs
        ext4: don't update s_rev_level if not required
        jbd2: fold jbd2_superblock_csum_{verify,set} into their callers
        jbd2: fix race when writing superblock
        ext4: fix crash during online resizing
        ext4: disallow files with EXT4_JOURNAL_DATA_FL from EXT4_IOC_SWAP_BOOT
        ext4: add mask of ext4 flags to swap
        ext4: update quota information while swapping boot loader inode
        ext4: cleanup pagecache before swap i_data
        ext4: fix check of inode in swap_inode_boot_loader
        ext4: unlock unused_pages timely when doing writeback
        ...
      a5adcfca
    • Linus Torvalds's avatar
      Merge tag 'ceph-for-5.1-rc1' of git://github.com/ceph/ceph-client · 2b0a80b0
      Linus Torvalds authored
      Pull ceph updates from Ilya Dryomov:
       "The highlights are:
      
         - rbd will now ignore discards that aren't aligned and big enough to
           actually free up some space (myself). This is controlled by the new
           alloc_size map option and can be disabled if needed.
      
         - support for rbd deep-flatten feature (myself). Deep-flatten allows
           "rbd flatten" to fully disconnect the clone image and its snapshots
           from the parent and make the parent snapshot removable.
      
         - a new round of cap handling improvements (Zheng Yan). The kernel
           client should now be much more prompt about releasing its caps and
           it is possible to put a limit on the number of caps held.
      
         - support for getting ceph.dir.pin extended attribute (Zheng Yan)"
      
      * tag 'ceph-for-5.1-rc1' of git://github.com/ceph/ceph-client: (26 commits)
        Documentation: modern versions of ceph are not backed by btrfs
        rbd: advertise support for RBD_FEATURE_DEEP_FLATTEN
        rbd: whole-object write and zeroout should copyup when snapshots exist
        rbd: copyup with an empty snapshot context (aka deep-copyup)
        rbd: introduce rbd_obj_issue_copyup_ops()
        rbd: stop copying num_osd_ops in rbd_obj_issue_copyup()
        rbd: factor out __rbd_osd_req_create()
        rbd: clear ->xferred on error from rbd_obj_issue_copyup()
        rbd: remove experimental designation from kernel layering
        ceph: add mount option to limit caps count
        ceph: periodically trim stale dentries
        ceph: delete stale dentry when last reference is dropped
        ceph: remove dentry_lru file from debugfs
        ceph: touch existing cap when handling reply
        ceph: pass inclusive lend parameter to filemap_write_and_wait_range()
        rbd: round off and ignore discards that are too small
        rbd: handle DISCARD and WRITE_ZEROES separately
        rbd: get rid of obj_req->obj_request_count
        libceph: use struct_size() for kmalloc() in crush_decode()
        ceph: send cap releases more aggressively
        ...
      2b0a80b0
    • Linus Torvalds's avatar
      Merge tag 'for-5.1-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · 92825b02
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "Correctness and a deadlock fixes"
      
      * tag 'for-5.1-part2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        btrfs: zstd: ensure reclaim timer is properly cleaned up
        btrfs: move ulist allocation out of transaction in quota enable
        btrfs: save drop_progress if we drop refs at all
        btrfs: check for refs on snapshot delete resume
        Btrfs: fix deadlock between clone/dedupe and rename
        Btrfs: fix corruption reading shared and compressed extents after hole punching
      92825b02
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 1fbf3e48
      Linus Torvalds authored
      Pull NFS client updates from Trond Myklebust:
       "Highlights include:
      
        Stable fixes:
         - Fixes for NFS I/O request leakages
         - Fix error handling paths in the NFS I/O recoalescing code
         - Reinitialise NFSv4.1 sequence results before retransmitting a
           request
         - Fix a soft lockup in the delegation recovery code
         - Bulk destroy of layouts needs to be safe w.r.t. umount
         - Prevent thundering herd issues when the SUNRPC socket is not
           connected
         - Respect RPC call timeouts when retrying transmission
      
        Features:
         - Convert rpc auth layer to use xdr_streams
         - Config option to disable insecure RPCSEC_GSS crypto types
         - Reduce size of RPC receive buffers
         - Readdirplus optimization by cache mechanism
         - Convert SUNRPC socket send code to use iov_iter()
         - SUNRPC micro-optimisations to avoid indirect calls
         - Add support for the pNFS LAYOUTERROR operation and use it with the
           pNFS/flexfiles driver
         - Add trace events to report non-zero NFS status codes
         - Various removals of unnecessary dprintks
      
        Bugfixes and cleanups:
         - Fix a number of sparse warnings and documentation format warnings
         - Fix nfs_parse_devname to not modify it's argument
         - Fix potential corruption of page being written through pNFS/blocks
         - fix xfstest generic/099 failures on nfsv3
         - Avoid NFSv4.1 "false retries" when RPC calls are interrupted
         - Abort I/O early if the pNFS/flexfiles layout segment was
           invalidated
         - Avoid unnecessary pNFS/flexfiles layout invalidations"
      
      * tag 'nfs-for-5.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (90 commits)
        SUNRPC: Take the transport send lock before binding+connecting
        SUNRPC: Micro-optimise when the task is known not to be sleeping
        SUNRPC: Check whether the task was transmitted before rebind/reconnect
        SUNRPC: Remove redundant calls to RPC_IS_QUEUED()
        SUNRPC: Clean up
        SUNRPC: Respect RPC call timeouts when retrying transmission
        SUNRPC: Fix up RPC back channel transmission
        SUNRPC: Prevent thundering herd when the socket is not connected
        SUNRPC: Allow dynamic allocation of back channel slots
        NFSv4.1: Bump the default callback session slot count to 16
        SUNRPC: Convert remaining GFP_NOIO, and GFP_NOWAIT sites in sunrpc
        NFS/flexfiles: Clean up mirror DS initialisation
        NFS/flexfiles: Remove dead code in ff_layout_mirror_valid()
        NFS/flexfile: Simplify nfs4_ff_layout_select_ds_stateid()
        NFS/flexfile: Simplify nfs4_ff_layout_ds_version()
        NFS/flexfiles: Simplify ff_layout_get_ds_cred()
        NFS/flexfiles: Simplify nfs4_ff_find_or_create_ds_client()
        NFS/flexfiles: Simplify nfs4_ff_layout_select_ds_fh()
        NFS/flexfiles: Speed up read failover when DSes are down
        NFS/flexfiles: Don't invalidate DS deviceids for being unresponsive
        ...
      1fbf3e48
    • Linus Torvalds's avatar
      Merge tag 'ovl-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs · f88c5942
      Linus Torvalds authored
      Pull overlayfs updates from Miklos Szeredi:
       "Fix copy up of security related xattrs"
      
      * tag 'ovl-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
        ovl: Do not lose security.capability xattr over metadata file copy-up
        ovl: During copy up, first copy up data and then xattrs
      f88c5942
    • Linus Torvalds's avatar
      Merge tag 'fuse-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse · dfee9c25
      Linus Torvalds authored
      Pull fuse updates from Miklos Szeredi:
       "Scalability and performance improvements, as well as minor bug fixes
        and cleanups"
      
      * tag 'fuse-update-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (25 commits)
        fuse: cache readdir calls if filesystem opts out of opendir
        fuse: support clients that don't implement 'opendir'
        fuse: lift bad inode checks into callers
        fuse: multiplex cached/direct_io file operations
        fuse add copy_file_range to direct io fops
        fuse: use iov_iter based generic splice helpers
        fuse: Switch to using async direct IO for FOPEN_DIRECT_IO
        fuse: use atomic64_t for khctr
        fuse: clean up aborted
        fuse: Protect ff->reserved_req via corresponding fi->lock
        fuse: Protect fi->nlookup with fi->lock
        fuse: Introduce fi->lock to protect write related fields
        fuse: Convert fc->attr_version into atomic64_t
        fuse: Add fuse_inode argument to fuse_prepare_release()
        fuse: Verify userspace asks to requeue interrupt that we really sent
        fuse: Do some refactoring in fuse_dev_do_write()
        fuse: Wake up req->waitq of only if not background
        fuse: Optimize request_end() by not taking fiq->waitq.lock
        fuse: Kill fasync only if interrupt is queued in queue_interrupt()
        fuse: Remove stale comment in end_requests()
        ...
      dfee9c25
    • Linus Torvalds's avatar
      Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 7b47a9e7
      Linus Torvalds authored
      Pull vfs mount infrastructure updates from Al Viro:
       "The rest of core infrastructure; no new syscalls in that pile, but the
        old parts are switched to new infrastructure. At that point
        conversions of individual filesystems can happen independently; some
        are done here (afs, cgroup, procfs, etc.), there's also a large series
        outside of that pile dealing with NFS (quite a bit of option-parsing
        stuff is getting used there - it's one of the most convoluted
        filesystems in terms of mount-related logics), but NFS bits are the
        next cycle fodder.
      
        It got seriously simplified since the last cycle; documentation is
        probably the weakest bit at the moment - I considered dropping the
        commit introducing Documentation/filesystems/mount_api.txt (cutting
        the size increase by quarter ;-), but decided that it would be better
        to fix it up after -rc1 instead.
      
        That pile allows to do followup work in independent branches, which
        should make life much easier for the next cycle. fs/super.c size
        increase is unpleasant; there's a followup series that allows to
        shrink it considerably, but I decided to leave that until the next
        cycle"
      
      * 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (41 commits)
        afs: Use fs_context to pass parameters over automount
        afs: Add fs_context support
        vfs: Add some logging to the core users of the fs_context log
        vfs: Implement logging through fs_context
        vfs: Provide documentation for new mount API
        vfs: Remove kern_mount_data()
        hugetlbfs: Convert to fs_context
        cpuset: Use fs_context
        kernfs, sysfs, cgroup, intel_rdt: Support fs_context
        cgroup: store a reference to cgroup_ns into cgroup_fs_context
        cgroup1_get_tree(): separate "get cgroup_root to use" into a separate helper
        cgroup_do_mount(): massage calling conventions
        cgroup: stash cgroup_root reference into cgroup_fs_context
        cgroup2: switch to option-by-option parsing
        cgroup1: switch to option-by-option parsing
        cgroup: take options parsing into ->parse_monolithic()
        cgroup: fold cgroup1_mount() into cgroup1_get_tree()
        cgroup: start switching to fs_context
        ipc: Convert mqueue fs to fs_context
        proc: Add fs_context support to procfs
        ...
      7b47a9e7
    • Linus Torvalds's avatar
      Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · dbc2fba3
      Linus Torvalds authored
      Pull iov_iter updates from Al Viro:
       "A couple of iov_iter patches - Christoph's crapectomy (the last
        remaining user of iov_for_each() went away with lustre, IIRC) and
        Eric'c optimization of sanity checks"
      
      * 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        iov_iter: optimize page_copy_sane()
        uio: remove the unused iov_for_each macro
      dbc2fba3
    • Linus Torvalds's avatar
      Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 5f739e4a
      Linus Torvalds authored
      Pull misc vfs updates from Al Viro:
       "Assorted fixes (really no common topic here)"
      
      * 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        vfs: Make __vfs_write() static
        vfs: fix preadv64v2 and pwritev64v2 compat syscalls with offset == -1
        pipe: stop using ->can_merge
        splice: don't merge into linked buffers
        fs: move generic stat response attr handling to vfs_getattr_nosec
        orangefs: don't reinitialize result_mask in ->getattr
        fs/devpts: always delete dcache dentry-s in dput()
      5f739e4a
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · a667cb7a
      Linus Torvalds authored
      Merge misc updates from Andrew Morton:
      
       - a few misc things
      
       - the rest of MM
      
      -  remove flex_arrays, replace with new simple radix-tree implementation
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (38 commits)
        Drop flex_arrays
        sctp: convert to genradix
        proc: commit to genradix
        generic radix trees
        selinux: convert to kvmalloc
        md: convert to kvmalloc
        openvswitch: convert to kvmalloc
        of: fix kmemleak crash caused by imbalance in early memory reservation
        mm: memblock: update comments and kernel-doc
        memblock: split checks whether a region should be skipped to a helper function
        memblock: remove memblock_{set,clear}_region_flags
        memblock: drop memblock_alloc_*_nopanic() variants
        memblock: memblock_alloc_try_nid: don't panic
        treewide: add checks for the return value of memblock_alloc*()
        swiotlb: add checks for the return value of memblock_alloc*()
        init/main: add checks for the return value of memblock_alloc*()
        mm/percpu: add checks for the return value of memblock_alloc*()
        sparc: add checks for the return value of memblock_alloc*()
        ia64: add checks for the return value of memblock_alloc*()
        arch: don't memset(0) memory returned by memblock_alloc()
        ...
      a667cb7a
    • Kent Overstreet's avatar
      Drop flex_arrays · 586187d7
      Kent Overstreet authored
      All existing users have been converted to generic radix trees
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-8-kent.overstreet@gmail.comSigned-off-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Acked-by: default avatarDave Hansen <dave.hansen@intel.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      586187d7
    • Kent Overstreet's avatar
      sctp: convert to genradix · 2075e50c
      Kent Overstreet authored
      This also makes sctp_stream_alloc_(out|in) saner, in that they no longer
      allocate new flex_arrays/genradixes, they just preallocate more
      elements.
      
      This code does however have a suspicious lack of locking.
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-7-kent.overstreet@gmail.comSigned-off-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2075e50c
    • Kent Overstreet's avatar
      proc: commit to genradix · 94f8f3b0
      Kent Overstreet authored
      The new generic radix trees have a simpler API and implementation, and
      no limitations on number of elements, so all flex_array users are being
      converted
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-6-kent.overstreet@gmail.comSigned-off-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Reviewed-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      94f8f3b0
    • Kent Overstreet's avatar
      generic radix trees · ba20ba2e
      Kent Overstreet authored
      Very simple radix tree implementation that supports storing arbitrary
      size entries, up to PAGE_SIZE - upcoming patches will convert existing
      flex_array users to genradixes.  The new genradix code has a much
      simpler API and implementation, and doesn't have a hard limit on the
      number of elements like flex_array does.
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-5-kent.overstreet@gmail.comSigned-off-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ba20ba2e
    • Kent Overstreet's avatar
      selinux: convert to kvmalloc · acdf52d9
      Kent Overstreet authored
      The flex arrays were being used for constant sized arrays, so there's no
      benefit to using flex_arrays over something simpler.
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-4-kent.overstreet@gmail.comSigned-off-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      acdf52d9
    • Kent Overstreet's avatar
      md: convert to kvmalloc · b330e6a4
      Kent Overstreet authored
      The code really just wants a big flat buffer, so just do that.
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-3-kent.overstreet@gmail.comSigned-off-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Reviewed-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b330e6a4
    • Kent Overstreet's avatar
      openvswitch: convert to kvmalloc · ee9c5e67
      Kent Overstreet authored
      Patch series "generic radix trees; drop flex arrays".
      
      This patch (of 7):
      
      There was no real need for this code to be using flexarrays, it's just
      implementing a hash table - ideally it would be using rhashtables, but
      that conversion would be significantly more complicated.
      
      Link: http://lkml.kernel.org/r/20181217131929.11727-2-kent.overstreet@gmail.comSigned-off-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Reviewed-by: default avatarMatthew Wilcox <willy@infradead.org>
      Cc: Pravin B Shelar <pshelar@ovn.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Eric Paris <eparis@parisplace.org>
      Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Paul Moore <paul@paul-moore.com>
      Cc: Shaohua Li <shli@kernel.org>
      Cc: Stephen Smalley <sds@tycho.nsa.gov>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ee9c5e67
    • Mike Rapoport's avatar
      of: fix kmemleak crash caused by imbalance in early memory reservation · 5c01a25a
      Mike Rapoport authored
      Marc Gonzalez reported the following kmemleak crash:
      
        Unable to handle kernel paging request at virtual address ffffffc021e00000
        Mem abort info:
          ESR = 0x96000006
          Exception class = DABT (current EL), IL = 32 bits
          SET = 0, FnV = 0
          EA = 0, S1PTW = 0
        Data abort info:
          ISV = 0, ISS = 0x00000006
          CM = 0, WnR = 0
        swapper pgtable: 4k pages, 39-bit VAs, pgdp = (____ptrval____) [ffffffc021e00000] pgd=000000017e3ba803, pud=000000017e3ba803, pmd=0000000000000000
        Internal error: Oops: 96000006 [#1] PREEMPT SMP
        Modules linked in:
        CPU: 6 PID: 523 Comm: kmemleak Tainted: G S      W         5.0.0-rc1 #13
        Hardware name: Qualcomm Technologies, Inc. MSM8998 v1 MTP (DT)
        pstate: 80000085 (Nzcv daIf -PAN -UAO)
        pc : scan_block+0x70/0x190
        lr : scan_block+0x6c/0x190
        Process kmemleak (pid: 523, stack limit = 0x(____ptrval____))
        Call trace:
         scan_block+0x70/0x190
         scan_gray_list+0x108/0x1c0
         kmemleak_scan+0x33c/0x7c0
         kmemleak_scan_thread+0x98/0xf0
         kthread+0x11c/0x120
         ret_from_fork+0x10/0x1c
        Code: f9000fb4 d503201f 97ffffd2 35000580 (f9400260)
      
      The crash happens when a no-map area is allocated in
      early_init_dt_alloc_reserved_memory_arch().  The allocated region is
      registered with kmemleak, but it is then removed from memblock using
      memblock_remove() that is not kmemleak-aware.
      
      Replacing memblock_phys_alloc_range() with memblock_find_in_range()
      makes sure that the allocated memory is not added to kmemleak and then
      memblock_remove()'ing this memory is safe.
      
      As a bonus, since memblock_find_in_range() ensures the allocation in the
      specified range, the bounds check can be removed.
      
      [rppt@linux.ibm.com: of: fix parameters order for call to memblock_find_in_range()]
        Link: http://lkml.kernel.org/r/20190221112619.GC32004@rapoport-lnx
      Link: http://lkml.kernel.org/r/20190213181921.GB15270@rapoport-lnx
      Fixes: 3f0c8206 ("drivers: of: add initialization code for dynamic reserved memory")
      Signed-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Acked-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Acked-by: default avatarPrateek Patel <prpatel@nvidia.com>
      Tested-by: default avatarMarc Gonzalez <marc.w.gonzalez@free.fr>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Frank Rowand <frowand.list@gmail.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5c01a25a
    • Mike Rapoport's avatar
      mm: memblock: update comments and kernel-doc · a2974133
      Mike Rapoport authored
      * Remove comments mentioning bootmem
      * Extend "DOC: memblock overview"
      * Add kernel-doc comments for several more functions
      
      [akpm@linux-foundation.org: fix copy-n-paste error]
      Link: http://lkml.kernel.org/r/1549626347-25461-1-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a2974133
    • Mike Rapoport's avatar
      memblock: split checks whether a region should be skipped to a helper function · c9a688a3
      Mike Rapoport authored
      __next_mem_range() and __next_mem_range_rev() duplicate the code that
      checks whether a region should be skipped because of node or flags
      incompatibility.
      
      Split this code into a helper function.
      
      Link: http://lkml.kernel.org/r/1549455025-17706-3-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c9a688a3
    • Mike Rapoport's avatar
      memblock: remove memblock_{set,clear}_region_flags · fe145124
      Mike Rapoport authored
      The memblock API provides dedicated helpers to set or clear a flag on a
      memory region, e.g.  memblock_{mark,clear}_hotplug().
      
      The memblock_{set,clear}_region_flags() functions are used only by the
      memblock internal function that adjusts the region flags.  Drop these
      functions and use open-coded implementation instead.
      
      Link: http://lkml.kernel.org/r/1549455025-17706-2-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fe145124
    • Mike Rapoport's avatar
      memblock: drop memblock_alloc_*_nopanic() variants · 26fb3dae
      Mike Rapoport authored
      As all the memblock allocation functions return NULL in case of error
      rather than panic(), the duplicates with _nopanic suffix can be removed.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-22-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Acked-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Reviewed-by: Petr Mladek <pmladek@suse.com>		[printk]
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      26fb3dae
    • Mike Rapoport's avatar
      memblock: memblock_alloc_try_nid: don't panic · c0dbe825
      Mike Rapoport authored
      As all the memblock_alloc*() users are now checking the return value and
      panic() in case of error, the panic() call can be removed from the core
      memblock allocator, namely memblock_alloc_try_nid().
      
      Link: http://lkml.kernel.org/r/1548057848-15136-21-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c0dbe825
    • Mike Rapoport's avatar
      treewide: add checks for the return value of memblock_alloc*() · 8a7f97b9
      Mike Rapoport authored
      Add check for the return value of memblock_alloc*() functions and call
      panic() in case of error.  The panic message repeats the one used by
      panicing memblock allocators with adjustment of parameters to include
      only relevant ones.
      
      The replacement was mostly automated with semantic patches like the one
      below with manual massaging of format strings.
      
        @@
        expression ptr, size, align;
        @@
        ptr = memblock_alloc(size, align);
        + if (!ptr)
        + 	panic("%s: Failed to allocate %lu bytes align=0x%lx\n", __func__, size, align);
      
      [anders.roxell@linaro.org: use '%pa' with 'phys_addr_t' type]
        Link: http://lkml.kernel.org/r/20190131161046.21886-1-anders.roxell@linaro.org
      [rppt@linux.ibm.com: fix format strings for panics after memblock_alloc]
        Link: http://lkml.kernel.org/r/1548950940-15145-1-git-send-email-rppt@linux.ibm.com
      [rppt@linux.ibm.com: don't panic if the allocation in sparse_buffer_init fails]
        Link: http://lkml.kernel.org/r/20190131074018.GD28876@rapoport-lnx
      [akpm@linux-foundation.org: fix xtensa printk warning]
      Link: http://lkml.kernel.org/r/1548057848-15136-20-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Reviewed-by: Guo Ren <ren_guo@c-sky.com>		[c-sky]
      Acked-by: Paul Burton <paul.burton@mips.com>		[MIPS]
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>	[s390]
      Reviewed-by: Juergen Gross <jgross@suse.com>		[Xen]
      Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Acked-by: Max Filippov <jcmvbkbc@gmail.com>		[xtensa]
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8a7f97b9
    • Mike Rapoport's avatar
      swiotlb: add checks for the return value of memblock_alloc*() · a0bf842e
      Mike Rapoport authored
      Add panic() calls if memblock_alloc() returns NULL.
      
      The panic() format duplicates the one used by memblock itself and in
      order to avoid explosion with long parameters list replace open coded
      allocation size calculations with a local variable.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-19-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a0bf842e
    • Mike Rapoport's avatar
      init/main: add checks for the return value of memblock_alloc*() · f5c7310a
      Mike Rapoport authored
      Add panic() calls if memblock_alloc() returns NULL.
      
      The panic() format duplicates the one used by memblock itself and in
      order to avoid explosion with long parameters list replace open coded
      allocation size calculations with a local variable.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-18-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f5c7310a
    • Mike Rapoport's avatar
      mm/percpu: add checks for the return value of memblock_alloc*() · f655f405
      Mike Rapoport authored
      Add panic() calls if memblock_alloc() returns NULL.
      
      The panic() format duplicates the one used by memblock itself and in
      order to avoid explosion with long parameters list replace open coded
      allocation size calculations with a local variable.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-17-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f655f405
    • Mike Rapoport's avatar
      sparc: add checks for the return value of memblock_alloc*() · b1e1c869
      Mike Rapoport authored
      Add panic() calls if memblock_alloc*() returns NULL.
      
      Most of the changes are simply addition of
      
              if(!ptr)
                      panic();
      
      statements after the calls to memblock_alloc*() variants.
      
      Exceptions are pcpu_populate_pte() and kernel_map_range() that were
      slightly refactored to accommodate the change.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-16-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      b1e1c869
    • Mike Rapoport's avatar
      ia64: add checks for the return value of memblock_alloc*() · d80db5c1
      Mike Rapoport authored
      Add panic() calls if memblock_alloc*() returns NULL.
      
      Most of the changes are simply addition of
      
      	if(!ptr)
      		panic();
      
      statements after the calls to memblock_alloc*() variants.
      
      Exceptions are create_mem_map_page_table() and ia64_log_init() that were
      slightly refactored to accommodate the change.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-15-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d80db5c1
    • Mike Rapoport's avatar
      arch: don't memset(0) memory returned by memblock_alloc() · 0240dfd5
      Mike Rapoport authored
      memblock_alloc() already clears the allocated memory, no point in doing
      it twice.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-14-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>	[m68k]
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0240dfd5
    • Mike Rapoport's avatar
      arch: use memblock_alloc() instead of memblock_alloc_from(size, align, 0) · 9415673e
      Mike Rapoport authored
      The last parameter of memblock_alloc_from() is the lower limit for the
      memory allocation.  When it is 0, the call is equivalent to
      memblock_alloc().
      
      Link: http://lkml.kernel.org/r/1548057848-15136-13-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Acked-by: Paul Burton <paul.burton@mips.com> # MIPS part
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      9415673e
    • Mike Rapoport's avatar
      memblock: make memblock_find_in_range_node() and choose_memblock_flags() static · c366ea89
      Mike Rapoport authored
      These functions are not used outside memblock.  Make them static.
      
      Link: http://lkml.kernel.org/r/1548057848-15136-12-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c366ea89
    • Mike Rapoport's avatar
      memblock: refactor internal allocation functions · 92d12f95
      Mike Rapoport authored
      Currently, memblock has several internal functions with overlapping
      functionality.  They all call memblock_find_in_range_node() to find free
      memory and then reserve the allocated range and mark it with kmemleak.
      However, there is difference in the allocation constraints and in
      fallback strategies.
      
      The allocations returning physical address first attempt to find free
      memory on the specified node within mirrored memory regions, then retry
      on the same node without the requirement for memory mirroring and
      finally fall back to all available memory.
      
      The allocations returning virtual address start with clamping the
      allowed range to memblock.current_limit, attempt to allocate from the
      specified node from regions with mirroring and with user defined minimal
      address.  If such allocation fails, next attempt is done with node
      restriction lifted.  Next, the allocation is retried with minimal
      address reset to zero and at last without the requirement for mirrored
      regions.
      
      Let's consolidate various fallbacks handling and make them more
      consistent for physical and virtual variants.  Most of the fallback
      handling is moved to memblock_alloc_range_nid() and it now handles node
      and mirror fallbacks.
      
      The memblock_alloc_internal() uses memblock_alloc_range_nid() to get a
      physical address of the allocated range and converts it to virtual
      address.
      
      The fallback for allocation below the specified minimal address remains
      in memblock_alloc_internal() because memblock_alloc_range_nid() is used
      by CMA with exact requirement for lower bounds.
      
      The memblock_phys_alloc_nid() function is completely dropped as it is not
      used anywhere outside memblock and its only usage can be replaced by a
      call to memblock_alloc_range_nid().
      
      [rppt@linux.ibm.com: fix parameter order in memblock_phys_alloc_try_nid()]
        Link: http://lkml.kernel.org/r/20190203113915.GC8620@rapoport-lnx
      Link: http://lkml.kernel.org/r/1548057848-15136-11-git-send-email-rppt@linux.ibm.comSigned-off-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Tested-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Christophe Leroy <christophe.leroy@c-s.fr>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Dennis Zhou <dennis@kernel.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Guo Ren <guoren@kernel.org>
      Cc: Guo Ren <ren_guo@c-sky.com>				[c-sky]
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Juergen Gross <jgross@suse.com>			[Xen]
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Max Filippov <jcmvbkbc@gmail.com>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Petr Mladek <pmladek@suse.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Rob Herring <robh+dt@kernel.org>
      Cc: Rob Herring <robh@kernel.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Stafford Horne <shorne@gmail.com>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      92d12f95