1. 02 May, 2019 5 commits
    • Serge Semin's avatar
      mips: Reserve memory for the kernel image resources · b93ddc4f
      Serge Semin authored
      The reserved_end variable had been used by the bootmem_init() code
      to find a lowest limit of memory available for memmap blob. The original
      code just tried to find a free memory space higher than kernel was placed.
      This limitation seems justified for the memmap ragion search process, but
      I can't see any obvious reason to reserve the unused space below kernel
      seeing some platforms place it much higher than standard 1MB. Moreover
      the RELOCATION config enables it to be loaded at any memory address.
      So lets reserve the memory occupied by the kernel only, leaving the region
      below being free for allocations. After doing this we can now discard the
      code freeing a space between kernel _text and VMLINUX_LOAD_ADDRESS symbols
      since it's going to be free anyway (unless marked as reserved by
      platforms).
      Signed-off-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Matt Redfearn <matt.redfearn@mips.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Thomas Bogendoerfer <tbogendoerfer@suse.de>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: linux-mips@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      b93ddc4f
    • Paul Burton's avatar
      MIPS: Remove duplicate EBase configuration · de56d4c1
      Paul Burton authored
      Clean up our configuration of the EBase register by making
      configure_exception_vector() write to it unconditionally on systems
      implementing MIPSr2 or higher, and removing the duplicate code in
      per_cpu_trap_init(). The latter would have duplicated work on systems
      with vectored interrupts, and didn't set BEV for safety like the
      configure_exception_vector() version of the code does.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Cc: linux-mips@vger.kernel.org
      de56d4c1
    • Paul Burton's avatar
      MIPS: Sync icache for whole exception vector · 783454e2
      Paul Burton authored
      Rather than performing cache flushing for a fixed 0x400 bytes, use the
      actual size of the vector in order to ensure we cover all emitted code
      on systems that make use of vectored interrupts.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <f4bug@amsat.org>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Cc: linux-mips@vger.kernel.org
      783454e2
    • Paul Burton's avatar
      MIPS: Always allocate exception vector for MIPSr2+ · 172dcd93
      Paul Burton authored
      Currently we allocate the exception vector on systems which use a
      vectored interrupt mode, but otherwise attempt to reuse whatever
      exception vector the bootloader uses.
      
      This can be problematic for a number of reasons:
      
        1) The memory isn't properly marked reserved in the memblock
           allocator. We've relied on the fact that EBase is generally in the
           memory below the kernel image which we don't free, but this is
           about to change.
      
        2) Recent versions of U-Boot place their exception vector high in
           kseg0, in memory which isn't protected by being lower than the
           kernel anyway & can end up being clobbered.
      
        3) We are unnecessarily reliant upon there being memory at the address
           EBase points to upon entry to the kernel. This is often the case,
           but if the bootloader doesn't configure EBase & leaves it with its
           default value then we rely upon there being memory at physical
           address 0 for no good reason.
      
      Improve this situation by allocating the exception vector in all cases
      when running on MIPSr2 or higher, and reserving the memory for MIPSr1 or
      lower. This ensures we don't clobber the exception vector in any
      configuration, and for MIPSr2 & higher removes the need for memory at
      physical address 0.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Cc: linux-mips@vger.kernel.org
      172dcd93
    • Paul Burton's avatar
      MIPS: Use memblock_phys_alloc() for exception vector · f995adb0
      Paul Burton authored
      Allocate the exception vector using memblock_phys_alloc() which gives us
      a physical address, rather than the previous convoluted setup which
      obtained a virtual address using memblock_alloc(), converted it to a
      physical address & then back to a virtual address.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Reviewed-by: default avatarPhilippe Mathieu-Daudé <f4bug@amsat.org>
      Reviewed-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Cc: linux-mips@vger.kernel.org
      f995adb0
  2. 24 Apr, 2019 3 commits
    • Serge Semin's avatar
      mips: Combine memblock init and memory reservation loops · cf0c4876
      Serge Semin authored
      Before bootmem was completely removed from the kernel, the last loop
      in the bootmem_init() had been used to reserve the correspondingly
      marked regions, initialize sparsemem sections and to free the low memory
      pages, which then would be used for early memory allocations. After the
      bootmem removing patchset had been merged the loop was left to do the first
      two things only. But it didn't do them quite well.
      
      First of all it leaves the BOOT_MEM_INIT_RAM memory types unreserved,
      which is definitely bug (although it isn't noticeable due to being used
      by the kernel region only, which is fully marked as reserved). Secondly
      the reservation is supposed to be done for any memory including the
      high one. (I couldn't figure out why the highmem was ignored in the first
      place, since platforms and dts' may declare any memory region for
      reservation) Thirdly the reserved_end variable had been used here to not
      accidentally free memory occupied by kernel. Since we already reserved the
      corresponding region higher in this method there is no need in using the
      variable here anymore. Fourthly the sparsemem should be aware of all the
      memory types in the system including the ROM_DATA even if it is going to
      be reserved for the whole system uptime. Finally after all these notes are
      fixed the loop of memory reservation can be freely merged into the memory
      installation loop as it's done in this patch.
      Signed-off-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Matt Redfearn <matt.redfearn@mips.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Thomas Bogendoerfer <tbogendoerfer@suse.de>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: linux-mips@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      cf0c4876
    • Serge Semin's avatar
      mips: Discard rudiments from bootmem_init · 6ea3ba6f
      Serge Semin authored
      There is a pointless code left in the bootmem_init() method since
      the bootmem allocator removal. First part resides the PFN ranges
      calculation loop. The conditional expressions and continue operator
      are useless there, since nothing is done after them. Second part is
      in RAM ranges installation loop. We can simplify the conditions cascade
      a bit without much of the logic redefinition, so to reduce the code
      length. In particular the end boundary value can be verified after
      the possible reduction to be below max_low_pfn.
      Signed-off-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Matt Redfearn <matt.redfearn@mips.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Thomas Bogendoerfer <tbogendoerfer@suse.de>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: linux-mips@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      6ea3ba6f
    • Serge Semin's avatar
      mips: Make sure kernel .bss exists in boot mem pool · a703db3d
      Serge Semin authored
      Current MIPS platform code makes sure the kernel text, data and init
      sections are added to the boot memory map pool right after the
      arch-specific memory setup method has been executed. But for some reason
      the MIPS platform code skipped the kernel .bss section, which definitely
      should be in the boot mem pool as well in any case. Lets fix this just be
      adding the space between __bss_start and __bss_stop.
      Reviewed-by: default avatarMatt Redfearn <matt.redfearn@mips.com>
      Signed-off-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Thomas Bogendoerfer <tbogendoerfer@suse.de>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: Stefan Agner <stefan@agner.ch>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: linux-mips@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      a703db3d
  3. 23 Apr, 2019 1 commit
  4. 12 Apr, 2019 1 commit
  5. 09 Apr, 2019 4 commits
    • Paul Burton's avatar
      MIPS: generic: Enable CONFIG_JUMP_LABEL · 3e3d1dfd
      Paul Burton authored
      Enable CONFIG_JUMP_LABEL for generic configs in order to better optimize
      at runtime and get better test coverage for our jump label support.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      3e3d1dfd
    • Paul Burton's avatar
      MIPS: jump_label: Use compact branches for >= r6 · 9b6584e3
      Paul Burton authored
      MIPSr6 introduced compact branches which have no delay slots. Make use
      of them for jump labels in order to avoid the need for a nop to fill the
      branch or jump delay slot, saving 4 bytes of code for each static branch.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      9b6584e3
    • Paul Burton's avatar
      MIPS: jump_label: Remove redundant nops · c838b580
      Paul Burton authored
      Both arch_static_branch() & arch_static_branch_jump() emit a control
      transfer instruction (ie. branch or jump) without disabling assembler
      re-ordering. As such the assembler will automatically fill their delay
      slots.
      
      Both functions follow their branch or jump with an explicit nop that at
      first appears to be there to fill the delay slot, but given that the
      assembler will do that the explicit nops serve no purpose & we end up
      with our branch or jump followed by 2 nops. Remove the redundant nops.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-mips@vger.kernel.org
      c838b580
    • Paul Burton's avatar
      Merge tag 'mips_fixes_5.1_1' into mips-next · ec86e545
      Paul Burton authored
      A small batch of MIPS fixes for 5.1:
      
      - An interrupt masking fix for Loongson-based Lemote 2F systems (fixing
        a regression from v3.19).
      
      - A relocation fix for configurations in which the devicetree is stored
        in an ELF section (fixing a regression from v4.7).
      
      - Fix jump labels for MIPSr6 kernels where they previously could
        inadvertently place a control transfer instruction in a forbidden slot
        & take unexpected exceptions (fixing MIPSr6 support added in v4.0).
      
      - Extend an existing USB power workaround for the Netgear WNDR3400 to v2
        boards in addition to the v3 ones that already used it.
      
      - Remove the custom MIPS32 definition of __kernel_fsid_t to make it
        consistent with MIPS64 & every other architecture, in particular
        resolving issues for code which tries to print the val field whose
        type previously differed (though had identical memory layout).
      
      Merged into mips-next to gain the MIPSr6 jump label fix before enabling
      jump labels by default for generic kernel builds.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      ec86e545
  6. 04 Apr, 2019 1 commit
  7. 25 Mar, 2019 1 commit
    • Paul Burton's avatar
      MIPS: KVM: Use prandom_u32_max() to generate tlbwr index · e6331a32
      Paul Burton authored
      Emulation of the tlbwr instruction, which writes a TLB entry to a random
      index in the TLB, currently uses get_random_bytes() to generate a 4 byte
      random number which we then mask to form the index. This is overkill in
      a couple of ways:
      
        - We don't need 4 bytes here since we mask the value to form a 6 bit
          number anyway, so we waste /dev/random entropy generating 3 random
          bytes that are unused.
      
        - We don't need crypto-grade randomness here - the architecture spec
          allows implementations to use any algorithm & merely encourages that
          some pseudo-randomness be used rather than a simple counter. The
          fast prandom_u32() function fits that criteria well.
      
      So rather than using get_random_bytes() & consuming /dev/random entropy,
      switch to using the faster prandom_u32_max() which provides what we need
      here whilst also performing the masking/modulo for us.
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Reported-by: default avatarGeorge Spelvin <lkml@sdf.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@vger.kernel.org
      e6331a32
  8. 19 Mar, 2019 5 commits
    • Enrico Weigelt, metux IT consult's avatar
      arch: mips: Kconfig: pedantic formatting · 371a4151
      Enrico Weigelt, metux IT consult authored
      Formatting of Kconfig files doesn't look so pretty, so let the
      Great White Handkerchief come around and clean it up.
      Signed-off-by: default avatarEnrico Weigelt, metux IT consult <info@metux.net>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-kernel@vger.kernel.org
      Cc: hauke@hauke-m.de
      Cc: zajec5@gmail.com
      Cc: f.fainelli@gmail.com
      Cc: bcm-kernel-feedback-list@broadcom.com
      Cc: linux-mips@vger.kernel.org
      371a4151
    • Hassan Naveed's avatar
      MIPS: eBPF: Initial eBPF support for MIPS32 architecture. · 716850ab
      Hassan Naveed authored
      Currently MIPS32 supports a JIT for classic BPF only, not extended BPF.
      This patch adds JIT support for extended BPF on MIPS32, so code is
      actually JIT'ed instead of being only interpreted. Instructions with
      64-bit operands are not supported at this point.
      We can delete classic BPF because the kernel will translate classic BPF
      programs into extended BPF and JIT them, eliminating the need for
      classic BPF.
      Signed-off-by: default avatarHassan Naveed <hnaveed@wavecomp.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: kafai@fb.com
      Cc: songliubraving@fb.com
      Cc: yhs@fb.com
      Cc: netdev@vger.kernel.org
      Cc: bpf@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: open list:MIPS <linux-mips@linux-mips.org>
      Cc: open list <linux-kernel@vger.kernel.org>
      716850ab
    • Hassan Naveed's avatar
      MIPS: eBPF: Provide eBPF support for MIPS64R6 · 6c2c8a18
      Hassan Naveed authored
      Currently eBPF support is available on MIPS64R2 only. Use MIPS64R6
      variants of instructions like multiply, divide, movn, movz so eBPF
      can run on the newer ISA. Also, we only need to check ISA revision
      before JIT'ing code, because we know the CPU is running a 64-bit
      kernel because eBPF JIT is only included in kernels with CONFIG_64BIT=y
      due to Kconfig dependencies.
      Signed-off-by: default avatarHassan Naveed <hnaveed@wavecomp.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: kafai@fb.com
      Cc: songliubraving@fb.com
      Cc: yhs@fb.com
      Cc: netdev@vger.kernel.org
      Cc: bpf@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: open list:MIPS <linux-mips@linux-mips.org>
      Cc: open list <linux-kernel@vger.kernel.org>
      6c2c8a18
    • Hassan Naveed's avatar
      MIPS: uasm: Add div, mul and sel instructions for mipsr6 · 0d1d17b9
      Hassan Naveed authored
      Add the following instructions for use by eBPF on mipsr6:
      insn_ddivu_r6, insn_divu_r6, insn_dmodu, insn_dmulu, insn_modu,
      insn_mulu, insn_seleqz, insn_selnez
      Signed-off-by: default avatarHassan Naveed <hnaveed@wavecomp.com>
      Reviewed-by: default avatarPaul Burton <paul.burton@mips.com>
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: kafai@fb.com
      Cc: songliubraving@fb.com
      Cc: yhs@fb.com
      Cc: netdev@vger.kernel.org
      Cc: bpf@vger.kernel.org
      Cc: linux-mips@vger.kernel.org
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: open list:MIPS <linux-mips@linux-mips.org>
      Cc: open list <linux-kernel@vger.kernel.org>
      0d1d17b9
    • Valentin Schneider's avatar
      MIPS: entry: Remove unneeded need_resched() loop · b8f3b15a
      Valentin Schneider authored
      Since the enabling and disabling of IRQs within preempt_schedule_irq()
      is contained in a need_resched() loop, we don't need the outer arch
      code loop.
      
      Note that commit a18815ab ("Use preempt_schedule_irq.") initially
      removed the existing loop, but missed the final branch to restore_all.
      Commit cdaed73a ("Fix preemption bug.") missed that and reintroduced
      the loop.
      Signed-off-by: default avatarValentin Schneider <valentin.schneider@arm.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: linux-mips@vger.kernel.org
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Cc: linux-kernel@vger.kernel.org
      b8f3b15a
  9. 17 Mar, 2019 14 commits
  10. 16 Mar, 2019 5 commits
    • Linus Torvalds's avatar
      Merge tag 'pidfd-v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · a9dce667
      Linus Torvalds authored
      Pull pidfd system call from Christian Brauner:
       "This introduces the ability to use file descriptors from /proc/<pid>/
        as stable handles on struct pid. Even if a pid is recycled the handle
        will not change. For a start these fds can be used to send signals to
        the processes they refer to.
      
        With the ability to use /proc/<pid> fds as stable handles on struct
        pid we can fix a long-standing issue where after a process has exited
        its pid can be reused by another process. If a caller sends a signal
        to a reused pid it will end up signaling the wrong process.
      
        With this patchset we enable a variety of use cases. One obvious
        example is that we can now safely delegate an important part of
        process management - sending signals - to processes other than the
        parent of a given process by sending file descriptors around via scm
        rights and not fearing that the given process will have been recycled
        in the meantime. It also allows for easy testing whether a given
        process is still alive or not by sending signal 0 to a pidfd which is
        quite handy.
      
        There has been some interest in this feature e.g. from systems
        management (systemd, glibc) and container managers. I have requested
        and gotten comments from glibc to make sure that this syscall is
        suitable for their needs as well. In the future I expect it to take on
        most other pid-based signal syscalls. But such features are left for
        the future once they are needed.
      
        This has been sitting in linux-next for quite a while and has not
        caused any issues. It comes with selftests which verify basic
        functionality and also test that a recycled pid cannot be signaled via
        a pidfd.
      
        Jon has written about a prior version of this patchset. It should
        cover the basic functionality since not a lot has changed since then:
      
            https://lwn.net/Articles/773459/
      
        The commit message for the syscall itself is extensively documenting
        the syscall, including it's functionality and extensibility"
      
      * tag 'pidfd-v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
        selftests: add tests for pidfd_send_signal()
        signal: add pidfd_send_signal() syscall
      a9dce667
    • Linus Torvalds's avatar
      Merge tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · f67e3fb4
      Linus Torvalds authored
      Pull device-dax updates from Dan Williams:
       "New device-dax infrastructure to allow persistent memory and other
        "reserved" / performance differentiated memories, to be assigned to
        the core-mm as "System RAM".
      
        Some users want to use persistent memory as additional volatile
        memory. They are willing to cope with potential performance
        differences, for example between DRAM and 3D Xpoint, and want to use
        typical Linux memory management apis rather than a userspace memory
        allocator layered over an mmap() of a dax file. The administration
        model is to decide how much Persistent Memory (pmem) to use as System
        RAM, create a device-dax-mode namespace of that size, and then assign
        it to the core-mm. The rationale for device-dax is that it is a
        generic memory-mapping driver that can be layered over any "special
        purpose" memory, not just pmem. On subsequent boots udev rules can be
        used to restore the memory assignment.
      
        One implication of using pmem as RAM is that mlock() no longer keeps
        data off persistent media. For this reason it is recommended to enable
        NVDIMM Security (previously merged for 5.0) to encrypt pmem contents
        at rest. We considered making this recommendation an actively enforced
        requirement, but in the end decided to leave it as a distribution /
        administrator policy to allow for emulation and test environments that
        lack security capable NVDIMMs.
      
        Summary:
      
         - Replace the /sys/class/dax device model with /sys/bus/dax, and
           include a compat driver so distributions can opt-in to the new ABI.
      
         - Allow for an alternative driver for the device-dax address-range
      
         - Introduce the 'kmem' driver to hotplug / assign a device-dax
           address-range to the core-mm.
      
         - Arrange for the device-dax target-node to be onlined so that the
           newly added memory range can be uniquely referenced by numa apis"
      
      NOTE! I'm not entirely happy with the whole "PMEM as RAM" model because
      we currently have special - and very annoying rules in the kernel about
      accessing PMEM only with the "MC safe" accessors, because machine checks
      inside the regular repeat string copy functions can be fatal in some
      (not described) circumstances.
      
      And apparently the PMEM modules can cause that a lot more than regular
      RAM.  The argument is that this happens because PMEM doesn't necessarily
      get scrubbed at boot like RAM does, but that is planned to be added for
      the user space tooling.
      
      Quoting Dan from another email:
       "The exposure can be reduced in the volatile-RAM case by scanning for
        and clearing errors before it is onlined as RAM. The userspace tooling
        for that can be in place before v5.1-final. There's also runtime
        notifications of errors via acpi_nfit_uc_error_notify() from
        background scrubbers on the DIMM devices. With that mechanism the
        kernel could proactively clear newly discovered poison in the volatile
        case, but that would be additional development more suitable for v5.2.
      
        I understand the concern, and the need to highlight this issue by
        tapping the brakes on feature development, but I don't see PMEM as RAM
        making the situation worse when the exposure is also there via DAX in
        the PMEM case. Volatile-RAM is arguably a safer use case since it's
        possible to repair pages where the persistent case needs active
        application coordination"
      
      * tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        device-dax: "Hotplug" persistent memory for use like normal RAM
        mm/resource: Let walk_system_ram_range() search child resources
        mm/memory-hotplug: Allow memory resources to be children
        mm/resource: Move HMM pr_debug() deeper into resource code
        mm/resource: Return real error codes from walk failures
        device-dax: Add a 'modalias' attribute to DAX 'bus' devices
        device-dax: Add a 'target_node' attribute
        device-dax: Auto-bind device after successful new_id
        acpi/nfit, device-dax: Identify differentiated memory with a unique numa-node
        device-dax: Add /sys/class/dax backwards compatibility
        device-dax: Add support for a dax override driver
        device-dax: Move resource pinning+mapping into the common driver
        device-dax: Introduce bus + driver model
        device-dax: Start defining a dax bus model
        device-dax: Remove multi-resource infrastructure
        device-dax: Kill dax_region base
        device-dax: Kill dax_region ida
      f67e3fb4
    • Linus Torvalds's avatar
      Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 477558d7
      Linus Torvalds authored
      Pull more SCSI updates from James Bottomley:
       "This is the final round of mostly small fixes and performance
        improvements to our initial submit.
      
        The main regression fix is the ia64 simscsi build failure which was
        missed in the serial number elimination conversion"
      
      * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
        scsi: ia64: simscsi: use request tag instead of serial_number
        scsi: aacraid: Fix performance issue on logical drives
        scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup()
        scsi: libiscsi: Hold back_lock when calling iscsi_complete_task
        scsi: hisi_sas: Change SERDES_CFG init value to increase reliability of HiLink
        scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port
        scsi: hisi_sas: Set PHY linkrate when disconnected
        scsi: hisi_sas: print PHY RX errors count for later revision of v3 hw
        scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO
        scsi: hisi_sas: Change return variable type in phy_up_v3_hw()
        scsi: qla2xxx: check for kstrtol() failure
        scsi: lpfc: fix 32-bit format string warning
        scsi: lpfc: fix unused variable warning
        scsi: target: tcmu: Switch to bitmap_zalloc()
        scsi: libiscsi: fall back to sendmsg for slab pages
        scsi: qla2xxx: avoid printf format warning
        scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset
        scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check
        scsi: ufs: hisi: fix ufs_hba_variant_ops passing
        scsi: qla2xxx: Fix panic in qla_dfs_tgt_counters_show
        ...
      477558d7
    • Linus Torvalds's avatar
      Merge tag 'for-5.1/block-post-20190315' of git://git.kernel.dk/linux-block · 11efae35
      Linus Torvalds authored
      Pull more block layer changes from Jens Axboe:
       "This is a collection of both stragglers, and fixes that came in after
        I finalized the initial pull. This contains:
      
         - An MD pull request from Song, with a few minor fixes
      
         - Set of NVMe patches via Christoph
      
         - Pull request from Konrad, with a few fixes for xen/blkback
      
         - pblk fix IO calculation fix (Javier)
      
         - Segment calculation fix for pass-through (Ming)
      
         - Fallthrough annotation for blkcg (Mathieu)"
      
      * tag 'for-5.1/block-post-20190315' of git://git.kernel.dk/linux-block: (25 commits)
        blkcg: annotate implicit fall through
        nvme-tcp: support C2HData with SUCCESS flag
        nvmet: ignore EOPNOTSUPP for discard
        nvme: add proper write zeroes setup for the multipath device
        nvme: add proper discard setup for the multipath device
        nvme: remove nvme_ns_config_oncs
        nvme: disable Write Zeroes for qemu controllers
        nvmet-fc: bring Disconnect into compliance with FC-NVME spec
        nvmet-fc: fix issues with targetport assoc_list list walking
        nvme-fc: reject reconnect if io queue count is reduced to zero
        nvme-fc: fix numa_node when dev is null
        nvme-fc: use nr_phys_segments to determine existence of sgl
        nvme-loop: init nvmet_ctrl fatal_err_work when allocate
        nvme: update comment to make the code easier to read
        nvme: put ns_head ref if namespace fails allocation
        nvme-trace: fix cdw10 buffer overrun
        nvme: don't warn on block content change effects
        nvme: add get-feature to admin cmds tracer
        md: Fix failed allocation of md_register_thread
        It's wrong to add len to sector_nr in raid10 reshape twice
        ...
      11efae35
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-5.1-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 465c209d
      Linus Torvalds authored
      Pull NFS client bugfixes from Trond Myklebust:
       "Highlights include:
      
        Bugfixes:
         - Fix an Oops in SUNRPC back channel tracepoints
         - Fix a SUNRPC client regression when handling oversized replies
         - Fix the minimal size for SUNRPC reply buffer allocation
         - rpc_decode_header() must always return a non-zero value on error
         - Fix a typo in pnfs_update_layout()
      
        Cleanup:
         - Remove redundant check for the reply length in call_decode()"
      
      * tag 'nfs-for-5.1-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        SUNRPC: Remove redundant check for the reply length in call_decode()
        SUNRPC: Handle the SYSTEM_ERR rpc error
        SUNRPC: rpc_decode_header() must always return a non-zero value on error
        SUNRPC: Use the ENOTCONN error on socket disconnect
        SUNRPC: Fix the minimal size for reply buffer allocation
        SUNRPC: Fix a client regression when handling oversized replies
        pNFS: Fix a typo in pnfs_update_layout
        fix null pointer deref in tracepoints in back channel
      465c209d