Commits · 13f9bae579c6bd051e58f326913dd09af1291208 · nexedi / linux

20 Nov, 2019 2 commits

s390/kasan: support memcpy_real with TRACE_IRQFLAGS · 13f9bae5

Vasily Gorbik authored Nov 05, 2019

Currently if the kernel is built with CONFIG_TRACE_IRQFLAGS and KASAN
and used as crash kernel it crashes itself due to
trace_hardirqs_off/trace_hardirqs_on being called with DAT off. This
happens because trace_hardirqs_off/trace_hardirqs_on are instrumented and
kasan code tries to perform access to shadow memory to validate memory
accesses. Kasan shadow memory is populated with vmemmap, so all accesses
require DAT on.

memcpy_real could be called with DAT on or off (with kasan enabled DAT
is set even before early code is executed).

Make sure that trace_hardirqs_off/trace_hardirqs_on are called with DAT
on and only actual __memcpy_real is called with DAT off.

Also annotate __memcpy_real and _memcpy_real with __no_sanitize_address
to avoid further problems due to switching DAT off.
Reviewed-by: Philipp Rudo <prudo@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

13f9bae5

s390/crypto: Fix unsigned variable compared with zero · 0398d4ab

YueHaibing authored Nov 14, 2019

s390_crypto_shash_parmsize() return type is int, it
should not be stored in a unsigned variable, which
compared with zero.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 3c2eb6b7 ("s390/crypto: Support for SHA3 via CPACF (MSA6)")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Joerg Schmidbauer <jschmidb@linux.vnet.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

0398d4ab

12 Nov, 2019 7 commits

s390/pkey: use memdup_user() to simplify code · 8b57e7c8

Markus Elfring authored Nov 11, 2019

Generated by: scripts/coccinelle/api/memdup_user.cocci

Link: http://lkml.kernel.org/r/aca044e8-e4b2-eda8-d724-b08772a44ed9@web.de
[borntraeger@de.ibm.com: use ==0 instead of <=0 for a size_t variable]
[heiko.carstens@de.ibm.com: split bugfix into separate patch; shorten changelog]
Signed-off-by: Markus Elfring <Markus.Elfring@web.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

8b57e7c8

s390/pkey: fix memory leak within _copy_apqns_from_user() · f9cac4fd

Heiko Carstens authored Nov 12, 2019

Fixes: f2bbc96e ("s390/pkey: add CCA AES cipher key support")
Reported-by: Markus Elfring <Markus.Elfring@web.de>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

f9cac4fd

Merge tag 'vfio-ccw-20191111' of... · 4ff4ba15

Vasily Gorbik authored Nov 12, 2019

Merge tag 'vfio-ccw-20191111' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw into features

enhance tracing in vfio-ccw

* tag 'vfio-ccw-20191111' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/vfio-ccw:
  vfio-ccw: Rework the io_fctl trace
  vfio-ccw: Add a trace for asynchronous requests
  vfio-ccw: Trace the FSM jumptable
  vfio-ccw: Refactor how the traces are built
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

4ff4ba15

s390/disassembler: don't hide instruction addresses · 544f1d62

Ilya Leoshkevich authored Oct 31, 2019

Due to kptr_restrict, JITted BPF code is now displayed like this:

000000000b6ed1b2: ebdff0800024  stmg    %r13,%r15,128(%r15)
000000004cde2ba0: 41d0f040      la      %r13,64(%r15)
00000000fbad41b0: a7fbffa0      aghi    %r15,-96

Leaking kernel addresses to dmesg is not a concern in this case, because
this happens only when JIT debugging is explicitly activated, which only
root can do.

Use %px in this particular instance, and also to print an instruction
address in show_code and PCREL (e.g. brasl) arguments in print_insn.
While at present functionally equivalent to %016lx, %px is recommended
by Documentation/core-api/printk-formats.rst for such cases.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

544f1d62

s390/cpum_sf: Assign error value to err variable · 72fbcd05

Thomas Richter authored Oct 31, 2019

When starting the CPU Measurement sampling facility using
qsi() function, this function may return an error value.
This error value is referenced in the else part of the
if statement to dump its value in a debug statement.
Right now this value is always zero because it has not been
assigned a value.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

72fbcd05

s390/cpum_sf: Replace function name in debug statements · c1838834

Thomas Richter authored Oct 31, 2019

Replace hard coded function names in debug statements
by the "%s ...", __func__ construct suggested by checkpatch.pl
script.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

c1838834

s390/cpum_sf: Use consistant debug print format for sampling · d98b5d07

Thomas Richter authored Oct 31, 2019

Use consistant debug print format of the form variable
blank value. Also add leading 0x for all hex values.
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

d98b5d07

05 Nov, 2019 1 commit

s390/unwind: drop unnecessary code around calling ftrace_graph_ret_addr() · c2f2093e

Miroslav Benes authored Oct 29, 2019

The current code around calling ftrace_graph_ret_addr() is ifdeffed and
also tests if ftrace redirection is present on stack.
ftrace_graph_ret_addr() however performs the test internally and there
is a version for !CONFIG_FUNCTION_GRAPH_TRACER as well. The unnecessary
code can thus be dropped.

Link: http://lkml.kernel.org/r/20191029143904.24051-2-mbenes@suse.czSigned-off-by: Miroslav Benes <mbenes@suse.cz>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

c2f2093e

31 Oct, 2019 21 commits

s390: add error handling to perf_callchain_kernel · effb83cc

Ilya Leoshkevich authored Oct 30, 2019

perf_callchain_kernel stops neither when it encounters a garbage
address, nor when it runs out of space. Fix both issues using x86
version as an inspiration.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

effb83cc

s390: always inline current_stack_pointer() · 265f79dc

Heiko Carstens authored Oct 30, 2019

This function must be inlined since any caller expects the current
stack pointer; which wouldn't be true if the function isn't inlined.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

265f79dc

s390/mm: add mm_pxd_folded() checks to pxd_free() · 2416cefc

Gerald Schaefer authored Oct 22, 2019

Unlike pxd_free_tlb(), the pxd_free() functions do not check for folded
page tables. This is not an issue so far, as those functions will actually
never be called, since no code will reach them when page tables are folded.

In order to avoid future issues, and to make the s390 code more similar to
other architectures, add mm_pxd_folded() checks, similar to how it is done
in pxd_free_tlb().

This was found by testing a patch from from Anshuman Khandual, which is
currently discussed on LKML ("mm/debug: Add tests validating architecture
page table helpers").
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

2416cefc

s390/mm: properly clear _PAGE_NOEXEC bit when it is not supported · ab874f22

Gerald Schaefer authored Sep 11, 2019

On older HW or under a hypervisor, w/o the instruction-execution-
protection (IEP) facility, and also w/o EDAT-1, a translation-specification
exception may be recognized when bit 55 of a pte is one (_PAGE_NOEXEC).

The current code tries to prevent setting _PAGE_NOEXEC in such cases,
by removing it within set_pte_at(). However, ptep_set_access_flags()
will modify a pte directly, w/o using set_pte_at(). There is at least
one scenario where this can result in an active pte with _PAGE_NOEXEC
set, which would then lead to a panic due to a translation-specification
exception (write to swapped out page):

do_swap_page
  pte = mk_pte (with _PAGE_NOEXEC bit)
  set_pte_at   (will remove _PAGE_NOEXEC bit in page table, but keep it
                in local variable pte)
  vmf->orig_pte = pte (pte still contains _PAGE_NOEXEC bit)
  do_wp_page
    wp_page_reuse
      entry = vmf->orig_pte (still with _PAGE_NOEXEC bit)
      ptep_set_access_flags (writes entry with _PAGE_NOEXEC bit)

Fix this by clearing _PAGE_NOEXEC already in mk_pte_phys(), where the
pgprot value is applied, so that no pte with _PAGE_NOEXEC will ever be
visible, if it is not supported. The check in set_pte_at() can then also
be removed.

Cc: <stable@vger.kernel.org> # 4.11+
Fixes: 57d7f939 ("s390: add no-execute support")
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

ab874f22

s390/mm: simplify page table helpers for large entries · 2d1fc1eb

Gerald Schaefer authored Sep 10, 2019

For pmds and puds, there are a couple of page table helper functions that
only make sense for large entries, like pxd_(mk)dirty/young/write etc.

We currently explicitly check if the entries are large, but in practice
those functions must never be used for normal entries, which point to lower
level page tables, so the code can be simplified.

This also fixes a theoretical bug, where common code could use one of the
functions before actually marking a pmd large, like this:

pmd = pmd_mkhuge(pmd_mkdirty(pmd))

With the current implementation, the resulting large pmd would not be dirty
as requested. This could in theory result in the loss of dirty information,
e.g. after collapsing into a transparent hugepage. Common code currently
always marks an entry large before using one of the functions, but there is
no hard requirement for this. The only requirement would be that it never
uses the functions for normal entries pointing to lower level page tables,
but they might be called before marking an entry large during its creation.

In order to avoid issues with future common code, and to simplify the page
table helpers, remove the checks for large entries and rely on common code
never using them for normal entries.

This was found by testing a patch from from Anshuman Khandual, which is
currently discussed on LKML ("mm/debug: Add tests validating architecture
page table helpers").
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

2d1fc1eb

s390/mm: make pmd/pud_bad() report large entries as bad · 1c27a4bc

Gerald Schaefer authored Sep 06, 2019

The semantics of pmd/pud_bad() expect that large entries are reported as
bad, but we also check large entries for sanity.

There is currently no issue with this wrong behaviour, but let's conform
to the semantics by reporting large pmd/pud entries as bad, in order to
prevent future issues.

This was found by testing a patch from from Anshuman Khandual, which is
currently discussed on LKML ("mm/debug: Add tests validating architecture
page table helpers").
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

1c27a4bc

s390/time: ensure get_clock_monotonic() returns monotonic values · 01162068

Heiko Carstens authored Oct 29, 2019

The current implementation of get_clock_monotonic() leaves it up to
the caller to call the function with preemption disabled. The only
core kernel caller (sched_clock) however does not disable preemption.

In order to make sure that all callers of this function see monotonic
values handle disabling preemption within the function itself.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

01162068

s390/process: avoid custom stack unwinding in get_wchan · 6756dd9b

Vasily Gorbik authored Oct 28, 2019

Currently get_wchan uses custom stack unwinding implementation which
relies on back_chain presence. Replace it with more abstract stack
unwinding api usage.
Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

6756dd9b

s390/unwind: fix get_stack_pointer(NULL, NULL) · ea3f6dcf

Ilya Leoshkevich authored Oct 02, 2019

unwind_for_each_frame(NULL, NULL, 0) does not return any valid frames.

The reason is that get_stack_pointer, unlike get_stack_info and
show_stack, does not handle NULL argument.

Fix by making get_stack_pointer treat NULL as current, like
get_stack_info and show_stack do.
Reviewed-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

ea3f6dcf

s390: avoid double handling of "noexec" option · d3baaeb5

Vasily Gorbik authored Oct 22, 2019

"noexec" option is already parsed during startup and its value is
exposed via noexec_disabled variable. Simply reuse that value during
machine facilities detection.
Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

d3baaeb5

s390/time: remove monotonic_clock() · f653e29b

Heiko Carstens authored Oct 28, 2019

Remove unused monotonic_clock() function.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

f653e29b

s390/qdio: move SSQD Sniffer mask definition · 1917b47d

Julian Wiedmann authored Oct 23, 2019

Put the Sniffer bit next to all the other CHSC AC2 bits.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Steffen Maier <maier@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

1917b47d

s390/boot: fix section name escaping · 4f84b383

Nick Desaulniers authored Aug 12, 2019

GCC unescapes escaped string section names while Clang does not. Because
__section uses the `#` stringification operator for the section name, it
doesn't need to be escaped.

This antipattern was found with:
$ grep -e __section\(\" -e __section__\(\" -r
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Message-Id: <20190812215052.71840-1-ndesaulniers@google.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

4f84b383

s390/alternatives: make use of asm_inline · cceb0183

Heiko Carstens authored Oct 18, 2019

This is the s390 version of commit 40576e5e ("x86: alternative.h:
use asm_inline for all alternative variants").

See commit eb111869 ("compiler-types.h: add asm_inline
definition") for more details.

With this change the compiler will not generate many out-of-line
versions for the three instruction sized arch_spin_unlock() function
anymore. Due to this gcc seems to change a lot of other inline
decisions which results in a net 6k text size growth according to
bloat-o-meter (gcc 9.2 with defconfig).
But that's still better than having many out-of-line versions of
arch_spin_unlock().
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

cceb0183

s390/bug: make use of asm_inline · 6a3035da

Heiko Carstens authored Oct 18, 2019

This is the s390 version of commit 32ee8230 ("x86: bug.h: use
asm_inline in _BUG_FLAGS definitions").

See commit eb111869 ("compiler-types.h: add asm_inline
definition") for more details.

Just like on x86 the .text section size decreases a bit while the
.data section size increases about the same amount (gcc 9.2 with
defconfig).
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

6a3035da

s390/qdio: don't attempt IRQ avoidance on Output SBALs · 6d76c898

Julian Wiedmann authored Oct 04, 2019

Output interrupts are not subject to SLSB-based avoidance, so remove the
gratuitous SLSB updates for Output SBALs in ERROR state.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

6d76c898

s390/qdio: simplify thinint device registration · 94c43bda

Julian Wiedmann authored Jul 23, 2019

On an interrupt, tiqdio_thinint_handler() walks a list of all objects
that might require attention, and checks their DSCI. This list is
awkwardly built from Input Queues, even though the IRQs are per-device
and the queue is then only used to dereference its qdio_irq parent.

To simplify the logic, change the code so that tiq_list contains
qdio_irq entries.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

94c43bda

s390/qdio: add statistics helper macro · 46112810

Julian Wiedmann authored Sep 30, 2019

qperf_inc() takes a queue as input, but actually updates the statistics
in its qdio_irq parent.
In some contexts we already have access to the qdio_irq struct, and can
avoid the additional dereference.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

46112810

s390/qdio: remove a forward declaration · d86f71fd

Julian Wiedmann authored Jul 23, 2019

Shift the definition of tiqdio_airq around, so that it doesn't require a
forward declaration for tiqdio_thinint_handler().
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

d86f71fd

s390/qdio: reduce log level for EQBS partial · 4e79a5d4

Julian Wiedmann authored Aug 19, 2019

Partial EQBS completion is no significant event, and the WARN ends up
spamming the debug logs for no good reason.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

4e79a5d4

s390/qdio: use QDIO_BUFNR() · a320412d

Julian Wiedmann authored Aug 12, 2019

qdio.h recently gained a new helper macro that handles wrap-around on a
QDIO queue, use it.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

a320412d

17 Oct, 2019 4 commits

vfio-ccw: Rework the io_fctl trace · 85298880

Eric Farman authored Oct 16, 2019

Using __field_struct for the schib is convenient, but it doesn't
appear to let us filter based on any of the schib elements.
Specifying the full schid or any element within it results
in various errors by the parser.  So, expand that out to its
component elements, so we can limit the trace to a single device.

While we are at it, rename this trace to the function name, so we
remember what is being traced instead of an abstract reference to the
function control bit of the SCSW.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20191016142040.14132-5-farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>

85298880

vfio-ccw: Add a trace for asynchronous requests · d5950b02

Eric Farman authored Oct 16, 2019

Since the asynchronous requests are typically associated with
error recovery, let's add a simple trace when one of those is
issued to a device.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Message-Id: <20191016142040.14132-4-farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>

d5950b02

vfio-ccw: Trace the FSM jumptable · 970ebeb8

Eric Farman authored Oct 16, 2019

It would be nice if we could track the sequence of events within
vfio-ccw, based on the state of the device/FSM and our calling
sequence within it.  So let's add a simple trace here so we can
watch the states change as things go, and allow it to be folded
into the rest of the other cio traces.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20191016142040.14132-3-farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>

970ebeb8

vfio-ccw: Refactor how the traces are built · 7af52cca

Eric Farman authored Oct 16, 2019

Commit 3cd90214 ("vfio: ccw: add tracepoints for interesting error
paths") added a quick trace point to determine where a channel program
failed while being processed.  It's a great addition, but adding more
traces to vfio-ccw is more cumbersome than it needs to be.

Let's refactor how this is done, so that additional traces are easier
to add and can exist outside of the FSM if we ever desire.
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20191016142040.14132-2-farman@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>

7af52cca

10 Oct, 2019 2 commits

s390/Kconfig: add z13s and z14 ZR1 to TUNE descriptions · 89d0180a

Heiko Carstens authored Oct 06, 2019

The names for the z13s and z14 ZR1 machines are missing for the
TUNE_Z13 and TUNE_Z14 descriptions. Just add them.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

89d0180a

s390/Kconfig: make use of 'depends on cc-option' · 5474080a

Heiko Carstens authored Oct 06, 2019

Make use of 'depends on cc-option' to only display those Kconfig
options for which compiler support is available.

Add this for the MARCH and TUNE options which are the only options
which may result in compile errors if the selected architecture is not
supported by the compiler.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>

5474080a

06 Oct, 2019 3 commits

Linux 5.4-rc2 · da0c9ea1
Linus Torvalds authored Oct 06, 2019

da0c9ea1

elf: don't use MAP_FIXED_NOREPLACE for elf executable mappings · b212921b

Linus Torvalds authored Oct 06, 2019

In commit 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map") we
changed elf to use MAP_FIXED_NOREPLACE instead of MAP_FIXED for the
executable mappings.

Then, people reported that it broke some binaries that had overlapping
segments from the same file, and commit ad55eac7 ("elf: enforce
MAP_FIXED on overlaying elf segments") re-instated MAP_FIXED for some
overlaying elf segment cases.  But only some - despite the summary line
of that commit, it only did it when it also does a temporary brk vma for
one obvious overlapping case.

Now Russell King reports another overlapping case with old 32-bit x86
binaries, which doesn't trigger that limited case.  End result: we had
better just drop MAP_FIXED_NOREPLACE entirely, and go back to MAP_FIXED.

Yes, it's a sign of old binaries generated with old tool-chains, but we
do pride ourselves on not breaking existing setups.

This still leaves MAP_FIXED_NOREPLACE in place for the load_elf_interp()
and the old load_elf_library() use-cases, because nobody has reported
breakage for those. Yet.

Note that in all the cases seen so far, the overlapping elf sections
seem to be just re-mapping of the same executable with different section
attributes.  We could possibly introduce a new MAP_FIXED_NOFILECHANGE
flag or similar, which acts like NOREPLACE, but allows just remapping
the same executable file using different protection flags.

It's not clear that would make a huge difference to anything, but if
people really hate that "elf remaps over previous maps" behavior, maybe
at least a more limited form of remapping would alleviate some concerns.

Alternatively, we should take a look at our elf_map() logic to see if we
end up not mapping things properly the first time.

In the meantime, this is the minimal "don't do that then" patch while
people hopefully think about it more.
Reported-by: Russell King <linux@armlinux.org.uk>
Fixes: 4ed28639 ("fs, elf: drop MAP_FIXED usage from elf_map")
Fixes: ad55eac7 ("elf: enforce  MAP_FIXED on overlaying elf segments")
Cc: Michal Hocko <mhocko@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b212921b

Merge tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mapping · 7cdb85df

Linus Torvalds authored Oct 06, 2019

Pull dma-mapping regression fix from Christoph Hellwig:
 "Revert an incorret hunk from a patch that caused problems on various
  arm boards (Andrey Smirnov)"

* tag 'dma-mapping-5.4-1' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: fix false positive warnings in dma_common_free_remap()

7cdb85df