Commits · e6b93f4e48af8fe8373ec1132b8c7787890e7578 · nexedi / linux

04 Jun, 2015 1 commit

x86/asm/entry: Move the 'thunk' functions to arch/x86/entry/ · e6b93f4e

Ingo Molnar authored Jun 03, 2015

These are all calling x86 entry code functions, so move them close
to other entry code.

Change lib-y to obj-y: there's no real difference between the two
as we don't really drop any of them during the linking stage, and
obj-y is the more common approach for core kernel object code.

Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

e6b93f4e

03 Jun, 2015 3 commits

x86/asm/entry, x86/vdso: Move the vDSO code to arch/x86/entry/vdso/ · d603c8e1

Ingo Molnar authored Jun 03, 2015

Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

d603c8e1

x86/asm/entry: Move the compat syscall entry code to arch/x86/entry/ · 19a433f4

Ingo Molnar authored Jun 03, 2015

Move the ia32entry.S file over into arch/x86/entry/.

Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

19a433f4

x86/asm/entry: Move entry_64.S and entry_32.S to arch/x86/entry/ · 905a36a2

Ingo Molnar authored Jun 03, 2015

Create a new directory hierarchy for the low level x86 entry code:

    arch/x86/entry/*

This will host all the low level glue that is currently scattered
all across arch/x86/.

Start with entry_64.S and entry_32.S.

Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

905a36a2

02 Jun, 2015 3 commits

x86/asm/entry/64: Fold identical code paths · 2f63b9db

Jan Beulich authored Jun 01, 2015

retint_kernel doesn't require %rcx to be pointing to thread info
(anymore?), and the code on the two alternative paths is - not
really surprisingly - identical.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/556C664F020000780007FB64@mail.emea.novell.comSigned-off-by: Ingo Molnar <mingo@kernel.org>

2f63b9db

x86/asm/entry/64: Use negative immediates for stack adjustments · 2bf557ea

Jan Beulich authored Jun 01, 2015

Doing so allows adjustments by 128 bytes (occurring for
REMOVE_PT_GPREGS_FROM_STACK 8 uses) to be expressed with a
single byte immediate.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/556C660F020000780007FB60@mail.emea.novell.comSigned-off-by: Ingo Molnar <mingo@kernel.org>

2bf557ea

x86/debug: Remove perpetually broken, unmaintainable dwarf annotations · 131484c8

Ingo Molnar authored May 28, 2015

So the dwarf2 annotations in low level assembly code have
become an increasing hindrance: unreadable, messy macros
mixed into some of the most security sensitive code paths
of the Linux kernel.

These debug info annotations don't even buy the upstream
kernel anything: dwarf driven stack unwinding has caused
problems in the past so it's out of tree, and the upstream
kernel only uses the much more robust framepointers based
stack unwinding method.

In addition to that there's a steady, slow bitrot going
on with these annotations, requiring frequent fixups.
There's no tooling and no functionality upstream that
keeps it correct.

So burn down the sick forest, allowing new, healthier growth:

   27 files changed, 350 insertions(+), 1101 deletions(-)

Someone who has the willingness and time to do this
properly can attempt to reintroduce dwarf debuginfo in x86
assembly code plus dwarf unwinding from first principles,
with the following conditions:

 - it should be maximally readable, and maximally low-key to
   'ordinary' code reading and maintenance.

 - find a build time method to insert dwarf annotations
   automatically in the most common cases, for pop/push
   instructions that manipulate the stack pointer. This could
   be done for example via a preprocessing step that just
   looks for common patterns - plus special annotations for
   the few cases where we want to depart from the default.
   We have hundreds of CFI annotations, so automating most of
   that makes sense.

 - it should come with build tooling checks that ensure that
   CFI annotations are sensible. We've seen such efforts from
   the framepointer side, and there's no reason it couldn't be
   done on the dwarf side.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Frédéric Weisbecker <fweisbec@gmail.com
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>

131484c8

24 May, 2015 1 commit

x86/asm/irq: Stop relying on magic JMP behavior for early_idt_handlers · cdeb6048

Andy Lutomirski authored May 22, 2015

The early_idt_handlers asm code generates an array of entry
points spaced nine bytes apart.  It's not really clear from that
code or from the places that reference it what's going on, and
the code only works in the first place because GAS never
generates two-byte JMP instructions when jumping to global
labels.

Clean up the code to generate the correct array stride (member size)
explicitly. This should be considerably more robust against
screw-ups, as GAS will warn if a .fill directive has a negative
count.  Using '. =' to advance would have been even more robust
(it would generate an actual error if it tried to move
backwards), but it would pad with nulls, confusing anyone who
tries to disassemble the code.  The new scheme should be much
clearer to future readers.

While we're at it, improve the comments and rename the array and
common code.

Binutils may start relaxing jumps to non-weak labels.  If so,
this change will fix our build, and we may need to backport this
change.

Before, on x86_64:

  0000000000000000 <early_idt_handlers>:
     0:   6a 00                   pushq  $0x0
     2:   6a 00                   pushq  $0x0
     4:   e9 00 00 00 00          jmpq   9 <early_idt_handlers+0x9>
                          5: R_X86_64_PC32        early_idt_handler-0x4
  ...
    48:   66 90                   xchg   %ax,%ax
    4a:   6a 08                   pushq  $0x8
    4c:   e9 00 00 00 00          jmpq   51 <early_idt_handlers+0x51>
                          4d: R_X86_64_PC32       early_idt_handler-0x4
  ...
   117:   6a 00                   pushq  $0x0
   119:   6a 1f                   pushq  $0x1f
   11b:   e9 00 00 00 00          jmpq   120 <early_idt_handler>
                          11c: R_X86_64_PC32      early_idt_handler-0x4

After:

  0000000000000000 <early_idt_handler_array>:
     0:   6a 00                   pushq  $0x0
     2:   6a 00                   pushq  $0x0
     4:   e9 14 01 00 00          jmpq   11d <early_idt_handler_common>
  ...
    48:   6a 08                   pushq  $0x8
    4a:   e9 d1 00 00 00          jmpq   120 <early_idt_handler_common>
    4f:   cc                      int3
    50:   cc                      int3
  ...
   117:   6a 00                   pushq  $0x0
   119:   6a 1f                   pushq  $0x1f
   11b:   eb 03                   jmp    120 <early_idt_handler_common>
   11d:   cc                      int3
   11e:   cc                      int3
   11f:   cc                      int3
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Binutils <binutils@sourceware.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Jan Beulich <JBeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ac027962af343b0c599cbfcf50b945ad2ef3d7a8.1432336324.git.luto@kernel.orgSigned-off-by: Ingo Molnar <mingo@kernel.org>

cdeb6048

17 May, 2015 8 commits

x86/asm/entry/64: Use shorter MOVs from segment registers · adeb5537

Denys Vlasenko authored May 15, 2015

The "movw %ds,%cx" instruction needs a 0x66 prefix, while
"movl %ds,%ecx" does not.

The difference is that latter form (on 64-bit CPUs)
overwrites the entire %ecx, not only its lower half.

But subsequent code doesn't depend on the value of upper
half of %ecx, so we can safely use the shorter instruction.

The new code is also faster than the old one - now we don't
depend on the old value of %ecx, but this code fragment is
not performance-critical so it does not matter much.
Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Drewry <wad@chromium.org>
Link: http://lkml.kernel.org/r/1431722346-26585-1-git-send-email-dvlasenk@redhat.comSigned-off-by: Ingo Molnar <mingo@kernel.org>

adeb5537

x86/asm/head*.S: Change global labels to local · e839004b

Borislav Petkov authored May 16, 2015

Make the disassembly look less confusing:

  -- head_64.o.before.asm
  ++ head_64.o.after.asm
   0000000000000120 <early_idt_handler>:
    120:	fc                   	cld
    121:	83 3c 24 02          	cmpl   $0x2,(%rsp)
  - 125:	0f 84 9d 00 00 00    	je     1c8 <is_nmi>
  + 125:	0f 84 9d 00 00 00    	je     1c8 <early_idt_handler+0xa8>
    12b:	83 3d 00 00 00 00 02 	cmpl   $0x2,0x0(%rip)        # 132 <early_idt_handler+0x12>
    132:	74 7e                	je     1b2 <early_idt_handler+0x92>
    134:	ff 05 00 00 00 00    	incl   0x0(%rip)        # 13a <early_idt_handler+0x1a>
  @@ -1198,9 +1198,7 @@ Disassembly of section .init.text:
    1bf:	5a                   	pop    %rdx
    1c0:	59                   	pop    %rcx
    1c1:	58                   	pop    %rax
  - 1c2:	ff 0d 00 00 00 00    	decl   0x0(%rip)        # 1c8 <is_nmi>
  -
  -00000000000001c8 <is_nmi>:
  + 1c2:	ff 0d 00 00 00 00    	decl   0x0(%rip)        # 1c8 <early_idt_handler+0xa8>
    1c8:	48 83 c4 10          	add    $0x10,%rsp
    1cc:	48 cf                	iretq

  -- head_32.o.before.asm
  ++ head_32.o.after.asm
   0000016c <early_idt_handler>:
    16c:  fc                      cld
    16d:  83 3c 24 02             cmpl   $0x2,(%esp)
  - 171:  74 73                   je     1e6 <is_nmi>
  + 171:  74 73                   je     1e6 <ex_entry+0xc>
    173:  36 83 3d 00 00 00 00    cmpl   $0x2,%ss:0x0
    17a:  02
    17b:  74 5a                   je     1d7 <hlt_loop>
  @@ -483,8 +483,6 @@ Disassembly of section .init.text:
    1dd:  59                      pop    %ecx
    1de:  58                      pop    %eax
    1df:  36 ff 0d 00 00 00 00    decl   %ss:0x0
  -
  -000001e6 <is_nmi>:
    1e6:  83 c4 08                add    $0x8,%esp
    1e9:  cf                      iret
    1ea:  66 90                   xchg   %ax,%ax

No functionality change.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1431793079-11153-1-git-send-email-bp@alien8.deSigned-off-by: Ingo Molnar <mingo@kernel.org>

e839004b

Merge branch 'linus' into x86/asm, to resolve conflicts · 75d95d84
Ingo Molnar authored May 17, 2015
```
Conflicts:
	tools/testing/selftests/x86/Makefile
	tools/testing/selftests/x86/run_x86_tests.sh
```
75d95d84

x86: Pack loops tightly as well · 52648e83

Ingo Molnar authored May 17, 2015

Packing loops tightly (-falign-loops=1) is beneficial to code size:

     text        data    bss     dec              filename
 12566391        1617840 1089536 15273767         vmlinux.align.16-byte
 12224951        1617840 1089536 14932327         vmlinux.align.1-byte
 11976567        1617840 1089536 14683943         vmlinux.align.1-byte.funcs-1-byte
 11903735        1617840 1089536 14611111         vmlinux.align.1-byte.funcs-1-byte.loops-1-byte

Which reduces the size of the kernel by another 0.6%, so the
the total combined size reduction of the alignment-packing
patches is ~5.5%.

The x86 decoder bandwidth and caching arguments laid out in:

  be6cb027 ("x86: Align jump targets to 1-byte boundaries")

apply to loop alignment as well.

Furtermore, modern CPU uarchs have a loop cache/buffer that
is a L0 cache before even any uop cache, covering a few
dozen most recently executed instructions.

This loop cache generally does not have the 16-byte alignment
restrictions of the uop cache.

Now loop alignment can still be beneficial if:

 - a loop is cache-hot and its surroundings are not.

 - if the loop is so cache hot that the instruction
   flow becomes x86 decoder bandwidth limited

But loop alignment is harmful if:

 - a loop is cache-cold

 - a loop's surroundings are cache-hot as well

 - two cache-hot loops are close to each other

 - if the loop fits into the loop cache

 - if the code flow is not decoder bandwidth limited

and I'd argue that the latter five scenarios are much
more common in the kernel, as our hottest loops are
typically:

 - pointer chasing: this should fit into the loop cache
   in most cases and is typically data cache and address
   generation limited

 - generic memory ops (memset, memcpy, etc.): these generally
   fit into the loop cache as well, and are likewise data
   cache limited.

So this patch packs loop addresses tightly as well.
Acked-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/20150410123017.GB19918@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>

52648e83

Merge tag 'usb-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · c0655fe9

Linus Torvalds authored May 16, 2015

Pull USB fixes from Greg KH:
 "Here are some USB fixes and new device ids for 4.1-rc4.

  All are pretty minor, and have been in linux-next successfully"

* tag 'usb-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb-storage: Add NO_WP_DETECT quirk for Lacie 059f:0651 devices
  Added another USB product ID for ELAN touchscreen quirks.
  xhci: gracefully handle xhci_irq dead device
  xhci: Solve full event ring by increasing TRBS_PER_SEGMENT to 256
  xhci: fix isoc endpoint dequeue from advancing too far on transaction error
  usb: chipidea: debug: avoid out of bound read
  USB: visor: Match I330 phone more precisely
  USB: pl2303: Remove support for Samsung I330
  USB: cp210x: add ID for KCF Technologies PRN device
  usb: gadget: remove incorrect __init/__exit annotations
  usb: phy: isp1301: work around tps65010 dependency
  usb: gadget: serial: fix re-ordering of tx data
  usb: gadget: hid: Fix static variable usage
  usb: gadget: configfs: Fix interfaces array NULL-termination
  usb: gadget: xilinx: fix devm_ioremap_resource() check
  usb: dwc3: dwc3-omap: correct the register macros

c0655fe9

Merge tag 'tty-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · dd8edd7e

Linus Torvalds authored May 16, 2015

Pull tty/serial fixes from Greg KH:
 "Here's some TTY and serial driver fixes for reported issues.

  All of these have been in linux-next successfully"

* tag 'tty-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  pty: Fix input race when closing
  tty/n_gsm.c: fix a memory leak when gsmtty is removed
  Revert "serial/amba-pl011: Leave the TX IRQ alone when the UART is not open"
  serial: omap: Fix error handling in probe
  earlycon: Revert log warnings

dd8edd7e

Merge tag 'staging-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 3f4741b1

Linus Torvalds authored May 16, 2015

Pull staging / IIO driver fixes from Greg KH:
 "Here's some staging and iio driver fixes to resolve a number of
  reported issues.

  All of these have been in linux-next for a while"

* tag 'staging-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (31 commits)
  iio: light: hid-sensor-prox: Fix memory leak in probe()
  iio: adc: cc10001: Add delay before setting START bit
  iio: adc: cc10001: Fix regulator_get_voltage() return value check
  iio: adc: cc10001: Fix incorrect use of power-up/power-down register
  staging: gdm724x: Correction of variable usage after applying ALIGN()
  iio: adc: cc10001: Fix the channel number mapping
  staging: vt6655: lock MACvWriteBSSIDAddress.
  staging: vt6655: CARDbUpdateTSF bss timestamp correct tsf counter value.
  staging: vt6655: vnt_tx_packet Correct TX order of OWNED_BY_NIC
  staging: vt6655: Fix 80211 control and management status reporting.
  staging: vt6655: implement IEEE80211_TX_STAT_NOACK_TRANSMITTED
  staging: vt6655: device_free_tx_buf use only ieee80211_tx_status_irqsafe
  staging: vt6656: use ieee80211_tx_info to select packet type.
  staging: rtl8712: freeing an ERR_PTR
  staging: sm750: remove incorrect __exit annotation
  iio: kfifo: Set update_needed to false only if a buffer was allocated
  iio: mcp320x: Fix occasional incorrect readings
  iio: accel: mma9553: check input value for activity period
  iio: accel: mma9553: add enable channel for activity
  iio: accel: mma9551_core: prevent buffer overrun
  ...

3f4741b1

Merge tag 'char-misc-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 148c46f3

Linus Torvalds authored May 16, 2015

Pull char/misc fix from Greg KH:
 "Here is one fix, in the extcon subsystem, that resolves a reported
  issue.

  It's been in linux-next for a number of weeks now, sorry for not
  getting it to you sooner"

* tag 'char-misc-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  extcon: usb-gpio: register extcon device before IRQ registration

148c46f3

16 May, 2015 9 commits

Merge branch 'for-linus-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml · 92752b5c

Linus Torvalds authored May 16, 2015

Pull UML hostfs fix from Richard Weinberger:
 "This contains a single fix for a regression introduced in 4.1-rc1"

* 'for-linus-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
  hostfs: Use correct mask for file mode

92752b5c

Merge tag 'upstream-4.1-rc4' of git://git.infradead.org/linux-ubifs · 1630ee5e

Linus Torvalds authored May 16, 2015

Pull UBI bufix from Richard Weinberger:
 "This contains a single bug fix for the UBI block driver"

* tag 'upstream-4.1-rc4' of git://git.infradead.org/linux-ubifs:
  UBI: block: Add missing cache flushes

1630ee5e

Merge tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 6a8098a4

Linus Torvalds authored May 16, 2015

Pull ext4 fixes from Ted Ts'o:
 "Fix a number of ext4 bugs; the most serious of which is a bug in the
  lazytime mount optimization code where we could end up updating the
  timestamps to the wrong inode"

* tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  ext4: fix an ext3 collapse range regression in xfstests
  jbd2: fix r_count overflows leading to buffer overflow in journal recovery
  ext4: check for zero length extent explicitly
  ext4: fix NULL pointer dereference when journal restart fails
  ext4: remove unused function prototype from ext4.h
  ext4: don't save the error information if the block device is read-only
  ext4: fix lazytime optimization

6a8098a4

Merge branch 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs · c7309e88

Linus Torvalds authored May 16, 2015

Pull btrfs fixes from Chris Mason:
 "The first commit is a fix from Filipe for a very old extent buffer
  reuse race that triggered a BUG_ON.  It hasn't come up often, I looked
  through old logs at FB and we hit it a handful of times over the last
  year.

  The rest are other corners he hit during testing"

* 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
  Btrfs: fix race when reusing stale extent buffers that leads to BUG_ON
  Btrfs: fix race between block group creation and their cache writeout
  Btrfs: fix panic when starting bg cache writeout after IO error
  Btrfs: fix crash after inode cache writeback failure

c7309e88

Merge branch 'master' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 518af3cb

Linus Torvalds authored May 16, 2015

Pull MIPS fixes from Ralf Baechle:
 "Seven small fixes.  The shortlog below is a good description so no
  need to elaborate.

  It has sat in linux-next and survived the usual automated testing by
  Imagination's test farm"

* 'master' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: tlb-r4k: Fix PG_ELPA comment
  MIPS: Fix up obsolete cpu_set usage
  MIPS: IP32: Fix build errors in reset code in DS1685 platform hook.
  MIPS: KVM: Fix unused variable build warning
  MIPS: traps: remove extra Tainted: line from __show_regs() output
  MIPS: Fix wrong CHECKFLAGS (sparse builds) with GCC 5.1
  MIPS: Fix a preemption issue with thread's FPU defaults

518af3cb

Merge tag 'arc-4.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc · 2ed3d795

Linus Torvalds authored May 16, 2015

Pull ARC fixes from Vineet Gupta.

* tag 'arc-4.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
  ARC: inline cache flush toggle helpers
  ARC: With earlycon in use, retire EARLY_PRINTK
  ARC: unbork !LLSC build

2ed3d795

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc · d6610270

Linus Torvalds authored May 16, 2015

Pull ARM SoC fixes from Arnd Bergmann:
 "Nothing frightening this time, just smaller fixes in a number of
  places.

  The other changes contained here are:

   MAINTAINERS file updates:

   - The mach-gemini maintainer is back in action and has a new git tree

   - Krzysztof Kozlowski has volunteered to be a new co-maintainer for
     the samsung platforms

   - updates to the files that belong to Marvell mvebu

  Bug fixes:

   - The largest changes are on omap2, but are only to avoid some
     harmless warnings and to fix reset on omap4

   - a small regression fix on tegra

   - multiple fixes for incorrect IRQ affinity on vexpress

   - the missing system controller on arm64 juno is added

   - one revert of a patch that was accidentally applied twice for
     mach-rockchip

   - two clock related DT fixes for mvebu

   - a workaround for suspend with old DT binaries on new exynos kernels

   - Another fix for suspend on exynos, needs to be backported"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (21 commits)
  MAINTAINERS: Add dts entries for some of the Marvell SoCs
  MAINTAINERS: ARM: EXYNOS: Add Krzysztof Kozlowski as co-maintainer
  ARM: EXYNOS: Use of_machine_is_compatible instead of soc_is_exynos4
  ARM: EXYNOS: Fix failed second suspend on Exynos4
  Revert "ARM: rockchip: fix undefined instruction of reset_ctrl_regs"
  ARM: EXYNOS: Fix dereference of ERR_PTR returned by of_genpd_get_from_provider
  ARM: EXYNOS: Don't try to initialize suspend on old DT
  ARM: dts: Add keep-power-in-suspend to WiFi SDIO node for Peach Boards
  ARM: gemini: fix compiler warning due wrong data type
  ARM: vexpress/tc2: Add interrupt-affinity to the PMU node
  ARM: vexpress/ca9: Add interrupt-affinity to the PMU node
  ARM: vexpress/ca9: Add unified-cache property to l2 cache node
  ARM64: juno: add sp810 support and fix sp804 clock frequency
  ARM: Gemini: Maintainers update
  ARM: OMAP2+: Remove bogus struct clk comparison for timer clock
  ARM: dove: Add clock-names to CuBox Si5351 clk generator
  ARM: AM33xx+: hwmod: re-use omap4 implementations for reset functionality
  ARM: OMAP4+: PRM: add support for passing status register/bit info to reset
  ARM: AM43xx: hwmod: add VPFE hwmod entries
  ARM: mvebu: Fix the main PLL frequency on Armada 375, 38x and 39x SoCs
  ...

d6610270

Merge branch 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux · 73786683

Linus Torvalds authored May 16, 2015

Pull thermal fixes from Zhang Rui:
 "Specifics:

   - fix an issue in intel_powerclamp driver that idle injection target
     is not accurately maintained on newer Intel CPUs.  Package C8 to
     C10 states are introduced on these CPUs but they were not included
     in the package c-state residency calculation.  From Jacob Pan.

   - fix a problem that package c-state idle injection was missing on
     Broadwell server, by adding its id to intel_powerclamp driver.
     From Jacob Pan.

   - a couple of small fixes and cleanups from Joe Perches, Mathias
     Krause, Dan Carpenter and Anand Moon"

* 'for-rc' of git://git.kernel.org/pub/scm/linux/kernel/git/rzhang/linux:
  tools/thermal: tmon: fixed the 'make install' command
  thermal: rockchip: fix an error code
  thermal/powerclamp: fix missing newer package c-states
  thermal/intel_powerclamp: add id for broadwell server
  thermal/intel_powerclamp: add __init / __exit annotations
  thermal: Use bool function return values of true/false not 1/0

73786683

Merge tag 'linux-kselftest-4.1-rc4' of... · d70933be

Linus Torvalds authored May 16, 2015

Merge tag 'linux-kselftest-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull kselftest fixes from Shuah Khan:
 "Urgent fix for Kselftest regression introduced in 4.1-rc1 by the new
  x86 test due to its hard dependency on 32-bit build environment.

  A set of 5 patches fix the make kselftest run and kselftest install"

* tag 'linux-kselftest-4.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests, x86: Rework x86 target architecture detection
  selftests, x86: Remove useless run_tests rule
  selftests/x86: install tests
  selftest/x86: have no dependency on all when cross building
  selftest/x86: build both bitnesses

d70933be

15 May, 2015 15 commits

Merge branch 'parisc-4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · 4b470f12

Linus Torvalds authored May 15, 2015

Pull parisc fixes from Helge Deller:
 "One important patch which fixes crashes due to stack randomization on
  architectures where the stack grows upwards (currently parisc and
  metag only).

  This bug went unnoticed on parisc since kernel 3.14 where the flexible
  mmap memory layout support was added by commit 9dabf60d.  The
  changes in fs/exec.c are inside an #ifdef CONFIG_STACK_GROWSUP section
  and will not affect other platforms.

  The other two patches rename args of the kthread_arg() function and
  fixes a printk output"

* 'parisc-4.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc,metag: Fix crashes due to stack randomization on stack-grows-upwards architectures
  parisc: copy_thread(): rename 'arg' argument to 'kthread_arg'
  parisc: %pf is only for function pointers

4b470f12

MIPS: tlb-r4k: Fix PG_ELPA comment · e05cb568

James Hogan authored May 13, 2015

The ELPA bit in PageGrain is all about large *physical* addresses, so
correct the reference to "large virtual address" in the comment above
where it is set for MIPS64.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10038/Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

e05cb568

MIPS: Fix up obsolete cpu_set usage · 7363cb7d

Ezequiel Garcia authored Apr 28, 2015

cpu_set was removed (along with a bunch of cpumask helpers) by
commit 2f0f267e ("cpumask: remove deprecated functions.").

Fix this by replacing cpu_set with cpumask_set_cpu. Without this
fix the following error is triggered when CONFIG_MIPS_MT_FPAFF=y.

arch/mips/kernel/smp-cps.c: In function 'cps_smp_setup':
arch/mips/kernel/smp-cps.c:95:3: error: implicit declaration of function 'cpu_set' [-Werror=implicit-function-declaration]

Fixes: 90db024f ("MIPS: smp-cps: cpu_set FPU mask if FPU present")
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@imgtec.com>
Acked-by: Niklas Cassel <niklass@axis.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9912/Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

7363cb7d

Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · be5e32fc

Linus Torvalds authored May 15, 2015

Pull x86 build fix from Ingo Molnar:
 "A bzImage build fix on older distros"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vdso: Fix 'make bzImage' on older distros

be5e32fc

Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 14db1e8d

Linus Torvalds authored May 15, 2015

Pull scheduler fixes from Ingo Molnar:
 "Two fixes: a suspend/resume related regression fix, and an RT priority
  boosting fix"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/core: Fix regression in cpuset_cpu_inactive() for suspend
  sched: Handle priority boosted tasks proper in setscheduler()

14db1e8d

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · ef4a293a

Linus Torvalds authored May 15, 2015

Pull perf fixes from Ingo Molnar:
 "Mostly tooling fixes, but also a lockdep annotation fix, a PMU event
  list fix and a new model addition"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  tools/liblockdep: Fix compilation error
  tools/liblockdep: Fix linker error in case of cross compile
  perf tools: Use getconf to determine number of online CPUs
  tools: Fix tools/vm build
  perf/x86/rapl: Enable Broadwell-U RAPL support
  perf/x86/intel: Fix SLM cache event list
  perf: Annotate inherited event ctx->mutex recursion

ef4a293a

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 214e9f72

Linus Torvalds authored May 15, 2015

Pull irq fix from Ingo Molnar:
 "A tegra irqchip driver memory corruption fix"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip: tegra: Set the proper base address in irq chip data

214e9f72

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · c4d0bcc2

Linus Torvalds authored May 15, 2015

Pull drm fixes from Dave Airlie:
 "Radeon:
     one oops fix, one bug fix, one pci id addition patch

  i915:
     one suspend/resume regression fix.

  All seems quiet enough."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: don't do mst probing if MST isn't enabled.
  drm/radeon: add new bonaire pci id
  drm/radeon: fix VM_CONTEXT*_PAGE_TABLE_END_ADDR handling
  drm/i915: Avoid GPU hang when coming out of s3 or s4

c4d0bcc2

Merge branch 'akpm' (patches from Andrew) · 0336104d

Linus Torvalds authored May 15, 2015

Merge misc fixes from Andrew Morton:
 "8 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm, numa: really disable NUMA balancing by default on single node machines
  MAINTAINERS: update Jingoo Han's email address
  CMA: page_isolation: check buddy before accessing it
  uidgid: make uid_valid and gid_valid work with !CONFIG_MULTIUSER
  kernfs: do not account ino_ida allocations to memcg
  gfp: add __GFP_NOACCOUNT
  tools/vm: fix page-flags build
  drivers/rtc/rtc-armada38x.c: remove unused local `flags'

0336104d

Merge tag 'samsung-fixes-2' of... · 56523eef

Arnd Bergmann authored May 15, 2015

Merge tag 'samsung-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung into fixes

Merge "Samsung 2nd fixes for v4.1" from Kukjin Kim:

- fix second S2R on exynos4412 based Trats2, Odroid U3 boards which
  happened after enabling L2$ and caused by commit 13cfa6c4 ("ARM:
  EXYNOS: Fix CPU idle clock down after CPU off")
  And replace the soc_is_exynosxxx() macro with of_compatible_xxx

- fix dereference of ERR_PTR of of_genpd_get_from_provider()

- fix suspend problem on old DT machines to skip the initialization
  suspend and caused by commit 8b283c02 ("ARM: exynos4/5: convert
  pmu wakeup to stacked domains")

- add keep-power-in-suspend for Peach Boards to support S2R and has
  been missed in previous pull-request for fixes

* tag 'samsung-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
  ARM: EXYNOS: Use of_machine_is_compatible instead of soc_is_exynos4
  ARM: EXYNOS: Fix failed second suspend on Exynos4
  ARM: EXYNOS: Fix dereference of ERR_PTR returned by of_genpd_get_from_provider
  ARM: EXYNOS: Don't try to initialize suspend on old DT
  ARM: dts: Add keep-power-in-suspend to WiFi SDIO node for Peach Boards

56523eef

Merge tag 'mvebu-fixes-4.1-2' of git://git.infradead.org/linux-mvebu into fixes · cc1c1b5d

Arnd Bergmann authored May 15, 2015

Merge "mvebu fixes for 4.1 (part 2)" from Gregory CLEMENT:

Fix the main PLL frequency on Armada 375, 38x and 39x SoCs
Add clock-names to CuBox Si5351 clk generator
Add dts entries in the MAINTAINERS file

* tag 'mvebu-fixes-4.1-2' of git://git.infradead.org/linux-mvebu:
  MAINTAINERS: Add dts entries for some of the Marvell SoCs
  ARM: dove: Add clock-names to CuBox Si5351 clk generator
  ARM: mvebu: Fix the main PLL frequency on Armada 375, 38x and 39x SoCs

cc1c1b5d

MAINTAINERS: Add dts entries for some of the Marvell SoCs · 31c17ac9

Gregory CLEMENT authored May 15, 2015

Since many releases, the modifications of the mvebu and berlin device
tree files are merged through the mvebu subsystem. This patch makes it
official in order to help the contributors using the get_maintainer.pl
to find the accurate peoples.

In the same time, updated the mvebu description which now includes the
kirkwood SoCs and new Armada SoCs.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Acked-by: Sebastian Hesselbarth <sebastian.hesselbarth@gmail.com>
Acked-by: Jason Cooper <jason@lakedaemon.net>
Acked-by: Andrew Lunn <andrew@lunn.ch>

31c17ac9

x86: Align jump targets to 1-byte boundaries · be6cb027

Ingo Molnar authored Apr 10, 2015

The following NOP in a hot function caught my attention:

  >   5a:	66 0f 1f 44 00 00    	nopw   0x0(%rax,%rax,1)

That's a dead NOP that bloats the function a bit, added for the
default 16-byte alignment that GCC applies for jump targets.

I realize that x86 CPU manufacturers recommend 16-byte jump
target alignments (it's in the Intel optimization manual),
to help their relatively narrow decoder prefetch alignment
and uop cache constraints, but the cost of that is very
significant:

        text           data       bss         dec      filename
    12566391        1617840   1089536    15273767      vmlinux.align.16-byte
    12224951        1617840   1089536    14932327      vmlinux.align.1-byte

By using 1-byte jump target alignment (i.e. no alignment at all)
we get an almost 3% reduction in kernel size (!) - and a
probably similar reduction in I$ footprint.

Now, the usual justification for jump target alignment is the
following:

 - modern decoders tend to have 16-byte (effective) decoder
   prefetch windows. (AMD documents it higher but measurements
   suggest the effective prefetch window on curretn uarchs is
   still around 16 bytes)

 - on Intel there's also the uop-cache with cachelines that have
   16-byte granularity and limited associativity.

 - older x86 uarchs had a penalty for decoder fetches that crossed
   16-byte boundaries. These limits are mostly gone from recent
   uarchs.

So if a forward jump target is aligned to cacheline boundary then
prefetches will start from a new prefetch-cacheline and there's
higher chance for decoding in fewer steps and packing tightly.

But I think that argument is flawed for typical optimized kernel
code flows: forward jumps often go to 'cold' (uncommon) pieces
of code, and  aligning cold code to cache lines does not bring a
lot of advantages  (they are uncommon), while it causes
collateral damage:

 - their alignment 'spreads out' the cache footprint, it shifts
   followup hot code further out

 - plus it slows down even 'cold' code that immediately follows 'hot'
   code (like in the above case), which could have benefited from the
   partial cacheline that comes off the end of hot code.

But even in the cache-hot case the 16 byte alignment brings
disadvantages:

 - it spreads out the cache footprint, possibly making the code
   fall out of the L1 I$.

 - On Intel CPUs, recent microarchitectures have plenty of
   uop cache (typically doubling every 3 years) - while the
   size of the L1 cache grows much less aggressively. So
   workloads are rarely uop cache limited.

The only situation where alignment might matter are tight
loops that could fit into a single 16 byte chunk - but those
are pretty rare in the kernel: if they exist they tend
to be pointer chasing or generic memory ops, which both tend
to be cache miss (or cache allocation) intensive and are not
decoder bandwidth limited.

So the balance of arguments strongly favors packing kernel
instructions tightly versus maximizing for decoder bandwidth:
this patch changes the jump target alignment from 16 bytes
to 1 byte (tightly packed, unaligned).
Acked-by: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Aswin Chandramouleeswaran <aswin@hp.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/20150410120846.GA17101@gmail.comSigned-off-by: Ingo Molnar <mingo@kernel.org>

be6cb027

Merge branch 'liblockdep-fixes' of... · 60d5ddea

Ingo Molnar authored May 15, 2015

Merge branch 'liblockdep-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux into perf/urgent

Pull liblockdep fixes from Sasha Levin:

"two fixes that deal with compilation errors in liblockdep."
Signed-off-by: Ingo Molnar <mingo@kernel.org>

60d5ddea

Merge tag 'drm-intel-fixes-2015-05-13' of git://anongit.freedesktop.org/drm-intel into drm-fixes · 47231324

Dave Airlie authored May 15, 2015

fix one gpu hang on resume.

* tag 'drm-intel-fixes-2015-05-13' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Avoid GPU hang when coming out of s3 or s4

47231324