Commits · f71cd88c4320b4fbe75cadba7df80b1e299a6a27 · Kirill Smelkov / linux

19 Feb, 2015 40 commits

lib/checksum.c: fix build for generic csum_tcpudp_nofold · f71cd88c

karl beldan authored Jan 29, 2015

Fixed commit added from64to32 under _#ifndef do_csum_ but used it
under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's
robot reported TILEGX's). Move from64to32 under the latter.

Fixes: 150ae0e9 ("lib/checksum.c: fix carry in csum_tcpudp_nofold")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit 9ce35779)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

f71cd88c

lib/checksum.c: fix build for generic csum_tcpudp_nofold · 93536369

karl beldan authored Jan 28, 2015

Fixed commit added from64to32 under _#ifndef do_csum_ but used it
under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's
robot reported TILEGX's). Move from64to32 under the latter.

Fixes: 150ae0e9 ("lib/checksum.c: fix carry in csum_tcpudp_nofold")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: ipv4: initialize unicast_sock sk_pacing_rate

When I added sk_pacing_rate field, I forgot to initialize its value
in the per cpu unicast_sock used in ip_send_unicast_reply()

This means that for sch_fq users, RST packets, or ACK packets sent
on behalf of TIME_WAIT sockets might be sent to slowly or even dropped
once we reach the per flow limit.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 95bd09eb ("tcp: TSO packets automatic sizing")
Signed-off-by: David S. Miller <davem@davemloft.net>

lib/checksum.c: fix carry in csum_tcpudp_nofold

The carry from the 64->32bits folding was dropped, e.g with:
saddr=0xFFFFFFFF daddr=0xFF0000FF len=0xFFFF proto=0 sum=1,
csum_tcpudp_nofold returned 0 instead of 1.
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit 9ce35779
150ae0e9)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

93536369

tcm_loop: Fix wrong I_T nexus association · 7b453b80

Hannes Reinecke authored Jan 30, 2015

tcm_loop has the I_T nexus associated with the HBA. This causes
commands to become misdirected if the HBA has more than one
target portal group; any command is then being sent to the
first target portal group instead of the correct one.

The nexus needs to be associated with the target portal group
instead.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: <stable@vger.kernel.org> # 3.0+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

(cherry picked from commit 506787a2)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

7b453b80

gpio: squelch a compiler warning · bca042cf

Martin Kaiser authored Jan 30, 2015

drivers/gpio/gpiolib-of.c: In function 'of_gpiochip_find_and_xlate':
drivers/gpio/gpiolib-of.c:51:21: warning: assignment makes integer from
pointer without a cast [enabled by default]
   gg_data->out_gpio = ERR_PTR(ret);
                     ^
this was introduced in d1c34491

the upstream kernel changed the type of out_gpio from int to struct gpio_desc *
as part of a larger refactoring that wasn't backported
Signed-off-by: Martin Kaiser <martin@kaiser.cx>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

(cherry picked from commit 60e353b1)

(cherry picked from commit HEAD)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

bca042cf

pstore: d_alloc_name() doesn't return an ERR_PTR · 023391e1

Dan Carpenter authored Aug 14, 2013

d_alloc_name() returns NULL on error.  Also I changed the error code
from -ENOSPC to -ENOMEM to reflect that we were short on RAM not disk
space.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>

(cherry picked from commit c39524e6)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

023391e1

pstore: Fail to unlink if a driver has not defined pstore_erase · e29e8202

Aruna Balakrishnaiah authored Jun 25, 2013

pstore_erase is used to erase the record from the persistent store.
So if a driver has not defined pstore_erase callback return
-EPERM instead of unlinking a file as deleting the file without
erasing its record in persistent store will give a wrong impression
to customers.
Signed-off-by: Aruna Balakrishnaiah <aruna@linux.vnet.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>

(cherry picked from commit bf288333)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

e29e8202

ARM: 8108/1: mm: Introduce {pte,pmd}_isset and {pte,pmd}_isclear · ecf51c22

Steven Capper authored Jul 18, 2014

Long descriptors on ARM are 64 bits, and some pte functions such as
pte_dirty return a bitwise-and of a flag with the pte value. If the
flag to be tested resides in the upper 32 bits of the pte, then we run
into the danger of the result being dropped if downcast.

For example:
	gather_stats(page, md, pte_dirty(*pte), 1);
where pte_dirty(*pte) is downcast to an int.

This patch introduces a new macro pte_isset which performs the bitwise
and, then performs a double logical invert (where needed) to ensure
predictable downcasting. The logical inverse pte_isclear is also
introduced.

Equivalent pmd functions for Transparent HugePages have also been
added.
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

(cherry picked from commit f2950706)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ecf51c22

ARM: 7931/1: Correct virt_addr_valid · 4d332d1b

Laura Abbott authored Dec 21, 2013

The definition of virt_addr_valid is that virt_addr_valid should
return true if and only if virt_to_page returns a valid pointer.
The current definition of virt_addr_valid only checks against the
virtual address range. There's no guarantee that just because a
virtual address falls bewteen PAGE_OFFSET and high_memory the
associated physical memory has a valid backing struct page. Follow
the example of other architectures and convert to pfn_valid to
verify that the virtual address is actually valid. The check for
an address between PAGE_OFFSET and high_memory is still necessary
as vmalloc/highmem addresses are not valid with virt_to_page.

Cc: Will Deacon <will.deacon@arm.com>
Cc: Nicolas Pitre <nico@linaro.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

(cherry picked from commit efea3403)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

4d332d1b

ARM: 7867/1: include: asm: use 'int' instead of 'unsigned long' for 'oldval' in atomic_cmpxchg(). · 37668df2

Chen Gang authored Oct 26, 2013

For atomic_cmpxchg(), the type of 'oldval' need be 'int' to match the
type of "*ptr" (used by 'ldrex' instruction) and 'old' (used by 'teq'
instruction).
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

(cherry picked from commit 4dcc1cf7)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

37668df2

ARM: lpae: fix definition of PTE_HWTABLE_PTRS · 33450464

Will Deacon authored May 02, 2013

For 2-level page tables, PTE_HWTABLE_PTRS describes the offset between
Linux PTEs and hardware PTEs. On LPAE, there is no distinction (since
we have 64-bit descriptors with plenty of space) so PTE_HWTABLE_PTRS
should be 0. Unfortunately, it is wrongly defined as PTRS_PER_PTE,
meaning that current pte table flushing is off by a page. Luckily,
all current LPAE implementations are SMP, so the hardware walker can
snoop L1.

This patch fixes the broken definition.
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

(cherry picked from commit e38a5175)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

33450464

ARM: fix type of PHYS_PFN_OFFSET to unsigned long · b758191f

Cyril Chemparathy authored Sep 12, 2012

On LPAE machines, PHYS_OFFSET evaluates to a phys_addr_t and this type is
inherited by the PHYS_PFN_OFFSET definition as well.  Consequently, the kernel
build emits warnings of the form:

init/main.c: In function 'start_kernel':
init/main.c:588:7: warning: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'phys_addr_t' [-Wformat]

This patch fixes this warning by pinning down the PFN type to unsigned long.
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

(cherry picked from commit 5b20c5b2)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

b758191f

ARM: LPAE: use phys_addr_t in alloc_init_pud() · 1af1c20d

Vitaly Andrianov authored Jul 10, 2012

This patch fixes the alloc_init_pud() function to use phys_addr_t instead of
unsigned long when passing in the phys argument.

This is an extension to commit 97092e0c (ARM:
pgtable: use phys_addr_t for physical addresses), which applied similar changes
elsewhere in the ARM memory management code.
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

(cherry picked from commit 20d6956d)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

1af1c20d

ARM: LPAE: use signed arithmetic for mask definitions · 5877a739

Cyril Chemparathy authored Jul 22, 2012

This patch applies to PAGE_MASK, PMD_MASK, and PGDIR_MASK, where forcing
unsigned long math truncates the mask at the 32-bits.  This clearly does bad
things on PAE systems.

This patch fixes this problem by defining these masks as signed quantities.
We then rely on sign extension to do the right thing.
Signed-off-by: Cyril Chemparathy <cyril@ti.com>
Signed-off-by: Vitaly Andrianov <vitalya@ti.com>
Reviewed-by: Nicolas Pitre <nico@linaro.org>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com>
Tested-by: Subash Patel <subash.rp@samsung.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>

(cherry picked from commit 926edcc7)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

5877a739

ARM: mm: correct pte_same behaviour for LPAE. · ec1f11ca

Steve Capper authored May 17, 2013

For 3 levels of paging the PTE_EXT_NG bit will be set for user
address ptes that are written to a page table but not for ptes
created with mk_pte.

This can cause some comparison tests made by pte_same to fail
spuriously and lead to other problems.

To correct this behaviour, we mask off PTE_EXT_NG for any pte that
is present before running the comparison.
Signed-off-by: Steve Capper <steve.capper@linaro.org>
Reviewed-by: Will Deacon <will.deacon@arm.com>

(cherry picked from commit dde1b651)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ec1f11ca

ARM: 7829/1: Add ".text.unlikely" and ".text.hot" to arm unwind tables · b067506d

Douglas Anderson authored Aug 29, 2013

It appears that gcc may put some code in ".text.unlikely" or
".text.hot" sections.  Right now those aren't accounted for in unwind
tables.  Add them.

I found some docs about this at:
  http://gcc.gnu.org/onlinedocs/gcc-4.6.2/gcc.pdf

Without this, if you have slub_debug turned on, you can get messages
that look like this:
  unwind: Index not found 7f008c50
Signed-off-by: Doug Anderson <dianders@chromium.org>
Acked-by: Mike Frysinger <vapier@gentoo.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

(cherry picked from commit 849b882b)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

b067506d

regulator: core: fix race condition in regulator_put() · b64e8ae7

Ashay Jaiswal authored Jan 08, 2015

The regulator framework maintains a list of consumer regulators
for a regulator device and protects it from concurrent access using
the regulator device's mutex lock.

In the case of regulator_put() the consumer is removed and regulator
device's parameters are updated without holding the regulator device's
mutex. This would lead to a race condition between the regulator_put()
and any function which traverses the consumer list or modifies regulator
device's parameters.
Fix this race condition by holding the regulator device's mutex in case
of regulator_put.
Signed-off-by: Ashay Jaiswal <ashayj@codeaurora.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: stable@vger.kernel.org

(cherry picked from commit 83b0302d)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

b64e8ae7

quota: provide interface for readding allocated space into reserved space · 53cdae84

Jan Kara authored Aug 17, 2013

ext4 needs to convert allocated (metadata) blocks back into blocks
reserved for delayed allocation. Add functions into quota code for
supporting such operation.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>

(cherry picked from commit 1c8924eb)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

53cdae84

Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single" · 719fd3b4

David Vrabel authored Dec 10, 2014

This reverts commit 2c3fc8d2.

This commit broke on x86 PV because entries in the generic SWIOTLB are
indexed using (pseudo-)physical address not DMA address and these are
not the same in a x86 PV guest.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single"

This reverts commit 2c3fc8d2.

This commit broke on x86 PV because entries in the generic SWIOTLB are
indexed using (pseudo-)physical address not DMA address and these are
not the same in a x86 PV guest.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

xen: annotate xen_set_identity_and_remap_chunk() with __init

Commit 5b8e7d80 removed the __init
annotation from xen_set_identity_and_remap_chunk(). Add it again.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: introduce helper functions to do safe read and write accesses

Introduce two helper functions to safely read and write unsigned long
values from or to memory when the access may fault because the mapping
is non-present or read-only.

These helpers can be used instead of open coded uses of __get_user()
and __put_user() avoiding the need to do casts to fix sparse warnings.

Use the helpers in page.h and p2m.c. This will fix the sparse
warnings when doing "make C=1".
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Speed up set_phys_to_machine() by using read-only mappings

Instead of checking at each call of set_phys_to_machine() whether a
new p2m page has to be allocated due to writing an entry in a large
invalid or identity area, just map those areas read only and react
to a page fault on write by allocating the new page.

This change will make the common path with no allocation much
faster as it only requires a single write of the new mfn instead
of walking the address translation tables and checking for the
special cases.
Suggested-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: switch to linear virtual mapped sparse p2m list

At start of the day the Xen hypervisor presents a contiguous mfn list
to a pv-domain. In order to support sparse memory this mfn list is
accessed via a three level p2m tree built early in the boot process.
Whenever the system needs the mfn associated with a pfn this tree is
used to find the mfn.

Instead of using a software walked tree for accessing a specific mfn
list entry this patch is creating a virtual address area for the
entire possible mfn list including memory holes. The holes are
covered by mapping a pre-defined  page consisting only of "invalid
mfn" entries. Access to a mfn entry is possible by just using the
virtual base address of the mfn list and the pfn as index into that
list. This speeds up the (hot) path of determining the mfn of a
pfn.

Kernel build on a Dell Latitude E6440 (2 cores, HT) in 64 bit Dom0
showed following improvements:

Elapsed time: 32:50 ->  32:35
System:       18:07 ->  17:47
User:        104:00 -> 103:30

Tested with following configurations:
- 64 bit dom0, 8GB RAM
- 64 bit dom0, 128 GB RAM, PCI-area above 4 GB
- 32 bit domU, 512 MB, 8 GB, 43 GB (more wouldn't work even without
                                    the patch)
- 32 bit domU, ballooning up and down
- 32 bit domU, save and restore
- 32 bit domU with PCI passthrough
- 64 bit domU, 8 GB, 2049 MB, 5000 MB
- 64 bit domU, ballooning up and down
- 64 bit domU, save and restore
- 64 bit domU with PCI passthrough
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Hide get_phys_to_machine() to be able to tune common path

Today get_phys_to_machine() is always called when the mfn for a pfn
is to be obtained. Add a wrapper __pfn_to_mfn() as inline function
to be able to avoid calling get_phys_to_machine() when possible as
soon as the switch to a linear mapped p2m list has been done.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

x86: Introduce function to get pmd entry pointer

Introduces lookup_pmd_address() to get the address of the pmd entry
related to a virtual address in the current address space. This
function is needed for support of a virtual mapped sparse p2m list
in xen pv domains, as we need the address of the pmd entry, not the
one of the pte in that case.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Delay invalidating extra memory

When the physical memory configuration is initialized the p2m entries
for not pouplated memory pages are set to "invalid". As those pages
are beyond the hypervisor built p2m list the p2m tree has to be
extended.

This patch delays processing the extra memory related p2m entries
during the boot process until some more basic memory management
functions are callable. This removes the need to create new p2m
entries until virtual memory management is available.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Delay m2p_override initialization

The m2p overrides are used to be able to find the local pfn for a
foreign mfn mapped into the domain. They are used by driver backends
having to access frontend data.

As this functionality isn't used in early boot it makes no sense to
initialize the m2p override functions very early. It can be done
later without doing any harm, removing the need for allocating memory
via extend_brk().

While at it make some m2p override functions static as they are only
used internally.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Delay remapping memory of pv-domain

Early in the boot process the memory layout of a pv-domain is changed
to match the E820 map (either the host one for Dom0 or the Xen one)
regarding placement of RAM and PCI holes. This requires removing memory
pages initially located at positions not suitable for RAM and adding
them later at higher addresses where no restrictions apply.

To be able to operate on the hypervisor supported p2m list until a
virtual mapped linear p2m list can be constructed, remapping must
be delayed until virtual memory management is initialized, as the
initial p2m list can't be extended unlimited at physical memory
initialization time due to it's fixed structure.

A further advantage is the reduction in complexity and code volume as
we don't have to be careful regarding memory restrictions during p2m
updates.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: use common page allocation function in p2m.c

In arch/x86/xen/p2m.c three different allocation functions for
obtaining a memory page are used: extend_brk(), alloc_bootmem_align()
or __get_free_page().  Which of those functions is used depends on the
progress of the boot process of the system.

Introduce a common allocation routine selecting the to be called
allocation routine dynamically based on the boot progress. This allows
moving initialization steps without having to care about changing
allocation calls.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Make functions static

Some functions in arch/x86/xen/p2m.c are used locally only. Make them
static. Rearrange the functions in p2m.c to avoid forward declarations.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: fix some style issues in p2m.c

The source arch/x86/xen/p2m.c has some coding style issues. Fix them.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

(cherry picked from commit dbdd7476
4ef8e3f3)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

719fd3b4

gpio: sysfs: fix gpio device-attribute leak · 77e757f8

Johan Hovold authored Jan 13, 2015

The gpio device attributes were never destroyed when the gpio was
unexported (or on export failures).

Use device_create_with_groups() to create the default device attributes
of the gpio class device. Note that this also fixes the
attribute-creation race with userspace for these attributes.

Remove contingent attributes in export error path and on unexport.

Fixes: d8f388d8 ("gpio: sysfs interface")
Cc: stable <stable@vger.kernel.org>	# v2.6.27+
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

(cherry picked from commit 0915e6fe)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

77e757f8

gpiolib: of: Correct error handling in of_get_named_gpiod_flags · 55635c52

Hans Holmberg authored Jan 09, 2015

of_get_named_gpiod_flags fails with -EPROBE_DEFER in cases
where the gpio chip is available and the GPIO translation fails.

This causes drivers to be re-probed erroneusly, and hides the
real problem(i.e. the GPIO number being out of range).

Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Hans Holmberg <hans.holmberg@intel.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

(cherry picked from commit 7b8792bb)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

55635c52

drm/i915: Force the CS stall for invalidate flushes · 7106485c

Chris Wilson authored Dec 16, 2014

In order to act as a full command barrier by itself, we need to tell the
pipecontrol to actually stall the command streamer while the flush runs.
We require the full command barrier before operations like
MI_SET_CONTEXT, which currently rely on a prior invalidate flush.

References: https://bugs.freedesktop.org/show_bug.cgi?id=83677
Cc: Simon Farnsworth <simon@farnz.org.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

(cherry picked from commit add284a3)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

7106485c

Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single" · 645d9f8c

Stefano Stabellini authored Nov 21, 2014

This reverts commit 2c3fc8d2.

This commit broke on x86 PV because entries in the generic SWIOTLB are
indexed using (pseudo-)physical address not DMA address and these are
not the same in a x86 PV guest.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

Revert "swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single"

This reverts commit 2c3fc8d2.

This commit broke on x86 PV because entries in the generic SWIOTLB are
indexed using (pseudo-)physical address not DMA address and these are
not the same in a x86 PV guest.
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>

xen: annotate xen_set_identity_and_remap_chunk() with __init

Commit 5b8e7d80 removed the __init
annotation from xen_set_identity_and_remap_chunk(). Add it again.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: introduce helper functions to do safe read and write accesses

Introduce two helper functions to safely read and write unsigned long
values from or to memory when the access may fault because the mapping
is non-present or read-only.

These helpers can be used instead of open coded uses of __get_user()
and __put_user() avoiding the need to do casts to fix sparse warnings.

Use the helpers in page.h and p2m.c. This will fix the sparse
warnings when doing "make C=1".
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Speed up set_phys_to_machine() by using read-only mappings

Instead of checking at each call of set_phys_to_machine() whether a
new p2m page has to be allocated due to writing an entry in a large
invalid or identity area, just map those areas read only and react
to a page fault on write by allocating the new page.

This change will make the common path with no allocation much
faster as it only requires a single write of the new mfn instead
of walking the address translation tables and checking for the
special cases.
Suggested-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: switch to linear virtual mapped sparse p2m list

At start of the day the Xen hypervisor presents a contiguous mfn list
to a pv-domain. In order to support sparse memory this mfn list is
accessed via a three level p2m tree built early in the boot process.
Whenever the system needs the mfn associated with a pfn this tree is
used to find the mfn.

Instead of using a software walked tree for accessing a specific mfn
list entry this patch is creating a virtual address area for the
entire possible mfn list including memory holes. The holes are
covered by mapping a pre-defined  page consisting only of "invalid
mfn" entries. Access to a mfn entry is possible by just using the
virtual base address of the mfn list and the pfn as index into that
list. This speeds up the (hot) path of determining the mfn of a
pfn.

Kernel build on a Dell Latitude E6440 (2 cores, HT) in 64 bit Dom0
showed following improvements:

Elapsed time: 32:50 ->  32:35
System:       18:07 ->  17:47
User:        104:00 -> 103:30

Tested with following configurations:
- 64 bit dom0, 8GB RAM
- 64 bit dom0, 128 GB RAM, PCI-area above 4 GB
- 32 bit domU, 512 MB, 8 GB, 43 GB (more wouldn't work even without
                                    the patch)
- 32 bit domU, ballooning up and down
- 32 bit domU, save and restore
- 32 bit domU with PCI passthrough
- 64 bit domU, 8 GB, 2049 MB, 5000 MB
- 64 bit domU, ballooning up and down
- 64 bit domU, save and restore
- 64 bit domU with PCI passthrough
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Hide get_phys_to_machine() to be able to tune common path

Today get_phys_to_machine() is always called when the mfn for a pfn
is to be obtained. Add a wrapper __pfn_to_mfn() as inline function
to be able to avoid calling get_phys_to_machine() when possible as
soon as the switch to a linear mapped p2m list has been done.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

x86: Introduce function to get pmd entry pointer

Introduces lookup_pmd_address() to get the address of the pmd entry
related to a virtual address in the current address space. This
function is needed for support of a virtual mapped sparse p2m list
in xen pv domains, as we need the address of the pmd entry, not the
one of the pte in that case.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Delay invalidating extra memory

When the physical memory configuration is initialized the p2m entries
for not pouplated memory pages are set to "invalid". As those pages
are beyond the hypervisor built p2m list the p2m tree has to be
extended.

This patch delays processing the extra memory related p2m entries
during the boot process until some more basic memory management
functions are callable. This removes the need to create new p2m
entries until virtual memory management is available.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Delay m2p_override initialization

The m2p overrides are used to be able to find the local pfn for a
foreign mfn mapped into the domain. They are used by driver backends
having to access frontend data.

As this functionality isn't used in early boot it makes no sense to
initialize the m2p override functions very early. It can be done
later without doing any harm, removing the need for allocating memory
via extend_brk().

While at it make some m2p override functions static as they are only
used internally.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Delay remapping memory of pv-domain

Early in the boot process the memory layout of a pv-domain is changed
to match the E820 map (either the host one for Dom0 or the Xen one)
regarding placement of RAM and PCI holes. This requires removing memory
pages initially located at positions not suitable for RAM and adding
them later at higher addresses where no restrictions apply.

To be able to operate on the hypervisor supported p2m list until a
virtual mapped linear p2m list can be constructed, remapping must
be delayed until virtual memory management is initialized, as the
initial p2m list can't be extended unlimited at physical memory
initialization time due to it's fixed structure.

A further advantage is the reduction in complexity and code volume as
we don't have to be careful regarding memory restrictions during p2m
updates.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: use common page allocation function in p2m.c

In arch/x86/xen/p2m.c three different allocation functions for
obtaining a memory page are used: extend_brk(), alloc_bootmem_align()
or __get_free_page().  Which of those functions is used depends on the
progress of the boot process of the system.

Introduce a common allocation routine selecting the to be called
allocation routine dynamically based on the boot progress. This allows
moving initialization steps without having to care about changing
allocation calls.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: Make functions static

Some functions in arch/x86/xen/p2m.c are used locally only. Make them
static. Rearrange the functions in p2m.c to avoid forward declarations.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen: fix some style issues in p2m.c

The source arch/x86/xen/p2m.c has some coding style issues. Fix them.
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen/pci: Use APIC directly when APIC virtualization hardware is available

When hardware supports APIC/x2APIC virtualization we don't need to use
pirqs for MSI handling and instead use APIC since most APIC accesses
(MMIO or MSR) will now be processed without VMEXITs.

As an example, netperf on the original code produces this profile
(collected wih 'xentrace -e 0x0008ffff -T 5'):

    342 cpu_change
    260 CPUID
  34638 HLT
  64067 INJ_VIRQ
  28374 INTR
  82733 INTR_WINDOW
     10 NPF
  24337 TRAP
 370610 vlapic_accept_pic_intr
 307528 VMENTRY
 307527 VMEXIT
 140998 VMMCALL
    127 wrap_buffer

After applying this patch the same test shows

    230 cpu_change
    260 CPUID
  36542 HLT
    174 INJ_VIRQ
  27250 INTR
    222 INTR_WINDOW
     20 NPF
  24999 TRAP
 381812 vlapic_accept_pic_intr
 166480 VMENTRY
 166479 VMEXIT
  77208 VMMCALL
     81 wrap_buffer

ApacheBench results (ab -n 10000 -c 200) improve by about 10%
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen/pci: Defer initialization of MSI ops on HVM guests

If the hardware supports APIC virtualization we may decide not to use
pirqs and instead use APIC/x2APIC directly, meaning that we don't want
to set x86_msi.setup_msi_irqs and x86_msi.teardown_msi_irq to
Xen-specific routines.  However, x2APIC is not set up by the time
pci_xen_hvm_init() is called so we need to postpone setting these ops
until later, when we know which APIC mode is used.

(Note that currently x2APIC is never initialized on HVM guests. This
may change in the future)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen-pciback: drop SR-IOV VFs when PF driver unloads

When a PF driver unloads, it may find it necessary to leave the VFs
around simply because of pciback having marked them as assigned to a
guest. Utilize a suitable notification to let go of the VFs, thus
allowing the PF to go back into the state it was before its driver
loaded (which in particular allows the driver to be loaded again with
it being able to create the VFs anew, but which also allows to then
pass through the PF instead of the VFs).

Don't do this however for any VFs currently in active use by a guest.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
[v2: Removed the switch statement, moved it about]
[v3: Redid it a bit differently]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen/pciback: Restore configuration space when detaching from a guest.

The commit "xen/pciback: Don't deadlock when unbinding." was using
the version of pci_reset_function which would lock the device lock.
That is no good as we can dead-lock. As such we swapped to using
the lock-less version and requiring that the callers
of 'pcistub_put_pci_dev' take the device lock. And as such
this bug got exposed.

Using the lock-less version is  OK, except that we tried to
use 'pci_restore_state' after the lock-less version of
__pci_reset_function_locked - which won't work as 'state_saved'
is set to false. Said 'state_saved' is a toggle boolean that
is to be used by the sequence of a) pci_save_state/pci_restore_state
or b) pci_load_and_free_saved_state/pci_restore_state. We don't
want to use a) as the guest might have messed up the PCI
configuration space and we want it to revert to the state
when the PCI device was binded to us. Therefore we pick
b) to restore the configuration space.

We restore from our 'golden' version of PCI configuration space, when an:
 - Device is unbinded from pciback
 - Device is detached from a guest.
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

PCI: Expose pci_load_saved_state for public consumption.

We have the pci_load_and_free_saved_state, and pci_store_saved_state
but are missing the functionality to just load the state
multiple times in the PCI device without having to free/save
the state.

This patch makes it possible to use this function.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen/pciback: Remove tons of dereferences

A little cleanup. No functional difference.
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen/pciback: Print out the domain owning the device.

We had been printing it only if the device was built with
debug enabled. But this information is useful in the field
to troubleshoot.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen/pciback: Include the domain id if removing the device whilst still in use

Cleanup the function a bit - also include the id of the
domain that is using the device.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

driver core: Provide an wrapper around the mutex to do lockdep warnings

Instead of open-coding it in drivers that want to double check
that their functions are indeed holding the device lock.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Suggested-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

xen/pciback: Don't deadlock when unbinding.

As commit 0a9fd015
'xen/pciback: Document the entry points for 'pcistub_put_pci_dev''
explained there are four entry points in this function.
Two of them are when the user fiddles in the SysFS to
unbind a device which might be in use by a guest or not.

Both 'unbind' states will cause a deadlock as the the PCI lock has
already been taken, which then pci_device_reset tries to take.

We can simplify this by requiring that all callers of
pcistub_put_pci_dev MUST hold the device lock. And then
we can just call the lockless version of pci_device_reset.

To make it even simpler we will modify xen_pcibk_release_pci_dev
to quality whether it should take a lock or not - as it ends
up calling xen_pcibk_release_pci_dev and needs to hold the lock.
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>

swiotlb-xen: pass dev_addr to swiotlb_tbl_unmap_single

Need to pass the pointer within the swiotlb internal buffer to the
swiotlb library, that in the case of xen_unmap_single is dev_addr, not
paddr.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: stable@vger.kernel.org

(cherry picked from commit dbdd7476
4ef8e3f3
2c3fc8d2)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

645d9f8c

can: dev: fix crtlmode_supported check · 17099e5f

Oliver Hartkopp authored Jan 05, 2015

When changing flags in the CAN drivers ctrlmode the provided new content has to
be checked whether the bits are allowed to be changed. The bits that are to be
changed are given as a bitfield in cm->mask. Therefore checking against
cm->flags is wrong as the content can hold any kind of values.

The iproute2 tool sets the bits in cm->mask and cm->flags depending on the
detected command line options. To be robust against bogus user space
applications additionally sanitize the provided flags with the provided mask.

Cc: Wolfgang Grandegger <wg@grandegger.com>
Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

(cherry picked from commit 9b1087aa)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

17099e5f

ARM: omap5/dra7xx: Fix frequency typos · 1d2a3a05

Lennart Sorensen authored Jan 05, 2015

The switch statement of the possible list of SYSCLK1 frequencies is
missing a 0 in 4 out of the 7 frequencies.

Fixes: fa6d79d2 ("ARM: OMAP: Add initialisation for the real-time counter")
Cc: stable@vger.kernel.org # v3.7+
Signed-off-by: Len Sorensen <lsorense@csclub.uwaterloo.ca>
Reviewed-by: Lokesh Vutla <lokeshvutla@ti.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>

(cherry picked from commit 572b24e6)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

1d2a3a05

USB: EHCI: fix initialization bug in iso_stream_schedule() · 7d14c9d1

Alan Stern authored Dec 04, 2014

Commit c3ee9b76 (EHCI: improved logic for isochronous scheduling)
introduced the idea of using ehci->last_iso_frame as the origin (or
base) for the circular calculations involved in modifying the
isochronous schedule.  However, the new code it added used
ehci->last_iso_frame before the value was properly initialized.  This
patch rectifies the mistake by moving the initialization lines earlier
in iso_stream_schedule().

This fixes Bugzilla #72891.
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Fixes: c3ee9b76Reported-by: Joe Bryant <tenminjoe@yahoo.com>
Tested-by: Joe Bryant <tenminjoe@yahoo.com>
Tested-by: Martin Long <martin@longhome.co.uk>
CC: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

(cherry picked from commit 6d89252a)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

7d14c9d1

USB: console: fix potential use after free · 8b3e40c4

Johan Hovold authored Jan 05, 2015

Use tty kref to release the fake tty in usb_console_setup to avoid use
after free if the underlying serial driver has acquired a reference.

Note that using the tty destructor release_one_tty requires some more
state to be initialised.

Fixes: 4a90f09b ("tty: usb-serial krefs")
Cc: stable <stable@vger.kernel.org>
Signed-off-by: Johan Hovold <johan@kernel.org>

(cherry picked from commit 32a4bf2e)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

8b3e40c4

gpiolib: of: Correct error handling in of_get_named_gpiod_flags · 625bb65d

Hans Holmberg authored Jan 09, 2015

of_get_named_gpiod_flags fails with -EPROBE_DEFER in cases
where the gpio chip is available and the GPIO translation fails.

This causes drivers to be re-probed erroneusly, and hides the
real problem(i.e. the GPIO number being out of range).

Cc: Stable <stable@vger.kernel.org>
Signed-off-by: Hans Holmberg <hans.holmberg@intel.com>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

(cherry picked from commit 7b8792bb)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

625bb65d

ipvs: correct usage/allocation of seqadj ext in ipvs · 683a0be8

Jesper Dangaard Brouer authored Jan 26, 2015

The IPVS FTP helper ip_vs_ftp could trigger an OOPS in nf_ct_seqadj_set,
after commit 41d73ec0 (netfilter: nf_conntrack: make sequence number
adjustments usuable without NAT).

This is because, the seqadj ext is now allocated dynamically, and the
IPVS code didn't handle this situation.  Fix this in the IPVS nfct
code by invoking the alloc function nfct_seqadj_ext_add().

Fixes: 41d73ec0 (netfilter: nf_conntrack: make sequence number adjustments usuable without NAT)
Suggested-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Simon Horman <horms@verge.net.au>

(cherry picked from commit b25adce1)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

683a0be8

drm/i915: Force the CS stall for invalidate flushes · ab7cdcf3

Chris Wilson authored Dec 16, 2014

In order to act as a full command barrier by itself, we need to tell the
pipecontrol to actually stall the command streamer while the flush runs.
We require the full command barrier before operations like
MI_SET_CONTEXT, which currently rely on a prior invalidate flush.

References: https://bugs.freedesktop.org/show_bug.cgi?id=83677
Cc: Simon Farnsworth <simon@farnz.org.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
Signed-off-by: Jani Nikula <jani.nikula@intel.com>

(cherry picked from commit add284a3)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ab7cdcf3

ACPI / osl: speedup grace period in acpi_os_map_cleanup · 607b6d58

Konstantin Khlebnikov authored Nov 09, 2014

ACPI maintains cache of ioremap regions to speed up operations and
access to them from irq context where ioremap() calls aren't allowed.
This code abuses synchronize_rcu() on unmap path for synchronization
with fast-path in acpi_os_read/write_memory which uses this cache.

Since v3.10 CPUs are allowed to enter idle state even if they have RCU
callbacks queued, see commit c0f4dfd4
("rcu: Make RCU_FAST_NO_HZ take advantage of numbered callbacks").
That change caused problems with nvidia proprietary driver which calls
acpi_os_map/unmap_generic_address several times during initialization.
Each unmap calls synchronize_rcu and adds significant delay. Totally
initialization is slowed for a couple of seconds and that is enough to
trigger timeout in hardware, gpu decides to "fell off the bus". Widely
spread workaround is reducing "rcu_idle_gp_delay" from 4 to 1 jiffy.

This patch replaces synchronize_rcu() with synchronize_rcu_expedited()
which is much faster.

Link: https://devtalk.nvidia.com/default/topic/567297/linux/linux-3-10-driver-crash/Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-and-tested-by: Alexander Monakov <amonakov@gmail.com>
Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

(cherry picked from commit 74b51ee1)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

607b6d58

umount: Disallow unprivileged mount force · ad582b5a

Eric W. Biederman authored Oct 04, 2014

Forced unmount affects not just the mount namespace but the underlying
superblock as well.  Restrict forced unmount to the global root user
for now.  Otherwise it becomes possible a user in a less privileged
mount namespace to force the shutdown of a superblock of a filesystem
in a more privileged mount namespace, allowing a DOS attack on root.

Cc: stable@vger.kernel.org
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>

(cherry picked from commit b2f5d4dc)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ad582b5a

crypto: crc32c - add missing crypto module alias · b4a6b1f5

Mathias Krause authored Feb 10, 2015

The backport of commit 5d26a105 ("crypto: prefix module autoloading
with "crypto-"") lost the MODULE_ALIAS_CRYPTO() annotation of crc32c.c.
Add it to fix the reported filesystem related regressions.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Reported-by: Philip Müller <philm@manjaro.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Rob McCathie <rob@manjaro.org>
Cc: Luis Henriques <luis.henriques@canonical.com>
Cc: Kamal Mostafa <kamal@canonical.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

(cherry picked from commit 28e24c6d)

(cherry picked from commit HEAD)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

b4a6b1f5

x86,kvm,vmx: Preserve CR4 across VM entry · a635abde

Andy Lutomirski authored Oct 08, 2014

CR4 isn't constant; at least the TSD and PCE bits can vary.

TBH, treating CR0 and CR3 as constant scares me a bit, too, but it looks
like it's correct.

This adds a branch and a read from cr4 to each vm entry.  Because it is
extremely likely that consecutive entries into the same vcpu will have
the same host cr4 value, this fixes up the vmcs instead of restoring cr4
after the fact.  A subsequent patch will add a kernel-wide cr4 shadow,
reducing the overhead in the common case to just two memory reads and a
branch.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: stable@vger.kernel.org
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

(cherry picked from commit d974baa3)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

a635abde

lib/checksum.c: fix build for generic csum_tcpudp_nofold · 021d269d

karl beldan authored Jan 29, 2015

Fixed commit added from64to32 under _#ifndef do_csum_ but used it
under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's
robot reported TILEGX's). Move from64to32 under the latter.

Fixes: 150ae0e9 ("lib/checksum.c: fix carry in csum_tcpudp_nofold")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit 9ce35779)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

021d269d

ext4: prevent bugon on race between write/fcntl · 1a567928

Dmitry Monakhov authored Oct 30, 2014

O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
twice inside ext4_file_write_iter() and __generic_file_write() which
result in BUG_ON inside ext4_direct_IO.

Let's initialize iocb->private unconditionally.

TESTCASE: xfstest:generic/036  https://patchwork.ozlabs.org/patch/402445/

#TYPICAL STACK TRACE:
kernel BUG at fs/ext4/inode.c:2960!
invalid opcode: 0000 [#1] SMP
Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161
Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000
RIP: 0010:[<ffffffff811fabf2>]  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
RSP: 0018:ffff88080f90bb58  EFLAGS: 00010246
RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818
RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001
RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80
R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28
FS:  00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0
Stack:
 ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30
 0000000000000200 0000000000000200 0000000000000001 0000000000000200
 ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200
Call Trace:
 [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180
 [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370
 [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400
 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
 [<ffffffff810990e5>] ? local_clock+0x25/0x30
 [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700
 [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0
 [<ffffffff811bd756>] aio_run_iocb+0x286/0x410
 [<ffffffff810990e5>] ? local_clock+0x25/0x30
 [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
 [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0
 [<ffffffff811bde3b>] do_io_submit+0x55b/0x740
 [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740
 [<ffffffff811be030>] SyS_io_submit+0x10/0x20
 [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b
Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c
24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89
RIP  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
 RSP <ffff88080f90bb58>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
Cc: stable@vger.kernel.org

(cherry picked from commit a41537e6)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

1a567928

lib/checksum.c: fix build for generic csum_tcpudp_nofold · fe7144fe

karl beldan authored Jan 28, 2015

Fixed commit added from64to32 under _#ifndef do_csum_ but used it
under _#ifndef csum_tcpudp_nofold_, breaking some builds (Fengguang's
robot reported TILEGX's). Move from64to32 under the latter.

Fixes: 150ae0e9 ("lib/checksum.c: fix carry in csum_tcpudp_nofold")
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: ipv4: initialize unicast_sock sk_pacing_rate

When I added sk_pacing_rate field, I forgot to initialize its value
in the per cpu unicast_sock used in ip_send_unicast_reply()

This means that for sch_fq users, RST packets, or ACK packets sent
on behalf of TIME_WAIT sockets might be sent to slowly or even dropped
once we reach the per flow limit.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 95bd09eb ("tcp: TSO packets automatic sizing")
Signed-off-by: David S. Miller <davem@davemloft.net>

lib/checksum.c: fix carry in csum_tcpudp_nofold

The carry from the 64->32bits folding was dropped, e.g with:
saddr=0xFFFFFFFF daddr=0xFF0000FF len=0xFFFF proto=0 sum=1,
csum_tcpudp_nofold returned 0 instead of 1.
Signed-off-by: Karl Beldan <karl.beldan@rivierawaves.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit 9ce35779
150ae0e9)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

fe7144fe

gpio: sysfs: fix memory leak in gpiod_sysfs_set_active_low · 40a4f586

Johan Hovold authored Jan 26, 2015

Fix memory leak in the gpio sysfs interface due to failure to drop
reference to device returned by class_find_device when setting the
gpio-line polarity.

Fixes: 07697461 ("gpiolib: add support for changing value polarity
in sysfs")
Cc: stable <stable@vger.kernel.org>	# v2.6.33
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

(cherry picked from commit 49d2ca84)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

40a4f586

gpio: sysfs: fix memory leak in gpiod_export_link · ce307097

Johan Hovold authored Jan 26, 2015

Fix memory leak in the gpio sysfs interface due to failure to drop
reference to device returned by class_find_device when creating a link.

Fixes: a4177ee7 ("gpiolib: allow exported GPIO nodes to be named
using sysfs links")
Cc: stable <stable@vger.kernel.org>	# v2.6.32
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>

(cherry picked from commit 0f303db0)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

ce307097

target: Drop arbitrary maximum I/O size limit · 9928994f

Nicholas Bellinger authored Jan 06, 2015

This patch drops the arbitrary maximum I/O size limit in sbc_parse_cdb(),
which currently for fabric_max_sectors is hardcoded to 8192 (4 MB for 512
byte sector devices), and for hw_max_sectors is a backend driver dependent
value.

This limit is problematic because Linux initiators have only recently
started to honor block limits MAXIMUM TRANSFER LENGTH, and other non-Linux
based initiators (eg: MSFT Fibre Channel) can also generate I/Os larger
than 4 MB in size.

Currently when this happens, the following message will appear on the
target resulting in I/Os being returned with non recoverable status:

  SCSI OP 28h with too big sectors 16384 exceeds fabric_max_sectors: 8192

Instead, drop both [fabric,hw]_max_sector checks in sbc_parse_cdb(),
and convert the existing hw_max_sectors into a purely informational
attribute used to represent the granuality that backend driver and/or
subsystem code is splitting I/Os upon.

Also, update FILEIO with an explicit FD_MAX_BYTES check in fd_execute_rw()
to deal with the one special iovec limitiation case.

v2 changes:
  - Drop hw_max_sectors check in sbc_parse_cdb()
Reported-by: Lance Gropper <lance.gropper@qosserver.com>
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Roland Dreier <roland@purestorage.com>
Cc: stable@vger.kernel.org # 3.4
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

(cherry picked from commit 046ba642)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

9928994f

pstore: Fix NULL pointer fault if get NULL prz in ramoops_get_next_prz · 4592cd41

Liu ShuoX authored Mar 17, 2014

ramoops_get_next_prz get the prz according the paramters. If it get a
uninitialized prz, access its members by following persistent_ram_old_size(prz)
will cause a NULL pointer crash.
Ex: if ftrace_size is 0, fprz will be NULL.

Fix it by return NULL in advance.
Signed-off-by: Liu ShuoX <shuox.liu@intel.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>

(cherry picked from commit b0aa931f)
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

4592cd41