Commits · 2512f298cb9886e06938e761c9e924c8448d9ab8 · nexedi / linux

15 Jan, 2013 6 commits

xen/gntdev: fix unsafe vma access · 2512f298

Daniel De Graaf authored Jan 02, 2013

In gntdev_ioctl_get_offset_for_vaddr, we need to hold mmap_sem while
calling find_vma() to avoid potentially having the result freed out from
under us. Similarly, the MMU notifier functions need to synchronize with
gntdev_vma_close to avoid map->vma being freed during their iteration.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

2512f298

xen/privcmd: Fix mmap batch ioctl. · 99beae6c

Andres Lagar-Cavilla authored Jan 14, 2013

1. If any individual mapping error happens, the V1 case will mark *all*
operations as failed. Fixed.

2. The err_array was allocated with kcalloc, resulting in potentially O(n) page
allocations. Refactor code to not use this array.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

99beae6c

Merge tag 'v3.7' into stable/for-linus-3.8 · 7bcc1ec0

Konrad Rzeszutek Wilk authored Jan 15, 2013

Linux 3.7

* tag 'v3.7': (833 commits)
  Linux 3.7
  Input: matrix-keymap - provide proper module license
  Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage
  ipv4: ip_check_defrag must not modify skb before unsharing
  Revert "mm: avoid waking kswapd for THP allocations when compaction is deferred or contended"
  inet_diag: validate port comparison byte code to prevent unsafe reads
  inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run()
  inet_diag: validate byte code to prevent oops in inet_diag_bc_run()
  inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state
  mm: vmscan: fix inappropriate zone congestion clearing
  vfs: fix O_DIRECT read past end of block device
  net: gro: fix possible panic in skb_gro_receive()
  tcp: bug fix Fast Open client retransmission
  tmpfs: fix shared mempolicy leak
  mm: vmscan: do not keep kswapd looping forever due to individual uncompactable zones
  mm: compaction: validate pfn range passed to isolate_freepages_block
  mmc: sh-mmcif: avoid oops on spurious interrupts (second try)
  Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts"
  mmc: sdhci-s3c: fix missing clock for gpio card-detect
  lib/Makefile: Fix oid_registry build dependency
  ...
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

Conflicts:
	arch/arm/xen/enlighten.c
	drivers/xen/Makefile

[We need to have the v3.7 base as the 'for-3.8' was based off v3.7-rc3
and there are some patches in v3.7-rc6 that we to have in our branch]

7bcc1ec0

Xen: properly bound buffer access when parsing cpu/*/availability · e5c702d3

Jan Beulich authored Jan 15, 2013

At the same time reduce the local buffers to 16 bytes each.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

e5c702d3

xen/grant-table: correctly initialize grant table version 1 · d0b4d64a

Matt Wilson authored Jan 15, 2013

Commit 85ff6acb (xen/granttable: Grant
tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
a constant to a conditional expression. The expression depends on
grant_table_version being appropriately set. Unfortunately, at init
time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
conditional expression checks for "grant_table_version == 1", and
therefore returns the number of grant references per frame for v2.

This causes gnttab_init() to allocate fewer pages for gnttab_list, as
a frame can old half the number of v2 entries than v1 entries. After
gnttab_resume() is called, grant_table_version is appropriately
set. nr_init_grefs will then be miscalculated and gnttab_free_count
will hold a value larger than the actual number of free gref entries.

If a guest is heavily utilizing improperly initialized v1 grant
tables, memory corruption can occur. One common manifestation is
corruption of the vmalloc list, resulting in a poisoned pointer
derefrence when accessing /proc/meminfo or /proc/vmallocinfo:

[   40.770064] BUG: unable to handle kernel paging request at 0000200200001407
[   40.770083] IP: [<ffffffff811a6fb0>] get_vmalloc_info+0x70/0x110
[   40.770102] PGD 0
[   40.770107] Oops: 0000 [#1] SMP
[   40.770114] CPU 10

This patch introduces a static variable, grefs_per_grant_frame, to
cache the calculated value. gnttab_init() now calls
gnttab_request_version() early so that grant_table_version and
grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
been added to prevent this type of bug from reoccurring in the future.
Signed-off-by: Matt Wilson <msw@amazon.com>
Reviewed-and-Tested-by: Steven Noonan <snoonan@amazon.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Annie Li <annie.li@oracle.com>
Cc: xen-devel@lists.xen.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org # v3.3 and newer
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

d0b4d64a

x86/xen : Fix the wrong check in pciback · 6337a239

Yang Zhang authored Nov 22, 2012

Fix the wrong check in pciback.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

6337a239

11 Jan, 2013 1 commit

xen/privcmd: Relax access control in privcmd_ioctl_mmap · 30d4b180

Tamas Lengyel authored Dec 31, 2012

In the privcmd Linux driver two checks in the functions
privcmd_ioctl_mmap and privcmd_ioctl_mmap_batch are not needed as they
are trying to enforce hypervisor-level access control.  They should be
removed as they break secondary control domains when performing dom0
disaggregation. Xen itself provides adequate security controls around
these hypercalls and these checks prevent those controls from
functioning as intended.
Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
[v1: Fixed up the patch and commit description]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

30d4b180

18 Dec, 2012 3 commits

xen/vcpu: Fix vcpu restore path. · 9d328a94

Wei Liu authored Dec 13, 2012

The runstate of vcpu should be restored for all possible cpus, as well as the
vcpu info placement.
Acked-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

9d328a94

xen: Add EVTCHNOP_reset in Xen interface header files. · cc31fd9c

Wei Liu authored Dec 13, 2012

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

cc31fd9c

xen/smp: Use smp_store_boot_cpu_info() to store cpu info for BSP during boot time. · 06d0b5d9

Konrad Rzeszutek Wilk authored Dec 17, 2012

Git commit 30106c17
("x86, hotplug: Support functions for CPU0 online/offline") alters what
the call to smp_store_cpu_info() does. For BSP we should use the
smp_store_boot_cpu_info() and for secondary CPU's the old
variant of smp_store_cpu_info() should be used. This fixes
the regression introduced by said commit.
Reported-and-Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

06d0b5d9

11 Dec, 2012 3 commits

Linux 3.7 · 29594404
Linus Torvalds authored Dec 10, 2012

29594404

Input: matrix-keymap - provide proper module license · 55220bb3

Florian Fainelli authored Dec 10, 2012

The matrix-keymap module is currently lacking a proper module license,
add one so we don't have this module tainting the entire kernel.  This
issue has been present since commit 1932811f ("Input: matrix-keymap
- uninline and prepare for device tree support")
Signed-off-by: Florian Fainelli <florian@openwrt.org>
CC: stable@vger.kernel.org # v3.5+
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

55220bb3

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 2c68bc72

Linus Torvalds authored Dec 10, 2012

Pull networking fixes from David Miller:

 1) Netlink socket dumping had several missing verifications and checks.

    In particular, address comparisons in the request byte code
    interpreter could access past the end of the address in the
    inet_request_sock.

    Also, address family and address prefix lengths were not validated
    properly at all.

    This means arbitrary applications can read past the end of certain
    kernel data structures.

    Fixes from Neal Cardwell.

 2) ip_check_defrag() operates in contexts where we're in the process
    of, or about to, input the packet into the real protocols
    (specifically macvlan and AF_PACKET snooping).

    Unfortunately, it does a pskb_may_pull() which can modify the
    backing packet data which is not legal if the SKB is shared.  It
    very much can be shared in this context.

    Deal with the possibility that the SKB is segmented by using
    skb_copy_bits().

    Fix from Johannes Berg based upon a report by Eric Leblond.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  ipv4: ip_check_defrag must not modify skb before unsharing
  inet_diag: validate port comparison byte code to prevent unsafe reads
  inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run()
  inet_diag: validate byte code to prevent oops in inet_diag_bc_run()
  inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state

2c68bc72

10 Dec, 2012 4 commits

Revert "revert "Revert "mm: remove __GFP_NO_KSWAPD""" and associated damage · caf49191

Linus Torvalds authored Dec 10, 2012

This reverts commits a5091539 and
d7c3b937.

This is a revert of a revert of a revert.  In addition, it reverts the
even older i915 change to stop using the __GFP_NO_KSWAPD flag due to the
original commits in linux-next.

It turns out that the original patch really was bogus, and that the
original revert was the correct thing to do after all.  We thought we
had fixed the problem, and then reverted the revert, but the problem
really is fundamental: waking up kswapd simply isn't the right thing to
do, and direct reclaim sometimes simply _is_ the right thing to do.

When certain allocations fail, we simply should try some direct reclaim,
and if that fails, fail the allocation.  That's the right thing to do
for THP allocations, which can easily fail, and the GPU allocations want
to do that too.

So starting kswapd is sometimes simply wrong, and removing the flag that
said "don't start kswapd" was a mistake.  Let's hope we never revisit
this mistake again - and certainly not this many times ;)
Acked-by: Mel Gorman <mgorman@suse.de>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

caf49191

ipv4: ip_check_defrag must not modify skb before unsharing · 1bf3751e

Johannes Berg authored Dec 09, 2012

ip_check_defrag() might be called from af_packet within the
RX path where shared SKBs are used, so it must not modify
the input SKB before it has unshared it for defragmentation.
Use skb_copy_bits() to get the IP header and only pull in
everything later.

The same is true for the other caller in macvlan as it is
called from dev->rx_handler which can also get a shared SKB.
Reported-by: Eric Leblond <eric@regit.org>
Cc: stable@vger.kernel.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1bf3751e

Revert "mm: avoid waking kswapd for THP allocations when compaction is deferred or contended" · 31f8d42d

Linus Torvalds authored Dec 10, 2012

This reverts commit 782fd304.

We are going to reinstate the __GFP_NO_KSWAPD flag that has been
removed, the removal reverted, and then removed again.  Making this
commit a pointless fixup for a problem that was caused by the removal of
__GFP_NO_KSWAPD flag.

The thing is, we really don't want to wake up kswapd for THP allocations
(because they fail quite commonly under any kind of memory pressure,
including when there is tons of memory free), and these patches were
just trying to fix up the underlying bug: the original removal of
__GFP_NO_KSWAPD in commit c6543459 ("mm: remove __GFP_NO_KSWAPD")
was simply bogus.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

31f8d42d

inet_diag: validate port comparison byte code to prevent unsafe reads · 5e1f5420

Neal Cardwell authored Dec 09, 2012

Add logic to verify that a port comparison byte code operation
actually has the second inet_diag_bc_op from which we read the port
for such operations.

Previously the code blindly referenced op[1] without first checking
whether a second inet_diag_bc_op struct could fit there. So a
malicious user could make the kernel read 4 bytes beyond the end of
the bytecode array by claiming to have a whole port comparison byte
code (2 inet_diag_bc_op structs) when in fact the bytecode was not
long enough to hold both.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5e1f5420

09 Dec, 2012 3 commits

inet_diag: avoid unsafe and nonsensical prefix matches in inet_diag_bc_run() · f67caec9

Neal Cardwell authored Dec 08, 2012

Add logic to check the address family of the user-supplied conditional
and the address family of the connection entry. We now do not do
prefix matching of addresses from different address families (AF_INET
vs AF_INET6), except for the previously existing support for having an
IPv4 prefix match an IPv4-mapped IPv6 address (which this commit
maintains as-is).

This change is needed for two reasons:

(1) The addresses are different lengths, so comparing a 128-bit IPv6
prefix match condition to a 32-bit IPv4 connection address can cause
us to unwittingly walk off the end of the IPv4 address and read
garbage or oops.

(2) The IPv4 and IPv6 address spaces are semantically distinct, so a
simple bit-wise comparison of the prefixes is not meaningful, and
would lead to bogus results (except for the IPv4-mapped IPv6 case,
which this commit maintains).
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f67caec9

inet_diag: validate byte code to prevent oops in inet_diag_bc_run() · 405c0059

Neal Cardwell authored Dec 08, 2012

Add logic to validate INET_DIAG_BC_S_COND and INET_DIAG_BC_D_COND
operations.

Previously we did not validate the inet_diag_hostcond, address family,
address length, and prefix length. So a malicious user could make the
kernel read beyond the end of the bytecode array by claiming to have a
whole inet_diag_hostcond when the bytecode was not long enough to
contain a whole inet_diag_hostcond of the given address family. Or
they could make the kernel read up to about 27 bytes beyond the end of
a connection address by passing a prefix length that exceeded the
length of addresses of the given family.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

405c0059

inet_diag: fix oops for IPv4 AF_INET6 TCP SYN-RECV state · 1c95df85

Neal Cardwell authored Dec 08, 2012

Fix inet_diag to be aware of the fact that AF_INET6 TCP connections
instantiated for IPv4 traffic and in the SYN-RECV state were actually
created with inet_reqsk_alloc(), instead of inet6_reqsk_alloc(). This
means that for such connections inet6_rsk(req) returns a pointer to a
random spot in memory up to roughly 64KB beyond the end of the
request_sock.

With this bug, for a server using AF_INET6 TCP sockets and serving
IPv4 traffic, an inet_diag user like `ss state SYN-RECV` would lead to
inet_diag_fill_req() causing an oops or the export to user space of 16
bytes of kernel memory as a garbage IPv6 address, depending on where
the garbage inet6_rsk(req) pointed.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1c95df85

08 Dec, 2012 3 commits

mm: vmscan: fix inappropriate zone congestion clearing · ed23ec4f

Johannes Weiner authored Dec 06, 2012

commit c702418f ("mm: vmscan: do not keep kswapd looping forever due
to individual uncompactable zones") removed zone watermark checks from
the compaction code in kswapd but left in the zone congestion clearing,
which now happens unconditionally on higher order reclaim.

This messes up the reclaim throttling logic for zones with
dirty/writeback pages, where zones should only lose their congestion
status when their watermarks have been restored.

Remove the clearing from the zone compaction section entirely.  The
preliminary zone check and the reclaim loop in kswapd will clear it if
the zone is considered balanced.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reviewed-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

ed23ec4f

vfs: fix O_DIRECT read past end of block device · 684c9aae

Linus Torvalds authored Dec 07, 2012

The direct-IO write path already had the i_size checks in mm/filemap.c,
but it turns out the read path did not, and removing the block size
checks in fs/block_dev.c (commit bbec0270: "blkdev_max_block: make
private to fs/buffer.c") removed the magic "shrink IO to past the end of
the device" code there.

Fix it by truncating the IO to the size of the block device, like the
write path already does.

NOTE! I suspect the write path would be *much* better off doing it this
way in fs/block_dev.c, rather than hidden deep in mm/filemap.c.  The
mm/filemap.c code is extremely hard to follow, and has various
conditionals on the target being a block device (ie the flag passed in
to 'generic_write_checks()', along with a conditional update of the
inode timestamp etc).

It is also quite possible that we should treat this whole block device
size as a "s_maxbytes" issue, and try to make the logic even more
generic.  However, in the meantime this is the fairly minimal targeted
fix.

Noted by Milan Broz thanks to a regression test for the cryptsetup
reencrypt tool.
Reported-and-tested-by: Milan Broz <mbroz@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

684c9aae

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 1b3c393c

Linus Torvalds authored Dec 07, 2012

Pull networking fixes from David Miller:
 "Two stragglers:

   1) The new code that adds new flushing semantics to GRO can cause SKB
      pointer list corruption, manage the lists differently to avoid the
      OOPS.  Fix from Eric Dumazet.

   2) When TCP fast open does a retransmit of data in a SYN-ACK or
      similar, we update retransmit state that we shouldn't triggering a
      WARN_ON later.  Fix from Yuchung Cheng."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net: gro: fix possible panic in skb_gro_receive()
  tcp: bug fix Fast Open client retransmission

1b3c393c

07 Dec, 2012 3 commits

net: gro: fix possible panic in skb_gro_receive() · c3c7c254

Eric Dumazet authored Dec 06, 2012

commit 2e71a6f8 (net: gro: selective flush of packets) added
a bug for skbs using frag_list. This part of the GRO stack is rarely
used, as it needs skb not using a page fragment for their skb->head.

Most drivers do use a page fragment, but some of them use GFP_KERNEL
allocations for the initial fill of their RX ring buffer.

napi_gro_flush() overwrite skb->prev that was used for these skb to
point to the last skb in frag_list.

Fix this using a separate field in struct napi_gro_cb to point to the
last fragment.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c3c7c254

tcp: bug fix Fast Open client retransmission · 93b174ad

Yuchung Cheng authored Dec 06, 2012

If SYN-ACK partially acks SYN-data, the client retransmits the
remaining data by tcp_retransmit_skb(). This increments lost recovery
state variables like tp->retrans_out in Open state. If loss recovery
happens before the retransmission is acked, it triggers the WARN_ON
check in tcp_fastretrans_alert(). For example: the client sends
SYN-data, gets SYN-ACK acking only ISN, retransmits data, sends
another 4 data packets and get 3 dupacks.

Since the retransmission is not caused by network drop it should not
update the recovery state variables. Further the server may return a
smaller MSS than the cached MSS used for SYN-data, so the retranmission
needs a loop. Otherwise some data will not be retransmitted until timeout
or other loss recovery events.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

93b174ad

Merge tag 'mmc-fixes-for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc · 1afa4717

Linus Torvalds authored Dec 07, 2012

Pull MMC fixes from Chris Ball:
 "Two small regression fixes:

   - sdhci-s3c: Fix runtime PM regression against 3.7-rc1
   - sh-mmcif: Fix oops against 3.6"

* tag 'mmc-fixes-for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: sh-mmcif: avoid oops on spurious interrupts (second try)
  Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts"
  mmc: sdhci-s3c: fix missing clock for gpio card-detect

1afa4717

06 Dec, 2012 11 commits

tmpfs: fix shared mempolicy leak · 18a2f371

Mel Gorman authored Dec 05, 2012

This fixes a regression in 3.7-rc, which has since gone into stable.

Commit 00442ad0 ("mempolicy: fix a memory corruption by refcount
imbalance in alloc_pages_vma()") changed get_vma_policy() to raise the
refcount on a shmem shared mempolicy; whereas shmem_alloc_page() went
on expecting alloc_page_vma() to drop the refcount it had acquired.
This deserves a rework: but for now fix the leak in shmem_alloc_page().

Hugh: shmem_swapin() did not need a fix, but surely it's clearer to use
the same refcounting there as in shmem_alloc_page(), delete its onstack
mempolicy, and the strange mpol_cond_copy() and __mpol_cond_copy() -
those were invented to let swapin_readahead() make an unknown number of
calls to alloc_pages_vma() with one mempolicy; but since 00442ad0,
alloc_pages_vma() has kept refcount in balance, so now no problem.
Reported-and-tested-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

18a2f371

mm: vmscan: do not keep kswapd looping forever due to individual uncompactable zones · c702418f

Johannes Weiner authored Dec 04, 2012

When a zone meets its high watermark and is compactable in case of
higher order allocations, it contributes to the percentage of the node's
memory that is considered balanced.

This requirement, that a node be only partially balanced, came about
when kswapd was desparately trying to balance tiny zones when all bigger
zones in the node had plenty of free memory.  Arguably, the same should
apply to compaction: if a significant part of the node is balanced
enough to run compaction, do not get hung up on that tiny zone that
might never get in shape.

When the compaction logic in kswapd is reached, we know that at least
25% of the node's memory is balanced properly for compaction (see
zone_balanced and pgdat_balanced).  Remove the individual zone checks
that restart the kswapd cycle.

Otherwise, we may observe more endless looping in kswapd where the
compaction code loops back to reclaim because of a single zone and
reclaim does nothing because the node is considered balanced overall.

See for example

  https://bugzilla.redhat.com/show_bug.cgi?id=866988Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-and-tested-by: Thorsten Leemhuis <fedora@leemhuis.info>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Tested-by: John Ellson <john.ellson@comcast.net>
Tested-by: Zdenek Kabelac <zkabelac@redhat.com>
Tested-by: Bruno Wolff III <bruno@wolff.to>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

c702418f

mm: compaction: validate pfn range passed to isolate_freepages_block · 60177d31

Mel Gorman authored Dec 06, 2012

Commit 0bf380bc ("mm: compaction: check pfn_valid when entering a
new MAX_ORDER_NR_PAGES block during isolation for migration") added a
check for pfn_valid() when isolating pages for migration as the scanner
does not necessarily start pageblock-aligned.

Since commit c89511ab ("mm: compaction: Restart compaction from near
where it left off"), the free scanner has the same problem. This patch
makes sure that the pfn range passed to isolate_freepages_block() is
within the same block so that pfn_valid() checks are unnecessary.

In answer to Henrik's wondering why others have not reported this:
reproducing this requires a large enough hole with the right aligment to
have compaction walk into a PFN range with no memmap. Size and
alignment depends in the memory model - 4M for FLATMEM and 128M for
SPARSEMEM on x86. It needs a "lucky" machine.
Reported-by: Henrik Rydberg <rydberg@euromail.se>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

60177d31

mmc: sh-mmcif: avoid oops on spurious interrupts (second try) · 91ab252a

Guennadi Liakhovetski authored Aug 22, 2012

On some systems, e.g., kzm9g, MMCIF interfaces can produce spurious
interrupts without any active request. To prevent the Oops, that results
in such cases, don't dereference the mmc request pointer until we make
sure, that we are indeed processing such a request.
Reported-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
Signed-off-by: Guennadi Liakhovetski <g.liakhovetski@gmx.de>
Tested-by: Tetsuyuki Kobayashi <koba@kmckk.co.jp>
Cc: stable@vger.kernel.org
Signed-off-by: Chris Ball <cjb@laptop.org>

91ab252a

Revert misapplied "mmc: sh-mmcif: avoid oops on spurious interrupts" · 6984f3c3

Chris Ball authored Dec 03, 2012

This reverts commit 8464dd52, which was a misapplied debugging
version of the patch, not the final patch itself.
Signed-off-by: Chris Ball <cjb@laptop.org>
Cc: stable@vger.kernel.org

6984f3c3

mmc: sdhci-s3c: fix missing clock for gpio card-detect · fe007c02

Heiko Stübner authored Nov 18, 2012

2abeb5c5 ("Add clk_(enable/disable) in runtime suspend/resume")
added the capability to stop the clocks when the device is runtime
suspended, but forgot to handle the case of the card-detect using
an external gpio.

Therefore in the case that runtime-pm is enabled, start the io-clock
when a card is inserted and stop it again once it is removed.
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Chris Ball <cjb@laptop.org>

fe007c02

Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus · 04c5decd

Linus Torvalds authored Dec 06, 2012

Pull MIPS fixes from Ralf Baechle:
 "These are the fixes for the N32 syscall bugs found by Al, an
  extraneous break that broke detection for R3000 and R3081 processors,
  an endless loop processing signals for kernel task (x86 received the
  same fix a while ago) and a fix for transparent huge page which took
  ages to track down because it was so hard to come up with a workable
  test case."

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Fix endless loop when processing signals for kernel tasks
  MIPS: R3000/R3081: Fix CPU detection.
  MIPS: N32: Fix signalfd4 syscall entry point
  MIPS: N32: Fix preadv(2) and pwritev(2) entry points.
  MIPS: Avoid mcheck by flushing page range in huge_ptep_set_access_flags()

04c5decd

Merge branch 'more-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · d91fa971

Linus Torvalds authored Dec 06, 2012

Pull build fix from Rusty Russell:
 "Tim Gardner <tim.gardner@canonical.com> writes:
  > It is $(obj)/oid_registry.o that is dependent on $(obj)/oid_registry_data.c.
  > The object file cannot be built until $(obj)/oid_registry_data.c has been
  > generated.
  >
  > A periodic and hard to reproduce parallel build failure is due to
  > this incorrect lib/Makefile dependency. The compile error is completely
  > disingenuous.
  >
  >   GEN     lib/oid_registry_data.c
  > Compiling 49 OIDs
  >   CC      lib/oid_registry.o
  > gcc: error: lib/oid_registry.c: No such file or directory
  > gcc: fatal error: no input files
  > compilation terminated.
  > make[3]: *** [lib/oid_registry.o] Error 4

  I can't reproduce it either.  It's completely weird; nothing ever
  removes lib/oid_registry.c, so either gcc is giving the wrong message
  or it's a weird fs with a very odd race.

  But your version is definitely more correct than the previous one,
  so..."

* 'more-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  lib/Makefile: Fix oid_registry build dependency

d91fa971

Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · 54d1ae49

Linus Torvalds authored Dec 06, 2012

Pull module signing fixes from Rusty Russell:
 "David gave me these a month ago, during my git workflow churn :("

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  ASN.1: Fix an indefinite length skip error
  MODSIGN: Don't use enum-type bitfields in module signature info block

54d1ae49

Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · cfd1f032

Linus Torvalds authored Dec 06, 2012

Pull watchdog fix from Thomas Gleixner:
 "Trivial CPU hotplug regression fix for the watchdog code"

* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  watchdog: Fix CPU hotplug regression

cfd1f032

lib/Makefile: Fix oid_registry build dependency · 527897cc

Tim Gardner authored Dec 04, 2012

It is $(obj)/oid_registry.o that is dependent on $(obj)/oid_registry_data.c.
The object file cannot be built until $(obj)/oid_registry_data.c has been
generated.

A periodic and hard to reproduce parallel build failure is due to
this incorrect lib/Makefile dependency. The compile error is completely
disingenuous.

  GEN     lib/oid_registry_data.c
Compiling 49 OIDs
  CC      lib/oid_registry.o
gcc: error: lib/oid_registry.c: No such file or directory
gcc: fatal error: no input files
compilation terminated.
make[3]: *** [lib/oid_registry.o] Error 4

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Michel Lespinasse <walken@google.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

527897cc

05 Dec, 2012 3 commits

MIPS: Fix endless loop when processing signals for kernel tasks · c90e6fbb

Dmitry Adamushko authored Apr 05, 2012

The problem occurs [1] when a kernel-mode task returns from a system
call with a pending signal.

A real-life scenario is a child of 'khelper' returning from a failed
kernel_execve() in ____call_usermodehelper() [ kernel/kmod.c ].
kernel_execve() fails due to a pending SIGKILL, which is the result of
"kill -9 -1" (at least, busybox's init does it upon reboot).

The loop is as follows:

* syscall_exit_work:
 - work_pending:            // start_of_the_loop
 - work_notifysig:
   - do_notify_resume()
     - do_signal()
       - if (!user_mode(regs)) return;
 - resume_userspace         // TIF_SIGPENDING is still set
 - work_pending             // so we call work_pending => goto
                            // start_of_the_loop

More information can be found in another LKML thread:
http://www.serverphorums.com/read.php?12,457826

[1] The problem was also reproduced on !CONFIG_VM86 x86, and the
following fix was accepted.

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=29a2e2836ff9ea65a603c89df217f4198973a74fSigned-off-by: Dmitry Adamushko <dmitry.adamushko@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/3571/Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

c90e6fbb

MIPS: R3000/R3081: Fix CPU detection. · 2d33976f

Ralf Baechle authored Jun 08, 2012

Broken since e05ea74fc56f347f872ef9946d27c53e8bf20864 (lmo) rsp.
cea7e2df (kernel.org) [MIPS: Sort out CPU
type to name translation.]  These CPUs are no longer very popular to say
the least ...
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Reported-by: Murphy McCauley <murphy.mccauley@gmail.com>

2d33976f

MIPS: N32: Fix signalfd4 syscall entry point · 97daa768

Ralf Baechle authored Dec 04, 2012

This needs to use the compat entry point or it's going to fail on big
endian systems.

Noticed by Al Viro.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>

97daa768