1. 19 Sep, 2012 10 commits
  2. 12 Sep, 2012 30 commits
    • Ben Hutchings's avatar
      Linux 3.2.29 · 21094cfa
      Ben Hutchings authored
      21094cfa
    • Satoru Moriya's avatar
      mm: avoid swapping out with swappiness==0 · a65afe79
      Satoru Moriya authored
      commit fe35004f upstream.
      
      Sometimes we'd like to avoid swapping out anonymous memory.  In
      particular, avoid swapping out pages of important process or process
      groups while there is a reasonable amount of pagecache on RAM so that we
      can satisfy our customers' requirements.
      
      OTOH, we can control how aggressive the kernel will swap memory pages with
      /proc/sys/vm/swappiness for global and
      /sys/fs/cgroup/memory/memory.swappiness for each memcg.
      
      But with current reclaim implementation, the kernel may swap out even if
      we set swappiness=0 and there is pagecache in RAM.
      
      This patch changes the behavior with swappiness==0.  If we set
      swappiness==0, the kernel does not swap out completely (for global reclaim
      until the amount of free pages and filebacked pages in a zone has been
      reduced to something very very small (nr_free + nr_filebacked < high
      watermark)).
      Signed-off-by: default avatarSatoru Moriya <satoru.moriya@hds.com>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Acked-by: default avatarJerome Marchand <jmarchan@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2:
       - Adjust context
       - vmscan_swappiness() does not have a zone parameter]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a65afe79
    • Phillip Lougher's avatar
      Squashfs: fix mount time sanity check for corrupted superblock · 4bddad58
      Phillip Lougher authored
      commit cc37f75a upstream.
      
      A Squashfs filesystem containing nothing but an empty directory,
      although unusual and ultimately pointless, is still valid.
      
      The directory_table >= next_table sanity check rejects these
      filesystems as invalid because the directory_table is empty and
      equal to next_table.
      Signed-off-by: default avatarPhillip Lougher <phillip@squashfs.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4bddad58
    • AceLan Kao's avatar
      asus-nb-wmi: add some video toggle keys · b64c7d34
      AceLan Kao authored
      commit 3766054f upstream.
      
      There are some new video switch keys that used by newer machines.
      0xA0 - SDSP HDMI only
      0xA1 - SDSP LCD + HDMI
      0xA2 - SDSP CRT + HDMI
      0xA3 - SDSP TV + HDMI
      But in Linux, there is no suitable userspace application to handle this,
      so, mapping them all to KEY_SWITCHVIDEOMODE.
      Signed-off-by: default avatarAceLan Kao <acelan.kao@canonical.com>
      Signed-off-by: default avatarMatthew Garrett <mjg@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b64c7d34
    • Trond Myklebust's avatar
      NFS: Fix Oopses in nfs_lookup_revalidate and nfs4_lookup_revalidate · 2a53885e
      Trond Myklebust authored
      Fix the following Oops in 3.5.1:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
       IP: [<ffffffffa03789cd>] nfs_lookup_revalidate+0x2d/0x480 [nfs]
       PGD 337c63067 PUD 0
       Oops: 0000 [#1] SMP
       CPU 5
       Modules linked in: nfs fscache nfsd lockd nfs_acl auth_rpcgss sunrpc af_packet binfmt_misc cpufreq_conservative cpufreq_userspace cpufreq_powersave dm_mod acpi_cpufreq mperf coretemp gpio_ich kvm_intel joydev kvm ioatdma hid_generic igb lpc_ich i7core_edac edac_core ptp serio_raw dca pcspkr i2c_i801 mfd_core sg pps_core usbhid crc32c_intel microcode button autofs4 uhci_hcd ttm drm_kms_helper drm i2c_algo_bit sysimgblt sysfillrect syscopyarea ehci_hcd usbcore usb_common scsi_dh_rdac scsi_dh_emc scsi_dh_hp_sw scsi_dh_alua scsi_dh edd fan ata_piix thermal processor thermal_sys
      
       Pid: 30431, comm: java Not tainted 3.5.1-2-default #1 Supermicro X8DTT/X8DTT
       RIP: 0010:[<ffffffffa03789cd>]  [<ffffffffa03789cd>] nfs_lookup_revalidate+0x2d/0x480 [nfs]
       RSP: 0018:ffff8801b418bd38  EFLAGS: 00010292
       RAX: 00000000fffffff6 RBX: ffff88032016d800 RCX: 0000000000000020
       RDX: ffffffff00000000 RSI: 0000000000000000 RDI: ffff8801824a7b00
       RBP: ffff8801b418bdf8 R08: 7fffff0034323030 R09: fffffffff04c03ed
       R10: ffff8801824a7b00 R11: 0000000000000002 R12: ffff8801824a7b00
       R13: ffff8801824a7b00 R14: 0000000000000000 R15: ffff8803201725d0
       FS:  00002b53a46cb700(0000) GS:ffff88033fc20000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000038 CR3: 000000020a426000 CR4: 00000000000007e0
       DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
       DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
       Process java (pid: 30431, threadinfo ffff8801b418a000, task ffff8801b5d20600)
       Stack:
        ffff8801b418be44 ffff88032016d800 ffff8801b418bdf8 0000000000000000
        ffff8801824a7b00 ffff8801b418bdd7 ffff8803201725d0 ffffffff8116a9c0
        ffff8801b5c38dc0 0000000000000007 ffff88032016d800 0000000000000000
       Call Trace:
        [<ffffffff8116a9c0>] lookup_dcache+0x80/0xe0
        [<ffffffff8116aa43>] __lookup_hash+0x23/0x90
        [<ffffffff8116b4a5>] lookup_one_len+0xc5/0x100
        [<ffffffffa03869a3>] nfs_sillyrename+0xe3/0x210 [nfs]
        [<ffffffff8116cadf>] vfs_unlink.part.25+0x7f/0xe0
        [<ffffffff8116f22c>] do_unlinkat+0x1ac/0x1d0
        [<ffffffff815717b9>] system_call_fastpath+0x16/0x1b
        [<00002b5348b5f527>] 0x2b5348b5f526
       Code: ec 38 b8 f6 ff ff ff 4c 89 64 24 18 4c 89 74 24 28 49 89 fc 48 89 5c 24 08 48 89 6c 24 10 49 89 f6 4c 89 6c 24 20 4c 89 7c 24 30 <f6> 46 38 40 0f 85 d1 00 00 00 e8 c4 c4 df e0 48 8b 58 30 49 89
       RIP  [<ffffffffa03789cd>] nfs_lookup_revalidate+0x2d/0x480 [nfs]
        RSP <ffff8801b418bd38>
       CR2: 0000000000000038
       ---[ end trace 845113ed191985dd ]---
      
      This Oops affects 3.5 kernels and older, and is due to lookup_one_len()
      calling down to the dentry revalidation code with a NULL pointer
      to struct nameidata.
      
      It is fixed upstream by commit 0b728e19 (stop passing nameidata *
      to ->d_revalidate())
      Reported-by: default avatarRichard Ems <richard.ems@cape-horn-eng.com>
      Signed-off-by: default avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2a53885e
    • Christopher Brannon's avatar
      Staging: speakup: fix an improperly-declared variable. · fc1dd47c
      Christopher Brannon authored
      commit 4ea418b8 upstream.
      
      A local static variable was declared as a pointer to a string
      constant.  We're assigning to the underlying memory, so it
      needs to be an array instead.
      Signed-off-by: default avatarChristopher Brannon <chris@the-brannons.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      fc1dd47c
    • Jason Wessel's avatar
      pmac_zilog,kdb: Fix console poll hook to return instead of loop · 4422e6fe
      Jason Wessel authored
      commit 38f8eefc upstream.
      
      kdb <-> kgdb transitioning does not work properly with this UART
      driver because the get character routine loops indefinitely as opposed
      to returning NO_POLL_CHAR per the expectation of the KDB I/O driver
      API.
      
      The symptom is a kernel hang when trying to switch debug modes.
      
      Cc: Alan Cox <alan@linux.intel.com>
      Signed-off-by: default avatarJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4422e6fe
    • Lionel Vaux's avatar
      HID: add support for Cypress barcode scanner 04B4:ED81 · 7a42c228
      Lionel Vaux authored
      commit 76c9d8fe upstream.
      
      Add yet another device to the list of Cypress barcode scanners
      needing the CP_RDESC_SWAPPED_MIN_MAX quirk.
      Signed-off-by: default avatarLionel Vaux (iouri) <lionel.vaux@free.fr>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      7a42c228
    • Asias He's avatar
      virtio-blk: Reset device after blk_cleanup_queue() · 95964784
      Asias He authored
      commit 483001c7 upstream.
      
      blk_cleanup_queue() will call blk_drian_queue() to drain all the
      requests before queue DEAD marking. If we reset the device before
      blk_cleanup_queue() the drain would fail.
      
      1) if the queue is stopped in do_virtblk_request() because device is
      full, the q->request_fn() will not be called.
      
      blk_drain_queue() {
         while(true) {
            ...
            if (!list_empty(&q->queue_head))
              __blk_run_queue(q) {
      	    if (queue is not stoped)
      		q->request_fn()
      	}
            ...
         }
      }
      
      Do no reset the device before blk_cleanup_queue() gives the chance to
      start the queue in interrupt handler blk_done().
      
      2) In commit b79d866c, We abort requests
      dispatched to driver before blk_cleanup_queue(). There is a race if
      requests are dispatched to driver after the abort and before the queue
      DEAD mark. To fix this, instead of aborting the requests explicitly, we
      can just reset the device after after blk_cleanup_queue so that the
      device can complete all the requests before queue DEAD marking in the
      drain process.
      
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      Cc: virtualization@lists.linux-foundation.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarAsias He <asias@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      95964784
    • Asias He's avatar
      virtio-blk: Call del_gendisk() before disable guest kick · 88963fd6
      Asias He authored
      commit 02e2b124 upstream.
      
      del_gendisk() might not return due to failing to remove the
      /sys/block/vda/serial sysfs entry when another thread (udev) is
      trying to read it.
      
      virtblk_remove()
        vdev->config->reset() : guest will not kick us through interrupt
          del_gendisk()
            device_del()
              kobject_del(): got stuck, sysfs entry ref count non zero
      
      sysfs_open_file(): user space process read /sys/block/vda/serial
         sysfs_get_active() : got sysfs entry ref count
            dev_attr_show()
              virtblk_serial_show()
                 blk_execute_rq() : got stuck, interrupt is disabled
                                    request cannot be finished
      
      This patch fixes it by calling del_gendisk() before we disable guest's
      interrupt so that the request sent in virtblk_serial_show() will be
      finished and del_gendisk() will success.
      
      This fixes another race in hot-unplug process.
      
      It is save to call del_gendisk(vblk->disk) before
      flush_work(&vblk->config_work) which might access vblk->disk, because
      vblk->disk is not freed until put_disk(vblk->disk).
      
      Cc: virtualization@lists.linux-foundation.org
      Cc: kvm@vger.kernel.org
      Signed-off-by: default avatarAsias He <asias@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      88963fd6
    • Asias He's avatar
      virtio-blk: Fix hot-unplug race in remove method · e5000b33
      Asias He authored
      commit b79d866c upstream.
      
      If we reset the virtio-blk device before the requests already dispatched
      to the virtio-blk driver from the block layer are finised, we will stuck
      in blk_cleanup_queue() and the remove will fail.
      
      blk_cleanup_queue() calls blk_drain_queue() to drain all requests queued
      before DEAD marking. However it will never success if the device is
      already stopped. We'll have q->in_flight[] > 0, so the drain will not
      finish.
      
      How to reproduce the race:
      1. hot-plug a virtio-blk device
      2. keep reading/writing the device in guest
      3. hot-unplug while the device is busy serving I/O
      
      Test:
      ~1000 rounds of hot-plug/hot-unplug test passed with this patch.
      
      Changes in v3:
      - Drop blk_abort_queue and blk_abort_request
      - Use __blk_end_request_all to complete request dispatched to driver
      
      Changes in v2:
      - Drop req_in_flight
      - Use virtqueue_detach_unused_buf to get request dispatched to driver
      Signed-off-by: default avatarAsias He <asias@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e5000b33
    • Asias He's avatar
      virtio_blk: Drop unused request tracking list · 788c8de3
      Asias He authored
      commit f65ca1dc upstream.
      
      Benchmark shows small performance improvement on fusion io device.
      
      Before:
        seq-read : io=1,024MB, bw=19,982KB/s, iops=39,964, runt= 52475msec
        seq-write: io=1,024MB, bw=20,321KB/s, iops=40,641, runt= 51601msec
        rnd-read : io=1,024MB, bw=15,404KB/s, iops=30,808, runt= 68070msec
        rnd-write: io=1,024MB, bw=14,776KB/s, iops=29,552, runt= 70963msec
      
      After:
        seq-read : io=1,024MB, bw=20,343KB/s, iops=40,685, runt= 51546msec
        seq-write: io=1,024MB, bw=20,803KB/s, iops=41,606, runt= 50404msec
        rnd-read : io=1,024MB, bw=16,221KB/s, iops=32,442, runt= 64642msec
        rnd-write: io=1,024MB, bw=15,199KB/s, iops=30,397, runt= 68991msec
      Signed-off-by: default avatarAsias He <asias@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      788c8de3
    • Michael S. Tsirkin's avatar
      virtio_blk: fix config handler race · 0984c4c4
      Michael S. Tsirkin authored
      commit 4678d6f9 upstream.
      
      Fix a theoretical race related to config work
      handler: a config interrupt might happen
      after we flush config work but before we
      reset the device. It will then cause the
      config work to run during or after reset.
      
      Two problems with this:
      - if this runs after device is gone we will get use after free
      - access of config while reset is in progress is racy
      (as layout is changing).
      
      As a solution
      1. flush after reset when we know there will be no more interrupts
      2. add a flag to disable config access before reset
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarRusty Russell <rusty@rustcorp.com.au>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      0984c4c4
    • Tony Luck's avatar
      dmi: Feed DMI table to /dev/random driver · d9033bcc
      Tony Luck authored
      commit d114a333 upstream.
      
      Send the entire DMI (SMBIOS) table to the /dev/random driver to
      help seed its pools.
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      d9033bcc
    • Tony Luck's avatar
      random: Add comment to random_initialize() · 86946487
      Tony Luck authored
      commit cbc96b75 upstream.
      
      Many platforms have per-machine instance data (serial numbers,
      asset tags, etc.) squirreled away in areas that are accessed
      during early system bringup. Mixing this data into the random
      pools has a very high value in providing better random data,
      so we should allow (and even encourage) architecture code to
      call add_device_randomness() from the setup_arch() paths.
      
      However, this limits our options for internal structure of
      the random driver since random_initialize() is not called
      until long after setup_arch().
      
      Add a big fat comment to rand_initialize() spelling out
      this requirement.
      Suggested-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      86946487
    • Theodore Ts'o's avatar
      MAINTAINERS: Theodore Ts'o is taking over the random driver · 94688740
      Theodore Ts'o authored
      commit 330e0a01 upstream.
      
      Matt Mackall stepped down as the /dev/random driver maintainer last
      year, so Theodore Ts'o is taking back the /dev/random driver.
      
      Cc: Matt Mackall <mpm@selenic.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      94688740
    • Mathieu Desnoyers's avatar
      drivers/char/random.c: fix boot id uniqueness race · a457fb13
      Mathieu Desnoyers authored
      commit 44e4360f upstream.
      
      /proc/sys/kernel/random/boot_id can be read concurrently by userspace
      processes.  If two (or more) user-space processes concurrently read
      boot_id when sysctl_bootid is not yet assigned, a race can occur making
      boot_id differ between the reads.  Because the whole point of the boot id
      is to be unique across a kernel execution, fix this by protecting this
      operation with a spinlock.
      
      Given that this operation is not frequently used, hitting the spinlock
      on each call should not be an issue.
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@efficios.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Matt Mackall <mpm@selenic.com>
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Cc: Greg Kroah-Hartman <greg@kroah.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a457fb13
    • Szymon Janc's avatar
      Bluetooth: Fix using uninitialized option in RFCMode · 39b78b60
      Szymon Janc authored
      commit 8f321f85 upstream.
      
      If remote device sends bogus RFC option with invalid length,
      undefined options values are used. Fix this by using defaults when
      remote misbehaves.
      
      This also fixes the following warning reported by gcc 4.7.0:
      
      net/bluetooth/l2cap_core.c: In function 'l2cap_config_rsp':
      net/bluetooth/l2cap_core.c:3302:13: warning: 'rfc.max_pdu_size' may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.max_pdu_size' was declared here
      net/bluetooth/l2cap_core.c:3298:25: warning: 'rfc.monitor_timeout' may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.monitor_timeout' was declared here
      net/bluetooth/l2cap_core.c:3297:25: warning: 'rfc.retrans_timeout' may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.retrans_timeout' was declared here
      net/bluetooth/l2cap_core.c:3295:2: warning: 'rfc.mode' may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/bluetooth/l2cap_core.c:3266:24: note: 'rfc.mode' was declared here
      Signed-off-by: default avatarSzymon Janc <szymon.janc@tieto.com>
      Signed-off-by: default avatarGustavo Padovan <gustavo.padovan@collabora.co.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      39b78b60
    • Hugh Dickins's avatar
      block: replace __getblk_slow misfix by grow_dev_page fix · 46d271ae
      Hugh Dickins authored
      commit 676ce6d5 upstream.
      
      Commit 91f68c89 ("block: fix infinite loop in __getblk_slow")
      is not good: a successful call to grow_buffers() cannot guarantee
      that the page won't be reclaimed before the immediate next call to
      __find_get_block(), which is why there was always a loop there.
      
      Yesterday I got "EXT4-fs error (device loop0): __ext4_get_inode_loc:3595:
      inode #19278: block 664: comm cc1: unable to read itable block" on console,
      which pointed to this commit.
      
      I've been trying to bisect for weeks, why kbuild-on-ext4-on-loop-on-tmpfs
      sometimes fails from a missing header file, under memory pressure on
      ppc G5.  I've never seen this on x86, and I've never seen it on 3.5-rc7
      itself, despite that commit being in there: bisection pointed to an
      irrelevant pinctrl merge, but hard to tell when failure takes between
      18 minutes and 38 hours (but so far it's happened quicker on 3.6-rc2).
      
      (I've since found such __ext4_get_inode_loc errors in /var/log/messages
      from previous weeks: why the message never appeared on console until
      yesterday morning is a mystery for another day.)
      
      Revert 91f68c89, restoring __getblk_slow() to how it was (plus
      a checkpatch nitfix).  Simplify the interface between grow_buffers()
      and grow_dev_page(), and avoid the infinite loop beyond end of device
      by instead checking init_page_buffers()'s end_block there (I presume
      that's more efficient than a repeated call to blkdev_max_block()),
      returning -ENXIO to __getblk_slow() in that case.
      
      And remove akpm's ten-year-old "__getblk() cannot fail ... weird"
      comment, but that is worrying: are all users of __getblk() really
      now prepared for a NULL bh beyond end of device, or will some oops??
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      46d271ae
    • Glauber Costa's avatar
      fs/buffer.c: remove BUG() in possible but rare condition · 5835d2dc
      Glauber Costa authored
      commit 61065a30 upstream.
      
      While stressing the kernel with with failing allocations today, I hit the
      following chain of events:
      
      alloc_page_buffers():
      
      	bh = alloc_buffer_head(GFP_NOFS);
      	if (!bh)
      		goto no_grow; <= path taken
      
      grow_dev_page():
              bh = alloc_page_buffers(page, size, 0);
              if (!bh)
                      goto failed;  <= taken, consequence of the above
      
      and then the failed path BUG()s the kernel.
      
      The failure is inserted a litte bit artificially, but even then, I see no
      reason why it should be deemed impossible in a real box.
      
      Even though this is not a condition that we expect to see around every
      time, failed allocations are expected to be handled, and BUG() sounds just
      too much.  As a matter of fact, grow_dev_page() can return NULL just fine
      in other circumstances, so I propose we just remove it, then.
      Signed-off-by: default avatarGlauber Costa <glommer@parallels.com>
      Cc: Michal Hocko <mhocko@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5835d2dc
    • Alexandre Bounine's avatar
      rapidio/tsi721: fix unused variable compiler warning · 9bf9e4d2
      Alexandre Bounine authored
      commit 9a9a9a7a upstream.
      
      Fix unused variable compiler warning when built with CONFIG_RAPIDIO_DEBUG
      option off.
      
      This patch is applicable to kernel versions starting from v3.2
      Signed-off-by: default avatarAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9bf9e4d2
    • Alexandre Bounine's avatar
      rapidio/tsi721: fix inbound doorbell interrupt handling · 61ee0199
      Alexandre Bounine authored
      commit 3670e7e1 upstream.
      
      Make sure that there is no doorbell messages left behind due to disabled
      interrupts during inbound doorbell processing.
      
      The most common case for this bug is loss of rionet JOIN messages in
      systems with three or more rionet participants and MSI or MSI-X enabled.
      As result, requests for packet transfers may finish with "destination
      unreachable" error message.
      
      This patch is applicable to kernel versions starting from v3.2.
      Signed-off-by: default avatarAlexandre Bounine <alexandre.bounine@idt.com>
      Cc: Matt Porter <mporter@kernel.crashing.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      61ee0199
    • Atsushi Nemoto's avatar
      drivers/rtc/rtc-rs5c348.c: fix hour decoding in 12-hour mode · 05621c97
      Atsushi Nemoto authored
      commit 7dbfb315 upstream.
      
      Correct the offset by subtracting 20 from tm_hour before taking the
      modulo 12.
      
      [ "Why 20?" I hear you ask. Or at least I did.
      
        Here's the reason why: RS5C348_BIT_PM is 32, and is - stupidly -
        included in the RS5C348_HOURS_MASK define.  So it's really subtracting
        out that bit to get "hour+12".  But then because it does things modulo
        12, it needs to add the 12 in again afterwards anyway.
      
        This code is confused.  It would be much clearer if RS5C348_HOURS_MASK
        just didn't include the RS5C348_BIT_PM bit at all, then it wouldn't
        need to do the silly subtract either.
      
        Whatever. It's all just math, the end result is the same.   - Linus ]
      Reported-by: default avatarJames Nute <newten82@gmail.com>
      Tested-by: default avatarJames Nute <newten82@gmail.com>
      Signed-off-by: default avatarAtsushi Nemoto <anemo@mba.ocn.ne.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      05621c97
    • Robin Holt's avatar
      drivers/misc/sgi-xp/xpc_uv.c: SGI XPC fails to load when cpu 0 is out of IRQ resources · 33da4c1e
      Robin Holt authored
      commit 7838f994 upstream.
      
      On many of our larger systems, CPU 0 has had all of its IRQ resources
      consumed before XPC loads.  Worst cases on machines with multiple 10
      GigE cards and multiple IB cards have depleted the entire first socket
      of IRQs.
      
      This patch makes selecting the node upon which IRQs are allocated (as
      well as all the other GRU Message Queue structures) specifiable as a
      module load param and has a default behavior of searching all nodes/cpus
      for an available resources.
      
      [akpm@linux-foundation.org: fix build: include cpu.h and module.h]
      Signed-off-by: default avatarRobin Holt <holt@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      33da4c1e
    • Michal Hocko's avatar
      mm: hugetlbfs: correctly populate shared pmd · cf727a57
      Michal Hocko authored
      commit eb48c071 upstream.
      
      Each page mapped in a process's address space must be correctly
      accounted for in _mapcount.  Normally the rules for this are
      straightforward but hugetlbfs page table sharing is different.  The page
      table pages at the PMD level are reference counted while the mapcount
      remains the same.
      
      If this accounting is wrong, it causes bugs like this one reported by
      Larry Woodman:
      
        kernel BUG at mm/filemap.c:135!
        invalid opcode: 0000 [#1] SMP
        CPU 22
        Modules linked in: bridge stp llc sunrpc binfmt_misc dcdbas microcode pcspkr acpi_pad acpi]
        Pid: 18001, comm: mpitest Tainted: G        W    3.3.0+ #4 Dell Inc. PowerEdge R620/07NDJ2
        RIP: 0010:[<ffffffff8112cfed>]  [<ffffffff8112cfed>] __delete_from_page_cache+0x15d/0x170
        Process mpitest (pid: 18001, threadinfo ffff880428972000, task ffff880428b5cc20)
        Call Trace:
          delete_from_page_cache+0x40/0x80
          truncate_hugepages+0x115/0x1f0
          hugetlbfs_evict_inode+0x18/0x30
          evict+0x9f/0x1b0
          iput_final+0xe3/0x1e0
          iput+0x3e/0x50
          d_kill+0xf8/0x110
          dput+0xe2/0x1b0
          __fput+0x162/0x240
      
      During fork(), copy_hugetlb_page_range() detects if huge_pte_alloc()
      shared page tables with the check dst_pte == src_pte.  The logic is if
      the PMD page is the same, they must be shared.  This assumes that the
      sharing is between the parent and child.  However, if the sharing is
      with a different process entirely then this check fails as in this
      diagram:
      
        parent
          |
          ------------>pmd
                       src_pte----------> data page
                                              ^
        other--------->pmd--------------------|
                        ^
        child-----------|
                       dst_pte
      
      For this situation to occur, it must be possible for Parent and Other to
      have faulted and failed to share page tables with each other.  This is
      possible due to the following style of race.
      
        PROC A                                          PROC B
        copy_hugetlb_page_range                         copy_hugetlb_page_range
          src_pte == huge_pte_offset                      src_pte == huge_pte_offset
          !src_pte so no sharing                          !src_pte so no sharing
      
        (time passes)
      
        hugetlb_fault                                   hugetlb_fault
          huge_pte_alloc                                  huge_pte_alloc
            huge_pmd_share                                 huge_pmd_share
              LOCK(i_mmap_mutex)
              find nothing, no sharing
              UNLOCK(i_mmap_mutex)
                                                            LOCK(i_mmap_mutex)
                                                            find nothing, no sharing
                                                            UNLOCK(i_mmap_mutex)
            pmd_alloc                                       pmd_alloc
            LOCK(instantiation_mutex)
            fault
            UNLOCK(instantiation_mutex)
                                                        LOCK(instantiation_mutex)
                                                        fault
                                                        UNLOCK(instantiation_mutex)
      
      These two processes are not poing to the same data page but are not
      sharing page tables because the opportunity was missed.  When either
      process later forks, the src_pte == dst pte is potentially insufficient.
      As the check falls through, the wrong PTE information is copied in
      (harmless but wrong) and the mapcount is bumped for a page mapped by a
      shared page table leading to the BUG_ON.
      
      This patch addresses the issue by moving pmd_alloc into huge_pmd_share
      which guarantees that the shared pud is populated in the same critical
      section as pmd.  This also means that huge_pte_offset test in
      huge_pmd_share is serialized correctly now which in turn means that the
      success of the sharing will be higher as the racing tasks see the pud
      and pmd populated together.
      
      Race identified and changelog written mostly by Mel Gorman.
      
      {akpm@linux-foundation.org: attempt to make the huge_pmd_share() comment comprehensible, clean up coding style]
      Reported-by: default avatarLarry Woodman <lwoodman@redhat.com>
      Tested-by: default avatarLarry Woodman <lwoodman@redhat.com>
      Reviewed-by: default avatarMel Gorman <mgorman@suse.de>
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.cz>
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Cc: David Gibson <david@gibson.dropbear.id.au>
      Cc: Ken Chen <kenchen@google.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Hillf Danton <dhillf@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      cf727a57
    • Stephen M. Cameron's avatar
      cciss: fix incorrect scsi status reporting · e371958b
      Stephen M. Cameron authored
      commit b0cf0b11 upstream.
      
      Delete code which sets SCSI status incorrectly as it's already been set
      correctly above this incorrect code.  The bug was introduced in 2009 by
      commit b0e15f6d ("cciss: fix typo that causes scsi status to be
      lost.")
      Signed-off-by: default avatarStephen M. Cameron <scameron@beardog.cce.hp.com>
      Reported-by: default avatarRoel van Meer <roel.vanmeer@bokxing.nl>
      Tested-by: default avatarRoel van Meer <roel.vanmeer@bokxing.nl>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e371958b
    • Dave Airlie's avatar
      fbcon: fix race condition between console lock and cursor timer (v1.1) · 94fb2469
      Dave Airlie authored
      commit d8636a27 upstream.
      
      So we've had a fair few reports of fbcon handover breakage between
      efi/vesafb and i915 surface recently, so I dedicated a couple of
      days to finding the problem.
      
      Essentially the last thing we saw was the conflicting framebuffer
      message and that was all.
      
      So after much tracing with direct netconsole writes (printks
      under console_lock not so useful), I think I found the race.
      
      Thread A (driver load)    Thread B (timer thread)
        unbind_con_driver ->              |
        bind_con_driver ->                |
        vc->vc_sw->con_deinit ->          |
        fbcon_deinit ->                   |
        console_lock()                    |
            |                             |
            |                       fbcon_flashcursor timer fires
            |                       console_lock() <- blocked for A
            |
            |
      fbcon_del_cursor_timer ->
        del_timer_sync
        (BOOM)
      
      Of course because all of this is under the console lock,
      we never see anything, also since we also just unbound the active
      console guess what we never see anything.
      
      Hopefully this fixes the problem for anyone seeing vesafb->kms
      driver handoff.
      
      v1.1: add comment suggestion from Alan.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      94fb2469
    • Alex Deucher's avatar
      Revert "drm/radeon: fix bo creation retry path" · 2744f4e7
      Alex Deucher authored
      commit 676bc2e1 upstream.
      
      This reverts commit d1c7871d.
      
      ttm_bo_init() destroys the BO on failure. So this patch makes
      the retry path work with freed memory.  This ends up causing
      kernel panics when this path is hit.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2744f4e7
    • J. Bruce Fields's avatar
      svcrpc: fix svc_xprt_enqueue/svc_recv busy-looping · 9d16d84d
      J. Bruce Fields authored
      commit d10f27a7 upstream.
      
      The rpc server tries to ensure that there will be room to send a reply
      before it receives a request.
      
      It does this by tracking, in xpt_reserved, an upper bound on the total
      size of the replies that is has already committed to for the socket.
      
      Currently it is adding in the estimate for a new reply *before* it
      checks whether there is space available.  If it finds that there is not
      space, it then subtracts the estimate back out.
      
      This may lead the subsequent svc_xprt_enqueue to decide that there is
      space after all.
      
      The results is a svc_recv() that will repeatedly return -EAGAIN, causing
      server threads to loop without doing any actual work.
      Reported-by: default avatarMichael Tokarev <mjt@tls.msk.ru>
      Tested-by: default avatarMichael Tokarev <mjt@tls.msk.ru>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9d16d84d
    • J. Bruce Fields's avatar
      svcrpc: sends on closed socket should stop immediately · 1057f770
      J. Bruce Fields authored
      commit f06f00a2 upstream.
      
      svc_tcp_sendto sets XPT_CLOSE if we fail to transmit the entire reply.
      However, the XPT_CLOSE won't be acted on immediately.  Meanwhile other
      threads could send further replies before the socket is really shut
      down.  This can manifest as data corruption: for example, if a truncated
      read reply is followed by another rpc reply, that second reply will look
      to the client like further read data.
      
      Symptoms were data corruption preceded by svc_tcp_sendto logging
      something like
      
      	kernel: rpc-srv/tcp: nfsd: sent only 963696 when sending 1048708 bytes - shutting down socket
      Reported-by: default avatarMalahal Naineni <malahal@us.ibm.com>
      Tested-by: default avatarMalahal Naineni <malahal@us.ibm.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1057f770