1. 12 Feb, 2011 8 commits
    • Michel Lespinasse's avatar
      mlock: do not munlock pages in __do_fault() · 419d8c96
      Michel Lespinasse authored
      If the page is going to be written to, __do_page needs to break COW.
      
      However, the old page (before breaking COW) was never mapped mapped into
      the current pte (__do_fault is only called when the pte is not present),
      so vmscan can't have marked the old page as PageMlocked due to being
      mapped in __do_fault's VMA.  Therefore, __do_fault() does not need to
      worry about clearing PageMlocked() on the old page.
      Signed-off-by: default avatarMichel Lespinasse <walken@google.com>
      Reviewed-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      419d8c96
    • Michel Lespinasse's avatar
      mlock: fix race when munlocking pages in do_wp_page() · e15f8c01
      Michel Lespinasse authored
      vmscan can lazily find pages that are mapped within VM_LOCKED vmas, and
      set the PageMlocked bit on these pages, transfering them onto the
      unevictable list.  When do_wp_page() breaks COW within a VM_LOCKED vma,
      it may need to clear PageMlocked on the old page and set it on the new
      page instead.
      
      This change fixes an issue where do_wp_page() was clearing PageMlocked
      on the old page while the pte was still pointing to it (as well as
      rmap).  Therefore, we were not protected against vmscan immediately
      transfering the old page back onto the unevictable list.  This could
      cause pages to get stranded there forever.
      
      I propose to move the corresponding code to the end of do_wp_page(),
      after the pte (and rmap) have been pointed to the new page.
      Additionally, we can use munlock_vma_page() instead of
      clear_page_mlock(), so that the old page stays mlocked if there are
      still other VM_LOCKED vmas mapping it.
      Signed-off-by: default avatarMichel Lespinasse <walken@google.com>
      Reviewed-by: default avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: default avatarHugh Dickins <hughd@google.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e15f8c01
    • Yinghai Lu's avatar
      memblock: don't adjust size in memblock_find_base() · e6d2e2b2
      Yinghai Lu authored
      While applying patch to use memblock to find aperture for 64bit x86.
      Ingo found system with 1g + force_iommu
      
      > No AGP bridge found
      > Node 0: aperture @ 38000000 size 32 MB
      > Aperture pointing to e820 RAM. Ignoring.
      > Your BIOS doesn't leave a aperture memory hole
      > Please enable the IOMMU option in the BIOS setup
      > This costs you 64 MB of RAM
      > Cannot allocate aperture memory hole (0,65536K)
      
      the corresponding code:
      
      	addr = memblock_find_in_range(0, 1ULL<<32, aper_size, 512ULL<<20);
      	if (addr == MEMBLOCK_ERROR || addr + aper_size > 0xffffffff) {
      		printk(KERN_ERR
      			"Cannot allocate aperture memory hole (%lx,%uK)\n",
      				addr, aper_size>>10);
      		return 0;
      	}
      	memblock_x86_reserve_range(addr, addr + aper_size, "aperture64")
      
      fails because memblock core code align the size with 512M.  That could
      make size way too big.
      
      So don't align the size in that case.
      
      actually __memblock_alloc_base, the another caller already align that
      before calling that function.
      
      BTW. x86 does not use __memblock_alloc_base...
      Signed-off-by: default avatarYinghai Lu <yinghai@kernel.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Dave Airlie <airlied@linux.ie>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e6d2e2b2
    • Soren Hansen's avatar
      nbd: remove module-level ioctl mutex · de1f016f
      Soren Hansen authored
      Commit 2a48fc0a ("block: autoconvert trivial BKL users to private
      mutex") replaced uses of the BKL in the nbd driver with mutex
      operations.  Since then, I've been been seeing these lock ups:
      
       INFO: task qemu-nbd:16115 blocked for more than 120 seconds.
       "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
       qemu-nbd      D 0000000000000001     0 16115  16114 0x00000004
        ffff88007d775d98 0000000000000082 ffff88007d775fd8 ffff88007d774000
        0000000000013a80 ffff8800020347e0 ffff88007d775fd8 0000000000013a80
        ffff880133730000 ffff880002034440 ffffea0004333db8 ffffffffa071c020
       Call Trace:
        [<ffffffff815b9997>] __mutex_lock_slowpath+0xf7/0x180
        [<ffffffff815b93eb>] mutex_lock+0x2b/0x50
        [<ffffffffa071a21c>] nbd_ioctl+0x6c/0x1c0 [nbd]
        [<ffffffff812cb970>] blkdev_ioctl+0x230/0x730
        [<ffffffff811967a1>] block_ioctl+0x41/0x50
        [<ffffffff81175c03>] do_vfs_ioctl+0x93/0x370
        [<ffffffff81175f61>] sys_ioctl+0x81/0xa0
        [<ffffffff8100c0c2>] system_call_fastpath+0x16/0x1b
      
      Instrumenting the nbd module's ioctl handler with some extra logging
      clearly shows the NBD_DO_IT ioctl being invoked which is a long-lived
      ioctl in the sense that it doesn't return until another ioctl asks the
      driver to disconnect.  However, that other ioctl blocks, waiting for the
      module-level mutex that replaced the BKL, and then we're stuck.
      
      This patch removes the module-level mutex altogether.  It's clearly
      wrong, and as far as I can see, it's entirely unnecessary, since the nbd
      driver maintains per-device mutexes, and I don't see anything that would
      require a module-level (or kernel-level, for that matter) mutex.
      Signed-off-by: default avatarSoren Hansen <soren@linux2go.dk>
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Acked-by: default avatarPaul Clements <paul.clements@steeleye.com>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: <stable@kernel.org>		[2.6.37.x]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      de1f016f
    • Alexander Strakh's avatar
      drivers/rtc/rtc-proc.c: add module_put on error path in rtc_proc_open() · 24a6f5b8
      Alexander Strakh authored
      In file drivers/rtc/rtc-proc.c seq_open() can return -ENOMEM.
      
       86        if (!try_module_get(THIS_MODULE))
       87                return -ENODEV;
       88
       89        return single_open(file, rtc_proc_show, rtc);
      
      In this case before exiting (line 89) from rtc_proc_open the
      module_put(THIS_MODULE) must be called.
      
      Found by Linux Device Drivers Verification Project
      Signed-off-by: default avatarAlexander Strakh <strakh@ispras.ru>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      24a6f5b8
    • Roland Stigge's avatar
      drivers/gpio/pca953x.c: add a mutex to fix race condition · 6e20fb18
      Roland Stigge authored
      Add a mutex to register communication and handling.  Without the mutex,
      GPIOs didn't switch as expected when toggled in a fast sequence of
      status changes of multiple outputs.
      Signed-off-by: default avatarRoland Stigge <stigge@antcom.de>
      Acked-by: default avatarEric Miao <eric.y.miao@gmail.com>
      Cc: Grant Likely <grant.likely@secretlab.ca>
      Cc: Marc Zyngier <maz@misterjones.org>
      Cc: Ben Gardner <bgardner@wabtec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6e20fb18
    • Tejun Heo's avatar
      ptrace: use safer wake up on ptrace_detach() · 01e05e9a
      Tejun Heo authored
      The wake_up_process() call in ptrace_detach() is spurious and not
      interlocked with the tracee state.  IOW, the tracee could be running or
      sleeping in any place in the kernel by the time wake_up_process() is
      called.  This can lead to the tracee waking up unexpectedly which can be
      dangerous.
      
      The wake_up is spurious and should be removed but for now reduce its
      toxicity by only waking up if the tracee is in TRACED or STOPPED state.
      
      This bug can possibly be used as an attack vector.  I don't think it
      will take too much effort to come up with an attack which triggers oops
      somewhere.  Most sleeps are wrapped in condition test loops and should
      be safe but we have quite a number of places where sleep and wakeup
      conditions are expected to be interlocked.  Although the window of
      opportunity is tiny, ptrace can be used by non-privileged users and with
      some loading the window can definitely be extended and exploited.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarRoland McGrath <roland@redhat.com>
      Acked-by: default avatarOleg Nesterov <oleg@redhat.com>
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      01e05e9a
    • Boaz Harrosh's avatar
      vfs: call rcu_barrier after ->kill_sb() · d863b50a
      Boaz Harrosh authored
      In commit fa0d7e3d ("fs: icache RCU free inodes"), we use rcu free
      inode instead of freeing the inode directly.  It causes a crash when we
      rmmod immediately after we umount the volume[1].
      
      So we need to call rcu_barrier after we kill_sb so that the inode is
      freed before we do rmmod.  The idea is inspired by Aneesh Kumar.
      rcu_barrier will wait for all callbacks to end before preceding.  The
      original patch was done by Tao Ma, but synchronize_rcu() is not enough
      here.
      
      1. http://marc.info/?l=linux-fsdevel&m=129680863330185&w=2Tested-by: default avatarTao Ma <boyu.mt@taobao.com>
      Signed-off-by: default avatarBoaz Harrosh <bharrosh@panasas.com>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Chris Mason <chris.mason@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d863b50a
  2. 11 Feb, 2011 3 commits
    • Linus Torvalds's avatar
      Fix possible filp_cachep memory corruption · 2dab5974
      Linus Torvalds authored
      In commit 31e6b01f ("fs: rcu-walk for path lookup") we started doing
      path lookup using RCU, which then falls back to a careful non-RCU lookup
      in case of problems (LOOKUP_REVAL).  So do_filp_open() has this "re-do
      the lookup carefully" looping case.
      
      However, that means that we must not release the open-intent file data
      if we are going to loop around and use it once more!
      
      Fix this by moving the release of the open-intent data to the function
      that allocates it (do_filp_open() itself) rather than the helper
      functions that can get called multiple times (finish_open() and
      do_last()).  This makes the logic for the lifetime of that field much
      more obvious, and avoids the possible double free.
      Reported-by: default avatarJ. R. Okajima <hooanon05@yahoo.co.jp>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Nick Piggin <npiggin@kernel.dk>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      2dab5974
    • Corey Minyard's avatar
      char/ipmi: fix OOPS caused by pnp_unregister_driver on unregistered driver · d2478521
      Corey Minyard authored
      This patch fixes an OOPS triggered when calling modprobe ipmi_si a
      second time after the first modprobe returned without finding any ipmi
      devices.  This can happen if you reload the module after having the
      first module load fail.  The driver was not deregistering from PNP in
      that case.
      
      Peter Huewe originally reported this patch and supplied a fix, I have a
      different patch based on Linus' suggestion that cleans things up a bit
      more.
      
      Cc: stable@kernel.org
      Cc: openipmi-developer@lists.sourceforge.net
      Reviewed-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Cc: Randy Dunlap <randy.dunlap@oracle.com>
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d2478521
    • Linus Torvalds's avatar
      cap_syslog: accept CAP_SYS_ADMIN for now · ee24aebf
      Linus Torvalds authored
      In commit ce6ada35 ("security: Define CAP_SYSLOG") Serge Hallyn
      introduced CAP_SYSLOG, but broke backwards compatibility by no longer
      accepting CAP_SYS_ADMIN as an override (it would cause a warning and
      then reject the operation).
      
      Re-instate CAP_SYS_ADMIN - but keeping the warning - as an acceptable
      capability until any legacy applications have been updated.  There are
      apparently applications out there that drop all capabilities except for
      CAP_SYS_ADMIN in order to access the syslog.
      
      (This is a re-implementation of a patch by Serge, cleaning the logic up
      and making the code more readable)
      Acked-by: default avatarSerge Hallyn <serge@hallyn.com>
      Reviewed-by: default avatarJames Morris <jmorris@namei.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ee24aebf
  3. 10 Feb, 2011 9 commits
  4. 09 Feb, 2011 16 commits
  5. 08 Feb, 2011 4 commits