1. 03 Mar, 2013 40 commits
    • Ian Abbott's avatar
      staging: comedi: check s->async for poll(), read() and write() · 9cb9a591
      Ian Abbott authored
      commit cc400e18 upstream.
      
      Some low-level comedi drivers (incorrectly) point `dev->read_subdev` or
      `dev->write_subdev` to a subdevice that does not support asynchronous
      commands.  Comedi's poll(), read() and write() file operation handlers
      assume these subdevices do support asynchronous commands.  In
      particular, they assume `s->async` is valid (where `s` points to the
      read or write subdevice), which it won't be if it has been set
      incorrectly.  This can lead to a NULL pointer dereference.
      
      Check `s->async` is non-NULL in `comedi_poll()`, `comedi_read()` and
      `comedi_write()` to avoid the bug.
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9cb9a591
    • Joseph Salisbury's avatar
      ACPI: Add DMI entry for Sony VGN-FW41E_H · 30f3a0a7
      Joseph Salisbury authored
      commit 66f2fda9 upstream.
      
      This patch adds a quirk to allow the Sony VGN-FW41E_H to suspend/resume
      properly.
      
      References: http://bugs.launchpad.net/bugs/1113547Signed-off-by: default avatarJoseph Salisbury <joseph.salisbury@canonical.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      30f3a0a7
    • Rajanikanth H.V's avatar
      ab8500_btemp: Demote initcall sequence · 15642204
      Rajanikanth H.V authored
      commit eeb0751c upstream.
      
      Power supply subsystem creates thermal zone device for the property
      'POWER_SUPPLY_PROP_TEMP' which requires thermal subsystem to be ready
      before 'ab8500 battery temperature monitor' driver is initialized. ab8500
      btemp driver is initialized with subsys_initcall whereas thermal subsystem
      is initialized with fs_initcall which causes
      thermal_zone_device_register(...) to crash since the required structure
      'thermal_class' is not initialized yet:
      
      Unable to handle kernel NULL pointer dereference at virtual address 000000a4
      pgd = c0004000
      [000000a4] *pgd=00000000
      Internal error: Oops: 5 [#1] PREEMPT SMP ARM
      Modules linked in:
      CPU: 0    Tainted: G        W     (3.8.0-rc4-00001-g632fda8-dirty #1)
      PC is at _raw_spin_lock+0x18/0x54
      LR is at get_device_parent+0x50/0x1b8
      pc : [<c02f1dd0>]    lr : [<c01cb248>]    psr: 60000013
      sp : ef04bdc8  ip : 00000000  fp : c0446180
      r10: ef216e38  r9 : c03af5d0  r8 : ef275c18
      r7 : 00000000  r6 : c0476c14  r5 : ef275c18  r4 : ef095840
      r3 : ef04a000  r2 : 00000001  r1 : 00000000  r0 : 000000a4
      Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
      Control: 10c5787d  Table: 0000404a  DAC: 00000015
      Process swapper/0 (pid: 1, stack limit = 0xef04a238)
      Stack: (0xef04bdc8 to 0xef04c000)
      [...]
      [<c02f1dd0>] (_raw_spin_lock+0x18/0x54) from [<c01cb248>] (get_device_parent+0x50/0x1b8)
      [<c01cb248>] (get_device_parent+0x50/0x1b8) from [<c01cb8d8>] (device_add+0xa4/0x574)
      [<c01cb8d8>] (device_add+0xa4/0x574) from [<c020b91c>] (thermal_zone_device_register+0x118/0x938)
      [<c020b91c>] (thermal_zone_device_register+0x118/0x938) from [<c0202030>] (power_supply_register+0x170/0x1f8)
      [<c0202030>] (power_supply_register+0x170/0x1f8) from [<c02055ec>] (ab8500_btemp_probe+0x208/0x47c)
      [<c02055ec>] (ab8500_btemp_probe+0x208/0x47c) from [<c01cf0dc>] (platform_drv_probe+0x14/0x18)
      [<c01cf0dc>] (platform_drv_probe+0x14/0x18) from [<c01cde70>] (driver_probe_device+0x74/0x20c)
      [<c01cde70>] (driver_probe_device+0x74/0x20c) from [<c01ce094>] (__driver_attach+0x8c/0x90)
      [<c01ce094>] (__driver_attach+0x8c/0x90) from [<c01cc640>] (bus_for_each_dev+0x4c/0x80)
      [<c01cc640>] (bus_for_each_dev+0x4c/0x80) from [<c01cd6b4>] (bus_add_driver+0x16c/0x23c)
      [<c01cd6b4>] (bus_add_driver+0x16c/0x23c) from [<c01ce54c>] (driver_register+0x78/0x14c)
      [<c01ce54c>] (driver_register+0x78/0x14c) from [<c00086ac>] (do_one_initcall+0xfc/0x164)
      [<c00086ac>] (do_one_initcall+0xfc/0x164) from [<c02e89c8>] (kernel_init+0x120/0x2b8)
      [<c02e89c8>] (kernel_init+0x120/0x2b8) from [<c000e358>] (ret_from_fork+0x14/0x3c)
      Code: e3c3303f e5932004 e2822001 e5832004 (e1903f9f)
      ---[ end trace ed9df72941b5bada ]---
      Signed-off-by: default avatarRajanikanth H.V <rajanikanth.hv@stericsson.com>
      Signed-off-by: default avatarAnton Vorontsov <anton@enomsg.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      15642204
    • Lee Jones's avatar
      ab8500-chargalg: Only root should have write permission on sysfs file · 71b9101a
      Lee Jones authored
      commit e3455002 upstream.
      
      Only root should have write permission on sysfs file ab8500_chargalg/chargalg.
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71b9101a
    • NeilBrown's avatar
      bq27x00_battery: Fix bugs introduced with BQ27425 support · eddf61b2
      NeilBrown authored
      commit bde83b9a upstream.
      
      commit a66f59ba
      
          bq27x00_battery: Add support for BQ27425 chip
      
      introduced 2 bugs.
      
      1/ 'chip' was set to BQ27425 unconditionally - breaking support for
         other devices;
      
      2/ BQ27425 does not support cycle count, how the code still tries to
         get the cycle count for BQ27425, and now does it twice for other chips.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Cc: Saranya Gopal <saranya.gopal@intel.com>
      Signed-off-by: default avatarAnton Vorontsov <anton@enomsg.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eddf61b2
    • Li Zefan's avatar
      cgroup: fix exit() vs rmdir() race · ec463f0c
      Li Zefan authored
      commit 71b5707e upstream.
      
      In cgroup_exit() put_css_set_taskexit() is called without any lock,
      which might lead to accessing a freed cgroup:
      
      thread1                           thread2
      ---------------------------------------------
      exit()
        cgroup_exit()
          put_css_set_taskexit()
            atomic_dec(cgrp->count);
                                         rmdir();
            /* not safe !! */
            check_for_release(cgrp);
      
      rcu_read_lock() can be used to make sure the cgroup is alive.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec463f0c
    • Li Zefan's avatar
      cpuset: fix cpuset_print_task_mems_allowed() vs rename() race · b19c8d0b
      Li Zefan authored
      commit 63f43f55 upstream.
      
      rename() will change dentry->d_name. The result of this race can
      be worse than seeing partially rewritten name, but we might access
      a stale pointer because rename() will re-allocate memory to hold
      a longer name.
      
      It's safe in the protection of dentry->d_lock.
      
      v2: check NULL dentry before acquiring dentry lock.
      Signed-off-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b19c8d0b
    • Seiji Aguchi's avatar
      pstore: Avoid deadlock in panic and emergency-restart path · 225234a2
      Seiji Aguchi authored
      commit 9f244e9c upstream.
      
      [Issue]
      
      When pstore is in panic and emergency-restart paths, it may be blocked
      in those paths because it simply takes spin_lock.
      
      This is an example scenario which pstore may hang up in a panic path:
      
       - cpuA grabs psinfo->buf_lock
       - cpuB panics and calls smp_send_stop
       - smp_send_stop sends IRQ to cpuA
       - after 1 second, cpuB gives up on cpuA and sends an NMI instead
       - cpuA is now in an NMI handler while still holding buf_lock
       - cpuB is deadlocked
      
      This case may happen if a firmware has a bug and
      cpuA is stuck talking with it more than one second.
      
      Also, this is a similar scenario in an emergency-restart path:
      
       - cpuA grabs psinfo->buf_lock and stucks in a firmware
       - cpuB kicks emergency-restart via either sysrq-b or hangcheck timer.
         And then, cpuB is deadlocked by taking psinfo->buf_lock again.
      
      [Solution]
      
      This patch avoids the deadlocking issues in both panic and emergency_restart
      paths by introducing a function, is_non_blocking_path(), to check if a cpu
      can be blocked in current path.
      
      With this patch, pstore is not blocked even if another cpu has
      taken a spin_lock, in those paths by changing from spin_lock_irqsave
      to spin_trylock_irqsave.
      
      In addition, according to a comment of emergency_restart() in kernel/sys.c,
      spin_lock shouldn't be taken in an emergency_restart path to avoid
      deadlock. This patch fits the comment below.
      
      <snip>
      /**
       *      emergency_restart - reboot the system
       *
       *      Without shutting down any hardware or taking any locks
       *      reboot the system.  This is called when we know we are in
       *      trouble so this is our best effort to reboot.  This is
       *      safe to call in interrupt context.
       */
      void emergency_restart(void)
      <snip>
      Signed-off-by: default avatarSeiji Aguchi <seiji.aguchi@hds.com>
      Acked-by: default avatarDon Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
      Cc: CAI Qian <caiqian@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      225234a2
    • Tejun Heo's avatar
      workqueue: consider work function when searching for busy work items · e50e7d63
      Tejun Heo authored
      commit a2c1c57b upstream.
      
      To avoid executing the same work item concurrenlty, workqueue hashes
      currently busy workers according to their current work items and looks
      up the the table when it wants to execute a new work item.  If there
      already is a worker which is executing the new work item, the new item
      is queued to the found worker so that it gets executed only after the
      current execution finishes.
      
      Unfortunately, a work item may be freed while being executed and thus
      recycled for different purposes.  If it gets recycled for a different
      work item and queued while the previous execution is still in
      progress, workqueue may make the new work item wait for the old one
      although the two aren't really related in any way.
      
      In extreme cases, this false dependency may lead to deadlock although
      it's extremely unlikely given that there aren't too many self-freeing
      work item users and they usually don't wait for other work items.
      
      To alleviate the problem, record the current work function in each
      busy worker and match it together with the work item address in
      find_worker_executing_work().  While this isn't complete, it ensures
      that unrelated work items don't interact with each other and in the
      very unlikely case where a twisted wq user triggers it, it's always
      onto itself making the culprit easy to spot.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarAndrey Isakov <andy51@gmx.ru>
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51701
      Cc: stable@vger.kernel.org
      e50e7d63
    • Miklos Szeredi's avatar
      fuse: don't WARN when nlink is zero · c205ae0e
      Miklos Szeredi authored
      commit dfca7ceb upstream.
      
      drop_nlink() warns if nlink is already zero.  This is triggerable by a buggy
      userspace filesystem.  The cure, I think, is worse than the disease so disable
      the warning.
      Reported-by: default avatarTero Roponen <tero.roponen@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c205ae0e
    • Fernando Luis Vázquez Cao's avatar
      HID: clean up quirk for Sony RF receivers · d62365d1
      Fernando Luis Vázquez Cao authored
      commit 99d24902 upstream.
      
      Document what the fix-up is does and make it more robust by ensuring
      that it is only applied to the USB interface that corresponds to the
      mouse (sony_report_fixup() is called once per interface during probing).
      Signed-off-by: default avatarFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d62365d1
    • Fernando Luis Vázquez Cao's avatar
      HID: add support for Sony RF receiver with USB product id 0x0374 · 9637341b
      Fernando Luis Vázquez Cao authored
      commit a4649184 upstream.
      
      Some Vaio desktop computers, among them the VGC-LN51JGB multimedia PC, have
      a RF receiver, multi-interface USB device 054c:0374, that is used to connect
      a wireless keyboard and a wireless mouse.
      
      The keyboard works flawlessly, but the mouse (VGP-WMS3 in my case) does not
      seem to be generating any pointer events. The problem is that the mouse pointer
      is wrongly declared as a constant non-data variable in the report descriptor
      (see lsusb and usbhid-dump output below), with the consequence that it is
      ignored by the HID code.
      
      Add this device to the have-special-driver list and fix up the report
      descriptor in the Sony-specific driver which happens to already have a fixup
      for a similar firmware bug.
      
      # lsusb -vd 054C:0374
      Bus 003 Device 002: ID 054c:0374 Sony Corp.
      Device Descriptor:
        bLength                18
        bDescriptorType         1
        bcdUSB               2.00
        bDeviceClass            0 (Defined at Interface level)
        bDeviceSubClass         0
        bDeviceProtocol         0
        bMaxPacketSize0         8
        idVendor           0x054c Sony Corp.
        idProduct          0x0374
        iSerial                 0
      [...]
          Interface Descriptor:
            bLength                 9
            bDescriptorType         4
            bInterfaceNumber        1
            bAlternateSetting       0
            bNumEndpoints           1
            bInterfaceClass         3 Human Interface Device
            bInterfaceSubClass      1 Boot Interface Subclass
            bInterfaceProtocol      2 Mouse
            iInterface              2 RF Receiver
      [...]
                Report Descriptor: (length is 100)
      [...]
                  Item(Global): Usage Page, data= [ 0x01 ] 1
                                  Generic Desktop Controls
                  Item(Local ): Usage, data= [ 0x30 ] 48
                                  Direction-X
                  Item(Local ): Usage, data= [ 0x31 ] 49
                                  Direction-Y
                  Item(Global): Report Count, data= [ 0x02 ] 2
                  Item(Global): Report Size, data= [ 0x08 ] 8
                  Item(Global): Logical Minimum, data= [ 0x81 ] 129
                  Item(Global): Logical Maximum, data= [ 0x7f ] 127
                  Item(Main  ): Input, data= [ 0x07 ] 7
                                  Constant Variable Relative No_Wrap Linear
                                  Preferred_State No_Null_Position Non_Volatile Bitfield
      
      # usbhid-dump
      003:002:001:DESCRIPTOR         1357910009.758544
       05 01 09 02 A1 01 05 01 09 02 A1 02 85 01 09 01
       A1 00 05 09 19 01 29 05 95 05 75 01 15 00 25 01
       81 02 75 03 95 01 81 01 05 01 09 30 09 31 95 02
       75 08 15 81 25 7F 81 07 A1 02 85 01 09 38 35 00
       45 00 15 81 25 7F 95 01 75 08 81 06 C0 A1 02 85
       01 05 0C 15 81 25 7F 95 01 75 08 0A 38 02 81 06
       C0 C0 C0 C0
      Signed-off-by: default avatarFernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9637341b
    • J. Bruce Fields's avatar
      svcrpc: fix rpc server shutdown races · acb9bc5f
      J. Bruce Fields authored
      commit cc630d9f upstream.
      
      Rewrite server shutdown to remove the assumption that there are no
      longer any threads running (no longer true, for example, when shutting
      down the service in one network namespace while it's still running in
      others).
      
      Do that by doing what we'd do in normal circumstances: just CLOSE each
      socket, then enqueue it.
      
      Since there may not be threads to handle the resulting queued xprts,
      also run a simplified version of the svc_recv() loop run by a server to
      clean up any closed xprts afterwards.
      Tested-by: default avatarJason Tibbitts <tibbs@math.uh.edu>
      Tested-by: default avatarPaweł Sikora <pawel.sikora@agmk.net>
      Acked-by: default avatarStanislav Kinsbursky <skinsbursky@parallels.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      acb9bc5f
    • J. Bruce Fields's avatar
      svcrpc: make svc_age_temp_xprts enqueue under sv_lock · cc5e7bc7
      J. Bruce Fields authored
      commit e75bafbf upstream.
      
      svc_age_temp_xprts expires xprts in a two-step process: first it takes
      the sv_lock and moves the xprts to expire off their server-wide list
      (sv_tempsocks or sv_permsocks) to a local list.  Then it drops the
      sv_lock and enqueues and puts each one.
      
      I see no reason for this: svc_xprt_enqueue() will take sp_lock, but the
      sv_lock and sp_lock are not otherwise nested anywhere (and documentation
      at the top of this file claims it's correct to nest these with sp_lock
      inside.)
      Tested-by: default avatarJason Tibbitts <tibbs@math.uh.edu>
      Tested-by: default avatarPaweł Sikora <pawel.sikora@agmk.net>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc5e7bc7
    • majianpeng's avatar
      nfsd: Fix memleak · d7bfb000
      majianpeng authored
      commit 2d32b29a upstream.
      
      When free nfs-client, it must free the ->cl_stateids.
      Signed-off-by: default avatarJianpeng Ma <majianpeng@gmail.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d7bfb000
    • Lukas Czerner's avatar
      ext4: fix free clusters calculation in bigalloc filesystem · 0ff827cf
      Lukas Czerner authored
      commit 304e220f upstream.
      
      ext4_has_free_clusters() should tell us whether there is enough free
      clusters to allocate, however number of free clusters in the file system
      is converted to blocks using EXT4_C2B() which is not only wrong use of
      the macro (we should have used EXT4_NUM_B2C) but it's also completely
      wrong concept since everything else is in cluster units.
      
      Moreover when calculating number of root clusters we should be using
      macro EXT4_NUM_B2C() instead of EXT4_B2C() otherwise the result might be
      off by one. However r_blocks_count should always be a multiple of the
      cluster ratio so doing a plain bit shift should be enough here. We
      avoid using EXT4_B2C() because it's confusing.
      
      As a result of the first problem number of free clusters is much bigger
      than it should have been and ext4_has_free_clusters() would return 1 even
      if there is really not enough free clusters available.
      
      Fix this by removing the EXT4_C2B() conversion of free clusters and
      using bit shift when calculating number of root clusters. This bug
      affects number of xfstests tests covering file system ENOSPC situation
      handling. With this patch most of the ENOSPC problems with bigalloc file
      system disappear, especially the errors caused by delayed allocation not
      having enough space when the actual allocation is finally requested.
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0ff827cf
    • Lukas Czerner's avatar
      ext4: fix xattr block allocation/release with bigalloc · 808b5ab0
      Lukas Czerner authored
      commit 1231b3a1 upstream.
      
      Currently when new xattr block is created or released we we would call
      dquot_free_block() or dquot_alloc_block() respectively, among the else
      decrementing or incrementing the number of blocks assigned to the
      inode by one block.
      
      This however does not work for bigalloc file system because we always
      allocate/free the whole cluster so we have to count with that in
      dquot_free_block() and dquot_alloc_block() as well.
      
      Use the clusters-to-blocks conversion EXT4_C2B() when passing number of
      blocks to the dquot_alloc/free functions to fix the problem.
      
      The problem has been revealed by xfstests #117 (and possibly others).
      Signed-off-by: default avatarLukas Czerner <lczerner@redhat.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: default avatarEric Sandeen <sandeen@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      808b5ab0
    • Niu Yawei's avatar
      ext4: fix race in ext4_mb_add_n_trim() · 8a6a8f04
      Niu Yawei authored
      commit f1167009 upstream.
      
      In ext4_mb_add_n_trim(), lg_prealloc_lock should be taken when
      changing the lg_prealloc_list.
      Signed-off-by: default avatarNiu Yawei <yawei.niu@intel.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8a6a8f04
    • Theodore Ts'o's avatar
      ext4: release sysfs kobject when failing to enable quotas on mount · bcd7f174
      Theodore Ts'o authored
      commit 72ba7450 upstream.
      
      In addition, print the error returned from ext4_enable_quotas()
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Reviewed-by: default avatarCarlos Maiolino <cmaiolino@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bcd7f174
    • Eryu Guan's avatar
      ext4: check bh in ext4_read_block_bitmap() · 70d31ea8
      Eryu Guan authored
      commit 15b49132 upstream.
      
      Validate the bh pointer before using it, since
      ext4_read_block_bitmap_nowait() might return NULL.
      
      I've seen this in fsfuzz testing.
      
       EXT4-fs error (device loop0): ext4_read_block_bitmap_nowait:385: comm touch: Cannot get buffer for block bitmap - block_group = 0, block_bitmap = 3925999616
       BUG: unable to handle kernel NULL pointer dereference at           (null)
       IP: [<ffffffff8121de25>] ext4_wait_block_bitmap+0x25/0xe0
       ...
       Call Trace:
        [<ffffffff8121e1e5>] ext4_read_block_bitmap+0x35/0x60
        [<ffffffff8125e9c6>] ext4_free_blocks+0x236/0xb80
        [<ffffffff811d0d36>] ? __getblk+0x36/0x70
        [<ffffffff811d0a5f>] ? __find_get_block+0x8f/0x210
        [<ffffffff81191ef3>] ? kmem_cache_free+0x33/0x140
        [<ffffffff812678e5>] ext4_xattr_release_block+0x1b5/0x1d0
        [<ffffffff812679be>] ext4_xattr_delete_inode+0xbe/0x100
        [<ffffffff81222a7c>] ext4_free_inode+0x7c/0x4d0
        [<ffffffff812277b8>] ? ext4_mark_inode_dirty+0x88/0x230
        [<ffffffff8122993c>] ext4_evict_inode+0x32c/0x490
        [<ffffffff811b8cd7>] evict+0xa7/0x1c0
        [<ffffffff811b8ed3>] iput_final+0xe3/0x170
        [<ffffffff811b8f9e>] iput+0x3e/0x50
        [<ffffffff812316fd>] ext4_add_nondir+0x4d/0x90
        [<ffffffff81231d0b>] ext4_create+0xeb/0x170
        [<ffffffff811aae9c>] vfs_create+0xac/0xd0
        [<ffffffff811ac845>] lookup_open+0x185/0x1c0
        [<ffffffff8129e3b9>] ? selinux_inode_permission+0xa9/0x170
        [<ffffffff811acb54>] do_last+0x2d4/0x7a0
        [<ffffffff811af743>] path_openat+0xb3/0x480
        [<ffffffff8116a8a1>] ? handle_mm_fault+0x251/0x3b0
        [<ffffffff811afc49>] do_filp_open+0x49/0xa0
        [<ffffffff811bbaad>] ? __alloc_fd+0xdd/0x150
        [<ffffffff8119da28>] do_sys_open+0x108/0x1f0
        [<ffffffff8119db51>] sys_open+0x21/0x30
        [<ffffffff81618959>] system_call_fastpath+0x16/0x1b
      
      Also fix comment for ext4_read_block_bitmap_nowait()
      Signed-off-by: default avatarEryu Guan <guaneryu@gmail.com>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      70d31ea8
    • Theodore Ts'o's avatar
      ext4: return ENOMEM if sb_getblk() fails · 94bd696b
      Theodore Ts'o authored
      commit 860d21e2 upstream.
      
      The only reason for sb_getblk() failing is if it can't allocate the
      buffer_head.  So ENOMEM is more appropriate than EIO.  In addition,
      make sure that the file system is marked as being inconsistent if
      sb_getblk() fails.
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      94bd696b
    • Dan Carpenter's avatar
      media: rc: unlock on error in show_protocols() · 84d239fa
      Dan Carpenter authored
      commit 30ebc5e4 upstream.
      
      We recently introduced a new return -ENODEV in this function but we need
      to unlock before returning.
      
      [mchehab@redhat.com: found two patches with the same fix. Merged SOB's/acks into one patch]
      Acked-by: default avatarHerton R. Krzesinski <herton.krzesinski@canonical.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDouglas Bagnall <douglas@paradise.net.nz>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      84d239fa
    • Al Viro's avatar
      media: omap_vout: find_vma() needs ->mmap_sem held · 1952a8d9
      Al Viro authored
      commit 55ee64b3 upstream.
      
      Walking rbtree while it's modified is a Bad Idea(tm); besides,
      the result of find_vma() can be freed just as it's getting returned
      to caller.  Fortunately, it's easy to fix - just take ->mmap_sem a bit
      earlier (and don't bother with find_vma() at all if virtp >= PAGE_OFFSET -
      in that case we don't even look at its result).
      
      While we are at it, what prevents VIDIOC_PREPARE_BUF calling
      v4l_prepare_buf() -> (e.g) vb2_ioctl_prepare_buf() -> vb2_prepare_buf() ->
      __buf_prepare() -> __qbuf_userptr() -> vb2_vmalloc_get_userptr() -> find_vma(),
      AFAICS without having taken ->mmap_sem anywhere in process?  The code flow
      is bloody convoluted and depends on a bunch of things done by initialization,
      so I certainly might've missed something...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Sakari Ailus <sakari.ailus@iki.fi>
      Cc: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
      Cc: Archit Taneja <archit@ti.com>
      Cc: Prabhakar Lad <prabhakar.lad@ti.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1952a8d9
    • Laurent Pinchart's avatar
      media: v4l: Reset subdev v4l2_dev field to NULL if registration fails · 59af41c9
      Laurent Pinchart authored
      commit 317efce9 upstream.
      
      When subdev registration fails the subdev v4l2_dev field is left to a
      non-NULL value. Later calls to v4l2_device_unregister_subdev() will
      consider the subdev as registered and will module_put() the subdev
      module without any matching module_get().
      Fix this by setting the subdev v4l2_dev field to NULL in
      v4l2_device_register_subdev() when the function fails.
      Signed-off-by: default avatarLaurent Pinchart <laurent.pinchart@ideasonboard.com>
      Acked-by: default avatarSylwester Nawrocki <s.nawrocki@samsung.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      59af41c9
    • Hans Verkuil's avatar
      media: cx18/ivtv: fix regression: remove __init from a non-init function · 6afda116
      Hans Verkuil authored
      commit cfb046cb upstream.
      
      Commits 5e6e81b2 (cx18) and
      2aebbf67 (ivtv) added an __init
      annotation to the cx18-alsa-load and ivtv-alsa-load functions. However,
      these functions are called *after* initialization by the main cx18/ivtv
      driver. By that time the memory containing those functions is already
      freed and your machine goes BOOM.
      Signed-off-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6afda116
    • Jan Kara's avatar
      ext4: fix possible use-after-free with AIO · 3aa7a466
      Jan Kara authored
      commit 091e26df upstream.
      
      Running AIO is pinning inode in memory using file reference. Once AIO
      is completed using aio_complete(), file reference is put and inode can
      be freed from memory. So we have to be sure that calling aio_complete()
      is the last thing we do with the inode.
      Reviewed-by: default avatarCarlos Maiolino <cmaiolino@redhat.com>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatar"Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3aa7a466
    • Jan Kara's avatar
      fs: Fix possible use-after-free with AIO · a50f8141
      Jan Kara authored
      commit 54c807e7 upstream.
      
      Running AIO is pinning inode in memory using file reference. Once AIO
      is completed using aio_complete(), file reference is put and inode can
      be freed from memory. So we have to be sure that calling aio_complete()
      is the last thing we do with the inode.
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      CC: Christoph Hellwig <hch@infradead.org>
      CC: Jens Axboe <axboe@kernel.dk>
      CC: Jeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a50f8141
    • Paolo Bonzini's avatar
      nbd: fsync and kill block device on shutdown · f9cf4f43
      Paolo Bonzini authored
      commit 3a2d63f8 upstream.
      
      There are two problems with shutdown in the NBD driver.
      
      1: Receiving the NBD_DISCONNECT ioctl does not sync the filesystem.
      
         This patch adds the sync operation into __nbd_ioctl()'s
         NBD_DISCONNECT handler.  This is useful because BLKFLSBUF is restricted
         to processes that have CAP_SYS_ADMIN, and the NBD client may not
         possess it (fsync of the block device does not sync the filesystem,
         either).
      
      2: Once we clear the socket we have no guarantee that later reads will
         come from the same backing storage.
      
         The patch adds calls to kill_bdev() in __nbd_ioctl()'s socket
         clearing code so the page cache is cleaned, lest reads that hit on the
         page cache will return stale data from the previously-accessible disk.
      
      Example:
      
          # qemu-nbd -r -c/dev/nbd0 /dev/sr0
          # file -s /dev/nbd0
          /dev/stdin: # UDF filesystem data (version 1.5) etc.
          # qemu-nbd -d /dev/nbd0
          # qemu-nbd -r -c/dev/nbd0 /dev/sda
          # file -s /dev/nbd0
          /dev/stdin: # UDF filesystem data (version 1.5) etc.
      
      While /dev/sda has:
      
          # file -s /dev/sda
          /dev/sda: x86 boot sector; etc.
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Acked-by: default avatarPaul Clements <Paul.Clements@steeleye.com>
      Cc: Alex Bligh <alex@alex.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f9cf4f43
    • Xi Wang's avatar
      sysctl: fix null checking in bin_dn_node_address() · 603e070f
      Xi Wang authored
      commit df1778be upstream.
      
      The null check of `strchr() + 1' is broken, which is always non-null,
      leading to OOB read.  Instead, check the result of strchr().
      Signed-off-by: default avatarXi Wang <xi.wang@gmail.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      603e070f
    • Tejun Heo's avatar
      firewire: add minor number range check to fw_device_init() · fd9471ef
      Tejun Heo authored
      commit 3bec60d5 upstream.
      
      fw_device_init() didn't check whether the allocated minor number isn't
      too large.  Fail if it goes overflows MINORBITS.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Suggested-by: default avatarStefan Richter <stefanr@s5r6.in-berlin.de>
      Acked-by: default avatarStefan Richter <stefanr@s5r6.in-berlin.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fd9471ef
    • Tejun Heo's avatar
      block: fix synchronization and limit check in blk_alloc_devt() · ca4e7610
      Tejun Heo authored
      commit ce23bba8 upstream.
      
      idr allocation in blk_alloc_devt() wasn't synchronized against lookup
      and removal, and its limit check was off by one - 1 << MINORBITS is
      the number of minors allowed, not the maximum allowed minor.
      
      Add locking and rename MAX_EXT_DEVT to NR_EXT_DEVT and fix limit
      checking.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ca4e7610
    • Tejun Heo's avatar
      idr: fix a subtle bug in idr_get_next() · bf949343
      Tejun Heo authored
      commit 6cdae741 upstream.
      
      The iteration logic of idr_get_next() is borrowed mostly verbatim from
      idr_for_each().  It walks down the tree looking for the slot matching
      the current ID.  If the matching slot is not found, the ID is
      incremented by the distance of single slot at the given level and
      repeats.
      
      The implementation assumes that during the whole iteration id is aligned
      to the layer boundaries of the level closest to the leaf, which is true
      for all iterations starting from zero or an existing element and thus is
      fine for idr_for_each().
      
      However, idr_get_next() may be given any point and if the starting id
      hits in the middle of a non-existent layer, increment to the next layer
      will end up skipping the same offset into it.  For example, an IDR with
      IDs filled between [64, 127] would look like the following.
      
                [  0  64 ... ]
             /----/   |
             |        |
            NULL    [ 64 ... 127 ]
      
      If idr_get_next() is called with 63 as the starting point, it will try
      to follow down the pointer from 0.  As it is NULL, it will then try to
      proceed to the next slot in the same level by adding the slot distance
      at that level which is 64 - making the next try 127.  It goes around the
      loop and finds and returns 127 skipping [64, 126].
      
      Note that this bug also triggers in idr_for_each_entry() loop which
      deletes during iteration as deletions can make layers go away leaving
      the iteration with unaligned ID into missing layers.
      
      Fix it by ensuring proceeding to the next slot doesn't carry over the
      unaligned offset - ie.  use round_up(id + 1, slot_distance) instead of
      id += slot_distance.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarDavid Teigland <teigland@redhat.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bf949343
    • Roger Pau Monne's avatar
      xen-blkback: use balloon pages for persistent grants · dfd7c4e8
      Roger Pau Monne authored
      commit 087ffecd upstream.
      
      With current persistent grants implementation we are not freeing the
      persistent grants after we disconnect the device. Since grant map
      operations change the mfn of the allocated page, and we can no longer
      pass it to __free_page without setting the mfn to a sane value, use
      balloon grant pages instead, as the gntdev device does.
      Signed-off-by: default avatarRoger Pau Monné <roger.pau@citrix.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dfd7c4e8
    • Konrad Rzeszutek Wilk's avatar
      xen-blkfront: drop the use of llist_for_each_entry_safe · a4c06c2a
      Konrad Rzeszutek Wilk authored
      commit f84adf49 upstream.
      
      Replace llist_for_each_entry_safe with a while loop.
      
      llist_for_each_entry_safe can trigger a bug in GCC 4.1, so it's best
      to remove it and use a while loop and do the deletion manually.
      
      Specifically this bug can be triggered by hot-unplugging a disk, either
      by doing xm block-detach or by save/restore cycle.
      
      BUG: unable to handle kernel paging request at fffffffffffffff0
      IP: [<ffffffffa0047223>] blkif_free+0x63/0x130 [xen_blkfront]
      The crash call trace is:
      	...
      bad_area_nosemaphore+0x13/0x20
      do_page_fault+0x25e/0x4b0
      page_fault+0x25/0x30
      ? blkif_free+0x63/0x130 [xen_blkfront]
      blkfront_resume+0x46/0xa0 [xen_blkfront]
      xenbus_dev_resume+0x6c/0x140
      pm_op+0x192/0x1b0
      device_resume+0x82/0x1e0
      dpm_resume+0xc9/0x1a0
      dpm_resume_end+0x15/0x30
      do_suspend+0x117/0x1e0
      
      When drilling down to the assembler code, on newer GCC it does
      .L29:
              cmpq    $-16, %r12      #, persistent_gnt check
              je      .L30    	#, out of the loop
      .L25:
      	... code in the loop
              testq   %r13, %r13      # n
              je      .L29    	#, back to the top of the loop
              cmpq    $-16, %r12      #, persistent_gnt check
              movq    16(%r12), %r13  # <variable>.node.next, n
              jne     .L25    	#,	back to the top of the loop
      .L30:
      
      While on GCC 4.1, it is:
      L78:
      	... code in the loop
      	testq   %r13, %r13      # n
              je      .L78    #,	back to the top of the loop
              movq    16(%rbx), %r13  # <variable>.node.next, n
              jmp     .L78    #,	back to the top of the loop
      
      Which basically means that the exit loop condition instead of
      being:
      
      	&(pos)->member != NULL;
      
      is:
      	;
      
      which makes the loop unbound.
      
      Since xen-blkfront is the only user of the llist_for_each_entry_safe
      macro remove it from llist.h.
      
      Orabug: 16263164
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a4c06c2a
    • Konrad Rzeszutek Wilk's avatar
      xen/blkback: Don't trust the handle from the frontend. · ef56ca64
      Konrad Rzeszutek Wilk authored
      commit 01c681d4 upstream.
      
      The 'handle' is the device that the request is from. For the life-time
      of the ring we copy it from a request to a response so that the frontend
      is not surprised by it. But we do not need it - when we start processing
      I/Os we have our own 'struct phys_req' which has only most essential
      information about the request. In fact the 'vbd_translate' ends up
      over-writing the preq.dev with a value from the backend.
      
      This assignment of preq.dev with the 'handle' value is superfluous
      so lets not do it.
      Acked-by: default avatarJan Beulich <jbeulich@suse.com>
      Acked-by: default avatarIan Campbell <ian.campbell@citrix.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ef56ca64
    • Jan Beulich's avatar
      xen-blkback: do not leak mode property · 55782573
      Jan Beulich authored
      commit 9d092603 upstream.
      
      "be->mode" is obtained from xenbus_read(), which does a kmalloc() for
      the message body. The short string is never released, so do it along
      with freeing "be" itself, and make sure the string isn't kept when
      backend_changed() doesn't complete successfully (which made it
      desirable to slightly re-structure that function, so that the error
      cleanup can be done in one place).
      Reported-by: default avatarOlaf Hering <olaf@aepfle.de>
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      55782573
    • Tomas Henzl's avatar
      block: fix ext_devt_idr handling · c5799187
      Tomas Henzl authored
      commit 7b74e912 upstream.
      
      While adding and removing a lot of disks disks and partitions this
      sometimes shows up:
      
        WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted)
        Hardware name:
        sysfs: cannot create duplicate filename '/dev/block/259:751'
        Modules linked in: raid1 autofs4 bnx2fc cnic uio fcoe libfcoe libfc 8021q scsi_transport_fc scsi_tgt garp stp llc sunrpc cpufreq_ondemand powernow_k8 freq_table mperf ipv6 dm_mirror dm_region_hash dm_log power_meter microcode dcdbas serio_raw amd64_edac_mod edac_core edac_mce_amd i2c_piix4 i2c_core k10temp bnx2 sg ixgbe dca mdio ext4 mbcache jbd2 dm_round_robin sr_mod cdrom sd_mod crc_t10dif ata_generic pata_acpi pata_atiixp ahci mptsas mptscsih mptbase scsi_transport_sas dm_multipath dm_mod [last unloaded: scsi_wait_scan]
        Pid: 44103, comm: async/16 Not tainted 2.6.32-195.el6.x86_64 #1
        Call Trace:
          warn_slowpath_common+0x87/0xc0
          warn_slowpath_fmt+0x46/0x50
          sysfs_add_one+0xc9/0x130
          sysfs_do_create_link+0x12b/0x170
          sysfs_create_link+0x13/0x20
          device_add+0x317/0x650
          idr_get_new+0x13/0x50
          add_partition+0x21c/0x390
          rescan_partitions+0x32b/0x470
          sd_open+0x81/0x1f0 [sd_mod]
          __blkdev_get+0x1b6/0x3c0
          blkdev_get+0x10/0x20
          register_disk+0x155/0x170
          add_disk+0xa6/0x160
          sd_probe_async+0x13b/0x210 [sd_mod]
          add_wait_queue+0x46/0x60
          async_thread+0x102/0x250
          default_wake_function+0x0/0x20
          async_thread+0x0/0x250
          kthread+0x96/0xa0
          child_rip+0xa/0x20
          kthread+0x0/0xa0
          child_rip+0x0/0x20
      
      This most likely happens because dev_t is freed while the number is
      still used and idr_get_new() is not protected on every use.  The fix
      adds a mutex where it wasn't before and moves the dev_t free function so
      it is called after device del.
      Signed-off-by: default avatarTomas Henzl <thenzl@redhat.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c5799187
    • Xiaowei.Hu's avatar
      ocfs2: ac->ac_allow_chain_relink=0 won't disable group relink · 68719e23
      Xiaowei.Hu authored
      commit 309a85b6 upstream.
      
      ocfs2_block_group_alloc_discontig() disables chain relink by setting
      ac->ac_allow_chain_relink = 0 because it grabs clusters from multiple
      cluster groups.
      
      It doesn't keep the credits for all chain relink,but
      ocfs2_claim_suballoc_bits overrides this in this call trace:
      ocfs2_block_group_claim_bits()->ocfs2_claim_clusters()->
      __ocfs2_claim_clusters()->ocfs2_claim_suballoc_bits()
      ocfs2_claim_suballoc_bits set ac->ac_allow_chain_relink = 1; then call
      ocfs2_search_chain() one time and disable it again, and then we run out
      of credits.
      
      Fix is to allow relink by default and disable it in
      ocfs2_block_group_alloc_discontig.
      
      Without this patch, End-users will run into a crash due to run out of
      credits, backtrace like this:
      
        RIP: 0010:[<ffffffffa0808b14>]  [<ffffffffa0808b14>]
        jbd2_journal_dirty_metadata+0x164/0x170 [jbd2]
        RSP: 0018:ffff8801b919b5b8  EFLAGS: 00010246
        RAX: 0000000000000000 RBX: ffff88022139ddc0 RCX: ffff880159f652d0
        RDX: ffff880178aa3000 RSI: ffff880159f652d0 RDI: ffff880087f09bf8
        RBP: ffff8801b919b5e8 R08: 0000000000000000 R09: 0000000000000000
        R10: 0000000000001e00 R11: 00000000000150b0 R12: ffff880159f652d0
        R13: ffff8801a0cae908 R14: ffff880087f09bf8 R15: ffff88018d177800
        FS:  00007fc9b0b6b6e0(0000) GS:ffff88022fd40000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 000000000040819c CR3: 0000000184017000 CR4: 00000000000006e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        Process dd (pid: 9945, threadinfo ffff8801b919a000, task ffff880149a264c0)
        Call Trace:
          ocfs2_journal_dirty+0x2f/0x70 [ocfs2]
          ocfs2_relink_block_group+0x111/0x480 [ocfs2]
          ocfs2_search_chain+0x455/0x9a0 [ocfs2]
          ...
      Signed-off-by: default avatarXiaowei.Hu <xiaowei.hu@oracle.com>
      Reviewed-by: default avatarSrinivas Eeda <srinivas.eeda@oracle.com>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      68719e23
    • Jeff Liu's avatar
      ocfs2: fix ocfs2_init_security_and_acl() to initialize acl correctly · a13433e0
      Jeff Liu authored
      commit 32918dd9 upstream.
      
      We need to re-initialize the security for a new reflinked inode with its
      parent dirs if it isn't specified to be preserved for ocfs2_reflink().
      However, the code logic is broken at ocfs2_init_security_and_acl()
      although ocfs2_init_security_get() succeed.  As a result,
      ocfs2_acl_init() does not involked and therefore the default ACL of
      parent dir was missing on the new inode.
      
      Note this was introduced by 9d8f13ba ("security: new
      security_inode_init_security API adds function callback")
      
      To reproduce:
      
          set default ACL for the parent dir(ocfs2 in this case):
          $ setfacl -m default:user:jeff:rwx ../ocfs2/
          $ getfacl ../ocfs2/
          # file: ../ocfs2/
          # owner: jeff
          # group: jeff
          user::rwx
          group::r-x
          other::r-x
          default:user::rwx
          default:user:jeff:rwx
          default:group::r-x
          default:mask::rwx
          default:other::r-x
      
          $ touch a
          $ getfacl a
          # file: a
          # owner: jeff
          # group: jeff
          user::rw-
          group::rw-
          other::r--
      
      Before patching, create reflink file b from a, the user
      default ACL entry(user:jeff:rwx)was missing:
      
          $ ./ocfs2_reflink a b
          $ getfacl b
          # file: b
          # owner: jeff
          # group: jeff
          user::rw-
          group::rw-
          other::r--
      
      In this case, the end user can also observed an error message at syslog:
      
        (ocfs2_reflink,3229,2):ocfs2_init_security_and_acl:7193 ERROR: status = 0
      
      After applying this patch, create reflink file c from a:
      
          $ ./ocfs2_reflink a c
          $ getfacl c
          # file: c
          # owner: jeff
          # group: jeff
          user::rw-
          user:jeff:rwx			#effective:rw-
          group::r-x			#effective:r--
          mask::rw-
          other::r--
      
      Test program:
      /* Usage: reflink <source> <dest> */
      #include <stdio.h>
      #include <stdint.h>
      #include <stdbool.h>
      #include <string.h>
      #include <errno.h>
      #include <sys/types.h>
      #include <sys/stat.h>
      #include <fcntl.h>
      #include <sys/ioctl.h>
      
      static int
      reflink_file(char const *src_name, char const *dst_name,
      	     bool preserve_attrs)
      {
      	int fd;
      
      #ifndef REFLINK_ATTR_NONE
      #  define REFLINK_ATTR_NONE 0
      #endif
      #ifndef REFLINK_ATTR_PRESERVE
      #  define REFLINK_ATTR_PRESERVE 1
      #endif
      #ifndef OCFS2_IOC_REFLINK
      	struct reflink_arguments {
      		uint64_t old_path;
      		uint64_t new_path;
      		uint64_t preserve;
      	};
      
      #  define OCFS2_IOC_REFLINK _IOW ('o', 4, struct reflink_arguments)
      #endif
      	struct reflink_arguments args = {
      		.old_path = (unsigned long) src_name,
      		.new_path = (unsigned long) dst_name,
      		.preserve = preserve_attrs ? REFLINK_ATTR_PRESERVE :
      					     REFLINK_ATTR_NONE,
      	};
      
      	fd = open(src_name, O_RDONLY);
      	if (fd < 0) {
      		fprintf(stderr, "Failed to open %s: %s\n",
      			src_name, strerror(errno));
      		return -1;
      	}
      
      	if (ioctl(fd, OCFS2_IOC_REFLINK, &args) < 0) {
      		fprintf(stderr, "Failed to reflink %s to %s: %s\n",
      			src_name, dst_name, strerror(errno));
      		return -1;
      	}
      }
      
      int
      main(int argc, char *argv[])
      {
      	if (argc != 3) {
      		fprintf(stdout, "Usage: %s source dest\n", argv[0]);
      		return 1;
      	}
      
      	return reflink_file(argv[1], argv[2], 0);
      }
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Reviewed-by: default avatarTao Ma <boyu.mt@taobao.com>
      Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a13433e0
    • Jan Kara's avatar
      ocfs2: fix possible use-after-free with AIO · 37a47398
      Jan Kara authored
      commit 9b171e0c upstream.
      
      Running AIO is pinning inode in memory using file reference. Once AIO
      is completed using aio_complete(), file reference is put and inode can
      be freed from memory. So we have to be sure that calling aio_complete()
      is the last thing we do with the inode.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Acked-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Acked-by: default avatarJoel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      37a47398