1. 10 Aug, 2016 40 commits
    • Andrey Ulanov's avatar
      namespace: update event counter when umounting a deleted dentry · 7b1789f9
      Andrey Ulanov authored
      commit e06b933e upstream.
      
      - m_start() in fs/namespace.c expects that ns->event is incremented each
        time a mount added or removed from ns->list.
      - umount_tree() removes items from the list but does not increment event
        counter, expecting that it's done before the function is called.
      - There are some codepaths that call umount_tree() without updating
        "event" counter. e.g. from __detach_mounts().
      - When this happens m_start may reuse a cached mount structure that no
        longer belongs to ns->list (i.e. use after free which usually leads
        to infinite loop).
      
      This change fixes the above problem by incrementing global event counter
      before invoking umount_tree().
      
      Change-Id: I622c8e84dcb9fb63542372c5dbf0178ee86bb589
      Signed-off-by: default avatarAndrey Ulanov <andreyu@google.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b1789f9
    • Colin Ian King's avatar
      devpts: fix null pointer dereference on failed memory allocation · a0292a0f
      Colin Ian King authored
      commit 5353ed8d upstream.
      
      An ENOMEM when creating a pair tty in tty_ldisc_setup causes a null
      pointer dereference in devpts_kill_index because tty->link->driver_data
      is NULL.  The oops was triggered with the pty stressor in stress-ng when
      in a low memory condition.
      
      tty_init_dev tries to clean up a tty_ldisc_setup ENOMEM error by calling
      release_tty, however, this ultimately tries to clean up the NULL pair'd
      tty in pty_unix98_remove, triggering the Oops.
      
      Add check to pty_unix98_remove to only clean up fsi if it is not NULL.
      
      Ooops:
      
      [   23.020961] Oops: 0000 [#1] SMP
      [   23.020976] Modules linked in: ppdev snd_hda_codec_generic snd_hda_intel snd_hda_codec parport_pc snd_hda_core snd_hwdep parport snd_pcm input_leds joydev snd_timer serio_raw snd soundcore i2c_piix4 mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel qxl aes_x86_64 ttm lrw gf128mul glue_helper ablk_helper drm_kms_helper cryptd syscopyarea sysfillrect psmouse sysimgblt floppy fb_sys_fops drm pata_acpi jitterentropy_rng drbg ansi_cprng
      [   23.020978] CPU: 0 PID: 1452 Comm: stress-ng-pty Not tainted 4.7.0-rc4+ #2
      [   23.020978] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
      [   23.020979] task: ffff88007ba30000 ti: ffff880078ea8000 task.ti: ffff880078ea8000
      [   23.020981] RIP: 0010:[<ffffffff813f11ff>]  [<ffffffff813f11ff>] ida_remove+0x1f/0x120
      [   23.020981] RSP: 0018:ffff880078eabb60  EFLAGS: 00010a03
      [   23.020982] RAX: 4444444444444567 RBX: 0000000000000000 RCX: 000000000000001f
      [   23.020982] RDX: 000000000000014c RSI: 000000000000026f RDI: 0000000000000000
      [   23.020982] RBP: ffff880078eabb70 R08: 0000000000000004 R09: 0000000000000036
      [   23.020983] R10: 000000000000026f R11: 0000000000000000 R12: 000000000000026f
      [   23.020983] R13: 000000000000026f R14: ffff88007c944b40 R15: 000000000000026f
      [   23.020984] FS:  00007f9a2f3cc700(0000) GS:ffff88007fc00000(0000) knlGS:0000000000000000
      [   23.020984] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   23.020985] CR2: 0000000000000010 CR3: 000000006c81b000 CR4: 00000000001406f0
      [   23.020988] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   23.020988] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   23.020988] Stack:
      [   23.020989]  0000000000000000 000000000000026f ffff880078eabb90 ffffffff812a5a99
      [   23.020990]  0000000000000000 00000000fffffff4 ffff880078eabba8 ffffffff814f9cbe
      [   23.020991]  ffff88007965c800 ffff880078eabbc8 ffffffff814eef43 fffffffffffffff4
      [   23.020991] Call Trace:
      [   23.021000]  [<ffffffff812a5a99>] devpts_kill_index+0x29/0x50
      [   23.021002]  [<ffffffff814f9cbe>] pty_unix98_remove+0x2e/0x50
      [   23.021006]  [<ffffffff814eef43>] release_tty+0xb3/0x1b0
      [   23.021007]  [<ffffffff814f18d4>] tty_init_dev+0xd4/0x1c0
      [   23.021011]  [<ffffffff814f9fae>] ptmx_open+0xae/0x190
      [   23.021013]  [<ffffffff812254ef>] chrdev_open+0xbf/0x1b0
      [   23.021015]  [<ffffffff8121d973>] do_dentry_open+0x203/0x310
      [   23.021016]  [<ffffffff81225430>] ? cdev_put+0x30/0x30
      [   23.021017]  [<ffffffff8121ee44>] vfs_open+0x54/0x80
      [   23.021018]  [<ffffffff8122b8fc>] ? may_open+0x8c/0x100
      [   23.021019]  [<ffffffff8122f26b>] path_openat+0x2eb/0x1440
      [   23.021020]  [<ffffffff81230534>] ? putname+0x54/0x60
      [   23.021022]  [<ffffffff814f6f97>] ? n_tty_ioctl_helper+0x27/0x100
      [   23.021023]  [<ffffffff81231651>] do_filp_open+0x91/0x100
      [   23.021024]  [<ffffffff81230596>] ? getname_flags+0x56/0x1f0
      [   23.021026]  [<ffffffff8123fc66>] ? __alloc_fd+0x46/0x190
      [   23.021027]  [<ffffffff8121f1e4>] do_sys_open+0x124/0x210
      [   23.021028]  [<ffffffff8121f2ee>] SyS_open+0x1e/0x20
      [   23.021035]  [<ffffffff81845576>] entry_SYSCALL_64_fastpath+0x1e/0xa8
      [   23.021044] Code: 63 28 45 31 e4 eb dd 0f 1f 44 00 00 55 4c 63 d6 48 ba 89 88 88 88 88 88 88 88 4c 89 d0 b9 1f 00 00 00 48 f7 e2 48 89 e5 41 54 53 <8b> 47 10 48 89 fb 8d 3c c5 00 00 00 00 48 c1 ea 09 b8 01 00 00
      [   23.021045] RIP  [<ffffffff813f11ff>] ida_remove+0x1f/0x120
      [   23.021045]  RSP <ffff880078eabb60>
      [   23.021046] CR2: 0000000000000010
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a0292a0f
    • Rafael J. Wysocki's avatar
      cpufreq: Avoid false-positive WARN_ON()s in cpufreq_update_policy() · 4be5c33b
      Rafael J. Wysocki authored
      commit 742c87bf upstream.
      
      CPU notifications from the firmware coming in when cpufreq is
      suspended cause cpufreq_update_current_freq() to return 0 which
      triggers the WARN_ON() in cpufreq_update_policy() for no reason.
      
      Avoid that by checking cpufreq_suspended before calling
      cpufreq_update_current_freq().
      
      Fixes: c9d9c929 (cpufreq: Abort cpufreq_update_current_freq() for cpufreq_suspended set)
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4be5c33b
    • Miklos Szeredi's avatar
      9p: use file_dentry() · 8b592e2a
      Miklos Szeredi authored
      commit b403f0e3 upstream.
      
      v9fs may be used as lower layer of overlayfs and accessing f_path.dentry
      can lead to a crash.  In this case it's a NULL pointer dereference in
      p9_fid_create().
      
      Fix by replacing direct access of file->f_path.dentry with the
      file_dentry() accessor, which will always return a native object.
      Reported-by: default avatarAlessio Igor Bogani <alessioigorbogani@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Tested-by: default avatarAlessio Igor Bogani <alessioigorbogani@gmail.com>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8b592e2a
    • Vegard Nossum's avatar
      ext4: verify extent header depth · af2be3d2
      Vegard Nossum authored
      commit 7bc94916 upstream.
      
      Although the extent tree depth of 5 should enough be for the worst
      case of 2*32 extents of length 1, the extent tree code does not
      currently to merge nodes which are less than half-full with a sibling
      node, or to shrink the tree depth if possible.  So it's possible, at
      least in theory, for the tree depth to be greater than 5.  However,
      even in the worst case, a tree depth of 32 is highly unlikely, and if
      the file system is maliciously corrupted, an insanely large eh_depth
      can cause memory allocation failures that will trigger kernel warnings
      (here, eh_depth = 65280):
      
          JBD2: ext4.exe wants too many credits credits:195849 rsv_credits:0 max:256
          ------------[ cut here ]------------
          WARNING: CPU: 0 PID: 50 at fs/jbd2/transaction.c:293 start_this_handle+0x569/0x580
          CPU: 0 PID: 50 Comm: ext4.exe Not tainted 4.7.0-rc5+ #508
          Stack:
           604a8947 625badd8 0002fd09 00000000
           60078643 00000000 62623910 601bf9bc
           62623970 6002fc84 626239b0 900000125
          Call Trace:
           [<6001c2dc>] show_stack+0xdc/0x1a0
           [<601bf9bc>] dump_stack+0x2a/0x2e
           [<6002fc84>] __warn+0x114/0x140
           [<6002fdff>] warn_slowpath_null+0x1f/0x30
           [<60165829>] start_this_handle+0x569/0x580
           [<60165d4e>] jbd2__journal_start+0x11e/0x220
           [<60146690>] __ext4_journal_start_sb+0x60/0xa0
           [<60120a81>] ext4_truncate+0x131/0x3a0
           [<60123677>] ext4_setattr+0x757/0x840
           [<600d5d0f>] notify_change+0x16f/0x2a0
           [<600b2b16>] do_truncate+0x76/0xc0
           [<600c3e56>] path_openat+0x806/0x1300
           [<600c55c9>] do_filp_open+0x89/0xf0
           [<600b4074>] do_sys_open+0x134/0x1e0
           [<600b4140>] SyS_open+0x20/0x30
           [<6001ea68>] handle_syscall+0x88/0x90
           [<600295fd>] userspace+0x3fd/0x500
           [<6001ac55>] fork_handler+0x85/0x90
      
          ---[ end trace 08b0b88b6387a244 ]---
      
      [ Commit message modified and the extent tree depath check changed
      from 5 to 32 -- tytso ]
      
      Cc: Darrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      af2be3d2
    • Jeff Mahoney's avatar
      ecryptfs: don't allow mmap when the lower fs doesn't support it · f7803b2f
      Jeff Mahoney authored
      commit f0fe970d upstream.
      
      There are legitimate reasons to disallow mmap on certain files, notably
      in sysfs or procfs.  We shouldn't emulate mmap support on file systems
      that don't offer support natively.
      
      CVE-2016-1583
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      [tyhicks: clean up f_op check by using ecryptfs_file_to_lower()]
      Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f7803b2f
    • Jeff Mahoney's avatar
      Revert "ecryptfs: forbid opening files without mmap handler" · 9cd0ae8d
      Jeff Mahoney authored
      commit 78c4e172 upstream.
      
      This reverts commit 2f36db71.
      
      It fixed a local root exploit but also introduced a dependency on
      the lower file system implementing an mmap operation just to open a file,
      which is a bit of a heavy hammer.  The right fix is to have mmap depend
      on the existence of the mmap handler instead.
      Signed-off-by: default avatarJeff Mahoney <jeffm@suse.com>
      Signed-off-by: default avatarTyler Hicks <tyhicks@canonical.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9cd0ae8d
    • Miklos Szeredi's avatar
      locks: use file_inode() · e629d6d9
      Miklos Szeredi authored
      commit 6343a212 upstream.
      
      (Another one for the f_path debacle.)
      
      ltp fcntl33 testcase caused an Oops in selinux_file_send_sigiotask.
      
      The reason is that generic_add_lease() used filp->f_path.dentry->inode
      while all the others use file_inode().  This makes a difference for files
      opened on overlayfs since the former will point to the overlay inode the
      latter to the underlying inode.
      
      So generic_add_lease() added the lease to the overlay inode and
      generic_delete_lease() removed it from the underlying inode.  When the file
      was released the lease remained on the overlay inode's lock list, resulting
      in use after free.
      Reported-by: default avatarEryu Guan <eguan@redhat.com>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay")
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e629d6d9
    • Rhyland Klein's avatar
      power_supply: power_supply_read_temp only if use_cnt > 0 · 3f0f77b6
      Rhyland Klein authored
      commit 5bc28b93 upstream.
      
      Change power_supply_read_temp() to use power_supply_get_property()
      so that it will check the use_cnt and ensure it is > 0. The use_cnt
      will be incremented at the end of __power_supply_register, so this
      will block to case where get_property can be called before the supply
      is fully registered. This fixes the issue show in the stack below:
      
      [    1.452598] power_supply_read_temp+0x78/0x80
      [    1.458680] thermal_zone_get_temp+0x5c/0x11c
      [    1.464765] thermal_zone_device_update+0x34/0xb4
      [    1.471195] thermal_zone_device_register+0x87c/0x8cc
      [    1.477974] __power_supply_register+0x364/0x424
      [    1.484317] power_supply_register_no_ws+0x10/0x18
      [    1.490833] bq27xxx_battery_setup+0x10c/0x164
      [    1.497003] bq27xxx_battery_i2c_probe+0xd0/0x1b0
      [    1.503435] i2c_device_probe+0x174/0x240
      [    1.509172] driver_probe_device+0x1fc/0x29c
      [    1.515167] __driver_attach+0xa4/0xa8
      [    1.520643] bus_for_each_dev+0x58/0x98
      [    1.526204] driver_attach+0x20/0x28
      [    1.531505] bus_add_driver+0x1c8/0x22c
      [    1.537067] driver_register+0x68/0x108
      [    1.542630] i2c_register_driver+0x38/0x7c
      [    1.548457] bq27xxx_battery_i2c_driver_init+0x18/0x20
      [    1.555321] do_one_initcall+0x38/0x12c
      [    1.560886] kernel_init_freeable+0x148/0x1ec
      [    1.566972] kernel_init+0x10/0xfc
      [    1.572101] ret_from_fork+0x10/0x40
      
      Also make the same change to ps_get_max_charge_cntl_limit() and
      ps_get_cur_chrage_cntl_limit() to be safe. Lastly, change the return
      value of power_supply_get_property() to -EAGAIN from -ENODEV if
      use_cnt <= 0.
      
      Fixes: 297d716f ("power_supply: Change ownership from driver to core")
      Signed-off-by: default avatarRhyland Klein <rklein@nvidia.com>
      Reviewed-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Signed-off-by: default avatarSebastian Reichel <sre@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f0f77b6
    • Daniel Bristot de Oliveira's avatar
      cgroup: Disable IRQs while holding css_set_lock · 2644b978
      Daniel Bristot de Oliveira authored
      commit 82d6489d upstream.
      
      While testing the deadline scheduler + cgroup setup I hit this
      warning.
      
      [  132.612935] ------------[ cut here ]------------
      [  132.612951] WARNING: CPU: 5 PID: 0 at kernel/softirq.c:150 __local_bh_enable_ip+0x6b/0x80
      [  132.612952] Modules linked in: (a ton of modules...)
      [  132.612981] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.7.0-rc2 #2
      [  132.612981] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
      [  132.612982]  0000000000000086 45c8bb5effdd088b ffff88013fd43da0 ffffffff813d229e
      [  132.612984]  0000000000000000 0000000000000000 ffff88013fd43de0 ffffffff810a652b
      [  132.612985]  00000096811387b5 0000000000000200 ffff8800bab29d80 ffff880034c54c00
      [  132.612986] Call Trace:
      [  132.612987]  <IRQ>  [<ffffffff813d229e>] dump_stack+0x63/0x85
      [  132.612994]  [<ffffffff810a652b>] __warn+0xcb/0xf0
      [  132.612997]  [<ffffffff810e76a0>] ? push_dl_task.part.32+0x170/0x170
      [  132.612999]  [<ffffffff810a665d>] warn_slowpath_null+0x1d/0x20
      [  132.613000]  [<ffffffff810aba5b>] __local_bh_enable_ip+0x6b/0x80
      [  132.613008]  [<ffffffff817d6c8a>] _raw_write_unlock_bh+0x1a/0x20
      [  132.613010]  [<ffffffff817d6c9e>] _raw_spin_unlock_bh+0xe/0x10
      [  132.613015]  [<ffffffff811388ac>] put_css_set+0x5c/0x60
      [  132.613016]  [<ffffffff8113dc7f>] cgroup_free+0x7f/0xa0
      [  132.613017]  [<ffffffff810a3912>] __put_task_struct+0x42/0x140
      [  132.613018]  [<ffffffff810e776a>] dl_task_timer+0xca/0x250
      [  132.613027]  [<ffffffff810e76a0>] ? push_dl_task.part.32+0x170/0x170
      [  132.613030]  [<ffffffff8111371e>] __hrtimer_run_queues+0xee/0x270
      [  132.613031]  [<ffffffff81113ec8>] hrtimer_interrupt+0xa8/0x190
      [  132.613034]  [<ffffffff81051a58>] local_apic_timer_interrupt+0x38/0x60
      [  132.613035]  [<ffffffff817d9b0d>] smp_apic_timer_interrupt+0x3d/0x50
      [  132.613037]  [<ffffffff817d7c5c>] apic_timer_interrupt+0x8c/0xa0
      [  132.613038]  <EOI>  [<ffffffff81063466>] ? native_safe_halt+0x6/0x10
      [  132.613043]  [<ffffffff81037a4e>] default_idle+0x1e/0xd0
      [  132.613044]  [<ffffffff810381cf>] arch_cpu_idle+0xf/0x20
      [  132.613046]  [<ffffffff810e8fda>] default_idle_call+0x2a/0x40
      [  132.613047]  [<ffffffff810e92d7>] cpu_startup_entry+0x2e7/0x340
      [  132.613048]  [<ffffffff81050235>] start_secondary+0x155/0x190
      [  132.613049] ---[ end trace f91934d162ce9977 ]---
      
      The warn is the spin_(lock|unlock)_bh(&css_set_lock) in the interrupt
      context. Converting the spin_lock_bh to spin_lock_irq(save) to avoid
      this problem - and other problems of sharing a spinlock with an
      interrupt.
      
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Juri Lelli <juri.lelli@arm.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: cgroups@vger.kernel.org
      Cc: linux-kernel@vger.kernel.org
      Reviewed-by: default avatarRik van Riel <riel@redhat.com>
      Reviewed-by: default avatar"Luis Claudio R. Goncalves" <lgoncalv@redhat.com>
      Signed-off-by: default avatarDaniel Bristot de Oliveira <bristot@redhat.com>
      Acked-by: default avatarZefan Li <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2644b978
    • Tejun Heo's avatar
      cgroup: set css->id to -1 during init · 65856851
      Tejun Heo authored
      commit 8fa3b8d6 upstream.
      
      If percpu_ref initialization fails during css_create(), the free path
      can end up trying to free css->id of zero.  As ID 0 is unused, it
      doesn't cause a critical breakage but it does trigger a warning
      message.  Fix it by setting css->id to -1 from init_and_link_css().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Wenwei Tao <ww.tao0320@gmail.com>
      Fixes: 01e58659 ("cgroup: release css->id after css_free")
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      65856851
    • Wenwei Tao's avatar
      cgroup: remove redundant cleanup in css_create · 64e7dc48
      Wenwei Tao authored
      commit b00c52da upstream.
      
      When create css failed, before call css_free_rcu_fn, we remove the css
      id and exit the percpu_ref, but we will do these again in
      css_free_work_fn, so they are redundant.  Especially the css id, that
      would cause problem if we remove it twice, since it may be assigned to
      another css after the first remove.
      
      tj: This was broken by two commits updating the free path without
          synchronizing the creation failure path.  This can be easily
          triggered by trying to create more than 64k memory cgroups.
      Signed-off-by: default avatarWenwei Tao <ww.tao0320@gmail.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Fixes: 9a1049da ("percpu-refcount: require percpu_ref to be exited explicitly")
      Fixes: 01e58659 ("cgroup: release css->id after css_free")
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64e7dc48
    • Alexander Shiyan's avatar
      pinctrl: imx: Do not treat a PIN without MUX register as an error · 03cbdca3
      Alexander Shiyan authored
      commit ba562d5e upstream.
      
      Some PINs do not have a MUX register, it is not an error.
      It is necessary to allow the continuation of the PINs configuration,
      otherwise the whole PIN-group will be configured incorrectly.
      Signed-off-by: default avatarAlexander Shiyan <shc_work@mail.ru>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      03cbdca3
    • Tony Lindgren's avatar
      pinctrl: single: Fix missing flush of posted write for a wakeirq · 4d5ce009
      Tony Lindgren authored
      commit 0ac3c0a4 upstream.
      
      With many repeated suspend resume cycles, the pin specific wakeirq
      may not always work on omaps. This is because the write to enable the
      pin interrupt may not have reached the device over the interconnect
      before suspend happens.
      
      Let's fix the issue with a flush of posted write with a readback.
      Reported-by: default avatarNishanth Menon <nm@ti.com>
      Signed-off-by: default avatarTony Lindgren <tony@atomide.com>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d5ce009
    • Minfei Huang's avatar
      pvclock: Add CPU barriers to get correct version value · c02b2ac1
      Minfei Huang authored
      commit 749d088b upstream.
      
      Protocol for the "version" fields is: hypervisor raises it (making it
      uneven) before it starts updating the fields and raises it again (making
      it even) when it is done.  Thus the guest can make sure the time values
      it got are consistent by checking the version before and after reading
      them.
      
      Add CPU barries after getting version value just like what function
      vread_pvclock does, because all of callees in this function is inline.
      
      Fixes: 502dfeffSigned-off-by: default avatarMinfei Huang <mnghuan@gmail.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c02b2ac1
    • Michael Welling's avatar
      Input: tsc200x - report proper input_dev name · d056c1c7
      Michael Welling authored
      commit e9003c9c upstream.
      
      Passes input_id struct to the common probe function for the tsc200x drivers
      instead of just the bustype.
      
      This allows for the use of the product variable to set the input_dev->name
      variable according to the type of touchscreen used. Note that when we
      introduced support for TSC2004 we started calling everything TSC200X, so
      let's keep this quirk.
      Signed-off-by: default avatarMichael Welling <mwelling@ieee.org>
      Acked-by: default avatarPavel Machek <pavel@ucw.cz>
      Acked-by: default avatarPali Rohár <pali.rohar@gmail.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d056c1c7
    • Andrew Duggan's avatar
      Input: synaptics-rmi4 - fix maximum size check for F12 control register 8 · 99d52327
      Andrew Duggan authored
      commit e4add7b6 upstream.
      
      According to the RMI4 spec the maximum size of F12 control register 8 is
      15 bytes. The current code incorrectly reports an error if control 8 is
      greater then 14. Making sensors with a control register 8 with 15 bytes
      unusable.
      Signed-off-by: default avatarAndrew Duggan <aduggan@synaptics.com>
      Reported-by: default avatarChris Healy <cphealy@gmail.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99d52327
    • Dmitry Torokhov's avatar
      Revert "Input: wacom_w8001 - drop use of ABS_MT_TOOL_TYPE" · 0cc1789c
      Dmitry Torokhov authored
      commit 3e9161bf upstream.
      
      This reverts commit 5f7e5445 because
      removal of input_mt_report_slot_state() means we no longer generate
      tracking IDs for the reported contacts.
      Acked-by: default avatarPeter Hutterer <peter.hutterer@who-t.net>
      Acked-by: default avatarPing Cheng <pinglinux@gmail.com>
      0cc1789c
    • Cameron Gutman's avatar
      Input: xpad - validate USB endpoint count during probe · 8316597e
      Cameron Gutman authored
      commit caca925f upstream.
      
      This prevents a malicious USB device from causing an oops.
      Signed-off-by: default avatarCameron Gutman <aicommander@gmail.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8316597e
    • Ping Cheng's avatar
      Input: wacom_w8001 - ignore invalid pen data packets · fccb8892
      Ping Cheng authored
      commit 9e72ac74 upstream.
      
      ThinkPad X60 Tablet PC (pen only device) sometime posts
      packets that are larger than W8001_PKTLEN_TPCPEN.
      Reported-by: default avatarChris J Arges <christopherarges@gmail.com>
      Tested-by: default avatarChris J Arges <christopherarges@gmail.com>
      Signed-off-by: default avatarPing Cheng <pingc@wacom.com>
      Reviewed-by: default avatarPeter Hutterer <peter.hutterer@who-t.net>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fccb8892
    • Ping Cheng's avatar
      Input: wacom_w8001 - w8001_MAX_LENGTH should be 13 · d80f15e3
      Ping Cheng authored
      commit 12afb344 upstream.
      
      Somehow the patch that added two-finger touch support forgot to update
      W8001_MAX_LENGTH from 11 to 13.
      Signed-off-by: default avatarPing Cheng <pingc@wacom.com>
      Reviewed-by: default avatarPeter Hutterer <peter.hutterer@who-t.net>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d80f15e3
    • Cameron Gutman's avatar
      Input: xpad - fix oops when attaching an unknown Xbox One gamepad · 28a4cb13
      Cameron Gutman authored
      commit c7f14293 upstream.
      
      Xbox One controllers have multiple interfaces which all have the
      same class, subclass, and protocol. One of the these interfaces
      has only a single endpoint. When Xpad attempts to bind to this
      interface, it causes an oops when trying initialize the output URB
      by trying to access the second endpoint's descriptor.
      
      This situation was avoided for known Xbox One devices by checking
      the XTYPE constant associated with the VID and PID tuple. However,
      this breaks when new or previously unknown Xbox One controllers
      are attached to the system.
      
      This change addresses the problem by deriving the XTYPE for Xbox
      One controllers based on the interface protocol before checking
      the interface number.
      
      Fixes: 1a48ff81 ("Input: xpad - add support for Xbox One controllers")
      Signed-off-by: default avatarCameron Gutman <aicommander@gmail.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      28a4cb13
    • Dmitry Torokhov's avatar
      Input: elantech - add more IC body types to the list · a4c41dfb
      Dmitry Torokhov authored
      commit 226ba707 upstream.
      
      The touchpad in HP Pavilion 14-ab057ca reports it's version as 12 and
      according to Elan both 11 and 12 are valid IC types and should be
      identified as hw_version 4.
      Reported-by: default avatarPatrick Lessard <Patrick.Lessard@cogeco.com>
      Tested-by: default avatarPatrick Lessard <Patrick.Lessard@cogeco.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a4c41dfb
    • Sinclair Yeh's avatar
      Input: vmmouse - remove port reservation · bc05227c
      Sinclair Yeh authored
      commit 60842ef8 upstream.
      
      The VMWare EFI BIOS will expose port 0x5658 as an ACPI resource.  This
      causes the port to be reserved by the APCI module as the system comes up,
      making it unavailable to be reserved again by other drivers, thus
      preserving this VMWare port for special use in a VMWare guest.
      
      This port is designed to be shared among multiple VMWare services, such as
      the VMMOUSE.  Because of this, VMMOUSE should not try to reserve this port
      on its own.
      
      The VMWare non-EFI BIOS does not do this to preserve compatibility with
      existing/legacy VMs.  It is known that there is small chance a VM may be
      configured such that these ports get reserved by other non-VMWare devices,
      and if this ever happens, the result is undefined.
      Signed-off-by: default avatarSinclair Yeh <syeh@vmware.com>
      Reviewed-by: default avatarThomas Hellstrom <thellstrom@vmware.com>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc05227c
    • Kangjie Lu's avatar
      ALSA: timer: Fix leak in events via snd_timer_user_tinterrupt · 1fae440a
      Kangjie Lu authored
      commit e4ec8cc8 upstream.
      
      The stack object “r1” has a total size of 32 bytes. Its field
      “event” and “val” both contain 4 bytes padding. These 8 bytes
      padding bytes are sent to user without being initialized.
      Signed-off-by: default avatarKangjie Lu <kjlu@gatech.edu>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1fae440a
    • Kangjie Lu's avatar
      ALSA: timer: Fix leak in events via snd_timer_user_ccallback · 5b6fc00b
      Kangjie Lu authored
      commit 9a47e9cf upstream.
      
      The stack object “r1” has a total size of 32 bytes. Its field
      “event” and “val” both contain 4 bytes padding. These 8 bytes
      padding bytes are sent to user without being initialized.
      Signed-off-by: default avatarKangjie Lu <kjlu@gatech.edu>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b6fc00b
    • Kangjie Lu's avatar
      ALSA: timer: Fix leak in SNDRV_TIMER_IOCTL_PARAMS · 82a638a2
      Kangjie Lu authored
      commit cec8f96e upstream.
      
      The stack object “tread” has a total size of 32 bytes. Its field
      “event” and “val” both contain 4 bytes padding. These 8 bytes
      padding bytes are sent to user without being initialized.
      Signed-off-by: default avatarKangjie Lu <kjlu@gatech.edu>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      82a638a2
    • Bob Liu's avatar
      xen-blkfront: don't call talk_to_blkback when already connected to blkback · 2e9aedc4
      Bob Liu authored
      commit efd15352 upstream.
      
      Sometimes blkfront may twice receive blkback_changed() notification
      (XenbusStateConnected) after migration, which will cause
      talk_to_blkback() to be called twice too and confuse xen-blkback.
      
      The flow is as follow:
         blkfront                                        blkback
      blkfront_resume()
       > talk_to_blkback()
        > Set blkfront to XenbusStateInitialised
                                                      front changed()
                                                       > Connect()
                                                        > Set blkback to XenbusStateConnected
      
      blkback_changed()
       > Skip talk_to_blkback()
         because frontstate == XenbusStateInitialised
       > blkfront_connect()
        > Set blkfront to XenbusStateConnected
      
      -----
      And here we get another XenbusStateConnected notification leading
      to:
      -----
      blkback_changed()
       > because now frontstate != XenbusStateInitialised
         talk_to_blkback() is also called again
        > blkfront state changed from
        XenbusStateConnected to XenbusStateInitialised
          (Which is not correct!)
      
      						front_changed():
                                                       > Do nothing because blkback
                                                         already in XenbusStateConnected
      
      Now blkback is in XenbusStateConnected but blkfront is still
      in XenbusStateInitialised - leading to no disks.
      
      Poking of the XenbusStateConnected state is allowed (to deal with
      block disk change) and has to be dealt with. The most likely
      cause of this bug are custom udev scripts hooking up the disks
      and then validating the size.
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e9aedc4
    • Bob Liu's avatar
      xen-blkfront: fix resume issues after a migration · 1d8d6ed6
      Bob Liu authored
      commit 2a6f71ad upstream.
      
      After a migrate to another host (which may not have multiqueue
      support), the number of rings (block hardware queues)
      may be changed and the ring info structure will also be reallocated.
      
      This patch fixes two related bugs:
       * call blk_mq_update_nr_hw_queues() to make blk-core know the number
         of hardware queues have been changed.
       * Don't store rinfo pointer to hctx->driver_data, because rinfo may be
         reallocated so use hctx->queue_num to get the rinfo structure instead.
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d8d6ed6
    • Jan Beulich's avatar
      xenbus: don't bail early from xenbus_dev_request_and_reply() · 77bdc603
      Jan Beulich authored
      commit 7469be95 upstream.
      
      xenbus_dev_request_and_reply() needs to track whether a transaction is
      open.  For XS_TRANSACTION_START messages it calls transaction_start()
      and for XS_TRANSACTION_END messages it calls transaction_end().
      
      If sending an XS_TRANSACTION_START message fails or responds with an
      an error, the transaction is not open and transaction_end() must be
      called.
      
      If sending an XS_TRANSACTION_END message fails, the transaction is
      still open, but if an error response is returned the transaction is
      closed.
      
      Commit 027bd7e8 ("xen/xenbus: Avoid synchronous wait on XenBus
      stalling shutdown/restart") introduced a regression where failed
      XS_TRANSACTION_START messages were leaving the transaction open.  This
      can cause problems with suspend (and migration) as all transactions
      must be closed before suspending.
      
      It appears that the problematic change was added accidentally, so just
      remove it.
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77bdc603
    • Jan Beulich's avatar
      xenbus: don't BUG() on user mode induced condition · e8f5fe05
      Jan Beulich authored
      commit 0beef634 upstream.
      
      Inability to locate a user mode specified transaction ID should not
      lead to a kernel crash. For other than XS_TRANSACTION_START also
      don't issue anything to xenbus if the specified ID doesn't match that
      of any active transaction.
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8f5fe05
    • Bob Liu's avatar
      xen-blkfront: save uncompleted reqs in blkfront_resume() · 61675789
      Bob Liu authored
      commit 7b427a59 upstream.
      
      Uncompleted reqs used to be 'saved and resubmitted' in blkfront_recover() during
      migration, but that's too late after multi-queue was introduced.
      
      After a migrate to another host (which may not have multiqueue support), the
      number of rings (block hardware queues) may be changed and the ring and shadow
      structure will also be reallocated.
      
      The blkfront_recover() then can't 'save and resubmit' the real
      uncompleted reqs because shadow structure have been reallocated.
      
      This patch fixes this issue by moving the 'save' logic out of
      blkfront_recover() to earlier place in blkfront_resume().
      
      The 'resubmit' is not changed and still in blkfront_recover().
      Signed-off-by: default avatarBob Liu <bob.liu@oracle.com>
      Signed-off-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      61675789
    • Andrey Grodzovsky's avatar
      xen/pciback: Fix conf_space read/write overlap check. · 45e4b704
      Andrey Grodzovsky authored
      commit 02ef871e upstream.
      
      Current overlap check is evaluating to false a case where a filter
      field is fully contained (proper subset) of a r/w request.  This
      change applies classical overlap check instead to include all the
      scenarios.
      
      More specifically, for (Hilscher GmbH CIFX 50E-DP(M/S)) device driver
      the logic is such that the entire confspace is read and written in 4
      byte chunks. In this case as an example, CACHE_LINE_SIZE,
      LATENCY_TIMER and PCI_BIST are arriving together in one call to
      xen_pcibk_config_write() with offset == 0xc and size == 4.  With the
      exsisting overlap check the LATENCY_TIMER field (offset == 0xd, length
      == 1) is fully contained in the write request and hence is excluded
      from write, which is incorrect.
      Signed-off-by: default avatarAndrey Grodzovsky <andrey2805@gmail.com>
      Reviewed-by: default avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Reviewed-by: default avatarJan Beulich <JBeulich@suse.com>
      Signed-off-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      45e4b704
    • Vineet Gupta's avatar
      ARC: unwind: ensure that .debug_frame is generated (vs. .eh_frame) · 4d1fa7a0
      Vineet Gupta authored
      commit f52e126c upstream.
      
      With recent binutils update to support dwarf CFI pseudo-ops in gas, we
      now get .eh_frame vs. .debug_frame. Although the call frame info is
      exactly the same in both, the CIE differs, which the current kernel
      unwinder can't cope with.
      
      This broke both the kernel unwinder as well as loadable modules (latter
      because of a new unhandled relo R_ARC_32_PCREL from .rela.eh_frame in
      the module loader)
      
      The ideal solution would be to switch unwinder to .eh_frame.
      For now however we can make do by just ensureing .debug_frame is
      generated by removing -fasynchronous-unwind-tables
      
       .eh_frame    generated with -gdwarf-2 -fasynchronous-unwind-tables
       .debug_frame generated with -gdwarf-2
      
      Fixes STAR 9001058196
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4d1fa7a0
    • Alexey Brodkin's avatar
      arc: unwind: warn only once if DW2_UNWIND is disabled · bcd07ba6
      Alexey Brodkin authored
      commit 9bd54517 upstream.
      
      If CONFIG_ARC_DW2_UNWIND is disabled every time arc_unwind_core()
      gets called following message gets printed in debug console:
      ----------------->8---------------
      CONFIG_ARC_DW2_UNWIND needs to be enabled
      ----------------->8---------------
      
      That message makes sense if user indeed wants to see a backtrace or
      get nice function call-graphs in perf but what if user disabled
      unwinder for the purpose? Why pollute his debug console?
      
      So instead we'll warn user about possibly missing feature once and
      let him decide if that was what he or she really wanted.
      Signed-off-by: default avatarAlexey Brodkin <abrodkin@synopsys.com>
      Signed-off-by: default avatarVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bcd07ba6
    • Josh Poimboeuf's avatar
      sched/debug: Fix deadlock when enabling sched events · f6e9bad4
      Josh Poimboeuf authored
      commit eda8dca5 upstream.
      
      I see a hang when enabling sched events:
      
        echo 1 > /sys/kernel/debug/tracing/events/sched/enable
      
      The printk buffer shows:
      
        BUG: spinlock recursion on CPU#1, swapper/1/0
         lock: 0xffff88007d5d8c00, .magic: dead4ead, .owner: swapper/1/0, .owner_cpu: 1
        CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc2+ #1
        Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
        ...
        Call Trace:
         <IRQ>  [<ffffffff8143d663>] dump_stack+0x85/0xc2
         [<ffffffff81115948>] spin_dump+0x78/0xc0
         [<ffffffff81115aea>] do_raw_spin_lock+0x11a/0x150
         [<ffffffff81891471>] _raw_spin_lock+0x61/0x80
         [<ffffffff810e5466>] ? try_to_wake_up+0x256/0x4e0
         [<ffffffff810e5466>] try_to_wake_up+0x256/0x4e0
         [<ffffffff81891a0a>] ? _raw_spin_unlock_irqrestore+0x4a/0x80
         [<ffffffff810e5705>] wake_up_process+0x15/0x20
         [<ffffffff810cebb4>] insert_work+0x84/0xc0
         [<ffffffff810ced7f>] __queue_work+0x18f/0x660
         [<ffffffff810cf9a6>] queue_work_on+0x46/0x90
         [<ffffffffa00cd95b>] drm_fb_helper_dirty.isra.11+0xcb/0xe0 [drm_kms_helper]
         [<ffffffffa00cdac0>] drm_fb_helper_sys_imageblit+0x30/0x40 [drm_kms_helper]
         [<ffffffff814babcd>] soft_cursor+0x1ad/0x230
         [<ffffffff814ba379>] bit_cursor+0x649/0x680
         [<ffffffff814b9d30>] ? update_attr.isra.2+0x90/0x90
         [<ffffffff814b5e6a>] fbcon_cursor+0x14a/0x1c0
         [<ffffffff81555ef8>] hide_cursor+0x28/0x90
         [<ffffffff81558b6f>] vt_console_print+0x3bf/0x3f0
         [<ffffffff81122c63>] call_console_drivers.constprop.24+0x183/0x200
         [<ffffffff811241f4>] console_unlock+0x3d4/0x610
         [<ffffffff811247f5>] vprintk_emit+0x3c5/0x610
         [<ffffffff81124bc9>] vprintk_default+0x29/0x40
         [<ffffffff811e965b>] printk+0x57/0x73
         [<ffffffff810f7a9e>] enqueue_entity+0xc2e/0xc70
         [<ffffffff810f7b39>] enqueue_task_fair+0x59/0xab0
         [<ffffffff8106dcd9>] ? kvm_sched_clock_read+0x9/0x20
         [<ffffffff8103fb39>] ? sched_clock+0x9/0x10
         [<ffffffff810e3fcc>] activate_task+0x5c/0xa0
         [<ffffffff810e4514>] ttwu_do_activate+0x54/0xb0
         [<ffffffff810e5cea>] sched_ttwu_pending+0x7a/0xb0
         [<ffffffff810e5e51>] scheduler_ipi+0x61/0x170
         [<ffffffff81059e7f>] smp_trace_reschedule_interrupt+0x4f/0x2a0
         [<ffffffff81893ba6>] trace_reschedule_interrupt+0x96/0xa0
         <EOI>  [<ffffffff8106e0d6>] ? native_safe_halt+0x6/0x10
         [<ffffffff8110fb1d>] ? trace_hardirqs_on+0xd/0x10
         [<ffffffff81040ac0>] default_idle+0x20/0x1a0
         [<ffffffff8104147f>] arch_cpu_idle+0xf/0x20
         [<ffffffff81102f8f>] default_idle_call+0x2f/0x50
         [<ffffffff8110332e>] cpu_startup_entry+0x37e/0x450
         [<ffffffff8105af70>] start_secondary+0x160/0x1a0
      
      Note the hang only occurs when echoing the above from a physical serial
      console, not from an ssh session.
      
      The bug is caused by a deadlock where the task is trying to grab the rq
      lock twice because printk()'s aren't safe in sched code.
      Signed-off-by: default avatarJosh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: cb251765 ("sched/debug: Make schedstats a runtime tunable that is disabled by default")
      Link: http://lkml.kernel.org/r/20160613073209.gdvdybiruljbkn3p@trebleSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f6e9bad4
    • Andrey Ryabinin's avatar
      kernel/sysrq, watchdog, sched/core: Reset watchdog on all CPUs while processing sysrq-w · 543a87ab
      Andrey Ryabinin authored
      commit 57675cb9 upstream.
      
      Lengthy output of sysrq-w may take a lot of time on slow serial console.
      
      Currently we reset NMI-watchdog on the current CPU to avoid spurious
      lockup messages. Sometimes this doesn't work since softlockup watchdog
      might trigger on another CPU which is waiting for an IPI to proceed.
      We reset softlockup watchdogs on all CPUs, but we do this only after
      listing all tasks, and this may be too late on a busy system.
      
      So, reset watchdogs CPUs earlier, in for_each_process_thread() loop.
      Signed-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1465474805-14641-1-git-send-email-aryabinin@virtuozzo.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      543a87ab
    • Jiri Slaby's avatar
      pps: do not crash when failed to register · 266f1d62
      Jiri Slaby authored
      commit 368301f2 upstream.
      
      With this command sequence:
      
        modprobe plip
        modprobe pps_parport
        rmmod pps_parport
      
      the partport_pps modules causes this crash:
      
        BUG: unable to handle kernel NULL pointer dereference at (null)
        IP: parport_detach+0x1d/0x60 [pps_parport]
        Oops: 0000 [#1] SMP
        ...
        Call Trace:
          parport_unregister_driver+0x65/0xc0 [parport]
          SyS_delete_module+0x187/0x210
      
      The sequence that builds up to this is:
      
       1) plip is loaded and takes the parport device for exclusive use:
      
          plip0: Parallel port at 0x378, using IRQ 7.
      
       2) pps_parport then fails to grab the device:
      
          pps_parport: parallel port PPS client
          parport0: cannot grant exclusive access for device pps_parport
          pps_parport: couldn't register with parport0
      
       3) rmmod of pps_parport is then killed because it tries to access
          pardev->name, but pardev (taken from port->cad) is NULL.
      
      So add a check for NULL in the test there too.
      
      Link: http://lkml.kernel.org/r/20160714115245.12651-1-jslaby@suse.czSigned-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      Acked-by: default avatarRodolfo Giometti <giometti@enneenne.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      266f1d62
    • Andrey Ryabinin's avatar
      radix-tree: fix radix_tree_iter_retry() for tagged iterators. · d9e8be11
      Andrey Ryabinin authored
      commit 3cb9185c upstream.
      
      radix_tree_iter_retry() resets slot to NULL, but it doesn't reset tags.
      Then NULL slot and non-zero iter.tags passed to radix_tree_next_slot()
      leading to crash:
      
        RIP: radix_tree_next_slot include/linux/radix-tree.h:473
          find_get_pages_tag+0x334/0x930 mm/filemap.c:1452
        ....
        Call Trace:
          pagevec_lookup_tag+0x3a/0x80 mm/swap.c:960
          mpage_prepare_extent_to_map+0x321/0xa90 fs/ext4/inode.c:2516
          ext4_writepages+0x10be/0x2b20 fs/ext4/inode.c:2736
          do_writepages+0x97/0x100 mm/page-writeback.c:2364
          __filemap_fdatawrite_range+0x248/0x2e0 mm/filemap.c:300
          filemap_write_and_wait_range+0x121/0x1b0 mm/filemap.c:490
          ext4_sync_file+0x34d/0xdb0 fs/ext4/fsync.c:115
          vfs_fsync_range+0x10a/0x250 fs/sync.c:195
          vfs_fsync fs/sync.c:209
          do_fsync+0x42/0x70 fs/sync.c:219
          SYSC_fdatasync fs/sync.c:232
          SyS_fdatasync+0x19/0x20 fs/sync.c:230
          entry_SYSCALL_64_fastpath+0x23/0xc1 arch/x86/entry/entry_64.S:207
      
      We must reset iterator's tags to bail out from radix_tree_next_slot()
      and go to the slow-path in radix_tree_next_chunk().
      
      Fixes: 46437f9a ("radix-tree: fix race in gang lookup")
      Link: http://lkml.kernel.org/r/1468495196-10604-1-git-send-email-aryabinin@virtuozzo.comSigned-off-by: default avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Acked-by: default avatarKonstantin Khlebnikov <koct9i@gmail.com>
      Cc: Matthew Wilcox <willy@linux.intel.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d9e8be11
    • Johannes Weiner's avatar
      mm: memcontrol: fix cgroup creation failure after many small jobs · db70cd18
      Johannes Weiner authored
      commit 73f576c0 upstream.
      
      The memory controller has quite a bit of state that usually outlives the
      cgroup and pins its CSS until said state disappears.  At the same time
      it imposes a 16-bit limit on the CSS ID space to economically store IDs
      in the wild.  Consequently, when we use cgroups to contain frequent but
      small and short-lived jobs that leave behind some page cache, we quickly
      run into the 64k limitations of outstanding CSSs.  Creating a new cgroup
      fails with -ENOSPC while there are only a few, or even no user-visible
      cgroups in existence.
      
      Although pinning CSSs past cgroup removal is common, there are only two
      instances that actually need an ID after a cgroup is deleted: cache
      shadow entries and swapout records.
      
      Cache shadow entries reference the ID weakly and can deal with the CSS
      having disappeared when it's looked up later.  They pose no hurdle.
      
      Swap-out records do need to pin the css to hierarchically attribute
      swapins after the cgroup has been deleted; though the only pages that
      remain swapped out after offlining are tmpfs/shmem pages.  And those
      references are under the user's control, so they are manageable.
      
      This patch introduces a private 16-bit memcg ID and switches swap and
      cache shadow entries over to using that.  This ID can then be recycled
      after offlining when the CSS remains pinned only by objects that don't
      specifically need it.
      
      This script demonstrates the problem by faulting one cache page in a new
      cgroup and deleting it again:
      
        set -e
        mkdir -p pages
        for x in `seq 128000`; do
          [ $((x % 1000)) -eq 0 ] && echo $x
          mkdir /cgroup/foo
          echo $$ >/cgroup/foo/cgroup.procs
          echo trex >pages/$x
          echo $$ >/cgroup/cgroup.procs
          rmdir /cgroup/foo
        done
      
      When run on an unpatched kernel, we eventually run out of possible IDs
      even though there are no visible cgroups:
      
        [root@ham ~]# ./cssidstress.sh
        [...]
        65000
        mkdir: cannot create directory '/cgroup/foo': No space left on device
      
      After this patch, the IDs get released upon cgroup destruction and the
      cache and css objects get released once memory reclaim kicks in.
      
      [hannes@cmpxchg.org: init the IDR]
        Link: http://lkml.kernel.org/r/20160621154601.GA22431@cmpxchg.org
      Fixes: b2052564 ("mm: memcontrol: continue cache reclaim from offlined groups")
      Link: http://lkml.kernel.org/r/20160617162516.GD19084@cmpxchg.orgSigned-off-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Reported-by: default avatarJohn Garcia <john.garcia@mesosphere.io>
      Reviewed-by: default avatarVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Cc: Nikolay Borisov <kernel@kyup.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db70cd18