1. 15 Sep, 2016 40 commits
    • Vegard Nossum's avatar
      ALSA: timer: fix division by zero after SNDRV_TIMER_IOCTL_CONTINUE · 6fd91313
      Vegard Nossum authored
      commit 6b760bb2 upstream.
      
      I got this:
      
          divide error: 0000 [#1] PREEMPT SMP KASAN
          CPU: 1 PID: 1327 Comm: a.out Not tainted 4.8.0-rc2+ #189
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
          task: ffff8801120a9580 task.stack: ffff8801120b0000
          RIP: 0010:[<ffffffff82c8bd9a>]  [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
          RSP: 0018:ffff88011aa87da8  EFLAGS: 00010006
          RAX: 0000000000004f76 RBX: ffff880112655e88 RCX: 0000000000000000
          RDX: 0000000000000000 RSI: ffff880112655ea0 RDI: 0000000000000001
          RBP: ffff88011aa87e00 R08: ffff88013fff905c R09: ffff88013fff9048
          R10: ffff88013fff9050 R11: 00000001050a7b8c R12: ffff880114778a00
          R13: ffff880114778ab4 R14: ffff880114778b30 R15: 0000000000000000
          FS:  00007f071647c700(0000) GS:ffff88011aa80000(0000) knlGS:0000000000000000
          CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
          CR2: 0000000000603001 CR3: 0000000112021000 CR4: 00000000000006e0
          Stack:
           0000000000000000 ffff880114778ab8 ffff880112655ea0 0000000000004f76
           ffff880112655ec8 ffff880112655e80 ffff880112655e88 ffff88011aa98fc0
           00000000b97ccf2b dffffc0000000000 ffff88011aa98fc0 ffff88011aa87ef0
          Call Trace:
           <IRQ>
           [<ffffffff813abce7>] __hrtimer_run_queues+0x347/0xa00
           [<ffffffff82c8bbc0>] ? snd_hrtimer_close+0x130/0x130
           [<ffffffff813ab9a0>] ? retrigger_next_event+0x1b0/0x1b0
           [<ffffffff813ae1a6>] ? hrtimer_interrupt+0x136/0x4b0
           [<ffffffff813ae220>] hrtimer_interrupt+0x1b0/0x4b0
           [<ffffffff8120f91e>] local_apic_timer_interrupt+0x6e/0xf0
           [<ffffffff81227ad3>] ? kvm_guest_apic_eoi_write+0x13/0xc0
           [<ffffffff83c35086>] smp_apic_timer_interrupt+0x76/0xa0
           [<ffffffff83c3416c>] apic_timer_interrupt+0x8c/0xa0
           <EOI>
           [<ffffffff83c3239c>] ? _raw_spin_unlock_irqrestore+0x2c/0x60
           [<ffffffff82c8185d>] snd_timer_start1+0xdd/0x670
           [<ffffffff82c87015>] snd_timer_continue+0x45/0x80
           [<ffffffff82c88100>] snd_timer_user_ioctl+0x1030/0x2830
           [<ffffffff8159f3a0>] ? __follow_pte.isra.49+0x430/0x430
           [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
           [<ffffffff815a26fa>] ? do_wp_page+0x3aa/0x1c90
           [<ffffffff815aa4f8>] ? handle_mm_fault+0xbc8/0x27f0
           [<ffffffff815a9930>] ? __pmd_alloc+0x370/0x370
           [<ffffffff82c870d0>] ? snd_timer_pause+0x80/0x80
           [<ffffffff816b0733>] do_vfs_ioctl+0x193/0x1050
           [<ffffffff816b05a0>] ? ioctl_preallocate+0x200/0x200
           [<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
           [<ffffffff815045ba>] ? __context_tracking_exit.part.4+0x9a/0x1e0
           [<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
           [<ffffffff82001a97>] ? check_preemption_disabled+0x37/0x1e0
           [<ffffffff81d93889>] ? security_file_ioctl+0x89/0xb0
           [<ffffffff816b167f>] SyS_ioctl+0x8f/0xc0
           [<ffffffff816b15f0>] ? do_vfs_ioctl+0x1050/0x1050
           [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
           [<ffffffff83c32b2a>] entry_SYSCALL64_slow_path+0x25/0x25
          Code: e8 fc 42 7b fe 8b 0d 06 8a 50 03 49 0f af cf 48 85 c9 0f 88 7c 01 00 00 48 89 4d a8 e8 e0 42 7b fe 48 8b 45 c0 48 8b 4d a8 48 99 <48> f7 f9 49 01 c7 e8 cb 42 7b fe 48 8b 55 d0 48 b8 00 00 00 00
          RIP  [<ffffffff82c8bd9a>] snd_hrtimer_callback+0x1da/0x3f0
           RSP <ffff88011aa87da8>
          ---[ end trace 6aa380f756a21074 ]---
      
      The problem happens when you call ioctl(SNDRV_TIMER_IOCTL_CONTINUE) on a
      completely new/unused timer -- it will have ->sticks == 0, which causes a
      divide by 0 in snd_hrtimer_callback().
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6fd91313
    • Vegard Nossum's avatar
      ALSA: timer: fix NULL pointer dereference in read()/ioctl() race · 6664ad65
      Vegard Nossum authored
      commit 11749e08 upstream.
      
      I got this with syzkaller:
      
          ==================================================================
          BUG: KASAN: null-ptr-deref on address 0000000000000020
          Read of size 32 by task syz-executor/22519
          CPU: 1 PID: 22519 Comm: syz-executor Not tainted 4.8.0-rc2+ #169
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2
          014
           0000000000000001 ffff880111a17a00 ffffffff81f9f141 ffff880111a17a90
           ffff880111a17c50 ffff880114584a58 ffff880114584a10 ffff880111a17a80
           ffffffff8161fe3f ffff880100000000 ffff880118d74a48 ffff880118d74a68
          Call Trace:
           [<ffffffff81f9f141>] dump_stack+0x83/0xb2
           [<ffffffff8161fe3f>] kasan_report_error+0x41f/0x4c0
           [<ffffffff8161ff74>] kasan_report+0x34/0x40
           [<ffffffff82c84b54>] ? snd_timer_user_read+0x554/0x790
           [<ffffffff8161e79e>] check_memory_region+0x13e/0x1a0
           [<ffffffff8161e9c1>] kasan_check_read+0x11/0x20
           [<ffffffff82c84b54>] snd_timer_user_read+0x554/0x790
           [<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
           [<ffffffff817d0831>] ? proc_fault_inject_write+0x1c1/0x250
           [<ffffffff817d0670>] ? next_tgid+0x2a0/0x2a0
           [<ffffffff8127c278>] ? do_group_exit+0x108/0x330
           [<ffffffff8174653a>] ? fsnotify+0x72a/0xca0
           [<ffffffff81674dfe>] __vfs_read+0x10e/0x550
           [<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
           [<ffffffff81674cf0>] ? do_sendfile+0xc50/0xc50
           [<ffffffff81745e10>] ? __fsnotify_update_child_dentry_flags+0x60/0x60
           [<ffffffff8143fec6>] ? kcov_ioctl+0x56/0x190
           [<ffffffff81e5ada2>] ? common_file_perm+0x2e2/0x380
           [<ffffffff81746b0e>] ? __fsnotify_parent+0x5e/0x2b0
           [<ffffffff81d93536>] ? security_file_permission+0x86/0x1e0
           [<ffffffff816728f5>] ? rw_verify_area+0xe5/0x2b0
           [<ffffffff81675355>] vfs_read+0x115/0x330
           [<ffffffff81676371>] SyS_read+0xd1/0x1a0
           [<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
           [<ffffffff82001c2c>] ? __this_cpu_preempt_check+0x1c/0x20
           [<ffffffff8150455a>] ? __context_tracking_exit.part.4+0x3a/0x1e0
           [<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
           [<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
           [<ffffffff810052fc>] ? syscall_return_slowpath+0x16c/0x1d0
           [<ffffffff83c3276a>] entry_SYSCALL64_slow_path+0x25/0x25
          ==================================================================
      
      There are a couple of problems that I can see:
      
       - ioctl(SNDRV_TIMER_IOCTL_SELECT), which potentially sets
         tu->queue/tu->tqueue to NULL on memory allocation failure, so read()
         would get a NULL pointer dereference like the above splat
      
       - the same ioctl() can free tu->queue/to->tqueue which means read()
         could potentially see (and dereference) the freed pointer
      
      We can fix both by taking the ioctl_lock mutex when dereferencing
      ->queue/->tqueue, since that's always held over all the ioctl() code.
      
      Just looking at the code I find it likely that there are more problems
      here such as tu->qhead pointing outside the buffer if the size is
      changed concurrently using SNDRV_TIMER_IOCTL_PARAMS.
      Signed-off-by: default avatarVegard Nossum <vegard.nossum@oracle.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6664ad65
    • Kai-Heng Feng's avatar
      ALSA: hda - Enable subwoofer on Dell Inspiron 7559 · 857fbd7a
      Kai-Heng Feng authored
      commit fd06c77e upstream.
      
      The subwoofer on Inspiron 7559 was disabled originally.
      Applying a pin fixup to node 0x1b can enable it and make it work.
      
      Old pin: 0x411111f0
      New pin: 0x90170151
      Signed-off-by: default avatarKai-Heng Feng <kai.heng.feng@canonical.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      857fbd7a
    • Shrirang Bagul's avatar
      ALSA: hda - Add headset mic quirk for Dell Inspiron 5468 · eea36eb2
      Shrirang Bagul authored
      commit 311042d1 upstream.
      
      This patch enables headset microphone on some variants of
      Dell Inspiron 5468. (Dell SSID 0x07ad)
      
      BugLink: https://bugs.launchpad.net/bugs/1617900Signed-off-by: default avatarShrirang Bagul <shrirang.bagul@canonical.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eea36eb2
    • Takashi Iwai's avatar
      ALSA: rawmidi: Fix possible deadlock with virmidi registration · 42c51d00
      Takashi Iwai authored
      commit 816f318b upstream.
      
      When a seq-virmidi driver is initialized, it registers a rawmidi
      instance with its callback to create an associated seq kernel client.
      Currently it's done throughly in rawmidi's register_mutex context.
      Recently it was found that this may lead to a deadlock another rawmidi
      device that is being attached with the sequencer is accessed, as both
      open with the same register_mutex.  This was actually triggered by
      syzkaller, as Dmitry Vyukov reported:
      
      ======================================================
       [ INFO: possible circular locking dependency detected ]
       4.8.0-rc1+ #11 Not tainted
       -------------------------------------------------------
       syz-executor/7154 is trying to acquire lock:
        (register_mutex#5){+.+.+.}, at: [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341
      
       but task is already holding lock:
        (&grp->list_mutex){++++.+}, at: [<ffffffff850138bb>] check_and_subscribe_port+0x5b/0x5c0 sound/core/seq/seq_ports.c:495
      
       which lock already depends on the new lock.
      
       the existing dependency chain (in reverse order) is:
      
       -> #1 (&grp->list_mutex){++++.+}:
          [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746
          [<ffffffff863f6199>] down_read+0x49/0xc0 kernel/locking/rwsem.c:22
          [<     inline     >] deliver_to_subscribers sound/core/seq/seq_clientmgr.c:681
          [<ffffffff85005c5e>] snd_seq_deliver_event+0x35e/0x890 sound/core/seq/seq_clientmgr.c:822
          [<ffffffff85006e96>] > snd_seq_kernel_client_dispatch+0x126/0x170 sound/core/seq/seq_clientmgr.c:2418
          [<ffffffff85012c52>] snd_seq_system_broadcast+0xb2/0xf0 sound/core/seq/seq_system.c:101
          [<ffffffff84fff70a>] snd_seq_create_kernel_client+0x24a/0x330 sound/core/seq/seq_clientmgr.c:2297
          [<     inline     >] snd_virmidi_dev_attach_seq sound/core/seq/seq_virmidi.c:383
          [<ffffffff8502d29f>] snd_virmidi_dev_register+0x29f/0x750 sound/core/seq/seq_virmidi.c:450
          [<ffffffff84fd208c>] snd_rawmidi_dev_register+0x30c/0xd40 sound/core/rawmidi.c:1645
          [<ffffffff84f816d3>] __snd_device_register.part.0+0x63/0xc0 sound/core/device.c:164
          [<     inline     >] __snd_device_register sound/core/device.c:162
          [<ffffffff84f8235d>] snd_device_register_all+0xad/0x110 sound/core/device.c:212
          [<ffffffff84f7546f>] snd_card_register+0xef/0x6c0 sound/core/init.c:749
          [<ffffffff85040b7f>] snd_virmidi_probe+0x3ef/0x590 sound/drivers/virmidi.c:123
          [<ffffffff833ebf7b>] platform_drv_probe+0x8b/0x170 drivers/base/platform.c:564
          ......
      
       -> #0 (register_mutex#5){+.+.+.}:
          [<     inline     >] check_prev_add kernel/locking/lockdep.c:1829
          [<     inline     >] check_prevs_add kernel/locking/lockdep.c:1939
          [<     inline     >] validate_chain kernel/locking/lockdep.c:2266
          [<ffffffff814791f4>] __lock_acquire+0x4d44/0x4d80 kernel/locking/lockdep.c:3335
          [<ffffffff8147a3a8>] lock_acquire+0x208/0x430 kernel/locking/lockdep.c:3746
          [<     inline     >] __mutex_lock_common kernel/locking/mutex.c:521
          [<ffffffff863f0ef1>] mutex_lock_nested+0xb1/0xa20 kernel/locking/mutex.c:621
          [<ffffffff84fd6d4b>] snd_rawmidi_kernel_open+0x4b/0x260 sound/core/rawmidi.c:341
          [<ffffffff8502e7c7>] midisynth_subscribe+0xf7/0x350 sound/core/seq/seq_midi.c:188
          [<     inline     >] subscribe_port sound/core/seq/seq_ports.c:427
          [<ffffffff85013cc7>] check_and_subscribe_port+0x467/0x5c0 sound/core/seq/seq_ports.c:510
          [<ffffffff85015da9>] snd_seq_port_connect+0x2c9/0x500 sound/core/seq/seq_ports.c:579
          [<ffffffff850079b8>] snd_seq_ioctl_subscribe_port+0x1d8/0x2b0 sound/core/seq/seq_clientmgr.c:1480
          [<ffffffff84ffe9e4>] snd_seq_do_ioctl+0x184/0x1e0 sound/core/seq/seq_clientmgr.c:2225
          [<ffffffff84ffeae8>] snd_seq_kernel_client_ctl+0xa8/0x110 sound/core/seq/seq_clientmgr.c:2440
          [<ffffffff85027664>] snd_seq_oss_midi_open+0x3b4/0x610 sound/core/seq/oss/seq_oss_midi.c:375
          [<ffffffff85023d67>] snd_seq_oss_synth_setup_midi+0x107/0x4c0 sound/core/seq/oss/seq_oss_synth.c:281
          [<ffffffff8501b0a8>] snd_seq_oss_open+0x748/0x8d0 sound/core/seq/oss/seq_oss_init.c:274
          [<ffffffff85019d8a>] odev_open+0x6a/0x90 sound/core/seq/oss/seq_oss.c:138
          [<ffffffff84f7040f>] soundcore_open+0x30f/0x640 sound/sound_core.c:639
          ......
      
       other info that might help us debug this:
      
       Possible unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(&grp->list_mutex);
                                      lock(register_mutex#5);
                                      lock(&grp->list_mutex);
         lock(register_mutex#5);
      
       *** DEADLOCK ***
      ======================================================
      
      The fix is to simply move the registration parts in
      snd_rawmidi_dev_register() to the outside of the register_mutex lock.
      The lock is needed only to manage the linked list, and it's not
      necessarily to cover the whole initialization process.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42c51d00
    • Takashi Sakamoto's avatar
      ALSA: fireworks: accessing to user space outside spinlock · e6c4138b
      Takashi Sakamoto authored
      commit 6b1ca4bc upstream.
      
      In hwdep interface of fireworks driver, accessing to user space is in a
      critical section with disabled local interrupt. Depending on architecture,
      accessing to user space can cause page fault exception. Then local
      processor stores machine status and handles the synchronous event. A
      handler corresponding to the event can call task scheduler to wait for
      preparing pages. In a case of usage of single core processor, the state to
      disable local interrupt is worse because it don't handle usual interrupts
      from hardware.
      
      This commit fixes this bug, performing the accessing outside spinlock. This
      commit also gives up counting the number of queued response messages to
      simplify ring-buffer management.
      Reported-by: default avatarVaishali Thakkar <vaishali.thakkar@oracle.com>
      Fixes: 555e8a8f('ALSA: fireworks: Add command/response functionality into hwdep interface')
      Signed-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6c4138b
    • Takashi Sakamoto's avatar
      ALSA: firewire-tascam: accessing to user space outside spinlock · ad3bccfd
      Takashi Sakamoto authored
      commit 04b2d9c9 upstream.
      
      In hwdep interface of firewire-tascam driver, accessing to user space is
      in a critical section with disabled local interrupt. Depending on
      architecture, accessing to user space can cause page fault exception. Then
      local processor stores machine status and handle the synchronous event. A
      handler corresponding to the event can call task scheduler to wait for
      preparing pages. In a case of usage of single core processor, the state to
      disable local interrupt is worse because it doesn't handle usual interrupts
      from hardware.
      
      This commit fixes this bug, by performing the accessing outside spinlock.
      Reported-by: default avatarVaishali Thakkar <vaishali.thakkar@oracle.com>
      Fixes: e5e0c3dd('ALSA: firewire-tascam: add hwdep interface')
      Signed-off-by: default avatarTakashi Sakamoto <o-takashi@sakamocchi.jp>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad3bccfd
    • Ken Lin's avatar
      ALSA: usb-audio: Add sample rate inquiry quirk for B850V3 CP2114 · 9947ec2c
      Ken Lin authored
      commit 83d9956b upstream.
      
      Avoid getting sample rate on B850V3 CP2114 as it is unsupported and
      causes noisy "current rate is different from the runtime rate" messages
      when playback starts.
      Signed-off-by: default avatarKen Lin <ken.lin@advantech.com.tw>
      Signed-off-by: default avatarAkshay Bhat <akshay.bhat@timesys.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9947ec2c
    • Horia Geantă's avatar
      crypto: caam - fix IV loading for authenc (giv)decryption · f973851a
      Horia Geantă authored
      commit 8b18e235 upstream.
      
      For algorithms that implement IV generators before the crypto ops,
      the IV needed for decryption is initially located in req->src
      scatterlist, not in req->iv.
      
      Avoid copying the IV into req->iv by modifying the (givdecrypt)
      descriptors to load it directly from req->src.
      aead_givdecrypt() is no longer needed and goes away.
      
      Fixes: 479bcc7c ("crypto: caam - Convert authenc to new AEAD interface")
      Signed-off-by: default avatarHoria Geantă <horia.geanta@nxp.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f973851a
    • Oleg Nesterov's avatar
      uprobes: Fix the memcg accounting · f964b3b3
      Oleg Nesterov authored
      commit 6c4687cc upstream.
      
      __replace_page() wronlgy calls mem_cgroup_cancel_charge() in "success" path,
      it should only do this if page_check_address() fails.
      
      This means that every enable/disable leads to unbalanced mem_cgroup_uncharge()
      from put_page(old_page), it is trivial to underflow the page_counter->count
      and trigger OOM.
      Reported-and-tested-by: default avatarBrenden Blanco <bblanco@plumgrid.com>
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reviewed-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@kernel.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
      Fixes: 00501b53 ("mm: memcontrol: rewrite charge API")
      Link: http://lkml.kernel.org/r/20160817153629.GB29724@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f964b3b3
    • Wanpeng Li's avatar
      x86/apic: Do not init irq remapping if ioapic is disabled · 2e133f2d
      Wanpeng Li authored
      commit 2e63ad4b upstream.
      
      native_smp_prepare_cpus
        -> default_setup_apic_routing
          -> enable_IR_x2apic
            -> irq_remapping_prepare
              -> intel_prepare_irq_remapping
                -> intel_setup_irq_remapping
      
      So IR table is setup even if "noapic" boot parameter is added. As a result we
      crash later when the interrupt affinity is set due to a half initialized
      remapping infrastructure.
      
      Prevent remap initialization when IOAPIC is disabled.
      Signed-off-by: default avatarWanpeng Li <wanpeng.li@hotmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Joerg Roedel <joro@8bytes.org>
      Link: http://lkml.kernel.org/r/1471954039-3942-1-git-send-email-wanpeng.li@hotmail.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e133f2d
    • Benjamin Coddington's avatar
      vhost/scsi: fix reuse of &vq->iov[out] in response · 9cb2e06a
      Benjamin Coddington authored
      commit a77ec83a upstream.
      
      The address of the iovec &vq->iov[out] is not guaranteed to contain the scsi
      command's response iovec throughout the lifetime of the command.  Rather, it
      is more likely to contain an iovec from an immediately following command
      after looping back around to vhost_get_vq_desc().  Pass along the iovec
      entirely instead.
      
      Fixes: 79c14141 ("vhost/scsi: Convert completion path to use copy_to_iter")
      Signed-off-by: default avatarBenjamin Coddington <bcodding@redhat.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9cb2e06a
    • Kent Overstreet's avatar
      bcache: RESERVE_PRIO is too small by one when prio_buckets() is a power of two. · 2d64cbc8
      Kent Overstreet authored
      commit acc9cf8c upstream.
      
      This patch fixes a cachedev registration-time allocation deadlock.
      This can deadlock on boot if your initrd auto-registeres bcache devices:
      
      Allocator thread:
      [  720.727614] INFO: task bcache_allocato:3833 blocked for more than 120 seconds.
      [  720.732361]  [<ffffffff816eeac7>] schedule+0x37/0x90
      [  720.732963]  [<ffffffffa05192b8>] bch_bucket_alloc+0x188/0x360 [bcache]
      [  720.733538]  [<ffffffff810e6950>] ? prepare_to_wait_event+0xf0/0xf0
      [  720.734137]  [<ffffffffa05302bd>] bch_prio_write+0x19d/0x340 [bcache]
      [  720.734715]  [<ffffffffa05190bf>] bch_allocator_thread+0x3ff/0x470 [bcache]
      [  720.735311]  [<ffffffff816ee41c>] ? __schedule+0x2dc/0x950
      [  720.735884]  [<ffffffffa0518cc0>] ? invalidate_buckets+0x980/0x980 [bcache]
      
      Registration thread:
      [  720.710403] INFO: task bash:3531 blocked for more than 120 seconds.
      [  720.715226]  [<ffffffff816eeac7>] schedule+0x37/0x90
      [  720.715805]  [<ffffffffa05235cd>] __bch_btree_map_nodes+0x12d/0x150 [bcache]
      [  720.716409]  [<ffffffffa0522d30>] ? bch_btree_insert_check_key+0x1c0/0x1c0 [bcache]
      [  720.717008]  [<ffffffffa05236e4>] bch_btree_insert+0xf4/0x170 [bcache]
      [  720.717586]  [<ffffffff810e6950>] ? prepare_to_wait_event+0xf0/0xf0
      [  720.718191]  [<ffffffffa0527d9a>] bch_journal_replay+0x14a/0x290 [bcache]
      [  720.718766]  [<ffffffff810cc90d>] ? ttwu_do_activate.constprop.94+0x5d/0x70
      [  720.719369]  [<ffffffff810cf684>] ? try_to_wake_up+0x1d4/0x350
      [  720.719968]  [<ffffffffa05317d0>] run_cache_set+0x580/0x8e0 [bcache]
      [  720.720553]  [<ffffffffa053302e>] register_bcache+0xe2e/0x13b0 [bcache]
      [  720.721153]  [<ffffffff81354cef>] kobj_attr_store+0xf/0x20
      [  720.721730]  [<ffffffff812a2dad>] sysfs_kf_write+0x3d/0x50
      [  720.722327]  [<ffffffff812a225a>] kernfs_fop_write+0x12a/0x180
      [  720.722904]  [<ffffffff81225177>] __vfs_write+0x37/0x110
      [  720.723503]  [<ffffffff81228048>] ? __sb_start_write+0x58/0x110
      [  720.724100]  [<ffffffff812cedb3>] ? security_file_permission+0x23/0xa0
      [  720.724675]  [<ffffffff812258a9>] vfs_write+0xa9/0x1b0
      [  720.725275]  [<ffffffff8102479c>] ? do_audit_syscall_entry+0x6c/0x70
      [  720.725849]  [<ffffffff81226755>] SyS_write+0x55/0xd0
      [  720.726451]  [<ffffffff8106a390>] ? do_page_fault+0x30/0x80
      [  720.727045]  [<ffffffff816f2cae>] system_call_fastpath+0x12/0x71
      
      The fifo code in upstream bcache can't use the last element in the buffer,
      which was the cause of the bug: if you asked for a power of two size,
      it'd give you a fifo that could hold one less than what you asked for
      rather than allocating a buffer twice as big.
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Tested-by: default avatarEric Wheeler <bcache@linux.ewheeler.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d64cbc8
    • Vincent Stehlé's avatar
      ubifs: Fix assertion in layout_in_gaps() · 4c109816
      Vincent Stehlé authored
      commit c0082e98 upstream.
      
      An assertion in layout_in_gaps() verifies that the gap_lebs pointer is
      below the maximum bound. When computing this maximum bound the idx_lebs
      count is multiplied by sizeof(int), while C pointers arithmetic does take
      into account the size of the pointed elements implicitly already. Remove
      the multiplication to fix the assertion.
      
      Fixes: 1e51764a ("UBIFS: add new flash file system")
      Signed-off-by: default avatarVincent Stehlé <vincent.stehle@intel.com>
      Cc: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatarArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c109816
    • Miklos Szeredi's avatar
      ovl: fix workdir creation · 2f949da9
      Miklos Szeredi authored
      commit e1ff3dd1 upstream.
      
      Workdir creation fails in latest kernel.
      
      Fix by allowing EOPNOTSUPP as a valid return value from
      vfs_removexattr(XATTR_NAME_POSIX_ACL_*).  Upper filesystem may not support
      ACL and still be perfectly able to support overlayfs.
      Reported-by: default avatarMartin Ziegler <ziegler@uni-freiburg.de>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: c11b9fdd ("ovl: remove posix_acl_default from workdir")
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2f949da9
    • Miklos Szeredi's avatar
      ovl: listxattr: use strnlen() · 708cb42f
      Miklos Szeredi authored
      commit 7cb35119 upstream.
      
      Be defensive about what underlying fs provides us in the returned xattr
      list buffer.  If it's not properly null terminated, bail out with a warning
      insead of BUG.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      708cb42f
    • Miklos Szeredi's avatar
      ovl: remove posix_acl_default from workdir · d57a6c74
      Miklos Szeredi authored
      commit c11b9fdd upstream.
      
      Clear out posix acl xattrs on workdir and also reset the mode after
      creation so that an inherited sgid bit is cleared.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d57a6c74
    • Miklos Szeredi's avatar
      ovl: don't copy up opaqueness · 48fd20d7
      Miklos Szeredi authored
      commit 0956254a upstream.
      
      When a copy up of a directory occurs which has the opaque xattr set, the
      xattr remains in the upper directory. The immediate behavior with overlayfs
      is that the upper directory is not treated as opaque, however after a
      remount the opaque flag is used and upper directory is treated as opaque.
      This causes files created in the lower layer to be hidden when using
      multiple lower directories.
      
      Fix by not copying up the opaque flag.
      
      To reproduce:
      
       ----8<---------8<---------8<---------8<---------8<---------8<----
      mkdir -p l/d/s u v w mnt
      mount -t overlay overlay -olowerdir=l,upperdir=u,workdir=w mnt
      rm -rf mnt/d/
      mkdir -p mnt/d/n
      umount mnt
      mount -t overlay overlay -olowerdir=u:l,upperdir=v,workdir=w mnt
      touch mnt/d/foo
      umount mnt
      mount -t overlay overlay -olowerdir=u:l,upperdir=v,workdir=w mnt
      ls mnt/d
       ----8<---------8<---------8<---------8<---------8<---------8<----
      
      output should be:  "foo  n"
      Reported-by: default avatarDerek McGowan <dmcg@drizz.net>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=151291Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      48fd20d7
    • Al Viro's avatar
      wrappers for ->i_mutex access · d72e9b25
      Al Viro authored
      commit 5955102c upstream
      
      parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
      inode_foo(inode) being mutex_foo(&inode->i_mutex).
      
      Please, use those for access to ->i_mutex; over the coming cycle
      ->i_mutex will become rwsem, with ->lookup() done with it held
      only shared.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [only the fs.h change included to make backports easier - gregkh]
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d72e9b25
    • Al Viro's avatar
      lustre: remove unused declaration · 09ca40a6
      Al Viro authored
      commit 57b8f112 upstream.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09ca40a6
    • John Stultz's avatar
      timekeeping: Avoid taking lock in NMI path with CONFIG_DEBUG_TIMEKEEPING · 4eca11db
      John Stultz authored
      commit 27727df2 upstream.
      
      When I added some extra sanity checking in timekeeping_get_ns() under
      CONFIG_DEBUG_TIMEKEEPING, I missed that the NMI safe __ktime_get_fast_ns()
      method was using timekeeping_get_ns().
      
      Thus the locking added to the debug checks broke the NMI-safety of
      __ktime_get_fast_ns().
      
      This patch open-codes the timekeeping_get_ns() logic for
      __ktime_get_fast_ns(), so can avoid any deadlocks in NMI.
      
      Fixes: 4ca22c26 "timekeeping: Add warnings when overflows or underflows are observed"
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Reported-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Link: http://lkml.kernel.org/r/1471993702-29148-2-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4eca11db
    • John Stultz's avatar
      timekeeping: Cap array access in timekeeping_debug · 42ef9015
      John Stultz authored
      commit a4f8f666 upstream.
      
      It was reported that hibernation could fail on the 2nd attempt, where the
      system hangs at hibernate() -> syscore_resume() -> i8237A_resume() ->
      claim_dma_lock(), because the lock has already been taken.
      
      However there is actually no other process would like to grab this lock on
      that problematic platform.
      
      Further investigation showed that the problem is triggered by setting
      /sys/power/pm_trace to 1 before the 1st hibernation.
      
      Since once pm_trace is enabled, the rtc becomes unmeaningful after suspend,
      and meanwhile some BIOSes would like to adjust the 'invalid' RTC (e.g, smaller
      than 1970) to the release date of that motherboard during POST stage, thus
      after resumed, it may seem that the system had a significant long sleep time
      which is a completely meaningless value.
      
      Then in timekeeping_resume -> tk_debug_account_sleep_time, if the bit31 of the
      sleep time happened to be set to 1, fls() returns 32 and we add 1 to
      sleep_time_bin[32], which causes an out of bounds array access and therefor
      memory being overwritten.
      
      As depicted by System.map:
      0xffffffff81c9d080 b sleep_time_bin
      0xffffffff81c9d100 B dma_spin_lock
      the dma_spin_lock.val is set to 1, which caused this problem.
      
      This patch adds a sanity check in tk_debug_account_sleep_time()
      to ensure we don't index past the sleep_time_bin array.
      
      [jstultz: Problem diagnosed and original patch by Chen Yu, I've solved the
       issue slightly differently, but borrowed his excelent explanation of the
       issue here.]
      
      Fixes: 5c83545f "power: Add option to log time spent in suspend"
      Reported-by: default avatarJanek Kozicki <cosurgi@gmail.com>
      Reported-by: default avatarChen Yu <yu.c.chen@intel.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: linux-pm@vger.kernel.org
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xunlei Pang <xpang@redhat.com>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Link: http://lkml.kernel.org/r/1471993702-29148-3-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      42ef9015
    • Dave Chinner's avatar
      xfs: fix superblock inprogress check · f5edb04b
      Dave Chinner authored
      commit f3d7ebde upstream.
      
      From inspection, the superblock sb_inprogress check is done in the
      verifier and triggered only for the primary superblock via a
      "bp->b_bn == XFS_SB_DADDR" check.
      
      Unfortunately, the primary superblock is an uncached buffer, and
      hence it is configured by xfs_buf_read_uncached() with:
      
      	bp->b_bn = XFS_BUF_DADDR_NULL;  /* always null for uncached buffers */
      
      And so this check never triggers. Fix it.
      Signed-off-by: default avatarDave Chinner <dchinner@redhat.com>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5edb04b
    • Christoph Huber's avatar
      ASoC: atmel_ssc_dai: Don't unconditionally reset SSC on stream startup · 4757f7ed
      Christoph Huber authored
      commit 3e103a65 upstream.
      
      commit cbaadf0f ("ASoC: atmel_ssc_dai: refactor the startup and
      shutdown") refactored code such that the SSC is reset on every
      startup; this breaks duplex audio (e.g. first start audio playback,
      then start record, causing the playback to stop/hang)
      
      Fixes: cbaadf0f (ASoC: atmel_ssc_dai: refactor the startup and shutdown)
      Signed-off-by: default avatarChristoph Huber <c.huber@bct-electronic.com>
      Signed-off-by: default avatarPeter Meerwald-Stadler <p.meerwald@bct-electronic.com>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4757f7ed
    • Rob Clark's avatar
      drm/msm: fix use of copy_from_user() while holding spinlock · 103898dd
      Rob Clark authored
      commit 89f82cbb upstream.
      
      Use instead __copy_from_user_inatomic() and fallback to slow-path where
      we drop and re-aquire the lock in case of fault.
      Reported-by: default avatarVaishali Thakkar <vaishali.thakkar@oracle.com>
      Signed-off-by: default avatarRob Clark <robdclark@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      103898dd
    • Daniel Vetter's avatar
      drm: Reject page_flip for !DRIVER_MODESET · b7e99f78
      Daniel Vetter authored
      commit 6f00975c upstream.
      
      Somehow this one slipped through, which means drivers without modeset
      support can be oopsed (since those also don't call
      drm_mode_config_init, which means the crtc lookup will chase an
      uninitalized idr).
      Reported-by: default avatarAlexander Potapenko <glider@google.com>
      Cc: Alexander Potapenko <glider@google.com>
      Signed-off-by: default avatarDaniel Vetter <daniel.vetter@intel.com>
      Reviewed-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b7e99f78
    • Christian König's avatar
      drm/radeon: fix radeon_move_blit on 32bit systems · 314c7e8a
      Christian König authored
      commit 13f479b9 upstream.
      
      This bug seems to be present for a very long time.
      Signed-off-by: default avatarChristian König <christian.koenig@amd.com>
      Reviewed-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      314c7e8a
    • Martin Schwidefsky's avatar
      s390/sclp_ctl: fix potential information leak with /dev/sclp · 2d29d6ce
      Martin Schwidefsky authored
      commit 532c34b5 upstream.
      
      The sclp_ctl_ioctl_sccb function uses two copy_from_user calls to
      retrieve the sclp request from user space. The first copy_from_user
      fetches the length of the request which is stored in the first two
      bytes of the request. The second copy_from_user gets the complete
      sclp request, but this copies the length field a second time.
      A malicious user may have changed the length in the meantime.
      Reported-by: default avatarPengfei Wang <wpengfeinudt@gmail.com>
      Reviewed-by: default avatarMichael Holzheu <holzheu@linux.vnet.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarJuerg Haefliger <juerg.haefliger@hpe.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d29d6ce
    • Kangjie Lu's avatar
      rds: fix an infoleak in rds_inc_info_copy · ffd5ce2a
      Kangjie Lu authored
      commit 4116def2 upstream.
      
      The last field "flags" of object "minfo" is not initialized.
      Copying this object out may leak kernel stack data.
      Assign 0 to it to avoid leak.
      Signed-off-by: default avatarKangjie Lu <kjlu@gatech.edu>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJuerg Haefliger <juerg.haefliger@hpe.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ffd5ce2a
    • Michael Neuling's avatar
      powerpc/tm: Avoid SLB faults in treclaim/trecheckpoint when RI=0 · 0e324f6d
      Michael Neuling authored
      commit 190ce869 upstream.
      
      Currently we have 2 segments that are bolted for the kernel linear
      mapping (ie 0xc000... addresses). This is 0 to 1TB and also the kernel
      stacks. Anything accessed outside of these regions may need to be
      faulted in. (In practice machines with TM always have 1T segments)
      
      If a machine has < 2TB of memory we never fault on the kernel linear
      mapping as these two segments cover all physical memory. If a machine
      has > 2TB of memory, there may be structures outside of these two
      segments that need to be faulted in. This faulting can occur when
      running as a guest as the hypervisor may remove any SLB that's not
      bolted.
      
      When we treclaim and trecheckpoint we have a window where we need to
      run with the userspace GPRs. This means that we no longer have a valid
      stack pointer in r1. For this window we therefore clear MSR RI to
      indicate that any exceptions taken at this point won't be able to be
      handled. This means that we can't take segment misses in this RI=0
      window.
      
      In this RI=0 region, we currently access the thread_struct for the
      process being context switched to or from. This thread_struct access
      may cause a segment fault since it's not guaranteed to be covered by
      the two bolted segment entries described above.
      
      We've seen this with a crash when running as a guest with > 2TB of
      memory on PowerVM:
      
        Unrecoverable exception 4100 at c00000000004f138
        Oops: Unrecoverable exception, sig: 6 [#1]
        SMP NR_CPUS=2048 NUMA pSeries
        CPU: 1280 PID: 7755 Comm: kworker/1280:1 Tainted: G                 X 4.4.13-46-default #1
        task: c000189001df4210 ti: c000189001d5c000 task.ti: c000189001d5c000
        NIP: c00000000004f138 LR: 0000000010003a24 CTR: 0000000010001b20
        REGS: c000189001d5f730 TRAP: 4100   Tainted: G                 X  (4.4.13-46-default)
        MSR: 8000000100001031 <SF,ME,IR,DR,LE>  CR: 24000048  XER: 00000000
        CFAR: c00000000004ed18 SOFTE: 0
        GPR00: ffffffffc58d7b60 c000189001d5f9b0 00000000100d7d00 000000003a738288
        GPR04: 0000000000002781 0000000000000006 0000000000000000 c0000d1f4d889620
        GPR08: 000000000000c350 00000000000008ab 00000000000008ab 00000000100d7af0
        GPR12: 00000000100d7ae8 00003ffe787e67a0 0000000000000000 0000000000000211
        GPR16: 0000000010001b20 0000000000000000 0000000000800000 00003ffe787df110
        GPR20: 0000000000000001 00000000100d1e10 0000000000000000 00003ffe787df050
        GPR24: 0000000000000003 0000000000010000 0000000000000000 00003fffe79e2e30
        GPR28: 00003fffe79e2e68 00000000003d0f00 00003ffe787e67a0 00003ffe787de680
        NIP [c00000000004f138] restore_gprs+0xd0/0x16c
        LR [0000000010003a24] 0x10003a24
        Call Trace:
        [c000189001d5f9b0] [c000189001d5f9f0] 0xc000189001d5f9f0 (unreliable)
        [c000189001d5fb90] [c00000000001583c] tm_recheckpoint+0x6c/0xa0
        [c000189001d5fbd0] [c000000000015c40] __switch_to+0x2c0/0x350
        [c000189001d5fc30] [c0000000007e647c] __schedule+0x32c/0x9c0
        [c000189001d5fcb0] [c0000000007e6b58] schedule+0x48/0xc0
        [c000189001d5fce0] [c0000000000deabc] worker_thread+0x22c/0x5b0
        [c000189001d5fd80] [c0000000000e7000] kthread+0x110/0x130
        [c000189001d5fe30] [c000000000009538] ret_from_kernel_thread+0x5c/0xa4
        Instruction dump:
        7cb103a6 7cc0e3a6 7ca222a6 78a58402 38c00800 7cc62838 08860000 7cc000a6
        38a00006 78c60022 7cc62838 0b060000 <e8c701a0> 7ccff120 e8270078 e8a70098
        ---[ end trace 602126d0a1dedd54 ]---
      
      This fixes this by copying the required data from the thread_struct to
      the stack before we clear MSR RI. Then once we clear RI, we only access
      the stack, guaranteeing there's no segment miss.
      
      We also tighten the region over which we set RI=0 on the treclaim()
      path. This may have a slight performance impact since we're adding an
      mtmsr instruction.
      
      Fixes: 090b9284 ("powerpc/tm: Clear MSR RI in non-recoverable TM code")
      Signed-off-by: default avatarMichael Neuling <mikey@neuling.org>
      Reviewed-by: default avatarCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0e324f6d
    • Gabriel Krisman Bertazi's avatar
      nvme: Call pci_disable_device on the error path. · 81e9a969
      Gabriel Krisman Bertazi authored
      Commit 5706aca74fe4 ("NVMe: Don't unmap controller registers on reset"),
      which backported b00a726a to the 4.4.y kernel introduced a
      regression in which it didn't call pci_disable_device in the error path
      of nvme_pci_enable.
      Reported-by: default avatarJiri Slaby <jslaby@suse.cz>
      Embarassed-developer: Gabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
      Signed-off-by: default avatarGabriel Krisman Bertazi <krisman@linux.vnet.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81e9a969
    • Balbir Singh's avatar
      cgroup: reduce read locked section of cgroup_threadgroup_rwsem during fork · db8c7fff
      Balbir Singh authored
      commit 568ac888 upstream.
      
      cgroup_threadgroup_rwsem is acquired in read mode during process exit
      and fork.  It is also grabbed in write mode during
      __cgroups_proc_write().  I've recently run into a scenario with lots
      of memory pressure and OOM and I am beginning to see
      
      systemd
      
       __switch_to+0x1f8/0x350
       __schedule+0x30c/0x990
       schedule+0x48/0xc0
       percpu_down_write+0x114/0x170
       __cgroup_procs_write.isra.12+0xb8/0x3c0
       cgroup_file_write+0x74/0x1a0
       kernfs_fop_write+0x188/0x200
       __vfs_write+0x6c/0xe0
       vfs_write+0xc0/0x230
       SyS_write+0x6c/0x110
       system_call+0x38/0xb4
      
      This thread is waiting on the reader of cgroup_threadgroup_rwsem to
      exit.  The reader itself is under memory pressure and has gone into
      reclaim after fork. There are times the reader also ends up waiting on
      oom_lock as well.
      
       __switch_to+0x1f8/0x350
       __schedule+0x30c/0x990
       schedule+0x48/0xc0
       jbd2_log_wait_commit+0xd4/0x180
       ext4_evict_inode+0x88/0x5c0
       evict+0xf8/0x2a0
       dispose_list+0x50/0x80
       prune_icache_sb+0x6c/0x90
       super_cache_scan+0x190/0x210
       shrink_slab.part.15+0x22c/0x4c0
       shrink_zone+0x288/0x3c0
       do_try_to_free_pages+0x1dc/0x590
       try_to_free_pages+0xdc/0x260
       __alloc_pages_nodemask+0x72c/0xc90
       alloc_pages_current+0xb4/0x1a0
       page_table_alloc+0xc0/0x170
       __pte_alloc+0x58/0x1f0
       copy_page_range+0x4ec/0x950
       copy_process.isra.5+0x15a0/0x1870
       _do_fork+0xa8/0x4b0
       ppc_clone+0x8/0xc
      
      In the meanwhile, all processes exiting/forking are blocked almost
      stalling the system.
      
      This patch moves the threadgroup_change_begin from before
      cgroup_fork() to just before cgroup_canfork().  There is no nee to
      worry about threadgroup changes till the task is actually added to the
      threadgroup.  This avoids having to call reclaim with
      cgroup_threadgroup_rwsem held.
      
      tj: Subject and description edits.
      Signed-off-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Acked-by: default avatarZefan Li <lizefan@huawei.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db8c7fff
    • Ming Lei's avatar
      block: make sure a big bio is split into at most 256 bvecs · 02989f49
      Ming Lei authored
      commit 4d70dca4 upstream.
      
      After arbitrary bio size was introduced, the incoming bio may
      be very big. We have to split the bio into small bios so that
      each holds at most BIO_MAX_PAGES bvecs for safety reason, such
      as bio_clone().
      
      This patch fixes the following kernel crash:
      
      > [  172.660142] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
      > [  172.660229] IP: [<ffffffff811e53b4>] bio_trim+0xf/0x2a
      > [  172.660289] PGD 7faf3e067 PUD 7f9279067 PMD 0
      > [  172.660399] Oops: 0000 [#1] SMP
      > [...]
      > [  172.664780] Call Trace:
      > [  172.664813]  [<ffffffffa007f3be>] ? raid1_make_request+0x2e8/0xad7 [raid1]
      > [  172.664846]  [<ffffffff811f07da>] ? blk_queue_split+0x377/0x3d4
      > [  172.664880]  [<ffffffffa005fb5f>] ? md_make_request+0xf6/0x1e9 [md_mod]
      > [  172.664912]  [<ffffffff811eb860>] ? generic_make_request+0xb5/0x155
      > [  172.664947]  [<ffffffffa0445c89>] ? prio_io+0x85/0x95 [bcache]
      > [  172.664981]  [<ffffffffa0448252>] ? register_cache_set+0x355/0x8d0 [bcache]
      > [  172.665016]  [<ffffffffa04497d3>] ? register_bcache+0x1006/0x1174 [bcache]
      
      The issue can be reproduced by the following steps:
      	- create one raid1 over two virtio-blk
      	- build bcache device over the above raid1 and another cache device
      	and bucket size is set as 2Mbytes
      	- set cache mode as writeback
      	- run random write over ext4 on the bcache device
      
      Fixes: 54efd50b(block: make generic_make_request handle arbitrarily sized bios)
      Reported-by: default avatarSebastian Roesner <sroesner-kernelorg@roesner-online.de>
      Reported-by: default avatarEric Wheeler <bcache@lists.ewheeler.net>
      Cc: Shaohua Li <shli@fb.com>
      Acked-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      Signed-off-by: default avatarMing Lei <ming.lei@canonical.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02989f49
    • Bart Van Assche's avatar
      block: Fix race triggered by blk_set_queue_dying() · d3a6bd7b
      Bart Van Assche authored
      commit 1b856086 upstream.
      
      blk_set_queue_dying() can be called while another thread is
      submitting I/O or changing queue flags, e.g. through dm_stop_queue().
      Hence protect the QUEUE_FLAG_DYING flag change with locking.
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@sandisk.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d3a6bd7b
    • Daeho Jeong's avatar
      ext4: avoid modifying checksum fields directly during checksum verification · 1d12bad7
      Daeho Jeong authored
      commit b47820ed upstream.
      
      We temporally change checksum fields in buffers of some types of
      metadata into '0' for verifying the checksum values. By doing this
      without locking the buffer, some metadata's checksums, which are
      being committed or written back to the storage, could be damaged.
      In our test, several metadata blocks were found with damaged metadata
      checksum value during recovery process. When we only verify the
      checksum value, we have to avoid modifying checksum fields directly.
      Signed-off-by: default avatarDaeho Jeong <daeho.jeong@samsung.com>
      Signed-off-by: default avatarYoungjin Gil <youngjin.gil@samsung.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Cc: Török Edwin <edwin@etorok.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d12bad7
    • Jan Kara's avatar
      ext4: avoid deadlock when expanding inode size · 77ae14d0
      Jan Kara authored
      commit 2e81a4ee upstream.
      
      When we need to move xattrs into external xattr block, we call
      ext4_xattr_block_set() from ext4_expand_extra_isize_ea(). That may end
      up calling ext4_mark_inode_dirty() again which will recurse back into
      the inode expansion code leading to deadlocks.
      
      Protect from recursion using EXT4_STATE_NO_EXPAND inode flag and move
      its management into ext4_expand_extra_isize_ea() since its manipulation
      is safe there (due to xattr_sem) from possible races with
      ext4_xattr_set_handle() which plays with it as well.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77ae14d0
    • Jan Kara's avatar
      ext4: properly align shifted xattrs when expanding inodes · a79f1f7f
      Jan Kara authored
      commit 443a8c41 upstream.
      
      We did not count with the padding of xattr value when computing desired
      shift of xattrs in the inode when expanding i_extra_isize. As a result
      we could create unaligned start of inline xattrs. Account for alignment
      properly.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a79f1f7f
    • Jan Kara's avatar
      ext4: fix xattr shifting when expanding inodes part 2 · e6abdbf8
      Jan Kara authored
      commit 418c12d0 upstream.
      
      When multiple xattrs need to be moved out of inode, we did not properly
      recompute total size of xattr headers in the inode and the new header
      position. Thus when moving the second and further xattr we asked
      ext4_xattr_shift_entries() to move too much and from the wrong place,
      resulting in possible xattr value corruption or general memory
      corruption.
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e6abdbf8
    • Jan Kara's avatar
      ext4: fix xattr shifting when expanding inodes · f2c06c73
      Jan Kara authored
      commit d0141191 upstream.
      
      The code in ext4_expand_extra_isize_ea() treated new_extra_isize
      argument sometimes as the desired target i_extra_isize and sometimes as
      the amount by which we need to grow current i_extra_isize. These happen
      to coincide when i_extra_isize is 0 which used to be the common case and
      so nobody noticed this until recently when we added i_projid to the
      inode and so i_extra_isize now needs to grow from 28 to 32 bytes.
      
      The result of these bugs was that we sometimes unnecessarily decided to
      move xattrs out of inode even if there was enough space and we often
      ended up corrupting in-inode xattrs because arguments to
      ext4_xattr_shift_entries() were just wrong. This could demonstrate
      itself as BUG_ON in ext4_xattr_shift_entries() triggering.
      
      Fix the problem by introducing new isize_diff variable and use it where
      appropriate.
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Signed-off-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f2c06c73
    • Theodore Ts'o's avatar
      ext4: validate that metadata blocks do not overlap superblock · dfa0a227
      Theodore Ts'o authored
      commit 829fa70d upstream.
      
      A number of fuzzing failures seem to be caused by allocation bitmaps
      or other metadata blocks being pointed at the superblock.
      
      This can cause kernel BUG or WARNings once the superblock is
      overwritten, so validate the group descriptor blocks to make sure this
      doesn't happen.
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dfa0a227