1. 06 May, 2015 20 commits
    • Steven Rostedt's avatar
      ring-buffer: Replace this_cpu_*() with __this_cpu_*() · 16d35612
      Steven Rostedt authored
      commit 80a9b64e upstream.
      
      It has come to my attention that this_cpu_read/write are horrible on
      architectures other than x86. Worse yet, they actually disable
      preemption or interrupts! This caused some unexpected tracing results
      on ARM.
      
         101.356868: preempt_count_add <-ring_buffer_lock_reserve
         101.356870: preempt_count_sub <-ring_buffer_lock_reserve
      
      The ring_buffer_lock_reserve has recursion protection that requires
      accessing a per cpu variable. But since preempt_disable() is traced, it
      too got traced while accessing the variable that is suppose to prevent
      recursion like this.
      
      The generic version of this_cpu_read() and write() are:
      
       #define this_cpu_generic_read(pcp)					\
       ({	typeof(pcp) ret__;						\
      	preempt_disable();						\
      	ret__ = *this_cpu_ptr(&(pcp));					\
      	preempt_enable();						\
      	ret__;								\
       })
      
       #define this_cpu_generic_to_op(pcp, val, op)				\
       do {									\
      	unsigned long flags;						\
      	raw_local_irq_save(flags);					\
      	*__this_cpu_ptr(&(pcp)) op val;					\
      	raw_local_irq_restore(flags);					\
       } while (0)
      
      Which is unacceptable for locations that know they are within preempt
      disabled or interrupt disabled locations.
      
      Paul McKenney stated that __this_cpu_() versions produce much better code on
      other architectures than this_cpu_() does, if we know that the call is done in
      a preempt disabled location.
      
      I also changed the recursive_unlock() to use two local variables instead
      of accessing the per_cpu variable twice.
      
      Link: http://lkml.kernel.org/r/20150317114411.GE3589@linux.vnet.ibm.com
      Link: http://lkml.kernel.org/r/20150317104038.312e73d1@gandalf.local.homeAcked-by: default avatarChristoph Lameter <cl@linux.com>
      Reported-by: default avatarUwe Kleine-Koenig <u.kleine-koenig@pengutronix.de>
      Tested-by: default avatarUwe Kleine-Koenig <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16d35612
    • Krzysztof Kozlowski's avatar
      compal-laptop: Check return value of power_supply_register · 223bdc47
      Krzysztof Kozlowski authored
      commit 1915a718 upstream.
      
      The return value of power_supply_register() call was not checked and
      even on error probe() function returned 0. If registering failed then
      during unbind the driver tried to unregister power supply which was not
      actually registered.
      
      This could lead to memory corruption because power_supply_unregister()
      unconditionally cleans up given power supply.
      
      Fix this by checking return status of power_supply_register() call. In
      case of failure, clean up sysfs entries and fail the probe.
      Signed-off-by: default avatarKrzysztof Kozlowski <k.kozlowski@samsung.com>
      Fixes: 9be0fcb5 ("compal-laptop: add JHL90, battery & hwmon interface")
      Signed-off-by: default avatarSebastian Reichel <sre@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      223bdc47
    • Ian Abbott's avatar
      spi: spidev: fix possible arithmetic overflow for multi-transfer message · f53d3116
      Ian Abbott authored
      commit f20fbaad upstream.
      
      `spidev_message()` sums the lengths of the individual SPI transfers to
      determine the overall SPI message length.  It restricts the total
      length, returning an error if too long, but it does not check for
      arithmetic overflow.  For example, if the SPI message consisted of two
      transfers and the first has a length of 10 and the second has a length
      of (__u32)(-1), the total length would be seen as 9, even though the
      second transfer is actually very long.  If the second transfer specifies
      a null `rx_buf` and a non-null `tx_buf`, the `copy_from_user()` could
      overrun the spidev's pre-allocated tx buffer before it reaches an
      invalid user memory address.  Fix it by checking that neither the total
      nor the individual transfer lengths exceed the maximum allowed value.
      
      Thanks to Dan Carpenter for reporting the potential integer overflow.
      Signed-off-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f53d3116
    • Oliver Neukum's avatar
      cdc-wdm: fix endianness bug in debug statements · a50c3862
      Oliver Neukum authored
      commit 323ece54 upstream.
      
      Values directly from descriptors given in debug statements
      must be converted to native endianness.
      Signed-off-by: default avatarOliver Neukum <oneukum@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a50c3862
    • NeilBrown's avatar
      md/raid0: fix bug with chunksize not a power of 2. · d12a39f9
      NeilBrown authored
      commit 47d68979 upstream.
      
      Since commit 20d0189b
      in v3.14-rc1 RAID0 has performed incorrect calculations
      when the chunksize is not a power of 2.
      
      This happens because "sector_div()" modifies its first argument, but
      this wasn't taken into account in the patch.
      
      So restore that first arg before re-using the variable.
      Reported-by: default avatarJoe Landman <joe.landman@gmail.com>
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Fixes: 20d0189bSigned-off-by: default avatarNeilBrown <neilb@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d12a39f9
    • Huacai Chen's avatar
      MIPS: Hibernate: flush TLB entries earlier · dc47dff3
      Huacai Chen authored
      commit a843d00d upstream.
      
      We found that TLB mismatch not only happens after kernel resume, but
      also happens during snapshot restore. So move it to the beginning of
      swsusp_arch_suspend().
      Signed-off-by: default avatarHuacai Chen <chenhc@lemote.com>
      Cc: Steven J. Hill <Steven.Hill@imgtec.com>
      Cc: linux-mips@linux-mips.org
      Cc: Fuxin Zhang <zhangfx@lemote.com>
      Cc: Zhangjin Wu <wuzhangjin@gmail.com>
      Patchwork: https://patchwork.linux-mips.org/patch/9621/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dc47dff3
    • Radim Krčmář's avatar
      KVM: use slowpath for cross page cached accesses · 89c9a457
      Radim Krčmář authored
      commit ca3f0874 upstream.
      
      kvm_write_guest_cached() does not mark all written pages as dirty and
      code comments in kvm_gfn_to_hva_cache_init() talk about NULL memslot
      with cross page accesses.  Fix all the easy way.
      
      The check is '<= 1' to have the same result for 'len = 0' cache anywhere
      in the page.  (nr_pages_needed is 0 on page boundary.)
      
      Fixes: 8f964525 ("KVM: Allow cross page reads and writes from cached translations.")
      Signed-off-by: default avatarRadim Krčmář <rkrcmar@redhat.com>
      Message-Id: <20150408121648.GA3519@potion.brq.redhat.com>
      Reviewed-by: default avatarWanpeng Li <wanpeng.li@linux.intel.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89c9a457
    • Heiko Carstens's avatar
      s390/hibernate: fix save and restore of kernel text section · 71f6e012
      Heiko Carstens authored
      commit d7441949 upstream.
      
      Sebastian reported a crash caused by a jump label mismatch after resume.
      This happens because we do not save the kernel text section during suspend
      and therefore also do not restore it during resume, but use the kernel image
      that restores the old system.
      
      This means that after a suspend/resume cycle we lost all modifications done
      to the kernel text section.
      The reason for this is the pfn_is_nosave() function, which incorrectly
      returns that read-only pages don't need to be saved. This is incorrect since
      we mark the kernel text section read-only.
      We still need to make sure to not save and restore pages contained within
      NSS and DCSS segment.
      To fix this add an extra case for the kernel text section and only save
      those pages if they are not contained within an NSS segment.
      
      Fixes the following crash (and the above bugs as well):
      
      Jump label code mismatch at netif_receive_skb_internal+0x28/0xd0
      Found:    c0 04 00 00 00 00
      Expected: c0 f4 00 00 00 11
      New:      c0 04 00 00 00 00
      Kernel panic - not syncing: Corrupted kernel text
      CPU: 0 PID: 9 Comm: migration/0 Not tainted 3.19.0-01975-gb1b096e70f23 #4
      Call Trace:
        [<0000000000113972>] show_stack+0x72/0xf0
        [<000000000081f15e>] dump_stack+0x6e/0x90
        [<000000000081c4e8>] panic+0x108/0x2b0
        [<000000000081be64>] jump_label_bug.isra.2+0x104/0x108
        [<0000000000112176>] __jump_label_transform+0x9e/0xd0
        [<00000000001121e6>] __sm_arch_jump_label_transform+0x3e/0x50
        [<00000000001d1136>] multi_cpu_stop+0x12e/0x170
        [<00000000001d1472>] cpu_stopper_thread+0xb2/0x168
        [<000000000015d2ac>] smpboot_thread_fn+0x134/0x1b0
        [<0000000000158baa>] kthread+0x10a/0x110
        [<0000000000824a86>] kernel_thread_starter+0x6/0xc
      Reported-and-tested-by: default avatarSebastian Ott <sebott@linux.vnet.ibm.com>
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71f6e012
    • Ekaterina Tumanova's avatar
      KVM: s390: Zero out current VMDB of STSI before including level3 data. · 86de6aaf
      Ekaterina Tumanova authored
      commit b75f4c9a upstream.
      
      s390 documentation requires words 0 and 10-15 to be reserved and stored as
      zeros. As we fill out all other fields, we can memset the full structure.
      Signed-off-by: default avatarEkaterina Tumanova <tumanova@linux.vnet.ibm.com>
      Reviewed-by: default avatarDavid Hildenbrand <dahi@linux.vnet.ibm.com>
      Signed-off-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      86de6aaf
    • Felipe Balbi's avatar
      usb: gadget: composite: enable BESL support · 8bd13670
      Felipe Balbi authored
      commit a6615937 upstream.
      
      According to USB 2.0 ECN Errata for Link Power
      Management (USB2-LPM-Errata-final.pdf), BESL
      must be enabled if LPM is enabled.
      
      This helps with USB30CV TD 9.21 LPM L1
      Suspend Resume Test.
      Signed-off-by: default avatarFelipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarDu, Changbin <changbin.du@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8bd13670
    • Len Brown's avatar
      sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power... · 9308cced
      Len Brown authored
      sched/idle/x86: Restore mwait_idle() to fix boot hangs, to improve power savings and to improve performance
      
      commit b253149b upstream.
      
      In Linux-3.9 we removed the mwait_idle() loop:
      
        69fb3676 ("x86 idle: remove mwait_idle() and "idle=mwait" cmdline param")
      
      The reasoning was that modern machines should be sufficiently
      happy during the boot process using the default_idle() HALT
      loop, until cpuidle loads and either acpi_idle or intel_idle
      invoke the newer MWAIT-with-hints idle loop.
      
      But two machines reported problems:
      
       1. Certain Core2-era machines support MWAIT-C1 and HALT only.
          MWAIT-C1 is preferred for optimal power and performance.
          But if they support just C1, cpuidle never loads and
          so they use the boot-time default idle loop forever.
      
       2. Some laptops will boot-hang if HALT is used,
          but will boot successfully if MWAIT is used.
          This appears to be a hidden assumption in BIOS SMI,
          that is presumably valid on the proprietary OS
          where the BIOS was validated.
      
             https://bugzilla.kernel.org/show_bug.cgi?id=60770
      
      So here we effectively revert the patch above, restoring
      the mwait_idle() loop.  However, we don't bother restoring
      the idle=mwait cmdline parameter, since it appears to add
      no value.
      
      Maintainer notes:
      
        For 3.9, simply revert 69fb3676
        for 3.10, patch -F3 applies, fuzz needed due to __cpuinit use in
        context For 3.11, 3.12, 3.13, this patch applies cleanly
      Tested-by: default avatarMike Galbraith <bitbucket@online.de>
      Signed-off-by: default avatarLen Brown <len.brown@intel.com>
      Acked-by: default avatarMike Galbraith <bitbucket@online.de>
      Cc: <stable@vger.kernel.org> # 3.9+
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Ian Malone <ibmalone@gmail.com>
      Cc: Josh Boyer <jwboyer@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/345254a551eb5a6a866e048d7ab570fd2193aca4.1389763084.git.len.brown@intel.com
      [ Ported to recent kernels. ]
      [ Mike: 3.10 backport ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarMike Galbraith <efault@gmx.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      9308cced
    • Filipe Manana's avatar
      Btrfs: fix inode eviction infinite loop after extent_same ioctl · 09ad914f
      Filipe Manana authored
      commit 113e8283 upstream.
      
      If we pass a length of 0 to the extent_same ioctl, we end up locking an
      extent range with a start offset greater then its end offset (if the
      destination file's offset is greater than zero). This results in a warning
      from extent_io.c:insert_state through the following call chain:
      
        btrfs_extent_same()
          btrfs_double_lock()
            lock_extent_range()
              lock_extent(inode->io_tree, offset, offset + len - 1)
                lock_extent_bits()
                  __set_extent_bit()
                    insert_state()
                      --> WARN_ON(end < start)
      
      This leads to an infinite loop when evicting the inode. This is the same
      problem that my previous patch titled
      "Btrfs: fix inode eviction infinite loop after cloning into it" addressed
      but for the extent_same ioctl instead of the clone ioctl.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarOmar Sandoval <osandov@osandov.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      09ad914f
    • Filipe Manana's avatar
      Btrfs: fix inode eviction infinite loop after cloning into it · 81a65d1f
      Filipe Manana authored
      commit ccccf3d6 upstream.
      
      If we attempt to clone a 0 length region into a file we can end up
      inserting a range in the inode's extent_io tree with a start offset
      that is greater then the end offset, which triggers immediately the
      following warning:
      
      [ 3914.619057] WARNING: CPU: 17 PID: 4199 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]()
      [ 3914.620886] BTRFS: end < start 4095 4096
      (...)
      [ 3914.638093] Call Trace:
      [ 3914.638636]  [<ffffffff81425fd9>] dump_stack+0x4c/0x65
      [ 3914.639620]  [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
      [ 3914.640789]  [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs]
      [ 3914.642041]  [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
      [ 3914.643236]  [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs]
      [ 3914.644441]  [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs]
      [ 3914.645711]  [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs]
      [ 3914.646914]  [<ffffffff8142b2fb>] ? _raw_spin_unlock+0x28/0x33
      [ 3914.648058]  [<ffffffffa03cbac4>] ? test_range_bit+0xcc/0xde [btrfs]
      [ 3914.650105]  [<ffffffffa03cb3c3>] lock_extent+0x13/0x15 [btrfs]
      [ 3914.651361]  [<ffffffffa03db39e>] lock_extent_range+0x3d/0xcd [btrfs]
      [ 3914.652761]  [<ffffffffa03de1fe>] btrfs_ioctl_clone+0x278/0x388 [btrfs]
      [ 3914.654128]  [<ffffffff811226dd>] ? might_fault+0x58/0xb5
      [ 3914.655320]  [<ffffffffa03e0909>] btrfs_ioctl+0xb51/0x2195 [btrfs]
      (...)
      [ 3914.669271] ---[ end trace 14843d3e2e622fc1 ]---
      
      This later makes the inode eviction handler enter an infinite loop that
      keeps dumping the following warning over and over:
      
      [ 3915.117629] WARNING: CPU: 22 PID: 4228 at fs/btrfs/extent_io.c:435 insert_state+0x4b/0x10b [btrfs]()
      [ 3915.119913] BTRFS: end < start 4095 4096
      (...)
      [ 3915.137394] Call Trace:
      [ 3915.137913]  [<ffffffff81425fd9>] dump_stack+0x4c/0x65
      [ 3915.139154]  [<ffffffff81045390>] warn_slowpath_common+0xa1/0xbb
      [ 3915.140316]  [<ffffffffa03ca44f>] ? insert_state+0x4b/0x10b [btrfs]
      [ 3915.141505]  [<ffffffff810453f0>] warn_slowpath_fmt+0x46/0x48
      [ 3915.142709]  [<ffffffffa03ca44f>] insert_state+0x4b/0x10b [btrfs]
      [ 3915.143849]  [<ffffffffa03ca729>] __set_extent_bit+0x107/0x3f4 [btrfs]
      [ 3915.145120]  [<ffffffffa038c1e3>] ? btrfs_kill_super+0x17/0x23 [btrfs]
      [ 3915.146352]  [<ffffffff811548f6>] ? deactivate_locked_super+0x3b/0x50
      [ 3915.147565]  [<ffffffffa03cb256>] lock_extent_bits+0x65/0x1bf [btrfs]
      [ 3915.148785]  [<ffffffff8142b7e2>] ? _raw_write_unlock+0x28/0x33
      [ 3915.149931]  [<ffffffffa03bc325>] btrfs_evict_inode+0x196/0x482 [btrfs]
      [ 3915.151154]  [<ffffffff81168904>] evict+0xa0/0x148
      [ 3915.152094]  [<ffffffff811689e5>] dispose_list+0x39/0x43
      [ 3915.153081]  [<ffffffff81169564>] evict_inodes+0xdc/0xeb
      [ 3915.154062]  [<ffffffff81154418>] generic_shutdown_super+0x49/0xef
      [ 3915.155193]  [<ffffffff811546d1>] kill_anon_super+0x13/0x1e
      [ 3915.156274]  [<ffffffffa038c1e3>] btrfs_kill_super+0x17/0x23 [btrfs]
      (...)
      [ 3915.167404] ---[ end trace 14843d3e2e622fc2 ]---
      
      So just bail out of the clone ioctl if the length of the region to clone
      is zero, without locking any extent range, in order to prevent this issue
      (same behaviour as a pwrite with a 0 length for example).
      
      This is trivial to reproduce. For example, the steps for the test I just
      made for fstests:
      
        mkfs.btrfs -f SCRATCH_DEV
        mount SCRATCH_DEV $SCRATCH_MNT
      
        touch $SCRATCH_MNT/foo
        touch $SCRATCH_MNT/bar
      
        $CLONER_PROG -s 0 -d 4096 -l 0 $SCRATCH_MNT/foo $SCRATCH_MNT/bar
        umount $SCRATCH_MNT
      
      A test case for fstests follows soon.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarOmar Sandoval <osandov@osandov.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      81a65d1f
    • David Sterba's avatar
      btrfs: don't accept bare namespace as a valid xattr · 16c855a6
      David Sterba authored
      commit 3c3b04d1 upstream.
      
      Due to insufficient check in btrfs_is_valid_xattr, this unexpectedly
      works:
      
       $ touch file
       $ setfattr -n user. -v 1 file
       $ getfattr -d file
      user.="1"
      
      ie. the missing attribute name after the namespace.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=94291Reported-by: default avatarWilliam Douglas <william.douglas@intel.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.cz>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16c855a6
    • Filipe Manana's avatar
      Btrfs: fix log tree corruption when fs mounted with -o discard · f1e1dad2
      Filipe Manana authored
      commit dcc82f47 upstream.
      
      While committing a transaction we free the log roots before we write the
      new super block. Freeing the log roots implies marking the disk location
      of every node/leaf (metadata extent) as pinned before the new super block
      is written. This is to prevent the disk location of log metadata extents
      from being reused before the new super block is written, otherwise we
      would have a corrupted log tree if before the new super block is written
      a crash/reboot happens and the location of any log tree metadata extent
      ended up being reused and rewritten.
      
      Even though we pinned the log tree's metadata extents, we were issuing a
      discard against them if the fs was mounted with the -o discard option,
      resulting in corruption of the log tree if a crash/reboot happened before
      writing the new super block - the next time the fs was mounted, during
      the log replay process we would find nodes/leafs of the log btree with
      a content full of zeroes, causing the process to fail and require the
      use of the tool btrfs-zero-log to wipeout the log tree (and all data
      previously fsynced becoming lost forever).
      
      Fix this by not doing a discard when pinning an extent. The discard will
      be done later when it's safe (after the new super block is committed) at
      extent-tree.c:btrfs_finish_extent_commit().
      
      Fixes: e688b725 (Btrfs: fix extent pinning bugs in the tree log)
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f1e1dad2
    • Eric Dumazet's avatar
      net: fix crash in build_skb() · 503c0602
      Eric Dumazet authored
      [ Upstream commit 2ea2f62c ]
      
      When I added pfmemalloc support in build_skb(), I forgot netlink
      was using build_skb() with a vmalloc() area.
      
      In this patch I introduce __build_skb() for netlink use,
      and build_skb() is a wrapper handling both skb->head_frag and
      skb->pfmemalloc
      
      This means netlink no longer has to hack skb->head_frag
      
      [ 1567.700067] kernel BUG at arch/x86/mm/physaddr.c:26!
      [ 1567.700067] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
      [ 1567.700067] Dumping ftrace buffer:
      [ 1567.700067]    (ftrace buffer empty)
      [ 1567.700067] Modules linked in:
      [ 1567.700067] CPU: 9 PID: 16186 Comm: trinity-c182 Not tainted 4.0.0-next-20150424-sasha-00037-g4796e21 #2167
      [ 1567.700067] task: ffff880127efb000 ti: ffff880246770000 task.ti: ffff880246770000
      [ 1567.700067] RIP: __phys_addr (arch/x86/mm/physaddr.c:26 (discriminator 3))
      [ 1567.700067] RSP: 0018:ffff8802467779d8  EFLAGS: 00010202
      [ 1567.700067] RAX: 000041000ed8e000 RBX: ffffc9008ed8e000 RCX: 000000000000002c
      [ 1567.700067] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffffb3fd6049
      [ 1567.700067] RBP: ffff8802467779f8 R08: 0000000000000019 R09: ffff8801d0168000
      [ 1567.700067] R10: ffff8801d01680c7 R11: ffffed003a02d019 R12: ffffc9000ed8e000
      [ 1567.700067] R13: 0000000000000f40 R14: 0000000000001180 R15: ffffc9000ed8e000
      [ 1567.700067] FS:  00007f2a7da3f700(0000) GS:ffff8801d1000000(0000) knlGS:0000000000000000
      [ 1567.700067] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [ 1567.700067] CR2: 0000000000738308 CR3: 000000022e329000 CR4: 00000000000007e0
      [ 1567.700067] Stack:
      [ 1567.700067]  ffffc9000ed8e000 ffff8801d0168000 ffffc9000ed8e000 ffff8801d0168000
      [ 1567.700067]  ffff880246777a28 ffffffffad7c0a21 0000000000001080 ffff880246777c08
      [ 1567.700067]  ffff88060d302e68 ffff880246777b58 ffff880246777b88 ffffffffad9a6821
      [ 1567.700067] Call Trace:
      [ 1567.700067] build_skb (include/linux/mm.h:508 net/core/skbuff.c:316)
      [ 1567.700067] netlink_sendmsg (net/netlink/af_netlink.c:1633 net/netlink/af_netlink.c:2329)
      [ 1567.774369] ? sched_clock_cpu (kernel/sched/clock.c:311)
      [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273)
      [ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273)
      [ 1567.774369] sock_sendmsg (net/socket.c:614 net/socket.c:623)
      [ 1567.774369] sock_write_iter (net/socket.c:823)
      [ 1567.774369] ? sock_sendmsg (net/socket.c:806)
      [ 1567.774369] __vfs_write (fs/read_write.c:479 fs/read_write.c:491)
      [ 1567.774369] ? get_lock_stats (kernel/locking/lockdep.c:249)
      [ 1567.774369] ? default_llseek (fs/read_write.c:487)
      [ 1567.774369] ? vtime_account_user (kernel/sched/cputime.c:701)
      [ 1567.774369] ? rw_verify_area (fs/read_write.c:406 (discriminator 4))
      [ 1567.774369] vfs_write (fs/read_write.c:539)
      [ 1567.774369] SyS_write (fs/read_write.c:586 fs/read_write.c:577)
      [ 1567.774369] ? SyS_read (fs/read_write.c:577)
      [ 1567.774369] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
      [ 1567.774369] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2594 kernel/locking/lockdep.c:2636)
      [ 1567.774369] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
      [ 1567.774369] system_call_fastpath (arch/x86/kernel/entry_64.S:261)
      
      Fixes: 79930f58 ("net: do not deplete pfmemalloc reserve")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      503c0602
    • Eric Dumazet's avatar
      net: do not deplete pfmemalloc reserve · d8101562
      Eric Dumazet authored
      [ Upstream commit 79930f58 ]
      
      build_skb() should look at the page pfmemalloc status.
      If set, this means page allocator allocated this page in the
      expectation it would help to free other pages. Networking
      stack can do that only if skb->pfmemalloc is also set.
      
      Also, we must refrain using high order pages from the pfmemalloc
      reserve, so __page_frag_refill() must also use __GFP_NOMEMALLOC for
      them. Under memory pressure, using order-0 pages is probably the best
      strategy.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d8101562
    • Eric Dumazet's avatar
      tcp: avoid looping in tcp_send_fin() · 4fbc9082
      Eric Dumazet authored
      [ Upstream commit 845704a5 ]
      
      Presence of an unbound loop in tcp_send_fin() had always been hard
      to explain when analyzing crash dumps involving gigantic dying processes
      with millions of sockets.
      
      Lets try a different strategy :
      
      In case of memory pressure, try to add the FIN flag to last packet
      in write queue, even if packet was already sent. TCP stack will
      be able to deliver this FIN after a timeout event. Note that this
      FIN being delivered by a retransmit, it also carries a Push flag
      given our current implementation.
      
      By checking sk_under_memory_pressure(), we anticipate that cooking
      many FIN packets might deplete tcp memory.
      
      In the case we could not allocate a packet, even with __GFP_WAIT
      allocation, then not sending a FIN seems quite reasonable if it allows
      to get rid of this socket, free memory, and not block the process from
      eventually doing other useful work.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4fbc9082
    • Eric Dumazet's avatar
      tcp: fix possible deadlock in tcp_send_fin() · 6dac0a78
      Eric Dumazet authored
      [ Upstream commit d83769a5 ]
      
      Using sk_stream_alloc_skb() in tcp_send_fin() is dangerous in
      case a huge process is killed by OOM, and tcp_mem[2] is hit.
      
      To be able to free memory we need to make progress, so this
      patch allows FIN packets to not care about tcp_mem[2], if
      skb allocation succeeded.
      
      In a follow-up patch, we might abort tcp_send_fin() infinite loop
      in case TIF_MEMDIE is set on this thread, as memory allocator
      did its best getting extra memory already.
      
      This patch reverts d22e1537 ("tcp: fix tcp fin memory accounting")
      
      Fixes: d22e1537 ("tcp: fix tcp fin memory accounting")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6dac0a78
    • Sebastian Pöhn's avatar
      ip_forward: Drop frames with attached skb->sk · 880e4c22
      Sebastian Pöhn authored
      [ Upstream commit 2ab95749 ]
      
      Initial discussion was:
      [FYI] xfrm: Don't lookup sk_policy for timewait sockets
      
      Forwarded frames should not have a socket attached. Especially
      tw sockets will lead to panics later-on in the stack.
      
      This was observed with TPROXY assigning a tw socket and broken
      policy routing (misconfigured). As a result frame enters
      forwarding path instead of input. We cannot solve this in
      TPROXY as it cannot know that policy routing is broken.
      
      v2:
      Remove useless comment
      Signed-off-by: default avatarSebastian Poehn <sebastian.poehn@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      880e4c22
  2. 29 Apr, 2015 20 commits
    • Greg Kroah-Hartman's avatar
      Linux 3.14.40 · 7b103792
      Greg Kroah-Hartman authored
      7b103792
    • Guenter Roeck's avatar
      arc: mm: Fix build failure · 9c5a2098
      Guenter Roeck authored
      commit e262eb93 upstream.
      
      Fix misspelled define.
      
      Fixes: 33692f27 ("vm: add VM_FAULT_SIGSEGV handling support")
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c5a2098
    • Konstantin Khlebnikov's avatar
      proc/pagemap: walk page tables under pte lock · 3b095426
      Konstantin Khlebnikov authored
      commit 05fbf357 upstream.
      
      Lockless access to pte in pagemap_pte_range() might race with page
      migration and trigger BUG_ON(!PageLocked()) in migration_entry_to_page():
      
      CPU A (pagemap)                           CPU B (migration)
                                                lock_page()
                                                try_to_unmap(page, TTU_MIGRATION...)
                                                     make_migration_entry()
                                                     set_pte_at()
      <read *pte>
      pte_to_pagemap_entry()
                                                remove_migration_ptes()
                                                unlock_page()
          if(is_migration_entry())
              migration_entry_to_page()
                  BUG_ON(!PageLocked(page))
      
      Also lockless read might be non-atomic if pte is larger than wordsize.
      Other pte walkers (smaps, numa_maps, clear_refs) already lock ptes.
      
      Fixes: 052fb0d6 ("proc: report file/anon bit in /proc/pid/pagemap")
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Reported-by: default avatarAndrey Ryabinin <a.ryabinin@samsung.com>
      Reviewed-by: default avatarCyrill Gorcunov <gorcunov@openvz.org>
      Acked-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Acked-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Cc: <stable@vger.kernel.org>	[3.5+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3b095426
    • Peter Feiner's avatar
      mm: softdirty: unmapped addresses between VMAs are clean · 620d77bd
      Peter Feiner authored
      commit 81d0fa62 upstream.
      
      If a /proc/pid/pagemap read spans a [VMA, an unmapped region, then a
      VM_SOFTDIRTY VMA], the virtual pages in the unmapped region are reported
      as softdirty.  Here's a program to demonstrate the bug:
      
      int main() {
      	const uint64_t PAGEMAP_SOFTDIRTY = 1ul << 55;
      	uint64_t pme[3];
      	int fd = open("/proc/self/pagemap", O_RDONLY);;
      	char *m = mmap(NULL, 3 * getpagesize(), PROT_READ,
      	               MAP_ANONYMOUS | MAP_SHARED, -1, 0);
      	munmap(m + getpagesize(), getpagesize());
      	pread(fd, pme, 24, (unsigned long) m / getpagesize() * 8);
      	assert(pme[0] & PAGEMAP_SOFTDIRTY);    /* passes */
      	assert(!(pme[1] & PAGEMAP_SOFTDIRTY)); /* fails */
      	assert(pme[2] & PAGEMAP_SOFTDIRTY);    /* passes */
      	return 0;
      }
      
      (Note that all pages in new VMAs are softdirty until cleared).
      
      Tested:
      	Used the program given above. I'm going to include this code in
      	a selftest in the future.
      
      [n-horiguchi@ah.jp.nec.com: prevent pagemap_pte_range() from overrunning]
      Signed-off-by: default avatarPeter Feiner <pfeiner@google.com>
      Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
      Cc: Cyrill Gorcunov <gorcunov@openvz.org>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Jamie Liu <jamieliu@google.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      620d77bd
    • Seth Jennings's avatar
      sb_edac: avoid INTERNAL ERROR message in EDAC with unspecified channel · 1127e68e
      Seth Jennings authored
      commit 351fc4a9 upstream.
      
      Intel IA32 SDM Table 15-14 defines channel 0xf as 'not specified', but
      EDAC doesn't know about this and returns and INTERNAL ERROR when the
      channel is greater than NUM_CHANNELS:
      
      kernel: [ 1538.886456] CPU 0: Machine Check Exception: 0 Bank 1: 940000000000009f
      kernel: [ 1538.886669] TSC 2bc68b22e7e812 ADDR 46dae7000 MISC 0 PROCESSOR 0:306e4 TIME 1390414572 SOCKET 0 APIC 0
      kernel: [ 1538.971948] EDAC MC1: INTERNAL ERROR: channel value is out of range (15 >= 4)
      kernel: [ 1538.972203] EDAC MC1: 0 CE memory read error on unknown memory (slot:0 page:0x46dae7 offset:0x0 grain:0 syndrome:0x0 -  area:DRAM err_code:0000:009f socket:1 channel_mask:1 rank:0)
      
      This commit changes sb_edac to forward a channel of -1 to EDAC if the
      channel is not specified.  edac_mc_handle_error() sets the channel to -1
      internally after the error message anyway, so this commit should have no
      effect other than avoiding the INTERNAL ERROR message when the channel
      is not specified.
      Signed-off-by: default avatarSeth Jennings <sjenning@redhat.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@osg.samsung.com>
      Cc: Vinson Lee <vlee@twopensource.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1127e68e
    • Linus Torvalds's avatar
      x86: mm: move mmap_sem unlock from mm_fault_error() to caller · 87f9c4f5
      Linus Torvalds authored
      commit 7fb08eca upstream.
      
      This replaces four copies in various stages of mm_fault_error() handling
      with just a single one.  It will also allow for more natural placement
      of the unlocking after some further cleanup.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      87f9c4f5
    • Steven Capper's avatar
      ARM: 8109/1: mm: Modify pte_write and pmd_write logic for LPAE · 74166021
      Steven Capper authored
      commit ded94779 upstream.
      
      For LPAE, we have the following means for encoding writable or dirty
      ptes:
                                    L_PTE_DIRTY       L_PTE_RDONLY
          !pte_dirty && !pte_write        0               1
          !pte_dirty && pte_write         0               1
          pte_dirty && !pte_write         1               1
          pte_dirty && pte_write          1               0
      
      So we can't distinguish between writeable clean ptes and read only
      ptes. This can cause problems with ptes being incorrectly flagged as
      read only when they are writeable but not dirty.
      
      This patch renumbers L_PTE_RDONLY from AP[2] to a software bit #58,
      and adds additional logic to set AP[2] whenever the pte is read only
      or not dirty. That way we can distinguish between clean writeable ptes
      and read only ptes.
      
      HugeTLB pages will use this new logic automatically.
      
      We need to add some logic to Transparent HugePages to ensure that they
      correctly interpret the revised pgprot permissions (L_PTE_RDONLY has
      moved and no longer matches PMD_SECT_AP2). In the process of revising
      THP, the names of the PMD software bits have been prefixed with L_ to
      make them easier to distinguish from their hardware bit counterparts.
      Signed-off-by: default avatarSteve Capper <steve.capper@linaro.org>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      [hpy: Backported to 3.14
       - adjust the context ]
      Signed-off-by: default avatarHou Pengyang <houpengyang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74166021
    • Steven Capper's avatar
      ARM: 8108/1: mm: Introduce {pte,pmd}_isset and {pte,pmd}_isclear · e8043c52
      Steven Capper authored
      commit f2950706 upstream.
      
      Long descriptors on ARM are 64 bits, and some pte functions such as
      pte_dirty return a bitwise-and of a flag with the pte value. If the
      flag to be tested resides in the upper 32 bits of the pte, then we run
      into the danger of the result being dropped if downcast.
      
      For example:
      	gather_stats(page, md, pte_dirty(*pte), 1);
      where pte_dirty(*pte) is downcast to an int.
      
      This patch introduces a new macro pte_isset which performs the bitwise
      and, then performs a double logical invert (where needed) to ensure
      predictable downcasting. The logical inverse pte_isclear is also
      introduced.
      
      Equivalent pmd functions for Transparent HugePages have also been
      added.
      Signed-off-by: default avatarSteve Capper <steve.capper@linaro.org>
      Reviewed-by: default avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      [hpy: Backported to 3.14
       - adjust the context ]
      Signed-off-by: default avatarHou Pengyang <houpengyang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e8043c52
    • Linus Torvalds's avatar
      vm: make stack guard page errors return VM_FAULT_SIGSEGV rather than SIGBUS · 910584ae
      Linus Torvalds authored
      commit 9c145c56 upstream.
      
      The stack guard page error case has long incorrectly caused a SIGBUS
      rather than a SIGSEGV, but nobody actually noticed until commit
      fee7e49d ("mm: propagate error from stack expansion even for guard
      page") because that error case was never actually triggered in any
      normal situations.
      
      Now that we actually report the error, people noticed the wrong signal
      that resulted.  So far, only the test suite of libsigsegv seems to have
      actually cared, but there are real applications that use libsigsegv, so
      let's not wait for any of those to break.
      Reported-and-tested-by: default avatarTakashi Iwai <tiwai@suse.de>
      Tested-by: default avatarJan Engelhardt <jengelh@inai.de>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
      Cc: linux-arch@vger.kernel.org
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      910584ae
    • Linus Torvalds's avatar
      vm: add VM_FAULT_SIGSEGV handling support · 1c2af919
      Linus Torvalds authored
      commit 33692f27 upstream.
      
      The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
      "you should SIGSEGV" error, because the SIGSEGV case was generally
      handled by the caller - usually the architecture fault handler.
      
      That results in lots of duplication - all the architecture fault
      handlers end up doing very similar "look up vma, check permissions, do
      retries etc" - but it generally works.  However, there are cases where
      the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.
      
      In particular, when accessing the stack guard page, libsigsegv expects a
      SIGSEGV.  And it usually got one, because the stack growth is handled by
      that duplicated architecture fault handler.
      
      However, when the generic VM layer started propagating the error return
      from the stack expansion in commit fee7e49d ("mm: propagate error
      from stack expansion even for guard page"), that now exposed the
      existing VM_FAULT_SIGBUS result to user space.  And user space really
      expected SIGSEGV, not SIGBUS.
      
      To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
      duplicate architecture fault handlers about it.  They all already have
      the code to handle SIGSEGV, so it's about just tying that new return
      value to the existing code, but it's all a bit annoying.
      
      This is the mindless minimal patch to do this.  A more extensive patch
      would be to try to gather up the mostly shared fault handling logic into
      one generic helper routine, and long-term we really should do that
      cleanup.
      
      Just from this patch, you can generally see that most architectures just
      copied (directly or indirectly) the old x86 way of doing things, but in
      the meantime that original x86 model has been improved to hold the VM
      semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
      "newer" things, so it would be a good idea to bring all those
      improvements to the generic case and teach other architectures about
      them too.
      Reported-and-tested-by: default avatarTakashi Iwai <tiwai@suse.de>
      Tested-by: default avatarJan Engelhardt <jengelh@inai.de>
      Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
      Cc: linux-arch@vger.kernel.org
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [shengyong: Backport to 3.14
       - adjust context
       - ignore modification for arch nios2, because 3.14 does not support it
       - add SIGSEGV handling to powerpc/cell spu_fault.c, because 3.14 does not
         separate it to copro_fault.c
       - add SIGSEGV handling to mm/memory.c, because 3.14 does not separate it
         to gup.c
      ]
      Signed-off-by: default avatarSheng Yong <shengyong1@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1c2af919
    • Richard Guy Briggs's avatar
      sched: declare pid_alive as inline · 3e5e7e6b
      Richard Guy Briggs authored
      commit 80e0b6e8 upstream.
      
      We accidentally declared pid_alive without any extern/inline connotation.
      Some platforms were fine with this, some like ia64 and mips were very angry.
      If the function is inline, the prototype should be inline!
      
      on ia64:
      include/linux/sched.h:1718: warning: 'pid_alive' declared inline after
      being called
      Signed-off-by: default avatarRichard Guy Briggs <rgb@redhat.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Signed-off-by: default avatarhujianyang <hujianyang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3e5e7e6b
    • Al Viro's avatar
      move d_rcu from overlapping d_child to overlapping d_alias · 5c48ea64
      Al Viro authored
      commit 946e51f2 upstream.
      
      move d_rcu from overlapping d_child to overlapping d_alias
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Cc: Ben Hutchings <ben@decadent.org.uk>
      [hujianyang: Backported to 3.14 refer to the work of Ben Hutchings in 3.2:
       - Apply name changes in all the different places we use d_alias and d_child
       - Move the WARN_ON() in __d_free() to d_free() as we don't have dentry_free()]
      Signed-off-by: default avatarhujianyang <hujianyang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5c48ea64
    • Nadav Amit's avatar
      KVM: x86: SYSENTER emulation is broken · ce599692
      Nadav Amit authored
      commit f3747379 upstream.
      
      SYSENTER emulation is broken in several ways:
      1. It misses the case of 16-bit code segments completely (CVE-2015-0239).
      2. MSR_IA32_SYSENTER_CS is checked in 64-bit mode incorrectly (bits 0 and 1 can
         still be set without causing #GP).
      3. MSR_IA32_SYSENTER_EIP and MSR_IA32_SYSENTER_ESP are not masked in
         legacy-mode.
      4. There is some unneeded code.
      
      Fix it.
      Signed-off-by: default avatarNadav Amit <namit@cs.technion.ac.il>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [zhangzhiqiang: backport to 3.10:
       - adjust context
       - in 3.10 context "ctxt->eflags &= ~(EFLG_VM | EFLG_IF | EFLG_RF)" is replaced by
         "ctxt->eflags &= ~(EFLG_VM | EFLG_IF)" in upstream, which was changed by another commit.
       - After the above adjustments, becomes same to the original patch:
             https://github.com/torvalds/linux/commit/f3747379accba8e95d70cec0eae0582c8c182050
      ]
      Signed-off-by: default avatarZhiqiang Zhang <zhangzhiqiang.zhang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ce599692
    • Florian Westphal's avatar
      netfilter: conntrack: disable generic tracking for known protocols · efbf300e
      Florian Westphal authored
      commit db29a950 upstream.
      
      Given following iptables ruleset:
      
      -P FORWARD DROP
      -A FORWARD -m sctp --dport 9 -j ACCEPT
      -A FORWARD -p tcp --dport 80 -j ACCEPT
      -A FORWARD -p tcp -m conntrack -m state ESTABLISHED,RELATED -j ACCEPT
      
      One would assume that this allows SCTP on port 9 and TCP on port 80.
      Unfortunately, if the SCTP conntrack module is not loaded, this allows
      *all* SCTP communication, to pass though, i.e. -p sctp -j ACCEPT,
      which we think is a security issue.
      
      This is because on the first SCTP packet on port 9, we create a dummy
      "generic l4" conntrack entry without any port information (since
      conntrack doesn't know how to extract this information).
      
      All subsequent packets that are unknown will then be in established
      state since they will fallback to proto_generic and will match the
      'generic' entry.
      
      Our originally proposed version [1] completely disabled generic protocol
      tracking, but Jozsef suggests to not track protocols for which a more
      suitable helper is available, hence we now mitigate the issue for in
      tree known ct protocol helpers only, so that at least NAT and direction
      information will still be preserved for others.
      
       [1] http://www.spinics.net/lists/netfilter-devel/msg33430.html
      
      Joint work with Daniel Borkmann.
      
      Fixes CVE-2014-8160.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarJozsef Kadlecsik <kadlec@blackhole.kfki.hu>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarZhiqiang Zhang <zhangzhiqiang.zhang@huawei.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      efbf300e
    • Naoya Horiguchi's avatar
      mm: hwpoison: drop lru_add_drain_all() in __soft_offline_page() · 259af80f
      Naoya Horiguchi authored
      commit 9ab3b598 upstream.
      
      A race condition starts to be visible in recent mmotm, where a PG_hwpoison
      flag is set on a migration source page *before* it's back in buddy page
      poo= l.
      
      This is problematic because no page flag is supposed to be set when
      freeing (see __free_one_page().) So the user-visible effect of this race
      is that it could trigger the BUG_ON() when soft-offlining is called.
      
      The root cause is that we call lru_add_drain_all() to make sure that the
      page is in buddy, but that doesn't work because this function just
      schedule= s a work item and doesn't wait its completion.
      drain_all_pages() does drainin= g directly, so simply dropping
      lru_add_drain_all() solves this problem.
      
      [n-horiguchi@ah.jp.nec.com: resolve conflict to apply on v3.11.10]
      Fixes: f15bdfa8 ("mm/memory-failure.c: fix memory leak in successful soft offlining")
      Signed-off-by: default avatarNaoya Horiguchi <n-horiguchi@ah.jp.nec.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Chen Gong <gong.chen@linux.intel.com>
      Cc: <stable@vger.kernel.org>	[3.11+]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      259af80f
    • Janne Heikkinen's avatar
      Bluetooth: Add USB device 04ca:3010 as Atheros AR3012 · 2d47ca2e
      Janne Heikkinen authored
      commit 134d3b35 upstream.
      
      Asus X553MA has USB device 04ca:3010 that is Atheros AR3012
      or compatible.
      
      Device from /sys/kernel/debug/usb/devices:
      
      T:  Bus=01 Lev=02 Prnt=02 Port=03 Cnt=02 Dev#= 27 Spd=12   MxCh= 0
      D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=04ca ProdID=3010 Rev= 0.02
      C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
      A:  FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=01
      I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      Signed-off-by: default avatarJanne Heikkinen <janne.m.heikkinen@gmail.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d47ca2e
    • Dmitry Tunin's avatar
      Bluetooth: ath3k: Add support of MCI 13d3:3408 bt device · 2eb92f79
      Dmitry Tunin authored
      commit 3bb30a7c upstream.
      
      Add support for Bluetooth MCI WB335 (AR9565) Wi-Fi+bt module. This
      Bluetooth module requires loading patch and sysconfig by ath3k driver.
      
      T:  Bus=01 Lev=02 Prnt=03 Port=00 Cnt=01 Dev#= 20 Spd=12   MxCh= 0
      D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=13d3 ProdID=3408 Rev= 0.02
      C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
      A:  FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=01
      I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      Signed-off-by: default avatarDmitry Tunin <hanipouspilot@gmail.com>
      Signed-off-by: default avatarJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2eb92f79
    • Anantha Krishnan's avatar
      Bluetooth: Add support for Acer [0489:e078] · 050f1742
      Anantha Krishnan authored
      commit 4b552bc9 upstream.
      
      Add support for the QCA6174 chip.
      
          T:  Bus=06 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#=  3 Spd=12  MxCh= 0
          D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
          P:  Vendor=0489 ProdID=e078 Rev=00.01
          C:  #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
          I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
          I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      Signed-off-by: default avatarAnantha Krishnan <ananthk@codeaurora.org>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      050f1742
    • Vincent Zwanenburg's avatar
      Add a new PID/VID 0227/0930 for AR3012. · b2b616c2
      Vincent Zwanenburg authored
      commit 89d2975f upstream.
      
      usb devices info:
      
      T:  Bus=01 Lev=02 Prnt=05 Port=00 Cnt=01 Dev#= 20 Spd=12   MxCh= 0
      D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=0930 ProdID=0227 Rev= 0.02
      C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
      A:  FirstIf#= 0 IfCount= 2 Cls=e0(wlcon) Sub=01 Prot=01
      I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      I:  If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      I:  If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      I:  If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      I:  If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      I:  If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      Signed-off-by: default avatarVincent Zwanenburg <vincentz@topmail.ie>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2b616c2
    • Marcel Holtmann's avatar
      Bluetooth: Add support for Broadcom device of Asus Z97-DELUXE motherboard · 1903a10d
      Marcel Holtmann authored
      commit c2aef6e8 upstream.
      
      The Asus Z97-DELUXE motherboard contains a Broadcom based Bluetooth
      controller on the USB bus. However vendor and product ID are listed
      as ASUSTek Computer.
      
      T:  Bus=01 Lev=01 Prnt=01 Port=01 Cnt=02 Dev#=  3 Spd=12   MxCh= 0
      D:  Ver= 2.00 Cls=ff(vend.) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=0b05 ProdID=17cf Rev= 1.12
      S:  Manufacturer=Broadcom Corp
      S:  Product=BCM20702A0
      S:  SerialNumber=54271E910064
      C:* #Ifs= 4 Cfg#= 1 Atr=e0 MxPwr=  0mA
      I:* If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=81(I) Atr=03(Int.) MxPS=  16 Ivl=1ms
      E:  Ad=82(I) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      E:  Ad=02(O) Atr=02(Bulk) MxPS=  64 Ivl=0ms
      I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   0 Ivl=1ms
      I:  If#= 1 Alt= 1 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=   9 Ivl=1ms
      I:  If#= 1 Alt= 2 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  17 Ivl=1ms
      I:  If#= 1 Alt= 3 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  25 Ivl=1ms
      I:  If#= 1 Alt= 4 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  33 Ivl=1ms
      I:  If#= 1 Alt= 5 #EPs= 2 Cls=ff(vend.) Sub=01 Prot=01 Driver=btusb
      E:  Ad=83(I) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      E:  Ad=03(O) Atr=01(Isoc) MxPS=  49 Ivl=1ms
      I:* If#= 2 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=(none)
      E:  Ad=84(I) Atr=02(Bulk) MxPS=  32 Ivl=0ms
      E:  Ad=04(O) Atr=02(Bulk) MxPS=  32 Ivl=0ms
      I:* If#= 3 Alt= 0 #EPs= 0 Cls=fe(app. ) Sub=01 Prot=01 Driver=(none)
      Reported-by: default avatarJerome Leclanche <jerome@leclan.ch>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarJohan Hedberg <johan.hedberg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1903a10d