1. 12 Sep, 2014 40 commits
    • Joshua Zhu's avatar
      perf tools: Add anonymous huge page recognition · a2e8e658
      Joshua Zhu authored
      Judging anonymous memory's vm_area_struct, perf_mmap_event's filename
      will be set to "//anon" indicating this vma belongs to anonymous
      memory.
      
      Once hugepage is used, vma's vm_file points to hugetlbfs. In this way,
      this vma will not be regarded as anonymous memory by is_anon_memory() in
      perf user space utility.
      Signed-off-by: default avatarJoshua Zhu <zhu.wen-jie@hp.com>
      Cc: Akihiro Nagai <akihiro.nagai.hw@hitachi.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Joshua Zhu <zhu.wen-jie@hp.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1357363797-3550-1-git-send-email-zhu.wen-jie@hp.comSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      
      (cherry picked from commit d0528b5d)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      a2e8e658
    • Tejun Heo's avatar
      libata: make it clear that sata_inic162x is experimental · 8e2f67fc
      Tejun Heo authored
      sata_inic162x never reached a state where it's reliable enough for
      production use and data corruption is a relatively common occurrence.
      Make the driver generate warning about the issues and mark the Kconfig
      option as experimental.
      
      If the situation doesn't improve, we'd be better off making it depend
      on CONFIG_BROKEN.  Let's wait for several cycles and see if the kernel
      message draws any attention.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarMartin Braure de Calignon <braurede@free.fr>
      Reported-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Reported-by: risc4all@yahoo.com
      
      (cherry picked from commit bb969619)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      8e2f67fc
    • Ben Hutchings's avatar
      ifb: Include <linux/sched.h> · 32a40775
      Ben Hutchings authored
      commit b51c3427 ('ifb: fix rcu_sched self-detected stalls', commit
      440d57bc upstream) added a call to cond_resched(), which is
      declared in '#include <linux/sched.h>'.  In Linux 3.2.y that header is
      included indirectly in some but not all configurations, so add a
      direct #include.
      Reported-by: default avatarTeck Choon Giam <giamteckchoon@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      (cherry picked from commit 22cbb1bd)
      
      (cherry picked from commit HEAD)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      32a40775
    • David S. Miller's avatar
      sparc64: Do not insert non-valid PTEs into the TSB hash table. · 866c34a1
      David S. Miller authored
      The assumption was that update_mmu_cache() (and the equivalent for PMDs) would
      only be called when the PTE being installed will be accessible by the user.
      
      This is not true for code paths originating from remove_migration_pte().
      
      There are dire consequences for placing a non-valid PTE into the TSB.  The TLB
      miss frramework assumes thatwhen a TSB entry matches we can just load it into
      the TLB and return from the TLB miss trap.
      
      So if a non-valid PTE is in there, we will deadlock taking the TLB miss over
      and over, never satisfying the miss.
      
      Just exit early from update_mmu_cache() and friends in this situation.
      
      Based upon a report and patch from Christopher Alexander Tobias Schulze.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 18f38132)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      866c34a1
    • H. Peter Anvin's avatar
      x86, espfix: Make espfix64 a Kconfig option, fix UML · 8461e05f
      H. Peter Anvin authored
      Make espfix64 a hidden Kconfig option.  This fixes the x86-64 UML
      build which had broken due to the non-existence of init_espfix_bsp()
      in UML: since UML uses its own Kconfig, this option does not appear in
      the UML build.
      
      This also makes it possible to make support for 16-bit segments a
      configuration option, for the people who want to minimize the size of
      the kernel.
      Reported-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Richard Weinberger <richard@nod.at>
      Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.com
      
      (cherry picked from commit 197725de)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      8461e05f
    • Sven Wegener's avatar
      x86_32, entry: Store badsys error code in %eax · b47af4d9
      Sven Wegener authored
      Commit 554086d8 ("x86_32, entry: Do syscall exit work on badsys
      (CVE-2014-4508)") introduced a regression in the x86_32 syscall entry
      code, resulting in syscall() not returning proper errors for undefined
      syscalls on CPUs supporting the sysenter feature.
      
      The following code:
      
      > int result = syscall(666);
      > printf("result=%d errno=%d error=%s\n", result, errno, strerror(errno));
      
      results in:
      
      > result=666 errno=0 error=Success
      
      Obviously, the syscall return value is the called syscall number, but it
      should have been an ENOSYS error. When run under ptrace it behaves
      correctly, which makes it hard to debug in the wild:
      
      > result=-1 errno=38 error=Function not implemented
      
      The %eax register is the return value register. For debugging via ptrace
      the syscall entry code stores the complete register context on the
      stack. The badsys handlers only store the ENOSYS error code in the
      ptrace register set and do not set %eax like a regular syscall handler
      would. The old resume_userspace call chain contains code that clobbers
      %eax and it restores %eax from the ptrace registers afterwards. The same
      goes for the ptrace-enabled call chain. When ptrace is not used, the
      syscall return value is the passed-in syscall number from the untouched
      %eax register.
      
      Use %eax as the return value register in syscall_badsys and
      sysenter_badsys, like a real syscall handler does, and have the caller
      push the value onto the stack for ptrace access.
      Signed-off-by: default avatarSven Wegener <sven.wegener@stealer.net>
      Link: http://lkml.kernel.org/r/alpine.LNX.2.11.1407221022380.31021@titan.int.lan.stealer.netReviewed-and-tested-by: default avatarAndy Lutomirski <luto@amacapital.net>
      Cc: <stable@vger.kernel.org> # If 554086d8 is backported
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      
      (cherry picked from commit 8142b215)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      b47af4d9
    • Takashi Iwai's avatar
      PM / sleep: Fix request_firmware() error at resume · 13bfcd58
      Takashi Iwai authored
      The commit [247bc037: PM / Sleep: Mitigate race between the freezer
      and request_firmware()] introduced the finer state control, but it
      also leads to a new bug; for example, a bug report regarding the
      firmware loading of intel BT device at suspend/resume:
        https://bugzilla.novell.com/show_bug.cgi?id=873790
      
      The root cause seems to be a small window between the process resume
      and the clear of usermodehelper lock.  The request_firmware() function
      checks the UMH lock and gives up when it's in UMH_DISABLE state.  This
      is for avoiding the invalid  f/w loading during suspend/resume phase.
      The problem is, however, that usermodehelper_enable() is called at the
      end of thaw_processes().  Thus, a thawed process in between can kick
      off the f/w loader code path (in this case, via btusb_setup_intel())
      even before the call of usermodehelper_enable().  Then
      usermodehelper_read_trylock() returns an error and request_firmware()
      spews WARN_ON() in the end.
      
      This oneliner patch fixes the issue just by setting to UMH_FREEZING
      state again before restarting tasks, so that the call of
      request_firmware() will be blocked until the end of this function
      instead of returning an error.
      
      Fixes: 247bc037 (PM / Sleep: Mitigate race between the freezer and request_firmware())
      Link: https://bugzilla.novell.com/show_bug.cgi?id=873790
      Cc: 3.4+ <stable@vger.kernel.org> # 3.4+
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      
      (cherry picked from commit 4320f6b1)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      13bfcd58
    • John Stultz's avatar
      alarmtimer: Fix bug where relative alarm timers were treated as absolute · 4864494f
      John Stultz authored
      Sharvil noticed with the posix timer_settime interface, using the
      CLOCK_REALTIME_ALARM or CLOCK_BOOTTIME_ALARM clockid, if the users
      tried to specify a relative time timer, it would incorrectly be
      treated as absolute regardless of the state of the flags argument.
      
      This patch corrects this, properly checking the absolute/relative flag,
      as well as adds further error checking that no invalid flag bits are set.
      Reported-by: default avatarSharvil Nanavati <sharvil@google.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Cc: Prarit Bhargava <prarit@redhat.com>
      Cc: Sharvil Nanavati <sharvil@google.com>
      Cc: stable <stable@vger.kernel.org> #3.0+
      Link: http://lkml.kernel.org/r/1404767171-6902-1-git-send-email-john.stultz@linaro.orgSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      
      (cherry picked from commit 16927776)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      4864494f
    • Alex Deucher's avatar
      drm/radeon: avoid leaking edid data · d76fa2c7
      Alex Deucher authored
      In some cases we fetch the edid in the detect() callback
      in order to determine what sort of monitor is connected.
      If that happens, don't fetch the edid again in the get_modes()
      callback or we will leak the edid.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Cc: stable@vger.kernel.org
      
      (cherry picked from commit 0ac66eff)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      d76fa2c7
    • Amitkumar Karwar's avatar
      mwifiex: fix Tx timeout issue · 0832a4fd
      Amitkumar Karwar authored
      https://bugzilla.kernel.org/show_bug.cgi?id=70191
      https://bugzilla.kernel.org/show_bug.cgi?id=77581
      
      It is observed that sometimes Tx packet is downloaded without
      adding driver's txpd header. This results in firmware parsing
      garbage data as packet length. Sometimes firmware is unable
      to read the packet if length comes out as invalid. This stops
      further traffic and timeout occurs.
      
      The root cause is uninitialized fields in tx_info(skb->cb) of
      packet used to get garbage values. In this case if
      MWIFIEX_BUF_FLAG_REQUEUED_PKT flag is mistakenly set, txpd
      header was skipped. This patch makes sure that tx_info is
      correctly initialized to fix the problem.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: default avatarAndrew Wiley <wiley.andrew.j@gmail.com>
      Reported-by: default avatarLinus Gasser <list@markas-al-nour.org>
      Reported-by: default avatarMichael Hirsch <hirsch@teufel.de>
      Tested-by: default avatarXinming Hu <huxm@marvell.com>
      Signed-off-by: default avatarAmitkumar Karwar <akarwar@marvell.com>
      Signed-off-by: default avatarMaithili Hinge <maithili@marvell.com>
      Signed-off-by: default avatarAvinash Patil <patila@marvell.com>
      Signed-off-by: default avatarBing Zhao <bzhao@marvell.com>
      Signed-off-by: default avatarJohn W. Linville <linville@tuxdriver.com>
      
      (cherry picked from commit d76744a9)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      0832a4fd
    • HATAYAMA Daisuke's avatar
      perf/x86/intel: ignore CondChgd bit to avoid false NMI handling · 76a2b2ed
      HATAYAMA Daisuke authored
      Currently, any NMI is falsely handled by a NMI handler of NMI watchdog
      if CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR is set.
      
      For example, we use external NMI to make system panic to get crash
      dump, but in this case, the external NMI is falsely handled do to the
      issue.
      
      This commit deals with the issue simply by ignoring CondChgd bit.
      
      Here is explanation in detail.
      
      On x86 NMI watchdog uses performance monitoring feature to
      periodically signal NMI each time performance counter gets overflowed.
      
      intel_pmu_handle_irq() is called as a NMI_LOCAL handler from a NMI
      handler of NMI watchdog, perf_event_nmi_handler(). It identifies an
      owner of a given NMI by looking at overflow status bits in
      MSR_CORE_PERF_GLOBAL_STATUS MSR. If some of the bits are set, then it
      handles the given NMI as its own NMI.
      
      The problem is that the intel_pmu_handle_irq() doesn't distinguish
      CondChgd bit from other bits. Unlike the other status bits, CondChgd
      bit doesn't represent overflow status for performance counters. Thus,
      CondChgd bit cannot be thought of as a mark indicating a given NMI is
      NMI watchdog's.
      
      As a result, if CondChgd bit is set, any NMI is falsely handled by the
      NMI handler of NMI watchdog. Also, if type of the falsely handled NMI
      is either NMI_UNKNOWN, NMI_SERR or NMI_IO_CHECK, the corresponding
      action is never performed until CondChgd bit is cleared.
      
      I noticed this behavior on systems with Ivy Bridge processors: Intel
      Xeon CPU E5-2630 v2 and Intel Xeon CPU E7-8890 v2. On both systems,
      CondChgd bit in MSR_CORE_PERF_GLOBAL_STATUS MSR has already been set
      in the beginning at boot. Then the CondChgd bit is immediately cleared
      by next wrmsr to MSR_CORE_PERF_GLOBAL_CTRL MSR and appears to remain
      0.
      
      On the other hand, on older processors such as Nehalem, Xeon E7540,
      CondChgd bit is not set in the beginning at boot.
      
      I'm not sure about exact behavior of CondChgd bit, in particular when
      this bit is set. Although I read Intel System Programmer's Manual to
      figure out that, the descriptions I found are:
      
        In 18.9.1:
      
        "The MSR_PERF_GLOBAL_STATUS MSR also provides a ¡sticky bit¢ to
         indicate changes to the state of performancmonitoring hardware"
      
        In Table 35-2 IA-32 Architectural MSRs
      
        63 CondChg: status bits of this register has changed.
      
      These are different from the bahviour I see on the actual system as I
      explained above.
      
      At least, I think ignoring CondChgd bit should be enough for NMI
      watchdog perspective.
      Signed-off-by: default avatarHATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Acked-by: default avatarDon Zickus <dzickus@redhat.com>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: <stable@vger.kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: linux-kernel@vger.kernel.org
      Link: http://lkml.kernel.org/r/20140625.103503.409316067.d.hatayama@jp.fujitsu.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      
      (cherry picked from commit b292d7a1)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      76a2b2ed
    • Eric Dumazet's avatar
      ipv4: fix buffer overflow in ip_options_compile() · 863b3474
      Eric Dumazet authored
      There is a benign buffer overflow in ip_options_compile spotted by
      AddressSanitizer[1] :
      
      Its benign because we always can access one extra byte in skb->head
      (because header is followed by struct skb_shared_info), and in this case
      this byte is not even used.
      
      [28504.910798] ==================================================================
      [28504.912046] AddressSanitizer: heap-buffer-overflow in ip_options_compile
      [28504.913170] Read of size 1 by thread T15843:
      [28504.914026]  [<ffffffff81802f91>] ip_options_compile+0x121/0x9c0
      [28504.915394]  [<ffffffff81804a0d>] ip_options_get_from_user+0xad/0x120
      [28504.916843]  [<ffffffff8180dedf>] do_ip_setsockopt.isra.15+0x8df/0x1630
      [28504.918175]  [<ffffffff8180ec60>] ip_setsockopt+0x30/0xa0
      [28504.919490]  [<ffffffff8181e59b>] tcp_setsockopt+0x5b/0x90
      [28504.920835]  [<ffffffff8177462f>] sock_common_setsockopt+0x5f/0x70
      [28504.922208]  [<ffffffff817729c2>] SyS_setsockopt+0xa2/0x140
      [28504.923459]  [<ffffffff818cfb69>] system_call_fastpath+0x16/0x1b
      [28504.924722]
      [28504.925106] Allocated by thread T15843:
      [28504.925815]  [<ffffffff81804995>] ip_options_get_from_user+0x35/0x120
      [28504.926884]  [<ffffffff8180dedf>] do_ip_setsockopt.isra.15+0x8df/0x1630
      [28504.927975]  [<ffffffff8180ec60>] ip_setsockopt+0x30/0xa0
      [28504.929175]  [<ffffffff8181e59b>] tcp_setsockopt+0x5b/0x90
      [28504.930400]  [<ffffffff8177462f>] sock_common_setsockopt+0x5f/0x70
      [28504.931677]  [<ffffffff817729c2>] SyS_setsockopt+0xa2/0x140
      [28504.932851]  [<ffffffff818cfb69>] system_call_fastpath+0x16/0x1b
      [28504.934018]
      [28504.934377] The buggy address ffff880026382828 is located 0 bytes to the right
      [28504.934377]  of 40-byte region [ffff880026382800, ffff880026382828)
      [28504.937144]
      [28504.937474] Memory state around the buggy address:
      [28504.938430]  ffff880026382300: ........ rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.939884]  ffff880026382400: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.941294]  ffff880026382500: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.942504]  ffff880026382600: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.943483]  ffff880026382700: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.944511] >ffff880026382800: .....rrr rrrrrrrr rrrrrrrr rrrrrrrr
      [28504.945573]                         ^
      [28504.946277]  ffff880026382900: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.094949]  ffff880026382a00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.096114]  ffff880026382b00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.097116]  ffff880026382c00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.098472]  ffff880026382d00: ffffffff rrrrrrrr rrrrrrrr rrrrrrrr
      [28505.099804] Legend:
      [28505.100269]  f - 8 freed bytes
      [28505.100884]  r - 8 redzone bytes
      [28505.101649]  . - 8 allocated bytes
      [28505.102406]  x=1..7 - x allocated bytes + (8-x) redzone bytes
      [28505.103637] ==================================================================
      
      [1] https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernelSigned-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 10ec9472)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      863b3474
    • Ben Hutchings's avatar
      dns_resolver: Null-terminate the right string · e0c4ff36
      Ben Hutchings authored
      *_result[len] is parsed as *(_result[len]) which is not at all what we
      want to touch here.
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Fixes: 84a7c0b1 ("dns_resolver: assure that dns_query() result is null-terminated")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 640d7efe)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e0c4ff36
    • Manuel Schölling's avatar
      dns_resolver: assure that dns_query() result is null-terminated · 3cf95798
      Manuel Schölling authored
      [ Upstream commit 84a7c0b1 ]
      
      dns_query() credulously assumes that keys are null-terminated and
      returns a copy of a memory block that is off by one.
      Signed-off-by: default avatarManuel Schölling <manuel.schoelling@gmx.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      (cherry picked from commit c304a23b)
      3cf95798
    • Sowmini Varadhan's avatar
      sunvnet: clean up objects created in vnet_new() on vnet_exit() · ab4be9ea
      Sowmini Varadhan authored
      Nothing cleans up the objects created by
      vnet_new(), they are completely leaked.
      
      vnet_exit(), after doing the vio_unregister_driver() to clean
      up ports, should call a helper function that iterates over vnet_list
      and cleans up those objects. This includes unregister_netdevice()
      as well as free_netdev().
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarDave Kleikamp <dave.kleikamp@oracle.com>
      Reviewed-by: default avatarKarl Volz <karl.volz@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit a4b70a07)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      ab4be9ea
    • Christoph Schulz's avatar
      net: pppoe: use correct channel MTU when using Multilink PPP · 35c2ad27
      Christoph Schulz authored
      The PPP channel MTU is used with Multilink PPP when ppp_mp_explode() (see
      ppp_generic module) tries to determine how big a fragment might be. According
      to RFC 1661, the MTU excludes the 2-byte PPP protocol field, see the
      corresponding comment and code in ppp_mp_explode():
      
      		/*
      		 * hdrlen includes the 2-byte PPP protocol field, but the
      		 * MTU counts only the payload excluding the protocol field.
      		 * (RFC1661 Section 2)
      		 */
      		mtu = pch->chan->mtu - (hdrlen - 2);
      
      However, the pppoe module *does* include the PPP protocol field in the channel
      MTU, which is wrong as it causes the PPP payload to be 1-2 bytes too big under
      certain circumstances (one byte if PPP protocol compression is used, two
      otherwise), causing the generated Ethernet packets to be dropped. So the pppoe
      module has to subtract two bytes from the channel MTU. This error only
      manifests itself when using Multilink PPP, as otherwise the channel MTU is not
      used anywhere.
      
      In the following, I will describe how to reproduce this bug. We configure two
      pppd instances for multilink PPP over two PPPoE links, say eth2 and eth3, with
      a MTU of 1492 bytes for each link and a MRRU of 2976 bytes. (This MRRU is
      computed by adding the two link MTUs and subtracting the MP header twice, which
      is 4 bytes long.) The necessary pppd statements on both sides are "multilink
      mtu 1492 mru 1492 mrru 2976". On the client side, we additionally need "plugin
      rp-pppoe.so eth2" and "plugin rp-pppoe.so eth3", respectively; on the server
      side, we additionally need to start two pppoe-server instances to be able to
      establish two PPPoE sessions, one over eth2 and one over eth3. We set the MTU
      of the PPP network interface to the MRRU (2976) on both sides of the connection
      in order to make use of the higher bandwidth. (If we didn't do that, IP
      fragmentation would kick in, which we want to avoid.)
      
      Now we send a ICMPv4 echo request with a payload of 2948 bytes from client to
      server over the PPP link. This results in the following network packet:
      
         2948 (echo payload)
       +    8 (ICMPv4 header)
       +   20 (IPv4 header)
      ---------------------
         2976 (PPP payload)
      
      These 2976 bytes do not exceed the MTU of the PPP network interface, so the
      IP packet is not fragmented. Now the multilink PPP code in ppp_mp_explode()
      prepends one protocol byte (0x21 for IPv4), making the packet one byte bigger
      than the negotiated MRRU. So this packet would have to be divided in three
      fragments. But this does not happen as each link MTU is assumed to be two bytes
      larger. So this packet is diveded into two fragments only, one of size 1489 and
      one of size 1488. Now we have for that bigger fragment:
      
         1489 (PPP payload)
       +    4 (MP header)
       +    2 (PPP protocol field for the MP payload (0x3d))
       +    6 (PPPoE header)
      --------------------------
         1501 (Ethernet payload)
      
      This packet exceeds the link MTU and is discarded.
      
      If one configures the link MTU on the client side to 1501, one can see the
      discarded Ethernet frames with tcpdump running on the client. A
      
      ping -s 2948 -c 1 192.168.15.254
      
      leads to the smaller fragment that is correctly received on the server side:
      
      (tcpdump -vvvne -i eth3 pppoes and ppp proto 0x3d)
      52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864),
        length 1514: PPPoE  [ses 0x3] MLPPP (0x003d), length 1494: seq 0x000,
        Flags [end], length 1492
      
      and to the bigger fragment that is not received on the server side:
      
      (tcpdump -vvvne -i eth2 pppoes and ppp proto 0x3d)
      52:54:00:70:9e:89 > 52:54:00:5d:6f:b0, ethertype PPPoE S (0x8864),
        length 1515: PPPoE  [ses 0x5] MLPPP (0x003d), length 1495: seq 0x000,
        Flags [begin], length 1493
      
      With the patch below, we correctly obtain three fragments:
      
      52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864),
        length 1514: PPPoE  [ses 0x1] MLPPP (0x003d), length 1494: seq 0x000,
        Flags [begin], length 1492
      52:54:00:70:9e:89 > 52:54:00:5d:6f:b0, ethertype PPPoE S (0x8864),
        length 1514: PPPoE  [ses 0x1] MLPPP (0x003d), length 1494: seq 0x000,
        Flags [none], length 1492
      52:54:00:ad:87:fd > 52:54:00:79:5c:d0, ethertype PPPoE S (0x8864),
        length 27: PPPoE  [ses 0x1] MLPPP (0x003d), length 7: seq 0x000,
        Flags [end], length 5
      
      And the ICMPv4 echo request is successfully received at the server side:
      
      IP (tos 0x0, ttl 64, id 21925, offset 0, flags [DF], proto ICMP (1),
        length 2976)
          192.168.222.2 > 192.168.15.254: ICMP echo request, id 30530, seq 0,
            length 2956
      
      The bug was introduced in commit c9aa6895
      ("[PPPOE]: Advertise PPPoE MTU") from the very beginning. This patch applies
      to 3.10 upwards but the fix can be applied (with minor modifications) to
      kernels as old as 2.6.32.
      Signed-off-by: default avatarChristoph Schulz <develop@kristov.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit a8a3e41c)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      35c2ad27
    • Daniel Borkmann's avatar
      net: sctp: fix information leaks in ulpevent layer · 136b4c39
      Daniel Borkmann authored
      While working on some other SCTP code, I noticed that some
      structures shared with user space are leaking uninitialized
      stack or heap buffer. In particular, struct sctp_sndrcvinfo
      has a 2 bytes hole between .sinfo_flags and .sinfo_ppid that
      remains unfilled by us in sctp_ulpevent_read_sndrcvinfo() when
      putting this into cmsg. But also struct sctp_remote_error
      contains a 2 bytes hole that we don't fill but place into a skb
      through skb_copy_expand() via sctp_ulpevent_make_remote_error().
      
      Both structures are defined by the IETF in RFC6458:
      
      * Section 5.3.2. SCTP Header Information Structure:
      
        The sctp_sndrcvinfo structure is defined below:
      
        struct sctp_sndrcvinfo {
          uint16_t sinfo_stream;
          uint16_t sinfo_ssn;
          uint16_t sinfo_flags;
          <-- 2 bytes hole  -->
          uint32_t sinfo_ppid;
          uint32_t sinfo_context;
          uint32_t sinfo_timetolive;
          uint32_t sinfo_tsn;
          uint32_t sinfo_cumtsn;
          sctp_assoc_t sinfo_assoc_id;
        };
      
      * 6.1.3. SCTP_REMOTE_ERROR:
      
        A remote peer may send an Operation Error message to its peer.
        This message indicates a variety of error conditions on an
        association. The entire ERROR chunk as it appears on the wire
        is included in an SCTP_REMOTE_ERROR event. Please refer to the
        SCTP specification [RFC4960] and any extensions for a list of
        possible error formats. An SCTP error notification has the
        following format:
      
        struct sctp_remote_error {
          uint16_t sre_type;
          uint16_t sre_flags;
          uint32_t sre_length;
          uint16_t sre_error;
          <-- 2 bytes hole  -->
          sctp_assoc_t sre_assoc_id;
          uint8_t  sre_data[];
        };
      
      Fix this by setting both to 0 before filling them out. We also
      have other structures shared between user and kernel space in
      SCTP that contains holes (e.g. struct sctp_paddrthlds), but we
      copy that buffer over from user space first and thus don't need
      to care about it in that cases.
      
      While at it, we can also remove lengthy comments copied from
      the draft, instead, we update the comment with the correct RFC
      number where one can look it up.
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 8f2e5ae4)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      136b4c39
    • Jon Paul Maloy's avatar
      tipc: clear 'next'-pointer of message fragments before reassembly · 05110600
      Jon Paul Maloy authored
      If the 'next' pointer of the last fragment buffer in a message is not
      zeroed before reassembly, we risk ending up with a corrupt message,
      since the reassembly function itself isn't doing this.
      
      Currently, when a buffer is retrieved from the deferred queue of the
      broadcast link, the next pointer is not cleared, with the result as
      described above.
      
      This commit corrects this, and thereby fixes a bug that may occur when
      long broadcast messages are transmitted across dual interfaces. The bug
      has been present since 40ba3cdf ("tipc:
      message reassembly using fragment chain")
      
      This commit should be applied to both net and net-next.
      Signed-off-by: default avatarJon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 99941754)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      05110600
    • Suresh Reddy's avatar
      be2net: set EQ DB clear-intr bit in be_open() · 344905db
      Suresh Reddy authored
      On BE3, if the clear-interrupt bit of the EQ doorbell is not set the first
      time it is armed, ocassionally we have observed that the EQ doesn't raise
      anymore interrupts even if it is in armed state.
      This patch fixes this by setting the clear-interrupt bit when EQs are
      armed for the first time in be_open().
      Signed-off-by: default avatarSuresh Reddy <Suresh.Reddy@emulex.com>
      Signed-off-by: default avatarSathya Perla <sathya.perla@emulex.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 4cad9f3b)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      344905db
    • Andrey Utkin's avatar
      appletalk: Fix socket referencing in skb · 799445c4
      Andrey Utkin authored
      Setting just skb->sk without taking its reference and setting a
      destructor is invalid. However, in the places where this was done, skb
      is used in a way not requiring skb->sk setting. So dropping the setting
      of skb->sk.
      Thanks to Eric Dumazet <eric.dumazet@gmail.com> for correct solution.
      
      Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=79441Reported-by: default avatarEd Martin <edman007@edman007.com>
      Signed-off-by: default avatarAndrey Utkin <andrey.krieger.utkin@gmail.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 36beddc2)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      799445c4
    • dingtianhong's avatar
      igmp: fix the problem when mc leave group · 8ad105cc
      dingtianhong authored
      The problem was triggered by these steps:
      
      1) create socket, bind and then setsockopt for add mc group.
         mreq.imr_multiaddr.s_addr = inet_addr("255.0.0.37");
         mreq.imr_interface.s_addr = inet_addr("192.168.1.2");
         setsockopt(sockfd, IPPROTO_IP, IP_ADD_MEMBERSHIP, &mreq, sizeof(mreq));
      
      2) drop the mc group for this socket.
         mreq.imr_multiaddr.s_addr = inet_addr("255.0.0.37");
         mreq.imr_interface.s_addr = inet_addr("0.0.0.0");
         setsockopt(sockfd, IPPROTO_IP, IP_DROP_MEMBERSHIP, &mreq, sizeof(mreq));
      
      3) and then drop the socket, I found the mc group was still used by the dev:
      
         netstat -g
      
         Interface       RefCnt Group
         --------------- ------ ---------------------
         eth2		   1	  255.0.0.37
      
      Normally even though the IP_DROP_MEMBERSHIP return error, the mc group still need
      to be released for the netdev when drop the socket, but this process was broken when
      route default is NULL, the reason is that:
      
      The ip_mc_leave_group() will choose the in_dev by the imr_interface.s_addr, if input addr
      is NULL, the default route dev will be chosen, then the ifindex is got from the dev,
      then polling the inet->mc_list and return -ENODEV, but if the default route dev is NULL,
      the in_dev and ifIndex is both NULL, when polling the inet->mc_list, the mc group will be
      released from the mc_list, but the dev didn't dec the refcnt for this mc group, so
      when dropping the socket, the mc_list is NULL and the dev still keep this group.
      
      v1->v2: According Hideaki's suggestion, we should align with IPv6 (RFC3493) and BSDs,
      	so I add the checking for the in_dev before polling the mc_list, make sure when
      	we remove the mc group, dec the refcnt to the real dev which was using the mc address.
      	The problem would never happened again.
      Signed-off-by: default avatarDing Tianhong <dingtianhong@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 52ad353a)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      8ad105cc
    • Li RongQing's avatar
      8021q: fix a potential memory leak · 65c5b9d8
      Li RongQing authored
      skb_cow called in vlan_reorder_header does not free the skb when it failed,
      and vlan_reorder_header returns NULL to reset original skb when it is called
      in vlan_untag, lead to a memory leak.
      Signed-off-by: default avatarLi RongQing <roy.qing.li@gmail.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 916c1689)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      65c5b9d8
    • Neal Cardwell's avatar
      tcp: fix tcp_match_skb_to_sack() for unaligned SACK at end of an skb · 01d3428d
      Neal Cardwell authored
      If there is an MSS change (or misbehaving receiver) that causes a SACK
      to arrive that covers the end of an skb but is less than one MSS, then
      tcp_match_skb_to_sack() was rounding up pkt_len to the full length of
      the skb ("Round if necessary..."), then chopping all bytes off the skb
      and creating a zero-byte skb in the write queue.
      
      This was visible now because the recently simplified TLP logic in
      bef1909e ("tcp: fixing TLP's FIN recovery") could find that 0-byte
      skb at the end of the write queue, and now that we do not check that
      skb's length we could send it as a TLP probe.
      
      Consider the following example scenario:
      
       mss: 1000
       skb: seq: 0 end_seq: 4000  len: 4000
       SACK: start_seq: 3999 end_seq: 4000
      
      The tcp_match_skb_to_sack() code will compute:
      
       in_sack = false
       pkt_len = start_seq - TCP_SKB_CB(skb)->seq = 3999 - 0 = 3999
       new_len = (pkt_len / mss) * mss = (3999/1000)*1000 = 3000
       new_len += mss = 4000
      
      Previously we would find the new_len > skb->len check failing, so we
      would fall through and set pkt_len = new_len = 4000 and chop off
      pkt_len of 4000 from the 4000-byte skb, leaving a 0-byte segment
      afterward in the write queue.
      
      With this new commit, we notice that the new new_len >= skb->len check
      succeeds, so that we return without trying to fragment.
      
      Fixes: adb92db8 ("tcp: Make SACK code to split only at mss boundaries")
      Reported-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Cc: Eric Dumazet <edumazet@google.com>
      Cc: Yuchung Cheng <ycheng@google.com>
      Cc: Ilpo Jarvinen <ilpo.jarvinen@helsinki.fi>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      
      (cherry picked from commit 2cd0d743)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      01d3428d
    • Markus F.X.J. Oberhumer's avatar
      crypto: testmgr - update LZO compression test vectors · e762e28d
      Markus F.X.J. Oberhumer authored
      Update the LZO compression test vectors according to the latest compressor
      version.
      Signed-off-by: default avatarMarkus F.X.J. Oberhumer <markus@oberhumer.com>
      
      (cherry picked from commit 0ec73820)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      e762e28d
    • Roland Dreier's avatar
      x86, ioremap: Speed up check for RAM pages · 5898e552
      Roland Dreier authored
      In __ioremap_caller() (the guts of ioremap), we loop over the range of
      pfns being remapped and checks each one individually with page_is_ram().
      For large ioremaps, this can be very slow.  For example, we have a
      device with a 256 GiB PCI BAR, and ioremapping this BAR can take 20+
      seconds -- sometimes long enough to trigger the soft lockup detector!
      
      Internally, page_is_ram() calls walk_system_ram_range() on a single
      page.  Instead, we can make a single call to walk_system_ram_range()
      from __ioremap_caller(), and do our further checks only for any RAM
      pages that we find.  For the common case of MMIO, this saves an enormous
      amount of work, since the range being ioremapped doesn't intersect
      system RAM at all.
      
      With this change, ioremap on our 256 GiB BAR takes less than 1 second.
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Link: http://lkml.kernel.org/r/1399054721-1331-1-git-send-email-roland@kernel.orgSigned-off-by: default avatarH. Peter Anvin <hpa@linux.intel.com>
      
      (cherry picked from commit c81c8a1e)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      5898e552
    • Thomas Gleixner's avatar
      rtmutex: Plug slow unlock race · 26a8b8b7
      Thomas Gleixner authored
      When the rtmutex fast path is enabled the slow unlock function can
      create the following situation:
      
      spin_lock(foo->m->wait_lock);
      foo->m->owner = NULL;
      	    			rt_mutex_lock(foo->m); <-- fast path
      				free = atomic_dec_and_test(foo->refcnt);
      				rt_mutex_unlock(foo->m); <-- fast path
      				if (free)
      				   kfree(foo);
      
      spin_unlock(foo->m->wait_lock); <--- Use after free.
      
      Plug the race by changing the slow unlock to the following scheme:
      
           while (!rt_mutex_has_waiters(m)) {
           	    /* Clear the waiters bit in m->owner */
      	    clear_rt_mutex_waiters(m);
            	    owner = rt_mutex_owner(m);
            	    spin_unlock(m->wait_lock);
            	    if (cmpxchg(m->owner, owner, 0) == owner)
            	       return;
            	    spin_lock(m->wait_lock);
           }
      
      So in case of a new waiter incoming while the owner tries the slow
      path unlock we have two situations:
      
       unlock(wait_lock);
      					lock(wait_lock);
       cmpxchg(p, owner, 0) == owner
       	    	   			mark_rt_mutex_waiters(lock);
      	 				acquire(lock);
      
      Or:
      
       unlock(wait_lock);
      					lock(wait_lock);
      	 				mark_rt_mutex_waiters(lock);
       cmpxchg(p, owner, 0) != owner
      					enqueue_waiter();
      					unlock(wait_lock);
       lock(wait_lock);
       wakeup_next waiter();
       unlock(wait_lock);
      					lock(wait_lock);
      					acquire(lock);
      
      If the fast path is disabled, then the simple
      
         m->owner = NULL;
         unlock(m->wait_lock);
      
      is sufficient as all access to m->owner is serialized via
      m->wait_lock;
      
      Also document and clarify the wakeup_next_waiter function as suggested
      by Oleg Nesterov.
      Reported-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140611183852.937945560@linutronix.de
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      
      (cherry picked from commit 27e35715)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      26a8b8b7
    • Thomas Gleixner's avatar
      rtmutex: Handle deadlock detection smarter · 6c13cf4e
      Thomas Gleixner authored
      Even in the case when deadlock detection is not requested by the
      caller, we can detect deadlocks. Right now the code stops the lock
      chain walk and keeps the waiter enqueued, even on itself. Silly not to
      yell when such a scenario is detected and to keep the waiter enqueued.
      
      Return -EDEADLK unconditionally and handle it at the call sites.
      
      The futex calls return -EDEADLK. The non futex ones dequeue the
      waiter, throw a warning and put the task into a schedule loop.
      
      Tagged for stable as it makes the code more robust.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Brad Mouring <bmouring@ni.com>
      Link: http://lkml.kernel.org/r/20140605152801.836501969@linutronix.de
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      
      (cherry picked from commit 3d5c9340)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      6c13cf4e
    • Thomas Gleixner's avatar
      rtmutex: Detect changes in the pi lock chain · 4cd98445
      Thomas Gleixner authored
      When we walk the lock chain, we drop all locks after each step. So the
      lock chain can change under us before we reacquire the locks. That's
      harmless in principle as we just follow the wrong lock path. But it
      can lead to a false positive in the dead lock detection logic:
      
      T0 holds L0
      T0 blocks on L1 held by T1
      T1 blocks on L2 held by T2
      T2 blocks on L3 held by T3
      T4 blocks on L4 held by T4
      
      Now we walk the chain
      
      lock T1 -> lock L2 -> adjust L2 -> unlock T1 ->
           lock T2 ->  adjust T2 ->  drop locks
      
      T2 times out and blocks on L0
      
      Now we continue:
      
      lock T2 -> lock L0 -> deadlock detected, but it's not a deadlock at all.
      
      Brad tried to work around that in the deadlock detection logic itself,
      but the more I looked at it the less I liked it, because it's crystal
      ball magic after the fact.
      
      We actually can detect a chain change very simple:
      
      lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->
      
           next_lock = T2->pi_blocked_on->lock;
      
      drop locks
      
      T2 times out and blocks on L0
      
      Now we continue:
      
      lock T2 ->
      
           if (next_lock != T2->pi_blocked_on->lock)
           	   return;
      
      So if we detect that T2 is now blocked on a different lock we stop the
      chain walk. That's also correct in the following scenario:
      
      lock T1 -> lock L2 -> adjust L2 -> unlock T1 -> lock T2 -> adjust T2 ->
      
           next_lock = T2->pi_blocked_on->lock;
      
      drop locks
      
      T3 times out and drops L3
      T2 acquires L3 and blocks on L4 now
      
      Now we continue:
      
      lock T2 ->
      
           if (next_lock != T2->pi_blocked_on->lock)
           	   return;
      
      We don't have to follow up the chain at that point, because T2
      propagated our priority up to T4 already.
      
      [ Folded a cleanup patch from peterz ]
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reported-by: default avatarBrad Mouring <bmouring@ni.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20140605152801.930031935@linutronix.de
      Cc: stable@vger.kernel.org
      
      (cherry picked from commit 82084984)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      4cd98445
    • Thomas Gleixner's avatar
      rtmutex: Fix deadlock detector for real · 8512d8a2
      Thomas Gleixner authored
      The current deadlock detection logic does not work reliably due to the
      following early exit path:
      
      	/*
      	 * Drop out, when the task has no waiters. Note,
      	 * top_waiter can be NULL, when we are in the deboosting
      	 * mode!
      	 */
      	if (top_waiter && (!task_has_pi_waiters(task) ||
      			   top_waiter != task_top_pi_waiter(task)))
      		goto out_unlock_pi;
      
      So this not only exits when the task has no waiters, it also exits
      unconditionally when the current waiter is not the top priority waiter
      of the task.
      
      So in a nested locking scenario, it might abort the lock chain walk
      and therefor miss a potential deadlock.
      
      Simple fix: Continue the chain walk, when deadlock detection is
      enabled.
      
      We also avoid the whole enqueue, if we detect the deadlock right away
      (A-A). It's an optimization, but also prevents that another waiter who
      comes in after the detection and before the task has undone the damage
      observes the situation and detects the deadlock and returns
      -EDEADLOCK, which is wrong as the other task is not in a deadlock
      situation.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Reviewed-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/20140522031949.725272460@linutronix.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      
      (cherry picked from commit 397335f0)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      8512d8a2
    • Steven Rostedt (Red Hat)'s avatar
      tracing: Remove ftrace_stop/start() from reading the trace file · a5f3acde
      Steven Rostedt (Red Hat) authored
      Disabling reading and writing to the trace file should not be able to
      disable all function tracing callbacks. There's other users today
      (like kprobes and perf). Reading a trace file should not stop those
      from happening.
      
      Cc: stable@vger.kernel.org # 3.0+
      Reviewed-by: default avatarMasami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      
      (cherry picked from commit 099ed151)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      a5f3acde
    • Christian König's avatar
      drm/radeon: stop poisoning the GART TLB · fb912114
      Christian König authored
      When we set the valid bit on invalid GART entries they are
      loaded into the TLB when an adjacent entry is loaded. This
      poisons the TLB with invalid entries which are sometimes
      not correctly removed on TLB flush.
      
      For stable inclusion the patch probably needs to be modified a bit.
      Signed-off-by: default avatarChristian König <christian.koenig@amd.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      
      (cherry picked from commit 0986c1a5)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      fb912114
    • Theodore Ts'o's avatar
      ext4: clarify error count warning messages · a8ee92c7
      Theodore Ts'o authored
      Make it clear that values printed are times, and that it is error
      since last fsck. Also add note about fsck version required.
      Signed-off-by: default avatarPavel Machek <pavel@ucw.cz>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Reviewed-by: default avatarAndreas Dilger <adilger@dilger.ca>
      Cc: stable@vger.kernel.org
      
      (cherry picked from commit ae0f78de)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      a8ee92c7
    • Anton Blanchard's avatar
      powerpc/perf: Never program book3s PMCs with values >= 0x80000000 · 7d767325
      Anton Blanchard authored
      We are seeing a lot of PMU warnings on POWER8:
      
          Can't find PMC that caused IRQ
      
      Looking closer, the active PMC is 0 at this point and we took a PMU
      exception on the transition from negative to 0. Some versions of POWER8
      have an issue where they edge detect and not level detect PMC overflows.
      
      A number of places program the PMC with (0x80000000 - period_left),
      where period_left can be negative. We can either fix all of these or
      just ensure that period_left is always >= 1.
      
      This patch takes the second option.
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      
      (cherry picked from commit f5602941)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7d767325
    • Axel Lin's avatar
      hwmon: (adm1029) Ensure the fan_div cache is updated in set_fan_div · aeec260f
      Axel Lin authored
      Writing to fanX_div does not clear the cache. As a result, reading
      from fanX_div may return the old value for up to two seconds
      after writing a new value.
      
      This patch ensures the fan_div cache is updated in set_fan_div().
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      
      (cherry picked from commit 1035a9e3)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      aeec260f
    • Axel Lin's avatar
      hwmon: (amc6821) Fix permissions for temp2_input · 33e69f80
      Axel Lin authored
      temp2_input should not be writable, fix it.
      Reported-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarAxel Lin <axel.lin@ingics.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGuenter Roeck <linux@roeck-us.net>
      
      (cherry picked from commit df86754b)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      33e69f80
    • Gu Zheng's avatar
      cpuset,mempolicy: fix sleeping function called from invalid context · 0a56cef1
      Gu Zheng authored
      When runing with the kernel(3.15-rc7+), the follow bug occurs:
      [ 9969.258987] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
      [ 9969.359906] in_atomic(): 1, irqs_disabled(): 0, pid: 160655, name: python
      [ 9969.441175] INFO: lockdep is turned off.
      [ 9969.488184] CPU: 26 PID: 160655 Comm: python Tainted: G       A      3.15.0-rc7+ #85
      [ 9969.581032] Hardware name: FUJITSU-SV PRIMEQUEST 1800E/SB, BIOS PRIMEQUEST 1000 Series BIOS Version 1.39 11/16/2012
      [ 9969.706052]  ffffffff81a20e60 ffff8803e941fbd0 ffffffff8162f523 ffff8803e941fd18
      [ 9969.795323]  ffff8803e941fbe0 ffffffff8109995a ffff8803e941fc58 ffffffff81633e6c
      [ 9969.884710]  ffffffff811ba5dc ffff880405c6b480 ffff88041fdd90a0 0000000000002000
      [ 9969.974071] Call Trace:
      [ 9970.003403]  [<ffffffff8162f523>] dump_stack+0x4d/0x66
      [ 9970.065074]  [<ffffffff8109995a>] __might_sleep+0xfa/0x130
      [ 9970.130743]  [<ffffffff81633e6c>] mutex_lock_nested+0x3c/0x4f0
      [ 9970.200638]  [<ffffffff811ba5dc>] ? kmem_cache_alloc+0x1bc/0x210
      [ 9970.272610]  [<ffffffff81105807>] cpuset_mems_allowed+0x27/0x140
      [ 9970.344584]  [<ffffffff811b1303>] ? __mpol_dup+0x63/0x150
      [ 9970.409282]  [<ffffffff811b1385>] __mpol_dup+0xe5/0x150
      [ 9970.471897]  [<ffffffff811b1303>] ? __mpol_dup+0x63/0x150
      [ 9970.536585]  [<ffffffff81068c86>] ? copy_process.part.23+0x606/0x1d40
      [ 9970.613763]  [<ffffffff810bf28d>] ? trace_hardirqs_on+0xd/0x10
      [ 9970.683660]  [<ffffffff810ddddf>] ? monotonic_to_bootbased+0x2f/0x50
      [ 9970.759795]  [<ffffffff81068cf0>] copy_process.part.23+0x670/0x1d40
      [ 9970.834885]  [<ffffffff8106a598>] do_fork+0xd8/0x380
      [ 9970.894375]  [<ffffffff81110e4c>] ? __audit_syscall_entry+0x9c/0xf0
      [ 9970.969470]  [<ffffffff8106a8c6>] SyS_clone+0x16/0x20
      [ 9971.030011]  [<ffffffff81642009>] stub_clone+0x69/0x90
      [ 9971.091573]  [<ffffffff81641c29>] ? system_call_fastpath+0x16/0x1b
      
      The cause is that cpuset_mems_allowed() try to take
      mutex_lock(&callback_mutex) under the rcu_read_lock(which was hold in
      __mpol_dup()). And in cpuset_mems_allowed(), the access to cpuset is
      under rcu_read_lock, so in __mpol_dup, we can reduce the rcu_read_lock
      protection region to protect the access to cpuset only in
      current_cpuset_is_being_rebound(). So that we can avoid this bug.
      
      This patch is a temporary solution that just addresses the bug
      mentioned above, can not fix the long-standing issue about cpuset.mems
      rebinding on fork():
      
      "When the forker's task_struct is duplicated (which includes
       ->mems_allowed) and it races with an update to cpuset_being_rebound
       in update_tasks_nodemask() then the task's mems_allowed doesn't get
       updated. And the child task's mems_allowed can be wrong if the
       cpuset's nodemask changes before the child has been added to the
       cgroup's tasklist."
      Signed-off-by: default avatarGu Zheng <guz.fnst@cn.fujitsu.com>
      Acked-by: default avatarLi Zefan <lizefan@huawei.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable <stable@vger.kernel.org>
      
      (cherry picked from commit 391acf97)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      0a56cef1
    • Bert Vermeulen's avatar
      USB: ftdi_sio: Add extra PID. · 1a80b45c
      Bert Vermeulen authored
      This patch adds PID 0x0003 to the VID 0x128d (Testo). At least the
      Testo 435-4 uses this, likely other gear as well.
      Signed-off-by: default avatarBert Vermeulen <bert@biot.com>
      Cc: Johan Hovold <johan@kernel.org>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      (cherry picked from commit 5a7fbe7e)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      1a80b45c
    • Andras Kovacs's avatar
      USB: cp210x: add support for Corsair usb dongle · 5da8328e
      Andras Kovacs authored
      Corsair USB Dongles are shipped with Corsair AXi series PSUs.
      These are cp210x serial usb devices, so make driver detect these.
      I have a program, that can get information from these PSUs.
      
      Tested with 2 different dongles shipped with Corsair AX860i and
      AX1200i units.
      Signed-off-by: default avatarAndras Kovacs <andras@sth.sze.hu>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      
      (cherry picked from commit b9326057)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      5da8328e
    • Bernd Wachter's avatar
      usb: option: Add ID for Telewell TW-LTE 4G v2 · 9f7a1f77
      Bernd Wachter authored
      Add ID of the Telewell 4G v2 hardware to option driver to get legacy
      serial interface working
      Signed-off-by: default avatarBernd Wachter <bernd.wachter@jolla.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      
      (cherry picked from commit 3d28bd84)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      9f7a1f77
    • Gustavo Maciel Dias Vieira's avatar
      ACPI video: ignore BIOS backlight value for HP dm4 · 7a35ec4d
      Gustavo Maciel Dias Vieira authored
      On a HP Pavilion dm4 laptop the BIOS sets minimum backlight on boot,
      completely dimming the screen. Ignore this initial value for this
      machine.
      Signed-off-by: default avatarGustavo Maciel Dias Vieira <gustavo@sagui.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      
      (cherry picked from commit 771d09b3)
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      7a35ec4d