1. 27 Apr, 2017 24 commits
    • Dan Williams's avatar
      device-dax: switch to srcu, fix rcu_read_lock() vs pte allocation · 9254ada0
      Dan Williams authored
      commit 956a4cd2 upstream.
      
      The following warning triggers with a new unit test that stresses the
      device-dax interface.
      
       ===============================
       [ ERR: suspicious RCU usage.  ]
       4.11.0-rc4+ #1049 Tainted: G           O
       -------------------------------
       ./include/linux/rcupdate.h:521 Illegal context switch in RCU read-side critical section!
      
       other info that might help us debug this:
      
       rcu_scheduler_active = 2, debug_locks = 0
       2 locks held by fio/9070:
        #0:  (&mm->mmap_sem){++++++}, at: [<ffffffff8d0739d7>] __do_page_fault+0x167/0x4f0
        #1:  (rcu_read_lock){......}, at: [<ffffffffc03fbd02>] dax_dev_huge_fault+0x32/0x620 [dax]
      
       Call Trace:
        dump_stack+0x86/0xc3
        lockdep_rcu_suspicious+0xd7/0x110
        ___might_sleep+0xac/0x250
        __might_sleep+0x4a/0x80
        __alloc_pages_nodemask+0x23a/0x360
        alloc_pages_current+0xa1/0x1f0
        pte_alloc_one+0x17/0x80
        __pte_alloc+0x1e/0x120
        __get_locked_pte+0x1bf/0x1d0
        insert_pfn.isra.70+0x3a/0x100
        ? lookup_memtype+0xa6/0xd0
        vm_insert_mixed+0x64/0x90
        dax_dev_huge_fault+0x520/0x620 [dax]
        ? dax_dev_huge_fault+0x32/0x620 [dax]
        dax_dev_fault+0x10/0x20 [dax]
        __do_fault+0x1e/0x140
        __handle_mm_fault+0x9af/0x10d0
        handle_mm_fault+0x16d/0x370
        ? handle_mm_fault+0x47/0x370
        __do_page_fault+0x28c/0x4f0
        trace_do_page_fault+0x58/0x2a0
        do_async_page_fault+0x1a/0xa0
        async_page_fault+0x28/0x30
      
      Inserting a page table entry may trigger an allocation while we are
      holding a read lock to keep the device instance alive for the duration
      of the fault. Use srcu for this keep-alive protection.
      
      Fixes: dee41079 ("/dev/dax, core: file operations and dax-mmap")
      Signed-off-by: default avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9254ada0
    • Yazen Ghannam's avatar
      x86/mce/AMD: Give a name to MCA bank 3 when accessed with legacy MSRs · 7d1c1be6
      Yazen Ghannam authored
      commit 29f72ce3 upstream.
      
      MCA bank 3 is reserved on systems pre-Fam17h, so it didn't have a name.
      However, MCA bank 3 is defined on Fam17h systems and can be accessed
      using legacy MSRs. Without a name we get a stack trace on Fam17h systems
      when trying to register sysfs files for bank 3 on kernels that don't
      recognize Scalable MCA.
      
      Call MCA bank 3 "decode_unit" since this is what it represents on
      Fam17h. This will allow kernels without SMCA support to see this bank on
      Fam17h+ and prevent the stack trace. This will not affect older systems
      since this bank is reserved on them, i.e. it'll be ignored.
      
      Tested on AMD Fam15h and Fam17h systems.
      
        WARNING: CPU: 26 PID: 1 at lib/kobject.c:210 kobject_add_internal
        kobject: (ffff88085bb256c0): attempted to be registered with empty name!
        ...
        Call Trace:
         kobject_add_internal
         kobject_add
         kobject_create_and_add
         threshold_create_device
         threshold_init_device
      Signed-off-by: default avatarYazen Ghannam <yazen.ghannam@amd.com>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Link: http://lkml.kernel.org/r/1490102285-3659-1-git-send-email-Yazen.Ghannam@amd.comSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7d1c1be6
    • Ravi Bangoria's avatar
      powerpc/kprobe: Fix oops when kprobed on 'stdu' instruction · 1136723a
      Ravi Bangoria authored
      commit 9e1ba4f2 upstream.
      
      If we set a kprobe on a 'stdu' instruction on powerpc64, we see a kernel
      OOPS:
      
        Bad kernel stack pointer cd93c840 at c000000000009868
        Oops: Bad kernel stack pointer, sig: 6 [#1]
        ...
        GPR00: c000001fcd93cb30 00000000cd93c840 c0000000015c5e00 00000000cd93c840
        ...
        NIP [c000000000009868] resume_kernel+0x2c/0x58
        LR [c000000000006208] program_check_common+0x108/0x180
      
      On a 64-bit system when the user probes on a 'stdu' instruction, the kernel does
      not emulate actual store in emulate_step() because it may corrupt the exception
      frame. So the kernel does the actual store operation in exception return code
      i.e. resume_kernel().
      
      resume_kernel() loads the saved stack pointer from memory using lwz, which only
      loads the low 32-bits of the address, causing the kernel crash.
      
      Fix this by loading the 64-bit value instead.
      
      Fixes: be96f633 ("powerpc: Split out instruction analysis part of emulate_step()")
      Signed-off-by: default avatarRavi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
      Reviewed-by: default avatarNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Reviewed-by: default avatarAnanth N Mavinakayanahalli <ananth@linux.vnet.ibm.com>
      [mpe: Change log massage, add stable tag]
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1136723a
    • Sebastian Siewior's avatar
      ubi/upd: Always flush after prepared for an update · a6db4334
      Sebastian Siewior authored
      commit 9cd9a21c upstream.
      
      In commit 6afaf8a4 ("UBI: flush wl before clearing update marker") I
      managed to trigger and fix a similar bug. Now here is another version of
      which I assumed it wouldn't matter back then but it turns out UBI has a
      check for it and will error out like this:
      
      |ubi0 warning: validate_vid_hdr: inconsistent used_ebs
      |ubi0 error: validate_vid_hdr: inconsistent VID header at PEB 592
      
      All you need to trigger this is? "ubiupdatevol /dev/ubi0_0 file" + a
      powercut in the middle of the operation.
      ubi_start_update() sets the update-marker and puts all EBs on the erase
      list. After that userland can proceed to write new data while the old EB
      aren't erased completely. A powercut at this point is usually not that
      much of a tragedy. UBI won't give read access to the static volume
      because it has the update marker. It will most likely set the corrupted
      flag because it misses some EBs.
      So we are all good. Unless the size of the image that has been written
      differs from the old image in the magnitude of at least one EB. In that
      case UBI will find two different values for `used_ebs' and refuse to
      attach the image with the error message mentioned above.
      
      So in order not to get in the situation, the patch will ensure that we
      wait until everything is removed before it tries to write any data.
      The alternative would be to detect such a case and remove all EBs at the
      attached time after we processed the volume-table and see the
      update-marker set. The patch looks bigger and I doubt it is worth it
      since usually the write() will wait from time to time for a new EB since
      usually there not that many spare EB that can be used.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a6db4334
    • Vishal Verma's avatar
      x86/mce: Make the MCE notifier a blocking one · a32ff3f0
      Vishal Verma authored
      commit 0dc9c639 upstream.
      
      The NFIT MCE handler callback (for handling media errors on NVDIMMs)
      takes a mutex to add the location of a memory error to a list. But since
      the notifier call chain for machine checks (x86_mce_decoder_chain) is
      atomic, we get a lockdep splat like:
      
        BUG: sleeping function called from invalid context at kernel/locking/mutex.c:620
        in_atomic(): 1, irqs_disabled(): 0, pid: 4, name: kworker/0:0
        [..]
        Call Trace:
         dump_stack
         ___might_sleep
         __might_sleep
         mutex_lock_nested
         ? __lock_acquire
         nfit_handle_mce
         notifier_call_chain
         atomic_notifier_call_chain
         ? atomic_notifier_call_chain
         mce_gen_pool_process
      
      Convert the notifier to a blocking one which gets to run only in process
      context.
      
      Boris: remove the notifier call in atomic context in print_mce(). For
      now, let's print the MCE on the atomic path so that we can make sure
      they go out and get logged at least.
      
      Fixes: 6839a6d9 ("nfit: do an ARS scrub on hitting a latent media error")
      Reported-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Signed-off-by: default avatarVishal Verma <vishal.l.verma@intel.com>
      Acked-by: default avatarTony Luck <tony.luck@intel.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: linux-edac <linux-edac@vger.kernel.org>
      Cc: x86-ml <x86@kernel.org>
      Link: http://lkml.kernel.org/r/20170411224457.24777-1-vishal.l.verma@intel.comSigned-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a32ff3f0
    • Johannes Berg's avatar
      mac80211: fix MU-MIMO follow-MAC mode · c77e7d37
      Johannes Berg authored
      commit 9e478066 upstream.
      
      There are two bugs in the follow-MAC code:
       * it treats the radiotap header as the 802.11 header
         (therefore it can't possibly work)
       * it doesn't verify that the skb data it accesses is actually
         present in the header, which is mitigated by the first point
      
      Fix this by moving all of this out into a separate function.
      This function copies the data it needs using skb_copy_bits()
      to make sure it can be accessed if it's paged, and offsets
      that by the possibly present vendor radiotap header.
      
      This also makes all those conditions more readable.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c77e7d37
    • Johannes Berg's avatar
      mac80211: reject ToDS broadcast data frames · ee9b4899
      Johannes Berg authored
      commit 3018e947 upstream.
      
      AP/AP_VLAN modes don't accept any real 802.11 multicast data
      frames, but since they do need to accept broadcast management
      frames the same is currently permitted for data frames. This
      opens a security problem because such frames would be decrypted
      with the GTK, and could even contain unicast L3 frames.
      
      Since the spec says that ToDS frames must always have the BSSID
      as the RA (addr1), reject any other data frames.
      
      The problem was originally reported in "Predicting, Decrypting,
      and Abusing WPA2/802.11 Group Keys" at usenix
      https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/vanhoef
      and brought to my attention by Jouni.
      Reported-by: default avatarJouni Malinen <j@w1.fi>
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      --
      ee9b4899
    • Richard Weinberger's avatar
      ubifs: Fix O_TMPFILE corner case in ubifs_link() · 71a3e367
      Richard Weinberger authored
      commit 32fe905c upstream.
      
      It is perfectly fine to link a tmpfile back using linkat().
      Since tmpfiles are created with a link count of 0 they appear
      on the orphan list, upon re-linking the inode has to be removed
      from the orphan list again.
      
      Ralph faced a filesystem corruption in combination with overlayfs
      due to this bug.
      
      Cc: Ralph Sennhauser <ralph.sennhauser@gmail.com>
      Cc: Amir Goldstein <amir73il@gmail.com>
      Reported-by: default avatarRalph Sennhauser <ralph.sennhauser@gmail.com>
      Tested-by: default avatarRalph Sennhauser <ralph.sennhauser@gmail.com>
      Reported-by: default avatarAmir Goldstein <amir73il@gmail.com>
      Fixes: 474b9370 ("ubifs: Implement O_TMPFILE")
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71a3e367
    • Felix Fietkau's avatar
      ubifs: Fix RENAME_WHITEOUT support · c1cadf6a
      Felix Fietkau authored
      commit c3d9fda6 upstream.
      
      Remove faulty leftover check in do_rename(), apparently introduced in a
      merge that combined whiteout support changes with commit f03b8ad8
      ("fs: support RENAME_NOREPLACE for local filesystems")
      
      Fixes: f03b8ad8 ("fs: support RENAME_NOREPLACE for local filesystems")
      Fixes: 9e0a1fff ("ubifs: Implement RENAME_WHITEOUT")
      Signed-off-by: default avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c1cadf6a
    • Haibo Chen's avatar
      mmc: sdhci-esdhc-imx: increase the pad I/O drive strength for DDR50 card · 27456652
      Haibo Chen authored
      commit 9f327845 upstream.
      
      Currently for DDR50 card, it need tuning in default. We meet tuning fail
      issue for DDR50 card and some data CRC error when DDR50 sd card works.
      
      This is because the default pad I/O drive strength can't make sure DDR50
      card work stable. So increase the pad I/O drive strength for DDR50 card,
      and use pins_100mhz.
      
      This fixes DDR50 card support for IMX since DDR50 tuning was enabled from
      commit 9faac7b9 ("mmc: sdhci: enable tuning for DDR50")
      Tested-and-reported-by: default avatarTim Harvey <tharvey@gateworks.com>
      Signed-off-by: default avatarHaibo Chen <haibo.chen@nxp.com>
      Acked-by: default avatarDong Aisheng <aisheng.dong@nxp.com>
      Acked-by: default avatarAdrian Hunter <adrian.hunter@intel.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      27456652
    • Douglas Anderson's avatar
      mmc: dw_mmc: Don't allow Runtime PM for SDIO cards · b478c19f
      Douglas Anderson authored
      commit a6db2c86 upstream.
      
      According to the SDIO standard interrupts are normally signalled in a
      very complicated way.  They require the card clock to be running and
      require the controller to be paying close attention to the signals
      coming from the card.  This simply can't happen with the clock stopped
      or with the controller in a low power mode.
      
      To that end, we'll disable runtime_pm when we detect that an SDIO card
      was inserted.  This is much like with what we do with the special
      "SDMMC_CLKEN_LOW_PWR" bit that dw_mmc supports.
      
      NOTE: we specifically do this Runtime PM disabling at card init time
      rather than in the enable_sdio_irq() callback.  This is _different_
      than how SDHCI does it.  Why do we do it differently?
      
      - Unlike SDHCI, dw_mmc uses the standard sdio_irq code in Linux (AKA
        dw_mmc doesn't set MMC_CAP2_SDIO_IRQ_NOTHREAD).
      - Because we use the standard sdio_irq code:
        - We see a constant stream of enable_sdio_irq(0) and
          enable_sdio_irq(1) calls.  This is because the standard code
          disables interrupts while processing and re-enables them after.
        - While interrupts are disabled, there's technically a period where
          we could get runtime disabled while processing interrupts.
        - If we are runtime disabled while processing interrupts, we'll
          reset the controller at resume time (see dw_mci_runtime_resume),
          which seems like a terrible idea because we could possibly have
          another interrupt pending.
      
      To fix the above isues we'd want to put something in the standard
      sdio_irq code that makes sure to call pm_runtime get/put when
      interrupts are being actively being processed.  That's possible to do,
      but it seems like a more complicated mechanism when we really just
      want the runtime pm disabled always for SDIO cards given that all the
      other bits needed to get Runtime PM vs. SDIO just aren't there.
      
      NOTE: at some point in time someone might come up with a fancy way to
      do SDIO interrupts and still allow (some) amount of runtime PM.
      Technically we could turn off the card clock if we used an alternate
      way of signaling SDIO interrupts (and out of band interrupt is one way
      to do this).  We probably wouldn't actually want to fully runtime
      suspend in this case though--at least not with the current
      dw_mci_runtime_resume() which basically fully resets the controller at
      resume time.
      
      Fixes: e9ed8835 ("mmc: dw_mmc: add runtime PM callback")
      Reported-by: default avatarBrian Norris <briannorris@chromium.org>
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Acked-by: default avatarJaehoon Chung <jh80.chung@samsung.com>
      Reviewed-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b478c19f
    • Arnd Bergmann's avatar
      ACPI / power: Avoid maybe-uninitialized warning · 9b02ecd1
      Arnd Bergmann authored
      commit fe8c470a upstream.
      
      gcc -O2 cannot always prove that the loop in acpi_power_get_inferred_state()
      is enterered at least once, so it assumes that cur_state might not get
      initialized:
      
      drivers/acpi/power.c: In function 'acpi_power_get_inferred_state':
      drivers/acpi/power.c:222:9: error: 'cur_state' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      This sets the variable to zero at the start of the loop, to ensure that
      there is well-defined behavior even for an empty list. This gets rid of
      the warning.
      
      The warning first showed up when the -Os flag got removed in a bug fix
      patch in linux-4.11-rc5.
      
      I would suggest merging this addon patch on top of that bug fix to avoid
      introducing a new warning in the stable kernels.
      
      Fixes: 61b79e16 (ACPI: Fix incompatibility with mcount-based function graph tracing)
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b02ecd1
    • Thorsten Leemhuis's avatar
      Input: elantech - add Fujitsu Lifebook E547 to force crc_enabled · 7010e15d
      Thorsten Leemhuis authored
      commit 704de489 upstream.
      
      Temporary got a Lifebook E547 into my hands and noticed the touchpad
      only works after running:
      
      	echo "1" > /sys/devices/platform/i8042/serio2/crc_enabled
      
      Add it to the list of machines that need this workaround.
      Signed-off-by: default avatarThorsten Leemhuis <linux@leemhuis.info>
      Reviewed-by: default avatarUlrik De Bie <ulrik.debie-os@e2big.org>
      Signed-off-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7010e15d
    • Christian Borntraeger's avatar
      s390/mm: fix CMMA vs KSM vs others · 0cb760df
      Christian Borntraeger authored
      commit a8f60d1f upstream.
      
      On heavy paging with KSM I see guest data corruption. Turns out that
      KSM will add pages to its tree, where the mapping return true for
      pte_unused (or might become as such later).  KSM will unmap such pages
      and reinstantiate with different attributes (e.g. write protected or
      special, e.g. in replace_page or write_protect_page)). This uncovered
      a bug in our pagetable handling: We must remove the unused flag as
      soon as an entry becomes present again.
      Signed-of-by: default avatarChristian Borntraeger <borntraeger@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0cb760df
    • Shawn Lin's avatar
      mmc: dw_mmc: silent verbose log when calling from PM context · 71766b91
      Shawn Lin authored
      commit ce69e2fe upstream.
      
      When deploying runtime PM, it's quite verbose to print the
      log of ios setting. Also it's useless to print it from system
      PM as it should be the same with booting time. We also have
      sysfs to get all these information from ios attribute, so let's
      skip this print from PM context.
      Signed-off-by: default avatarShawn Lin <shawn.lin@rock-chips.com>
      Signed-off-by: default avatarJaehoon Chung <jh80.chung@samsung.com>
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Cc: Alexander Kochetkov <al.kochet@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      71766b91
    • Germano Percossi's avatar
      CIFS: remove bad_network_name flag · 9f829677
      Germano Percossi authored
      commit a0918f1c upstream.
      
      STATUS_BAD_NETWORK_NAME can be received during node failover,
      causing the flag to be set and making the reconnect thread
      always unsuccessful, thereafter.
      
      Once the only place where it is set is removed, the remaining
      bits are rendered moot.
      
      Removing it does not prevent "mount" from failing when a non
      existent share is passed.
      
      What happens when the share really ceases to exist while the
      share is mounted is undefined now as much as it was before.
      Signed-off-by: default avatarGermano Percossi <germano.percossi@citrix.com>
      Reviewed-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      9f829677
    • Sachin Prabhu's avatar
      cifs: Do not send echoes before Negotiate is complete · 5cd77ebf
      Sachin Prabhu authored
      commit 62a6cfdd upstream.
      
      commit 4fcd1813 ("Fix reconnect to not defer smb3 session reconnect
      long after socket reconnect") added support for Negotiate requests to
      be initiated by echo calls.
      
      To avoid delays in calling echo after a reconnect, I added the patch
      introduced by the commit b8c60012 ("Call echo service immediately
      after socket reconnect").
      
      This has however caused a regression with cifs shares which do not have
      support for echo calls to trigger Negotiate requests. On connections
      which need to call Negotiation, the echo calls trigger an error which
      triggers a reconnect which in turn triggers another echo call. This
      results in a loop which is only broken when an operation is performed on
      the cifs share. For an idle share, it can DOS a server.
      
      The patch uses the smb_operation can_echo() for cifs so that it is
      called only if connection has been already been setup.
      
      kernel bz: 194531
      Signed-off-by: default avatarSachin Prabhu <sprabhu@redhat.com>
      Tested-by: default avatarJonathan Liu <net147@gmail.com>
      Acked-by: default avatarPavel Shilovsky <pshilov@microsoft.com>
      Signed-off-by: default avatarSteve French <smfrench@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5cd77ebf
    • Rabin Vincent's avatar
      mm: prevent NR_ISOLATE_* stats from going negative · 63ad4051
      Rabin Vincent authored
      commit fc280fe8 upstream.
      
      Commit 6afcf8ef ("mm, compaction: fix NR_ISOLATED_* stats for pfn
      based migration") moved the dec_node_page_state() call (along with the
      page_is_file_cache() call) to after putback_lru_page().
      
      But page_is_file_cache() can change after putback_lru_page() is called,
      so it should be called before putback_lru_page(), as it was before that
      patch, to prevent NR_ISOLATE_* stats from going negative.
      
      Without this fix, non-CONFIG_SMP kernels end up hanging in the
      while(too_many_isolated()) { congestion_wait() } loop in
      shrink_active_list() due to the negative stats.
      
       Mem-Info:
        active_anon:32567 inactive_anon:121 isolated_anon:1
        active_file:6066 inactive_file:6639 isolated_file:4294967295
                                                          ^^^^^^^^^^
        unevictable:0 dirty:115 writeback:0 unstable:0
        slab_reclaimable:2086 slab_unreclaimable:3167
        mapped:3398 shmem:18366 pagetables:1145 bounce:0
        free:1798 free_pcp:13 free_cma:0
      
      Fixes: 6afcf8ef ("mm, compaction: fix NR_ISOLATED_* stats for pfn based migration")
      Link: http://lkml.kernel.org/r/1492683865-27549-1-git-send-email-rabin.vincent@axis.comSigned-off-by: default avatarRabin Vincent <rabinv@axis.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Ming Ling <ming.ling@spreadtrum.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      63ad4051
    • Steven Rostedt (VMware)'s avatar
      ring-buffer: Have ring_buffer_iter_empty() return true when empty · 64d25336
      Steven Rostedt (VMware) authored
      commit 78f7a45d upstream.
      
      I noticed that reading the snapshot file when it is empty no longer gives a
      status. It suppose to show the status of the snapshot buffer as well as how
      to allocate and use it. For example:
      
       ># cat snapshot
       # tracer: nop
       #
       #
       # * Snapshot is allocated *
       #
       # Snapshot commands:
       # echo 0 > snapshot : Clears and frees snapshot buffer
       # echo 1 > snapshot : Allocates snapshot buffer, if not already allocated.
       #                      Takes a snapshot of the main buffer.
       # echo 2 > snapshot : Clears snapshot buffer (but does not allocate or free)
       #                      (Doesn't have to be '2' works with any number that
       #                       is not a '0' or '1')
      
      But instead it just showed an empty buffer:
      
       ># cat snapshot
       # tracer: nop
       #
       # entries-in-buffer/entries-written: 0/0   #P:4
       #
       #                              _-----=> irqs-off
       #                             / _----=> need-resched
       #                            | / _---=> hardirq/softirq
       #                            || / _--=> preempt-depth
       #                            ||| /     delay
       #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
       #              | |       |   ||||       |         |
      
      What happened was that it was using the ring_buffer_iter_empty() function to
      see if it was empty, and if it was, it showed the status. But that function
      was returning false when it was empty. The reason was that the iter header
      page was on the reader page, and the reader page was empty, but so was the
      buffer itself. The check only tested to see if the iter was on the commit
      page, but the commit page was no longer pointing to the reader page, but as
      all pages were empty, the buffer is also.
      
      Fixes: 651e22f2 ("ring-buffer: Always reset iterator to reader page")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      64d25336
    • Jason Gerecke's avatar
      HID: wacom: Treat HID_DG_TOOLSERIALNUMBER as unsigned · eff24861
      Jason Gerecke authored
      commit 286f3f47 upstream.
      
      Because HID_DG_TOOLSERIALNUMBER doesn't first cast the value recieved from HID
      to an unsigned type, sign-extension rules can cause the value of
      wacom_wac->serial[0] to inadvertently wind up with all 32 of its highest bits
      set if the highest bit of "value" was set.
      
      This can cause problems for Tablet PC devices which use AES sensors and the
      xf86-input-wacom userspace driver. It is not uncommon for AES sensors to send a
      serial number of '0' while the pen is entering or leaving proximity. The
      xf86-input-wacom driver ignores events with a serial number of '0' since it
      cannot match them up to an in-use tool.  To ensure the xf86-input-wacom driver
      does not ignore the final out-of-proximity event, the kernel does not send
      MSC_SERIAL events when the value of wacom_wac->serial[0] is '0'. If the highest
      bit of HID_DG_TOOLSERIALNUMBER is set by an in-prox pen which later leaves
      proximity and sends a '0' for HID_DG_TOOLSERIALNUMBER, then only the lowest 32
      bits of wacom_wac->serial[0] are actually cleared, causing the kernel to send
      an MSC_SERIAL event. Since the 'input_event' function takes an 'int' as
      argument, only those lowest (now-cleared) 32 bits of wacom_wac->serial[0] are
      sent to userspace, causing xf86-input-wacom to ignore the event. If the event
      was the final out-of-prox event, then xf86-input-wacom may remain in a state
      where it believes the pen is in proximity and refuses to allow other devices
      under its control (e.g. the touchscreen) to move the cursor.
      
      It should be noted that EMR devices and devices which use both the
      HID_DG_TOOLSERIALNUMBER and WACOM_HID_WD_SERIALHI usages (in that order) would
      be immune to this issue. It appears only AES devices are affected.
      
      Fixes: f85c9dc6 ("HID: wacom: generic: Support tool ID and additional tool types")
      Signed-off-by: default avatarJason Gerecke <jason.gerecke@wacom.com>
      Acked-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eff24861
    • Steven Rostedt (VMware)'s avatar
      tracing: Allocate the snapshot buffer before enabling probe · 838a281c
      Steven Rostedt (VMware) authored
      commit df62db5b upstream.
      
      Currently the snapshot trigger enables the probe and then allocates the
      snapshot. If the probe triggers before the allocation, it could cause the
      snapshot to fail and turn tracing off. It's best to allocate the snapshot
      buffer first, and then enable the trigger. If something goes wrong in the
      enabling of the trigger, the snapshot buffer is still allocated, but it can
      also be freed by the user by writting zero into the snapshot buffer file.
      
      Also add a check of the return status of alloc_snapshot().
      
      Fixes: 77fd5c15 ("tracing: Add snapshot trigger to function probes")
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      838a281c
    • Eric Biggers's avatar
      KEYS: fix keyctl_set_reqkey_keyring() to not leak thread keyrings · 523ae2e9
      Eric Biggers authored
      commit c9f838d1 upstream.
      
      This fixes CVE-2017-7472.
      
      Running the following program as an unprivileged user exhausts kernel
      memory by leaking thread keyrings:
      
      	#include <keyutils.h>
      
      	int main()
      	{
      		for (;;)
      			keyctl_set_reqkey_keyring(KEY_REQKEY_DEFL_THREAD_KEYRING);
      	}
      
      Fix it by only creating a new thread keyring if there wasn't one before.
      To make things more consistent, make install_thread_keyring_to_cred()
      and install_process_keyring_to_cred() both return 0 if the corresponding
      keyring is already present.
      
      Fixes: d84f4f99 ("CRED: Inaugurate COW credentials")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      523ae2e9
    • David Howells's avatar
      KEYS: Change the name of the dead type to ".dead" to prevent user access · cc4f9841
      David Howells authored
      commit c1644fe0 upstream.
      
      This fixes CVE-2017-6951.
      
      Userspace should not be able to do things with the "dead" key type as it
      doesn't have some of the helper functions set upon it that the kernel
      needs.  Attempting to use it may cause the kernel to crash.
      
      Fix this by changing the name of the type to ".dead" so that it's rejected
      up front on userspace syscalls by key_get_type_from_user().
      
      Though this doesn't seem to affect recent kernels, it does affect older
      ones, certainly those prior to:
      
      	commit c06cfb08
      	Author: David Howells <dhowells@redhat.com>
      	Date:   Tue Sep 16 17:36:06 2014 +0100
      	KEYS: Remove key_type::match in favour of overriding default by match_preparse
      
      which went in before 3.18-rc1.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cc4f9841
    • David Howells's avatar
      KEYS: Disallow keyrings beginning with '.' to be joined as session keyrings · 4cbbfd6a
      David Howells authored
      commit ee8f844e upstream.
      
      This fixes CVE-2016-9604.
      
      Keyrings whose name begin with a '.' are special internal keyrings and so
      userspace isn't allowed to create keyrings by this name to prevent
      shadowing.  However, the patch that added the guard didn't fix
      KEYCTL_JOIN_SESSION_KEYRING.  Not only can that create dot-named keyrings,
      it can also subscribe to them as a session keyring if they grant SEARCH
      permission to the user.
      
      This, for example, allows a root process to set .builtin_trusted_keys as
      its session keyring, at which point it has full access because now the
      possessor permissions are added.  This permits root to add extra public
      keys, thereby bypassing module verification.
      
      This also affects kexec and IMA.
      
      This can be tested by (as root):
      
      	keyctl session .builtin_trusted_keys
      	keyctl add user a a @s
      	keyctl list @s
      
      which on my test box gives me:
      
      	2 keys in keyring:
      	180010936: ---lswrv     0     0 asymmetric: Build time autogenerated kernel key: ae3d4a31b82daa8e1a75b49dc2bba949fd992a05
      	801382539: --alswrv     0     0 user: a
      
      
      Fix this by rejecting names beginning with a '.' in the keyctl.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Acked-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      cc: linux-ima-devel@lists.sourceforge.net
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4cbbfd6a
  2. 21 Apr, 2017 16 commits