1. 17 Jul, 2018 28 commits
    • Theodore Ts'o's avatar
      loop: add recursion validation to LOOP_CHANGE_FD · e3cf1cc9
      Theodore Ts'o authored
      commit d2ac838e upstream.
      
      Refactor the validation code used in LOOP_SET_FD so it is also used in
      LOOP_CHANGE_FD.  Otherwise it is possible to construct a set of loop
      devices that all refer to each other.  This can lead to a infinite
      loop in starting with "while (is_loop_device(f)) .." in loop_set_fd().
      
      Fix this by refactoring out the validation code and using it for
      LOOP_CHANGE_FD as well as LOOP_SET_FD.
      
      Reported-by: syzbot+4349872271ece473a7c91190b68b4bac7c5dbc87@syzkaller.appspotmail.com
      Reported-by: syzbot+40bd32c4d9a3cc12a339@syzkaller.appspotmail.com
      Reported-by: syzbot+769c54e66f994b041be7@syzkaller.appspotmail.com
      Reported-by: syzbot+0a89a9ce473936c57065@syzkaller.appspotmail.com
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e3cf1cc9
    • Florian Westphal's avatar
      netfilter: x_tables: initialise match/target check parameter struct · 40352e79
      Florian Westphal authored
      commit c568503e upstream.
      
      syzbot reports following splat:
      
      BUG: KMSAN: uninit-value in ebt_stp_mt_check+0x24b/0x450
       net/bridge/netfilter/ebt_stp.c:162
       ebt_stp_mt_check+0x24b/0x450 net/bridge/netfilter/ebt_stp.c:162
       xt_check_match+0x1438/0x1650 net/netfilter/x_tables.c:506
       ebt_check_match net/bridge/netfilter/ebtables.c:372 [inline]
       ebt_check_entry net/bridge/netfilter/ebtables.c:702 [inline]
      
      The uninitialised access is
         xt_mtchk_param->nft_compat
      
      ... which should be set to 0.
      Fix it by zeroing the struct beforehand, same for tgchk.
      
      ip(6)tables targetinfo uses c99-style initialiser, so no change
      needed there.
      
      Reported-by: syzbot+da4494182233c23a5fcf@syzkaller.appspotmail.com
      Fixes: 55917a21 ("netfilter: x_tables: add context to know if extension runs from nft_compat")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      40352e79
    • Eric Dumazet's avatar
      netfilter: nf_queue: augment nfqa_cfg_policy · ac378e6a
      Eric Dumazet authored
      commit ba062ebb upstream.
      
      Three attributes are currently not verified, thus can trigger KMSAN
      warnings such as :
      
      BUG: KMSAN: uninit-value in __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
      BUG: KMSAN: uninit-value in __fswab32 include/uapi/linux/swab.h:59 [inline]
      BUG: KMSAN: uninit-value in nfqnl_recv_config+0x939/0x17d0 net/netfilter/nfnetlink_queue.c:1268
      CPU: 1 PID: 4521 Comm: syz-executor120 Not tainted 4.17.0+ #5
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:113
       kmsan_report+0x188/0x2a0 mm/kmsan/kmsan.c:1117
       __msan_warning_32+0x70/0xc0 mm/kmsan/kmsan_instr.c:620
       __arch_swab32 arch/x86/include/uapi/asm/swab.h:10 [inline]
       __fswab32 include/uapi/linux/swab.h:59 [inline]
       nfqnl_recv_config+0x939/0x17d0 net/netfilter/nfnetlink_queue.c:1268
       nfnetlink_rcv_msg+0xb2e/0xc80 net/netfilter/nfnetlink.c:212
       netlink_rcv_skb+0x37e/0x600 net/netlink/af_netlink.c:2448
       nfnetlink_rcv+0x2fe/0x680 net/netfilter/nfnetlink.c:513
       netlink_unicast_kernel net/netlink/af_netlink.c:1310 [inline]
       netlink_unicast+0x1680/0x1750 net/netlink/af_netlink.c:1336
       netlink_sendmsg+0x104f/0x1350 net/netlink/af_netlink.c:1901
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg net/socket.c:639 [inline]
       ___sys_sendmsg+0xec8/0x1320 net/socket.c:2117
       __sys_sendmsg net/socket.c:2155 [inline]
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162
       do_syscall_64+0x15b/0x230 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      RIP: 0033:0x43fd59
      RSP: 002b:00007ffde0e30d28 EFLAGS: 00000213 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fd59
      RDX: 0000000000000000 RSI: 0000000020000080 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
      R10: 00000000004002c8 R11: 0000000000000213 R12: 0000000000401680
      R13: 0000000000401710 R14: 0000000000000000 R15: 0000000000000000
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:279 [inline]
       kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:189
       kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:315
       kmsan_slab_alloc+0x10/0x20 mm/kmsan/kmsan.c:322
       slab_post_alloc_hook mm/slab.h:446 [inline]
       slab_alloc_node mm/slub.c:2753 [inline]
       __kmalloc_node_track_caller+0xb35/0x11b0 mm/slub.c:4395
       __kmalloc_reserve net/core/skbuff.c:138 [inline]
       __alloc_skb+0x2cb/0x9e0 net/core/skbuff.c:206
       alloc_skb include/linux/skbuff.h:988 [inline]
       netlink_alloc_large_skb net/netlink/af_netlink.c:1182 [inline]
       netlink_sendmsg+0x76e/0x1350 net/netlink/af_netlink.c:1876
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg net/socket.c:639 [inline]
       ___sys_sendmsg+0xec8/0x1320 net/socket.c:2117
       __sys_sendmsg net/socket.c:2155 [inline]
       __do_sys_sendmsg net/socket.c:2164 [inline]
       __se_sys_sendmsg net/socket.c:2162 [inline]
       __x64_sys_sendmsg+0x331/0x460 net/socket.c:2162
       do_syscall_64+0x15b/0x230 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x44/0xa9
      
      Fixes: fdb694a0 ("netfilter: Add fail-open support")
      Fixes: 829e17a1 ("[NETFILTER]: nfnetlink_queue: allow changing queue length through netlink")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ac378e6a
    • Oleg Nesterov's avatar
      uprobes/x86: Remove incorrect WARN_ON() in uprobe_init_insn() · 377fb3d8
      Oleg Nesterov authored
      commit 90718e32 upstream.
      
      insn_get_length() has the side-effect of processing the entire instruction
      but only if it was decoded successfully, otherwise insn_complete() can fail
      and in this case we need to just return an error without warning.
      
      Reported-by: syzbot+30d675e3ca03c1c351e7@syzkaller.appspotmail.com
      Signed-off-by: default avatarOleg Nesterov <oleg@redhat.com>
      Reviewed-by: default avatarMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: syzkaller-bugs@googlegroups.com
      Link: https://lkml.kernel.org/lkml/20180518162739.GA5559@redhat.comSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      377fb3d8
    • Keith Busch's avatar
      nvme-pci: Remap CMB SQ entries on every controller reset · 062c4965
      Keith Busch authored
      commit 815c6704 upstream.
      
      The controller memory buffer is remapped into a kernel address on each
      reset, but the driver was setting the submission queue base address
      only on the very first queue creation. The remapped address is likely to
      change after a reset, so accessing the old address will hit a kernel bug.
      
      This patch fixes that by setting the queue's CMB base address each time
      the queue is created.
      
      Fixes: f63572df ("nvme: unmap CMB and remove sysfs file in reset path")
      Reported-by: default avatarChristian Black <christian.d.black@intel.com>
      Cc: Jon Derrick <jonathan.derrick@intel.com>
      Cc: <stable@vger.kernel.org> # 4.9+
      Signed-off-by: default avatarKeith Busch <keith.busch@intel.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarScott Bauer <scott.bauer@intel.com>
      Reviewed-by: default avatarJon Derrick <jonathan.derrick@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      062c4965
    • Steve Wise's avatar
      iw_cxgb4: correctly enforce the max reg_mr depth · 70c89bcc
      Steve Wise authored
      commit 7b72717a upstream.
      
      The code was mistakenly using the length of the page array memory instead
      of the depth of the page array.
      
      This would cause MR creation to fail in some cases.
      
      Fixes: 8376b86d ("iw_cxgb4: Support the new memory registration API")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      70c89bcc
    • Jon Hunter's avatar
      i2c: tegra: Fix NACK error handling · e78e3706
      Jon Hunter authored
      commit 54836e2d upstream.
      
      On Tegra30 Cardhu the PCA9546 I2C mux is not ACK'ing I2C commands on
      resume from suspend (which is caused by the reset signal for the I2C
      mux not being configured correctl). However, this NACK is causing the
      Tegra30 to hang on resuming from suspend which is not expected as we
      detect NACKs and handle them. The hang observed appears to occur when
      resetting the I2C controller to recover from the NACK.
      
      Commit 77821b46 ("i2c: tegra: proper handling of error cases") added
      additional error handling for some error cases including NACK, however,
      it appears that this change conflicts with an early fix by commit
      f70893d0 ("i2c: tegra: Add delay before resetting the controller
      after NACK"). After commit 77821b46 was made we now disable 'packet
      mode' before the delay from commit f70893d0 happens. Testing shows
      that moving the delay to before disabling 'packet mode' fixes the hang
      observed on Tegra30. The delay was added to give the I2C controller
      chance to send a stop condition and so it makes sense to move this to
      before we disable packet mode. Please note that packet mode is always
      enabled for Tegra.
      
      Fixes: 77821b46 ("i2c: tegra: proper handling of error cases")
      Signed-off-by: default avatarJon Hunter <jonathanh@nvidia.com>
      Acked-by: default avatarThierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarWolfram Sang <wsa@the-dreams.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e78e3706
    • Paul Menzel's avatar
      tools build: fix # escaping in .cmd files for future Make · 36c038f0
      Paul Menzel authored
      commit 9feeb638 upstream.
      
      In 2016 GNU Make made a backwards incompatible change to the way '#'
      characters were handled in Makefiles when used inside functions or
      macros:
      
      http://git.savannah.gnu.org/cgit/make.git/commit/?id=c6966b323811c37acedff05b57
      
      Due to this change, when attempting to run `make prepare' I get a
      spurious make syntax error:
      
          /home/earnest/linux/tools/objtool/.fixdep.o.cmd:1: *** missing separator.  Stop.
      
      When inspecting `.fixdep.o.cmd' it includes two lines which use
      unescaped comment characters at the top:
      
          \# cannot find fixdep (/home/earnest/linux/tools/objtool//fixdep)
          \# using basic dep data
      
      This is because `tools/build/Build.include' prints these '\#'
      characters:
      
          printf '\# cannot find fixdep (%s)\n' $(fixdep) > $(dot-target).cmd; \
          printf '\# using basic dep data\n\n' >> $(dot-target).cmd;           \
      
      This completes commit 9564a8cf ("Kbuild: fix # escaping in .cmd files
      for future Make").
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=197847
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarPaul Menzel <pmenzel@molgen.mpg.de>
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      36c038f0
    • Oscar Salvador's avatar
      fs, elf: make sure to page align bss in load_elf_library · db2858f1
      Oscar Salvador authored
      commit 24962af7 upstream.
      
      The current code does not make sure to page align bss before calling
      vm_brk(), and this can lead to a VM_BUG_ON() in __mm_populate() due to
      the requested lenght not being correctly aligned.
      
      Let us make sure to align it properly.
      
      Kees: only applicable to CONFIG_USELIB kernels: 32-bit and configured
      for libc5.
      
      Link: http://lkml.kernel.org/r/20180705145539.9627-1-osalvador@techadventures.netSigned-off-by: default avatarOscar Salvador <osalvador@suse.de>
      Reported-by: syzbot+5dcb560fe12aa5091c06@syzkaller.appspotmail.com
      Tested-by: default avatarTetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Nicolas Pitre <nicolas.pitre@linaro.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      db2858f1
    • Chris Wilson's avatar
      ALSA: hda - Handle pm failure during hotplug · bc193057
      Chris Wilson authored
      commit aaa23f86 upstream.
      
      Obtaining the runtime pm wakeref can fail, especially in a hotplug
      scenario where i915.ko has been unloaded. If we do not catch the
      failure, we end up with an unbalanced pm.
      
      v2 additions by tiwai:
      hdmi_present_sense() checks the return value and handle only a
      negative error case and bails out only if it's really still suspended.
      Also, snd_hda_power_down() is called at the error path so that the
      refcount is balanced.
      
      Along with it, the spec->pcm_lock is taken outside
      hdmi_present_sense() in the caller side, so that it won't cause
      deadlock at reentrace via runtime resume.
      
      v3 fix by tiwai:
      Missing linux/pm_runtime.h is included.
      
      References: 222bde03 ("ALSA: hda - Fix mutex deadlock at HDMI/DP hotplug")
      Signed-off-by: default avatarChris Wilson <chris@chris-wilson.co.uk>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc193057
    • Linus Torvalds's avatar
      Fix up non-directory creation in SGID directories · d2c7c524
      Linus Torvalds authored
      commit 0fa3ecd8 upstream.
      
      sgid directories have special semantics, making newly created files in
      the directory belong to the group of the directory, and newly created
      subdirectories will also become sgid.  This is historically used for
      group-shared directories.
      
      But group directories writable by non-group members should not imply
      that such non-group members can magically join the group, so make sure
      to clear the sgid bit on non-directories for non-members (but remember
      that sgid without group execute means "mandatory locking", just to
      confuse things even more).
      Reported-by: default avatarJann Horn <jannh@google.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2c7c524
    • Tomasz Kramkowski's avatar
      HID: usbhid: add quirk for innomedia INNEX GENESIS/ATARI adapter · 16387eb5
      Tomasz Kramkowski authored
      commit 9547837b upstream.
      
      The (1292:4745) Innomedia INNEX GENESIS/ATARI adapter needs
      HID_QUIRK_MULTI_INPUT to split the device up into two controllers
      instead of inputs from both being merged into one.
      Signed-off-by: default avatarTomasz Kramkowski <tk@the-tk.com>
      Acked-By: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16387eb5
    • Dan Carpenter's avatar
      xhci: xhci-mem: off by one in xhci_stream_id_to_ring() · 268476c9
      Dan Carpenter authored
      commit 313db3d6 upstream.
      
      The > should be >= here so that we don't read one element beyond the end
      of the ep->stream_info->stream_rings[] array.
      
      Fixes: e9df17eb ("USB: xhci: Correct assumptions about number of rings per endpoint.")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      268476c9
    • Nico Sneck's avatar
      usb: quirks: add delay quirks for Corsair Strafe · cac38ab7
      Nico Sneck authored
      commit bba57edd upstream.
      
      Corsair Strafe appears to suffer from the same issues
      as the Corsair Strafe RGB.
      Apply the same quirks (control message delay and init delay)
      that the RGB version has to 1b1c:1b15.
      
      With these quirks in place the keyboard works correctly upon
      booting the system, and no longer requires reattaching the device.
      Signed-off-by: default avatarNico Sneck <snecknico@gmail.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cac38ab7
    • Johan Hovold's avatar
      USB: serial: mos7840: fix status-register error handling · 7675c7b7
      Johan Hovold authored
      commit 794744ab upstream.
      
      Add missing transfer-length sanity check to the status-register
      completion handler to avoid leaking bits of uninitialised slab data to
      user space.
      
      Fixes: 3f542974 ("USB: Moschip 7840 USB-Serial Driver")
      Cc: stable <stable@vger.kernel.org>     # 2.6.19
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7675c7b7
    • Jann Horn's avatar
      USB: yurex: fix out-of-bounds uaccess in read handler · 0fdef314
      Jann Horn authored
      commit f1e255d6 upstream.
      
      In general, accessing userspace memory beyond the length of the supplied
      buffer in VFS read/write handlers can lead to both kernel memory corruption
      (via kernel_read()/kernel_write(), which can e.g. be triggered via
      sys_splice()) and privilege escalation inside userspace.
      
      Fix it by using simple_read_from_buffer() instead of custom logic.
      
      Fixes: 6bc235a2 ("USB: add driver for Meywa-Denki & Kayac YUREX")
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0fdef314
    • Johan Hovold's avatar
      USB: serial: keyspan_pda: fix modem-status error handling · 7e7c86d2
      Johan Hovold authored
      commit 01b3cdfc upstream.
      
      Fix broken modem-status error handling which could lead to bits of slab
      data leaking to user space.
      
      Fixes: 3b36a8fd ("usb: fix uninitialized variable warning in keyspan_pda")
      Cc: stable <stable@vger.kernel.org>     # 2.6.27
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7e7c86d2
    • Olli Salonen's avatar
      USB: serial: cp210x: add another USB ID for Qivicon ZigBee stick · 4115045f
      Olli Salonen authored
      commit 367b160f upstream.
      
      There are two versions of the Qivicon Zigbee stick in circulation. This
      adds the second USB ID to the cp210x driver.
      Signed-off-by: default avatarOlli Salonen <olli.salonen@iki.fi>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4115045f
    • Dan Carpenter's avatar
      USB: serial: ch341: fix type promotion bug in ch341_control_in() · 4c73f193
      Dan Carpenter authored
      commit e33eab9d upstream.
      
      The "r" variable is an int and "bufsize" is an unsigned int so the
      comparison is type promoted to unsigned.  If usb_control_msg() returns a
      negative that is treated as a high positive value and the error handling
      doesn't work.
      
      Fixes: 2d5a9c72 ("USB: serial: ch341: fix control-message error handling")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c73f193
    • Hans de Goede's avatar
      ahci: Disable LPM on Lenovo 50 series laptops with a too old BIOS · f510cc3a
      Hans de Goede authored
      commit 240630e6 upstream.
      
      There have been several reports of LPM related hard freezes about once
      a day on multiple Lenovo 50 series models. Strange enough these reports
      where not disk model specific as LPM issues usually are and some users
      with the exact same disk + laptop where seeing them while other users
      where not seeing these issues.
      
      It turns out that enabling LPM triggers a firmware bug somewhere, which
      has been fixed in later BIOS versions.
      
      This commit adds a new ahci_broken_lpm() function and a new ATA_FLAG_NO_LPM
      for dealing with this.
      
      The ahci_broken_lpm() function contains DMI match info for the 4 models
      which are known to be affected by this and the DMI BIOS date field for
      known good BIOS versions. If the BIOS date is older then the one in the
      table LPM will be disabled and a warning will be printed.
      
      Note the BIOS dates are for known good versions, some older versions may
      work too, but we don't know for sure, the table is using dates from BIOS
      versions for which users have confirmed that upgrading to that version
      makes the problem go away.
      
      Unfortunately I've been unable to get hold of the reporter who reported
      that BIOS version 2.35 fixed the problems on the W541 for him. I've been
      able to verify the DMI_SYS_VENDOR and DMI_PRODUCT_VERSION from an older
      dmidecode, but I don't know the exact BIOS date as reported in the DMI.
      Lenovo keeps a changelog with dates in their release notes, but the
      dates there are the release dates not the build dates which are in DMI.
      So I've chosen to set the date to which we compare to one day past the
      release date of the 2.34 BIOS. I plan to fix this with a follow up
      commit once I've the necessary info.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f510cc3a
    • Nadav Amit's avatar
      vmw_balloon: fix inflation with batching · 63c003e3
      Nadav Amit authored
      commit 90d72ce0 upstream.
      
      Embarrassingly, the recent fix introduced worse problem than it solved,
      causing the balloon not to inflate. The VM informed the hypervisor that
      the pages for lock/unlock are sitting in the wrong address, as it used
      the page that is used the uninitialized page variable.
      
      Fixes: b23220fe ("vmw_balloon: fixing double free when batching mode is off")
      Cc: stable@vger.kernel.org
      Reviewed-by: default avatarXavier Deguillard <xdeguillard@vmware.com>
      Signed-off-by: default avatarNadav Amit <namit@vmware.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      63c003e3
    • Damien Le Moal's avatar
      ata: Fix ZBC_OUT all bit handling · 3f205d7a
      Damien Le Moal authored
      commit 6edf1d4c upstream.
      
      If the ALL bit is set in the ZBC_OUT command, the command zone ID field
      (block) should be ignored.
      Reported-by: default avatarDavid Butterfield <david.butterfield@wdc.com>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3f205d7a
    • Damien Le Moal's avatar
      ata: Fix ZBC_OUT command block check · 51bacd84
      Damien Le Moal authored
      commit b320a0a9 upstream.
      
      The block (LBA) specified must not exceed the last addressable LBA,
      which is dev->nr_sectors - 1. So fix the correct check is
      "if (block >= dev->n_sectors)" and not "if (block > dev->n_sectords)".
      
      Additionally, the asc/ascq to return for an LBA that is not a zone start
      LBA should be ILLEGAL REQUEST, regardless if the bad LBA is out of
      range.
      Reported-by: default avatarDavid Butterfield <david.butterfield@wdc.com>
      Signed-off-by: default avatarDamien Le Moal <damien.lemoal@wdc.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      51bacd84
    • Jann Horn's avatar
      ibmasm: don't write out of bounds in read handler · 2823345c
      Jann Horn authored
      commit a0341fc1 upstream.
      
      This read handler had a lot of custom logic and wrote outside the bounds of
      the provided buffer. This could lead to kernel and userspace memory
      corruption. Just use simple_read_from_buffer() with a stack buffer.
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2823345c
    • x00270170's avatar
      mmc: dw_mmc: fix card threshold control configuration · 35479c22
      x00270170 authored
      commit 7a6b9f4d upstream.
      
      Card write threshold control is supposed to be set since controller
      version 2.80a for data write in HS400 mode and data read in
      HS200/HS400/SDR104 mode. However the current code returns without
      configuring it in the case of data writing in HS400 mode.
      Meanwhile the patch fixes that the current code goes to
      'disable' when doing data reading in HS400 mode.
      
      Fixes: 7e4bf1bc ("mmc: dw_mmc: add the card write threshold for HS400 mode")
      Signed-off-by: default avatarQing Xia <xiaqing17@hisilicon.com>
      Cc: stable@vger.kernel.org # v4.8+
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      35479c22
    • Paul Burton's avatar
      MIPS: Fix ioremap() RAM check · 92cb1184
      Paul Burton authored
      commit 523402fa upstream.
      
      We currently attempt to check whether a physical address range provided
      to __ioremap() may be in use by the page allocator by examining the
      value of PageReserved for each page in the region - lowmem pages not
      marked reserved are presumed to be in use by the page allocator, and
      requests to ioremap them fail.
      
      The way we check this has been broken since commit 92923ca3 ("mm:
      meminit: only set page reserved in the memblock region"), because
      memblock will typically not have any knowledge of non-RAM pages and
      therefore those pages will not have the PageReserved flag set. Thus when
      we attempt to ioremap a region outside of RAM we incorrectly fail
      believing that the region is RAM that may be in use.
      
      In most cases ioremap() on MIPS will take a fast-path to use the
      unmapped kseg1 or xkphys virtual address spaces and never hit this path,
      so the only way to hit it is for a MIPS32 system to attempt to ioremap()
      an address range in lowmem with flags other than _CACHE_UNCACHED.
      Perhaps the most straightforward way to do this is using
      ioremap_uncached_accelerated(), which is how the problem was discovered.
      
      Fix this by making use of walk_system_ram_range() to test the address
      range provided to __ioremap() against only RAM pages, rather than all
      lowmem pages. This means that if we have a lowmem I/O region, which is
      very common for MIPS systems, we're free to ioremap() address ranges
      within it. A nice bonus is that the test is no longer limited to lowmem.
      
      The approach here matches the way x86 performed the same test after
      commit c81c8a1e ("x86, ioremap: Speed up check for RAM pages") until
      x86 moved towards a slightly more complicated check using walk_mem_res()
      for unrelated reasons with commit 0e4c12b4 ("x86/mm, resource: Use
      PAGE_KERNEL protection for ioremap of memory pages").
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Reported-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Tested-by: default avatarSerge Semin <fancer.lancer@gmail.com>
      Fixes: 92923ca3 ("mm: meminit: only set page reserved in the memblock region")
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: linux-mips@linux-mips.org
      Cc: stable@vger.kernel.org # v4.2+
      Patchwork: https://patchwork.linux-mips.org/patch/19786/Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      92cb1184
    • Paul Burton's avatar
      MIPS: Call dump_stack() from show_regs() · 473b33dd
      Paul Burton authored
      commit 5a267832 upstream.
      
      The generic nmi_cpu_backtrace() function calls show_regs() when a struct
      pt_regs is available, and dump_stack() otherwise. If we were to make use
      of the generic nmi_cpu_backtrace() with MIPS' current implementation of
      show_regs() this would mean that we see only register data with no
      accompanying stack information, in contrast with our current
      implementation which calls dump_stack() regardless of whether register
      state is available.
      
      In preparation for making use of the generic nmi_cpu_backtrace() to
      implement arch_trigger_cpumask_backtrace(), have our implementation of
      show_regs() call dump_stack() and drop the explicit dump_stack() call in
      arch_dump_stack() which is invoked by arch_trigger_cpumask_backtrace().
      
      This will allow the output we produce to remain the same after a later
      patch switches to using nmi_cpu_backtrace(). It may mean that we produce
      extra stack output in other uses of show_regs(), but this:
      
        1) Seems harmless.
        2) Is good for consistency between arch_trigger_cpumask_backtrace()
           and other users of show_regs().
        3) Matches the behaviour of the ARM & PowerPC architectures.
      
      Marked for stable back to v4.9 as a prerequisite of the following patch
      "MIPS: Call dump_stack() from show_regs()".
      Signed-off-by: default avatarPaul Burton <paul.burton@mips.com>
      Patchwork: https://patchwork.linux-mips.org/patch/19596/
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Huacai Chen <chenhc@lemote.com>
      Cc: linux-mips@linux-mips.org
      Cc: stable@vger.kernel.org # v4.9+
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      473b33dd
    • Scott Bauer's avatar
      nvme: validate admin queue before unquiesce · 93e54f40
      Scott Bauer authored
      commit 7dd1ab16 upstream.
      
      With a misbehaving controller it's possible we'll never
      enter the live state and create an admin queue. When we
      fail out of reset work it's possible we failed out early
      enough without setting up the admin queue. We tear down
      queues after a failed reset, but needed to do some more
      sanitization.
      
      Fixes 443bd90f: "nvme: host: unquiesce queue in nvme_kill_queues()"
      
      [  189.650995] nvme nvme1: pci function 0000:0b:00.0
      [  317.680055] nvme nvme0: Device not ready; aborting reset
      [  317.680183] nvme nvme0: Removing after probe failure status: -19
      [  317.681258] kasan: GPF could be caused by NULL-ptr deref or user memory access
      [  317.681397] general protection fault: 0000 [#1] SMP KASAN
      [  317.682984] CPU: 3 PID: 477 Comm: kworker/3:2 Not tainted 4.13.0-rc1+ #5
      [  317.683112] Hardware name: Gigabyte Technology Co., Ltd. Z170X-UD5/Z170X-UD5-CF, BIOS F5 03/07/2016
      [  317.683284] Workqueue: events nvme_remove_dead_ctrl_work [nvme]
      [  317.683398] task: ffff8803b0990000 task.stack: ffff8803c2ef0000
      [  317.683516] RIP: 0010:blk_mq_unquiesce_queue+0x2b/0xa0
      [  317.683614] RSP: 0018:ffff8803c2ef7d40 EFLAGS: 00010282
      [  317.683716] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 1ffff1006fbdcde3
      [  317.683847] RDX: 0000000000000038 RSI: 1ffff1006f5a9245 RDI: 0000000000000000
      [  317.683978] RBP: ffff8803c2ef7d58 R08: 1ffff1007bcdc974 R09: 0000000000000000
      [  317.684108] R10: 1ffff1007bcdc975 R11: 0000000000000000 R12: 00000000000001c0
      [  317.684239] R13: ffff88037ad49228 R14: ffff88037ad492d0 R15: ffff88037ad492e0
      [  317.684371] FS:  0000000000000000(0000) GS:ffff8803de6c0000(0000) knlGS:0000000000000000
      [  317.684519] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  317.684627] CR2: 0000002d1860c000 CR3: 000000045b40d000 CR4: 00000000003406e0
      [  317.684758] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  317.684888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [  317.685018] Call Trace:
      [  317.685084]  nvme_kill_queues+0x4d/0x170 [nvme_core]
      [  317.685185]  nvme_remove_dead_ctrl_work+0x3a/0x90 [nvme]
      [  317.685289]  process_one_work+0x771/0x1170
      [  317.685372]  worker_thread+0xde/0x11e0
      [  317.685452]  ? pci_mmcfg_check_reserved+0x110/0x110
      [  317.685550]  kthread+0x2d3/0x3d0
      [  317.685617]  ? process_one_work+0x1170/0x1170
      [  317.685704]  ? kthread_create_on_node+0xc0/0xc0
      [  317.685785]  ret_from_fork+0x25/0x30
      [  317.685798] Code: 0f 1f 44 00 00 55 48 b8 00 00 00 00 00 fc ff df 48 89 e5 41 54 4c 8d a7 c0 01 00 00 53 48 89 fb 4c 89 e2 48 c1 ea 03 48 83 ec 08 <80> 3c 02 00 75 50 48 8b bb c0 01 00 00 e8 33 8a f9 00 0f ba b3
      [  317.685872] RIP: blk_mq_unquiesce_queue+0x2b/0xa0 RSP: ffff8803c2ef7d40
      [  317.685908] ---[ end trace a3f8704150b1e8b4 ]---
      Signed-off-by: default avatarScott Bauer <scott.bauer@intel.com>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      [ adapted for 4.9: added check around blk_mq_start_hw_queues() call
        instead of upstream blk_mq_unquiesce_queue() ]
      Fixes: 4aae4388 ("nvme: fix hang in remove path")
      Signed-off-by: default avatarSimon Veith <sveith@amazon.de>
      Signed-off-by: default avatarDavid Woodhouse <dwmw@amazon.co.uk>
      Signed-off-by: default avatarAmit Shah <aams@amazon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93e54f40
  2. 11 Jul, 2018 12 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.112 · 06074401
      Greg Kroah-Hartman authored
      06074401
    • Dan Carpenter's avatar
      staging: comedi: quatech_daqp_cs: fix no-op loop daqp_ao_insn_write() · e31cd420
      Dan Carpenter authored
      commit 1376b0a2 upstream.
      
      There is a '>' vs '<' typo so this loop is a no-op.
      
      Fixes: d35dcc89 ("staging: comedi: quatech_daqp_cs: fix daqp_ao_insn_write()")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Reviewed-by: default avatarIan Abbott <abbotti@mev.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e31cd420
    • Jann Horn's avatar
      netfilter: nf_log: don't hold nf_log_mutex during user access · 1712fae9
      Jann Horn authored
      commit ce00bf07 upstream.
      
      The old code would indefinitely block other users of nf_log_mutex if
      a userspace access in proc_dostring() blocked e.g. due to a userfaultfd
      region. Fix it by moving proc_dostring() out of the locked region.
      
      This is a followup to commit 266d07cb ("netfilter: nf_log: fix
      sleeping function called from invalid context"), which changed this code
      from using rcu_read_lock() to taking nf_log_mutex.
      
      Fixes: 266d07cb ("netfilter: nf_log: fix sleeping function calle[...]")
      Signed-off-by: default avatarJann Horn <jannh@google.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1712fae9
    • Tokunori Ikegami's avatar
      mtd: cfi_cmdset_0002: Change erase functions to check chip good only · a0239d83
      Tokunori Ikegami authored
      commit 79ca484b upstream.
      
      Currently the functions use to check both chip ready and good.
      But the chip ready is not enough to check the operation status.
      So change this to check the chip good instead of this.
      About the retry functions to make sure the error handling remain it.
      Signed-off-by: default avatarTokunori Ikegami <ikegami@allied-telesis.co.jp>
      Reviewed-by: default avatarJoakim Tjernlund <Joakim.Tjernlund@infinera.com>
      Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
      Cc: Marek Vasut <marek.vasut@gmail.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr>
      Cc: linux-mtd@lists.infradead.org
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      a0239d83
    • Tokunori Ikegami's avatar
      mtd: cfi_cmdset_0002: Change erase functions to retry for error · ed174614
      Tokunori Ikegami authored
      commit 45f75b8a upstream.
      
      For the word write functions it is retried for error.
      But it is not implemented to retry for the erase functions.
      To make sure for the erase functions change to retry as same.
      
      This is needed to prevent the flash erase error caused only once.
      It was caused by the error case of chip_good() in the do_erase_oneblock().
      Also it was confirmed on the MACRONIX flash device MX29GL512FHT2I-11G.
      But the error issue behavior is not able to reproduce at this moment.
      The flash controller is parallel Flash interface integrated on BCM53003.
      Signed-off-by: default avatarTokunori Ikegami <ikegami@allied-telesis.co.jp>
      Reviewed-by: default avatarJoakim Tjernlund <Joakim.Tjernlund@infinera.com>
      Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
      Cc: Marek Vasut <marek.vasut@gmail.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr>
      Cc: linux-mtd@lists.infradead.org
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed174614
    • Tokunori Ikegami's avatar
      mtd: cfi_cmdset_0002: Change definition naming to retry write operation · c2f163e3
      Tokunori Ikegami authored
      commit 85a82e28 upstream.
      
      The definition can be used for other program and erase operations also.
      So change the naming to MAX_RETRIES from MAX_WORD_RETRIES.
      Signed-off-by: default avatarTokunori Ikegami <ikegami@allied-telesis.co.jp>
      Reviewed-by: default avatarJoakim Tjernlund <Joakim.Tjernlund@infinera.com>
      Cc: Chris Packham <chris.packham@alliedtelesis.co.nz>
      Cc: Brian Norris <computersforpeace@gmail.com>
      Cc: David Woodhouse <dwmw2@infradead.org>
      Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
      Cc: Marek Vasut <marek.vasut@gmail.com>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Cyrille Pitchen <cyrille.pitchen@wedev4u.fr>
      Cc: linux-mtd@lists.infradead.org
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c2f163e3
    • Mikulas Patocka's avatar
      dm bufio: don't take the lock in dm_bufio_shrink_count · 4779184a
      Mikulas Patocka authored
      commit d12067f4 upstream.
      
      dm_bufio_shrink_count() is called from do_shrink_slab to find out how many
      freeable objects are there. The reported value doesn't have to be precise,
      so we don't need to take the dm-bufio lock.
      Suggested-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4779184a
    • Martin Kaiser's avatar
      mtd: rawnand: mxc: set spare area size register explicitly · 9d1304f5
      Martin Kaiser authored
      commit 3f77f244 upstream.
      
      The v21 version of the NAND flash controller contains a Spare Area Size
      Register (SPAS) at offset 0x10. Its setting defaults to the maximum
      spare area size of 218 bytes. The size that is set in this register is
      used by the controller when it calculates the ECC bytes internally in
      hardware.
      
      Usually, this register is updated from settings in the IIM fuses when
      the system is booting from NAND flash. For other boot media, however,
      the SPAS register remains at the default setting, which may not work for
      the particular flash chip on the board. The same goes for flash chips
      whose configuration cannot be set in the IIM fuses (e.g. chips with 2k
      sector size and 128 bytes spare area size can't be configured in the IIM
      fuses on imx25 systems).
      
      Set the SPAS register explicitly during the preset operation. Derive the
      register value from mtd->oobsize that was detected during probe by
      decoding the flash chip's ID bytes.
      
      While at it, rename the define for the spare area register's offset to
      NFC_V21_RSLTSPARE_AREA. The register at offset 0x10 on v1 controllers is
      different from the register on v21 controllers.
      
      Fixes: d4840180 ("mtd: mxc_nand: set NFC registers after reset")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMartin Kaiser <martin@kaiser.cx>
      Reviewed-by: default avatarSascha Hauer <s.hauer@pengutronix.de>
      Reviewed-by: default avatarMiquel Raynal <miquel.raynal@bootlin.com>
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@bootlin.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      9d1304f5
    • Mikulas Patocka's avatar
      dm bufio: drop the lock when doing GFP_NOIO allocation · 34d2fe72
      Mikulas Patocka authored
      commit 41c73a49 upstream.
      
      If the first allocation attempt using GFP_NOWAIT fails, drop the lock
      and retry using GFP_NOIO allocation (lock is dropped because the
      allocation can take some time).
      
      Note that we won't do GFP_NOIO allocation when we loop for the second
      time, because the lock shouldn't be dropped between __wait_for_free_buffer
      and __get_unclaimed_buffer.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      34d2fe72
    • Douglas Anderson's avatar
      dm bufio: avoid sleeping while holding the dm_bufio lock · 0758c35b
      Douglas Anderson authored
      commit 9ea61cac upstream.
      
      We've seen in-field reports showing _lots_ (18 in one case, 41 in
      another) of tasks all sitting there blocked on:
      
        mutex_lock+0x4c/0x68
        dm_bufio_shrink_count+0x38/0x78
        shrink_slab.part.54.constprop.65+0x100/0x464
        shrink_zone+0xa8/0x198
      
      In the two cases analyzed, we see one task that looks like this:
      
        Workqueue: kverityd verity_prefetch_io
      
        __switch_to+0x9c/0xa8
        __schedule+0x440/0x6d8
        schedule+0x94/0xb4
        schedule_timeout+0x204/0x27c
        schedule_timeout_uninterruptible+0x44/0x50
        wait_iff_congested+0x9c/0x1f0
        shrink_inactive_list+0x3a0/0x4cc
        shrink_lruvec+0x418/0x5cc
        shrink_zone+0x88/0x198
        try_to_free_pages+0x51c/0x588
        __alloc_pages_nodemask+0x648/0xa88
        __get_free_pages+0x34/0x7c
        alloc_buffer+0xa4/0x144
        __bufio_new+0x84/0x278
        dm_bufio_prefetch+0x9c/0x154
        verity_prefetch_io+0xe8/0x10c
        process_one_work+0x240/0x424
        worker_thread+0x2fc/0x424
        kthread+0x10c/0x114
      
      ...and that looks to be the one holding the mutex.
      
      The problem has been reproduced on fairly easily:
      0. Be running Chrome OS w/ verity enabled on the root filesystem
      1. Pick test patch: http://crosreview.com/412360
      2. Install launchBalloons.sh and balloon.arm from
           http://crbug.com/468342
         ...that's just a memory stress test app.
      3. On a 4GB rk3399 machine, run
           nice ./launchBalloons.sh 4 900 100000
         ...that tries to eat 4 * 900 MB of memory and keep accessing.
      4. Login to the Chrome web browser and restore many tabs
      
      With that, I've seen printouts like:
        DOUG: long bufio 90758 ms
      ...and stack trace always show's we're in dm_bufio_prefetch().
      
      The problem is that we try to allocate memory with GFP_NOIO while
      we're holding the dm_bufio lock.  Instead we should be using
      GFP_NOWAIT.  Using GFP_NOIO can cause us to sleep while holding the
      lock and that causes the above problems.
      
      The current behavior explained by David Rientjes:
      
        It will still try reclaim initially because __GFP_WAIT (or
        __GFP_KSWAPD_RECLAIM) is set by GFP_NOIO.  This is the cause of
        contention on dm_bufio_lock() that the thread holds.  You want to
        pass GFP_NOWAIT instead of GFP_NOIO to alloc_buffer() when holding a
        mutex that can be contended by a concurrent slab shrinker (if
        count_objects didn't use a trylock, this pattern would trivially
        deadlock).
      
      This change significantly increases responsiveness of the system while
      in this state.  It makes a real difference because it unblocks kswapd.
      In the bug report analyzed, kswapd was hung:
      
         kswapd0         D ffffffc000204fd8     0    72      2 0x00000000
         Call trace:
         [<ffffffc000204fd8>] __switch_to+0x9c/0xa8
         [<ffffffc00090b794>] __schedule+0x440/0x6d8
         [<ffffffc00090bac0>] schedule+0x94/0xb4
         [<ffffffc00090be44>] schedule_preempt_disabled+0x28/0x44
         [<ffffffc00090d900>] __mutex_lock_slowpath+0x120/0x1ac
         [<ffffffc00090d9d8>] mutex_lock+0x4c/0x68
         [<ffffffc000708e7c>] dm_bufio_shrink_count+0x38/0x78
         [<ffffffc00030b268>] shrink_slab.part.54.constprop.65+0x100/0x464
         [<ffffffc00030dbd8>] shrink_zone+0xa8/0x198
         [<ffffffc00030e578>] balance_pgdat+0x328/0x508
         [<ffffffc00030eb7c>] kswapd+0x424/0x51c
         [<ffffffc00023f06c>] kthread+0x10c/0x114
         [<ffffffc000203dd0>] ret_from_fork+0x10/0x40
      
      By unblocking kswapd memory pressure should be reduced.
      Suggested-by: default avatarDavid Rientjes <rientjes@google.com>
      Reviewed-by: default avatarGuenter Roeck <linux@roeck-us.net>
      Signed-off-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0758c35b
    • Vlastimil Babka's avatar
      mm, page_alloc: do not break __GFP_THISNODE by zonelist reset · 6cfbbdd2
      Vlastimil Babka authored
      commit 7810e678 upstream.
      
      In __alloc_pages_slowpath() we reset zonelist and preferred_zoneref for
      allocations that can ignore memory policies.  The zonelist is obtained
      from current CPU's node.  This is a problem for __GFP_THISNODE
      allocations that want to allocate on a different node, e.g.  because the
      allocating thread has been migrated to a different CPU.
      
      This has been observed to break SLAB in our 4.4-based kernel, because
      there it relies on __GFP_THISNODE working as intended.  If a slab page
      is put on wrong node's list, then further list manipulations may corrupt
      the list because page_to_nid() is used to determine which node's
      list_lock should be locked and thus we may take a wrong lock and race.
      
      Current SLAB implementation seems to be immune by luck thanks to commit
      511e3a05 ("mm/slab: make cache_grow() handle the page allocated on
      arbitrary node") but there may be others assuming that __GFP_THISNODE
      works as promised.
      
      We can fix it by simply removing the zonelist reset completely.  There
      is actually no reason to reset it, because memory policies and cpusets
      don't affect the zonelist choice in the first place.  This was different
      when commit 183f6371 ("mm: ignore mempolicies when using
      ALLOC_NO_WATERMARK") introduced the code, as mempolicies provided their
      own restricted zonelists.
      
      We might consider this for 4.17 although I don't know if there's
      anything currently broken.
      
      SLAB is currently not affected, but in kernels older than 4.7 that don't
      yet have 511e3a05 ("mm/slab: make cache_grow() handle the page
      allocated on arbitrary node") it is.  That's at least 4.4 LTS.  Older
      ones I'll have to check.
      
      So stable backports should be more important, but will have to be
      reviewed carefully, as the code went through many changes.  BTW I think
      that also the ac->preferred_zoneref reset is currently useless if we
      don't also reset ac->nodemask from a mempolicy to NULL first (which we
      probably should for the OOM victims etc?), but I would leave that for a
      separate patch.
      
      Link: http://lkml.kernel.org/r/20180525130853.13915-1-vbabka@suse.czSigned-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Fixes: 183f6371 ("mm: ignore mempolicies when using ALLOC_NO_WATERMARK")
      Acked-by: default avatarMel Gorman <mgorman@techsingularity.net>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6cfbbdd2
    • Brad Love's avatar
      media: cx25840: Use subdev host data for PLL override · d96a0d3c
      Brad Love authored
      commit 3ee9bc12 upstream.
      
      The cx25840 driver currently configures 885, 887, and 888 using
      default divisors for each chip. This check to see if the cx23885
      driver has passed the cx25840 a non-default clock rate for a
      specific chip. If a cx23885 board has left clk_freq at 0, the
      clock default values will be used to configure the PLLs.
      
      This patch only has effect on 888 boards who set clk_freq to 25M.
      Signed-off-by: default avatarBrad Love <brad@nextdimension.cc>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@s-opensource.com>
      Cc: Ben Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d96a0d3c