1. 20 Dec, 2017 22 commits
    • Chandan Rajendra's avatar
      ext4: fix crash when a directory's i_size is too small · 32e2ae03
      Chandan Rajendra authored
      commit 9d5afec6 upstream.
      
      On a ppc64 machine, when mounting a fuzzed ext2 image (generated by
      fsfuzzer) the following call trace is seen,
      
      VFS: brelse: Trying to free free buffer
      WARNING: CPU: 1 PID: 6913 at /root/repos/linux/fs/buffer.c:1165 .__brelse.part.6+0x24/0x40
      .__brelse.part.6+0x20/0x40 (unreliable)
      .ext4_find_entry+0x384/0x4f0
      .ext4_lookup+0x84/0x250
      .lookup_slow+0xdc/0x230
      .walk_component+0x268/0x400
      .path_lookupat+0xec/0x2d0
      .filename_lookup+0x9c/0x1d0
      .vfs_statx+0x98/0x140
      .SyS_newfstatat+0x48/0x80
      system_call+0x58/0x6c
      
      This happens because the directory that ext4_find_entry() looks up has
      inode->i_size that is less than the block size of the filesystem. This
      causes 'nblocks' to have a value of zero. ext4_bread_batch() ends up not
      reading any of the directory file's blocks. This renders the entries in
      bh_use[] array to continue to have garbage data. buffer_uptodate() on
      bh_use[0] can then return a zero value upon which brelse() function is
      invoked.
      
      This commit fixes the bug by returning -ENOENT when the directory file
      has no associated blocks.
      Reported-by: default avatarAbdul Haleem <abdhalee@linux.vnet.ibm.com>
      Signed-off-by: default avatarChandan Rajendra <chandan@linux.vnet.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      32e2ae03
    • Eryu Guan's avatar
      ext4: fix fdatasync(2) after fallocate(2) operation · 6a851bb9
      Eryu Guan authored
      commit c894aa97 upstream.
      
      Currently, fallocate(2) with KEEP_SIZE followed by a fdatasync(2)
      then crash, we'll see wrong allocated block number (stat -c %b), the
      blocks allocated beyond EOF are all lost. fstests generic/468
      exposes this bug.
      
      Commit 67a7d5f5 ("ext4: fix fdatasync(2) after extent
      manipulation operations") fixed all the other extent manipulation
      operation paths such as hole punch, zero range, collapse range etc.,
      but forgot the fallocate case.
      
      So similarly, fix it by recording the correct journal tid in ext4
      inode in fallocate(2) path, so that ext4_sync_file() will wait for
      the right tid to be committed on fdatasync(2).
      
      This addresses the test failure in xfstests test generic/468.
      Signed-off-by: default avatarEryu Guan <eguan@redhat.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6a851bb9
    • Adam Wallis's avatar
      dmaengine: dmatest: move callback wait queue to thread context · 679dbeac
      Adam Wallis authored
      commit 6f6a23a2 upstream.
      
      Commit adfa543e ("dmatest: don't use set_freezable_with_signal()")
      introduced a bug (that is in fact documented by the patch commit text)
      that leaves behind a dangling pointer. Since the done_wait structure is
      allocated on the stack, future invocations to the DMATEST can produce
      undesirable results (e.g., corrupted spinlocks).
      
      Commit a9df21e3 ("dmaengine: dmatest: warn user when dma test times
      out") attempted to WARN the user that the stack was likely corrupted but
      did not fix the actual issue.
      
      This patch fixes the issue by pushing the wait queue and callback
      structs into the the thread structure. If a failure occurs due to time,
      dmaengine_terminate_all will force the callback to safely call
      wake_up_all() without possibility of using a freed pointer.
      
      Bug: https://bugzilla.kernel.org/show_bug.cgi?id=197605
      Fixes: adfa543e ("dmatest: don't use set_freezable_with_signal()")
      Reviewed-by: default avatarSinan Kaya <okaya@codeaurora.org>
      Suggested-by: default avatarShunyong Yang <shunyong.yang@hxt-semitech.com>
      Signed-off-by: default avatarAdam Wallis <awallis@codeaurora.org>
      Signed-off-by: default avatarVinod Koul <vinod.koul@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      679dbeac
    • David Lechner's avatar
      eeprom: at24: change nvmem stride to 1 · 744cb5ab
      David Lechner authored
      commit 7f6d2ecd upstream.
      
      Trying to read the MAC address from an eeprom that has an offset that
      is not a multiple of 4 causes an error currently.
      
      Fix it by changing the nvmem stride to 1.
      Signed-off-by: default avatarDavid Lechner <david@lechnology.com>
      [Bartosz: tweaked the commit message]
      Signed-off-by: default avatarBartosz Golaszewski <brgl@bgdev.pl>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      744cb5ab
    • Steven Rostedt's avatar
      sched/rt: Do not pull from current CPU if only one CPU to pull · d266817f
      Steven Rostedt authored
      commit f73c52a5 upstream.
      
      Daniel Wagner reported a crash on the BeagleBone Black SoC.
      
      This is a single CPU architecture, and does not have a functional
      arch_send_call_function_single_ipi() implementation which can crash
      the kernel if that is called.
      
      As it only has one CPU, it shouldn't be called, but if the kernel is
      compiled for SMP, the push/pull RT scheduling logic now calls it for
      irq_work if the one CPU is overloaded, it can use that function to call
      itself and crash the kernel.
      
      Ideally, we should disable the SCHED_FEAT(RT_PUSH_IPI) if the system
      only has a single CPU. But SCHED_FEAT is a constant if sched debugging
      is turned off. Another fix can also be used, and this should also help
      with normal SMP machines. That is, do not initiate the pull code if
      there's only one RT overloaded CPU, and that CPU happens to be the
      current CPU that is scheduling in a lower priority task.
      
      Even on a system with many CPUs, if there's many RT tasks waiting to
      run on a single CPU, and that CPU schedules in another RT task of lower
      priority, it will initiate the PULL logic in case there's a higher
      priority RT task on another CPU that is waiting to run. But if there is
      no other CPU with waiting RT tasks, it will initiate the RT pull logic
      on itself (as it still has RT tasks waiting to run). This is a wasted
      effort.
      
      Not only does this help with SMP code where the current CPU is the only
      one with RT overloaded tasks, it should also solve the issue that
      Daniel encountered, because it will prevent the PULL logic from
      executing, as there's only one CPU on the system, and the check added
      here will cause it to exit the RT pull code.
      Reported-by: default avatarDaniel Wagner <wagi@monom.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-rt-users <linux-rt-users@vger.kernel.org>
      Fixes: 4bdced5c ("sched/rt: Simplify the IPI based RT balancing logic")
      Link: http://lkml.kernel.org/r/20171202130454.4cbbfe8d@vmware.local.homeSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d266817f
    • Scott Mayhew's avatar
      nfs: don't wait on commit in nfs_commit_inode() if there were no commit requests · 9c537f06
      Scott Mayhew authored
      commit dc4fd9ab upstream.
      
      If there were no commit requests, then nfs_commit_inode() should not
      wait on the commit or mark the inode dirty, otherwise the following
      BUG_ON can be triggered:
      
      [ 1917.130762] kernel BUG at fs/inode.c:578!
      [ 1917.130766] Oops: Exception in kernel mode, sig: 5 [#1]
      [ 1917.130768] SMP NR_CPUS=2048 NUMA pSeries
      [ 1917.130772] Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi blocklayoutdriver rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache sunrpc sg nx_crypto pseries_rng ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic crct10dif_common ibmvscsi scsi_transport_srp ibmveth scsi_tgt dm_mirror dm_region_hash dm_log dm_mod
      [ 1917.130805] CPU: 2 PID: 14923 Comm: umount.nfs4 Tainted: G               ------------ T 3.10.0-768.el7.ppc64 #1
      [ 1917.130810] task: c0000005ecd88040 ti: c00000004cea0000 task.ti: c00000004cea0000
      [ 1917.130813] NIP: c000000000354178 LR: c000000000354160 CTR: c00000000012db80
      [ 1917.130816] REGS: c00000004cea3720 TRAP: 0700   Tainted: G               ------------ T  (3.10.0-768.el7.ppc64)
      [ 1917.130820] MSR: 8000000100029032 <SF,EE,ME,IR,DR,RI>  CR: 22002822  XER: 20000000
      [ 1917.130828] CFAR: c00000000011f594 SOFTE: 1
      GPR00: c000000000354160 c00000004cea39a0 c0000000014c4700 c0000000018cc750
      GPR04: 000000000000c750 80c0000000000000 0600000000000000 04eeb76bea749a03
      GPR08: 0000000000000034 c0000000018cc758 0000000000000001 d000000005e619e8
      GPR12: c00000000012db80 c000000007b31200 0000000000000000 0000000000000000
      GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      GPR24: 0000000000000000 c000000000dfc3ec 0000000000000000 c0000005eefc02c0
      GPR28: d0000000079dbd50 c0000005b94a02c0 c0000005b94a0250 c0000005b94a01c8
      [ 1917.130867] NIP [c000000000354178] .evict+0x1c8/0x350
      [ 1917.130871] LR [c000000000354160] .evict+0x1b0/0x350
      [ 1917.130873] Call Trace:
      [ 1917.130876] [c00000004cea39a0] [c000000000354160] .evict+0x1b0/0x350 (unreliable)
      [ 1917.130880] [c00000004cea3a30] [c0000000003558cc] .evict_inodes+0x13c/0x270
      [ 1917.130884] [c00000004cea3af0] [c000000000327d20] .kill_anon_super+0x70/0x1e0
      [ 1917.130896] [c00000004cea3b80] [d000000005e43e30] .nfs_kill_super+0x20/0x60 [nfs]
      [ 1917.130900] [c00000004cea3c00] [c000000000328a20] .deactivate_locked_super+0xa0/0x1b0
      [ 1917.130903] [c00000004cea3c80] [c00000000035ba54] .cleanup_mnt+0xd4/0x180
      [ 1917.130907] [c00000004cea3d10] [c000000000119034] .task_work_run+0x114/0x150
      [ 1917.130912] [c00000004cea3db0] [c00000000001ba6c] .do_notify_resume+0xcc/0x100
      [ 1917.130916] [c00000004cea3e30] [c00000000000a7b0] .ret_from_except_lite+0x5c/0x60
      [ 1917.130919] Instruction dump:
      [ 1917.130921] 7fc3f378 486734b5 60000000 387f00a0 38800003 4bdcb365 60000000 e95f00a0
      [ 1917.130927] 694a0060 7d4a0074 794ad182 694a0001 <0b0a0000> 892d02a4 2f890000 40de0134
      Signed-off-by: default avatarScott Mayhew <smayhew@redhat.com>
      Signed-off-by: default avatarAnna Schumaker <Anna.Schumaker@Netapp.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c537f06
    • Mathias Nyman's avatar
      xhci: Don't add a virt_dev to the devs array before it's fully allocated · 3bdb508d
      Mathias Nyman authored
      commit 5d9b70f7 upstream.
      
      Avoid null pointer dereference if some function is walking through the
      devs array accessing members of a new virt_dev that is mid allocation.
      
      Add the virt_dev to xhci->devs[i] _after_ the virt_device and all its
      members are properly allocated.
      
      issue found by KASAN: null-ptr-deref in xhci_find_slot_id_by_port
      
      "Quick analysis suggests that xhci_alloc_virt_device() is not mutex
      protected. If so, there is a time frame where xhci->devs[slot_id] is set
      but not fully initialized. Specifically, xhci->devs[i]->udev can be NULL."
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3bdb508d
    • Sukumar Ghorai's avatar
      Bluetooth: btusb: driver to enable the usb-wakeup feature · 7336f548
      Sukumar Ghorai authored
      commit a0085f25 upstream.
      
      BT-Controller connected as platform non-root-hub device and
      usb-driver initialize such device with wakeup disabled,
      Ref. usb_new_device().
      
      At present wakeup-capability get enabled by hid-input device from usb
      function driver(e.g. BT HID device) at runtime. Again some functional
      driver does not set usb-wakeup capability(e.g LE HID device implement
      as HID-over-GATT), and can't wakeup the host on USB.
      
      Most of the device operation (such as mass storage) initiated from host
      (except HID) and USB wakeup aligned with host resume procedure. For BT
      device, usb-wakeup capability need to enable form btusc driver as a
      generic solution for multiple profile use case and required for USB remote
      wakeup (in-bus wakeup) while host is suspended. Also usb-wakeup feature
      need to enable/disable with HCI interface up and down.
      Signed-off-by: default avatarSukumar Ghorai <sukumar.ghorai@intel.com>
      Signed-off-by: default avatarAmit K Bag <amit.k.bag@intel.com>
      Acked-by: default avatarOliver Neukum <oneukum@suse.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Cc: Matthias Kaehlcke <mka@chromium.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7336f548
    • Chunfeng Yun's avatar
      usb: xhci: fix TDS for MTK xHCI1.1 · cdfe4c00
      Chunfeng Yun authored
      commit 72b663a9 upstream.
      
      For MTK's xHCI 1.0 or latter, TD size is the number of max
      packet sized packets remaining in the TD, not including
      this TRB (following spec).
      
      For MTK's xHCI 0.96 and older, TD size is the number of max
      packet sized packets remaining in the TD, including this TRB
      (not following spec).
      Signed-off-by: default avatarChunfeng Yun <chunfeng.yun@mediatek.com>
      Signed-off-by: default avatarMathias Nyman <mathias.nyman@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cdfe4c00
    • Yan, Zheng's avatar
      ceph: drop negative child dentries before try pruning inode's alias · e081bd0d
      Yan, Zheng authored
      commit 040d7860 upstream.
      
      Negative child dentry holds reference on inode's alias, it makes
      d_prune_aliases() do nothing.
      Signed-off-by: default avatar"Yan, Zheng" <zyan@redhat.com>
      Reviewed-by: default avatarJeff Layton <jlayton@redhat.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e081bd0d
    • Shuah Khan's avatar
      usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer · 14513e49
      Shuah Khan authored
      commit be6123df upstream.
      
      stub_send_ret_submit() handles urb with a potential null transfer_buffer,
      when it replays a packet with potential malicious data that could contain
      a null buffer. Add a check for the condition when actual_length > 0 and
      transfer_buffer is null.
      Reported-by: default avatarSecunia Research <vuln@secunia.com>
      Signed-off-by: default avatarShuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14513e49
    • Shuah Khan's avatar
      usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input · f3e95726
      Shuah Khan authored
      commit c6688ef9 upstream.
      
      Harden CMD_SUBMIT path to handle malicious input that could trigger
      large memory allocations. Add checks to validate transfer_buffer_length
      and number_of_packets to protect against bad input requesting for
      unbounded memory allocations. Validate early in get_pipe() and return
      failure.
      Reported-by: default avatarSecunia Research <vuln@secunia.com>
      Signed-off-by: default avatarShuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f3e95726
    • Felipe Balbi's avatar
      usb: add helper to extract bits 12:11 of wMaxPacketSize · b6dbace9
      Felipe Balbi authored
      commit 541b6fe6 upstream.
      
      According to USB Specification 2.0 table 9-4,
      wMaxPacketSize is a bitfield. Endpoint's maxpacket
      is laid out in bits 10:0. For high-speed,
      high-bandwidth isochronous endpoints, bits 12:11
      contain a multiplier to tell us how many
      transactions we want to try per uframe.
      
      This means that if we want an isochronous endpoint
      to issue 3 transfers of 1024 bytes per uframe,
      wMaxPacketSize should contain the value:
      
      	1024 | (2 << 11)
      
      or 5120 (0x1400). In order to make Host and
      Peripheral controller drivers' life easier, we're
      adding a helper which returns bits 12:11. Note that
      no care is made WRT to checking endpoint type and
      gadget's speed. That's left for drivers to handle.
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6dbace9
    • Shuah Khan's avatar
      usbip: fix stub_rx: get_pipe() to validate endpoint number · 20e825cd
      Shuah Khan authored
      commit 635f545a upstream.
      
      get_pipe() routine doesn't validate the input endpoint number
      and uses to reference ep_in and ep_out arrays. Invalid endpoint
      number can trigger BUG(). Range check the epnum and returning
      error instead of calling BUG().
      
      Change caller stub_recv_cmd_submit() to handle the get_pipe()
      error return.
      Reported-by: default avatarSecunia Research <vuln@secunia.com>
      Signed-off-by: default avatarShuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      20e825cd
    • Alan Stern's avatar
      USB: core: prevent malicious bNumInterfaces overflow · 99542e46
      Alan Stern authored
      commit 48a4ff1c upstream.
      
      A malicious USB device with crafted descriptors can cause the kernel
      to access unallocated memory by setting the bNumInterfaces value too
      high in a configuration descriptor.  Although the value is adjusted
      during parsing, this adjustment is skipped in one of the error return
      paths.
      
      This patch prevents the problem by setting bNumInterfaces to 0
      initially.  The existing code already sets it to the proper value
      after parsing is complete.
      Signed-off-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Reported-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      99542e46
    • David Kozub's avatar
      USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID · 0d29ae4f
      David Kozub authored
      commit 62354454 upstream.
      
      There is another JMS567-based USB3 UAS enclosure (152d:0578) that fails
      with the following error:
      
      [sda] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
      [sda] tag#0 Sense Key : Illegal Request [current]
      [sda] tag#0 Add. Sense: Invalid field in cdb
      
      The issue occurs both with UAS (occasionally) and mass storage
      (immediately after mounting a FS on a disk in the enclosure).
      
      Enabling US_FL_BROKEN_FUA quirk solves this issue.
      
      This patch adds an UNUSUAL_DEV with US_FL_BROKEN_FUA for the enclosure
      for both UAS and mass storage.
      Signed-off-by: default avatarDavid Kozub <zub@linux.fjfi.cvut.cz>
      Acked-by: default avatarAlan Stern <stern@rowland.harvard.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0d29ae4f
    • Changbin Du's avatar
      tracing: Allocate mask_str buffer dynamically · d760f903
      Changbin Du authored
      commit 90e406f9 upstream.
      
      The default NR_CPUS can be very large, but actual possible nr_cpu_ids
      usually is very small. For my x86 distribution, the NR_CPUS is 8192 and
      nr_cpu_ids is 4. About 2 pages are wasted.
      
      Most machines don't have so many CPUs, so define a array with NR_CPUS
      just wastes memory. So let's allocate the buffer dynamically when need.
      
      With this change, the mutext tracing_cpumask_update_lock also can be
      removed now, which was used to protect mask_str.
      
      Link: http://lkml.kernel.org/r/1512013183-19107-1-git-send-email-changbin.du@intel.com
      
      Fixes: 36dfe925 ("ftrace: make use of tracing_cpumask")
      Signed-off-by: default avatarChangbin Du <changbin.du@intel.com>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d760f903
    • NeilBrown's avatar
      autofs: fix careless error in recent commit · d1175423
      NeilBrown authored
      commit 302ec300 upstream.
      
      Commit ecc0c469 ("autofs: don't fail mount for transient error") was
      meant to replace an 'if' with a 'switch', but instead added the 'switch'
      leaving the case in place.
      
      Link: http://lkml.kernel.org/r/87zi6wstmw.fsf@notabene.neil.brown.name
      Fixes: ecc0c469 ("autofs: don't fail mount for transient error")
      Reported-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Cc: Ian Kent <raven@themaw.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1175423
    • Eric Biggers's avatar
      crypto: salsa20 - fix blkcipher_walk API usage · c32e053a
      Eric Biggers authored
      commit ecaaab56 upstream.
      
      When asked to encrypt or decrypt 0 bytes, both the generic and x86
      implementations of Salsa20 crash in blkcipher_walk_done(), either when
      doing 'kfree(walk->buffer)' or 'free_page((unsigned long)walk->page)',
      because walk->buffer and walk->page have not been initialized.
      
      The bug is that Salsa20 is calling blkcipher_walk_done() even when
      nothing is in 'walk.nbytes'.  But blkcipher_walk_done() is only meant to
      be called when a nonzero number of bytes have been provided.
      
      The broken code is part of an optimization that tries to make only one
      call to salsa20_encrypt_bytes() to process inputs that are not evenly
      divisible by 64 bytes.  To fix the bug, just remove this "optimization"
      and use the blkcipher_walk API the same way all the other users do.
      
      Reproducer:
      
          #include <linux/if_alg.h>
          #include <sys/socket.h>
          #include <unistd.h>
      
          int main()
          {
                  int algfd, reqfd;
                  struct sockaddr_alg addr = {
                          .salg_type = "skcipher",
                          .salg_name = "salsa20",
                  };
                  char key[16] = { 0 };
      
                  algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
                  bind(algfd, (void *)&addr, sizeof(addr));
                  reqfd = accept(algfd, 0, 0);
                  setsockopt(algfd, SOL_ALG, ALG_SET_KEY, key, sizeof(key));
                  read(reqfd, key, sizeof(key));
          }
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Fixes: eb6f13eb ("[CRYPTO] salsa20_generic: Fix multi-page processing")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c32e053a
    • Eric Biggers's avatar
      crypto: hmac - require that the underlying hash algorithm is unkeyed · 43259d07
      Eric Biggers authored
      commit af3ff804 upstream.
      
      Because the HMAC template didn't check that its underlying hash
      algorithm is unkeyed, trying to use "hmac(hmac(sha3-512-generic))"
      through AF_ALG or through KEYCTL_DH_COMPUTE resulted in the inner HMAC
      being used without having been keyed, resulting in sha3_update() being
      called without sha3_init(), causing a stack buffer overflow.
      
      This is a very old bug, but it seems to have only started causing real
      problems when SHA-3 support was added (requires CONFIG_CRYPTO_SHA3)
      because the innermost hash's state is ->import()ed from a zeroed buffer,
      and it just so happens that other hash algorithms are fine with that,
      but SHA-3 is not.  However, there could be arch or hardware-dependent
      hash algorithms also affected; I couldn't test everything.
      
      Fix the bug by introducing a function crypto_shash_alg_has_setkey()
      which tests whether a shash algorithm is keyed.  Then update the HMAC
      template to require that its underlying hash algorithm is unkeyed.
      
      Here is a reproducer:
      
          #include <linux/if_alg.h>
          #include <sys/socket.h>
      
          int main()
          {
              int algfd;
              struct sockaddr_alg addr = {
                  .salg_type = "hash",
                  .salg_name = "hmac(hmac(sha3-512-generic))",
              };
              char key[4096] = { 0 };
      
              algfd = socket(AF_ALG, SOCK_SEQPACKET, 0);
              bind(algfd, (const struct sockaddr *)&addr, sizeof(addr));
              setsockopt(algfd, SOL_ALG, ALG_SET_KEY, key, sizeof(key));
          }
      
      Here was the KASAN report from syzbot:
      
          BUG: KASAN: stack-out-of-bounds in memcpy include/linux/string.h:341  [inline]
          BUG: KASAN: stack-out-of-bounds in sha3_update+0xdf/0x2e0  crypto/sha3_generic.c:161
          Write of size 4096 at addr ffff8801cca07c40 by task syzkaller076574/3044
      
          CPU: 1 PID: 3044 Comm: syzkaller076574 Not tainted 4.14.0-mm1+ #25
          Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  Google 01/01/2011
          Call Trace:
            __dump_stack lib/dump_stack.c:17 [inline]
            dump_stack+0x194/0x257 lib/dump_stack.c:53
            print_address_description+0x73/0x250 mm/kasan/report.c:252
            kasan_report_error mm/kasan/report.c:351 [inline]
            kasan_report+0x25b/0x340 mm/kasan/report.c:409
            check_memory_region_inline mm/kasan/kasan.c:260 [inline]
            check_memory_region+0x137/0x190 mm/kasan/kasan.c:267
            memcpy+0x37/0x50 mm/kasan/kasan.c:303
            memcpy include/linux/string.h:341 [inline]
            sha3_update+0xdf/0x2e0 crypto/sha3_generic.c:161
            crypto_shash_update+0xcb/0x220 crypto/shash.c:109
            shash_finup_unaligned+0x2a/0x60 crypto/shash.c:151
            crypto_shash_finup+0xc4/0x120 crypto/shash.c:165
            hmac_finup+0x182/0x330 crypto/hmac.c:152
            crypto_shash_finup+0xc4/0x120 crypto/shash.c:165
            shash_digest_unaligned+0x9e/0xd0 crypto/shash.c:172
            crypto_shash_digest+0xc4/0x120 crypto/shash.c:186
            hmac_setkey+0x36a/0x690 crypto/hmac.c:66
            crypto_shash_setkey+0xad/0x190 crypto/shash.c:64
            shash_async_setkey+0x47/0x60 crypto/shash.c:207
            crypto_ahash_setkey+0xaf/0x180 crypto/ahash.c:200
            hash_setkey+0x40/0x90 crypto/algif_hash.c:446
            alg_setkey crypto/af_alg.c:221 [inline]
            alg_setsockopt+0x2a1/0x350 crypto/af_alg.c:254
            SYSC_setsockopt net/socket.c:1851 [inline]
            SyS_setsockopt+0x189/0x360 net/socket.c:1830
            entry_SYSCALL_64_fastpath+0x1f/0x96
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      43259d07
    • Eric Biggers's avatar
      crypto: rsa - fix buffer overread when stripping leading zeroes · cd9b5986
      Eric Biggers authored
      commit d2890c37 upstream.
      
      In rsa_get_n(), if the buffer contained all 0's and "FIPS mode" is
      enabled, we would read one byte past the end of the buffer while
      scanning the leading zeroes.  Fix it by checking 'n_sz' before '!*ptr'.
      
      This bug was reachable by adding a specially crafted key of type
      "asymmetric" (requires CONFIG_RSA and CONFIG_X509_CERTIFICATE_PARSER).
      
      KASAN report:
      
          BUG: KASAN: slab-out-of-bounds in rsa_get_n+0x19e/0x1d0 crypto/rsa_helper.c:33
          Read of size 1 at addr ffff88003501a708 by task keyctl/196
      
          CPU: 1 PID: 196 Comm: keyctl Not tainted 4.14.0-09238-g1d3b78bb #26
          Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
          Call Trace:
           rsa_get_n+0x19e/0x1d0 crypto/rsa_helper.c:33
           asn1_ber_decoder+0x82a/0x1fd0 lib/asn1_decoder.c:328
           rsa_set_pub_key+0xd3/0x320 crypto/rsa.c:278
           crypto_akcipher_set_pub_key ./include/crypto/akcipher.h:364 [inline]
           pkcs1pad_set_pub_key+0xae/0x200 crypto/rsa-pkcs1pad.c:117
           crypto_akcipher_set_pub_key ./include/crypto/akcipher.h:364 [inline]
           public_key_verify_signature+0x270/0x9d0 crypto/asymmetric_keys/public_key.c:106
           x509_check_for_self_signed+0x2ea/0x480 crypto/asymmetric_keys/x509_public_key.c:141
           x509_cert_parse+0x46a/0x620 crypto/asymmetric_keys/x509_cert_parser.c:129
           x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174
           asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388
           key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850
           SYSC_add_key security/keys/keyctl.c:122 [inline]
           SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62
           entry_SYSCALL_64_fastpath+0x1f/0x96
      
          Allocated by task 196:
           __do_kmalloc mm/slab.c:3711 [inline]
           __kmalloc_track_caller+0x118/0x2e0 mm/slab.c:3726
           kmemdup+0x17/0x40 mm/util.c:118
           kmemdup ./include/linux/string.h:414 [inline]
           x509_cert_parse+0x2cb/0x620 crypto/asymmetric_keys/x509_cert_parser.c:106
           x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174
           asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388
           key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850
           SYSC_add_key security/keys/keyctl.c:122 [inline]
           SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62
           entry_SYSCALL_64_fastpath+0x1f/0x96
      
      Fixes: 5a7de973 ("crypto: rsa - return raw integers for the ASN.1 parser")
      Cc: Tudor Ambarus <tudor-dan.ambarus@nxp.com>
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Reviewed-by: default avatarJames Morris <james.l.morris@oracle.com>
      Reviewed-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd9b5986
    • Martin Kaiser's avatar
      mfd: fsl-imx25: Clean up irq settings during removal · 1fb73eae
      Martin Kaiser authored
      commit 18f77393 upstream.
      
      When fsl-imx25-tsadc is compiled as a module, loading, unloading and
      reloading the module will lead to a crash.
      
      Unable to handle kernel paging request at virtual address bf005430
      [<c004df6c>] (irq_find_matching_fwspec)
         from [<c028d5ec>] (of_irq_get+0x58/0x74)
      [<c028d594>] (of_irq_get)
         from [<c01ff970>] (platform_get_irq+0x48/0xc8)
      [<c01ff928>] (platform_get_irq)
         from [<bf00e33c>] (mx25_tsadc_probe+0x220/0x2f4 [fsl_imx25_tsadc])
      
      irq_find_matching_fwspec() loops over all registered irq domains. The
      irq domain is still registered from last time the module was loaded but
      the pointer to its operations is invalid after the module was unloaded.
      
      Add a removal function which clears the irq handler and removes the irq
      domain. With this cleanup in place, it's possible to unload and reload
      the module.
      Signed-off-by: default avatarMartin Kaiser <martin@kaiser.cx>
      Reviewed-by: default avatarLucas Stach <l.stach@pengutronix.de>
      Signed-off-by: default avatarLee Jones <lee.jones@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1fb73eae
  2. 16 Dec, 2017 18 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.70 · ee52d08d
      Greg Kroah-Hartman authored
      ee52d08d
    • Leon Romanovsky's avatar
      RDMA/cxgb4: Annotate r2 and stag as __be32 · 349130bb
      Leon Romanovsky authored
      
      [ Upstream commit 7d7d065a ]
      
      Chelsio cxgb4 HW is big-endian, hence there is need to properly
      annotate r2 and stag fields as __be32 and not __u32 to fix the
      following sparse warnings.
      
        drivers/infiniband/hw/cxgb4/qp.c:614:16:
          warning: incorrect type in assignment (different base types)
            expected unsigned int [unsigned] [usertype] r2
            got restricted __be32 [usertype] <noident>
        drivers/infiniband/hw/cxgb4/qp.c:615:18:
          warning: incorrect type in assignment (different base types)
            expected unsigned int [unsigned] [usertype] stag
            got restricted __be32 [usertype] <noident>
      
      Cc: Steve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Reviewed-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      349130bb
    • Zdenek Kabelac's avatar
      md: free unused memory after bitmap resize · b7d3f2b5
      Zdenek Kabelac authored
      
      [ Upstream commit 0868b99c ]
      
      When bitmap is resized, the old kalloced chunks just are not released
      once the resized bitmap starts to use new space.
      
      This fixes in particular kmemleak reports like this one:
      
      unreferenced object 0xffff8f4311e9c000 (size 4096):
        comm "lvm", pid 19333, jiffies 4295263268 (age 528.265s)
        hex dump (first 32 bytes):
          02 80 02 80 02 80 02 80 02 80 02 80 02 80 02 80  ................
          02 80 02 80 02 80 02 80 02 80 02 80 02 80 02 80  ................
        backtrace:
          [<ffffffffa69471ca>] kmemleak_alloc+0x4a/0xa0
          [<ffffffffa628c10e>] kmem_cache_alloc_trace+0x14e/0x2e0
          [<ffffffffa676cfec>] bitmap_checkpage+0x7c/0x110
          [<ffffffffa676d0c5>] bitmap_get_counter+0x45/0xd0
          [<ffffffffa676d6b3>] bitmap_set_memory_bits+0x43/0xe0
          [<ffffffffa676e41c>] bitmap_init_from_disk+0x23c/0x530
          [<ffffffffa676f1ae>] bitmap_load+0xbe/0x160
          [<ffffffffc04c47d3>] raid_preresume+0x203/0x2f0 [dm_raid]
          [<ffffffffa677762f>] dm_table_resume_targets+0x4f/0xe0
          [<ffffffffa6774b52>] dm_resume+0x122/0x140
          [<ffffffffa6779b9f>] dev_suspend+0x18f/0x290
          [<ffffffffa677a3a7>] ctl_ioctl+0x287/0x560
          [<ffffffffa677a693>] dm_ctl_ioctl+0x13/0x20
          [<ffffffffa62d6b46>] do_vfs_ioctl+0xa6/0x750
          [<ffffffffa62d7269>] SyS_ioctl+0x79/0x90
          [<ffffffffa6956d41>] entry_SYSCALL_64_fastpath+0x1f/0xc2
      Signed-off-by: default avatarZdenek Kabelac <zkabelac@redhat.com>
      Signed-off-by: default avatarShaohua Li <shli@fb.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b7d3f2b5
    • Paul Moore's avatar
      audit: ensure that 'audit=1' actually enables audit for PID 1 · 93dedcf5
      Paul Moore authored
      
      [ Upstream commit 173743dd ]
      
      Prior to this patch we enabled audit in audit_init(), which is too
      late for PID 1 as the standard initcalls are run after the PID 1 task
      is forked.  This means that we never allocate an audit_context (see
      audit_alloc()) for PID 1 and therefore miss a lot of audit events
      generated by PID 1.
      
      This patch enables audit as early as possible to help ensure that when
      PID 1 is forked it can allocate an audit_context if required.
      Reviewed-by: default avatarRichard Guy Briggs <rgb@redhat.com>
      Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93dedcf5
    • Keefe Liu's avatar
      ipvlan: fix ipv6 outbound device · a625a16c
      Keefe Liu authored
      
      [ Upstream commit ca29fd7c ]
      
      When process the outbound packet of ipv6, we should assign the master
      device to output device other than input device.
      Signed-off-by: default avatarKeefe Liu <liuqifa@huawei.com>
      Acked-by: default avatarMahesh Bandewar <maheshb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a625a16c
    • Masahiro Yamada's avatar
      kbuild: do not call cc-option before KBUILD_CFLAGS initialization · 97c66870
      Masahiro Yamada authored
      
      [ Upstream commit 433dc2eb ]
      
      Some $(call cc-option,...) are invoked very early, even before
      KBUILD_CFLAGS, etc. are initialized.
      
      The returned string from $(call cc-option,...) depends on
      KBUILD_CPPFLAGS, KBUILD_CFLAGS, and GCC_PLUGINS_CFLAGS.
      
      Since they are exported, they are not empty when the top Makefile
      is recursively invoked.
      
      The recursion occurs in several places.  For example, the top
      Makefile invokes itself for silentoldconfig.  "make tinyconfig",
      "make rpm-pkg" are the cases, too.
      
      In those cases, the second call of cc-option from the same line
      runs a different shell command due to non-pristine KBUILD_CFLAGS.
      
      To get the same result all the time, KBUILD_* and GCC_PLUGINS_CFLAGS
      must be initialized before any call of cc-option.  This avoids
      garbage data in the .cache.mk file.
      
      Move all calls of cc-option below the config targets because target
      compiler flags are unnecessary for Kconfig.
      Signed-off-by: default avatarMasahiro Yamada <yamada.masahiro@socionext.com>
      Reviewed-by: default avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: default avatarSasha Levin <alexander.levin@verizon.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      97c66870
    • Paul Mackerras's avatar
      powerpc/64: Fix checksum folding in csum_tcpudp_nofold and ip_fast_csum_nofold · eae3f3ab
      Paul Mackerras authored
      commit b492f7e4 upstream.
      
      These functions compute an IP checksum by computing a 64-bit sum and
      folding it to 32 bits (the "nofold" in their names refers to folding
      down to 16 bits).  However, doing (u32) (s + (s >> 32)) is not
      sufficient to fold a 64-bit sum to 32 bits correctly.  The addition
      can produce a carry out from bit 31, which needs to be added in to
      the sum to produce the correct result.
      
      To fix this, we copy the from64to32() function from lib/checksum.c
      and use that.
      Signed-off-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eae3f3ab
    • Marc Zyngier's avatar
      KVM: arm/arm64: vgic-its: Preserve the revious read from the pending table · 9414a630
      Marc Zyngier authored
      commit 64afe6e9 upstream.
      
      The current pending table parsing code assumes that we keep the
      previous read of the pending bits, but keep that variable in
      the current block, making sure it is discarded on each loop.
      
      We end-up using whatever is on the stack. Who knows, it might
      just be the right thing...
      
      Fixes: 33d3bc95 ("KVM: arm64: vgic-its: Read initial LPI pending table")
      Cc: stable@vger.kernel.org # 4.8
      Reported-by: default avatarAKASHI Takahiro <takahiro.akashi@linaro.org>
      Reviewed-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: default avatarChristoffer Dall <christoffer.dall@linaro.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9414a630
    • Al Viro's avatar
      fix kcm_clone() · 80c0f477
      Al Viro authored
      commit a5739435 upstream.
      
      1) it's fput() or sock_release(), not both
      2) don't do fd_install() until the last failure exit.
      3) not a bug per se, but... don't attach socket to struct file
         until it's set up.
      
      Take reserving descriptor into the caller, move fd_install() to the
      caller, sanitize failure exits and calling conventions.
      Acked-by: default avatarTom Herbert <tom@herbertland.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      80c0f477
    • Vincent Pelletier's avatar
      usb: gadget: ffs: Forbid usb_ep_alloc_request from sleeping · 16648cbc
      Vincent Pelletier authored
      commit 30bf90cc upstream.
      
      Found using DEBUG_ATOMIC_SLEEP while submitting an AIO read operation:
      
      [  100.853642] BUG: sleeping function called from invalid context at mm/slab.h:421
      [  100.861148] in_atomic(): 1, irqs_disabled(): 1, pid: 1880, name: python
      [  100.867954] 2 locks held by python/1880:
      [  100.867961]  #0:  (&epfile->mutex){....}, at: [<f8188627>] ffs_mutex_lock+0x27/0x30 [usb_f_fs]
      [  100.868020]  #1:  (&(&ffs->eps_lock)->rlock){....}, at: [<f818ad4b>] ffs_epfile_io.isra.17+0x24b/0x590 [usb_f_fs]
      [  100.868076] CPU: 1 PID: 1880 Comm: python Not tainted 4.14.0-edison+ #118
      [  100.868085] Hardware name: Intel Corporation Merrifield/BODEGA BAY, BIOS 542 2015.01.21:18.19.48
      [  100.868093] Call Trace:
      [  100.868122]  dump_stack+0x47/0x62
      [  100.868156]  ___might_sleep+0xfd/0x110
      [  100.868182]  __might_sleep+0x68/0x70
      [  100.868217]  kmem_cache_alloc_trace+0x4b/0x200
      [  100.868248]  ? dwc3_gadget_ep_alloc_request+0x24/0xe0 [dwc3]
      [  100.868302]  dwc3_gadget_ep_alloc_request+0x24/0xe0 [dwc3]
      [  100.868343]  usb_ep_alloc_request+0x16/0xc0 [udc_core]
      [  100.868386]  ffs_epfile_io.isra.17+0x444/0x590 [usb_f_fs]
      [  100.868424]  ? _raw_spin_unlock_irqrestore+0x27/0x40
      [  100.868457]  ? kiocb_set_cancel_fn+0x57/0x60
      [  100.868477]  ? ffs_ep0_poll+0xc0/0xc0 [usb_f_fs]
      [  100.868512]  ffs_epfile_read_iter+0xfe/0x157 [usb_f_fs]
      [  100.868551]  ? security_file_permission+0x9c/0xd0
      [  100.868587]  ? rw_verify_area+0xac/0x120
      [  100.868633]  aio_read+0x9d/0x100
      [  100.868692]  ? __fget+0xa2/0xd0
      [  100.868727]  ? __might_sleep+0x68/0x70
      [  100.868763]  SyS_io_submit+0x471/0x680
      [  100.868878]  do_int80_syscall_32+0x4e/0xd0
      [  100.868921]  entry_INT80_32+0x2a/0x2a
      [  100.868932] EIP: 0xb7fbb676
      [  100.868941] EFLAGS: 00000292 CPU: 1
      [  100.868951] EAX: ffffffda EBX: b7aa2000 ECX: 00000002 EDX: b7af8368
      [  100.868961] ESI: b7fbb660 EDI: b7aab000 EBP: bfb6c658 ESP: bfb6c638
      [  100.868973]  DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 007b
      Signed-off-by: default avatarVincent Pelletier <plr.vincent@gmail.com>
      Signed-off-by: default avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      16648cbc
    • Heiko Carstens's avatar
      s390: always save and restore all registers on context switch · 47273f0d
      Heiko Carstens authored
      commit fbbd7f1a upstream.
      
      The switch_to() macro has an optimization to avoid saving and
      restoring register contents that aren't needed for kernel threads.
      
      There is however the possibility that a kernel thread execve's a user
      space program. In such a case the execve'd process can partially see
      the contents of the previous process, which shouldn't be allowed.
      
      To avoid this, simply always save and restore register contents on
      context switch.
      
      Fixes: fdb6d070 ("switch_to: dont restore/save access & fpu regs for kernel threads")
      Signed-off-by: default avatarHeiko Carstens <heiko.carstens@de.ibm.com>
      Signed-off-by: default avatarMartin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      47273f0d
    • Masamitsu Yamazaki's avatar
      ipmi: Stop timers before cleaning up the module · f8dac5bf
      Masamitsu Yamazaki authored
      commit 4f7f5551 upstream.
      
      System may crash after unloading ipmi_si.ko module
      because a timer may remain and fire after the module cleaned up resources.
      
      cleanup_one_si() contains the following processing.
      
              /*
               * Make sure that interrupts, the timer and the thread are
               * stopped and will not run again.
               */
              if (to_clean->irq_cleanup)
                      to_clean->irq_cleanup(to_clean);
              wait_for_timer_and_thread(to_clean);
      
              /*
               * Timeouts are stopped, now make sure the interrupts are off
               * in the BMC.  Note that timers and CPU interrupts are off,
               * so no need for locks.
               */
              while (to_clean->curr_msg || (to_clean->si_state != SI_NORMAL)) {
                      poll(to_clean);
                      schedule_timeout_uninterruptible(1);
              }
      
      si_state changes as following in the while loop calling poll(to_clean).
      
        SI_GETTING_MESSAGES
          => SI_CHECKING_ENABLES
           => SI_SETTING_ENABLES
            => SI_GETTING_EVENTS
             => SI_NORMAL
      
      As written in the code comments above,
      timers are expected to stop before the polling loop and not to run again.
      But the timer is set again in the following process
      when si_state becomes SI_SETTING_ENABLES.
      
        => poll
           => smi_event_handler
             => handle_transaction_done
                // smi_info->si_state == SI_SETTING_ENABLES
               => start_getting_events
                 => start_new_msg
                  => smi_mod_timer
                    => mod_timer
      
      As a result, before the timer set in start_new_msg() expires,
      the polling loop may see si_state becoming SI_NORMAL
      and the module clean-up finishes.
      
      For example, hard LOCKUP and panic occurred as following.
      smi_timeout was called after smi_event_handler,
      kcs_event and hangs at port_inb()
      trying to access I/O port after release.
      
          [exception RIP: port_inb+19]
          RIP: ffffffffc0473053  RSP: ffff88069fdc3d80  RFLAGS: 00000006
          RAX: ffff8806800f8e00  RBX: ffff880682bd9400  RCX: 0000000000000000
          RDX: 0000000000000ca3  RSI: 0000000000000ca3  RDI: ffff8806800f8e40
          RBP: ffff88069fdc3d80   R8: ffffffff81d86dfc   R9: ffffffff81e36426
          R10: 00000000000509f0  R11: 0000000000100000  R12: 0000000000]:000000
          R13: 0000000000000000  R14: 0000000000000246  R15: ffff8806800f8e00
          ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0000
       --- <NMI exception stack> ---
      
      To fix the problem I defined a flag, timer_can_start,
      as member of struct smi_info.
      The flag is enabled immediately after initializing the timer
      and disabled immediately before waiting for timer deletion.
      
      Fixes: 0cfec916 ("ipmi: Start the timer and thread on internal msgs")
      Signed-off-by: default avatarYamazaki Masamitsu <m-yamazaki@ah.jp.nec.com>
      [Adjusted for recent changes in the driver.]
      [Some fairly major changes went into the IPMI driver in 4.15, so this
       required a backport as the code had changed and moved to a different
       file.  The 4.14 version of this patch moved some code under an
       if statement causing it to not apply to 4.7-4.13.]
      Signed-off-by: default avatarCorey Minyard <cminyard@mvista.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f8dac5bf
    • Debabrata Banerjee's avatar
      Fix handling of verdicts after NF_QUEUE · 0cab694a
      Debabrata Banerjee authored
      [This fix is only needed for v4.9 stable since v4.10+ does not have the issue]
      
      A verdict of NF_STOLEN after NF_QUEUE will cause an incorrect return value
      and a potential kernel panic via double free of skb's
      
      This was broken by commit 7034b566 ("netfilter: fix nf_queue handling")
      and subsequently fixed in v4.10 by commit c63cbc46 ("netfilter:
      use switch() to handle verdict cases from nf_hook_slow()"). However that
      commit cannot be cleanly cherry-picked to v4.9
      Signed-off-by: default avatarDebabrata Banerjee <dbanerje@akamai.com>
      Acked-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      0cab694a
    • Tommi Rantala's avatar
      tipc: call tipc_rcv() only if bearer is up in tipc_udp_recv() · cf00fd3d
      Tommi Rantala authored
      
      [ Upstream commit c7799c06 ]
      
      Remove the second tipc_rcv() call in tipc_udp_recv(). We have just
      checked that the bearer is not up, and calling tipc_rcv() with a bearer
      that is not up leads to a TIPC div-by-zero crash in
      tipc_node_calculate_timer(). The crash is rare in practice, but can
      happen like this:
      
        We're enabling a bearer, but it's not yet up and fully initialized.
        At the same time we receive a discovery packet, and in tipc_udp_recv()
        we end up calling tipc_rcv() with the not-yet-initialized bearer,
        causing later the div-by-zero crash in tipc_node_calculate_timer().
      
      Jon Maloy explains the impact of removing the second tipc_rcv() call:
        "link setup in the worst case will be delayed until the next arriving
         discovery messages, 1 sec later, and this is an acceptable delay."
      
      As the tipc_rcv() call is removed, just leave the function via the
      rcu_out label, so that we will kfree_skb().
      
      [   12.590450] Own node address <1.1.1>, network identity 1
      [   12.668088] divide error: 0000 [#1] SMP
      [   12.676952] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.14.2-dirty #1
      [   12.679225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
      [   12.682095] task: ffff8c2a761edb80 task.stack: ffffa41cc0cac000
      [   12.684087] RIP: 0010:tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc]
      [   12.686486] RSP: 0018:ffff8c2a7fc838a0 EFLAGS: 00010246
      [   12.688451] RAX: 0000000000000000 RBX: ffff8c2a5b382600 RCX: 0000000000000000
      [   12.691197] RDX: 0000000000000000 RSI: ffff8c2a5b382600 RDI: ffff8c2a5b382600
      [   12.693945] RBP: ffff8c2a7fc838b0 R08: 0000000000000001 R09: 0000000000000001
      [   12.696632] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c2a5d8949d8
      [   12.699491] R13: ffffffff95ede400 R14: 0000000000000000 R15: ffff8c2a5d894800
      [   12.702338] FS:  0000000000000000(0000) GS:ffff8c2a7fc80000(0000) knlGS:0000000000000000
      [   12.705099] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [   12.706776] CR2: 0000000001bb9440 CR3: 00000000bd009001 CR4: 00000000003606e0
      [   12.708847] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [   12.711016] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      [   12.712627] Call Trace:
      [   12.713390]  <IRQ>
      [   12.714011]  tipc_node_check_dest+0x2e8/0x350 [tipc]
      [   12.715286]  tipc_disc_rcv+0x14d/0x1d0 [tipc]
      [   12.716370]  tipc_rcv+0x8b0/0xd40 [tipc]
      [   12.717396]  ? minmax_running_min+0x2f/0x60
      [   12.718248]  ? dst_alloc+0x4c/0xa0
      [   12.718964]  ? tcp_ack+0xaf1/0x10b0
      [   12.719658]  ? tipc_udp_is_known_peer+0xa0/0xa0 [tipc]
      [   12.720634]  tipc_udp_recv+0x71/0x1d0 [tipc]
      [   12.721459]  ? dst_alloc+0x4c/0xa0
      [   12.722130]  udp_queue_rcv_skb+0x264/0x490
      [   12.722924]  __udp4_lib_rcv+0x21e/0x990
      [   12.723670]  ? ip_route_input_rcu+0x2dd/0xbf0
      [   12.724442]  ? tcp_v4_rcv+0x958/0xa40
      [   12.725039]  udp_rcv+0x1a/0x20
      [   12.725587]  ip_local_deliver_finish+0x97/0x1d0
      [   12.726323]  ip_local_deliver+0xaf/0xc0
      [   12.726959]  ? ip_route_input_noref+0x19/0x20
      [   12.727689]  ip_rcv_finish+0xdd/0x3b0
      [   12.728307]  ip_rcv+0x2ac/0x360
      [   12.728839]  __netif_receive_skb_core+0x6fb/0xa90
      [   12.729580]  ? udp4_gro_receive+0x1a7/0x2c0
      [   12.730274]  __netif_receive_skb+0x1d/0x60
      [   12.730953]  ? __netif_receive_skb+0x1d/0x60
      [   12.731637]  netif_receive_skb_internal+0x37/0xd0
      [   12.732371]  napi_gro_receive+0xc7/0xf0
      [   12.732920]  receive_buf+0x3c3/0xd40
      [   12.733441]  virtnet_poll+0xb1/0x250
      [   12.733944]  net_rx_action+0x23e/0x370
      [   12.734476]  __do_softirq+0xc5/0x2f8
      [   12.734922]  irq_exit+0xfa/0x100
      [   12.735315]  do_IRQ+0x4f/0xd0
      [   12.735680]  common_interrupt+0xa2/0xa2
      [   12.736126]  </IRQ>
      [   12.736416] RIP: 0010:native_safe_halt+0x6/0x10
      [   12.736925] RSP: 0018:ffffa41cc0cafe90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff4d
      [   12.737756] RAX: 0000000000000000 RBX: ffff8c2a761edb80 RCX: 0000000000000000
      [   12.738504] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
      [   12.739258] RBP: ffffa41cc0cafe90 R08: 0000014b5b9795e5 R09: ffffa41cc12c7e88
      [   12.740118] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
      [   12.740964] R13: ffff8c2a761edb80 R14: 0000000000000000 R15: 0000000000000000
      [   12.741831]  default_idle+0x2a/0x100
      [   12.742323]  arch_cpu_idle+0xf/0x20
      [   12.742796]  default_idle_call+0x28/0x40
      [   12.743312]  do_idle+0x179/0x1f0
      [   12.743761]  cpu_startup_entry+0x1d/0x20
      [   12.744291]  start_secondary+0x112/0x120
      [   12.744816]  secondary_startup_64+0xa5/0xa5
      [   12.745367] Code: b9 f4 01 00 00 48 89 c2 48 c1 ea 02 48 3d d3 07 00
      00 48 0f 47 d1 49 8b 0c 24 48 39 d1 76 07 49 89 14 24 48 89 d1 31 d2 48
      89 df <48> f7 f1 89 c6 e8 81 6e ff ff 5b 41 5c 5d c3 66 90 66 2e 0f 1f
      [   12.747527] RIP: tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc] RSP: ffff8c2a7fc838a0
      [   12.748555] ---[ end trace 1399ab83390650fd ]---
      [   12.749296] Kernel panic - not syncing: Fatal exception in interrupt
      [   12.750123] Kernel Offset: 0x13200000 from 0xffffffff82000000
      (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
      [   12.751215] Rebooting in 60 seconds..
      
      Fixes: c9b64d49 ("tipc: add replicast peer discovery")
      Signed-off-by: default avatarTommi Rantala <tommi.t.rantala@nokia.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cf00fd3d
    • Julian Wiedmann's avatar
      s390/qeth: fix thinko in IPv4 multicast address tracking · 0cfe6df9
      Julian Wiedmann authored
      
      [ Upsteam commit bc3ab705 ]
      
      Commit 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
      reworked how secondary addresses are managed for qeth devices.
      Instead of dropping & subsequently re-adding all addresses on every
      ndo_set_rx_mode() call, qeth now keeps track of the addresses that are
      currently registered with the HW.
      On a ndo_set_rx_mode(), we thus only need to do (de-)registration
      requests for the addresses that have actually changed.
      
      On L3 devices, the lookup for IPv4 Multicast addresses checks the wrong
      hashtable - and thus never finds a match. As a result, we first delete
      *all* such addresses, and then re-add them again. So each set_rx_mode()
      causes a short period where the IPv4 Multicast addresses are not
      registered, and the card stops forwarding inbound traffic for them.
      
      Fix this by setting the ->is_multicast flag on the lookup object, thus
      enabling qeth_l3_ip_from_hash() to search the correct hashtable and
      find a match there.
      
      Fixes: 5f78e29c ("qeth: optimize IP handling in rx_mode callback")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0cfe6df9
    • Julian Wiedmann's avatar
      s390/qeth: fix GSO throughput regression · 1d55222b
      Julian Wiedmann authored
      
      [ Upstream commit 6d69b1f1 ]
      
      Using GSO with small MTUs currently results in a substantial throughput
      regression - which is caused by how qeth needs to map non-linear skbs
      into its IO buffer elements:
      compared to a linear skb, each GSO-segmented skb effectively consumes
      twice as many buffer elements (ie two instead of one) due to the
      additional header-only part. This causes the Output Queue to be
      congested with low-utilized IO buffers.
      
      Fix this as follows:
      If the MSS is low enough so that a non-SG GSO segmentation produces
      order-0 skbs (currently ~3500 byte), opt out from NETIF_F_SG. This is
      where we anticipate the biggest savings, since an SG-enabled
      GSO segmentation produces skbs that always consume at least two
      buffer elements.
      
      Larger MSS values continue to get a SG-enabled GSO segmentation, since
      1) the relative overhead of the additional header-only buffer element
      becomes less noticeable, and
      2) the linearization overhead increases.
      
      With the throughput regression fixed, re-enable NETIF_F_SG by default to
      reap the significant CPU savings of GSO.
      
      Fixes: 5722963a ("qeth: do not turn on SG per default")
      Reported-by: default avatarNils Hoppmann <niho@de.ibm.com>
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d55222b
    • Julian Wiedmann's avatar
      s390/qeth: build max size GSO skbs on L2 devices · fbf0dfe7
      Julian Wiedmann authored
      
      [ Upstream commit 0cbff6d4 ]
      
      The current GSO skb size limit was copy&pasted over from the L3 path,
      where it is needed due to a TSO limitation.
      As L2 devices don't offer TSO support (and thus all GSO skbs are
      segmented before they reach the driver), there's no reason to restrict
      the stack in how large it may build the GSO skbs.
      
      Fixes: d52aec97 ("qeth: enable scatter/gather in layer 2 mode")
      Signed-off-by: default avatarJulian Wiedmann <jwi@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fbf0dfe7
    • Eric Dumazet's avatar
      tcp/dccp: block bh before arming time_wait timer · aa0080f1
      Eric Dumazet authored
      
      [ Upstream commit cfac7f83 ]
      
      Maciej Żenczykowski reported some panics in tcp_twsk_destructor()
      that might be caused by the following bug.
      
      timewait timer is pinned to the cpu, because we want to transition
      timwewait refcount from 0 to 4 in one go, once everything has been
      initialized.
      
      At the time commit ed2e9239 ("tcp/dccp: fix timewait races in timer
      handling") was merged, TCP was always running from BH habdler.
      
      After commit 5413d1ba ("net: do not block BH while processing
      socket backlog") we definitely can run tcp_time_wait() from process
      context.
      
      We need to block BH in the critical section so that the pinned timer
      has still its purpose.
      
      This bug is more likely to happen under stress and when very small RTO
      are used in datacenter flows.
      
      Fixes: 5413d1ba ("net: do not block BH while processing socket backlog")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMaciej Żenczykowski <maze@google.com>
      Acked-by: default avatarMaciej Żenczykowski <maze@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa0080f1