1. 01 Feb, 2019 5 commits
    • Linus Torvalds's avatar
      Merge tag 'i3c/fixes-for-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux · 520fac05
      Linus Torvalds authored
      Pull i3c fixes from Boris Brezillon:
      
       - Fix a deadlock in the designware driver
      
       - Fix the error path in i3c_master_add_i3c_dev_locked()
      
      * tag 'i3c/fixes-for-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/i3c/linux:
        i3c: master: dw: fix deadlock
        i3c: fix missing detach if failed to retrieve i3c dev
      520fac05
    • Linus Torvalds's avatar
      x86: explicitly align IO accesses in memcpy_{to,from}io · c228d294
      Linus Torvalds authored
      In commit 170d13ca ("x86: re-introduce non-generic memcpy_{to,from}io")
      I made our copy from IO space use a separate copy routine rather than
      rely on the generic memcpy.  I did that because our generic memory copy
      isn't actually well-defined when it comes to internal access ordering or
      alignment, and will in fact depend on various CPUID flags.
      
      In particular, the default memcpy() for a modern Intel CPU will
      generally be just a "rep movsb", which works reasonably well for
      medium-sized memory copies of regular RAM, since the CPU will turn it
      into fairly optimized microcode.
      
      However, for non-cached memory and IO, "rep movs" ends up being
      horrendously slow and will just do the architectural "one byte at a
      time" accesses implied by the movsb.
      
      At the other end of the spectrum, if you _don't_ end up using the "rep
      movsb" code, you'd likely fall back to the software copy, which does
      overlapping accesses for the tail, and may copy things backwards.
      Again, for regular memory that's fine, for IO memory not so much.
      
      The thinking was that clearly nobody really cared (because things
      worked), but some people had seen horrible performance due to the byte
      accesses, so let's just revert back to our long ago version that dod
      "rep movsl" for the bulk of the copy, and then fixed up the potentially
      last few bytes of the tail with "movsw/b".
      
      Interestingly (and perhaps not entirely surprisingly), while that was
      our original memory copy implementation, and had been used before for
      IO, in the meantime many new users of memcpy_*io() had come about.  And
      while the access patterns for the memory copy weren't well-defined (so
      arguably _any_ access pattern should work), in practice the "rep movsb"
      case had been very common for the last several years.
      
      In particular Jarkko Sakkinen reported that the memcpy_*io() change
      resuled in weird errors from his Geminilake NUC TPM module.
      
      And it turns out that the TPM TCG accesses according to spec require
      that the accesses be
      
       (a) done strictly sequentially
      
       (b) be naturally aligned
      
      otherwise the TPM chip will abort the PCI transaction.
      
      And, in fact, the tpm_crb.c driver did this:
      
      	memcpy_fromio(buf, priv->rsp, 6);
      	...
      	memcpy_fromio(&buf[6], &priv->rsp[6], expected - 6);
      
      which really should never have worked in the first place, but back
      before commit 170d13ca it *happened* to work, because the
      memcpy_fromio() would be expanded to a regular memcpy, and
      
       (a) gcc would expand the first memcpy in-line, and turn it into a
           4-byte and a 2-byte read, and they happened to be in the right
           order, and the alignment was right.
      
       (b) gcc would call "memcpy()" for the second one, and the machines that
           had this TPM chip also apparently ended up always having ERMS
           ("Enhanced REP MOVSB/STOSB instructions"), so we'd use the "rep
           movbs" for that copy.
      
      In other words, basically by pure luck, the code happened to use the
      right access sizes in the (two different!) memcpy() implementations to
      make it all work.
      
      But after commit 170d13ca, both of the memcpy_fromio() calls
      resulted in a call to the routine with the consistent memory accesses,
      and in both cases it started out transferring with 4-byte accesses.
      Which worked for the first copy, but resulted in the second copy doing a
      32-bit read at an address that was only 2-byte aligned.
      
      Jarkko is actually fixing the fragile code in the TPM driver, but since
      this is an excellent example of why we absolutely must not use a generic
      memcpy for IO accesses, _and_ an IO-specific one really should strive to
      align the IO accesses, let's do exactly that.
      
      Side note: Jarkko also noted that the driver had been used on ARM
      platforms, and had worked.  That was because on 32-bit ARM, memcpy_*io()
      ends up always doing byte accesses, and on 64-bit ARM it first does byte
      accesses to align to 8-byte boundaries, and then does 8-byte accesses
      for the bulk.
      
      So ARM actually worked by design, and the x86 case worked by pure luck.
      
      We *might* want to make x86-64 do the 8-byte case too.  That should be a
      pretty straightforward extension, but let's do one thing at a time.  And
      generally MMIO accesses aren't really all that performance-critical, as
      shown by the fact that for a long time we just did them a byte at a
      time, and very few people ever noticed.
      Reported-and-tested-by: default avatarJarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
      Tested-by: default avatarJerry Snitselaar <jsnitsel@redhat.com>
      Cc: David Laight <David.Laight@aculab.com>
      Fixes: 170d13ca ("x86: re-introduce non-generic memcpy_{to,from}io")
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      c228d294
    • Linus Torvalds's avatar
      Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux · 5b4746a0
      Linus Torvalds authored
      Pull clk fixes from Stephen Boyd:
       "Mostly driver fixes, but there's a core framework fix in here too:
      
         - Revert the commits that introduce clk management for the SP clk on
           MMP2 SoCs (used for OLPC). Turns out it wasn't a good idea and
           there isn't any need to manage this clk, it just causes more
           headaches.
      
         - A performance regression that went unnoticed for many years where
           we would traverse the entire clk tree looking for a clk by name
           when we already have the pointer to said clk that we're looking for
      
         - A parent linkage fix for the qcom SDM845 clk driver
      
         - An i.MX clk driver rate miscalculation fix where order of
           operations were messed up
      
         - One error handling fix from the static checkers"
      
      * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
        clk: qcom: gcc: Use active only source for CPUSS clocks
        clk: ti: Fix error handling in ti_clk_parse_divider_data()
        clk: imx: Fix fractional clock set rate computation
        clk: Remove global clk traversal on fetch parent index
        Revert "dt-bindings: marvell,mmp2: Add clock id for the SP clock"
        Revert "clk: mmp2: add SP clock"
        Revert "Input: olpc_apsp - enable the SP clock"
      5b4746a0
    • Linus Torvalds's avatar
      Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 52107c54
      Linus Torvalds authored
      Pull crypto fix from Herbert Xu:
       "This fixes a bug in cavium/nitrox where the callback is invoked prior
        to the DMA unmap"
      
      * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
        crypto: cavium/nitrox - Invoke callback after DMA unmap
      52107c54
    • Linus Torvalds's avatar
      Merge tag 'pci-v5.0-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci · 44e56f32
      Linus Torvalds authored
      Pull PCI fixes from Bjorn Helgaas:
      
       - Revert armada8k GPIO reset change that broke Macchiatobin booting
         (Baruch Siach)
      
       - Use actual size config reads on ARM cns3xxx (Koen Vandeputte)
      
       - Fix ARM cns3xxx config write alignment issue (Koen Vandeputte)
      
       - Fix imx6 PHY device link error checking (Leonard Crestez)
      
       - Fix imx6 probe failure on chips without separate PCI power domain
         (Leonard Crestez)
      
      * tag 'pci-v5.0-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
        Revert "PCI: armada8k: Add support for gpio controlled reset signal"
        ARM: cns3xxx: Use actual size reads for PCIe
        ARM: cns3xxx: Fix writing to wrong PCI config registers after alignment
        PCI: imx: Fix checking pd_pcie_phy device link addition
        PCI: imx: Fix probe failure without power domain
      44e56f32
  2. 31 Jan, 2019 9 commits
  3. 30 Jan, 2019 6 commits
    • Waiman Long's avatar
      fs/dcache: Track & report number of negative dentries · af0c9af1
      Waiman Long authored
      The current dentry number tracking code doesn't distinguish between
      positive & negative dentries.  It just reports the total number of
      dentries in the LRU lists.
      
      As excessive number of negative dentries can have an impact on system
      performance, it will be wise to track the number of positive and
      negative dentries separately.
      
      This patch adds tracking for the total number of negative dentries in
      the system LRU lists and reports it in the 5th field in the
      /proc/sys/fs/dentry-state file.  The number, however, does not include
      negative dentries that are in flight but not in the LRU yet as well as
      those in the shrinker lists which are on the way out anyway.
      
      The number of positive dentries in the LRU lists can be roughly found by
      subtracting the number of negative dentries from the unused count.
      
      Matthew Wilcox had confirmed that since the introduction of the
      dentry_stat structure in 2.1.60, the dummy array was there, probably for
      future extension.  They were not replacements of pre-existing fields.
      So no sane applications that read the value of /proc/sys/fs/dentry-state
      will do dummy thing if the last 2 fields of the sysctl parameter are not
      zero.  IOW, it will be safe to use one of the dummy array entry for
      negative dentry count.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      af0c9af1
    • Waiman Long's avatar
      fs: Don't need to put list_lru into its own cacheline · 7d10f70f
      Waiman Long authored
      The list_lru structure is essentially just a pointer to a table of
      per-node LRU lists.  Even if CONFIG_MEMCG_KMEM is defined, the list
      field is just used for LRU list registration and shrinker_id is set at
      initialization.  Those fields won't need to be touched that often.
      
      So there is no point to make the list_lru structures to sit in their own
      cachelines.
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7d10f70f
    • Waiman Long's avatar
      fs/dcache: Fix incorrect nr_dentry_unused accounting in shrink_dcache_sb() · 1dbd449c
      Waiman Long authored
      The nr_dentry_unused per-cpu counter tracks dentries in both the LRU
      lists and the shrink lists where the DCACHE_LRU_LIST bit is set.
      
      The shrink_dcache_sb() function moves dentries from the LRU list to a
      shrink list and subtracts the dentry count from nr_dentry_unused.  This
      is incorrect as the nr_dentry_unused count will also be decremented in
      shrink_dentry_list() via d_shrink_del().
      
      To fix this double decrement, the decrement in the shrink_dcache_sb()
      function is taken out.
      
      Fixes: 4e717f5c ("list_lru: remove special case function list_lru_dispose_all."
      Cc: stable@kernel.org
      Signed-off-by: default avatarWaiman Long <longman@redhat.com>
      Reviewed-by: default avatarDave Chinner <dchinner@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1dbd449c
    • Linus Torvalds's avatar
      Merge tag 'iommu-fixes-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu · 1c0490ce
      Linus Torvalds authored
      Pull IOMMU fixes from Joerg Roedel:
       "A few more fixes this time:
      
         - Two patches to fix the error path of the map_sg implementation of
           the AMD IOMMU driver.
      
         - Also a missing IOTLB flush is fixed in the AMD IOMMU driver.
      
         - Memory leak fix for the Intel IOMMU driver.
      
         - Fix a regression in the Mediatek IOMMU driver which caused device
           initialization to fail (seen as broken HDMI output)"
      
      * tag 'iommu-fixes-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
        iommu/amd: Fix IOMMU page flush when detach device from a domain
        iommu/mediatek: Use correct fwspec in mtk_iommu_add_device()
        iommu/vt-d: Fix memory leak in intel_iommu_put_resv_regions()
        iommu/amd: Unmap all mapped pages in error path of map_sg
        iommu/amd: Call free_iova_fast with pfn in map_sg
      1c0490ce
    • Linus Torvalds's avatar
      Merge tag 'gpio-v5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · 877ef51d
      Linus Torvalds authored
      Pull GPIO fixes from Linus Walleij:
       "Here is a bunch of GPIO fixes for the v5.0 series. I was helped out by
        Bartosz in collecting these fixes, for which I am very grateful, the
        biggest achievement in GPIO right now is work distribution.
      
        There is one serious core fix (timestamping) and a bunch of driver
        fixes:
      
         - Fix timestamps on nested IRQs
      
         - Handle IRQs properly in multiple instances of PCF857x
      
         - Use the right data register and IRQ type setting in the Spreadtrum
           GPIO driver
      
         - Let the value argument work properly when setting direction in the
           Altera GPIO driver
      
         - Mask interrupts properly in the vf610 driver"
      
      * tag 'gpio-v5.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        gpio: vf610: Mask all GPIO interrupts
        gpio: altera-a10sr: Set proper output level for direction_output
        gpio: sprd: Fix incorrect irq type setting for the async EIC
        gpio: sprd: Fix the incorrect data register
        gpiolib: fix line event timestamps for nested irqs
        gpio: pcf857x: Fix interrupts on multiple instances
      877ef51d
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 62967898
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Need to save away the IV across tls async operations, from Dave
          Watson.
      
       2) Upon successful packet processing, we should liberate the SKB with
          dev_consume_skb{_irq}(). From Yang Wei.
      
       3) Only apply RX hang workaround on effected macb chips, from Harini
          Katakam.
      
       4) Dummy netdev need a proper namespace assigned to them, from Josh
          Elsasser.
      
       5) Some paths of nft_compat run lockless now, and thus we need to use a
          proper refcnt_t. From Florian Westphal.
      
       6) Avoid deadlock in mlx5 by doing IRQ locking, from Moni Shoua.
      
       7) netrom does not refcount sockets properly wrt. timers, fix that by
          using the sock timer API. From Cong Wang.
      
       8) Fix locking of inexact inserts of xfrm policies, from Florian
          Westphal.
      
       9) Missing xfrm hash generation bump, also from Florian.
      
      10) Missing of_node_put() in hns driver, from Yonglong Liu.
      
      11) Fix DN_IFREQ_SIZE, from Johannes Berg.
      
      12) ip6mr notifier is invoked during traversal of wrong table, from Nir
          Dotan.
      
      13) TX promisc settings not performed correctly in qed, from Manish
          Chopra.
      
      14) Fix OOB access in vhost, from Jason Wang.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
        MAINTAINERS: Add entry for XDP (eXpress Data Path)
        net: set default network namespace in init_dummy_netdev()
        net: b44: replace dev_kfree_skb_xxx by dev_consume_skb_xxx for drop profiles
        net: caif: call dev_consume_skb_any when skb xmit done
        net: 8139cp: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
        net: macb: Apply RXUBR workaround only to versions with errata
        net: ti: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
        net: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
        net: amd8111e: replace dev_kfree_skb_irq by dev_consume_skb_irq
        net: alteon: replace dev_kfree_skb_irq by dev_consume_skb_irq
        net: tls: Fix deadlock in free_resources tx
        net: tls: Save iv in tls_rec for async crypto requests
        vhost: fix OOB in get_rx_bufs()
        qed: Fix stack out of bounds bug
        qed: Fix system crash in ll2 xmit
        qed: Fix VF probe failure while FLR
        qed: Fix LACP pdu drops for VFs
        qed: Fix bug in tx promiscuous mode settings
        net: i825xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
        netfilter: ipt_CLUSTERIP: fix warning unused variable cn
        ...
      62967898
  4. 29 Jan, 2019 15 commits
  5. 28 Jan, 2019 5 commits
    • David S. Miller's avatar
      Merge branch 'qed-Bug-fixes' · bfe2599d
      David S. Miller authored
      Manish Chopra says:
      
      ====================
      qed: Bug fixes
      
      This series have SR-IOV and some general fixes.
      Please consider applying it to "net"
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfe2599d
    • Manish Chopra's avatar
      qed: Fix stack out of bounds bug · ffb057f9
      Manish Chopra authored
      KASAN reported following bug in qed_init_qm_get_idx_from_flags
      due to inappropriate casting of "pq_flags". Fix the type of "pq_flags".
      
      [  196.624707] BUG: KASAN: stack-out-of-bounds in qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed]
      [  196.624712] Read of size 8 at addr ffff809b00bc7360 by task kworker/0:9/1712
      [  196.624714]
      [  196.624720] CPU: 0 PID: 1712 Comm: kworker/0:9 Not tainted 4.18.0-60.el8.aarch64+debug #1
      [  196.624723] Hardware name: To be filled by O.E.M. Saber/Saber, BIOS 0ACKL024 09/26/2018
      [  196.624733] Workqueue: events work_for_cpu_fn
      [  196.624738] Call trace:
      [  196.624742]  dump_backtrace+0x0/0x2f8
      [  196.624745]  show_stack+0x24/0x30
      [  196.624749]  dump_stack+0xe0/0x11c
      [  196.624755]  print_address_description+0x68/0x260
      [  196.624759]  kasan_report+0x178/0x340
      [  196.624762]  __asan_report_load_n_noabort+0x38/0x48
      [  196.624786]  qed_init_qm_get_idx_from_flags+0x1a4/0x1b8 [qed]
      [  196.624808]  qed_init_qm_info+0xec0/0x2200 [qed]
      [  196.624830]  qed_resc_alloc+0x284/0x7e8 [qed]
      [  196.624853]  qed_slowpath_start+0x6cc/0x1ae8 [qed]
      [  196.624864]  __qede_probe.isra.10+0x1cc/0x12c0 [qede]
      [  196.624874]  qede_probe+0x78/0xf0 [qede]
      [  196.624879]  local_pci_probe+0xc4/0x180
      [  196.624882]  work_for_cpu_fn+0x54/0x98
      [  196.624885]  process_one_work+0x758/0x1900
      [  196.624888]  worker_thread+0x4e0/0xd18
      [  196.624892]  kthread+0x2c8/0x350
      [  196.624897]  ret_from_fork+0x10/0x18
      [  196.624899]
      [  196.624902] Allocated by task 2:
      [  196.624906]  kasan_kmalloc.part.1+0x40/0x108
      [  196.624909]  kasan_kmalloc+0xb4/0xc8
      [  196.624913]  kasan_slab_alloc+0x14/0x20
      [  196.624916]  kmem_cache_alloc_node+0x1dc/0x480
      [  196.624921]  copy_process.isra.1.part.2+0x1d8/0x4a98
      [  196.624924]  _do_fork+0x150/0xfa0
      [  196.624926]  kernel_thread+0x48/0x58
      [  196.624930]  kthreadd+0x3a4/0x5a0
      [  196.624932]  ret_from_fork+0x10/0x18
      [  196.624934]
      [  196.624937] Freed by task 0:
      [  196.624938] (stack is not available)
      [  196.624940]
      [  196.624943] The buggy address belongs to the object at ffff809b00bc0000
      [  196.624943]  which belongs to the cache thread_stack of size 32768
      [  196.624946] The buggy address is located 29536 bytes inside of
      [  196.624946]  32768-byte region [ffff809b00bc0000, ffff809b00bc8000)
      [  196.624948] The buggy address belongs to the page:
      [  196.624952] page:ffff7fe026c02e00 count:1 mapcount:0 mapping:ffff809b4001c000 index:0x0 compound_mapcount: 0
      [  196.624960] flags: 0xfffff8000008100(slab|head)
      [  196.624967] raw: 0fffff8000008100 dead000000000100 dead000000000200 ffff809b4001c000
      [  196.624970] raw: 0000000000000000 0000000000080008 00000001ffffffff 0000000000000000
      [  196.624973] page dumped because: kasan: bad access detected
      [  196.624974]
      [  196.624976] Memory state around the buggy address:
      [  196.624980]  ffff809b00bc7200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  196.624983]  ffff809b00bc7280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  196.624985] >ffff809b00bc7300: 00 00 00 00 00 00 00 00 f1 f1 f1 f1 04 f2 f2 f2
      [  196.624988]                                                        ^
      [  196.624990]  ffff809b00bc7380: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  196.624993]  ffff809b00bc7400: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [  196.624995] ==================================================================
      Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ffb057f9
    • Manish Chopra's avatar
      qed: Fix system crash in ll2 xmit · 7c81626a
      Manish Chopra authored
      Cache number of fragments in the skb locally as in case
      of linear skb (with zero fragments), tx completion
      (or freeing of skb) may happen before driver tries
      to get number of frgaments from the skb which could
      lead to stale access to an already freed skb.
      Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c81626a
    • Manish Chopra's avatar
      qed: Fix VF probe failure while FLR · 327852ec
      Manish Chopra authored
      VFs may hit VF-PF channel timeout while probing, as in some
      cases it was observed that VF FLR and VF "acquire" message
      transaction (i.e first message from VF to PF in VF's probe flow)
      could occur simultaneously which could lead VF to fail sending
      "acquire" message to PF as VF is marked disabled from HW perspective
      due to FLR, which will result into channel timeout and VF probe failure.
      
      In such cases, try retrying VF "acquire" message so that in later
      attempts it could be successful to pass message to PF after the VF
      FLR is completed and can be probed successfully.
      Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      327852ec
    • Manish Chopra's avatar
      qed: Fix LACP pdu drops for VFs · ff929696
      Manish Chopra authored
      VF is always configured to drop control frames
      (with reserved mac addresses) but to work LACP
      on the VFs, it would require LACP control frames
      to be forwarded or transmitted successfully.
      
      This patch fixes this in such a way that trusted VFs
      (marked through ndo_set_vf_trust) would be allowed to
      pass the control frames such as LACP pdus.
      Signed-off-by: default avatarManish Chopra <manishc@marvell.com>
      Signed-off-by: default avatarAriel Elior <aelior@marvell.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff929696