1. 26 May, 2018 1 commit
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 03250e10
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Let's begin the holiday weekend with some networking fixes:
      
         1) Whoops need to restrict cfg80211 wiphy names even more to 64
            bytes. From Eric Biggers.
      
         2) Fix flags being ignored when using kernel_connect() with SCTP,
            from Xin Long.
      
         3) Use after free in DCCP, from Alexey Kodanev.
      
         4) Need to check rhltable_init() return value in ipmr code, from Eric
            Dumazet.
      
         5) XDP handling fixes in virtio_net from Jason Wang.
      
         6) Missing RTA_TABLE in rtm_ipv4_policy[], from Roopa Prabhu.
      
         7) Need to use IRQ disabling spinlocks in mlx4_qp_lookup(), from Jack
            Morgenstein.
      
         8) Prevent out-of-bounds speculation using indexes in BPF, from
            Daniel Borkmann.
      
         9) Fix regression added by AF_PACKET link layer cure, from Willem de
            Bruijn.
      
        10) Correct ENIC dma mask, from Govindarajulu Varadarajan.
      
        11) Missing config options for PMTU tests, from Stefano Brivio"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits)
        ibmvnic: Fix partial success login retries
        selftests/net: Add missing config options for PMTU tests
        mlx4_core: allocate ICM memory in page size chunks
        enic: set DMA mask to 47 bit
        ppp: remove the PPPIOCDETACH ioctl
        ipv4: remove warning in ip_recv_error
        net : sched: cls_api: deal with egdev path only if needed
        vhost: synchronize IOTLB message with dev cleanup
        packet: fix reserve calculation
        net/mlx5: IPSec, Fix a race between concurrent sandbox QP commands
        net/mlx5e: When RXFCS is set, add FCS data into checksum calculation
        bpf: properly enforce index mask to prevent out-of-bounds speculation
        net/mlx4: Fix irq-unsafe spinlock usage
        net: phy: broadcom: Fix bcm_write_exp()
        net: phy: broadcom: Fix auxiliary control register reads
        net: ipv4: add missing RTA_TABLE to rtm_ipv4_policy
        net/mlx4: fix spelling mistake: "Inrerface" -> "Interface" and rephrase message
        ibmvnic: Only do H_EOI for mobility events
        tuntap: correctly set SOCKWQ_ASYNC_NOSPACE
        virtio-net: fix leaking page for gso packet during mergeable XDP
        ...
      03250e10
  2. 25 May, 2018 17 commits
    • Thomas Falcon's avatar
      ibmvnic: Fix partial success login retries · eb110410
      Thomas Falcon authored
      In its current state, the driver will handle backing device
      login in a loop for a certain number of retries while the
      device returns a partial success, indicating that the driver
      may need to try again using a smaller number of resources.
      
      The variable it checks to continue retrying may change
      over the course of operations, resulting in reallocation
      of resources but exits without sending the login attempt.
      Guard against this by introducing a boolean variable that
      will retain the state indicating that the driver needs to
      reattempt login with backing device firmware.
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb110410
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · d2f30f51
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2018-05-24
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Fix a bug in the original fix to prevent out of bounds speculation when
         multiple tail call maps from different branches or calls end up at the
         same tail call helper invocation, from Daniel.
      
      2) Two selftest fixes, one in reuseport_bpf_numa where test is skipped in
         case of missing numa support and another one to update kernel config to
         properly support xdp_meta.sh test, from Anders.
      
       ...
      
      Would be great if you have a chance to merge net into net-next after that.
      
      The verifier fix would be needed later as a dependency in bpf-next for
      upcomig work there. When you do the merge there's a trivial conflict on
      BPF side with 849fa506 ("bpf/verifier: refine retval R0 state for
      bpf_get_stack helper"): Resolution is to keep both functions, the
      do_refine_retval_range() and record_func_map().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2f30f51
    • Stefano Brivio's avatar
      selftests/net: Add missing config options for PMTU tests · 24e4b075
      Stefano Brivio authored
      PMTU tests in pmtu.sh need support for VTI, VTI6 and dummy
      interfaces: add them to config file.
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Fixes: d1f1b9cb ("selftests: net: Introduce first PMTU test")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      24e4b075
    • David S. Miller's avatar
      Merge tag 'batadv-net-for-davem-20180524' of git://git.open-mesh.org/linux-merge · e3ffec48
      David S. Miller authored
      Simon Wunderlich says:
      
      ====================
      Here are some batman-adv bugfixes:
      
       - prevent hardif_put call with NULL parameter, by Colin Ian King
      
       - Avoid race in Translation Table allocator, by Sven Eckelmann
      
       - Fix Translation Table sync flags for intermediate Responses,
         by Linus Luessing
      
       - prevent sending inconsistent Translation Table TVLVs,
         by Marek Lindner
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e3ffec48
    • Linus Torvalds's avatar
      Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux · 62d18ecf
      Linus Torvalds authored
      Pull more arm64 fixes from Will Deacon:
      
       - fix application of read-only permissions to kernel section mappings
      
       - sanitise reported ESR values for signals delivered on a kernel
         address
      
       - ensure tishift GCC helpers are exported to modules
      
       - fix inline asm constraints for some LSE atomics
      
      * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
        arm64: Make sure permission updates happen for pmd/pud
        arm64: fault: Don't leak data in ESR context for user fault on kernel VA
        arm64: export tishift functions to modules
        arm64: lse: Add early clobbers to some input/output asm operands
      62d18ecf
    • Linus Torvalds's avatar
      Merge tag 'powerpc-4.17-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · b133ef6e
      Linus Torvalds authored
      Pull powerpc fix from Michael Ellerman:
       "Just one fix, to make sure the PCR (Processor Compatibility Register)
        is reset on boot.
      
        Otherwise if we're running in compat mode in a guest (eg. pretending a
        Power9 is a Power8) and the host kernel oopses and kdumps then the
        kdump kernel's userspace will be running in Power8 mode, and will
        SIGILL if it uses Power9-only instructions.
      
        Thanks to Michael Neuling"
      
      * tag 'powerpc-4.17-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
        powerpc/64s: Clear PCR on boot
      b133ef6e
    • Linus Torvalds's avatar
      Merge tag 'mmc-v4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc · f287fe35
      Linus Torvalds authored
      Pull MMC fixes from Ulf Hansson:
       "MMC core:
         - Propagate correct error code for RPMB requests
      
        MMC host:
         - sdhci-iproc: Drop hard coded cap for 1.8v
         - sdhci-iproc: Fix 32bit writes for transfer mode
         - sdhci-iproc: Enable SDHCI_QUIRK2_HOST_OFF_CARD_ON for cygnus"
      
      * tag 'mmc-v4.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
        mmc: sdhci-iproc: add SDHCI_QUIRK2_HOST_OFF_CARD_ON for cygnus
        mmc: sdhci-iproc: fix 32bit writes for TRANSFER_MODE register
        mmc: sdhci-iproc: remove hard coded mmc cap 1.8v
        mmc: block: propagate correct returned value in mmc_rpmb_ioctl
      f287fe35
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.17-rc7' of git://people.freedesktop.org/~airlied/linux · b9f57019
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Only two sets of drivers fixes: one rcar-du lvds regression fix, and a
        group of fixes for vmwgfx"
      
      * tag 'drm-fixes-for-v4.17-rc7' of git://people.freedesktop.org/~airlied/linux:
        drm/vmwgfx: Schedule an fb dirty update after resume
        drm/vmwgfx: Fix host logging / guestinfo reading error paths
        drm/vmwgfx: Fix 32-bit VMW_PORT_HB_[IN|OUT] macros
        drm: rcar-du: lvds: Fix crash in .atomic_check when disabling connector
      b9f57019
    • Linus Torvalds's avatar
      Merge tag 'sound-4.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · a1a9f537
      Linus Torvalds authored
      Pull sound fixes from Takashi Iwai:
       "Two fixes:
      
         - a timer pause event notification was garbled upon the recent
           hardening work; corrected now
      
         - HD-audio runtime PM regression fix due to the incorrect return
           type"
      
      * tag 'sound-4.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
        ALSA: hda - Fix runtime PM
        ALSA: timer: Fix pause event notification
      a1a9f537
    • Qing Huang's avatar
      mlx4_core: allocate ICM memory in page size chunks · 1383cb81
      Qing Huang authored
      When a system is under memory presure (high usage with fragments),
      the original 256KB ICM chunk allocations will likely trigger kernel
      memory management to enter slow path doing memory compact/migration
      ops in order to complete high order memory allocations.
      
      When that happens, user processes calling uverb APIs may get stuck
      for more than 120s easily even though there are a lot of free pages
      in smaller chunks available in the system.
      
      Syslog:
      ...
      Dec 10 09:04:51 slcc03db02 kernel: [397078.572732] INFO: task
      oracle_205573_e:205573 blocked for more than 120 seconds.
      ...
      
      With 4KB ICM chunk size on x86_64 arch, the above issue is fixed.
      
      However in order to support smaller ICM chunk size, we need to fix
      another issue in large size kcalloc allocations.
      
      E.g.
      Setting log_num_mtt=30 requires 1G mtt entries. With the 4KB ICM chunk
      size, each ICM chunk can only hold 512 mtt entries (8 bytes for each mtt
      entry). So we need a 16MB allocation for a table->icm pointer array to
      hold 2M pointers which can easily cause kcalloc to fail.
      
      The solution is to use kvzalloc to replace kcalloc which will fall back
      to vmalloc automatically if kmalloc fails.
      Signed-off-by: default avatarQing Huang <qing.huang@oracle.com>
      Acked-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: default avatarZhu Yanjun <yanjun.zhu@oracle.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1383cb81
    • Govindarajulu Varadarajan's avatar
      enic: set DMA mask to 47 bit · 322eaa06
      Govindarajulu Varadarajan authored
      In commit 624dbf55 ("driver/net: enic: Try DMA 64 first, then
      failover to DMA") DMA mask was changed from 40 bits to 64 bits.
      Hardware actually supports only 47 bits.
      
      Fixes: 624dbf55 ("driver/net: enic: Try DMA 64 first, then failover to DMA")
      Signed-off-by: default avatarGovindarajulu Varadarajan <gvaradar@cisco.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      322eaa06
    • Eric Biggers's avatar
      ppp: remove the PPPIOCDETACH ioctl · af8d3c7c
      Eric Biggers authored
      The PPPIOCDETACH ioctl effectively tries to "close" the given ppp file
      before f_count has reached 0, which is fundamentally a bad idea.  It
      does check 'f_count < 2', which excludes concurrent operations on the
      file since they would only be possible with a shared fd table, in which
      case each fdget() would take a file reference.  However, it fails to
      account for the fact that even with 'f_count == 1' the file can still be
      linked into epoll instances.  As reported by syzbot, this can trivially
      be used to cause a use-after-free.
      
      Yet, the only known user of PPPIOCDETACH is pppd versions older than
      ppp-2.4.2, which was released almost 15 years ago (November 2003).
      Also, PPPIOCDETACH apparently stopped working reliably at around the
      same time, when the f_count check was added to the kernel, e.g. see
      https://lkml.org/lkml/2002/12/31/83.  Also, the current 'f_count < 2'
      check makes PPPIOCDETACH only work in single-threaded applications; it
      always fails if called from a multithreaded application.
      
      All pppd versions released in the last 15 years just close() the file
      descriptor instead.
      
      Therefore, instead of hacking around this bug by exporting epoll
      internals to modules, and probably missing other related bugs, just
      remove the PPPIOCDETACH ioctl and see if anyone actually notices.  Leave
      a stub in place that prints a one-time warning and returns EINVAL.
      
      Reported-by: syzbot+16363c99d4134717c05b@syzkaller.appspotmail.com
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Biggers <ebiggers@google.com>
      Acked-by: default avatarPaul Mackerras <paulus@ozlabs.org>
      Reviewed-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Tested-by: default avatarGuillaume Nault <g.nault@alphalink.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af8d3c7c
    • Willem de Bruijn's avatar
      ipv4: remove warning in ip_recv_error · 730c54d5
      Willem de Bruijn authored
      A precondition check in ip_recv_error triggered on an otherwise benign
      race. Remove the warning.
      
      The warning triggers when passing an ipv6 socket to this ipv4 error
      handling function. RaceFuzzer was able to trigger it due to a race
      in setsockopt IPV6_ADDRFORM.
      
        ---
        CPU0
          do_ipv6_setsockopt
            sk->sk_socket->ops = &inet_dgram_ops;
      
        ---
        CPU1
          sk->sk_prot->recvmsg
            udp_recvmsg
              ip_recv_error
                WARN_ON_ONCE(sk->sk_family == AF_INET6);
      
        ---
        CPU0
          do_ipv6_setsockopt
            sk->sk_family = PF_INET;
      
      This socket option converts a v6 socket that is connected to a v4 peer
      to an v4 socket. It updates the socket on the fly, changing fields in
      sk as well as other structs. This is inherently non-atomic. It races
      with the lockless udp_recvmsg path.
      
      No other code makes an assumption that these fields are updated
      atomically. It is benign here, too, as ip_recv_error cares only about
      the protocol of the skbs enqueued on the error queue, for which
      sk_family is not a precise predictor (thanks to another isue with
      IPV6_ADDRFORM).
      
      Link: http://lkml.kernel.org/r/20180518120826.GA19515@dragonet.kaist.ac.kr
      Fixes: 7ce875e5 ("ipv4: warn once on passing AF_INET6 socket to ip_recv_error")
      Reported-by: default avatarDaeRyong Jeong <threeearcat@gmail.com>
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      730c54d5
    • Or Gerlitz's avatar
      net : sched: cls_api: deal with egdev path only if needed · f8f4bef3
      Or Gerlitz authored
      When dealing with ingress rule on a netdev, if we did fine through the
      conventional path, there's no need to continue into the egdev route,
      and we can stop right there.
      
      Not doing so may cause a 2nd rule to be added by the cls api layer
      with the ingress being the egdev.
      
      For example, under sriov switchdev scheme, a user rule of VFR A --> VFR B
      will end up with two HW rules (1) VF A --> VF B and (2) uplink --> VF B
      
      Fixes: 208c0f4b ('net: sched: use tc_setup_cb_call to call per-block callbacks')
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8f4bef3
    • Jason Wang's avatar
      vhost: synchronize IOTLB message with dev cleanup · 1b15ad68
      Jason Wang authored
      DaeRyong Jeong reports a race between vhost_dev_cleanup() and
      vhost_process_iotlb_msg():
      
      Thread interleaving:
      CPU0 (vhost_process_iotlb_msg)			CPU1 (vhost_dev_cleanup)
      (In the case of both VHOST_IOTLB_UPDATE and
      VHOST_IOTLB_INVALIDATE)
      
      =====						=====
      						vhost_umem_clean(dev->iotlb);
      if (!dev->iotlb) {
      	        ret = -EFAULT;
      		        break;
      }
      						dev->iotlb = NULL;
      
      The reason is we don't synchronize between them, fixing by protecting
      vhost_process_iotlb_msg() with dev mutex.
      Reported-by: default avatarDaeRyong Jeong <threeearcat@gmail.com>
      Fixes: 6b1e6cc7 ("vhost: new device IOTLB API")
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1b15ad68
    • David S. Miller's avatar
      Merge tag 'mlx5-fixes-2018-05-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux · d681bc02
      David S. Miller authored
      Saeed Mahameed says:
      
      ====================
      Mellanox, mlx5 fixes 2018-05-24
      
      This series includes two mlx5 fixes.
      
      1) add FCS data to checksum complete when required, from Eran Ben
      Elisha.
      
      2) Fix A race in IPSec sandbox QP commands, from Yossi Kuperman.
      
      Please pull and let me know if there's any problem.
      
      for -stable v4.15
      ("net/mlx5e: When RXFCS is set, add FCS data into checksum calculation")
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d681bc02
    • Willem de Bruijn's avatar
      packet: fix reserve calculation · 9aad13b0
      Willem de Bruijn authored
      Commit b84bbaf7 ("packet: in packet_snd start writing at link
      layer allocation") ensures that packet_snd always starts writing
      the link layer header in reserved headroom allocated for this
      purpose.
      
      This is needed because packets may be shorter than hard_header_len,
      in which case the space up to hard_header_len may be zeroed. But
      that necessary padding is not accounted for in skb->len.
      
      The fix, however, is buggy. It calls skb_push, which grows skb->len
      when moving skb->data back. But in this case packet length should not
      change.
      
      Instead, call skb_reserve, which moves both skb->data and skb->tail
      back, without changing length.
      
      Fixes: b84bbaf7 ("packet: in packet_snd start writing at link layer allocation")
      Reported-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9aad13b0
  3. 24 May, 2018 15 commits
    • Dave Airlie's avatar
      Merge branch 'vmwgfx-fixes-4.17' of git://people.freedesktop.org/~thomash/linux into drm-fixes · 4bc6f777
      Dave Airlie authored
      Three fixes for vmwgfx. Two are cc'd stable and fix host logging and its
      error paths on 32-bit VMs. One is a fix for a hibernate flaw
      introduced with the 4.17 merge window.
      
      * 'vmwgfx-fixes-4.17' of git://people.freedesktop.org/~thomash/linux:
        drm/vmwgfx: Schedule an fb dirty update after resume
        drm/vmwgfx: Fix host logging / guestinfo reading error paths
        drm/vmwgfx: Fix 32-bit VMW_PORT_HB_[IN|OUT] macros
      4bc6f777
    • Linus Torvalds's avatar
      Merge branch 'stable/for-linus-4.17' of... · b5069438
      Linus Torvalds authored
      Merge branch 'stable/for-linus-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
      
      Pull swiotlb fix from Konrad Rzeszutek Wilk:
       "One single fix in here: under Xen the DMA32 heap (in the hypervisor)
        would end up looking like swiss cheese.
      
        The reason being that for every coherent DMA allocation we didn't do
        the proper hypercall to tell Xen to return the page back to the DMA32
        heap. End result was (eventually) no DMA32 space if you (for example)
        continously unloaded and loaded modules"
      
      * 'stable/for-linus-4.17' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
        xen-swiotlb: fix the check condition for xen_swiotlb_free_coherent
      b5069438
    • Yossi Kuperman's avatar
      net/mlx5: IPSec, Fix a race between concurrent sandbox QP commands · 1dcbc01f
      Yossi Kuperman authored
      Sandbox QP Commands are retired in the order they are sent. Outstanding
      commands are stored in a linked-list in the order they appear. Once a
      response is received and the callback gets called, we pull the first
      element off the pending list, assuming they correspond.
      
      Sending a message and adding it to the pending list is not done atomically,
      hence there is an opportunity for a race between concurrent requests.
      
      Bind both send and add under a critical section.
      
      Fixes: bebb23e6 ("net/mlx5: Accel, Add IPSec acceleration interface")
      Signed-off-by: default avatarYossi Kuperman <yossiku@mellanox.com>
      Signed-off-by: default avatarAdi Nissim <adin@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      1dcbc01f
    • Eran Ben Elisha's avatar
      net/mlx5e: When RXFCS is set, add FCS data into checksum calculation · 902a5459
      Eran Ben Elisha authored
      When RXFCS feature is enabled, the HW do not strip the FCS data,
      however it is not present in the checksum calculated by the HW.
      
      Fix that by manually calculating the FCS checksum and adding it to the SKB
      checksum field.
      
      Add helper function to find the FCS data for all SKB forms (linear,
      one fragment or more).
      
      Fixes: 102722fc ("net/mlx5e: Add support for RXFCS feature flag")
      Signed-off-by: default avatarEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      902a5459
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma · 34b48b87
      Linus Torvalds authored
      Pull rdma fixes from Jason Gunthorpe:
       "This is pretty much just the usual array of smallish driver bugs.
      
         - remove bouncing addresses from the MAINTAINERS file
      
         - kernel oops and bad error handling fixes for hfi, i40iw, cxgb4, and
           hns drivers
      
         - various small LOC behavioral/operational bugs in mlx5, hns, qedr
           and i40iw drivers
      
         - two fixes for patches already sent during the merge window
      
         - a long-standing bug related to not decreasing the pinned pages
           count in the right MM was found and fixed"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (28 commits)
        RDMA/hns: Move the location for initializing tmp_len
        RDMA/hns: Bugfix for cq record db for kernel
        IB/uverbs: Fix uverbs_attr_get_obj
        RDMA/qedr: Fix doorbell bar mapping for dpi > 1
        IB/umem: Use the correct mm during ib_umem_release
        iw_cxgb4: Fix an error handling path in 'c4iw_get_dma_mr()'
        RDMA/i40iw: Avoid panic when reading back the IRQ affinity hint
        RDMA/i40iw: Avoid reference leaks when processing the AEQ
        RDMA/i40iw: Avoid panic when objects are being created and destroyed
        RDMA/hns: Fix the bug with NULL pointer
        RDMA/hns: Set NULL for __internal_mr
        RDMA/hns: Enable inner_pa_vld filed of mpt
        RDMA/hns: Set desc_dma_addr for zero when free cmq desc
        RDMA/hns: Fix the bug with rq sge
        RDMA/hns: Not support qp transition from reset to reset for hip06
        RDMA/hns: Add return operation when configured global param fail
        RDMA/hns: Update convert function of endian format
        RDMA/hns: Load the RoCE dirver automatically
        RDMA/hns: Bugfix for rq record db for kernel
        RDMA/hns: Add rq inline flags judgement
        ...
      34b48b87
    • Linus Torvalds's avatar
      Merge tag 'for-4.17-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · d7b66b4a
      Linus Torvalds authored
      Pull btrfs fix from David Sterba:
       "A one-liner that prevents leaking an internal error value 1 out of the
        ftruncate syscall.
      
        This has been observed in practice. The steps to reproduce make a
        common pattern (open/write/fync/ftruncate) but also need the
        application to not check only for negative values and happens only for
        compressed inlined files.
      
        The conditions are narrow but as this could break userspace I think
        it's better to merge it now and not wait for the merge window"
      
      * tag 'for-4.17-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        Btrfs: fix error handling in btrfs_truncate()
      d7b66b4a
    • Lukas Wunner's avatar
      ALSA: hda - Fix runtime PM · 009f8c90
      Lukas Wunner authored
      Before commit 3b5b899c ("ALSA: hda: Make use of core codec functions
      to sync power state"), hda_set_power_state() returned the response to
      the Get Power State verb, a 32-bit unsigned integer whose expected value
      is 0x233 after transitioning a codec to D3, and 0x0 after transitioning
      it to D0.
      
      The response value is significant because hda_codec_runtime_suspend()
      does not clear the codec's bit in the codec_powered bitmask unless the
      AC_PWRST_CLK_STOP_OK bit (0x200) is set in the response value.  That in
      turn prevents the HDA controller from runtime suspending because
      azx_runtime_idle() checks that the codec_powered bitmask is zero.
      
      Since commit 3b5b899c, hda_set_power_state() only returns 0x0 or
      0x1, thereby breaking runtime PM for any HDA controller.  That's because
      an inline function introduced by the commit returns a bool instead of a
      32-bit unsigned int.  The change was likely erroneous and resulted from
      copying and pasting snd_hda_check_power_state(), which is immediately
      preceding the newly introduced inline function.  Fix it.
      
      Link: https://bugs.freedesktop.org/show_bug.cgi?id=106597
      Fixes: 3b5b899c ("ALSA: hda: Make use of core codec functions to sync power state")
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: Abhijeet Kumar <abhijeet.kumar@intel.com>
      Reported-and-tested-by: default avatarGunnar Krüger <taijian@posteo.de>
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Acked-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      009f8c90
    • Joonsoo Kim's avatar
      Revert "mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE" · d883c6cf
      Joonsoo Kim authored
      This reverts the following commits that change CMA design in MM.
      
       3d2054ad ("ARM: CMA: avoid double mapping to the CMA area if CONFIG_HIGHMEM=y")
      
       1d47a3ec ("mm/cma: remove ALLOC_CMA")
      
       bad8c6c0 ("mm/cma: manage the memory of the CMA area by using the ZONE_MOVABLE")
      
      Ville reported a following error on i386.
      
        Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
        microcode: microcode updated early to revision 0x4, date = 2013-06-28
        Initializing CPU#0
        Initializing HighMem for node 0 (000377fe:00118000)
        Initializing Movable for node 0 (00000001:00118000)
        BUG: Bad page state in process swapper  pfn:377fe
        page:f53effc0 count:0 mapcount:-127 mapping:00000000 index:0x0
        flags: 0x80000000()
        raw: 80000000 00000000 00000000 ffffff80 00000000 00000100 00000200 00000001
        page dumped because: nonzero mapcount
        Modules linked in:
        CPU: 0 PID: 0 Comm: swapper Not tainted 4.17.0-rc5-elk+ #145
        Hardware name: Dell Inc. Latitude E5410/03VXMC, BIOS A15 07/11/2013
        Call Trace:
         dump_stack+0x60/0x96
         bad_page+0x9a/0x100
         free_pages_check_bad+0x3f/0x60
         free_pcppages_bulk+0x29d/0x5b0
         free_unref_page_commit+0x84/0xb0
         free_unref_page+0x3e/0x70
         __free_pages+0x1d/0x20
         free_highmem_page+0x19/0x40
         add_highpages_with_active_regions+0xab/0xeb
         set_highmem_pages_init+0x66/0x73
         mem_init+0x1b/0x1d7
         start_kernel+0x17a/0x363
         i386_start_kernel+0x95/0x99
         startup_32_smp+0x164/0x168
      
      The reason for this error is that the span of MOVABLE_ZONE is extended
      to whole node span for future CMA initialization, and, normal memory is
      wrongly freed here.  I submitted the fix and it seems to work, but,
      another problem happened.
      
      It's so late time to fix the later problem so I decide to reverting the
      series.
      Reported-by: default avatarVille Syrjälä <ville.syrjala@linux.intel.com>
      Acked-by: default avatarLaura Abbott <labbott@redhat.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarJoonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d883c6cf
    • Linus Torvalds's avatar
      Merge branch 'for-4.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata · 577e75e0
      Linus Torvalds authored
      Pull libata fixes from Tejun Heo:
       "Nothing too interesting.  Four patches to update the blacklist and
        add a controller ID"
      
      * 'for-4.17-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
        ahci: Add PCI ID for Cannon Lake PCH-LP AHCI
        libata: blacklist Micron 500IT SSD with MU01 firmware
        libata: Apply NOLPM quirk for SAMSUNG PM830 CXM13D1Q.
        libata: Blacklist some Sandisk SSDs for NCQ
      577e75e0
    • Linus Torvalds's avatar
      Merge tag 'for-linus-20180524' of git://git.kernel.dk/linux-block · b68ea0ee
      Linus Torvalds authored
      Pull block fixes from Jens Axboe:
       "Two fixes that should go into this release:
      
         - a loop writeback error clearing fix from Jeff
      
         - the sr sense fix from myself"
      
      * tag 'for-linus-20180524' of git://git.kernel.dk/linux-block:
        loop: clear wb_err in bd_inode when detaching backing file
        sr: pass down correctly sized SCSI sense buffer
      b68ea0ee
    • Linus Torvalds's avatar
      Merge tag 'pm-4.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 9ca5a2ae
      Linus Torvalds authored
      Pull power management fix from Rafael Wysocki:
       "Fix a regression from the 4.15 cycle that caused the system suspend
        and resume overhead to increase on many systems and triggered more
        serious problems on some of them (Rafael Wysocki)"
      
      * tag 'pm-4.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PM / core: Fix direct_complete handling for devices with no callbacks
      9ca5a2ae
    • Daniel Borkmann's avatar
      bpf: properly enforce index mask to prevent out-of-bounds speculation · c93552c4
      Daniel Borkmann authored
      While reviewing the verifier code, I recently noticed that the
      following two program variants in relation to tail calls can be
      loaded.
      
      Variant 1:
      
        # bpftool p d x i 15
          0: (15) if r1 == 0x0 goto pc+3
          1: (18) r2 = map[id:5]
          3: (05) goto pc+2
          4: (18) r2 = map[id:6]
          6: (b7) r3 = 7
          7: (35) if r3 >= 0xa0 goto pc+2
          8: (54) (u32) r3 &= (u32) 255
          9: (85) call bpf_tail_call#12
         10: (b7) r0 = 1
         11: (95) exit
      
        # bpftool m s i 5
          5: prog_array  flags 0x0
              key 4B  value 4B  max_entries 4  memlock 4096B
        # bpftool m s i 6
          6: prog_array  flags 0x0
              key 4B  value 4B  max_entries 160  memlock 4096B
      
      Variant 2:
      
        # bpftool p d x i 20
          0: (15) if r1 == 0x0 goto pc+3
          1: (18) r2 = map[id:8]
          3: (05) goto pc+2
          4: (18) r2 = map[id:7]
          6: (b7) r3 = 7
          7: (35) if r3 >= 0x4 goto pc+2
          8: (54) (u32) r3 &= (u32) 3
          9: (85) call bpf_tail_call#12
         10: (b7) r0 = 1
         11: (95) exit
      
        # bpftool m s i 8
          8: prog_array  flags 0x0
              key 4B  value 4B  max_entries 160  memlock 4096B
        # bpftool m s i 7
          7: prog_array  flags 0x0
              key 4B  value 4B  max_entries 4  memlock 4096B
      
      In both cases the index masking inserted by the verifier in order
      to control out of bounds speculation from a CPU via b2157399
      ("bpf: prevent out-of-bounds speculation") seems to be incorrect
      in what it is enforcing. In the 1st variant, the mask is applied
      from the map with the significantly larger number of entries where
      we would allow to a certain degree out of bounds speculation for
      the smaller map, and in the 2nd variant where the mask is applied
      from the map with the smaller number of entries, we get buggy
      behavior since we truncate the index of the larger map.
      
      The original intent from commit b2157399 is to reject such
      occasions where two or more different tail call maps are used
      in the same tail call helper invocation. However, the check on
      the BPF_MAP_PTR_POISON is never hit since we never poisoned the
      saved pointer in the first place! We do this explicitly for map
      lookups but in case of tail calls we basically used the tail
      call map in insn_aux_data that was processed in the most recent
      path which the verifier walked. Thus any prior path that stored
      a pointer in insn_aux_data at the helper location was always
      overridden.
      
      Fix it by moving the map pointer poison logic into a small helper
      that covers both BPF helpers with the same logic. After that in
      fixup_bpf_calls() the poison check is then hit for tail calls
      and the program rejected. Latter only happens in unprivileged
      case since this is the *only* occasion where a rewrite needs to
      happen, and where such rewrite is specific to the map (max_entries,
      index_mask). In the privileged case the rewrite is generic for
      the insn->imm / insn->code update so multiple maps from different
      paths can be handled just fine since all the remaining logic
      happens in the instruction processing itself. This is similar
      to the case of map lookups: in case there is a collision of
      maps in fixup_bpf_calls() we must skip the inlined rewrite since
      this will turn the generic instruction sequence into a non-
      generic one. Thus the patch_call_imm will simply update the
      insn->imm location where the bpf_map_lookup_elem() will later
      take care of the dispatch. Given we need this 'poison' state
      as a check, the information of whether a map is an unpriv_array
      gets lost, so enforcing it prior to that needs an additional
      state. In general this check is needed since there are some
      complex and tail call intensive BPF programs out there where
      LLVM tends to generate such code occasionally. We therefore
      convert the map_ptr rather into map_state to store all this
      w/o extra memory overhead, and the bit whether one of the maps
      involved in the collision was from an unpriv_array thus needs
      to be retained as well there.
      
      Fixes: b2157399 ("bpf: prevent out-of-bounds speculation")
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      c93552c4
    • Mika Westerberg's avatar
      ahci: Add PCI ID for Cannon Lake PCH-LP AHCI · 4544e403
      Mika Westerberg authored
      This one should be using the default LPM policy for mobile chipsets so
      add the PCI ID to the driver list of supported revices.
      Signed-off-by: default avatarMika Westerberg <mika.westerberg@linux.intel.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Cc: stable@vger.kernel.org
      4544e403
    • Laura Abbott's avatar
      arm64: Make sure permission updates happen for pmd/pud · 82034c23
      Laura Abbott authored
      Commit 15122ee2 ("arm64: Enforce BBM for huge IO/VMAP mappings")
      disallowed block mappings for ioremap since that code does not honor
      break-before-make. The same APIs are also used for permission updating
      though and the extra checks prevent the permission updates from happening,
      even though this should be permitted. This results in read-only permissions
      not being fully applied. Visibly, this can occasionaly be seen as a failure
      on the built in rodata test when the test data ends up in a section or
      as an odd RW gap on the page table dump. Fix this by using
      pgattr_change_is_safe instead of p*d_present for determining if the
      change is permitted.
      Reviewed-by: default avatarKees Cook <keescook@chromium.org>
      Tested-by: default avatarPeter Robinson <pbrobinson@gmail.com>
      Reported-by: default avatarPeter Robinson <pbrobinson@gmail.com>
      Fixes: 15122ee2 ("arm64: Enforce BBM for huge IO/VMAP mappings")
      Signed-off-by: default avatarLaura Abbott <labbott@redhat.com>
      Signed-off-by: default avatarWill Deacon <will.deacon@arm.com>
      82034c23
    • Omar Sandoval's avatar
      Btrfs: fix error handling in btrfs_truncate() · d5014738
      Omar Sandoval authored
      Jun Wu at Facebook reported that an internal service was seeing a return
      value of 1 from ftruncate() on Btrfs in some cases. This is coming from
      the NEED_TRUNCATE_BLOCK return value from btrfs_truncate_inode_items().
      
      btrfs_truncate() uses two variables for error handling, ret and err.
      When btrfs_truncate_inode_items() returns non-zero, we set err to the
      return value. However, NEED_TRUNCATE_BLOCK is not an error. Make sure we
      only set err if ret is an error (i.e., negative).
      
      To reproduce the issue: mount a filesystem with -o compress-force=zstd
      and the following program will encounter return value of 1 from
      ftruncate:
      
      int main(void) {
              char buf[256] = { 0 };
              int ret;
              int fd;
      
              fd = open("test", O_CREAT | O_WRONLY | O_TRUNC, 0666);
              if (fd == -1) {
                      perror("open");
                      return EXIT_FAILURE;
              }
      
              if (write(fd, buf, sizeof(buf)) != sizeof(buf)) {
                      perror("write");
                      close(fd);
                      return EXIT_FAILURE;
              }
      
              if (fsync(fd) == -1) {
                      perror("fsync");
                      close(fd);
                      return EXIT_FAILURE;
              }
      
              ret = ftruncate(fd, 128);
              if (ret) {
                      printf("ftruncate() returned %d\n", ret);
                      close(fd);
                      return EXIT_FAILURE;
              }
      
              close(fd);
              return EXIT_SUCCESS;
      }
      
      Fixes: ddfae63c ("btrfs: move btrfs_truncate_block out of trans handle")
      CC: stable@vger.kernel.org # 4.15+
      Reported-by: default avatarJun Wu <quark@fb.com>
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      d5014738
  4. 23 May, 2018 7 commits
    • oulijun's avatar
      RDMA/hns: Move the location for initializing tmp_len · 55ba49cb
      oulijun authored
      When posted work request, it need to compute the length of
      all sges of every wr and fill it into the msg_len field of
      send wqe. Thus, While posting multiple wr,
      tmp_len should be reinitialized to zero.
      
      Fixes: 8b9b8d14 ("RDMA/hns: Fix the endian problem for hns")
      Signed-off-by: default avatarLijun Ou <oulijun@huawei.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      55ba49cb
    • oulijun's avatar
      RDMA/hns: Bugfix for cq record db for kernel · 05d6a4dd
      oulijun authored
      When use cq record db for kernel, it needs to set the hr_cq->db_en
      to 1 and configure the dma address of record cq db of qp context.
      
      Fixes: 86188a88 ("RDMA/hns: Support cq record doorbell for kernel space")
      Signed-off-by: default avatarLijun Ou <oulijun@huawei.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      05d6a4dd
    • Jason Gunthorpe's avatar
      IB/uverbs: Fix uverbs_attr_get_obj · f4602cbb
      Jason Gunthorpe authored
      The err pointer comes from uverbs_attr_get, not from the uobject member,
      which does not store an ERR_PTR.
      
      Fixes: be934cca ("IB/uverbs: Add device memory registration ioctl support")
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      f4602cbb
    • Kalderon, Michal's avatar
      RDMA/qedr: Fix doorbell bar mapping for dpi > 1 · 30bf066c
      Kalderon, Michal authored
      Each user_context receives a separate dpi value and thus a different
      address on the doorbell bar. The qedr_mmap function needs to validate
      the address and map the doorbell bar accordingly.
      The current implementation always checked against dpi=0 doorbell range
      leading to a wrong mapping for doorbell bar. (It entered an else case
      that mapped the address differently). qedr_mmap should only be used
      for doorbells, so the else was actually wrong in the first place.
      This only has an affect on arm architecture and not an issue on a
      x86 based architecture.
      This lead to doorbells not occurring on arm based systems and left
      applications that use more than one dpi (or several applications
      run simultaneously ) to hang.
      
      Fixes: ac1b36e5 ("qedr: Add support for user context verbs")
      Signed-off-by: default avatarAriel Elior <Ariel.Elior@cavium.com>
      Signed-off-by: default avatarMichal Kalderon <Michal.Kalderon@cavium.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      30bf066c
    • Jack Morgenstein's avatar
      net/mlx4: Fix irq-unsafe spinlock usage · d546b67c
      Jack Morgenstein authored
      spin_lock/unlock was used instead of spin_un/lock_irq
      in a procedure used in process space, on a spinlock
      which can be grabbed in an interrupt.
      
      This caused the stack trace below to be displayed (on kernel
      4.17.0-rc1 compiled with Lock Debugging enabled):
      
      [  154.661474] WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
      [  154.668909] 4.17.0-rc1-rdma_rc_mlx+ #3 Tainted: G          I
      [  154.675856] -----------------------------------------------------
      [  154.682706] modprobe/10159 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
      [  154.690254] 00000000f3b0e495 (&(&qp_table->lock)->rlock){+.+.}, at: mlx4_qp_remove+0x20/0x50 [mlx4_core]
      [  154.700927]
      and this task is already holding:
      [  154.707461] 0000000094373b5d (&(&cq->lock)->rlock/1){....}, at: destroy_qp_common+0x111/0x560 [mlx4_ib]
      [  154.718028] which would create a new lock dependency:
      [  154.723705]  (&(&cq->lock)->rlock/1){....} -> (&(&qp_table->lock)->rlock){+.+.}
      [  154.731922]
      but this new dependency connects a SOFTIRQ-irq-safe lock:
      [  154.740798]  (&(&cq->lock)->rlock){..-.}
      [  154.740800]
      ... which became SOFTIRQ-irq-safe at:
      [  154.752163]   _raw_spin_lock_irqsave+0x3e/0x50
      [  154.757163]   mlx4_ib_poll_cq+0x36/0x900 [mlx4_ib]
      [  154.762554]   ipoib_tx_poll+0x4a/0xf0 [ib_ipoib]
      ...
      to a SOFTIRQ-irq-unsafe lock:
      [  154.815603]  (&(&qp_table->lock)->rlock){+.+.}
      [  154.815604]
      ... which became SOFTIRQ-irq-unsafe at:
      [  154.827718] ...
      [  154.827720]   _raw_spin_lock+0x35/0x50
      [  154.833912]   mlx4_qp_lookup+0x1e/0x50 [mlx4_core]
      [  154.839302]   mlx4_flow_attach+0x3f/0x3d0 [mlx4_core]
      
      Since mlx4_qp_lookup() is called only in process space, we can
      simply replace the spin_un/lock calls with spin_un/lock_irq calls.
      
      Fixes: 6dc06c08 ("net/mlx4: Fix the check in attaching steering rules")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d546b67c
    • Florian Fainelli's avatar
      net: phy: broadcom: Fix bcm_write_exp() · 79fb218d
      Florian Fainelli authored
      On newer PHYs, we need to select the expansion register to write with
      setting bits [11:8] to 0xf. This was done correctly by bcm7xxx.c prior
      to being migrated to generic code under bcm-phy-lib.c which
      unfortunately used the older implementation from the BCM54xx days.
      
      Fix this by creating an inline stub: bcm_write_exp_sel() which adds the
      correct value (MII_BCM54XX_EXP_SEL_ER) and update both the Cygnus PHY
      and BCM7xxx PHY drivers which require setting these bits.
      
      broadcom.c is unchanged because some PHYs even use a different selector
      method, so let them specify it directly (e.g: SerDes secondary selector).
      
      Fixes: a1cba561 ("net: phy: Add Broadcom phy library for common interfaces")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79fb218d
    • Florian Fainelli's avatar
      net: phy: broadcom: Fix auxiliary control register reads · 733a969a
      Florian Fainelli authored
      We are currently doing auxiliary control register reads with the shadow
      register value 0b111 (0x7) which incidentally is also the selector value
      that should be present in bits [2:0]. Fix this by using the appropriate
      selector mask which is defined (MII_BCM54XX_AUXCTL_SHDWSEL_MASK).
      
      This does not have a functional impact yet because we always access the
      MII_BCM54XX_AUXCTL_SHDWSEL_MISC (0x7) register in the current code.
      This might change at some point though.
      
      Fixes: 5b4e2900 ("net: phy: broadcom: add bcm54xx_auxctl_read")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      733a969a