1. 17 Nov, 2016 28 commits
    • Linus Torvalds's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma · 57400d30
      Linus Torvalds authored
      Pull rmda fixes from Doug Ledford.
       "First round of -rc fixes.
      
        Due to various issues, I've been away and couldn't send a pull request
        for about three weeks. There were a number of -rc patches that built
        up in the meantime (some where there already from the early -rc
        stages). Obviously, there were way too many to send now, so I tried to
        pare the list down to the more important patches for the -rc cycle.
      
        Most of the code has had plenty of soak time at the various vendor's
        testing setups, so I doubt there will be another -rc pull request this
        cycle. I also tried to limit the patches to those with smaller
        footprints, so even though a shortlog is longer than I would like, the
        actual diffstat is mostly very small with the exception of just three
        files that had more changes, and a couple files with pure removals.
      
        Summary:
         - Misc Intel hfi1 fixes
         - Misc Mellanox mlx4, mlx5, and rxe fixes
         - A couple cxgb4 fixes"
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (34 commits)
        iw_cxgb4: invalidate the mr when posting a read_w_inv wr
        iw_cxgb4: set *bad_wr for post_send/post_recv errors
        IB/rxe: Update qp state for user query
        IB/rxe: Clear queue buffer when modifying QP to reset
        IB/rxe: Fix handling of erroneous WR
        IB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum
        IB/mlx4: Fix create CQ error flow
        IB/mlx4: Check gid_index return value
        IB/mlx5: Fix NULL pointer dereference on debug print
        IB/mlx5: Fix fatal error dispatching
        IB/mlx5: Resolve soft lock on massive reg MRs
        IB/mlx5: Use cache line size to select CQE stride
        IB/mlx5: Validate requested RQT size
        IB/mlx5: Fix memory leak in query device
        IB/core: Avoid unsigned int overflow in sg_alloc_table
        IB/core: Add missing check for addr_resolve callback return value
        IB/core: Set routable RoCE gid type for ipv4/ipv6 networks
        IB/cm: Mark stale CM id's whenever the mad agent was unregistered
        IB/uverbs: Fix leak of XRC target QPs
        IB/hfi1: Remove incorrect IS_ERR check
        ...
      57400d30
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · bec1b089
      Linus Torvalds authored
      Pull vfs fixes from Al Viro:
       "A couple of regression fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        fix iov_iter_advance() for ITER_PIPE
        xattr: Fix setting security xattrs on sockfs
      bec1b089
    • Linus Torvalds's avatar
      Merge tag 'for-linus-4.9-rc5-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux · d46bc34d
      Linus Torvalds authored
      Pull orangefs fix from Mike Marshall:
       "orangefs: add .owner to debugfs file_operations
      
        Without ".owner = THIS_MODULE" it is possible to crash the kernel by
        unloading the Orangefs module while someone is reading debugfs files"
      
      * tag 'for-linus-4.9-rc5-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux:
        orangefs: add .owner to debugfs file_operations
      d46bc34d
    • Aaron Lu's avatar
      mremap: fix race between mremap() and page cleanning · 5d190420
      Aaron Lu authored
      Prior to 3.15, there was a race between zap_pte_range() and
      page_mkclean() where writes to a page could be lost.  Dave Hansen
      discovered by inspection that there is a similar race between
      move_ptes() and page_mkclean().
      
      We've been able to reproduce the issue by enlarging the race window with
      a msleep(), but have not been able to hit it without modifying the code.
      So, we think it's a real issue, but is difficult or impossible to hit in
      practice.
      
      The zap_pte_range() issue is fixed by commit 1cf35d47("mm: split
      'tlb_flush_mmu()' into tlb flushing and memory freeing parts").  And
      this patch is to fix the race between page_mkclean() and mremap().
      
      Here is one possible way to hit the race: suppose a process mmapped a
      file with READ | WRITE and SHARED, it has two threads and they are bound
      to 2 different CPUs, e.g.  CPU1 and CPU2.  mmap returned X, then thread
      1 did a write to addr X so that CPU1 now has a writable TLB for addr X
      on it.  Thread 2 starts mremaping from addr X to Y while thread 1
      cleaned the page and then did another write to the old addr X again.
      The 2nd write from thread 1 could succeed but the value will get lost.
      
              thread 1                           thread 2
           (bound to CPU1)                    (bound to CPU2)
      
        1: write 1 to addr X to get a
           writeable TLB on this CPU
      
                                              2: mremap starts
      
                                              3: move_ptes emptied PTE for addr X
                                                 and setup new PTE for addr Y and
                                                 then dropped PTL for X and Y
      
        4: page laundering for N by doing
           fadvise FADV_DONTNEED. When done,
           pageframe N is deemed clean.
      
        5: *write 2 to addr X
      
                                              6: tlb flush for addr X
      
        7: munmap (Y, pagesize) to make the
           page unmapped
      
        8: fadvise with FADV_DONTNEED again
           to kick the page off the pagecache
      
        9: pread the page from file to verify
           the value. If 1 is there, it means
           we have lost the written 2.
      
        *the write may or may not cause segmentation fault, it depends on
        if the TLB is still on the CPU.
      
      Please note that this is only one specific way of how the race could
      occur, it didn't mean that the race could only occur in exact the above
      config, e.g. more than 2 threads could be involved and fadvise() could
      be done in another thread, etc.
      
      For anonymous pages, they could race between mremap() and page reclaim:
      THP: a huge PMD is moved by mremap to a new huge PMD, then the new huge
      PMD gets unmapped/splitted/pagedout before the flush tlb happened for
      the old huge PMD in move_page_tables() and we could still write data to
      it.  The normal anonymous page has similar situation.
      
      To fix this, check for any dirty PTE in move_ptes()/move_huge_pmd() and
      if any, did the flush before dropping the PTL.  If we did the flush for
      every move_ptes()/move_huge_pmd() call then we do not need to do the
      flush in move_pages_tables() for the whole range.  But if we didn't, we
      still need to do the whole range flush.
      
      Alternatively, we can track which part of the range is flushed in
      move_ptes()/move_huge_pmd() and which didn't to avoid flushing the whole
      range in move_page_tables().  But that would require multiple tlb
      flushes for the different sub-ranges and should be less efficient than
      the single whole range flush.
      
      KBuild test on my Sandybridge desktop doesn't show any noticeable change.
      v4.9-rc4:
        real    5m14.048s
        user    32m19.800s
        sys     4m50.320s
      
      With this commit:
        real    5m13.888s
        user    32m19.330s
        sys     4m51.200s
      Reported-by: default avatarDave Hansen <dave.hansen@intel.com>
      Signed-off-by: default avatarAaron Lu <aaron.lu@intel.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5d190420
    • Abhi Das's avatar
      fix iov_iter_advance() for ITER_PIPE · 680bb946
      Abhi Das authored
      iov_iter_advance() needs to decrement iter->count by the number of
      bytes we'd moved beyond.  Normal flavours do that, but ITER_PIPE
      doesn't and ITER_PIPE generic_file_read_iter() for O_DIRECT files
      ends up with a bogus fallback to page cache read, resulting in incorrect
      values for file offset and bytes read.
      Signed-off-by: default avatarAbhi Das <adas@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      680bb946
    • Andreas Gruenbacher's avatar
      xattr: Fix setting security xattrs on sockfs · 4a590153
      Andreas Gruenbacher authored
      The IOP_XATTR flag is set on sockfs because sockfs supports getting the
      "system.sockprotoname" xattr.  Since commit 6c6ef9f2, this flag is checked for
      setxattr support as well.  This is wrong on sockfs because security xattr
      support there is supposed to be provided by security_inode_setsecurity.  The
      smack security module relies on socket labels (xattrs).
      
      Fix this by adding a security xattr handler on sockfs that returns
      -EAGAIN, and by checking for -EAGAIN in setxattr.
      
      We cannot simply check for -EOPNOTSUPP in setxattr because there are
      filesystems that neither have direct security xattr support nor support
      via security_inode_setsecurity.  A more proper fix might be to move the
      call to security_inode_setsecurity into sockfs, but it's not clear to me
      if that is safe: we would end up calling security_inode_post_setxattr after
      that as well.
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      4a590153
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-for-v4.9-rc6' of git://people.freedesktop.org/~airlied/linux · 961b708e
      Linus Torvalds authored
      Pull drm fixes fr9om Dave Airlie:
       "Fixes for amdgpu, and a bunch of arm drivers.
      
        There seems to be an uptick in the ARM drivers sending things for
        fixes which is good, so I've decided to dequeue a bit early, more
        stuff may arrive before the weekend.
      
        This contains mediatek, arcpgu, sunxi, fsl-dcu display controller
        fixes along with 3 amdgpu fixes, one for a fencing issue with
        secondary GPUs"
      
      * tag 'drm-fixes-for-v4.9-rc6' of git://people.freedesktop.org/~airlied/linux:
        drm/amdgpu:fix vpost_needed routine
        drm/amdgpu/powerplay: drop a redundant NULL check
        drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)
        drm/arcpgu: Accommodate adv7511 switch to DRM bridge
        drm/fsl-dcu: disable planes before disabling CRTC
        drm/fsl-dcu: update all registers on flush
        drm/fsl-dcu: do not update when modifying irq registers
        drm/sun4i: Propagate error to the caller
        drm/sun4i: Fix error handling
        drm/mediatek: modify the factor to make the pll_rate set in the 1G-2G range
        drm/mediatek: enhance the HDMI driving current
        drm/mediatek: do mtk_hdmi_send_infoframe after HDMI clock enable
        drm/mediatek: clear IRQ status before enable OVL interrupt
        drm/mediatek: set vblank_disable_allowed to true
        drm/mediatek: fix a typo of OD_CFG to OD_RELAYMODE
        drm/sun4i: rgb: Remove the bridge enable/disable functions
        drm/sun4i: rgb: Enable panel after controller
      961b708e
    • Steve Wise's avatar
      iw_cxgb4: invalidate the mr when posting a read_w_inv wr · 5c6b2aaf
      Steve Wise authored
      Also, rearrange things a bit to have a common c4iw_invalidate_mr()
      function used everywhere that we need to invalidate.
      
      Fixes: 49b53a93 ("iw_cxgb4: add fast-path for small REG_MR operations")
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      5c6b2aaf
    • Steve Wise's avatar
      iw_cxgb4: set *bad_wr for post_send/post_recv errors · 4ff522ea
      Steve Wise authored
      There are a few cases in c4iw_post_send() and c4iw_post_receive()
      where *bad_wr is not set when an error is returned.  This can
      cause a crash if the application tries to use bad_wr.
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      4ff522ea
    • Doug Ledford's avatar
      6fa1f2f0
    • Yonatan Cohen's avatar
      IB/rxe: Update qp state for user query · 6d931308
      Yonatan Cohen authored
      The method rxe_qp_error() transitions QP to error state
      and make sure the QP is drained. It did not though update
      the QP state for user's query.
      
      This patch fixes this.
      
      Fixes: 8700e3e7 ("Soft RoCE driver")
      Signed-off-by: default avatarYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      6d931308
    • Yonatan Cohen's avatar
      IB/rxe: Clear queue buffer when modifying QP to reset · aa75b07b
      Yonatan Cohen authored
      RXE resets the send-q only once in rxe_qp_init_req() when
      QP is created, but when the QP is reused after QP reset, the send-q
      holds previous garbage data.
      
      This garbage data wrongly fails CQEs that otherwise
      should have completed successfully.
      
      Fixes: 8700e3e7 ("Soft RoCE driver")
      Signed-off-by: default avatarYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      aa75b07b
    • Yonatan Cohen's avatar
      IB/rxe: Fix handling of erroneous WR · 002e062e
      Yonatan Cohen authored
      To correctly handle a erroneous WR this fix does the following
      1. Make sure the bad WQE causes a user completion event.
      2. Call rxe_completer to handle the erred WQE.
      
      Before the fix, when rxe_requester found a bad WQE, it changed its
      status to IB_WC_LOC_PROT_ERR and exit with 0 for non RC QPs.
      
      If this was the 1st WQE then there would be no ACK to invoke the
      completer and this bad WQE would be stuck in the QP's send-q.
      
      On top of that the requester exiting with 0 caused rxe_do_task to
      endlessly invoke rxe_requester, resulting in a soft-lockup attached
      below.
      
      In case the WQE was not the 1st and rxe_completer did get a chance to
      handle the bad WQE, it did not cause a complete event since the WQE's
      IB_SEND_SIGNALED flag was not set.
      
      Setting WQE status to IB_SEND_SIGNALED is subject to IBA spec
      version 1.2.1, section 10.7.3.1 Signaled Completions.
      
      NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s!
      [<ffffffffa0590145>] ? rxe_pool_get_index+0x35/0xb0 [rdma_rxe]
      [<ffffffffa05952ec>] lookup_mem+0x3c/0xc0 [rdma_rxe]
      [<ffffffffa0595534>] copy_data+0x1c4/0x230 [rdma_rxe]
      [<ffffffffa058c180>] rxe_requester+0x9d0/0x1100 [rdma_rxe]
      [<ffffffff8158e98a>] ? kfree_skbmem+0x5a/0x60
      [<ffffffffa05962c9>] rxe_do_task+0x89/0xf0 [rdma_rxe]
      [<ffffffffa05963e2>] rxe_run_task+0x12/0x30 [rdma_rxe]
      [<ffffffffa059110a>] rxe_post_send+0x41a/0x550 [rdma_rxe]
      [<ffffffff811ef922>] ? __kmalloc+0x182/0x200
      [<ffffffff816ba512>] ? down_read+0x12/0x40
      [<ffffffffa054bd32>] ib_uverbs_post_send+0x532/0x540 [ib_uverbs]
      [<ffffffff815f8722>] ? tcp_sendmsg+0x402/0xb80
      [<ffffffffa05453dc>] ib_uverbs_write+0x18c/0x3f0 [ib_uverbs]
      [<ffffffff81623c2e>] ? inet_recvmsg+0x7e/0xb0
      [<ffffffff8158764d>] ? sock_recvmsg+0x3d/0x50
      [<ffffffff81215b87>] __vfs_write+0x37/0x140
      [<ffffffff81216892>] vfs_write+0xb2/0x1b0
      [<ffffffff81217ce5>] SyS_write+0x55/0xc0
      [<ffffffff816bc672>] entry_SYSCALL_64_fastpath+0x1a/0xa
      
      Fixes: 8700e3e7 ("Soft RoCE driver")
      Signed-off-by: default avatarYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      002e062e
    • Yonatan Cohen's avatar
      IB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum · 1454ca3a
      Yonatan Cohen authored
      Missing initialization of udp_tunnel_sock_cfg causes to following
      kernel panic, while kernel tries to execute gro_receive().
      
      While being there, we converted udp_port_cfg to use the same
      initialization scheme as udp_tunnel_sock_cfg.
      
      ------------[ cut here ]------------
      kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
      BUG: unable to handle kernel paging request at ffffffffa0588c50
      IP: [<ffffffffa0588c50>] __this_module+0x50/0xffffffffffff8400 [ib_rxe]
      PGD 1c09067 PUD 1c0a063 PMD bb394067 PTE 80000000ad5e8163
      Oops: 0011 [#1] SMP
      Modules linked in: ib_rxe ip6_udp_tunnel udp_tunnel
      CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.7.0-rc3+ #2
      Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011
      task: ffff880235e4e680 ti: ffff880235e68000 task.ti: ffff880235e68000
      RIP: 0010:[<ffffffffa0588c50>]
      [<ffffffffa0588c50>] __this_module+0x50/0xffffffffffff8400 [ib_rxe]
      RSP: 0018:ffff880237343c80  EFLAGS: 00010282
      RAX: 00000000dffe482d RBX: ffff8800ae330900 RCX: 000000002001b712
      RDX: ffff8800ae330900 RSI: ffff8800ae102578 RDI: ffff880235589c00
      RBP: ffff880237343cb0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800ae33e262
      R13: ffff880235589c00 R14: 0000000000000014 R15: ffff8800ae102578
      FS:  0000000000000000(0000) GS:ffff880237340000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: ffffffffa0588c50 CR3: 0000000001c06000 CR4: 00000000000006e0
      Stack:
      ffffffff8160860e ffff8800ae330900 ffff8800ae102578 0000000000000014
      000000000000004e ffff8800ae102578 ffff880237343ce0 ffffffff816088fb
      0000000000000000 ffff8800ae330900 0000000000000000 00000000ffad0000
      Call Trace:
      <IRQ>
      [<ffffffff8160860e>] ? udp_gro_receive+0xde/0x130
      [<ffffffff816088fb>] udp4_gro_receive+0x10b/0x2d0
      [<ffffffff81611373>] inet_gro_receive+0x1d3/0x270
      [<ffffffff81594e29>] dev_gro_receive+0x269/0x3b0
      [<ffffffff81595188>] napi_gro_receive+0x38/0x120
      [<ffffffffa011caee>] mlx5e_handle_rx_cqe+0x27e/0x340 [mlx5_core]
      [<ffffffffa011d076>] mlx5e_poll_rx_cq+0x66/0x6d0 [mlx5_core]
      [<ffffffffa011d7ae>] mlx5e_napi_poll+0x8e/0x400 [mlx5_core]
      [<ffffffff815949a0>] net_rx_action+0x160/0x380
      [<ffffffff816a9197>] __do_softirq+0xd7/0x2c5
      [<ffffffff81085c35>] irq_exit+0xf5/0x100
      [<ffffffff816a8f16>] do_IRQ+0x56/0xd0
      [<ffffffff816a6dcc>] common_interrupt+0x8c/0x8c
      <EOI>
      [<ffffffff81061f96>] ? native_safe_halt+0x6/0x10
      [<ffffffff81037ade>] default_idle+0x1e/0xd0
      [<ffffffff8103828f>] arch_cpu_idle+0xf/0x20
      [<ffffffff810c37dc>] default_idle_call+0x3c/0x50
      [<ffffffff810c3b13>] cpu_startup_entry+0x323/0x3c0
      [<ffffffff81050d8c>] start_secondary+0x15c/0x1a0
      RIP  [<ffffffffa0588c50>] __this_module+0x50/0xffffffffffff8400 [ib_rxe]
      RSP <ffff880237343c80>
      CR2: ffffffffa0588c50
      ---[ end trace 489ee31fa7614ac5 ]---
      Kernel panic - not syncing: Fatal exception in interrupt
      Kernel Offset: disabled
      ---[ end Kernel panic - not syncing: Fatal exception in interrupt
      ------------[ cut here ]------------
      
      Fixes: 8700e3e7 ("Soft RoCE driver")
      Signed-off-by: default avatarYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      1454ca3a
    • Matan Barak's avatar
      IB/mlx4: Fix create CQ error flow · 593ff73b
      Matan Barak authored
      Currently, if ib_copy_to_udata fails, the CQ
      won't be deleted from the radix tree and the HW (HW2SW).
      
      Fixes: 225c7b1f ('IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters')
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      593ff73b
    • Daniel Jurgens's avatar
      IB/mlx4: Check gid_index return value · 37995116
      Daniel Jurgens authored
      Check the returned GID index value and return an error if it is invalid.
      
      Fixes: 5070cd22 ('IB/mlx4: Replace mechanism for RoCE GID management')
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Reviewed-by: default avatarYuval Shaia <yuval.shaia@oracle.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      37995116
    • Eli Cohen's avatar
      IB/mlx5: Fix NULL pointer dereference on debug print · a1ab8402
      Eli Cohen authored
      For XRC QP CQs may not exist. Check before attempting dereference.
      
      Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters')
      Signed-off-by: default avatarEli Cohen <eli@mellanox.com>
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      a1ab8402
    • Eli Cohen's avatar
      IB/mlx5: Fix fatal error dispatching · dbaaff2a
      Eli Cohen authored
      When an internal error condition is detected, make sure to set the
      device inactive after dispatching the event so ULPs can get a
      notification of this event.
      
      Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters')
      Signed-off-by: default avatarEli Cohen <eli@mellanox.com>
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Reviewed-by: default avatarMohamad Haj Yahia <mohamad@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      dbaaff2a
    • Moshe Lazer's avatar
      IB/mlx5: Resolve soft lock on massive reg MRs · 6bc1a656
      Moshe Lazer authored
      When calling reg_mr of large MRs (e.g. 4GB) from multiple processes
      and MR caches can't supply the required amount of MRs the slow-path
      of MR allocation may be used. In this case we need to serialize the
      slow-path between the processes to avoid soft lock.
      
      Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters')
      Signed-off-by: default avatarMoshe Lazer <moshel@mellanox.com>
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Reviewed-by: default avatarEli Cohen <eli@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      6bc1a656
    • Daniel Jurgens's avatar
      IB/mlx5: Use cache line size to select CQE stride · 16b0e069
      Daniel Jurgens authored
      When creating kernel CQs use 128B CQE stride if the
      cache line size is 128B, 64B otherwise.  This prevents
      multiple CQEs from residing in a 128B cache line,
      which can cause retries when there are concurrent
      read and writes in one cache line.
      
      Tested with IPoIB on PPC64, saw ~5% throughput
      improvement.
      
      Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters')
      Signed-off-by: default avatarDaniel Jurgens <danielj@mellanox.com>
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      16b0e069
    • Maor Gottlieb's avatar
      IB/mlx5: Validate requested RQT size · efd7f400
      Maor Gottlieb authored
      Validate that the requested size of RQT is supported by firmware.
      
      Fixes: c5f90929 ('IB/mlx5: Add Receive Work Queue Indirection table operations')
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Reviewed-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      efd7f400
    • Majd Dibbiny's avatar
      IB/mlx5: Fix memory leak in query device · 90be7c8a
      Majd Dibbiny authored
      We need to free dev->port when we fail to enable RoCE or
      initialize node data.
      
      Fixes: 0837e86a ('IB/mlx5: Add per port counters')
      Signed-off-by: default avatarMajd Dibbiny <majd@mellanox.com>
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      90be7c8a
    • Mark Bloch's avatar
      IB/core: Avoid unsigned int overflow in sg_alloc_table · 3c7ba576
      Mark Bloch authored
      sg_alloc_table gets unsigned int as parameter while the driver
      returns it as size_t. Check npages isn't greater than maximum
      unsigned int.
      
      Fixes: eeb8461e ("IB: Refactor umem to use linear SG table")
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      3c7ba576
    • Mark Bloch's avatar
      IB/core: Add missing check for addr_resolve callback return value · 61c37028
      Mark Bloch authored
      When calling rdma_resolve_ip inside rdma_addr_find_l2_eth_by_grh,
      the return status of the request was ignored in the callback function
      causing a successful return and an empty dmac.
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarAlex Vesker <valex@mellanox.com>
      Reviewed-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      61c37028
    • Leon Romanovsky's avatar
      IB/core: Set routable RoCE gid type for ipv4/ipv6 networks · aeb76df4
      Leon Romanovsky authored
      On Thu, Oct 27, 2016 at 04:36:28PM +0300, Leon Romanovsky wrote:
      > From: Mark Bloch <markb@mellanox.com>
      >
      > If the underlying netowrk type is ipv4 or ipv6 and the device supports
      > routable RoCE, prefer it so the traffic could cross subnets.
      >
      > Signed-off-by: Mark Bloch <markb@mellanox.com>
      > Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
      > Signed-off-by: Leon Romanovsky <leon@kernel.org>
      > ---
      
      Hi Doug,
      
      Please take the following v1 of this patch where I fixed spelling error
      from "netowrk" to be "network".
      
      Thanks.
      
      >From 09f96ba3e9b4442cfb44dca04c6726e55525c9c3 Mon Sep 17 00:00:00 2001
      From: Mark Bloch <markb@mellanox.com>
      Date: Sun, 11 Sep 2016 06:25:10 +0000
      Subject: [PATCH rdma-rc v1 3/6] IB/core: Set routable RoCE gid type for ipv4/ipv6
       networks
      
      If the underlying network type is ipv4 or ipv6 and the device supports
      routable RoCE, prefer it so the traffic could cross subnets.
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      aeb76df4
    • Mark Bloch's avatar
      IB/cm: Mark stale CM id's whenever the mad agent was unregistered · 9db0ff53
      Mark Bloch authored
      When there is a CM id object that has port assigned to it, it means that
      the cm-id asked for the specific port that it should go by it, but if
      that port was removed (hot-unplug event) the cm-id was not updated.
      In order to fix that the port keeps a list of all the cm-id's that are
      planning to go by it, whenever the port is removed it marks all of them
      as invalid.
      
      This commit fixes a kernel panic which happens when running traffic between
      guests and we force reboot a guest mid traffic, it triggers a kernel panic:
      
       Call Trace:
        [<ffffffff815271fa>] ? panic+0xa7/0x16f
        [<ffffffff8152b534>] ? oops_end+0xe4/0x100
        [<ffffffff8104a00b>] ? no_context+0xfb/0x260
        [<ffffffff81084db2>] ? del_timer_sync+0x22/0x30
        [<ffffffff8104a295>] ? __bad_area_nosemaphore+0x125/0x1e0
        [<ffffffff81084240>] ? process_timeout+0x0/0x10
        [<ffffffff8104a363>] ? bad_area_nosemaphore+0x13/0x20
        [<ffffffff8104aabf>] ? __do_page_fault+0x31f/0x480
        [<ffffffff81065df0>] ? default_wake_function+0x0/0x20
        [<ffffffffa0752675>] ? free_msg+0x55/0x70 [mlx5_core]
        [<ffffffffa0753434>] ? cmd_exec+0x124/0x840 [mlx5_core]
        [<ffffffff8105a924>] ? find_busiest_group+0x244/0x9f0
        [<ffffffff8152d45e>] ? do_page_fault+0x3e/0xa0
        [<ffffffff8152a815>] ? page_fault+0x25/0x30
        [<ffffffffa024da25>] ? cm_alloc_msg+0x35/0xc0 [ib_cm]
        [<ffffffffa024e821>] ? ib_send_cm_dreq+0xb1/0x1e0 [ib_cm]
        [<ffffffffa024f836>] ? cm_destroy_id+0x176/0x320 [ib_cm]
        [<ffffffffa024fb00>] ? ib_destroy_cm_id+0x10/0x20 [ib_cm]
        [<ffffffffa034f527>] ? ipoib_cm_free_rx_reap_list+0xa7/0x110 [ib_ipoib]
        [<ffffffffa034f590>] ? ipoib_cm_rx_reap+0x0/0x20 [ib_ipoib]
        [<ffffffffa034f5a5>] ? ipoib_cm_rx_reap+0x15/0x20 [ib_ipoib]
        [<ffffffff81094d20>] ? worker_thread+0x170/0x2a0
        [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40
        [<ffffffff81094bb0>] ? worker_thread+0x0/0x2a0
        [<ffffffff8109aef6>] ? kthread+0x96/0xa0
        [<ffffffff8100c20a>] ? child_rip+0xa/0x20
        [<ffffffff8109ae60>] ? kthread+0x0/0xa0
        [<ffffffff8100c200>] ? child_rip+0x0/0x20
      
      Fixes: a977049d ("[PATCH] IB: Add the kernel CM implementation")
      Signed-off-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarErez Shitrit <erezsh@mellanox.com>
      Reviewed-by: default avatarMaor Gottlieb <maorg@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      9db0ff53
    • Tariq Toukan's avatar
      IB/uverbs: Fix leak of XRC target QPs · 5b810a24
      Tariq Toukan authored
      The real QP is destroyed in case of the ref count reaches zero, but
      for XRC target QPs this call was missed and caused to QP leaks.
      
      Let's call to destroy for all flows.
      
      Fixes: 0e0ec7e0 ('RDMA/core: Export ib_open_qp() to share XRC...')
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      5b810a24
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20161116' of git://github.com/jcmvbkbc/linux-xtensa · 5fd0f1ca
      Linus Torvalds authored
      Pull Xtensa fixes from Max Filippov:
      
       - fix register dumps, stack dumps and stack traces that got torn due to
         recent printk changes
      
       - wire up pkey_{mprotect,alloc,free} syscalls
      
      * tag 'xtensa-20161116' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: wire up new pkey_{mprotect,alloc,free} syscalls
        xtensa: clean up printk usage for boot/crash logging
      5fd0f1ca
  2. 16 Nov, 2016 10 commits
  3. 15 Nov, 2016 2 commits