- 17 Nov, 2016 28 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdmaLinus Torvalds authored
Pull rmda fixes from Doug Ledford. "First round of -rc fixes. Due to various issues, I've been away and couldn't send a pull request for about three weeks. There were a number of -rc patches that built up in the meantime (some where there already from the early -rc stages). Obviously, there were way too many to send now, so I tried to pare the list down to the more important patches for the -rc cycle. Most of the code has had plenty of soak time at the various vendor's testing setups, so I doubt there will be another -rc pull request this cycle. I also tried to limit the patches to those with smaller footprints, so even though a shortlog is longer than I would like, the actual diffstat is mostly very small with the exception of just three files that had more changes, and a couple files with pure removals. Summary: - Misc Intel hfi1 fixes - Misc Mellanox mlx4, mlx5, and rxe fixes - A couple cxgb4 fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (34 commits) iw_cxgb4: invalidate the mr when posting a read_w_inv wr iw_cxgb4: set *bad_wr for post_send/post_recv errors IB/rxe: Update qp state for user query IB/rxe: Clear queue buffer when modifying QP to reset IB/rxe: Fix handling of erroneous WR IB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum IB/mlx4: Fix create CQ error flow IB/mlx4: Check gid_index return value IB/mlx5: Fix NULL pointer dereference on debug print IB/mlx5: Fix fatal error dispatching IB/mlx5: Resolve soft lock on massive reg MRs IB/mlx5: Use cache line size to select CQE stride IB/mlx5: Validate requested RQT size IB/mlx5: Fix memory leak in query device IB/core: Avoid unsigned int overflow in sg_alloc_table IB/core: Add missing check for addr_resolve callback return value IB/core: Set routable RoCE gid type for ipv4/ipv6 networks IB/cm: Mark stale CM id's whenever the mad agent was unregistered IB/uverbs: Fix leak of XRC target QPs IB/hfi1: Remove incorrect IS_ERR check ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs fixes from Al Viro: "A couple of regression fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fix iov_iter_advance() for ITER_PIPE xattr: Fix setting security xattrs on sockfs
-
git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linuxLinus Torvalds authored
Pull orangefs fix from Mike Marshall: "orangefs: add .owner to debugfs file_operations Without ".owner = THIS_MODULE" it is possible to crash the kernel by unloading the Orangefs module while someone is reading debugfs files" * tag 'for-linus-4.9-rc5-ofs-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: orangefs: add .owner to debugfs file_operations
-
Aaron Lu authored
Prior to 3.15, there was a race between zap_pte_range() and page_mkclean() where writes to a page could be lost. Dave Hansen discovered by inspection that there is a similar race between move_ptes() and page_mkclean(). We've been able to reproduce the issue by enlarging the race window with a msleep(), but have not been able to hit it without modifying the code. So, we think it's a real issue, but is difficult or impossible to hit in practice. The zap_pte_range() issue is fixed by commit 1cf35d47("mm: split 'tlb_flush_mmu()' into tlb flushing and memory freeing parts"). And this patch is to fix the race between page_mkclean() and mremap(). Here is one possible way to hit the race: suppose a process mmapped a file with READ | WRITE and SHARED, it has two threads and they are bound to 2 different CPUs, e.g. CPU1 and CPU2. mmap returned X, then thread 1 did a write to addr X so that CPU1 now has a writable TLB for addr X on it. Thread 2 starts mremaping from addr X to Y while thread 1 cleaned the page and then did another write to the old addr X again. The 2nd write from thread 1 could succeed but the value will get lost. thread 1 thread 2 (bound to CPU1) (bound to CPU2) 1: write 1 to addr X to get a writeable TLB on this CPU 2: mremap starts 3: move_ptes emptied PTE for addr X and setup new PTE for addr Y and then dropped PTL for X and Y 4: page laundering for N by doing fadvise FADV_DONTNEED. When done, pageframe N is deemed clean. 5: *write 2 to addr X 6: tlb flush for addr X 7: munmap (Y, pagesize) to make the page unmapped 8: fadvise with FADV_DONTNEED again to kick the page off the pagecache 9: pread the page from file to verify the value. If 1 is there, it means we have lost the written 2. *the write may or may not cause segmentation fault, it depends on if the TLB is still on the CPU. Please note that this is only one specific way of how the race could occur, it didn't mean that the race could only occur in exact the above config, e.g. more than 2 threads could be involved and fadvise() could be done in another thread, etc. For anonymous pages, they could race between mremap() and page reclaim: THP: a huge PMD is moved by mremap to a new huge PMD, then the new huge PMD gets unmapped/splitted/pagedout before the flush tlb happened for the old huge PMD in move_page_tables() and we could still write data to it. The normal anonymous page has similar situation. To fix this, check for any dirty PTE in move_ptes()/move_huge_pmd() and if any, did the flush before dropping the PTL. If we did the flush for every move_ptes()/move_huge_pmd() call then we do not need to do the flush in move_pages_tables() for the whole range. But if we didn't, we still need to do the whole range flush. Alternatively, we can track which part of the range is flushed in move_ptes()/move_huge_pmd() and which didn't to avoid flushing the whole range in move_page_tables(). But that would require multiple tlb flushes for the different sub-ranges and should be less efficient than the single whole range flush. KBuild test on my Sandybridge desktop doesn't show any noticeable change. v4.9-rc4: real 5m14.048s user 32m19.800s sys 4m50.320s With this commit: real 5m13.888s user 32m19.330s sys 4m51.200s Reported-by: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Abhi Das authored
iov_iter_advance() needs to decrement iter->count by the number of bytes we'd moved beyond. Normal flavours do that, but ITER_PIPE doesn't and ITER_PIPE generic_file_read_iter() for O_DIRECT files ends up with a bogus fallback to page cache read, resulting in incorrect values for file offset and bytes read. Signed-off-by: Abhi Das <adas@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Andreas Gruenbacher authored
The IOP_XATTR flag is set on sockfs because sockfs supports getting the "system.sockprotoname" xattr. Since commit 6c6ef9f2, this flag is checked for setxattr support as well. This is wrong on sockfs because security xattr support there is supposed to be provided by security_inode_setsecurity. The smack security module relies on socket labels (xattrs). Fix this by adding a security xattr handler on sockfs that returns -EAGAIN, and by checking for -EAGAIN in setxattr. We cannot simply check for -EOPNOTSUPP in setxattr because there are filesystems that neither have direct security xattr support nor support via security_inode_setsecurity. A more proper fix might be to move the call to security_inode_setsecurity into sockfs, but it's not clear to me if that is safe: we would end up calling security_inode_post_setxattr after that as well. Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
git://people.freedesktop.org/~airlied/linuxLinus Torvalds authored
Pull drm fixes fr9om Dave Airlie: "Fixes for amdgpu, and a bunch of arm drivers. There seems to be an uptick in the ARM drivers sending things for fixes which is good, so I've decided to dequeue a bit early, more stuff may arrive before the weekend. This contains mediatek, arcpgu, sunxi, fsl-dcu display controller fixes along with 3 amdgpu fixes, one for a fencing issue with secondary GPUs" * tag 'drm-fixes-for-v4.9-rc6' of git://people.freedesktop.org/~airlied/linux: drm/amdgpu:fix vpost_needed routine drm/amdgpu/powerplay: drop a redundant NULL check drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5) drm/arcpgu: Accommodate adv7511 switch to DRM bridge drm/fsl-dcu: disable planes before disabling CRTC drm/fsl-dcu: update all registers on flush drm/fsl-dcu: do not update when modifying irq registers drm/sun4i: Propagate error to the caller drm/sun4i: Fix error handling drm/mediatek: modify the factor to make the pll_rate set in the 1G-2G range drm/mediatek: enhance the HDMI driving current drm/mediatek: do mtk_hdmi_send_infoframe after HDMI clock enable drm/mediatek: clear IRQ status before enable OVL interrupt drm/mediatek: set vblank_disable_allowed to true drm/mediatek: fix a typo of OD_CFG to OD_RELAYMODE drm/sun4i: rgb: Remove the bridge enable/disable functions drm/sun4i: rgb: Enable panel after controller
-
Steve Wise authored
Also, rearrange things a bit to have a common c4iw_invalidate_mr() function used everywhere that we need to invalidate. Fixes: 49b53a93 ("iw_cxgb4: add fast-path for small REG_MR operations") Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Steve Wise authored
There are a few cases in c4iw_post_send() and c4iw_post_receive() where *bad_wr is not set when an error is returned. This can cause a crash if the application tries to use bad_wr. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Doug Ledford authored
-
Yonatan Cohen authored
The method rxe_qp_error() transitions QP to error state and make sure the QP is drained. It did not though update the QP state for user's query. This patch fixes this. Fixes: 8700e3e7 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Yonatan Cohen authored
RXE resets the send-q only once in rxe_qp_init_req() when QP is created, but when the QP is reused after QP reset, the send-q holds previous garbage data. This garbage data wrongly fails CQEs that otherwise should have completed successfully. Fixes: 8700e3e7 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Yonatan Cohen authored
To correctly handle a erroneous WR this fix does the following 1. Make sure the bad WQE causes a user completion event. 2. Call rxe_completer to handle the erred WQE. Before the fix, when rxe_requester found a bad WQE, it changed its status to IB_WC_LOC_PROT_ERR and exit with 0 for non RC QPs. If this was the 1st WQE then there would be no ACK to invoke the completer and this bad WQE would be stuck in the QP's send-q. On top of that the requester exiting with 0 caused rxe_do_task to endlessly invoke rxe_requester, resulting in a soft-lockup attached below. In case the WQE was not the 1st and rxe_completer did get a chance to handle the bad WQE, it did not cause a complete event since the WQE's IB_SEND_SIGNALED flag was not set. Setting WQE status to IB_SEND_SIGNALED is subject to IBA spec version 1.2.1, section 10.7.3.1 Signaled Completions. NMI watchdog: BUG: soft lockup - CPU#7 stuck for 22s! [<ffffffffa0590145>] ? rxe_pool_get_index+0x35/0xb0 [rdma_rxe] [<ffffffffa05952ec>] lookup_mem+0x3c/0xc0 [rdma_rxe] [<ffffffffa0595534>] copy_data+0x1c4/0x230 [rdma_rxe] [<ffffffffa058c180>] rxe_requester+0x9d0/0x1100 [rdma_rxe] [<ffffffff8158e98a>] ? kfree_skbmem+0x5a/0x60 [<ffffffffa05962c9>] rxe_do_task+0x89/0xf0 [rdma_rxe] [<ffffffffa05963e2>] rxe_run_task+0x12/0x30 [rdma_rxe] [<ffffffffa059110a>] rxe_post_send+0x41a/0x550 [rdma_rxe] [<ffffffff811ef922>] ? __kmalloc+0x182/0x200 [<ffffffff816ba512>] ? down_read+0x12/0x40 [<ffffffffa054bd32>] ib_uverbs_post_send+0x532/0x540 [ib_uverbs] [<ffffffff815f8722>] ? tcp_sendmsg+0x402/0xb80 [<ffffffffa05453dc>] ib_uverbs_write+0x18c/0x3f0 [ib_uverbs] [<ffffffff81623c2e>] ? inet_recvmsg+0x7e/0xb0 [<ffffffff8158764d>] ? sock_recvmsg+0x3d/0x50 [<ffffffff81215b87>] __vfs_write+0x37/0x140 [<ffffffff81216892>] vfs_write+0xb2/0x1b0 [<ffffffff81217ce5>] SyS_write+0x55/0xc0 [<ffffffff816bc672>] entry_SYSCALL_64_fastpath+0x1a/0xa Fixes: 8700e3e7 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Yonatan Cohen authored
Missing initialization of udp_tunnel_sock_cfg causes to following kernel panic, while kernel tries to execute gro_receive(). While being there, we converted udp_port_cfg to use the same initialization scheme as udp_tunnel_sock_cfg. ------------[ cut here ]------------ kernel tried to execute NX-protected page - exploit attempt? (uid: 0) BUG: unable to handle kernel paging request at ffffffffa0588c50 IP: [<ffffffffa0588c50>] __this_module+0x50/0xffffffffffff8400 [ib_rxe] PGD 1c09067 PUD 1c0a063 PMD bb394067 PTE 80000000ad5e8163 Oops: 0011 [#1] SMP Modules linked in: ib_rxe ip6_udp_tunnel udp_tunnel CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.7.0-rc3+ #2 Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011 task: ffff880235e4e680 ti: ffff880235e68000 task.ti: ffff880235e68000 RIP: 0010:[<ffffffffa0588c50>] [<ffffffffa0588c50>] __this_module+0x50/0xffffffffffff8400 [ib_rxe] RSP: 0018:ffff880237343c80 EFLAGS: 00010282 RAX: 00000000dffe482d RBX: ffff8800ae330900 RCX: 000000002001b712 RDX: ffff8800ae330900 RSI: ffff8800ae102578 RDI: ffff880235589c00 RBP: ffff880237343cb0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800ae33e262 R13: ffff880235589c00 R14: 0000000000000014 R15: ffff8800ae102578 FS: 0000000000000000(0000) GS:ffff880237340000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffffffa0588c50 CR3: 0000000001c06000 CR4: 00000000000006e0 Stack: ffffffff8160860e ffff8800ae330900 ffff8800ae102578 0000000000000014 000000000000004e ffff8800ae102578 ffff880237343ce0 ffffffff816088fb 0000000000000000 ffff8800ae330900 0000000000000000 00000000ffad0000 Call Trace: <IRQ> [<ffffffff8160860e>] ? udp_gro_receive+0xde/0x130 [<ffffffff816088fb>] udp4_gro_receive+0x10b/0x2d0 [<ffffffff81611373>] inet_gro_receive+0x1d3/0x270 [<ffffffff81594e29>] dev_gro_receive+0x269/0x3b0 [<ffffffff81595188>] napi_gro_receive+0x38/0x120 [<ffffffffa011caee>] mlx5e_handle_rx_cqe+0x27e/0x340 [mlx5_core] [<ffffffffa011d076>] mlx5e_poll_rx_cq+0x66/0x6d0 [mlx5_core] [<ffffffffa011d7ae>] mlx5e_napi_poll+0x8e/0x400 [mlx5_core] [<ffffffff815949a0>] net_rx_action+0x160/0x380 [<ffffffff816a9197>] __do_softirq+0xd7/0x2c5 [<ffffffff81085c35>] irq_exit+0xf5/0x100 [<ffffffff816a8f16>] do_IRQ+0x56/0xd0 [<ffffffff816a6dcc>] common_interrupt+0x8c/0x8c <EOI> [<ffffffff81061f96>] ? native_safe_halt+0x6/0x10 [<ffffffff81037ade>] default_idle+0x1e/0xd0 [<ffffffff8103828f>] arch_cpu_idle+0xf/0x20 [<ffffffff810c37dc>] default_idle_call+0x3c/0x50 [<ffffffff810c3b13>] cpu_startup_entry+0x323/0x3c0 [<ffffffff81050d8c>] start_secondary+0x15c/0x1a0 RIP [<ffffffffa0588c50>] __this_module+0x50/0xffffffffffff8400 [ib_rxe] RSP <ffff880237343c80> CR2: ffffffffa0588c50 ---[ end trace 489ee31fa7614ac5 ]--- Kernel panic - not syncing: Fatal exception in interrupt Kernel Offset: disabled ---[ end Kernel panic - not syncing: Fatal exception in interrupt ------------[ cut here ]------------ Fixes: 8700e3e7 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Matan Barak authored
Currently, if ib_copy_to_udata fails, the CQ won't be deleted from the radix tree and the HW (HW2SW). Fixes: 225c7b1f ('IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Daniel Jurgens authored
Check the returned GID index value and return an error if it is invalid. Fixes: 5070cd22 ('IB/mlx4: Replace mechanism for RoCE GID management') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Eli Cohen authored
For XRC QP CQs may not exist. Check before attempting dereference. Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Eli Cohen authored
When an internal error condition is detected, make sure to set the device inactive after dispatching the event so ULPs can get a notification of this event. Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Moshe Lazer authored
When calling reg_mr of large MRs (e.g. 4GB) from multiple processes and MR caches can't supply the required amount of MRs the slow-path of MR allocation may be used. In this case we need to serialize the slow-path between the processes to avoid soft lock. Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Moshe Lazer <moshel@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Daniel Jurgens authored
When creating kernel CQs use 128B CQE stride if the cache line size is 128B, 64B otherwise. This prevents multiple CQEs from residing in a 128B cache line, which can cause retries when there are concurrent read and writes in one cache line. Tested with IPoIB on PPC64, saw ~5% throughput improvement. Fixes: e126ba97 ('mlx5: Add driver for Mellanox Connect-IB adapters') Signed-off-by: Daniel Jurgens <danielj@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Maor Gottlieb authored
Validate that the requested size of RQT is supported by firmware. Fixes: c5f90929 ('IB/mlx5: Add Receive Work Queue Indirection table operations') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Majd Dibbiny authored
We need to free dev->port when we fail to enable RoCE or initialize node data. Fixes: 0837e86a ('IB/mlx5: Add per port counters') Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Mark Bloch authored
sg_alloc_table gets unsigned int as parameter while the driver returns it as size_t. Check npages isn't greater than maximum unsigned int. Fixes: eeb8461e ("IB: Refactor umem to use linear SG table") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Mark Bloch authored
When calling rdma_resolve_ip inside rdma_addr_find_l2_eth_by_grh, the return status of the request was ignored in the callback function causing a successful return and an empty dmac. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Leon Romanovsky authored
On Thu, Oct 27, 2016 at 04:36:28PM +0300, Leon Romanovsky wrote: > From: Mark Bloch <markb@mellanox.com> > > If the underlying netowrk type is ipv4 or ipv6 and the device supports > routable RoCE, prefer it so the traffic could cross subnets. > > Signed-off-by: Mark Bloch <markb@mellanox.com> > Signed-off-by: Maor Gottlieb <maorg@mellanox.com> > Signed-off-by: Leon Romanovsky <leon@kernel.org> > --- Hi Doug, Please take the following v1 of this patch where I fixed spelling error from "netowrk" to be "network". Thanks. >From 09f96ba3e9b4442cfb44dca04c6726e55525c9c3 Mon Sep 17 00:00:00 2001 From: Mark Bloch <markb@mellanox.com> Date: Sun, 11 Sep 2016 06:25:10 +0000 Subject: [PATCH rdma-rc v1 3/6] IB/core: Set routable RoCE gid type for ipv4/ipv6 networks If the underlying network type is ipv4 or ipv6 and the device supports routable RoCE, prefer it so the traffic could cross subnets. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Mark Bloch authored
When there is a CM id object that has port assigned to it, it means that the cm-id asked for the specific port that it should go by it, but if that port was removed (hot-unplug event) the cm-id was not updated. In order to fix that the port keeps a list of all the cm-id's that are planning to go by it, whenever the port is removed it marks all of them as invalid. This commit fixes a kernel panic which happens when running traffic between guests and we force reboot a guest mid traffic, it triggers a kernel panic: Call Trace: [<ffffffff815271fa>] ? panic+0xa7/0x16f [<ffffffff8152b534>] ? oops_end+0xe4/0x100 [<ffffffff8104a00b>] ? no_context+0xfb/0x260 [<ffffffff81084db2>] ? del_timer_sync+0x22/0x30 [<ffffffff8104a295>] ? __bad_area_nosemaphore+0x125/0x1e0 [<ffffffff81084240>] ? process_timeout+0x0/0x10 [<ffffffff8104a363>] ? bad_area_nosemaphore+0x13/0x20 [<ffffffff8104aabf>] ? __do_page_fault+0x31f/0x480 [<ffffffff81065df0>] ? default_wake_function+0x0/0x20 [<ffffffffa0752675>] ? free_msg+0x55/0x70 [mlx5_core] [<ffffffffa0753434>] ? cmd_exec+0x124/0x840 [mlx5_core] [<ffffffff8105a924>] ? find_busiest_group+0x244/0x9f0 [<ffffffff8152d45e>] ? do_page_fault+0x3e/0xa0 [<ffffffff8152a815>] ? page_fault+0x25/0x30 [<ffffffffa024da25>] ? cm_alloc_msg+0x35/0xc0 [ib_cm] [<ffffffffa024e821>] ? ib_send_cm_dreq+0xb1/0x1e0 [ib_cm] [<ffffffffa024f836>] ? cm_destroy_id+0x176/0x320 [ib_cm] [<ffffffffa024fb00>] ? ib_destroy_cm_id+0x10/0x20 [ib_cm] [<ffffffffa034f527>] ? ipoib_cm_free_rx_reap_list+0xa7/0x110 [ib_ipoib] [<ffffffffa034f590>] ? ipoib_cm_rx_reap+0x0/0x20 [ib_ipoib] [<ffffffffa034f5a5>] ? ipoib_cm_rx_reap+0x15/0x20 [ib_ipoib] [<ffffffff81094d20>] ? worker_thread+0x170/0x2a0 [<ffffffff8109b2a0>] ? autoremove_wake_function+0x0/0x40 [<ffffffff81094bb0>] ? worker_thread+0x0/0x2a0 [<ffffffff8109aef6>] ? kthread+0x96/0xa0 [<ffffffff8100c20a>] ? child_rip+0xa/0x20 [<ffffffff8109ae60>] ? kthread+0x0/0xa0 [<ffffffff8100c200>] ? child_rip+0x0/0x20 Fixes: a977049d ("[PATCH] IB: Add the kernel CM implementation") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
Tariq Toukan authored
The real QP is destroyed in case of the ref count reaches zero, but for XRC target QPs this call was missed and caused to QP leaks. Let's call to destroy for all flows. Fixes: 0e0ec7e0 ('RDMA/core: Export ib_open_qp() to share XRC...') Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
-
git://github.com/jcmvbkbc/linux-xtensaLinus Torvalds authored
Pull Xtensa fixes from Max Filippov: - fix register dumps, stack dumps and stack traces that got torn due to recent printk changes - wire up pkey_{mprotect,alloc,free} syscalls * tag 'xtensa-20161116' of git://github.com/jcmvbkbc/linux-xtensa: xtensa: wire up new pkey_{mprotect,alloc,free} syscalls xtensa: clean up printk usage for boot/crash logging
-
- 16 Nov, 2016 10 commits
-
-
git://people.freedesktop.org/~agd5f/linuxDave Airlie authored
Just a few bug fixes for 4.9. The big one is Mario's prime fencing fix. * 'drm-fixes-4.9' of git://people.freedesktop.org/~agd5f/linux: drm/amdgpu:fix vpost_needed routine drm/amdgpu/powerplay: drop a redundant NULL check drm/amdgpu: Attach exclusive fence to prime exported bo's. (v5)
-
Dave Airlie authored
Merge branch 'mediatek-drm-fixes-2016-11-11' of https://github.com/ckhu-mediatek/linux.git-tags into drm-fixes This branch include one patch to fix a typo, two patches to disable vblank interrupt, and three patches to support HDMI 4K resolution. * 'mediatek-drm-fixes-2016-11-11' of https://github.com/ckhu-mediatek/linux.git-tags: drm/mediatek: modify the factor to make the pll_rate set in the 1G-2G range drm/mediatek: enhance the HDMI driving current drm/mediatek: do mtk_hdmi_send_infoframe after HDMI clock enable drm/mediatek: clear IRQ status before enable OVL interrupt drm/mediatek: set vblank_disable_allowed to true drm/mediatek: fix a typo of OD_CFG to OD_RELAYMODE
-
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuseLinus Torvalds authored
Pull fuse fixes from Miklos Szeredi: "A regression fix and bug fix bound for stable" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: fix fuse_write_end() if zero bytes were copied fuse: fix root dentry initialization
-
git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfdLinus Torvalds authored
Pull MFD fixes from Lee Jones: - Fix PCI properties in intel-lpss-pci - Fix Resetting issue during suspend in intel-lpss-pci - Seperate IRQs for USBC device and CHRG in intel_soc_pmic_bxtwc - Add timeout to fix Resetting issue in stmpe - Ensure we 'put' reference to device when done in mfd-core * tag 'mfd-fixes-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: mfd: core: Fix device reference leak in mfd_clone_cell mfd: stmpe: Fix RESET regression on STMPE2401 mfd: intel_soc_pmic_bxtwc: Fix usbc interrupt mfd: intel-lpss: Do not put device in reset state on suspend mfd: lpss: Fix Intel Kaby Lake PCH-H properties
-
Mike Marshall authored
Without ".owner = THIS_MODULE" it is possible to crash the kernel by unloading the Orangefs module while someone is reading debugfs files. Signed-off-by: Mike Marshall <hubcap@omnibond.com>
-
Johan Hovold authored
Make sure to drop the reference taken by bus_find_device_by_name() before returning from mfd_clone_cell(). Fixes: a9bbba99 ("mfd: add platform_device sharing support for mfd") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: Lee Jones <lee.jones@linaro.org>
-
Linus Walleij authored
Since commit c4dd1ba3 ("mfd: stmpe: Add reset support for all STMPE variant") we're resetting the STMPE expanders before use. This caused a regression on the STMP2401 on the Nomadik NHK8815: stmpe-i2c 0-0043: stmpe2401 detected, chip id: 0x101 nmk-i2c 101f8000.i2c0: write to slave 0x43 timed out nmk-i2c 101f8000.i2c0: no ack received after address transmission stmpe-i2c 0-0044: stmpe2401 detected, chip id: 0x101 nmk-i2c 101f8000.i2c0: write to slave 0x44 timed out nmk-i2c 101f8000.i2c0: no ack received after address transmission It turns out that we start to poll for the reset bit to go low again too quickly: the STMPE2401 is not yet online and ready to be asked for the status of the RESET bit. By introducing a 10ms delay before starting to hammer the register for information, we get back to normal: stmpe-i2c 0-0043: stmpe2401 detected, chip id: 0x101 stmpe-i2c 0-0044: stmpe2401 detected, chip id: 0x101 Cc: stable@vger.kernel.org Cc: Amelie Delaunay <amelie.delaunay@st.com> Fixes: c4dd1ba3 ("mfd: stmpe: Add reset support for all STMPE variant") Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Patrice Chotard <patrice.chotard@st.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
-
Heikki Krogerus authored
The wcove USB Type-C driver is currently being flooded with interrupts that are not targeted to it. The reason for that is because all CHRG first level interrupts are mapped to it. This fixes the issue by introducing separate irq for the usbc device, and mapping only USB Type-C PHY interrupts to it. Fixes: 9c6235c8 ("mfd: intel_soc_pmic_bxtwc: Add bxt_wcove_usbc device") Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
-
Azhar Shaikh authored
Commit 41a3da2b ("mfd: intel-lpss: Save register context on suspend") saved the register context while going to suspend and also put the device in reset state. Due to the resetting of device, system cannot enter S3/S0ix states when no_console_suspend flag is enabled. The system and serial console both hang. The resetting of device is not needed while going to suspend. Hence remove this code. Cc: stable@vger.kernel.org Fixes: 41a3da2b ("mfd: intel-lpss: Save register context on suspend") Signed-off-by: Azhar Shaikh <azhar.shaikh@intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
-
Jarkko Nikula authored
There are a few issues on Intel Kaby Lake PCH-H properties added by commit a6a576b78e09 ("mfd: lpss: Add Intel Kaby Lake PCH-H PCI IDs"): - Input clock of I2C controller on Intel Kaby Lake PCH-H is 120 MHz not 133 MHz. This was probably copy-paste error from Intel Broxton I2C properties. - There is no default I2C SDA hold time specified which is used when ACPI doesn't provide it. I got information from Windows driver team that Kaby Lake PCH-H can use the same configuration than Intel Sunrisepoint PCH. - Common HS-UART properties are not used. Fix these by reusing the Sunrisepoint properties on Kaby Lake PCH-H. Fixes: a6a576b78e09 ("mfd: lpss: Add Intel Kaby Lake PCH-H PCI IDs") Reported-by: Xiang A Wang <xiang.a.wang@intel.com> Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Lee Jones <lee.jones@linaro.org>
-
- 15 Nov, 2016 2 commits
-
-
Dave Airlie authored
Merge tag 'sunxi-drm-fixes-for-4.9' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux into drm-fixes sun4i-drm fixes for 4.9 A few patches to fix our error handling and our panel / bridge calls. * tag 'sunxi-drm-fixes-for-4.9' of https://git.kernel.org/pub/scm/linux/kernel/git/mripard/linux: drm/sun4i: Propagate error to the caller drm/sun4i: Fix error handling drm/sun4i: rgb: Remove the bridge enable/disable functions drm/sun4i: rgb: Enable panel after controller
-
Dennis Dalessandro authored
Remove IS_ERR check from caching code as the function being called does not actually return error pointers. Fixes: f19bd643: "IB/hfi1: Prevent NULL pointer deferences in caching code" Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
-