- 07 Dec, 2016 28 commits
-
-
Grygorii Strashko authored
The current implementation CPTS initialization and deinitialization (represented by cpts_register/unregister()) does too many static initialization from .ndo_open(), which is reasonable to do once at probe time instead, and also require caller to allocate memory for struct cpts, which is internal for CPTS driver in general. This patch splits CPTS initialization and deinitialization on two parts: - static initializtion cpts_create()/cpts_release() which expected to be executed when parent driver is probed/removed; - dynamic part cpts_register/unregister() which expected to be executed when network device is opened/closed. As result, current code of CPTS parent driver - CPSW - will be simplified (and it also will allow simplify adding support for Keystone 2 devices in the future), plus more initialization errors will be catched earlier. In addition, this change allows to clean up cpts.h for the case when CPTS is disabled. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grygorii Strashko authored
CPTS module and IRQs are always enabled when CPTS is registered, before starting overflow check work, and disabled during deregistration, when overflow check work has been canceled already. So, It doesn't require to (re)enable CPTS module and IRQs in cpts_overflow_check(). Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
WingMan Kwok authored
When a CPTS user does not exit gracefully by disabling cpts timestamping and leaving a joined multicast group, the system continues to receive and timestamps the ptp packets which eventually occupy all the event list entries. When this happns, the added code tries to remove some list entries which are expired. Signed-off-by: WingMan Kwok <w-kwok2@ti.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grygorii Strashko authored
The cpts now is left enabled after unregistration. Hence, disable it in cpts_unregister(). Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grygorii Strashko authored
The ptp clock registered before spinlock, which is protecting it, and before timecounter and cyclecounter initialization in cpts_register(). So, ensure that ptp clock is registered the last, after everything else is done. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grygorii Strashko authored
There are two issues with TI CPTS code which are reproducible when TI CPSW ethX device passes few up/down iterations: - cpts refclk prepare counter continuously incremented after each up/down iteration; - devm_clk_get(dev, "cpts") is called many times. Hence, fix these issues by using clk_disable_unprepare() in cpts_clk_release() and skipping devm_clk_get() if cpts refclk has been acquired already. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grygorii Strashko authored
This will provide more flexibility in changing CPTS internals and also required for further changes. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grygorii Strashko authored
TI CPTS IP is used as part of TI OMAP CPSW driver, but it's also present as part of NETCP on TI Keystone 2 SoCs. So, It's required to enable build of CPTS for both this drivers and this can be achieved by allowing CPTS to be built separately. Hence, allow cpts to be built separately and convert it to be a module as both CPSW and NETCP drives can be built as modules. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Grygorii Strashko authored
Switch to readl/writel_relaxed() APIs, because this is recommended API and the CPTS IP is reused on Keystone 2 SoCs where LE/BE modes are supported. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Michael Chan says: ==================== bnxt_en: Add interface to support RDMA driver. This series adds an interface to support a brand new RDMA driver bnxt_re. The first step is to re-arrange some code so that pci_enable_msix() can be called during pci probe. The purpose is to allow the RDMA driver to initialize and stay initialized whether the netdev is up or down. Then we make some changes to VF resource allocation so that there is enough resources to support RDMA. Finally the last patch adds a simple interface to allow the RDMA driver to probe and register itself with any bnxt_en devices that support RDMA. Once registered, the RDMA driver can request MSIX, send fw messages, and receive some notifications. v2: Fixed kbuild test robot warnings. David, please consider this series for net-next. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Since the network driver and RDMA driver operate on the same PCI function, we need to create an interface to allow the RDMA driver to share resources with the network driver. 1. Create a new bnxt_en_dev struct which will be returned by bnxt_ulp_probe() upon success. After that, all calls from the RDMA driver to bnxt_en will pass a pointer to this struct. 2. This struct contains additional function pointers to register, request msix, send fw messages, register for async events. 3. If the RDMA driver wants to enable RDMA on the function, it needs to call the function pointer bnxt_register_device(). A ulp_ops structure is passed for RCU protected upcalls from bnxt_en to the RDMA driver. 4. The RDMA driver can call firmware APIs using the bnxt_send_fw_msg() function pointer. 5. 1 stats context is reserved when the RDMA driver registers. MSIX and completion rings are reserved when the RDMA driver calls bnxt_request_msix() function pointer. 6. When the RDMA driver calls bnxt_unregister_device(), all RDMA resources will be cleaned up. v2: Fixed 2 uninitialized variable warnings. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
The driver register function with firmware consists of passing version information and registering for async events. To support the RDMA driver, the async events that we need to register may change. Separate the driver register function into 2 parts so that we can just update the async events for the RDMA driver. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
If the device supports RDMA, we'll setup network default rings so that there are enough minimum resources for RDMA, if possible. However, the user can still increase network rings to the max if he wants. The actual RDMA resources won't be reserved until the RDMA driver registers. v2: Fix compile warning when BNXT_CONFIG_SRIOV is not set. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
All available remaining completion rings not used by the PF should be made available for the VFs so that there are enough rings in the VF to support RDMA. The earlier workaround code of capping the rings by the statistics context is removed. When SRIOV is disabled, call a new function bnxt_restore_pf_fw_resources() to restore FW resources. Later on we need to add some logic to account for RDMA resources. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
Now that MSIX is enabled in bnxt_init_one(), resources may be allocated by the RDMA driver before the network device is opened. So we cannot do function reset in bnxt_open() which will clear all the resources. The proper place to do function reset now is in bnxt_init_one(). If we get AER, we'll do function reset as well. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
To better support the new RDMA driver, we need to move pci_enable_msix() from bnxt_open() to bnxt_init_one(). This way, MSIX vectors are available to the RDMA driver whether the network device is up or down. Part of the existing bnxt_setup_int_mode() function is now refactored into a new bnxt_init_int_mode(). bnxt_init_int_mode() is called during bnxt_init_one() to enable MSIX. The remaining logic in bnxt_setup_int_mode() to map the IRQs to the completion rings is called during bnxt_open(). v2: Fixed compile warning when CONFIG_BNXT_SRIOV is not set. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michael Chan authored
By refactoring existing code into this new function. The new function will be used in subsequent patches. v2: Fixed compile warning when CONFIG_BNXT_SRIOV is not set. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Paolo noticed a cache line miss in UDP recvmsg() to access sk_rxhash, sharing a cache line with sk_drops. sk_drops might be heavily incremented by cpus handling a flood targeting this socket. We might place sk_drops on a separate cache line, but lets try to avoid wasting 64 bytes per socket just for this, since we have other bottlenecks to take care of. sock_rps_record_flow() should only access sk_rxhash for connected flows. Testing sk_state for TCP_ESTABLISHED covers most of the cases for connected sockets, for a zero cost, since system calls using sock_rps_record_flow() also access sk->sk_prot which is on the same cache line. A follow up patch will provide a static_key (Jump Label) since most hosts do not even use RFS. Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrey Konovalov authored
This patch changes tun.c to call netif_receive_skb instead of netif_rx when a packet is received (if CONFIG_4KSTACKS is not enabled to avoid stack exhaustion). The difference between the two is that netif_rx queues the packet into the backlog, and netif_receive_skb proccesses the packet in the current context. This patch is required for syzkaller [1] to collect coverage from packet receive paths, when a packet being received through tun (syzkaller collects coverage per process in the process context). As mentioned by Eric this change also speeds up tun/tap. As measured by Peter it speeds up his closed-loop single-stream tap/OVS benchmark by about 23%, from 700k packets/second to 867k packets/second. A similar patch was introduced back in 2010 [2, 3], but the author found out that the patch doesn't help with the task he had in mind (for cgroups to shape network traffic based on the original process) and decided not to go further with it. The main concern back then was about possible stack exhaustion with 4K stacks. [1] https://github.com/google/syzkaller [2] https://www.spinics.net/lists/netdev/thrd440.html#130570 [3] https://www.spinics.net/lists/netdev/msg130570.htmlSigned-off-by: Andrey Konovalov <andreyknvl@google.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Joe Perches says: ==================== irda: w83977af_ir: Neatening Originally on top of Arnd's overly long udelay patches because I noticed a misindented block. That's now already fixed along with some other whitespace problems. These patches are the remainder style issues from my original series. Even though I haven't turned on the netwinder in a box in the garage in who knows how long, if this device is still used somewhere, might as well neaten the code too. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Use more common logging style, standardize function output logging use. Miscellanea: o Add and use pr_fmt o Convert printks to pr_<level> Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Neaten function declaration and definition arguments. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Add braces where appropriate and remove an unnecessary else. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Convert pointer comparisons to NULL. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Use a more typical vertical spacing style. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Add spaces around operators. git diff -w shows no differences. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Joe Perches authored
Remove leading and trailing whitespace. git diff -w shows no differences. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
-
- 06 Dec, 2016 12 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds authored
Pull sparc fix from David Miller: "A use-before-NULL-check from Dan Carpenter" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: dbri: move dereference after check for NULL
-
Dan Carpenter authored
We accidentally introduced a dereference before the NULL check in xmit_descs() as part of silencing a GCC warning. Fixes: 16f46050 ("dbri: Fix compiler warning") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) When dcbnl_cee_fill() fails to be able to push a new netlink attribute, it return 0 instead of an error code. From Pan Bian. 2) Two suffix handling fixes to FIB trie code, from Alexander Duyck. 3) bnxt_hwrm_stat_ctx_alloc() goes through all the trouble of setting and maintaining a return code 'rc' but fails to actually return it. Also from Pan Bian. 4) ping socket ICMP handler needs to validate ICMP header length, from Kees Cook. 5) caif_sktinit_module() has this interesting logic: int err = sock_register(...); if (!err) return err; return 0; Just return sock_register()'s return value directly which is the only possible correct thing to do. 6) Two bnx2x driver fixes from Yuval Mintz, return a reasonable estimate from get_ringparam() ethtool op when interface is down and avoid trying to use UDP port based tunneling on 577xx chips. 7) Fix ep93xx_eth crash on module unload from Florian Fainelli. 8) Missing uapi exports, from Stephen Hemminger. 9) Don't schedule work from sk_destruct(), because the socket will be freed upon return from that function. From Herbert Xu. 10) Buggy drivers, of which we know there is at least one, can send a huge packet into the TCP stack but forget to set the gso_size in the SKB, which causes all kinds of problems. Correct this when it happens, and emit a one-time warning with the device name included so that it can be diagnosed more easily. From Marcelo Ricardo Leitner. 11) virtio-net does DMA off the stack causes hiccups with VMAP_STACK, fix from Andy Lutomirski. 12) Fix fec driver compilation with CONFIG_M5272, from Nikita Yushchenko. 13) mlx5 fixes from Kamal Heib, Saeed Mahameed, and Mohamad Haj Yahia. (erroneously flushing queues on error, module parameter validation, etc) * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (34 commits) net/mlx5e: Change the SQ/RQ operational state to positive logic net/mlx5e: Don't flush SQ on error net/mlx5e: Don't notify HW when filling the edge of ICO SQ net/mlx5: Fix query ISSI flow net/mlx5: Remove duplicate pci dev name print net/mlx5: Verify module parameters net: fec: fix compile with CONFIG_M5272 be2net: Add DEVSEC privilege to SET_HSW_CONFIG command. virtio-net: Fix DMA-from-the-stack in virtnet_set_mac_address() tcp: warn on bogus MSS and try to amend it uapi glibc compat: fix outer guard of net device flags enum net: stmmac: clear reset value of snps, wr_osr_lmt/snps, rd_osr_lmt before writing netlink: Do not schedule work from sk_destruct uapi: export nf_log.h uapi: export tc_skbmod.h net: ep93xx_eth: Do not crash unloading module bnx2x: Prevent tunnel config for 577xx bnx2x: Correct ringparam estimate when DOWN isdn: hisax: set error code on failure net: bnx2x: fix improper return value ...
-
Linus Torvalds authored
The shmem hole punching with fallocate(FALLOC_FL_PUNCH_HOLE) does not want to race with generating new pages by faulting them in. However, the wait-queue used to delay the page faulting has a serious problem: the wait queue head (in shmem_fallocate()) is allocated on the stack, and the code expects that "wake_up_all()" will make sure that all the queue entries are gone before the stack frame is de-allocated. And that is not at all necessarily the case. Yes, a normal wake-up sequence will remove the wait-queue entry that caused the wakeup (see "autoremove_wake_function()"), but the key wording there is "that caused the wakeup". When there are multiple possible wakeup sources, the wait queue entry may well stay around. And _particularly_ in a page fault path, we may be faulting in new pages from user space while we also have other things going on, and there may well be other pending wakeups. So despite the "wake_up_all()", it's not at all guaranteed that all list entries are removed from the wait queue head on the stack. Fix this by introducing a new wakeup function that removes the list entry unconditionally, even if the target process had already woken up for other reasons. Use that "synchronous" function to set up the waiters in shmem_fault(). This problem has never been seen in the wild afaik, but Dave Jones has reported it on and off while running trinity. We thought we fixed the stack corruption with the blk-mq rq_list locking fix (commit 7fe31130: "blk-mq: update hardware and software queues for sleeping alloc"), but it turns out there was _another_ stack corruptor hiding in the trinity runs. Vegard Nossum (also running trinity) was able to trigger this one fairly consistently, and made us look once again at the shmem code due to the faults often being in that area. Reported-and-tested-by: Vegard Nossum <vegard.nossum@oracle.com>. Reported-by: Dave Jones <davej@codemonkey.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
David S. Miller authored
Saeed Mahameed says: ==================== Mellanox 100G mlx5 fixes 2016-12-04 Some bug fixes for mlx5 core and mlx5e driver. v1->v2: - replace "uint" with "unsigned int" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Mohamad Haj Yahia authored
When using the negative logic (i.e. FLUSH state), after the RQ/SQ reopen we will have a time interval that the RQ/SQ is not really ready and the state indicates that its not in FLUSH state because the initial SQ/RQ struct memory starts as zeros. Now we changed the state to indicate if the SQ/RQ is opened and we will set the READY state after finishing preparing all the SQ/RQ resources. Fixes: 6e8dd6d6 ("net/mlx5e: Don't wait for SQ completions on close") Fixes: f2fde18c ("net/mlx5e: Don't wait for RQ completions on close") Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Saeed Mahameed authored
We are doing SQ descriptors cleanup in driver. Fixes: 6e8dd6d6 ("net/mlx5e: Don't wait for SQ completions on close") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Saeed Mahameed authored
We are going to do this a couple of steps ahead anyway. Fixes: d3c9bc27 ("net/mlx5e: Added ICO SQs") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kamal Heib authored
In old FWs query ISSI command is not supported and for some of those FWs it might fail with status other than "MLX5_CMD_STAT_BAD_OP_ERR". In such case instead of failing the driver load, we will treat any FW status other than 0 for Query ISSI FW command as ISSI not supported and assume ISSI=0 (most basic driver/FW interface). In case of driver syndrom (query ISSI failure by driver) we will fail driver load. Fixes: f62b8bb8 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kamal Heib authored
Remove duplicate pci dev name printing from mlx5_core_warn/dbg. Fixes: 5a788398 ('net/mlx5_core: Improve mlx5 messages') Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kamal Heib authored
Verify the mlx5_core module parameters by making sure that they are in the expected range and if they aren't restore them to their default values. Fixes: 9603b61d ('mlx5: Move pci device handling from mlx5_ib to mlx5_core') Signed-off-by: Kamal Heib <kamalh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Salil authored
This patch introduces the RX checksum function to check the status of the hardware calculated checksum and its error and appropriately convey status to the upper stack in skb->ip_summed field. In hardware, we only support checksum for the following protocols: 1) IPv4, 2) TCP(over IPv4 or IPv6), 3) UDP(over IPv4 or IPv6), 4) SCTP(over IPv4 or IPv6) but we support many L3(IPv4, IPv6, MPLS, PPPoE etc) and L4(TCP, UDP, GRE, SCTP, IGMP, ICMP etc.) protocols. Hardware limitation: Our present hardware RX Descriptor lacks L3/L4 checksum "Status & Error" bit (which usually can be used to indicate whether checksum was calculated by the hardware and if there was any error encountered during checksum calculation). Software workaround: We do get info within the RX descriptor about the kind of L3/L4 protocol coming in the packet and the error status. These errors might not just be checksum errors but could be related to version, length of IPv4, UDP, TCP etc. Because there is no-way of knowing if it is a L3/L4 error due to bad checksum or any other L3/L4 error, we will not (cannot) convey hardware checksum status(CHECKSUM_UNNECESSARY) for such cases to upper stack and will not maintain the RX L3/L4 checksum counters as well. Signed-off-by: Salil Mehta <salil.mehta@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-