- 07 Mar, 2013 23 commits
-
-
Daniel Pieczko authored
Allocating 2 buffers per page is insanely inefficient when MTU is 1500 and PAGE_SIZE is 64K (as it usually is on POWER). Allocate as many as we can fit, and choose the refill batch size at run-time so that we still always use a whole page at once. [bwh: Fix loop condition to allow for compound pages; rebase] Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
This condition is brittle and we have lots of flags to spare. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Daniel Pieczko authored
On POWER systems, DMA mapping/unmapping operations are very expensive. These changes reduce these costs by trying to reuse DMA mapped pages. After all the buffers associated with a page have been processed and passed up, the page is placed into a ring (if there is room). For each page that is required for a refill operation, a page in the ring is examined to determine if its page count has fallen to 1, ie. the kernel has released its reference to these packets. If this is the case, the page can be immediately added back into the RX descriptor ring, without having to re-map it for DMA. If the kernel is still holding a reference to this page, it is removed from the ring and unmapped for DMA. Then a new page, which can immediately be used by RX buffers in the descriptor ring, is allocated and DMA mapped. The time a page needs to spend in the recycle ring before the kernel has released its page references is based on the number of buffers that use this page. As large pages can hold more RX buffers, the RX recycle ring can be shorter. This reduces memory usage on POWER systems, while maintaining the performance gain achieved by recycling pages, following the driver change to pack more than two RX buffers into large pages. When an IOMMU is not present, the recycle ring can be small to reduce memory usage, since DMA mapping operations are inexpensive. With a small recycle ring, attempting to refill the descriptor queue with more buffers than the equivalent size of the recycle ring could ultimately lead to memory leaks if page entries in the recycle ring were overwritten. To prevent this, the check to see if the recycle ring is full is changed to check if the next entry to be written is NULL. [bwh: Combine and rebase several commits so this is complete before the following buffer-packing changes. Remove module parameter.] Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
Enable RX DMA scattering iff an RX buffer large enough for the current MTU will not fit into a single page and the NIC supports DMA scattering for kernel-mode RX queues. On Falcon and Siena, the RX_USR_BUF_SIZE field is used as the DMA limit for both all RX queues with scatter enabled. Set it to 1824, matching what Onload uses now. Maintain a statistic for frames truncated due to lack of descriptors (rx_nodesc_trunc). This is distinct from rx_frm_trunc which may be incremented when scattering is disabled and implies an over-length frame. Whenever an MTU change causes scattering to be turned on or off, update filters that point to the PF queues, but leave others unchanged, as VF drivers assume scattering is off. Add n_frags parameters to various functions, and make them iterate: - efx_rx_packet() - efx_recycle_rx_buffers() - efx_rx_mk_skb() - efx_rx_deliver() Make efx_handle_rx_event() responsible for updating efx_rx_queue::removed_count. Change the RX pipeline state to a starting ring index and number of fragments, and make __efx_rx_packet() responsible for clearing it. Based on earlier versions by David Riddoch and Jon Cooper. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
Adjust rx_buf->page_offset when we eat the RX hash prefix. Remove efx_rx_buf_offset(), which is now redundant. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
Currently we prefetch from the Ethernet header, but we will also read the hash prefix. In practice they should be in the same cache line and this won't hurt, but it is still pointless to add on the hash prefix size. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
efx_rx_buf_va() returns the virtual address of the current start of the buffer. The callers must add the hash prefix size themselves. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
The pipeline mechanism will need to change a bit for scattered packets. Add a wrapper to insulate efx_process_channel() from this. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
Replace efx_nic::rx_buffer_len with efx_nic::rx_dma_len, the maximum RX DMA length. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Alexandre Rames authored
The Linux side of EEH is triggered by MMIO reads, but this driver's data path does not issue any MMIO reads (except in legacy interrupt mode). Therefore add a monitor function to poll EEH periodically. When preparing to reset the device based on our own error detection, also poll EEH and defer to its recovery mechanism if appropriate. [bwh: Use a separate condition for the initial link poll; fix some style errors] Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
On Siena, VFs share RSS configuration with the PF. We attempted to support configurations where the PF only uses 1 RX queue and VFs use multiple RX queues, by (1) setting up RSS for the number of RX queues per VF (2) disabling RSS in the PF's RX default filters. Unfortunately commit cd2d5b52 ('sfc: Add SR-IOV back-end support for SFC9000 family') only included (1). This is (2). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
efx_filter_insert_filter() uses the first table entry in the hash chain that either has the same match values or is empty. This means that replacement doesn't always work correctly: 1. Insert filter F1 with match values M1, hashing to H1, at first possible entry E1. 2. Insert filter F2 with match values M2, hashing to H1, at second possible entry E2. 3. Remove filter F1. 4. Insert filter F3 with match values M2, hashing to H1, at first possible entry E1. F3 should have either replaced F2 or been rejected (depending on priority and the replace_equal parameter). Instead, search for both a matching filter that the inserted filter would replace, and an available insertion point, up to the applicable maximum search depths. If we insert at lower depth than a replaced filter, clear the replaced filter. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
efx_filter_search() is only called from efx_filter_insert(), and neither function is very long. The following bug fix requires a more sophisticated search with a third result, which is going to be easier to implement as part of the same function. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
These functions happen to work for default MAC filters: they generate an initial index of 1/0 for unicast/multicast respectively and an increment of 1 for either, so a search succeeds at depth 2. But this is a matter of luck rather than design, and it really won't work well with the bug fix we're about to do. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
The 'for_insert' parameter is redundant since there are no longer any other operations that need to search based on a filter spec. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
The 'replace' flag to efx_filter_insert_filter() controls whether the new filter may replace *any* filter, and is checked even before priority comparison. But lower-priority filters should never block insertion of higher-priority filters. Change the priority checking so that lower-priority filters are replaced regardless of the value of the flag, and rename the flag to 'replace_equal'. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Alexandre Rames authored
[bwh: Remove more dead code, and make efx_ptp_rx() pull the data it needs into the header area.] Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Laurence Evans authored
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Laurence Evans authored
There is a long-standing problem with the packet-timestamp matching in the driver. When a PTP packet is received by the MC, the FPGA timestamps the packet and the MC sends the timestamp and 6 bytes of the UUID to the driver. The driver then matches the timestamp against received packets using the same 6 bytes of UUID. The problem comes from the choice of which 6 bytes to use. The PTP spec is slightly contradictory and misleading in one of the two places where the UUIDs are discussed. From section 7.2.2.2 of the spec, a PTPD2 UUID can be either a EUI-64 or a EUI-64 constructed from a EUI-48. The typical ethernet based implementation uses a EUI-64 constructed from a EUI-48. This works by taking the first 3 bytes of the MAC address of the NIC being used for PTP (the OUI), then inserting 0xFF, 0xFE, then taking the last 3 bytes of the MAC address giving MAC[0], MAC[1], MAC[2], 0xFF, 0xFE, MAC[3], MAC[4], MAC[5] The current MC firmware and driver discard the first two bytes of this UUID and packets are matched against timestamps using bytes 2 to 7 so there is a small risk that in a deployment of Solarflare PTP NICs used with other vendors NICs, that a PTP packet could be matched against the wrong timestamp. This applies to all other organisations whose third byte of the OUI is 0x53. It's a long list but I notice that it includes Cisco. The necessary modifications to use bytes 0-2 and 5-7 of the UUID to match against are quite small but introduce incompatibility between older version of the firmware and driver. When PTP is enabled via SO_TIMESTAMPING specifying PTP V2, the driver will try to enable PTP in the firmware using the enhanced mode (above). If the firmware returns an error, the driver will enable PTP in the firmware using the old mode. [bwh: Fix some style errors; remove private ioctl bits] Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
Instead of having efx_ptp_rx() call netif_receive_skb() for an invalid PTP packet, make it return false for rejected packets and have efx_rx_deliver() pass them up. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
-
- 06 Mar, 2013 17 commits
-
-
Eric Dumazet authored
HTB uses an internal pfifo queue, which limit is not reported to userland tools (tc), and value inherited from device tx_queue_len at setup time. Introduce TCA_HTB_DIRECT_QLEN attribute to allow finer control. Remove two obsolete pr_err() calls as well. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Florian Fainelli authored
We are currently busy waiting for MDIO registers to complete their operation but we did not propagate the result back to the caller. Update r6040_phy_{read,write} to report the busy waiting result accordingly. Signed-off-by: Florian Fainelli <florian@openwrt.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nicolas Dichtel authored
It's useful to be able to get the initial state of all entries. The patch adds the support for IPv4 and IPv6. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
As suggested by Eric Dumazet, allow user to select mode which chooses TX port randomly. Functionality should be more of less similar to round-robin mode with even lower overhead. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Pirko authored
No need to duplicate code for this. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ben Hutchings authored
RX DMA buffers start at an offset of EFX_PAGE_IP_ALIGN bytes from the start of a cache line. This offset obviously needs to be included in the virtual address, but this was missed in commit b590ace0 ('sfc: Fix efx_rx_buf_offset() in the presence of swiotlb') since EFX_PAGE_IP_ALIGN is equal to 0 on both x86 and powerpc. Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Ben Hutchings authored
efx_device_detach_sync() locks all TX queues before marking the device detached and thus disabling further TX scheduling. But it can still be interrupted by TX completions which then result in TX scheduling in soft interrupt context. This will deadlock when it tries to acquire a TX queue lock that efx_device_detach_sync() already acquired. To avoid deadlock, we must use netif_tx_{,un}lock_bh(). Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-
Eric Dumazet authored
BQL (Byte Queue Limits) proper operation needs TX completion being serviced in a timely fashion. bnx2x uses a non standard NAPI poll weight, and thats not fair to other napi poll handlers, and even not reasonable. Use the default value instead. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Some drivers use a too big NAPI poll weight. This patch adds a NAPI_POLL_WEIGHT default value and issues an error message if a driver attempts to use a bigger weight. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Eilon Greenstein <eilong@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Flavio Leitner authored
We must try harder to get unique (addr, port) pairs when doing port autoselection for sockets with SO_REUSEADDR option set. This is a continuation of commit aacd9289 for IPv6. Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
This patch uses module_platform_driver_probe() macro which makes the code smaller and simpler. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
This patch uses module_platform_driver_probe() macro which makes the code smaller and simpler. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jingoo Han authored
This patch uses module_platform_driver_probe() macro which makes the code smaller and simpler. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpcLinus Torvalds authored
Pull powerpc fixes from Ben Herrenschmidt: "Here are a few powerpc bits & fixes for rc1. A couple of str*cpy fixes, some fixes in handling the FSCR register on Power8 (controls the enabling of processor features), a 32-bit build fix and a couple more nits." * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc: Set DSCR bit in FSCR setup powerpc: Add DSCR FSCR register bit definition powerpc: Fix setting FSCR for HV=0 and on secondary CPUs powerpc: Wireup the kcmp syscall to sys_ni powerpc: Remove unused BITOP_LE_SWIZZLE macro powerpc: Avoid link stack corruption in MMU on syscall entry path drivers/tty/hvc: Use strlcpy instead of strncpy powerpc/pseries/hvcserver: Fix strncpy buffer limit in location code powerpc: Fix compile of sha1-powerpc-asm.S on 32-bit
-
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linuxLinus Torvalds authored
Pull virtio hwrng fix from Rusty Russell: "Nasty side-effect of vmalloc'ing modules: their static vars cannot be put into scatterlists. Jens has a check queued for this, so it shouldn't happen again. We could fix this in virtio_rng, but it's actually far easier to just do it in the core" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: hw_random: make buffer usable in scatterlist.
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: "A moderately sized pile of fixes, some specifically for merge window introduced regressions although others are for longer standing items and have been queued up for -stable. I'm kind of tired of all the RDS protocol bugs over the years, to be honest, it's way out of proportion to the number of people who actually use it. 1) Fix missing range initialization in netfilter IPSET, from Jozsef Kadlecsik. 2) ieee80211_local->tim_lock needs to use BH disabling, from Johannes Berg. 3) Fix DMA syncing in SFC driver, from Ben Hutchings. 4) Fix regression in BOND device MAC address setting, from Jiri Pirko. 5) Missing usb_free_urb in ISDN Hisax driver, from Marina Makienko. 6) Fix UDP checksumming in bnx2x driver for 57710 and 57711 chips, fix from Dmitry Kravkov. 7) Missing cfgspace_lock initialization in BCMA driver. 8) Validate parameter size for SCTP assoc stats getsockopt(), from Guenter Roeck. 9) Fix SCTP association hangs, from Lee A Roberts. 10) Fix jumbo frame handling in r8169, from Francois Romieu. 11) Fix phy_device memory leak, from Petr Malat. 12) Omit trailing FCS from frames received in BGMAC driver, from Hauke Mehrtens. 13) Missing socket refcount release in L2TP, from Guillaume Nault. 14) sctp_endpoint_init should respect passed in gfp_t, rather than use GFP_KERNEL unconditionally. From Dan Carpenter. 15) Add AISX AX88179 USB driver, from Freddy Xin. 16) Remove MAINTAINERS entries for drivers deleted during the merge window, from Cesar Eduardo Barros. 17) RDS protocol can try to allocate huge amounts of memory, check that the user's request length makes sense, from Cong Wang. 18) SCTP should use the provided KMALLOC_MAX_SIZE instead of it's own, bogus, definition. From Cong Wang. 19) Fix deadlocks in FEC driver by moving TX reclaim into NAPI poll, from Frank Li. Also, fix a build error introduced in the merge window. 20) Fix bogus purging of default routes in ipv6, from Lorenzo Colitti. 21) Don't double count RTT measurements when we leave the TCP receive fast path, from Neal Cardwell." * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits) tcp: fix double-counted receiver RTT when leaving receiver fast path CAIF: fix sparse warning for caif_usb rds: simplify a warning message net: fec: fix build error in no MXC platform net: ipv6: Don't purge default router if accept_ra=2 net: fec: put tx to napi poll function to fix dead lock sctp: use KMALLOC_MAX_SIZE instead of its own MAX_KMALLOC_SIZE rds: limit the size allocated by rds_message_alloc() MAINTAINERS: remove eexpress MAINTAINERS: remove drivers/net/wan/cycx* MAINTAINERS: remove 3c505 caif_dev: fix sparse warnings for caif_flow_cb ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver sctp: use the passed in gfp flags instead GFP_KERNEL ipv[4|6]: correct dropwatch false positive in local_deliver_finish l2tp: Restore socket refcount when sendmsg succeeds net/phy: micrel: Disable asymmetric pause for KSZ9021 bgmac: omit the fcs phy: Fix phy_device_free memory leak bnx2x: Fix KR2 work-around condition ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull irq fixes and cleanups from Thomas Gleixner: "Commit e5ab012c ("nohz: Make tick_nohz_irq_exit() irq safe") is the first commit in the series and the minimal necessary bugfix, which needs to go back into stable. The remanining commits enforce irq disabling in irq_exit(), sanitize the hardirq/softirq preempt count transition and remove a bunch of no longer necessary conditionals." I personally love getting rid of the very subtle and confusing IRQ_EXIT_OFFSET thing. Even apart from the whole "more lines removed than added" thing. * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irq: Don't re-enable interrupts at the end of irq_exit irq: Remove IRQ_EXIT_OFFSET workaround Revert "nohz: Make tick_nohz_irq_exit() irq safe" irq: Sanitize invoke_softirq irq: Ensure irq_exit() code runs with interrupts disabled nohz: Make tick_nohz_irq_exit() irq safe
-