- 25 Nov, 2008 3 commits
-
-
Ilpo Järvinen authored
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ilpo Järvinen authored
I always had thought that collapsing up to two at a time was intentional decision to avoid excessive processing if 1 byte sized skbs are to be combined for a full mtu, and consecutive retransmissions would make the size of the retransmittee double each round anyway, but some recent discussion made me to understand that was not the case. Thus make collapse work more and wait less. It would be possible to take advantage of the shifting machinery (added in the later patch) in the case of paged data but that can be implemented on top of this change. tcp_skb_is_last check is now provided by the loop. I tested a bit (ss-after-idle-off, fill 4096x4096B xfer, 10s sleep + 4096 x 1byte writes while dropping them for some a while with netem): . 16774097:16775545(1448) ack 1 win 46 . 16775545:16776993(1448) ack 1 win 46 . ack 16759617 win 2399 P 16776993:16777217(224) ack 1 win 46 . ack 16762513 win 2399 . ack 16765409 win 2399 . ack 16768305 win 2399 . ack 16771201 win 2399 . ack 16774097 win 2399 . ack 16776993 win 2399 . ack 16777217 win 2399 P 16777217:16777257(40) ack 1 win 46 . ack 16777257 win 2399 P 16777257:16778705(1448) ack 1 win 46 P 16778705:16780153(1448) ack 1 win 46 FP 16780153:16781313(1160) ack 1 win 46 . ack 16778705 win 2399 . ack 16780153 win 2399 F 1:1(0) ack 16781314 win 2399 While without drop-all period I get this: . 16773585:16775033(1448) ack 1 win 46 . ack 16764897 win 9367 . ack 16767793 win 9367 . ack 16770689 win 9367 . ack 16773585 win 9367 . 16775033:16776481(1448) ack 1 win 46 P 16776481:16777217(736) ack 1 win 46 . ack 16776481 win 9367 . ack 16777217 win 9367 P 16777217:16777218(1) ack 1 win 46 P 16777218:16777219(1) ack 1 win 46 P 16777219:16777220(1) ack 1 win 46 ... P 16777247:16777248(1) ack 1 win 46 . ack 16777218 win 9367 . ack 16777219 win 9367 ... . ack 16777233 win 9367 . ack 16777248 win 9367 P 16777248:16778696(1448) ack 1 win 46 P 16778696:16780144(1448) ack 1 win 46 FP 16780144:16781313(1169) ack 1 win 46 . ack 16780144 win 9367 F 1:1(0) ack 16781314 win 9367 The window seems to be 30-40 segments, which were successfully combined into: P 16777217:16777257(40) ack 1 win 46 Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
We can reduce pressure on dst entry refcount that slowdown UDP transmit path on SMP machines. This pressure is visible on RTP servers when delivering content to mediagateways, especially big ones, handling thousand of streams. Several cpus send UDP frames to the same destination, hence use the same dst entry. This patch makes ip_push_pending_frames() steal the refcount its callers had to take when filling inet->cork.dst. This doesnt avoid all refcounting, but still gives speedups on SMP, on UDP/RAW transmit path. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 24 Nov, 2008 21 commits
-
-
Eric Dumazet authored
We can reduce pressure on dst entry refcount that slowdown UDP transmit path on SMP machines. This pressure is visible on RTP servers when delivering content to mediagateways, especially big ones, handling thousand of streams. Several cpus send UDP frames to the same destination, hence use the same dst entry. This patch makes ip_append_data() eventually steal the refcount its callers had to take on the dst entry. This doesnt avoid all refcounting, but still gives speedups on SMP, on UDP/RAW transmit path Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jarek Poplawski authored
gen_kill_estimator() linear lists lookups are very slow, and e.g. while deleting a large number of HTB classes soft lockups were reported. Here is another try to fix this problem: this time internally, with rbtree, so similarly to Jamal's hashing idea IIRC. (Looking for next hits could be still optimized, but it's really fast as it is.) Reported-by: Badalian Vyacheslav <slavon@bigtelecom.ru> Reported-by: Denys Fedoryshchenko <denys@visp.net.lb> Signed-off-by: Jarek Poplawski <jarkao2@gmail.com> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Patrick McHardy authored
Jarek Poplawski points out: If all child qdiscs of sch_drr are non-work-conserving (e.g. sch_tbf) drr_dequeue() will busy-loop waiting for skbs instead of leaving the job for a watchdog. Checking for list_empty() in each loop isn't necessary either, because this can never be true except the first time. Using non-work-conserving qdiscs as children of DRR makes no sense, simply bail out in that case. Reported-by: Jarek Poplawski <jarkao2@gmail.com> Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wang Chen authored
This use of netdev->priv is wrong. The right way is: alloc_netdev() with no memory for private data. make netdev->ml_priv to point to c2_dev. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Acked-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wang Chen authored
1. convert netdev->priv to netdev_priv(). 2. make sbni_pci_probe() be static. Signed-off-by: Wang Chen <wangchen@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jirka Pirko authored
Freeing previously allocated buffers in case of error. Signed-off-by: Jirka Pirko <jirka@pirko.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jirka Pirko authored
Pointed out by Joe Perches. Error message after tx_ring allocation check was wrong. Signed-off-by: Jirka Pirko <jirka@jirka.pirko.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jirka Pirko authored
Fixed typo when allocating rx_ring, tx_ring was checked for null instead. Signed-off-by: Jirka Pirko <jirka@pirko.cz> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Instead of using call by reference use the PTR_ERR macros to handle return value with error case. Compile tested only. Signed-off-by: Stephen Hemminger <shemminger@vyatta.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
There is still a call to sock_prot_inuse_add() in af_netlink while in a preemptable section. Add explicit BH disable around this call. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
The rule of calling sock_prot_inuse_add() is that BHs must be disabled. Some new calls were added where this was not true and this tiggers warnings as reported by Ilpo. Fix this by adding explicit BH disabling around those call sites, or moving sock_prot_inuse_add() call inside an existing BH disabled section. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Linus mentioned we could try to perform long word operations, even on potentially unaligned addresses, on x86 at least. David mentioned the HAVE_EFFICIENT_UNALIGNED_ACCESS test to handle this on all arches that have efficient unailgned accesses. I tried this idea and got nice assembly on 32 bits: 158: 33 82 38 01 00 00 xor 0x138(%edx),%eax 15e: 33 8a 34 01 00 00 xor 0x134(%edx),%ecx 164: c1 e0 10 shl $0x10,%eax 167: 09 c1 or %eax,%ecx 169: 74 0b je 176 <eth_type_trans+0x87> And very nice assembly on 64 bits of course (one xor, one shl) Nice oprofile improvement in eth_type_trans(), 0.17 % instead of 0.41 %, expected since we remove 8 instructions on a fast path. This patch implements a compare_ether_addr_64bits() function, that uses the CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS ifdef to efficiently perform the 6 bytes comparison on all capable arches. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Commit 4e4fd4e4 ("ne2k: convert to net_device_ops") exported some ei_* symbols from the 8390 library, but the axnet_cs driver defines local static versions of the same functions. Rename them to avoid the namespace conflict. Reported by Stephen Rothwell. Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
The rule of calling sock_prot_inuse_add() is that BHs must be disabled. Some new calls were added where this was not true and this tiggers warnings as reported by Ilpo. Fix this by adding explicit BH disabling around those call sites. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Alexey Dobriyan authored
dev_net_set() should be the very first thing after alloc_netdev(). "ndo_" changes turned simple assignment (which is OK to do before netns assignment) into quite non-trivial operation (which is not OK, init_net was used). This leads to incomplete initialisation of tunnel device in netns. BUG: unable to handle kernel NULL pointer dereference at 00000004 IP: [<c02efdb5>] ip6_tnl_exit_net+0x37/0x4f *pde = 00000000 Oops: 0000 [#1] PREEMPT DEBUG_PAGEALLOC last sysfs file: /sys/class/net/lo/operstate Pid: 10, comm: netns Not tainted (2.6.28-rc6 #1) EIP: 0060:[<c02efdb5>] EFLAGS: 00010246 CPU: 0 EIP is at ip6_tnl_exit_net+0x37/0x4f EAX: 00000000 EBX: 00000020 ECX: 00000000 EDX: 00000003 ESI: c5caef30 EDI: c782bbe8 EBP: c7909f50 ESP: c7909f48 DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 Process netns (pid: 10, ti=c7908000 task=c7905780 task.ti=c7908000) Stack: c03e75e0 c7390bc8 c7909f60 c0245448 c7390bd8 c7390bf0 c7909fa8 c012577a 00000000 00000002 00000000 c0125736 c782bbe8 c7909f90 c0308fe3 c782bc04 c7390bd4 c0245406 c084b718 c04f0770 c03ad785 c782bbe8 c782bc04 c782bc0c Call Trace: [<c0245448>] ? cleanup_net+0x42/0x82 [<c012577a>] ? run_workqueue+0xd6/0x1ae [<c0125736>] ? run_workqueue+0x92/0x1ae [<c0308fe3>] ? schedule+0x275/0x285 [<c0245406>] ? cleanup_net+0x0/0x82 [<c0125ae1>] ? worker_thread+0x81/0x8d [<c0128344>] ? autoremove_wake_function+0x0/0x33 [<c0125a60>] ? worker_thread+0x0/0x8d [<c012815c>] ? kthread+0x39/0x5e [<c0128123>] ? kthread+0x0/0x5e [<c0103b9f>] ? kernel_thread_helper+0x7/0x10 Code: db e8 05 ff ff ff 89 c6 e8 dc 04 f6 ff eb 08 8b 40 04 e8 38 89 f5 ff 8b 44 9e 04 85 c0 75 f0 43 83 fb 20 75 f2 8b 86 84 00 00 00 <8b> 40 04 e8 1c 89 f5 ff e8 98 04 f6 ff 89 f0 e8 f8 63 e6 ff 5b EIP: [<c02efdb5>] ip6_tnl_exit_net+0x37/0x4f SS:ESP 0068:c7909f48 ---[ end trace 6c2f2328fccd3e0c ]--- Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
This is the last step to be able to perform full RCU lookups in __inet_lookup() : After established/timewait tables, we add RCU lookups to listening hash table. The only trick here is that a socket of a given type (TCP ipv4, TCP ipv6, ...) can now flight between two different tables (established and listening) during a RCU grace period, so we must use different 'nulls' end-of-chain values for two tables. We define a large value : #define LISTENING_NULLS_BASE (1U << 29) So that slots in listening table are guaranteed to have different end-of-chain values than slots in established table. A reader can still detect it finished its lookup in the right chain. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
The patch extends existing code: * Confirm options divide into the confirmed value plus an optional preference list for SP values. Previously only the preference list was echoed for SP values, now the confirmed value is added as per RFC 4340, 6.1; * length and sanity checks are added to avoid illegal memory (or NULL) access. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
Support for Mandatory options is provided by this patch, which will be used by subsequent feature-negotiation patches. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This extends the scope of two available functions, encode|decode_value_var, to work up to 6 (8) bytes, to match maximum requirements in the RFC. These functions are going to be used both by general option processing and feature negotiation code, hence declarations have been put into feat.h. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This provides function to query the current TX/RX CCID dynamically, without reliance on the minisock value, using dynamic information available in the currently loaded CCID module. This query function is then used to (a) provide the getsockopt part for getting/setting CCIDs via sockopts; (b) replace the current test for "which CCID is in use" in probe.c. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
With this patch, TX/RX CCIDs can now be changed on a per-connection basis, which overrides the defaults set by the global sysctl variables for TX/RX CCIDs. To make full use of this facility, the remaining patches of this patch set are needed, which track dependencies and activate negotiated feature values. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 23 Nov, 2008 4 commits
-
-
Brice Goglin authored
Update myri10ge firmware headers. Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Brice Goglin authored
Update DCA sections closing comments. Signed-off-by: Brice Goglin <brice@myri.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
In order to have relevant information for NETLINK protocol, in /proc/net/protocols, we should use sock_prot_inuse_add() to update a (percpu and pernamespace) counter of inuse sockets. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
1) Use eq_net() in inet_netns_ok() to speedup socket creation if !CONFIG_NET_NS 2) Reorder the tests about inet_ehash_secret generation (once only) Use the unlikely() macro when testing if inet_ehash_secret already generated. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 22 Nov, 2008 12 commits
-
-
-
Alexander Duyck authored
Currently the igb driver is experiencing a panic due to a null function pointer being used during the cleanup of the ethtool looback test on fiber/serdes parts. This patch prevents that and adds a check prior to calling any phy function. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Scott Feldman authored
Clarrify reading PBA has no side-effect (clearing). Add missing GPL license text. Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Scott Feldman authored
Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Scott Feldman authored
Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Scott Feldman authored
Add driver/firmware compatibility check. Update firmware notify cmd to honor notify area size. Add new version of init cmd. Add link_down_cnt to notify area to track link down count. Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Scott Feldman authored
Enable ethtool support for get/set_flags so LRO can be turned on/off by fwding drivers such as the bridge driver. LRO is not compatible with fwding drivers. Signed-off-by: Scott Feldman <scofeldm@cisco.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Krzysztof Hałasa authored
pc300too driver works around a bug in PCI9050 bridge. Unfortunately it was doing that too late. Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
-
Krzysztof Hałasa authored
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
-
Krzysztof Hałasa authored
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
-
Krzysztof Hałasa authored
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
-
Krzysztof Hałasa authored
Signed-off-by: Krzysztof Hałasa <khc@pm.waw.pl>
-