- 21 Mar, 2013 17 commits
-
-
Daniel Borkmann authored
If bpf_jit_enable > 1, then we dump the emitted JIT compiled image after creation. Currently, only SPARC and PowerPC has similar output as in the reference implementation on x86_64. Make a small helper function in order to reduce duplicated code and make the dump output uniform across architectures x86_64, SPARC, PPC, ARM (e.g. on ARM flen, pass and proglen are currently not shown, but would be interesting to know as well), also for future BPF JIT implementations on other archs. Cc: Mircea Gherzan <mgherzan@gmail.com> Cc: Matt Evans <matt@ozlabs.org> Cc: Eric Dumazet <eric.dumazet@google.com> Cc: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Shtylyov authored
Switch the driver to the managed device API by replacing ioremap() calls with devm_ioremap_resource() (that will also result in calling request_mem_region() which the driver forgot to do until now) and k[mz]alloc() with devm_kzalloc() -- this permits to simplify driver's probe()/remove() method cleanup. We can now remove the ioremap() error messages since the error messages are printed by devm_ioremap_resource() itself. We can also remove the 'bitbang' field from 'struct sh_eth_private' as we don't need it anymore in order to free the memory behind it... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Shtylyov authored
sh_eth_drv_probe() does cast from 'void *' when assigning to the 'pd' variable which is automatic anyway. Turn the assignment into initializer, while removing the cast... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sergei Shtylyov authored
sh_mdio_init() uses the bare numbers instead of the PHY interface bits, despite these are declared in sh_eth.h as 'enum PIR_BIT'... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Willem de Bruijn authored
The packetsocket fanout test uses a packet ring. Use TPACKET_V2 instead of TPACKET_V1 to work around a known 32/64 bit issue in the older ring that manifests on sparc64. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrey Vagin authored
The netlink_diag can be built as a module, just like it's done in unix sockets. The core dumping message carries the basic info about netlink sockets: family, type and protocol, portis, dst_group, dst_portid, state. Groups can be received as an optional parameter NETLINK_DIAG_GROUPS. Netlink sockets cab be filtered by protocols. The socket inode number and cookie is reserved for future per-socket info retrieving. The per-protocol filtering is also reserved for future by requiring the sdiag_protocol to be zero. The file /proc/net/netlink doesn't provide enough information for dumping netlink sockets. It doesn't provide dst_group, dst_portid, groups above 32. v2: fix NETLINK_DIAG_MAX. Now it's equal to the last constant. Acked-by: Pavel Emelyanov <xemul@parallels.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Andrey Vagin authored
Move a few declarations in a header. Acked-by: Pavel Emelyanov <xemul@parallels.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Gao feng <gaofeng@cn.fujitsu.com> Cc: Thomas Graf <tgraf@suug.ch> Signed-off-by: Andrey Vagin <avagin@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Claudiu Manoil authored
The GRO_DROP return code is handled by the core network layer. The current kernel approach is to factorize this kind of statistics into the upper layers, instead of having all the drivers maintaining them. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Claudiu Manoil authored
Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Claudiu Manoil authored
Warning message: warning: 'budget_per_q' may be used uninitialized in this function budget_per_q won't be used uninitialized since the only time it doesn't get initialized is when entering gfar_poll with num_act_queues == 0, meaning rstat_rxf == 0, in which case budget_per_q is not utilized (as it has no meaning). Inititalize budget_per_q to 0 though to suppress this compile warning. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nobuhiro Iwamatsu authored
Signed-off-by: Nobuhiro Iwamatsu <nobuhiro.iwamatsu.yj@renesas.com> Signed-off-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zefan Li authored
The cgroup code has been surrounded by ifdef CONFIG_NET_CLS_CGROUP and CONFIG_NETPRIO_CGROUP. Signed-off-by: Li Zefan <lizefan@huawei.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
This patch implements F-RTO (foward RTO recovery): When the first retransmission after timeout is acknowledged, F-RTO sends new data instead of old data. If the next ACK acknowledges some never-retransmitted data, then the timeout was spurious and the congestion state is reverted. Otherwise if the next ACK selectively acknowledges the new data, then the timeout was genuine and the loss recovery continues. This idea applies to recurring timeouts as well. While F-RTO sends different data during timeout recovery, it does not (and should not) change the congestion control. The implementaion follows the three steps of SACK enhanced algorithm (section 3) in RFC5682. Step 1 is in tcp_enter_loss(). Step 2 and 3 are in tcp_process_loss(). The basic version is not supported because SACK enhanced version also works for non-SACK connections. The new implementation is functionally in parity with the old F-RTO implementation except the one case where it increases undo events: In addition to the RFC algorithm, a spurious timeout may be detected without sending data in step 2, as long as the SACK confirms not all the original data are dropped. When this happens, the sender will undo the cwnd and perhaps enter fast recovery instead. This additional check increases the F-RTO undo events by 5x compared to the prior implementation on Google Web servers, since the sender often does not have new data to send for HTTP. Note F-RTO may detect spurious timeout before Eifel with timestamps does so. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
Consolidate all of TCP CA_Loss state processing in tcp_fastretrans_alert() into a new function called tcp_process_loss(). This is to prepare the new F-RTO implementation in the next patch. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
The patch series refactor the F-RTO feature (RFC4138/5682). This is to simplify the loss recovery processing. Existing F-RTO was developed during the experimental stage (RFC4138) and has many experimental features. It takes a separate code path from the traditional timeout processing by overloading CA_Disorder instead of using CA_Loss state. This complicates CA_Disorder state handling because it's also used for handling dubious ACKs and undos. While the algorithm in the RFC does not change the congestion control, the implementation intercepts congestion control in various places (e.g., frto_cwnd in tcp_ack()). The new code implements newer F-RTO RFC5682 using CA_Loss processing path. F-RTO becomes a small extension in the timeout processing and interfaces with congestion control and Eifel undo modules. It lets congestion control (module) determines how many to send independently. F-RTO only chooses what to send in order to detect spurious retranmission. If timeout is found spurious it invokes existing Eifel undo algorithms like DSACK or TCP timestamp based detection. The first patch removes all F-RTO code except the sysctl_tcp_frto is left for the new implementation. Since CA_EVENT_FRTO is removed, TCP westwood now computes ssthresh on regular timeout CA_EVENT_LOSS event. Signed-off-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
This is a minimal stand-alone user space helper, that allows for debugging or verification of emitted BPF JIT images. This is in particular useful for emitted opcode debugging, since minor bugs in the JIT compiler can be fatal. The disassembler is architecture generic and uses libopcodes and libbfd. How to get to the disassembly, example: 1) `echo 2 > /proc/sys/net/core/bpf_jit_enable` 2) Load a BPF filter (e.g. `tcpdump -p -n -s 0 -i eth1 host 192.168.20.0/24`) 3) Run e.g. `bpf_jit_disasm -o` to disassemble the most recent JIT code output `bpf_jit_disasm -o` will display the related opcodes to a particular instruction as well. Example for x86_64: $ ./bpf_jit_disasm 94 bytes emitted from JIT compiler (pass:3, flen:9) ffffffffa0356000 + <x>: 0: push %rbp 1: mov %rsp,%rbp 4: sub $0x60,%rsp 8: mov %rbx,-0x8(%rbp) c: mov 0x68(%rdi),%r9d 10: sub 0x6c(%rdi),%r9d 14: mov 0xe0(%rdi),%r8 1b: mov $0xc,%esi 20: callq 0xffffffffe0d01b71 25: cmp $0x86dd,%eax 2a: jne 0x000000000000003d 2c: mov $0x14,%esi 31: callq 0xffffffffe0d01b8d 36: cmp $0x6,%eax [...] 5c: leaveq 5d: retq $ ./bpf_jit_disasm -o 94 bytes emitted from JIT compiler (pass:3, flen:9) ffffffffa0356000 + <x>: 0: push %rbp 55 1: mov %rsp,%rbp 48 89 e5 4: sub $0x60,%rsp 48 83 ec 60 8: mov %rbx,-0x8(%rbp) 48 89 5d f8 c: mov 0x68(%rdi),%r9d 44 8b 4f 68 10: sub 0x6c(%rdi),%r9d 44 2b 4f 6c [...] 5c: leaveq c9 5d: retq c3 Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into wireless John W. Linville says: ==================== This is a big pull request for new features intended for the 3.10 stream... Regarding mac80211, Johannes says: "First, I merged mac80211/master to avoid some conflicts. This brings in a bunch of fixes you're already familiar with. For real -next material, I have a whole bunch of minstrel work, minstrel_ht from Felix and legacy minstrel from Thomas (Huehn). The other Thomas (Pedersen) did a number of changes in mesh to allow userspace peering management even when the mesh isn't secured. Stanislaw changes suspend/resume to always disconnect the networks. This is typically already done by network-manager so won't make a huge difference for most users, but fixes a number problems, particularly with USB drivers that can easily disconnect while suspended. Ilan has a small change to allow mac80211 drivers to differentiate remain-on-channel reasons, and Jouni extends nl80211 to allow fast roaming with full-MAC devices. I have a fairly large number of patches as well, many of them fairly simple cleanups, but also allowing split wiphy dumps and adding back the full wiphy information in nl80211, station entry change checking and more VHT work including VHT capability overrides (mostly for testing purposes)." And for iwlwifi, Johannes says: "Here, I also merged iwlwifi-fixes to avoid conflicts, and otherwise have various cleanups and improvements on the MVM driver, along with a few throughout the driver. Other than Bluetooth Coexistence from Emmanuel there's no over-arching theme, so listing them would pretty much reproduce the shortlog." Regarding NFC, Samuel says: "The 2 features we have with this one are: - An LLCP Service Name Lookup (SNL) netlink interface for querying LLCP service availability from user space. Along the way, Thierry also improved the existing SNL interface for aggregating SNL responses. - An initial LLCP socket options implementation, for setting the Receive Window (RW) and the Maximum Information Unit Extension (MIUX) per socket. This is need for the LLCP validation tests. We also have a microread MEI build failure here: I am not sending this one to 3.9 because the MEI bus code is not there yet, so it won't break for anyone else than me." And for ath6kl, Kalle says: "I added tracing support to ath6kl, along with a new Kconfig option. Now there's also a workaround to reset USB devices when the firmware upload fails, this happened when host was warm rebooted. There are also quite a few small fixes or cleanup." On top of all that, there is the usual bundle of driver updates with new features, new hardware support and the like mixed-in. The ath9k, b43, brcmfmac, mwifiex, rt2800, and wil6210 drivers are all well-represented, and a few other drivers are hit as well. I also pulled-in the wireless fixes tree in order to resolve some pending merge conflicts. Please let me know if there are problems! ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 20 Mar, 2013 23 commits
-
-
John W. Linville authored
Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem
-
stephen hemminger authored
Use netdev_alloc_sk_ip_align in the case where packet is copied. This handles case where NET_IP_ALIGN == 0 as well as adding required header padding. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Suggested-by: Daniel Baluta <daniel.baluta@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Baluta authored
Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Drivers should reserve some headroom in skb used in receive path, to avoid future head reallocation. One possible way to do that is to use dev_alloc_skb() instead of alloc_skb(), so that NET_SKB_PAD bytes are reserved. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Chris Metcalf authored
Previously, if you did an "ifconfig down" or similar on one core, and the kernel had CONFIG_XFRM enabled, every core would be interrupted to check its percpu flow list for items that could be garbage collected. With this change, we generate a mask of cores that actually have any percpu items, and only interrupt those cores. When we are trying to isolate a set of cpus from interrupts, this is important to do. Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuval Mintz authored
Revised bnx2x implementation of PCI Express Advanced Error Recovery - stop and free driver resources according to the AER flow (instead of the currently implemented `hope-for-the-best' release approach), and do not make any assumptions on the HW state after slot reset. Signed-off-by: Yuval Mintz <yuvalmin@broadcom.com> Signed-off-by: Ariel Elior <ariele@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
fec_poll_controller() was not declared. It should be static. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
emac_poll_controller() was not declared. It should be static. Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sachin Kamat authored
module_platform_driver macro removes some boilerplate and simplifies the code. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Cc: David Daney <ddaney@caviumnetworks.com> Acked-by: David Daney <ddaney@caviumnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sachin Kamat authored
module_platform_driver macro removes some boilerplate and simplifies the code. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sachin Kamat authored
module_platform_driver macro removes some boilerplate and simplifies the code. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Cc: Samuel Ortiz <samuel@sortiz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sachin Kamat authored
module_platform_driver macro removes some boilerplate and simplifies the code. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sachin Kamat authored
module_platform_driver macro removes some boilerplate and simplifies the code. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jesper Derehag authored
Process connector can now also detect coredumping events. Main aim of patch is get notified at start of coredumping, instead of having to wait for it to finish and then being notified through EXIT event. Could be used for instance by process-managers that want to get notified as soon as possible about process failures, and not necessarily beeing notified after coredump, which could be in the order of minutes depending on size of coredump, piping and so on. Signed-off-by: Jesper Derehag <jderehag@hotmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Claudiu Manoil authored
The only place where gfar_configure_coalescing is called with an actual bitmask (other than 0xff) is in gfar_poll (on the hot path). So make gfar_configure_coalescing() static for the buffer processing path, and export gfar_configure_coalescing_all() for the remaining cases that require to set coalescing for all the queues at once (on the slow path). Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Claudiu Manoil authored
For Multi Q Multi Group (MQ_MG_MODE) mode, the Rx/Tx colescing registers [rt]xic are aliased with the [rt]xic0 registers (coalescing setting regs for Q0). This avoids programming twice in a row the coalescing registers for the Rx/Tx hw Q0. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Claudiu Manoil authored
Split the napi budget fairly among the active queues only, instead of dividing it by the total number of Rx queues assigned to the given interrupt group. Use the h/w indication field RXFi in rstat (receive status register) to identify the active rx queues from the current interrupt group (i.e. receive event occured on ring i, if ring i is part of the current interrupt group). This indication field in rstat, RXFi i=0..7, allows us to find out on which queues of the same interrupt group do we have incomming traffic once we entered the polling routine for the given interrupt group. After servicing the ring i, the corresponding bit RXFi will be written with 1 to clear the active queue indication for that ring. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Claudiu Manoil authored
There are 2 issues with the current napi poll routine, with regards to tx ring cleanup: 1) for multi-queue devices (MQ_MG_MODE), should tx_bit_map != rx_bit_map, which is possible (and supported in h/w) if the DT property "fsl,tx-bit-map" holds a different value than rx_bit_map, the current polling routine will service the wrong Tx queues in this case (i.e. the interrupt group will receive interrupts from tx queues that it will not service) 2) Tx cleanup completion consumes napi budget, whereas the napi budget should be reserved for Rx work only. The patch fixes these issues and provides a clean napi polling routine. Napi poll completion is reached when all the Rx queues have been serviced and there is no Tx work to do. Signed-off-by: Claudiu Manoil <claudiu.manoil@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
It is very useful to do dynamic truncation of packets. In particular, we're interested to push the necessary header bytes to the user space and cut off user payload that should probably not be transferred for some reasons (e.g. privacy, speed, or others). With the ancillary extension PAY_OFFSET, we can load it into the accumulator, and return it. E.g. in bpfc syntax ... ld #poff ; { 0x20, 0, 0, 0xfffff034 }, ret a ; { 0x16, 0, 0, 0x00000000 }, ... as a filter will accomplish this without having to do a big hackery in a BPF filter itself. Follow-up JIT implementations are welcome. Thanks to Eric Dumazet for suggesting and discussing this during the Netfilter Workshop in Copenhagen. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Borkmann authored
__skb_get_poff() returns the offset to the payload as far as it could be dissected. The main user is currently BPF, so that we can dynamically truncate packets without needing to push actual payload to the user space and instead can analyze headers only. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller authored
Pull in the 'net' tree to get Daniel Borkmann's flow dissector infrastructure change. Signed-off-by: David S. Miller <davem@davemloft.net>
-
Willem de Bruijn authored
Fix flaky results with PACKET_FANOUT_HASH depending on whether the two flows hash into the same packet socket or not. Also adds tests for PACKET_FANOUT_LB and PACKET_FANOUT_CPU and replaces the counting method with a packet ring. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-