- 26 Aug, 2016 1 commit
-
-
Xin Long authored
Commit b17c7069 ("loopback: sctp: add NETIF_F_SCTP_CSUM to device features") added NETIF_F_SCTP_CRC to device features for lo device to improve the performance of sctp over lo. This patch is to add NETIF_F_SCTP_CRC to device features for veth to improve the performance of sctp over veth. Before this patch: ip netns exec cs_client netperf -H 10.167.12.2 -t SCTP_STREAM -- -m 10K Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 212992 212992 10240 10.00 1117.16 After this patch: ip netns exec cs_client netperf -H 10.167.12.2 -t SCTP_STREAM -- -m 10K Recv Send Send Socket Socket Message Elapsed Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec 212992 212992 10240 10.20 1415.22 Tested-by: Li Shuang <tjlishuang@yeah.net> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 25 Aug, 2016 10 commits
-
-
Florian Fainelli authored
Since we keep shadow copies of which interrupt sources are enabled through the intrl2_*_mask_{set,clear} macros, make sure that the ordering in which we do these two operations: update the copy, then unmask the register is correct. This is not currently a problem because we actually do not use them, but we will in a subsequent patch optimizing register accesses, so better be safe here. Fixes: 80105bef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
per_cpu_inc() is faster (at least on x86) than per_cpu_ptr(xxx)++; Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Adds SNMP counter for drops caused by MD5 mismatches. The current syslog might help, but a counter is more precise and helps monitoring. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
TCP MD5 mismatches do increment sk_drops counter in all states but SYN_RECV. This is very unlikely to happen in the real world, but worth adding to help diagnostics. We increase the parent (listener) sk_drops. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Fixes the following sparse warning: drivers/net/vmxnet3/vmxnet3_drv.c:1645:1: warning: symbol 'vmxnet3_rq_destroy_all_rxdataring' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Shrikrishna Khare <skhare@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Fix to return error code -ENOMEM from the dma_map_single error handling case instead of 0, as done elsewhere in this function. Fixes: 032c5e82 ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Remove an open coded simple_open() function and replace file operations references to the function with simple_open() instead. Generated by: scripts/coccinelle/api/simple_open.cocci Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Colitti authored
This allows a privileged process to filter by socket mark when dumping sockets via INET_DIAG_BY_FAMILY. This is useful on systems that use mark-based routing such as Android. The ability to filter socket marks requires CAP_NET_ADMIN, which is consistent with other privileged operations allowed by the SOCK_DIAG interface such as the ability to destroy sockets and the ability to inspect BPF filters attached to packet sockets. Tested: https://android-review.googlesource.com/261350Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenzo Colitti authored
This simplifies the code a bit and also allows inet_diag_bc_audit to send to userspace an error that isn't EINVAL. Signed-off-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Vivien Didelot authored
Now that the dsa_switch_driver structure contains only function pointers as it is supposed to, rename it to the more appropriate dsa_switch_ops, uniformly to any other operations structure in the kernel. No functional changes here, basically just the result of something like: s/dsa_switch_driver *drv/dsa_switch_ops *ops/g However keep the {un,}register_switch_driver functions and their dsa_switch_drivers list as is, since they represent the -- likely to be deprecated soon -- legacy DSA registration framework. In the meantime, also fix the following checks from checkpatch.pl to make it happy with this patch: CHECK: Comparison to NULL could be written "!ops" #403: FILE: net/dsa/dsa.c:470: + if (ops == NULL) { CHECK: Comparison to NULL could be written "ds->ops->get_strings" #773: FILE: net/dsa/slave.c:697: + if (ds->ops->get_strings != NULL) CHECK: Comparison to NULL could be written "ds->ops->get_ethtool_stats" #824: FILE: net/dsa/slave.c:785: + if (ds->ops->get_ethtool_stats != NULL) CHECK: Comparison to NULL could be written "ds->ops->get_sset_count" #835: FILE: net/dsa/slave.c:798: + if (ds->ops->get_sset_count != NULL) total: 0 errors, 0 warnings, 4 checks, 784 lines checked Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 24 Aug, 2016 29 commits
-
-
Yuval Mintz authored
When ndo_set_rx_mode() is called for bnx2x, as part of process of configuring the new MAC address filters [both unicast & multicast] driver begins by flushing the existing configuration and then iterating over the network device's list of addresses and configures those instead. This has the side-effect of creating a short gap where traffic wouldn't be properly classified, as no filters are configured in HW. While for unicasts this is rather insignificant [as unicast MACs don't frequently change while interface is actually running], for multicast traffic it does pose an issue as there are multicast-based networks where new multicast groups would constantly be removed and added. This patch tries to remedy this [at least for the newer adapters] - Instead of flushing & reconfiguring all existing multicast filters, the driver would instead create the approximate hash match that would result from the required filters. It would then compare it against the currently configured approximate hash match, and only add and remove the delta between those. Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'rxrpc-rewrite-20160824-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Add better client conn management strategy These two patches add a better client connection management strategy. They need to be applied on top of the just-posted fixes. (1) Duplicate the connection list and separate out procfs iteration from garbage collection. This is necessary for the next patch as with that client connections no longer appear on a single list and may not appear on a list at all - and really don't want to be exposed to the old garbage collector. (Note that client conns aren't left dangling, they're also in a tree rooted in the local endpoint so that they can be found by a user wanting to make a new client call. Service conns do not appear in this tree.) (2) Implement a better lifetime management and garbage collection strategy for client connections. In this, a client connection can be in one of five cache states (inactive, waiting, active, culled and idle). Limits are set on the number of client conns that may be active at any one time and makes users wait if they want to start a new call when there isn't capacity available. To make capacity available, active and idle connections can be culled, after a short delay (to allow for retransmission). The delay is reduced if the capacity exceeds a tunable threshold. If there is spare capacity, client conns are permitted to hang around a fair bit longer (tunable) so as to allow reuse of negotiated security contexts. After this patch, the client conn strategy is separate from that of service conns (which continues to use the old code for the moment). This difference in strategy is because the client side retains control over when it allows a connection to become active, whereas the service side has no control over when it sees a new connection or a new call on an old connection. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'rxrpc-rewrite-20160824-1' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: More fixes Here are a couple of fix patches: (1) Fix the conn-based retransmission patch posted yesterday. This breaks if it actually has to retransmit. However, it seems the likelihood of this happening is really low, despite the server I'm testing against being located >3000 miles away, and sometime of the time it's handled in the call background processor before we manage to disconnect the call - hence why I didn't spot it. (2) /proc/net/rxrpc_calls can cause a crash it accessed whilst a call is being torn down. The window of opportunity is pretty small, however, as calls don't stay in this state for long. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Jiri Pirko says: ==================== mlxsw: Offload FDB learning configuration Ido says: This patchset addresses two long standing issues in the mlxsw driver concerning FDB learning. Patch 1 limits the number of FDB records processed by the driver in a single session. This is useful in situations in which many new records need to be processed, thereby causing the RTNL mutex to be held for long periods of time. Patches 2-6 offload the learning configuration (on / off) of bridge ports to the device instead of having the driver decide whether a record needs to be learned or not. The last patch is fallout and removes configuration no longer necessary after the first patches are applied. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Before commit 99724c18 ("mlxsw: spectrum: Introduce support for router interfaces") we used to assign vFIDs to the created vPorts. Since these vPorts were used for slow path traffic we had to disable learning for them, as it doesn't make sense to have it enabled. This is no longer the case and now vPorts are either used for router interfaces (for which learning is disabled by the firmware) or bridge ports (for which learning is explicitly enabled by the driver). Therefore, we can remove the learning configuration upon vPort creation. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
We now offload the learning configuration to the device and don't rely on the driver to decide whether to learn the FDB record, so remove the check. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Up until now we simply stored the learning configuration of a bridge port in the driver and decided whether to learn a new FDB record based on this value. However, this is sub-optimal in cases where learning is disabled on the bridge port, as the device repeatedly generates learning notifications for the same record. Instead, offload the learning configuration to the device, thereby preventing it from generating notifications when learning is disabled. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
We are going to prevent the device from generating learning notifications for a port that was configured with learning disabled. Since learning configuration is done per {Port, VID} we need to apply the port's learning configuration for any VID that is added to the bridge port's VLAN filter list. When a VID is added to the VLAN filter list of a VLAN-aware bridge port, configure the {Port, VID} learning status according to the port's configuration. When the VID is removed, disable learning for the {Port, VID}. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
When removing VLANs from the VLAN-aware bridge we shouldn't abort on the first error, as we'll otherwise have resources that will never be freed. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Commit 05978481 ("mlxsw: spectrum: Create PVID vPort before registering netdevice") removed __mlxsw_sp_port_vlans_del() from the init sequence of the driver, which forced it to be non-symmetric with regards to __mlxsw_sp_port_vlans_add(). Make both functions symmetric as the constraint no longer exists. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Ido Schimmel authored
Up until now a learning session ended whenever the number of queried records was zero. This turned out to be problematic in situations where a large number of MACs (48K) had to be processed by the switch driver, as RTNL mutex is held during the learning session. Instead, limit the number of FDB records that can be processed in a session to 64. This means that every time the device is queried for learning notifications (currently, every 100ms), up to 64 records will be processed by the switch driver. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdmaDavid S. Miller authored
Saeed Mahameed says: ==================== Mellanox mlx5 core driver updates 2016-08-24 This series contains some low level and API updates for mlx5 core driver interface and mlx5_ifc.h, plus mlx5 LAG core driver support, to be shared as base code for net-next and rdma mlx5 4.9 submissions. From Alex and Artemy, Update mlx5_ifc for modify RQ and XRC bits. From Noa, Expose mlx5 link modes so they can be used in RDMA tree for rdma tools. From Aviv, LAG support needed for RDMA. - Add needed hardware structures, layouts and interface - mlx5 core driver LAG implementation - Introduce mlx5 core driver LAG API for mlx5_ib From Maor, add two low level patches for mlx5 hardware sniffer QP infrastructure bits and capabilities, plus added the namespace for sniffer steering tables. Needed for RDMA subtree. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Howells authored
Improve the management and caching of client rxrpc connection objects. From this point, client connections will be managed separately from service connections because AF_RXRPC controls the creation and re-use of client connections but doesn't have that luxury with service connections. Further, there will be limits on the numbers of client connections that may be live on a machine. No direct restriction will be placed on the number of client calls, excepting that each client connection can support a maximum of four concurrent calls. Note that, for a number of reasons, we don't want to simply discard a client connection as soon as the last call is apparently finished: (1) Security is negotiated per-connection and the context is then shared between all calls on that connection. The context can be negotiated again if the connection lapses, but that involves holding up calls whilst at least two packets are exchanged and various crypto bits are performed - so we'd ideally like to cache it for a little while at least. (2) If a packet goes astray, we will need to retransmit a final ACK or ABORT packet. To make this work, we need to keep around the connection details for a little while. (3) The locally held structures represent some amount of setup time, to be weighed against their occupation of memory when idle. To this end, the client connection cache is managed by a state machine on each connection. There are five states: (1) INACTIVE - The connection is not held in any list and may not have been exposed to the world. If it has been previously exposed, it was discarded from the idle list after expiring. (2) WAITING - The connection is waiting for the number of client conns to drop below the maximum capacity. Calls may be in progress upon it from when it was active and got culled. The connection is on the rxrpc_waiting_client_conns list which is kept in to-be-granted order. Culled conns with waiters go to the back of the queue just like new conns. (3) ACTIVE - The connection has at least one call in progress upon it, it may freely grant available channels to new calls and calls may be waiting on it for channels to become available. The connection is on the rxrpc_active_client_conns list which is kept in activation order for culling purposes. (4) CULLED - The connection got summarily culled to try and free up capacity. Calls currently in progress on the connection are allowed to continue, but new calls will have to wait. There can be no waiters in this state - the conn would have to go to the WAITING state instead. (5) IDLE - The connection has no calls in progress upon it and must have been exposed to the world (ie. the EXPOSED flag must be set). When it expires, the EXPOSED flag is cleared and the connection transitions to the INACTIVE state. The connection is on the rxrpc_idle_client_conns list which is kept in order of how soon they'll expire. A connection in the ACTIVE or CULLED state must have at least one active call upon it; if in the WAITING state it may have active calls upon it; other states may not have active calls. As long as a connection remains active and doesn't get culled, it may continue to process calls - even if there are connections on the wait queue. This simplifies things a bit and reduces the amount of checking we need do. There are a couple flags of relevance to the cache: (1) EXPOSED - The connection ID got exposed to the world. If this flag is set, an extra ref is added to the connection preventing it from being reaped when it has no calls outstanding. This flag is cleared and the ref dropped when a conn is discarded from the idle list. (2) DONT_REUSE - The connection should be discarded as soon as possible and should not be reused. This commit also provides a number of new settings: (*) /proc/net/rxrpc/max_client_conns The maximum number of live client connections. Above this number, new connections get added to the wait list and must wait for an active conn to be culled. Culled connections can be reused, but they will go to the back of the wait list and have to wait. (*) /proc/net/rxrpc/reap_client_conns If the number of desired connections exceeds the maximum above, the active connection list will be culled until there are only this many left in it. (*) /proc/net/rxrpc/idle_conn_expiry The normal expiry time for a client connection, provided there are fewer than reap_client_conns of them around. (*) /proc/net/rxrpc/idle_conn_fast_expiry The expedited expiry time, used when there are more than reap_client_conns of them around. Note that I combined the Tx wait queue with the channel grant wait queue to save space as only one of these should be in use at once. Note also that, for the moment, the service connection cache still uses the old connection management code. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
The main connection list is used for two independent purposes: primarily it is used to find connections to reap and secondarily it is used to list connections in procfs. Split the procfs list out from the reap list. This allows us to stop using the reap list for client connections when they acquire a separate management strategy from service collections. The client connections will not be on a management single list, and sometimes won't be on a management list at all. This doesn't leave them floating, however, as they will also be on an rb-tree rooted on the socket so that the socket can find them to dispatch calls. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Make /proc/net/rxrpc_calls safer by stashing a copy of the peer pointer in the rxrpc_call struct and checking in the show routine that the peer pointer, the socket pointer and the local pointer obtained from the socket pointer aren't NULL before we use them. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
If a duplicate packet comes in for a call that has just completed on a connection's channel then there will be an oops in the data_ready handler because it tries to examine the connection struct via a call struct (which we don't have - the pointer is unset). Since the connection struct pointer is available to us, go direct instead. Also, the ACK packet to be retransmitted needs three octets of padding between the soft ack list and the ackinfo. Fixes: 18bfeba5 ("rxrpc: Perform terminal call ACK/ABORT retransmission from conn processor") Signed-off-by: David Howells <dhowells@redhat.com>
-
David S. Miller authored
Eric Dumazet says: ==================== net: remove clear_sk() method Since IPv6 socket lookups no longer dereference pinet6 pointer and UDP lost SLAB_DESTROY_BY_RCU special rules, we no longer need special clear_sk() methods. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
We no longer use this handler, we can delete it. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Now RCU lookups of IPv6 TCP sockets no longer dereference pinet6, we do not need tcp_v6_clear_sk() anymore. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Since we no longer use SLAB_DESTROY_BY_RCU for UDP, we do not need sk_prot_clear_portaddr_nulls() helper. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Eric Dumazet authored
Now RCU lookups of ipv6 udp sockets no longer dereference pinet6 field, we can get rid of udp_v6_clear_sk() helper. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Ahern authored
This implements SOCK_DESTROY for UDP sockets similar to what was done for TCP with commit c1e64e29 ("net: diag: Support destroying TCP sockets.") A process with a UDP socket targeted for destroy is awakened and recvmsg fails with ECONNABORTED. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Use kfree_skb() instead of kfree() to free sk_buff. Fixes: 0d051bf9 ("tipc: make bearer packet filtering generic") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rami Rosen authored
This patch changes the return type of ena_set_push_mode() to be void, as it always returns 0. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'rxrpc-rewrite-20160823-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Miscellaneous improvements Here are some improvements that are part of the AF_RXRPC rewrite. They need to be applied on top of the just posted cleanups. (1) Set the connection expiry on the connection becoming idle when its last currently active call completes rather than each time put is called. This means that the connection isn't held open by retransmissions, pings and duplicate packets. Future patches will limit the number of live connections that the kernel will support, so making sure that old connections don't overstay their welcome is necessary. (2) Calculate packet serial skew in the UDP data_ready callback rather than in the call processor on a work queue. Deferring it like this causes the skew to be elevated by further packets coming in before we get to make the calculation. (3) Move retransmission of the terminal ACK or ABORT packet for a connection to the connection processor, using the terminal state cached in the rxrpc_connection struct. This means that once last_call is set in a channel to the current call's ID, no more packets will be routed to that rxrpc_call struct. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'rxrpc-rewrite-20160823-1' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Cleanups Here are some cleanups for the AF_RXRPC rewrite: (1) Remove some unused bits. (2) Call releasing on socket closure is now done in the order in which calls progress through the phases so that we don't miss a call actively moving list. (3) The rxrpc_call struct's channel number field is redundant and replaced with accesses to the masked off cid field instead. (4) Use a tracepoint for socket buffer accounting rather than printks. Unfortunately, since this would require currently non-existend arch-specific help to divine the current instruction location, the accounting functions are moved out of line so that __builtin_return_address() can be used. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
There is a error message within devm_ioremap_resource already, so remove the dev_err call to avoid redundant error message. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Remove including <linux/version.h> that don't need it. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Fixes the following sparse warning: drivers/net/phy/xilinx_gmii2rgmii.c:61:5: warning: symbol 'xgmiitorgmii_probe' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-