- 24 Aug, 2016 17 commits
-
-
David Howells authored
Improve the management and caching of client rxrpc connection objects. From this point, client connections will be managed separately from service connections because AF_RXRPC controls the creation and re-use of client connections but doesn't have that luxury with service connections. Further, there will be limits on the numbers of client connections that may be live on a machine. No direct restriction will be placed on the number of client calls, excepting that each client connection can support a maximum of four concurrent calls. Note that, for a number of reasons, we don't want to simply discard a client connection as soon as the last call is apparently finished: (1) Security is negotiated per-connection and the context is then shared between all calls on that connection. The context can be negotiated again if the connection lapses, but that involves holding up calls whilst at least two packets are exchanged and various crypto bits are performed - so we'd ideally like to cache it for a little while at least. (2) If a packet goes astray, we will need to retransmit a final ACK or ABORT packet. To make this work, we need to keep around the connection details for a little while. (3) The locally held structures represent some amount of setup time, to be weighed against their occupation of memory when idle. To this end, the client connection cache is managed by a state machine on each connection. There are five states: (1) INACTIVE - The connection is not held in any list and may not have been exposed to the world. If it has been previously exposed, it was discarded from the idle list after expiring. (2) WAITING - The connection is waiting for the number of client conns to drop below the maximum capacity. Calls may be in progress upon it from when it was active and got culled. The connection is on the rxrpc_waiting_client_conns list which is kept in to-be-granted order. Culled conns with waiters go to the back of the queue just like new conns. (3) ACTIVE - The connection has at least one call in progress upon it, it may freely grant available channels to new calls and calls may be waiting on it for channels to become available. The connection is on the rxrpc_active_client_conns list which is kept in activation order for culling purposes. (4) CULLED - The connection got summarily culled to try and free up capacity. Calls currently in progress on the connection are allowed to continue, but new calls will have to wait. There can be no waiters in this state - the conn would have to go to the WAITING state instead. (5) IDLE - The connection has no calls in progress upon it and must have been exposed to the world (ie. the EXPOSED flag must be set). When it expires, the EXPOSED flag is cleared and the connection transitions to the INACTIVE state. The connection is on the rxrpc_idle_client_conns list which is kept in order of how soon they'll expire. A connection in the ACTIVE or CULLED state must have at least one active call upon it; if in the WAITING state it may have active calls upon it; other states may not have active calls. As long as a connection remains active and doesn't get culled, it may continue to process calls - even if there are connections on the wait queue. This simplifies things a bit and reduces the amount of checking we need do. There are a couple flags of relevance to the cache: (1) EXPOSED - The connection ID got exposed to the world. If this flag is set, an extra ref is added to the connection preventing it from being reaped when it has no calls outstanding. This flag is cleared and the ref dropped when a conn is discarded from the idle list. (2) DONT_REUSE - The connection should be discarded as soon as possible and should not be reused. This commit also provides a number of new settings: (*) /proc/net/rxrpc/max_client_conns The maximum number of live client connections. Above this number, new connections get added to the wait list and must wait for an active conn to be culled. Culled connections can be reused, but they will go to the back of the wait list and have to wait. (*) /proc/net/rxrpc/reap_client_conns If the number of desired connections exceeds the maximum above, the active connection list will be culled until there are only this many left in it. (*) /proc/net/rxrpc/idle_conn_expiry The normal expiry time for a client connection, provided there are fewer than reap_client_conns of them around. (*) /proc/net/rxrpc/idle_conn_fast_expiry The expedited expiry time, used when there are more than reap_client_conns of them around. Note that I combined the Tx wait queue with the channel grant wait queue to save space as only one of these should be in use at once. Note also that, for the moment, the service connection cache still uses the old connection management code. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
The main connection list is used for two independent purposes: primarily it is used to find connections to reap and secondarily it is used to list connections in procfs. Split the procfs list out from the reap list. This allows us to stop using the reap list for client connections when they acquire a separate management strategy from service collections. The client connections will not be on a management single list, and sometimes won't be on a management list at all. This doesn't leave them floating, however, as they will also be on an rb-tree rooted on the socket so that the socket can find them to dispatch calls. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Make /proc/net/rxrpc_calls safer by stashing a copy of the peer pointer in the rxrpc_call struct and checking in the show routine that the peer pointer, the socket pointer and the local pointer obtained from the socket pointer aren't NULL before we use them. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
If a duplicate packet comes in for a call that has just completed on a connection's channel then there will be an oops in the data_ready handler because it tries to examine the connection struct via a call struct (which we don't have - the pointer is unset). Since the connection struct pointer is available to us, go direct instead. Also, the ACK packet to be retransmitted needs three octets of padding between the soft ack list and the ackinfo. Fixes: 18bfeba5 ("rxrpc: Perform terminal call ACK/ABORT retransmission from conn processor") Signed-off-by: David Howells <dhowells@redhat.com>
-
David Ahern authored
This implements SOCK_DESTROY for UDP sockets similar to what was done for TCP with commit c1e64e29 ("net: diag: Support destroying TCP sockets.") A process with a UDP socket targeted for destroy is awakened and recvmsg fails with ECONNABORTED. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Use kfree_skb() instead of kfree() to free sk_buff. Fixes: 0d051bf9 ("tipc: make bearer packet filtering generic") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rami Rosen authored
This patch changes the return type of ena_set_push_mode() to be void, as it always returns 0. Signed-off-by: Rami Rosen <ramirose@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'rxrpc-rewrite-20160823-2' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Miscellaneous improvements Here are some improvements that are part of the AF_RXRPC rewrite. They need to be applied on top of the just posted cleanups. (1) Set the connection expiry on the connection becoming idle when its last currently active call completes rather than each time put is called. This means that the connection isn't held open by retransmissions, pings and duplicate packets. Future patches will limit the number of live connections that the kernel will support, so making sure that old connections don't overstay their welcome is necessary. (2) Calculate packet serial skew in the UDP data_ready callback rather than in the call processor on a work queue. Deferring it like this causes the skew to be elevated by further packets coming in before we get to make the calculation. (3) Move retransmission of the terminal ACK or ABORT packet for a connection to the connection processor, using the terminal state cached in the rxrpc_connection struct. This means that once last_call is set in a channel to the current call's ID, no more packets will be routed to that rxrpc_call struct. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Merge tag 'rxrpc-rewrite-20160823-1' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs David Howells says: ==================== rxrpc: Cleanups Here are some cleanups for the AF_RXRPC rewrite: (1) Remove some unused bits. (2) Call releasing on socket closure is now done in the order in which calls progress through the phases so that we don't miss a call actively moving list. (3) The rxrpc_call struct's channel number field is redundant and replaced with accesses to the masked off cid field instead. (4) Use a tracepoint for socket buffer accounting rather than printks. Unfortunately, since this would require currently non-existend arch-specific help to divine the current instruction location, the accounting functions are moved out of line so that __builtin_return_address() can be used. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
There is a error message within devm_ioremap_resource already, so remove the dev_err call to avoid redundant error message. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Remove including <linux/version.h> that don't need it. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
Fixes the following sparse warning: drivers/net/phy/xilinx_gmii2rgmii.c:61:5: warning: symbol 'xgmiitorgmii_probe' was not declared. Should it be static? Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Sudarsana Reddy Kalluru authored
Add provision for configuring the fastpath queues with Tx (or Rx) only functionality. Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com> Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Phil Sutter authored
Since the features bit field has bits for internal only use as well, it may happen that the kernel exports RTAX_FEATURES attribute with zero value which is pointless. Fix this by making sure the attribute is added only if the exported value is non-zero. Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Hariprasad Shenai authored
When we disable SRIOV, we used to unregister the netdev but wasn't freed. But next time when the same netdev is registered, since the state was in 'NETREG_UNREGISTERED', we used to hit BUG_ON in register_netdevice, where it expects the state to be 'NETREG_UNINITIALIZED'. Alloc netdev and register them while configuring SRIOV, and free them when SRIOV is disabled. Also added a new function to setup ethernet properties instead of using ether_setup. Set carrier off by default, since we don't have to do any transmit on the interface. Fixes: 7829451c ("cxgb4: Add control net_device for configuring PCIe VF") Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Yuchung Cheng authored
TFO_SERVER_WO_SOCKOPT2 was intended for debugging purposes during Fast Open development. Remove this config option and also update/clean-up the documentation of the Fast Open sysctl. Reported-by: Piotr Jurkiewicz <piotr.jerzy.jurkiewicz@gmail.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Wei Yongjun authored
The callback function of call_rcu() just calls a kfree(), so we can use kfree_rcu() instead of call_rcu() + callback function. Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 23 Aug, 2016 23 commits
-
-
Nicholas Mc Guire authored
liquidio_set_rxcsum_command is a local function only, no need to expose it outside of lio_main.c so declare it static and make sparse happy. Signed-off-by: Nicholas Mc Guire <hofrat@osadl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gao Feng authored
Use PPP_ALLSTATIONS, PPP_UI, and SEND_SHUTDOWN instead of 0xff, 0x03, and 2 separately. Signed-off-by: Gao Feng <fgao@ikuai8.com> Acked-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Tom Herbert says: ==================== strp: Minor fixes to strparser and kcm Fix locking issue in kcm and losing events when paused. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
Lock the lower socket in kcm_unattach. Release during call to strp_done since that function cancels the RX timers and work queue with sync. Also added some status information in psock reporting. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Tom Herbert authored
When the upper layer unpauses a stream parser connection we need to queue rx_work to make sure no events are missed. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Stephen Hemminger says: ==================== Hyper-V network driver cleanups. The only new functionality is minor extensions to ethtool. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Printing console messages is not helpful when system is out of memory; and can be disastrous with netconsole. Instead keep statistics of these anomalous conditions. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Make netvsc on vmbus behave more like PCI. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
The variable m_ret is only used in one basic block. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
No caller checks the return value. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Break the different cases, code is cleaner if broken up Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Rearrange the transmit routine to eliminate goto's and unnecessary boolean variables. Use standard functions to test for vlan tag. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Move initialization to allocate where other fields are initialized. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Always returns 0 and no callers check. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Don't hard code size of array of NDIS versions. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Several new functions were introduced into hyperv.h but only used in one file. Move them and let compiler decide on inline. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Fix most of the complaints about the style of the code. Things like extra blank lines and return statements. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Better to use kcalloc rather than kzalloc and multiply for an array. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Stephen Hemminger authored
The function get_netvsc_net_device had conditional locking. This was unnecessary, incorrect, but harmless. It was unnecessary since the code is only called from netlink netdev event callback where RTNL is always acquired before the callbacks are run. It was incorrect because of use of trylock and then continuing. Fix by replacing with proper assertion. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdmaDavid S. Miller authored
Saeed Mahameed says: ==================== Mellanox mlx5 core driver updates 2016-08-20 This series contains several low level and API updates for mlx5 core commands interface and mlx5_ifc.h to be shared as base code for net-next and rdma mlx5 4.9 submissions. From Saeed, ten patches that refactors old layouts of firmware commands which were manually generated before we introduced the mlx5_ifc, now all of the firmware commands inbox/outbox layouts moved to use mlx5_ifc and we remove the old manually generated structures. Plus to those ten patches, we add two patches that unifies mlx5 commands execution interface and improve the driver log messages in that area. From Hadar and Ilya, added the needed hardware bits and infrastructure for minimum inline headers setting and encap/decap commands and capabilities, needed for E-Switch offloads. This series applies on top latest net-next and rdma/master, and smoothly merges with the latest "Mellanox 100G mlx5 fixes 2016-08-16" series already applied into net branch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Howells authored
Perform terminal call ACK/ABORT retransmission in the connection processor rather than in the call processor. With this change, once last_call is set, no more incoming packets will be routed to the corresponding call or any earlier calls on that channel (call IDs must only increase on a channel on a connection). Further, if a packet's callNumber is before the last_call ID or a packet is aimed at successfully completed service call then that packet is discarded and ignored. Signed-off-by: David Howells <dhowells@redhat.com>
-
David Howells authored
Calculate the serial number skew in the data_ready handler when a packet has been received and a connection looked up. The skew is cached in the sk_buff's priority field. The connection highest received serial number is updated at this time also. This can be done without locks or atomic instructions because, at this point, the code is serialised by the socket. This generates more accurate skew data because if the packet is offloaded to a work queue before this is determined, more packets may come in, bumping the highest serial number and thereby increasing the apparent skew. This also removes some unnecessary atomic ops. Signed-off-by: David Howells <dhowells@redhat.com>
-