Commits · d825da2ede50160e567e666ff43c89a403bf0193 · Kirill Smelkov / linux

10 Dec, 2012 12 commits

doc: Tighten-up and clarify description of tcp_fin_timeout · d825da2e

Rick Jones authored Dec 10, 2012

The description for tcp_fin_timeout should be tigher and more clear.

In addition to being tighter, we should make the spelling of the
state name consistent with what utilities report, remove the now
dated reference to 2.2 and put the default in the consistent place.
Signed-off-by: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d825da2e

net: 8390: use io{read,write}*_rep accessors · b146ecd6

Matthew Leach authored Dec 10, 2012

The {read,write}s{b,w,l} operations are not defined by all
architectures and are being removed from the asm-generic/io.h
interface.

This patch replaces the usage of these string functions in the 8390
accessors with io{read,write}{8,16,32}_rep calls instead.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ben Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Matthew Leach <matthew@mattleach.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b146ecd6

net: dm9000: use io{read,write}*_rep accessors · daadaf6f

Matthew Leach authored Dec 10, 2012

The {read,write}s{b,w,l} operations are not defined by all
architectures and are being removed from the asm-generic/io.h
interface.

This patch replaces the usage of these string functions in the default
DM9000 accessors with io{read,write}{8,16,32}_rep calls instead. This
is required as the dm9000 driver is in use by the blackfin
architecture which uses the asm-generic io accessors.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ben Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Matthew Leach <matthew@mattleach.net>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

daadaf6f

net: smc91x: use io{read, write}*_rep accessors instead of string functions · 4ba73aa1

Will Deacon authored Dec 10, 2012

The {read,write}s{b,w,l} operations are not defined by all architectures
and are being removed from the asm-generic/io.h interface.

This patch replaces the usage of these string functions in the default
SMC accessors with io{read,write}{8,16,32}_rep calls instead, which are
defined for all architectures.

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ben Herrenschmidt <benh@kernel.crashing.org>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4ba73aa1

rtnetlink: add missing message types to selinux perm table · 6e73d71d

Cong Wang authored Dec 07, 2012

Rebased on the latest net-next tree.

RTM_NEWNETCONF and RTM_GETNETCONF are missing in this table.

Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6e73d71d

net: Allow DCBnl to use other namespaces besides init_net · 7c77ab24

John Fastabend authored Dec 09, 2012

Allow DCB and net namespace to work together. This is useful if you
have containers that are bound to 'phys' interfaces that want to
also manage their DCB attributes.

The net namespace is taken from sock_net(skb->sk) of the netlink skb.

CC: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7c77ab24

smsc95xx: fix async register writes on big endian platforms · 7b9e7580

Steve Glendinning authored Dec 10, 2012

This patch fixes a missing endian conversion which results in the
interface failing to come up on BE platforms.

It also removes an unnecessary pointer dereference from this
function.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

7b9e7580

smsc95xx: fix register dump of last register · 96245317

Steve Glendinning authored Dec 10, 2012

This patch fixes the ethtool register dump for smsc95xx to dump
all 4 bytes of the final register (COE_CR) instead of just the
first byte.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

96245317

smsc75xx: only set mac address once on bind · 481705a1

Steve Glendinning authored Dec 10, 2012

This patch changes when we decide what the device's MAC address
is from per ifconfig up to once when the device is connected.

Without this patch, a manually forced device MAC is overwritten
on ifconfig down/up.  Also devices that have no EEPROM are
assigned a new random address on ifconfig down/up instead of
persisting the same one.
Signed-off-by: Steve Glendinning <steve.glendinning@shawell.net>
Reported-by: Robert Cunningham <rcunningham@nsmsurveillance.com>
Cc: Bjorn Mork <bjorn@mork.no>
Cc: Dan Williams <dcbw@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

481705a1

net: remove obsolete simple_strto<foo> · 4b5511eb

Abhijit Pawar authored Dec 09, 2012

This patch replace the obsolete simple_strto<foo> with kstrto<foo>
Signed-off-by: Abhijit Pawar <abhi.c.pawar@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4b5511eb

virtio_net: fix a typo in virtnet_alloc_queues() · 008d4278

Amerigo Wang authored Dec 10, 2012

Obviously it should check !vi->rq.
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

008d4278

bridge: make buffer larger in br_setlink() · 2062cc20

Dan Carpenter authored Dec 07, 2012

We pass IFLA_BRPORT_MAX to nla_parse_nested() so we need
IFLA_BRPORT_MAX + 1 elements.  Also Smatch complains that we read past
the end of the array when in br_set_port_flag() when it's called with
IFLA_BRPORT_FAST_LEAVE.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2062cc20

09 Dec, 2012 11 commits

caif_usb: Make the driver name check more efficient · 65d2897c

Ben Hutchings authored Dec 07, 2012

Use the device model to get just the name, rather than using the
ethtool API to get all driver information.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

65d2897c

caif_usb: Check driver name before reading driver state in netdev notifier · 40663634

Ben Hutchings authored Dec 07, 2012

In cfusbl_device_notify(), the usbnet and usbdev variables are
initialised before the driver name has been checked.  In case the
device's driver is not cdc_ncm, this may result in reading beyond the
end of the netdev private area.  Move the initialisation below the
driver name check.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

40663634

virtio-net: support changing the number of queue pairs through ethtool · d73bcd2c

Jason Wang authored Dec 07, 2012

This patch implements the ethtool_{set|get}_channels method of virtio-net to
allow user to change the number of queues when the device is running on demand.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d73bcd2c

virtio_net: multiqueue support · 986a4f4d

Jason Wang authored Dec 07, 2012

This patch adds the multiqueue (VIRTIO_NET_F_MQ) support to virtio_net
driver. VIRTIO_NET_F_MQ capable device could allow the driver to do packet
transmission and reception through multiple queue pairs and does the packet
steering to get better performance. By default, one one queue pair is used, user
could change the number of queue pairs by ethtool in the next patch.

When multiple queue pairs is used and the number of queue pairs is equal to the
number of vcpus. Driver does the following optimizations to implement per-cpu
virt queue pairs:

- select the txq based on the smp processor id.
- smp affinity hint to the cpu that owns the queue pairs.

This could be used with the flow steering support of the device to guarantee the
packets of a single flow is handled by the same cpu.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

986a4f4d

virtio-net: separate fields of sending/receiving queue from virtnet_info · e9d7417b

Jason Wang authored Dec 07, 2012

To support multiqueue transmitq/receiveq, the first step is to separate queue
related structure from virtnet_info. This patch introduce send_queue and
receive_queue structure and use the pointer to them as the parameter in
functions handling sending/receiving.
Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e9d7417b

vxlan: Add capability of Rx checksum offload for inner packet · 0afb1666

Joseph Gasparakis authored Dec 07, 2012

This patch adds capability in vxlan to identify received
checksummed inner packets and signal them to the upper layers of
the stack. The driver needs to set the skb->encapsulation bit
and also set the skb->ip_summed to CHECKSUM_UNNECESSARY.
Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0afb1666

vxlan: capture inner headers during encapsulation · d6727fe3

Joseph Gasparakis authored Dec 07, 2012

Allow VXLAN to make use of Tx checksum offloading and Tx scatter-gather.
The advantage to these two changes is that it also allows the VXLAN to
make use of GSO.
Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d6727fe3

net: Handle encapsulated offloads before fragmentation or handing to lower dev · fc70fb64

Alexander Duyck authored Dec 07, 2012

This change allows the VXLAN to enable Tx checksum offloading even on
devices that do not support encapsulated checksum offloads. The
advantage to this is that it allows for the lower device to change due
to routing table changes without impacting features on the VXLAN itself.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fc70fb64

net: Add support for hardware-offloaded encapsulation · 6a674e9c

Joseph Gasparakis authored Dec 07, 2012

This patch adds support in the kernel for offloading in the NIC Tx and Rx
checksumming for encapsulated packets (such as VXLAN and IP GRE).

For Tx encapsulation offload, the driver will need to set the right bits
in netdev->hw_enc_features. The protocol driver will have to set the
skb->encapsulation bit and populate the inner headers, so the NIC driver will
use those inner headers to calculate the csum in hardware.

For Rx encapsulation offload, the driver will need to set again the
skb->encapsulation flag and the skb->ip_csum to CHECKSUM_UNNECESSARY.
In that case the protocol driver should push the decapsulated packet up
to the stack, again with CHECKSUM_UNNECESSARY. In ether case, the protocol
driver should set the skb->encapsulation flag back to zero. Finally the
protocol driver should have NETIF_F_RXCSUM flag set in its features.
Signed-off-by: Joseph Gasparakis <joseph.gasparakis@intel.com>
Signed-off-by: Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6a674e9c

r8169: workaround for missing extended GigaMAC registers · 9ecb9aab

françois romieu authored Dec 07, 2012

GigaMAC registers have been reported left unitialized in several
situations:
- after cold boot from power-off state
- after S3 resume

Tweaking rtl_hw_phy_config takes care of both.

This patch removes an excess entry (",") at the end of the exgmac_reg
array as well.
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Signed-off-by: Wang YanQing <udknight@gmail.com>
Cc: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9ecb9aab

Merge branch 'tipc_net-next_v2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux · ba501666

David S. Miller authored Dec 08, 2012

Paul Gortmaker says:

====================
Changes since v1:
	-get rid of essentially unused variable spotted by
	 Neil Horman (patch #2)

	-drop patch #3; defer it for 3.9 content, so Neil,
	 Jon and Ying can discuss its specifics at their
	 leisure while net-next is closed.  (It had no
	 direct dependencies to the rest of the series, and
	 was just an optimization)

	-fix indentation of accept() code directly in place
	 vs. forking it out to a separate function (was patch
	 #10, now patch #9).

Rebuilt and re-ran tests just to ensure nothing odd happened.

Original v1 text follows, updated pull information follows that.

           ---------

Here is another batch of TIPC changes.  The most interesting
thing is probably the non-blocking socket connect - I'm told
there were several users looking forward to seeing this.

Also there were some resource limitation changes that had
the right intent back in 2005, but were now apparently causing
needless limitations to people's real use cases; those have
been relaxed/removed.

There is a lockdep splat fix, but no need for a stable backport,
since it is virtually impossible to trigger in mainline; you
have to essentially modify code to force the probabilities
in your favour to see it.

The rest can largely be categorized as general cleanup of things
seen in the process of getting the above changes done.

Tested between 64 and 32 bit nodes with the test suite.  I've
also compile tested all the individual commits on the chain.

I'd originally figured on this queue not being ready for 3.8, but
the extended stabilization window of 3.7 has changed that.  On
the other hand, this can still be 3.9 material, if that simply
works better for folks - no problem for me to defer it to 2013.
If anyone spots any problems then I'll definitely defer it,
rather than rush a last minute respin.
===================
Signed-off-by: David S. Miller <davem@davemloft.net>

ba501666

07 Dec, 2012 17 commits

tipc: refactor accept() code for improved readability · 0fef8f20

Paul Gortmaker authored Dec 04, 2012

In TIPC's accept() routine, there is a large block of code relating
to initialization of a new socket, all within an if condition checking
if the allocation succeeded.

Here, we simply flip the check of the if, so that the main execution
path stays at the same indentation level, which improves readability.
If the allocation fails, we jump to an already existing exit label.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

0fef8f20

tipc: add lock nesting notation to quiet lockdep warning · 258f8667

Ying Xue authored Dec 03, 2012

TIPC accept() call grabs the socket lock on a newly allocated
socket while holding the socket lock on an old socket. But lockdep
worries that this might be a recursive lock attempt:

  [ INFO: possible recursive locking detected ]
  ---------------------------------------------
  kworker/u:0/6 is trying to acquire lock:
  (sk_lock-AF_TIPC){+.+.+.}, at: [<c8c1226c>] accept+0x15c/0x310 [tipc]

  but task is already holding lock:
  (sk_lock-AF_TIPC){+.+.+.}, at: [<c8c12138>] accept+0x28/0x310 [tipc]

  other info that might help us debug this:
  Possible unsafe locking scenario:

          CPU0
          ----
          lock(sk_lock-AF_TIPC);
          lock(sk_lock-AF_TIPC);

          *** DEADLOCK ***

  May be due to missing lock nesting notation
  [...]

Tell lockdep that this locking is safe by using lock_sock_nested().
This is similar to what was done in commit 5131a184 for
SCTP code ("SCTP: lock_sock_nested in sctp_sock_migrate").

Also note that this is isn't something that is seen normally,
as it was uncovered with some experimental work-in-progress
code not yet ready for mainline.  So no need for stable
backports or similar of this commit.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

258f8667

tipc: eliminate connection setup for implied connect in recv_msg() · cbab3687

Ying Xue authored Nov 29, 2012

As connection setup is now completed asynchronously in BH context,
in the function filter_connect(), the corresponding code in recv_msg()
becomes redundant.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

cbab3687

tipc: introduce non-blocking socket connect · 584d24b3

Ying Xue authored Nov 29, 2012

TIPC has so far only supported blocking connect(), meaning that a call
to connect() doesn't return until either the connection is fully
established, or an error occurs. This has proved insufficient for many
users, so we now introduce non-blocking connect(), analogous to how
this is done in TCP and other protocols.

With this feature, if a connection cannot be established instantly,
connect() will return the error code "-EINPROGRESS".
If the user later calls connect() again, he will either have the
return code "-EALREADY" or "-EISCONN", depending on whether the
connection has been established or not.

The user must have explicitly set the socket to be non-blocking
(SOCK_NONBLOCK or O_NONBLOCK, depending on method used), so unless
for some reason they had set this already (the socket would anyway
remain blocking in current TIPC) this change should be completely
backwards compatible.

It is also now possible to call select() or poll() to wait for the
completion of a connection.

An effect of the above is that the actual completion of a connection
may now be performed asynchronously, independent of the calls from
user space. Therefore, we now execute this code in BH context, in
the function filter_rcv(), which is executed upon reception of
messages in the socket.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: minor refactoring for improved connect/disconnect function names]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

584d24b3

tipc: consolidate connection-oriented message reception in one function · 7e6c131e

Ying Xue authored Nov 29, 2012

Handling of connection-related message reception is currently scattered
around at different places in the code. This makes it harder to verify
that things are handled correctly in all possible scenarios.
So we consolidate the existing processing of connection-oriented
message reception in a single routine.  In the process, we convert the
chain of if/else into a switch/case for improved readability.

A cast on the socket_state in the switch is needed to avoid compile
warnings on 32 bit, like "net/tipc/socket.c:1252:2: warning: case value
‘4294967295’ not in enumerated type".  This happens because existing
tipc code pseudo extends the default linux socket state values with:

	#define SS_LISTENING    -1      /* socket is listening */
	#define SS_READY        -2      /* socket is connectionless */

It may make sense to add these as _positive_ values to the existing
socket state enum list someday, vs. these already existing defines.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
[PG: add cast to fix warning; remove returns from middle of switch]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

7e6c131e

tipc: standardize across connect/disconnect function naming · bc879117

Paul Gortmaker authored Nov 29, 2012

Currently we have tipc_disconnect and tipc_disconnect_port.  It is
not clear from the names alone, what they do or how they differ.
It turns out that tipc_disconnect just deals with the port locking
and then calls tipc_disconnect_port which does all the work.

If we rename as follows: tipc_disconnect_port --> __tipc_disconnect
then we will be following typical linux convention, where:

   __tipc_disconnect: "raw" function that does all the work.

   tipc_disconnect: wrapper that deals with locking and then calls
		    the real core __tipc_disconnect function

With this, the difference is immediately evident, and locking
violations are more apt to be spotted by chance while working on,
or even just while reading the code.

On the connect side of things, we currently only have the single
"tipc_connect2port" function.  It does both the locking at enter/exit,
and the core of the work.  Pending changes will make it desireable to
have the connect be a two part locking wrapper + worker function,
just like the disconnect is already.

Here, we make the connect look just like the updated disconnect case,
for the above reason, and for consistency.  In the process, we also
get rid of the "2port" suffix that was on the original name, since
it adds no descriptive value.

On close examination, one might notice that the above connect
changes implicitly move the call to tipc_link_get_max_pkt() to be
within the scope of tipc_port_lock() protected region; when it was
not previously.  We don't see any issues with this, and it is in
keeping with __tipc_connect doing the work and tipc_connect just
handling the locking.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

bc879117

tipc: change sk_receive_queue upper limit · e643df15

Jon Maloy authored Nov 27, 2012

The sk_recv_queue upper limit for connectionless sockets has empirically
turned out to be too low. When we double the current limit we get much
fewer rejected messages and no noticable negative side-effects.
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

e643df15

bonding: Fix check for ethtool get_link operation support · c772dde3

Ben Hutchings authored Dec 07, 2012

Since commit 2c60db03 ('net: provide a default dev->ethtool_ops')
all devices have a non-null ethtool_ops.  Test only
dev->ethtool_ops->get_link in both places where we care.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c772dde3

bridge: export multicast database via netlink · ee07c6e7

Cong Wang authored Dec 07, 2012

V5: fix two bugs pointed out by Thomas
    remove seq check for now, mark it as TODO

V4: remove some useless #include
    some coding style fix

V3: drop debugging printk's
    update selinux perm table as well

V2: drop patch 1/2, export ifindex directly
    Redesign netlink attributes
    Improve netlink seq check
    Handle IPv6 addr as well

This patch exports bridge multicast database via netlink
message type RTM_GETMDB. Similar to fdb, but currently bridge-specific.
We may need to support modify multicast database too (RTM_{ADD,DEL}MDB).

(Thanks to Thomas for patient reviews)

Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Cong Wang <amwang@redhat.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

ee07c6e7

net: doc : use more suitable word 'unexpected' to replace 'secluded' · 5d248c49

Shan Wei authored Dec 06, 2012

 'secluded' is used to describe places, not suitable here.
Suggested-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: Shan Wei <davidshan@tencent.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5d248c49

net: phy: smsc: Fix config_init typo · 4257d583

Patrick Trantham authored Dec 06, 2012

Correct a mistake made in the previous commit due to reckless
copy-and-pasting.
Signed-off-by: Patrick Trantham <patrick.trantham@fuel7.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4257d583

drivers/net: fix up function prototypes after __dev* removals · 1dd06ae8

Greg Kroah-Hartman authored Dec 06, 2012

The __dev* removal patches for the network drivers ended up messing up
the function prototypes for a bunch of drivers.  This patch fixes all of
them back up to be properly aligned.

Bonus is that this almost removes 100 lines of code, always a nice
surprise.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

1dd06ae8

tipc: eliminate aggregate sk_receive_queue limit · 9da3d475

Ying Xue authored Nov 27, 2012

As a complement to the per-socket sk_recv_queue limit, TIPC keeps a
global atomic counter for the sum of sk_recv_queue sizes across all
tipc sockets. When incremented, the counter is compared to an upper
threshold value, and if this is reached, the message is rejected
with error code TIPC_OVERLOAD.

This check was originally meant to protect the node against
buffer exhaustion and general CPU overload. However, all experience
indicates that the feature not only is redundant on Linux, but even
harmful. Users run into the limit very often, causing disturbances
for their applications, while removing it seems to have no negative
effects at all. We have also seen that overall performance is
boosted significantly when this bottleneck is removed.

Furthermore, we don't see any other network protocols maintaining
such a mechanism, something strengthening our conviction that this
control can be eliminated.

As a result, the atomic variable tipc_queue_size is now unused
and so it can be deleted.  There is a getsockopt call that used
to allow reading it; we retain that but just return zero for
maximum compatibility.
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
[PG: phase out tipc_queue_size as pointed out by Neil Horman]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

9da3d475

sctp: Add RCU protection to assoc->transport_addr_list · 45122ca2

Thomas Graf authored Dec 06, 2012

peer.transport_addr_list is currently only protected by sk_sock
which is inpractical to acquire for procfs dumping purposes.

This patch adds RCU protection allowing for the procfs readers to
enter RCU read-side critical sections.

Modification of the list continues to be serialized via sk_lock.

V2: Use list_del_rcu() in sctp_association_free() to be safe
    Skip transports marked dead when dumping for procfs

Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

45122ca2

sctp: proc: protect bind_addr->address_list accesses with rcu_read_lock() · 0b0fe913

Thomas Graf authored Dec 06, 2012

address_list is protected via the socket lock or RCU. Since we don't want
to take the socket lock for each assoc we dump in procfs a RCU read-side
critical section must be entered.

V2: Skip local addresses marked as dead

Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Vlad Yasevich <vyasevic@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0b0fe913

Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next · 36f0ffa5

David S. Miller authored Dec 07, 2012

John W. Linville says:

====================
This pull request is intended for 3.8...

This includes a Bluetooth pull.  Gustavo says:

"A few more patches to 3.8, I hope they can still make it to mainline!
The most important ones are the socket option for the SCO protocol to allow
accept/refuse new connections from userspace. Other than that I added some
fixes and Andrei did more AMP work."

Also, a mac80211 pull.  Johannes says:

"If you think there's any chance this might make it still, please pull my
mac80211-next tree (per below). This contains a relatively large number
of fixes to the previous code, as well as a few small features:
 * VHT association in mac80211
 * some new debugfs files
 * P2P GO powersave configuration
 * masked MAC address verification

The biggest patch is probably the BSS struct changes to use RCU for
their IE buffers to fix potential races. I've not tagged this for stable
because it's pretty invasive and nobody has ever seen any bugs in this
area as far as I know."

Several other drivers get some attention, including ath9k, brcmfmac,
brcmsmac, and a number of others.  Also, Hauke gives us a series that
improves watchdog support for the bcma and ssb busses.  Finally, Bill
Pemberton delivers a group of "remove __dev* attributes" for wireless
drivers -- these generate some "section mismatch" warnings, but Greg
K-H assures me that they will disappear by the time -rc1 is released.

This also includes a pull of the wireless tree to avoid merge
conflicts.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

36f0ffa5

tun: correctly report an error in tun_flow_init() · b3943aef

Paul Moore authored Dec 06, 2012

On error, the error code from tun_flow_init() is lost inside
tun_set_iff(), this patch fixes this by assigning the tun_flow_init()
error code to the "err" variable which is returned by
the tun_flow_init() function on error.
Signed-off-by: Paul Moore <pmoore@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b3943aef