Commits · e29ecd51de1683e6aeb88d76251f194b0311f749 · Kirill Smelkov / linux

24 Apr, 2012 9 commits

bnx2x: Update driver version to 1.72.50-0 · e29ecd51

Barak Witkowski authored Apr 23, 2012

Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e29ecd51

bnx2x: remove gro workaround · 94b2f9ba

Dmitry Kravkov authored Apr 23, 2012

Removes GRO workaround, as issue is fixed in FW 7.2.51.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

94b2f9ba

bnx2x: add afex support · a3348722

Barak Witkowski authored Apr 23, 2012

Following patch adds afex multifunction support to the driver (afex
multifunction is based on vntag header) and updates FW version used to 7.2.51.

Support includes the following:

1. Configure vif parameters in firmware (default vlan, vif id, default
   priority, allowed priorities) according to values received from NIC.
2. Configure FW to strip/add default vlan according to afex vlan mode.
3. Notify link up to OS only after vif is fully initialized.
4. Support vif list set/get requests and configure FW accordingly.
5. Supply afex statistics upon request from NIC.
6. Special handling to L2 interface in case of FCoE vif.
Signed-off-by: Barak Witkowski <barak@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a3348722

mlx4_en: Byte Queue Limit support · 5b263f53

Yevgeny Petrilin authored Apr 23, 2012

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>

5b263f53

mlx4_en: Moving to Interrupts for TX completions · e22979d9

Yevgeny Petrilin authored Apr 23, 2012

Moving to interrupts instead of polling fpr TX completions
Avoiding situations where skb can be held in by the driver for
a long time (till timer expires).
The change is also necessary for supporting BQL.

Removing comp_lock that was required because we could handle TX
completions from several contexts: Interrupts, timer, polling.
Now there is only interrupts
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>

e22979d9

mlx4_en: Added Ethtool support for TX Interrupt coalescing · a19a848a

Yevgeny Petrilin authored Apr 23, 2012

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Signed-off-by: David S. Miller <davem@davemloft.net>

a19a848a

tcp: sk_add_backlog() is too agressive for TCP · da882c1f

Eric Dumazet authored Apr 22, 2012

While investigating TCP performance problems on 10Gb+ links, we found a
tcp sender was dropping lot of incoming ACKS because of sk_rcvbuf limit
in sk_add_backlog(), especially if receiver doesnt use GRO/LRO and sends
one ACK every two MSS segments.

A sender usually tweaks sk_sndbuf, but sk_rcvbuf stays at its default
value (87380), allowing a too small backlog.

A TCP ACK, even being small, can consume nearly same truesize space than
outgoing packets. Using sk_rcvbuf + sk_sndbuf as a limit makes sense and
is fast to compute.

Performance results on netperf, single flow, receiver with disabled
GRO/LRO : 7500 Mbits instead of 6050 Mbits, no more TCPBacklogDrop
increments at sender.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

da882c1f

net: add a limit parameter to sk_add_backlog() · f545a38f

Eric Dumazet authored Apr 22, 2012

sk_add_backlog() & sk_rcvqueues_full() hard coded sk_rcvbuf as the
memory limit. We need to make this limit a parameter for TCP use.

No functional change expected in this patch, all callers still using the
old sk_rcvbuf limit.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Rick Jones <rick.jones2@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f545a38f

net ax25: Fix the build when sysctl support is disabled. · b9898507

Eric W. Biederman authored Apr 23, 2012

Randy Dunlap <rdunlap@xenotime.net> reported:

> On 04/23/2012 12:07 AM, Stephen Rothwell wrote:
>
>> Hi all,
>>
>> Changes since 20120420:
>
>
> include/net/ax25.h:447:75: error: expected ';' before '}' token
>
> static inline int ax25_register_dev_sysctl(ax25_dev *ax25_dev) { return 0 };
> static inline void ax25_unregister_dev_sysctl(ax25_dev *ax25_dev) {};
>
> First function:  move ';' inside braces.
> Second function:  drop the ';'.

Put the semicolons where it makes sense.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

b9898507

23 Apr, 2012 3 commits

net sysctl: Add place holder functions for when sysctl support is compiled out of the kernel. · 48c74958

Eric W. Biederman authored Apr 23, 2012

Randy Dunlap <rdunlap@xenotime.net> reported:
> On 04/23/2012 12:07 AM, Stephen Rothwell wrote:
>
>> Hi all,
>>
>> Changes since 20120420:
>
>
>
> ERROR: "unregister_net_sysctl_table" [net/phonet/phonet.ko] undefined!
> ERROR: "register_net_sysctl" [net/phonet/phonet.ko] undefined!
>
> when CONFIG_SYSCTL is not enabled.

Add static inline stub functions to gracefully handle the case when sysctl
support is not present.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

48c74958

be2net: fix ethtool get settings · 42f11cf2

Ajit Khaparde authored Apr 21, 2012

ethtool get settings was not displaying all the settings correctly.
use the get_phy_info to get more information about the PHY to fix this.
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

42f11cf2

tcp: Fix build warning after tcp_{v4,v6}_init_sock consolidation. · ac807fa8

David S. Miller authored Apr 23, 2012

net/ipv4/tcp_ipv4.c: In function 'tcp_v4_init_sock':
net/ipv4/tcp_ipv4.c:1891:19: warning: unused variable 'tp' [-Wunused-variable]
net/ipv6/tcp_ipv6.c: In function 'tcp_v6_init_sock':
net/ipv6/tcp_ipv6.c:1836:19: warning: unused variable 'tp' [-Wunused-variable]
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>

ac807fa8

21 Apr, 2012 28 commits

gianfar: add GRO support · cd754a57

Wu Jiajun-B06378 authored Apr 19, 2012

Replace netif_receive_skb with napi_gro_receive.
Signed-off-by: Jiajun Wu <b06378@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cd754a57

af_packet: packet_getsockopt() cleanup · c06fff6e

Eric Dumazet authored Apr 19, 2012

Factorize code, since most fetched values are int type.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c06fff6e

tcp: move duplicate code from tcp_v4_init_sock()/tcp_v6_init_sock() · 900f65d3

Neal Cardwell authored Apr 19, 2012

This commit moves the (substantial) common code shared between
tcp_v4_init_sock() and tcp_v6_init_sock() to a new address-family
independent function, tcp_init_sock().

Centralizing this functionality should help avoid drift issues,
e.g. where the IPv4 side is updated without a corresponding update to
IPv6. There was already some drift: IPv4 initialized snd_cwnd to
TCP_INIT_CWND, while the IPv6 side was still initializing snd_cwnd to
2 (in this case it should not matter, since snd_cwnd is also
initialized in tcp_init_metrics(), but the general risks and
maintenance overhead remain).

When diffing the old and new code, note that new tcp_init_sock()
function uses the order of steps from the tcp_v4_init_sock()
implementation (the order is slightly different in
tcp_v6_init_sock()).
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

900f65d3

net: allow better page reuse in splice(sock -> pipe) · e66e9a31

Eric Dumazet authored Apr 19, 2012

splice() from socket to pipe needs linear_to_page() helper to transfert
skb header to part of page.

We can reset the offset in the current sk->sk_sndmsg_page if we are the
last user of the page.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e66e9a31

team: add per-port option for enabling/disabling ports · acd69962

Jiri Pirko authored Apr 20, 2012

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

acd69962

team: allow to enable/disable ports · 19a0b58e

Jiri Pirko authored Apr 20, 2012

This patch changes content of hashlist (used to get port struct by
computed index (0...en_port_count-1)). Now the hash list contains only
enabled ports so userspace will be able to say what ports can be used
for tx/rx. This becomes handy when userspace will need to disable ports
which does not belong to active aggregator. By default, newly added port
is enabled.
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

19a0b58e

team: lb: let userspace care about port macs · 4c78bb84

Jiri Pirko authored Apr 20, 2012

Better to leave this for userspace
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4c78bb84

net: change big iov allocations · a74e9106

Eric Dumazet authored Apr 20, 2012

iov of more than 8 entries are allocated in sendmsg()/recvmsg() through
sock_kmalloc()

As these allocations are temporary only and small enough, it makes sense
to use plain kmalloc() and avoid sk_omem_alloc atomic overhead.

Slightly changed fast path to be even faster.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Mike Waychison <mikew@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a74e9106

tcp: Repair connection-time negotiated parameters · b139ba4e

Pavel Emelyanov authored Apr 19, 2012

There are options, which are set up on a socket while performing
TCP handshake. Need to resurrect them on a socket while repairing.
A new sockoption accepts a buffer and parses it. The buffer should
be CODE:VALUE sequence of bytes, where CODE is standard option
code and VALUE is the respective value.

Only 4 options should be handled on repaired socket.

To read 3 out of 4 of these options the TCP_INFO sockoption can be
used. An ability to get the last one (the mss_clamp) was added by
the previous patch.

Now the restore. Three of these options -- timestamp_ok, mss_clamp
and snd_wscale -- are just restored on a coket.

The sack_ok flags has 2 issues. First, whether or not to do sacks
at all. This flag is just read and set back. No other sack  info is
saved or restored, since according to the standart and the code
dropping all sack-ed segments is OK, the sender will resubmit them
again, so after the repair we will probably experience a pause in
connection. Next, the fack bit. It's just set back on a socket if
the respective sysctl is set. No collected stats about packets flow
is preserved. As far as I see (plz, correct me if I'm wrong) the
fack-based congestion algorithm survives dropping all of the stats
and repairs itself eventually, probably losing the performance for
that period.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

b139ba4e

tcp: Report mss_clamp with TCP_MAXSEG option in repair mode · 5e6a3ce6

Pavel Emelyanov authored Apr 19, 2012

The mss_clamp is the only connection-time negotiated option which
cannot be obtained from the user space. Make the TCP_MAXSEG sockopt
report one in the repair mode.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

5e6a3ce6

tcp: Repair socket queues · c0e88ff0

Pavel Emelyanov authored Apr 19, 2012

Reading queues under repair mode is done with recvmsg call.
The queue-under-repair set by TCP_REPAIR_QUEUE option is used
to determine which queue should be read. Thus both send and
receive queue can be read with this.

Caller must pass the MSG_PEEK flag.

Writing to queues is done with sendmsg call and yet again --
the repair-queue option can be used to push data into the
receive queue.

When putting an skb into receive queue a zero tcp header is
appented to its head to address the tcp_hdr(skb)->syn and
the ->fin checks by the (after repair) tcp_recvmsg. These
flags flags are both set to zero and that's why.

The fin cannot be met in the queue while reading the source
socket, since the repair only works for closed/established
sockets and queueing fin packet always changes its state.

The syn in the queue denotes that the respective skb's seq
is "off-by-one" as compared to the actual payload lenght. Thus,
at the rcv queue refill we can just drop this flag and set the
skb's sequences to precice values.

When the repair mode is turned off, the write queue seqs are
updated so that the whole queue is considered to be 'already sent,
waiting for ACKs' (write_seq = snd_nxt <= snd_una). From the
protocol POV the send queue looks like it was sent, but the data
between the write_seq and snd_nxt is lost in the network.

This helps to avoid another sockoption for setting the snd_nxt
sequence. Leaving the whole queue in a 'not yet sent' state (as
it will be after sendmsg-s) will not allow to receive any acks
from the peer since the ack_seq will be after the snd_nxt. Thus
even the ack for the window probe will be dropped and the
connection will be 'locked' with the zero peer window.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c0e88ff0

tcp: Initial repair mode · ee995283

Pavel Emelyanov authored Apr 19, 2012

This includes (according the the previous description):

* TCP_REPAIR sockoption

This one just puts the socket in/out of the repair mode.
Allowed for CAP_NET_ADMIN and for closed/establised sockets only.
When repair mode is turned off and the socket happens to be in
the established state the window probe is sent to the peer to
'unlock' the connection.

* TCP_REPAIR_QUEUE sockoption

This one sets the queue which we're about to repair. The
'no-queue' is set by default.

* TCP_QUEUE_SEQ socoption

Sets the write_seq/rcv_nxt of a selected repaired queue.
Allowed for TCP_CLOSE-d sockets only. When the socket changes
its state the other seq-s are changed by the kernel according
to the protocol rules (most of the existing code is actually
reused).

* Ability to forcibly bind a socket to a port

The sk->sk_reuse is set to SK_FORCE_REUSE.

* Immediate connect modification

The connect syscall initializes the connection, then directly jumps
to the code which finalizes it.

* Silent close modification

The close just aborts the connection (similar to SO_LINGER with 0
time) but without sending any FIN/RST-s to peer.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ee995283

tcp: Move code around · 370816ae

Pavel Emelyanov authored Apr 19, 2012

This is just the preparation patch, which makes the needed for
TCP repair code ready for use.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

370816ae

sock: Introduce named constants for sk_reuse · 4a17fd52

Pavel Emelyanov authored Apr 19, 2012

Name them in a "backward compatible" manner, i.e. reuse or not
are still 1 and 0 respectively. The reuse value of 2 means that
the socket with it will forcibly reuse everyone else's port.
Signed-off-by: Pavel Emelyanov <xemul@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

4a17fd52

drivers/net: decouple ISA and ISA_DMA_API · 59c55bdd

Arnd Bergmann authored Apr 20, 2012

The two options are separate, and some platforms (e.g. arm pxa)
have ISA slots but no ISA dma controller, so they cannot build
drivers using the DMA API functions.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

59c55bdd

sungem: use mdelay instead of udelay where necessary · 3a22d5d5

Arnd Bergmann authored Apr 20, 2012

Some architectures like ARM cannot handle large numbers as
arguments to udelay, so the drivers should use mdelay when
delaying for multiple miliseconds.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

3a22d5d5

donauboe: replace excessive udelay with msleep · 32fd32a5

Arnd Bergmann authored Apr 20, 2012

No driver should spin the CPU for 10ms, so better use
an msleep, which is allowed in the ->suspend function.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

32fd32a5

8390: select CRC32 support · 31f31204

Arnd Bergmann authored Apr 20, 2012

The ax88796 driver uses the CRC32 functions, so make sure that
they are actually enabled.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

31f31204

drivers/net: iwmc3200 depends on EXPERIMENTAL · ee29e613

Arnd Bergmann authored Apr 20, 2012

The iwmc3200 driver selects other code in Kconfig that depends on
EXPERIMENTAL. Kconfig warns about this when CONFIG_EXPERIMENTAL
is not already set, so logically, these options should also
be marked experimental or promoted to stable.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ee29e613

caif: include linux/io.h · 6e4a7629

Arnd Bergmann authored Apr 20, 2012

The caif_shmcore requires io.h in order to use ioremap, so include that
explicitly to compile in all configurations.

Also add a note about the use of ioremap(), which is not a proper way
to map a DMA buffer into kernel space. It's not completely clear what
the intention is for using ioremap, but it is clear that the result
of ioremap must not simply be accessed using kernel pointers but
should use readl/writel or memcopy_{to,from}io. Assigning the result
of ioremap to a regular pointer that can also be set to something
else is not ok.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

6e4a7629

drivers/net: add missing __devexit_p() annotations · 65f60925

Arnd Bergmann authored Apr 20, 2012

Drivers that refer to a __devexit function in an operations
structure need to annotate that pointer with __devexit_p so
replace it with a NULL pointer when the section gets discarded.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

65f60925

davinci_cpdma: export symbols used by other drivers · 32a6d90b

Arnd Bergmann authored Apr 20, 2012

The davinci_emac driver can be a module, so the symbols
it needs from the cpdma driver must be exported.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

32a6d90b

pch_gbe: remove suspicious comment · 8e7073a3

Richard Cochran authored Apr 20, 2012

The time stamping code in this driver appears to have been copied from
the ixp4xx_eth.c driver, including this timing comment. I had actually
measured the time stamp delay on an IXP425, but I really doubt that this
value also applies here.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8e7073a3

pch_gbe: run the ptp bpf just once per packet · 32127a0a

Richard Cochran authored Apr 20, 2012

This patch fixes code which needlessly ran the BPF twice per
packet. Instead, we just run the classifier once and test
whether the packet is any kind of PTP event message.
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

32127a0a

pch_gbe: correct receive time stamp filtering · 358dfb6d

Takahiro Shimizu authored Apr 20, 2012

This patch fixes the driver so that multicast PTP event messages can
be recognized by the hardware time stamping unit. The station address
register must be set according to the desired transport type.

[ RC - Rebased Takahiro's changes and wrote a commit message
  explaining the changes. ]
Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

358dfb6d

pch_gbe: do not set the channel control register · a6891ac7

Takahiro Shimizu authored Apr 20, 2012

We will let the pch_gbe code do that according to the receive time stamp
filter.

[ RC - Rebased Takahiro's changes and wrote a commit message
  explaining the changes. ]
Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a6891ac7

pch_gbe: improve coding style · 93c8acb5

Takahiro Shimizu authored Apr 20, 2012

This patch clears up a few coding style issues:

- Makes two function definitions a bit nicer looking.
- Remove unneeded parentheses.
- Simplify macros for register bits.

[ RC - Rebased Takahiro's changes and wrote a commit message
  explaining the changes. ]
Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

93c8acb5

pch_gbe: export a method to set the receive match address · 17cdedf3

Takahiro Shimizu authored Apr 20, 2012

The code in phc_gbe_main will need to call this method in order to set the
station address register according to the receive time stamping filter.

[ RC - Rebased Takahiro's changes and wrote a commit message
  explaining the changes. ]
Signed-off-by: Takahiro Shimizu <tshimizu818@gmail.com>
Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

17cdedf3