Commits · a4412a681a995d733e51cc1f6394b5db428613c3 · nexedi / linux

14 Jan, 2013 8 commits

Merge tag 'batman-adv-for-davem' of git://git.open-mesh.org/linux-merge · a4412a68

David S. Miller authored Jan 13, 2013

Included changes:
- use per_cpu_add when possible
- prevent the TT component to add multicast address as "mesh clients"
- some debug output improvements
- proper lockdeps class initializations
- new style fixes (space before/after brackets)
- other minor fixes and refactoring
Signed-off-by: David S. Miller <davem@davemloft.net>

a4412a68

ipv6: Store Router Alert option in IP6CB directly. · dd3332bf

YOSHIFUJI Hideaki / 吉藤英明 authored Jan 13, 2013

Router Alert option is very small and we can store the value
itself in the skb.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

dd3332bf

ipv6 xfrm: Use ipv6_addr_hash() in xfrm6_tunnel_spi_hash_byaddr(). · 2b464f61
YOSHIFUJI Hideaki / 吉藤英明 authored Jan 13, 2013
```
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
```
2b464f61
ipv6 route: Use ipv6_addr_hash() in rt6_info_hash_nhsfn(). · c08977bb
YOSHIFUJI Hideaki / 吉藤英明 authored Jan 13, 2013
```
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
```
c08977bb

ipv6: Make ipv6_is_mld() inline and use it from ip6_mc_input(). · daad1512

YOSHIFUJI Hideaki / 吉藤英明 authored Jan 13, 2013

Move generalized version of ipv6_is_mld() to header,
and use it from ip6_mc_input().
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

daad1512

ipv6: Use ipv6_get_dsfield() instead of ipv6_tclass(). · e7219858

YOSHIFUJI Hideaki / 吉藤英明 authored Jan 13, 2013

Commit 7a3198a8 ("ipv6: helper function to get tclass") introduced
ipv6_tclass(), but similar function is already available as
ipv6_get_dsfield().

We might be able to call ipv6_tclass() from ipv6_get_dsfield(),
but it is confusing to have two versions.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

e7219858

ipv6: Introduce ip6_flowinfo() to extract flowinfo (tclass + flowlabel). · 6502ca52
YOSHIFUJI Hideaki / 吉藤英明 authored Jan 13, 2013
```
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
```
6502ca52

ipv6: Introduce ip6_flow_hdr() to fill version, tclass and flowlabel. · 3e4e4c1f

YOSHIFUJI Hideaki / 吉藤英明 authored Jan 13, 2013

This is not only for readability but also for optimization.
What we do here is to build the 32bit word at the beginning of the ipv6
header (the "ip6_flow" virtual member of struct ip6_hdr in RFC3542) and
we do not need to read the tclass portion of the target buffer.
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

3e4e4c1f

12 Jan, 2013 17 commits

batman-adv: unbloat batadv_priv if debug is not enabled · 0c430d0d

Marek Lindner authored Dec 16, 2012

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>

0c430d0d

batman-adv: remove unused variable from orig_node struct · 93380261

Marek Lindner authored Dec 15, 2012

Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>

93380261

batman-adv: fix typo in debug message · c0275e24

Antonio Quartulli authored Dec 11, 2012

in bat_iv_ogm.c a debug message should print "tq" instead of "td"
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>

c0275e24

batman-adv: use the const qualifier in hash functions · 467b5fe6

Antonio Quartulli authored Dec 07, 2012

The data argument in each hash function should carry the
"const" qualifier as it is never modified.
Signed-off-by: Antonio Quartulli <antonio@open-mesh.com>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>

467b5fe6

batman-adv: don't compile the BLA switch if not requested · fa706554

Antonio Quartulli authored Nov 26, 2012

When the Bridge Loop Avoidance component is not compiled-in, its boolean switch
should be not compiled as well. This patch surrounds the switch with a proper
ifdef.

This behaviour was introduced by 9fd6b0615b5499b270d39a92b8790e206cf75833
("batman-adv: add bridge loop avoidance compile option")
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Acked-by: Simon Wunderlich <siwu@hrz.tu-chemnitz.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>

fa706554

batman-adv: remove useless NULL check · 3f87c423

Antonio Quartulli authored Nov 29, 2012

debugfs_remove_recursive() checks whether its argument is not null
on its own, therefore it is possible to remove the external check.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>

3f87c423

batman-adv: remove useless blank lines before and after brackets · 46d160ef
Antonio Quartulli authored Dec 01, 2012
```
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
```
46d160ef

batman-adv: Initialize lockdep class keys for hashes · dec05074

Antonio Quartulli authored Nov 10, 2012

Different hashes have the same class key because they get
initialised with the same one. For this reason lockdep can create
false warning when they are used recursively.

Re-initialise the key for each hash after the invocation to hash_new()
to avoid this problem.
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Tested-by: Linus Lüssing <linus.luessing@web.de>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>

dec05074

batman-adv: remove useless assignment in tt_local_add() · 8425ec6a

Antonio Quartulli authored Nov 19, 2012

The flag field of the tt_local_entry->common structure in
tt_local_add() is first assigned NO_FLAGS and then TT_CLIENT_NEW so
nullifying the first operation. For this reason it is safe to remove
the first assignment.

This was introuduced by ("batman-adv: keep local table consistency for
further TT_RESPONSE")
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>

8425ec6a

batman-adv: unify and properly print hex values · 39a32991

Antonio Quartulli authored Nov 19, 2012

Values are printed in hexadecimal format in several points in the
code, but they are not printed using the same format string.

This patches unifies the format used for such numbers so that they
look the same everywhere.

Given the fact that all the variables printed as hexadecimal are 16
bit long, this is the chosen printing format: %#.4x
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>

39a32991

batman-adv: print the CRC together with the translation tables · f9d8a537

Antonio Quartulli authored Nov 19, 2012

To simplify debugging operations, it is better to print the related
CRC together with the translation table (local CRC for the local
table and global CRC for each entry in the global table)
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>

f9d8a537

batman-adv: improve local translation table output · 85766a82

Antonio Quartulli authored Nov 08, 2012

This patch adds a nice header to the local translation table and
the last_seen time for each local entry
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>

85766a82

batman-adv: reduce local TT entry timeout to 10 minutes · 7cf4d520

Antonio Quartulli authored Nov 08, 2012

The current timeout is set to one hour. However a client connected to the mesh
network will always generate traffic. In the worst case it will send ARP
requests every 4 or 5 minutes. On the other hand having a long timeout means
storing dead entries for one hour and it leads to very big trans-tables
containing useless clients.

This patch reduces the timeout to 10 minutes
Signed-off-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>

7cf4d520

batman-adv: Do not add multicast MAC addresses to translation table · 02233e0c

Linus Lüssing authored Oct 17, 2012

The current translation table mechanism is not suitable for multicast
addresses and we are currently flooding such frames anyway.

Therefore this patch prevents multicast MAC addresses being added to the
translation table.
Signed-off-by: Linus Lüssing <linus.luessing@web.de>
Acked-by: Antonio Quartulli <ordex@autistici.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>

02233e0c

batman-adv: use per_cpu_add helper · 56917443

Shan Wei authored Nov 13, 2012

this_cpu_add is an atomic operation.
and be more faster than per_cpu_ptr operation.
Signed-off-by: Shan Wei <davidshan@tencent.com>
Reviewed-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <lindner_marek@yahoo.de>
Signed-off-by: Antonio Quartulli <ordex@autistici.org>

56917443

networking/cs89x0.txt: delete stale information about hand patching · 00494be4

Paul Gortmaker authored Jan 11, 2013

Output of a git grep happened to make me look into this file, and
I found instructions about how to hand patch (without using patch)
the driver into the kernel tree.

Since the driver has been a part of the mainline kernel for years,
we can dump this whole section.  Fortunately it doesn't even cause
a renumbering of the sections to do so.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

00494be4

net: splice: fix __splice_segment() · 18aafc62

Eric Dumazet authored Jan 11, 2013

commit 9ca1b22d (net: splice: avoid high order page splitting)
forgot that skb->head could need a copy into several page frags.

This could be the case for loopback traffic mostly.

Also remove now useless skb argument from linear_to_page()
and __splice_segment() prototypes.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>

18aafc62

11 Jan, 2013 9 commits

gianfar: use more portable i/o accessors · fb017472

Kim Phillips authored Jan 11, 2013

in/out_be32 accessors are Power arch centric whereas
ioread/writebe32 are available in other arches.  Also, unlike
in/out_be32, ioread/writebe32 expect non-volatile address arguments.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fb017472

ipv4: fib: fix a comment. · 28a28283

Rami Rosen authored Jan 11, 2013

In fib_frontend.c, there is a confusing comment; NETLINK_CB(skb).portid does not
refer to a pid of sending process, but rather to a netlink portid.
Signed-off-by: Rami Rosen <ramirose@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

28a28283

net: Export __netdev_pick_tx so that it can be used in modules · 87696f92

Alexander Duyck authored Jan 11, 2013

When testing with FCoE enabled we discovered that I had not exported
__netdev_pick_tx.  As a result ixgbe doesn't build with the RFC patches
applied because ixgbe_select_queue was calling the function.  This change
corrects that build issue by correctly exporting __netdev_pick_tx so it
can be used by modules.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

87696f92

tg3: missing break statement in tg3_get_5720_nvram_info() · 17e1a42f

Dan Carpenter authored Jan 11, 2013

There is a missing break statement so FLASH_5762_EEPROM_HD gets treated
like FLASH_5762_EEPROM_LD.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

17e1a42f

net: Add support for XPS without sysfs being defined · 024e9679

Alexander Duyck authored Jan 10, 2013

This patch makes it so that we can support transmit packet steering without
sysfs needing to be enabled.  The reason for making this change is to make
it so that a driver can make use of the XPS even while the sysfs portion of
the interface is not present.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

024e9679

net: Rewrite netif_set_xps_queues to address several issues · 01c5f864

Alexander Duyck authored Jan 10, 2013

This change is meant to address several issues I found within the
netif_set_xps_queues function.

If the allocation of one of the maps to be assigned to new_dev_maps failed
we could end up with the device map in an inconsistent state since we had
already worked through a number of CPUs and removed or added the queue. To
address that I split the process into several steps. The first of which is
just the allocation of updated maps for CPUs that will need larger maps to
store the queue. By doing this we can fail gracefully without actually
altering the contents of the current device map.

The second issue I found was the fact that we were always allocating a new
device map even if we were not adding any queues. I have updated the code
so that we only allocate a new device map if we are adding queues,
otherwise if we are not adding any queues to CPUs we just skip to the
removal process.

The last change I made was to reuse the code from remove_xps_queue to remove
the queue from the CPU. By making this change we can be consistent in how
we go about adding and removing the queues from the CPUs.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

01c5f864

net: Rewrite netif_reset_xps_queue to allow for better code reuse · 10cdc3f3

Alexander Duyck authored Jan 10, 2013

This patch does a minor refactor on netif_reset_xps_queue to address a few
items I noticed.

First is the fact that we are doing removal of queues in both
netif_reset_xps_queue and netif_set_xps_queue.  Since there is no need to
have the code in two places I am pushing it out into a separate function
and will come back in another patch and reuse the code in
netif_set_xps_queue.

The second item this change addresses is the fact that the Tx queues were
not getting their numa_node value cleared as a part of the XPS queue reset.
This patch resolves that by resetting the numa_node value if the dev_maps
value is set.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

10cdc3f3

net: Add functions netif_reset_xps_queue and netif_set_xps_queue · 537c00de

Alexander Duyck authored Jan 10, 2013

This patch adds two functions, netif_reset_xps_queue and
netif_set_xps_queue.  The main idea behind these two functions is to
provide a mechanism through which drivers can update their defaults in
regards to XPS.

Currently no such mechanism exists and as a result we cannot use XPS for
things such as ATR which would require a basic configuration to start in
which the Tx queues are mapped to CPUs via a 1:1 mapping.  With this change
I am making it possible for drivers such as ixgbe to be able to use the XPS
feature by controlling the default configuration.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

537c00de

net: Split core bits of netdev_pick_tx into __netdev_pick_tx · 416186fb

Alexander Duyck authored Jan 10, 2013

This change splits the core bits of netdev_pick_tx into a separate function.
The main idea behind this is to make this code accessible to select queue
functions when they decide to process the standard path instead of their
own custom path in their select queue routine.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

416186fb

10 Jan, 2013 6 commits

softirq: reduce latencies · c10d7367

Eric Dumazet authored Jan 10, 2013

In various network workloads, __do_softirq() latencies can be up
to 20 ms if HZ=1000, and 200 ms if HZ=100.

This is because we iterate 10 times in the softirq dispatcher,
and some actions can consume a lot of cycles.

This patch changes the fallback to ksoftirqd condition to :

- A time limit of 2 ms.
- need_resched() being set on current task

When one of this condition is met, we wakeup ksoftirqd for further
softirq processing if we still have pending softirqs.

Using need_resched() as the only condition can trigger RCU stalls,
as we can keep BH disabled for too long.

I ran several benchmarks and got no significant difference in
throughput, but a very significant reduction of latencies (one order
of magnitude) :

In following bench, 200 antagonist "netperf -t TCP_RR" are started in
background, using all available cpus.

Then we start one "netperf -t TCP_RR", bound to the cpu handling the NIC
IRQ (hard+soft)

Before patch :

# netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
RT_LATENCY=550110.424
MIN_LATENCY=146858
MAX_LATENCY=997109
P50_LATENCY=305000
P90_LATENCY=550000
P99_LATENCY=710000
MEAN_LATENCY=376989.12
STDDEV_LATENCY=184046.92

After patch :

# netperf -H 7.7.7.84 -t TCP_RR -T2,2 -- -k
RT_LATENCY,MIN_LATENCY,MAX_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MEAN_LATENCY,STDDEV_LATENCY
MIGRATED TCP REQUEST/RESPONSE TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET
to 7.7.7.84 () port 0 AF_INET : first burst 0 : cpu bind
RT_LATENCY=40545.492
MIN_LATENCY=9834
MAX_LATENCY=78366
P50_LATENCY=33583
P90_LATENCY=59000
P99_LATENCY=69000
MEAN_LATENCY=38364.67
STDDEV_LATENCY=12865.26
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: David Miller <davem@davemloft.net>
Cc: Tom Herbert <therbert@google.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c10d7367

net_sched: more precise pkt_len computation · 1def9238

Eric Dumazet authored Jan 10, 2013

One long standing problem with TSO/GSO/GRO packets is that skb->len
doesn't represent a precise amount of bytes on wire.

Headers are only accounted for the first segment.
For TCP, thats typically 66 bytes per 1448 bytes segment missing,
an error of 4.5 % for normal MSS value.

As consequences :

1) TBF/CBQ/HTB/NETEM/... can send more bytes than the assigned limits.
2) Device stats are slightly under estimated as well.

Fix this by taking account of headers in qdisc_skb_cb(skb)->pkt_len
computation.

Packet schedulers should use qdisc pkt_len instead of skb->len for their
bandwidth limitations, and TSO enabled devices drivers could use pkt_len
if their statistics are not hardware assisted, and if they don't scratch
skb->cb[] first word.

Both egress and ingress paths work, thanks to commit fda55eca
(net: introduce skb_transport_header_was_set()) : If GRO built
a GSO packet, it also set the transport header for us.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Stephen Hemminger <shemminger@vyatta.com>
Cc: Paolo Valente <paolo.valente@unimore.it>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

1def9238

doc: Clarify behavior when sysctl tcp_ecn = 1 · 3d55b323

Vijay Subramanian authored Jan 09, 2013

Recent commit (commit 7e3a2dc5 doc: make the description of how tcp_ecn
works more explicit and clear ) clarified the behavior of tcp_ecn sysctl
variable but description is inconsistent. When requested by incoming conections,
ECN is enabled with not just tcp_ecn = 2 but also with tcp_ecn = 1.

This patch makes it clear that with tcp_ecn = 1, ECN is enabled when requested
by incoming connections.

Also fix spelling of 'incoming'.
Signed-off-by: Vijay Subramanian <subramanian.vijay@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

3d55b323

bnx2x: align define usage to satisfy static checkers · 23826850

Ariel Elior authored Jan 09, 2013

Static checkers complained that the E1H_FUNC_MAX define is used
incorrectly in bnx2x_pretend_func(). The complaint was justified,
although its not a real bug, as the first part of the conditional
protects us in this case (a real bug would happen if a VF tried to
use the pretend func, but there are no VFs in E1H chips).
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ariel Elior <ariele@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

23826850

veth: fix a NULL deref in netif_carrier_off · 2efd32ee

Eric Dumazet authored Jan 10, 2013

In commit d0e2c55e (veth: avoid a NULL deref in veth_stats_one)
we now clear the peer pointers in veth_dellink()

veth_close() must therefore make sure the peer pointer is set.
Reported-by: Tom Parkin <tom.parkin@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2efd32ee

cxgb4: Fix incorrect PFVF CMASK · 1f1e4958

Vipul Pandya authored Jan 09, 2013

With Hard-Wired firmware configuration it was incorrectly provisioning the VFs
Channel Access Rights Mask.
Signed-off-by: Jay Hernandez <jay@chelsio.com>
Signed-off-by: Vipul Pandya <vipul@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

1f1e4958