Commits · 4054ff45454a9a4652e29ac750a6600f6fdcc216 · Kirill Smelkov / linux

13 Apr, 2016 29 commits

netfilter: ctnetlink: remove unnecessary inlining · 4054ff45

Pablo Neira Ayuso authored Apr 12, 2016

Many of these functions are called from control plane path.  Move
ctnetlink_nlmsg_size() under CONFIG_NF_CONNTRACK_EVENTS to avoid a
compilation warning when CONFIG_NF_CONNTRACK_EVENTS=n.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

4054ff45

netfilter: x_tables: introduce and use xt_copy_counters_from_user · d7591f0c

Florian Westphal authored Apr 01, 2016

The three variants use same copy&pasted code, condense this into a
helper and use that.

Make sure info.name is 0-terminated.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

d7591f0c

netfilter: x_tables: remove obsolete check · aded9f3e

Florian Westphal authored Apr 01, 2016

Since 'netfilter: x_tables: validate targets of jumps' change we
validate that the target aligns exactly with beginning of a rule,
so offset test is now redundant.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

aded9f3e

netfilter: x_tables: remove obsolete overflow check for compat case too · 95609155

Florian Westphal authored Apr 01, 2016

commit 9e67d5a7
("[NETFILTER]: x_tables: remove obsolete overflow check") left the
compat parts alone, but we can kill it there as well.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

95609155

netfilter: x_tables: do compat validation via translate_table · 09d96860

Florian Westphal authored Apr 01, 2016

This looks like refactoring, but its also a bug fix.

Problem is that the compat path (32bit iptables, 64bit kernel) lacks a few
sanity tests that are done in the normal path.

For example, we do not check for underflows and the base chain policies.

While its possible to also add such checks to the compat path, its more
copy&pastry, for instance we cannot reuse check_underflow() helper as
e->target_offset differs in the compat case.

Other problem is that it makes auditing for validation errors harder; two
places need to be checked and kept in sync.

At a high level 32 bit compat works like this:
1- initial pass over blob:
   validate match/entry offsets, bounds checking
   lookup all matches and targets
   do bookkeeping wrt. size delta of 32/64bit structures
   assign match/target.u.kernel pointer (points at kernel
   implementation, needed to access ->compatsize etc.)

2- allocate memory according to the total bookkeeping size to
   contain the translated ruleset

3- second pass over original blob:
   for each entry, copy the 32bit representation to the newly allocated
   memory.  This also does any special match translations (e.g.
   adjust 32bit to 64bit longs, etc).

4- check if ruleset is free of loops (chase all jumps)

5-first pass over translated blob:
   call the checkentry function of all matches and targets.

The alternative implemented by this patch is to drop steps 3&4 from the
compat process, the translation is changed into an intermediate step
rather than a full 1:1 translate_table replacement.

In the 2nd pass (step #3), change the 64bit ruleset back to a kernel
representation, i.e. put() the kernel pointer and restore ->u.user.name .

This gets us a 64bit ruleset that is in the format generated by a 64bit
iptables userspace -- we can then use translate_table() to get the
'native' sanity checks.

This has two drawbacks:

1. we re-validate all the match and target entry structure sizes even
though compat translation is supposed to never generate bogus offsets.
2. we put and then re-lookup each match and target.

THe upside is that we get all sanity tests and ruleset validations
provided by the normal path and can remove some duplicated compat code.

iptables-restore time of autogenerated ruleset with 300k chains of form
-A CHAIN0001 -m limit --limit 1/s -j CHAIN0002
-A CHAIN0002 -m limit --limit 1/s -j CHAIN0003

shows no noticeable differences in restore times:
old:   0m30.796s
new:   0m31.521s
64bit: 0m25.674s
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

09d96860

netfilter: x_tables: xt_compat_match_from_user doesn't need a retval · 0188346f

Florian Westphal authored Apr 01, 2016

Always returned 0.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

0188346f

netfilter: arp_tables: simplify translate_compat_table args · 8dddd327
Florian Westphal authored Apr 01, 2016
```
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
```
8dddd327
netfilter: ip6_tables: simplify translate_compat_table args · 329a0807
Florian Westphal authored Apr 01, 2016
```
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
```
329a0807
netfilter: ip_tables: simplify translate_compat_table args · 7d3f843e
Florian Westphal authored Apr 01, 2016
```
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
```
7d3f843e

netfilter: x_tables: validate all offsets and sizes in a rule · 13631bfc

Florian Westphal authored Apr 01, 2016

Validate that all matches (if any) add up to the beginning of
the target and that each match covers at least the base structure size.

The compat path should be able to safely re-use the function
as the structures only differ in alignment; added a
BUILD_BUG_ON just in case we have an arch that adds padding as well.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

13631bfc

netfilter: x_tables: check for bogus target offset · ce683e5f

Florian Westphal authored Apr 01, 2016

We're currently asserting that targetoff + targetsize <= nextoff.

Extend it to also check that targetoff is >= sizeof(xt_entry).
Since this is generic code, add an argument pointing to the start of the
match/target, we can then derive the base structure size from the delta.

We also need the e->elems pointer in a followup change to validate matches.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ce683e5f

netfilter: x_tables: check standard target size too · 7ed2abdd

Florian Westphal authored Apr 01, 2016

We have targets and standard targets -- the latter carries a verdict.

The ip/ip6tables validation functions will access t->verdict for the
standard targets to fetch the jump offset or verdict for chainloop
detection, but this happens before the targets get checked/validated.

Thus we also need to check for verdict presence here, else t->verdict
can point right after a blob.

Spotted with UBSAN while testing malformed blobs.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

7ed2abdd

netfilter: x_tables: add compat version of xt_check_entry_offsets · fc1221b3

Florian Westphal authored Apr 01, 2016

32bit rulesets have different layout and alignment requirements, so once
more integrity checks get added to xt_check_entry_offsets it will reject
well-formed 32bit rulesets.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

fc1221b3

netfilter: x_tables: assert minimum target size · a08e4e19

Florian Westphal authored Apr 01, 2016

The target size includes the size of the xt_entry_target struct.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

a08e4e19

netfilter: x_tables: kill check_entry helper · aa412ba2

Florian Westphal authored Apr 01, 2016

Once we add more sanity testing to xt_check_entry_offsets it
becomes relvant if we're expecting a 32bit 'config_compat' blob
or a normal one.

Since we already have a lot of similar-named functions (check_entry,
compat_check_entry, find_and_check_entry, etc.) and the current
incarnation is short just fold its contents into the callers.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

aa412ba2

netfilter: x_tables: add and use xt_check_entry_offsets · 7d35812c

Florian Westphal authored Apr 01, 2016

Currently arp/ip and ip6tables each implement a short helper to check that
the target offset is large enough to hold one xt_entry_target struct and
that t->u.target_size fits within the current rule.

Unfortunately these checks are not sufficient.

To avoid adding new tests to all of ip/ip6/arptables move the current
checks into a helper, then extend this helper in followup patches.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

7d35812c

netfilter: x_tables: validate targets of jumps · 36472341

Florian Westphal authored Apr 01, 2016

When we see a jump also check that the offset gets us to beginning of
a rule (an ipt_entry).

The extra overhead is negible, even with absurd cases.

300k custom rules, 300k jumps to 'next' user chain:
[ plus one jump from INPUT to first userchain ]:

Before:
real    0m24.874s
user    0m7.532s
sys     0m16.076s

After:
real    0m27.464s
user    0m7.436s
sys     0m18.840s
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

36472341

netfilter: x_tables: don't move to non-existent next rule · f24e230d

Florian Westphal authored Apr 01, 2016

Ben Hawkes says:

 In the mark_source_chains function (net/ipv4/netfilter/ip_tables.c) it
 is possible for a user-supplied ipt_entry structure to have a large
 next_offset field. This field is not bounds checked prior to writing a
 counter value at the supplied offset.

Base chains enforce absolute verdict.

User defined chains are supposed to end with an unconditional return,
xtables userspace adds them automatically.

But if such return is missing we will move to non-existent next rule.
Reported-by: Ben Hawkes <hawkes@google.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

f24e230d

tipc: remove remnants of old broadcast code · 7d45a04c

Jon Paul Maloy authored Apr 13, 2016

We remove a couple of leftover fields in struct tipc_bearer. Those
were used by the old broadcast implementation, and are not needed
any longer. There is no functional changes in this commit.
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7d45a04c

Merge branch 'mediatek-stress-test-fixes' · b30d27f5

David S. Miller authored Apr 12, 2016

John Crispin says:

====================
net: mediatek: make the driver pass stress tests

While testing the driver we managed to get the TX path to stall and fail
to recover. When dual MAC support was added to the driver, the whole queue
stop/wake code was not properly adapted. There was also a regression in the
locking of the xmit function. The fact that watchdog_timeo was not set and
that the tx_timeout code failed to properly reset the dma, irq and queue
just made the mess complete.

This series make the driver pass stress testing. With this series applied
the testbed has been running for several days and still has not locked up.
We have a second setup that has a small hack patch applied to randomly stop
irqs and/or one of the queues and successfully manages to recover from these
simulated tx stalls.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b30d27f5

net: mediatek: do not set the QID field in the TX DMA descriptors · 369f0453

John Crispin authored Apr 08, 2016

The QID field gets set to the mac id. This made the DMA linked list queue
the traffic of each MAC on a different internal queue. However during long
term testing we found that this will cause traffic stalls as the multi
queue setup requires a more complete initialisation which is not part of
the upstream driver yet.

This patch removes the code setting the QID field, resulting in all
traffic ending up in queue 0 which works without any special setup.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

369f0453

net: mediatek: move the pending_work struct to the device generic struct · 7c78b4ad

John Crispin authored Apr 08, 2016

The worker always touches both netdevs. It is ethernet core and not MAC
specific. We only need one worker, which belongs into the ethernets core
struct.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

7c78b4ad

net: mediatek: fix mtk_pending_work · e7d425dc

John Crispin authored Apr 08, 2016

The driver supports 2 MACs. Both run on the same DMA ring. If we hit a TX
timeout we need to stop both netdevs before restarting them again. If we
don't do this, mtk_stop() wont shutdown DMA and the consecutive call to
mtk_open() wont restart DMA and enable IRQs.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

e7d425dc

net: mediatek: fix TX locking · 34c2e4c9

John Crispin authored Apr 08, 2016

Inside the TX path there is a lock inside the tx_map function. This is
however too late. The patch moves the lock to the start of the xmit
function right before the free count check of the DMA ring happens.
If we do not do this, the code becomes racy leading to TX stalls and
dropped packets. This happens as there are 2 netdevs running on the
same physical DMA ring.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

34c2e4c9

net: mediatek: fix stop and wakeup of queue · 13c822f6

John Crispin authored Apr 08, 2016

The driver supports 2 MACs. Both run on the same DMA ring. If we go
above/below the TX rings threshold value, we always need to wake/stop
the queue of both devices. Not doing to can cause TX stalls and packet
drops on one of the devices.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

13c822f6

net: mediatek: remove superfluous reset call · 13439eec

John Crispin authored Apr 08, 2016

HW reset is triggered in the mtk_hw_init() function. There is no need to
also reset the core during probe.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

13439eec

net: mediatek: mtk_cal_txd_req() returns bad value · beeb4ca4

John Crispin authored Apr 08, 2016

The code used to also support the PDMA engine, which had 2 packet pointers
per descriptor. Because of this we had to divide the result by 2 and round
it up. This is no longer needed as the code only supports QDMA.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

beeb4ca4

net: mediatek: watchdog_timeo was not set · 82500aa0

John Crispin authored Apr 08, 2016

The original commit failed to set watchdog_timeo. This patch sets
watchdog_timeo to HZ.
Signed-off-by: John Crispin <blogic@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

82500aa0

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · da0caadf

David S. Miller authored Apr 12, 2016

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains the first batch of Netfilter updates for
your net-next tree.

1) Define pr_fmt() in nf_conntrack, from Weongyo Jeong.

2) Define and register netfilter's afinfo for the bridge family,
   this comes in preparation for native nfqueue's bridge for nft,
   from Stephane Bryant.

3) Add new attributes to store layer 2 and VLAN headers to nfqueue,
   also from Stephane Bryant.

4) Parse new NFQA_VLAN and NFQA_L2HDR nfqueue netlink attributes
   coming from userspace, from Stephane Bryant.

5) Use net->ipv6.devconf_all->hop_limit instead of hardcoded hop_limit
   in IPv6 SYNPROXY, from Liping Zhang.

6) Remove unnecessary check for dst == NULL in nf_reject_ipv6,
   from Haishuang Yan.

7) Deinline ctnetlink event report functions, from Florian Westphal.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

da0caadf

12 Apr, 2016 4 commits

netfilter: conntrack: move expectation event helper to ecache.c · ecdfb48c

Florian Westphal authored Apr 11, 2016

Not performance critical, it is only invoked when an expectation is
added/destroyed.

While at it, kill unused nf_ct_expect_event() wrapper.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ecdfb48c

netfilter: conntrack: de-inline nf_conntrack_eventmask_report · 3c435e2e

Florian Westphal authored Apr 11, 2016

Way too large; move it to nf_conntrack_ecache.c.
Reduces total object size by 1216 byte on my machine.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

3c435e2e

Merge branch 'for-upstream' of... · 69fb7812

David S. Miller authored Apr 12, 2016

Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2016-04-12

Here's a set of Bluetooth & 802.15.4 patches intended for the 4.7 kernel:

 - Fix for race condition in vhci driver
 - Memory leak fix for ieee802154/adf7242 driver
 - Improvements to deal with single-mode (LE-only) Bluetooth controllers
 - Fix for allowing the BT_SECURITY_FIPS security level
 - New BCM2E71 ACPI ID
 - NULL pointer dereference fix fox hci_ldisc driver

Let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

69fb7812

net: mdio: Fix lockdep falls positive splat · 9a6f2b01

Andrew Lunn authored Apr 11, 2016

MDIO devices can be stacked upon each other. The current code supports
two levels, which until recently has been enough for a DSA mdio bus on
top of another bus. Now we have hardware which has an MDIO mux in the
middle.

Define an MDIO MUTEX class with three levels.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

9a6f2b01

11 Apr, 2016 7 commits

Merge branch 'rprpc-2nd-rewrite-part-1' · 7c3da7d0

David S. Miller authored Apr 11, 2016

David Howells says:

====================
RxRPC: 2nd rewrite part 1

Okay, I'm in the process of rewriting the RxRPC rewrite.  The primary aim of
this second rewrite is to strictly control the number of active connections we
know about and to get rid of connections we don't need much more quickly.

On top of this, there are fixes to the protocol handling which will all occur
in later parts.

Here's the first set of patches from the second go, aimed at net-next.  These
are all fixes and cleanups preparatory to the main event.

Notable parts of this set include:

 (1) A fix for the AFS filesystem to wait for outstanding calls to complete
     before closing the RxRPC socket.

 (2) Differentiation of local and remote abort codes.  At a future point
     userspace will get to see this via control message data on recvmsg().

 (3) Absorb the rxkad module into the af_rxrpc module to prevent a dependency
     loop.

 (4) Create a null security module and unconditionalise calls into the
     security module that's in force (there will always be a security module
     applied to a connection, even if it's just the null one).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

7c3da7d0

rxrpc: Create a null security type and get rid of conditional calls · e0e4d82f

David Howells authored Apr 07, 2016

Create a null security type for security index 0 and get rid of all
conditional calls to the security operations.  We expect normally to be
using security, so this should be of little negative impact.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e0e4d82f

rxrpc: Absorb the rxkad security module · 648af7fc

David Howells authored Apr 07, 2016

Absorb the rxkad security module into the af_rxrpc module so that there's
only one module file.  This avoids a circular dependency whereby rxkad pins
af_rxrpc and cached connections pin rxkad but can't be manually evicted
(they will expire eventually and cease pinning).

With this change, af_rxrpc can just be unloaded, despite having cached
connections.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

648af7fc

rxrpc: Don't assume transport address family and size when using it · 6dd050f8

David Howells authored Apr 07, 2016

Don't assume transport address family and size when using the peer address
to send a packet.  Instead, use the start of the transport address rather
than any particular element of the union and use the transport address
length noted inside the sockaddr_rxrpc struct.

This will be necessary when IPv6 support is introduced.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6dd050f8

rxrpc: Don't pass gfp around in incoming call handling functions · 843099ca

David Howells authored Apr 07, 2016

Don't pass gfp around in incoming call handling functions, but rather hard
code it at the points where we actually need it since the value comes from
within the rxrpc driver and is always the same.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

843099ca

rxrpc: Differentiate local and remote abort codes in structs · dc44b3a0

David Howells authored Apr 07, 2016

In the rxrpc_connection and rxrpc_call structs, there's one field to hold
the abort code, no matter whether that value was generated locally to be
sent or was received from the peer via an abort packet.

Split the abort code fields in two for cleanliness sake and add an error
field to hold the Linux error number to the rxrpc_call struct too
(sometimes this is generated in a context where we can't return it to
userspace directly).

Furthermore, add a skb mark to indicate a packet that caused a local abort
to be generated so that recvmsg() can pick up the correct abort code.  A
future addition will need to be to indicate to userspace the difference
between aborts via a control message.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dc44b3a0

rxrpc: Static arrays of strings should be const char *const[] · 5b3e87f1

David Howells authored Apr 07, 2016

Static arrays of strings should be const char *const[].
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5b3e87f1