Commits · 278ed676cf453ee41fe1882f872e0ec2741ee191 · nexedi / linux

02 Sep, 2016 3 commits

David S. Miller authored Sep 01, 2016

Nikolay Aleksandrov says:

====================
net: bridge: add per-port unknown multicast flood control

The first patch prepares the forwarding path by having the exact packet
type passed down so we can later filter based on it and the per-port
unknown mcast flood flag introduced in the second patch. It is similar to
how the per-port unknown unicast flood flag works.
Nice side-effects of patch 01 are the slight reduction of tests in the
fast-path and a few minor checkpatch fixes.

v3: don't change br_auto_mask as that will change user-visible behaviour
v2: make pkt_type an enum as per Stephen's comment
====================
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

278ed676

net: bridge: add per-port multicast flood flag · b6cb5ac8

Nikolay Aleksandrov authored Aug 31, 2016

Add a per-port flag to control the unknown multicast flood, similar to the
unknown unicast flood flag and break a few long lines in the netlink flag
exports.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b6cb5ac8

net: bridge: change unicast boolean to exact pkt_type · 8addd5e7

Nikolay Aleksandrov authored Aug 31, 2016

Remove the unicast flag and introduce an exact pkt_type. That would help us
for the upcoming per-port multicast flood flag and also slightly reduce the
tests in the input fast path.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8addd5e7

01 Sep, 2016 30 commits

rtnetlink: fdb dump: optimize by saving last interface markers · d297653d

Roopa Prabhu authored Aug 30, 2016

fdb dumps spanning multiple skb's currently restart from the first
interface again for every skb. This results in unnecessary
iterations on the already visited interfaces and their fdb
entries. In large scale setups, we have seen this to slow
down fdb dumps considerably. On a system with 30k macs we
see fdb dumps spanning across more than 300 skbs.

To fix the problem, this patch replaces the existing single fdb
marker with three markers: netdev hash entries, netdevs and fdb
index to continue where we left off instead of restarting from the
first netdev. This is consistent with link dumps.

In the process of fixing the performance issue, this patch also
re-implements fix done by
commit 472681d5 ("net: ndo_fdb_dump should report -EMSGSIZE to rtnl_fdb_dump")
(with an internal fix from Wilson Kok) in the following ways:
- change ndo_fdb_dump handlers to return error code instead
of the last fdb index
- use cb->args strictly for dump frag markers and not error codes.
This is consistent with other dump functions.

Below results were taken on a system with 1000 netdevs
and 35085 fdb entries:
before patch:
$time bridge fdb show | wc -l
15065

real    1m11.791s
user    0m0.070s
sys 1m8.395s

(existing code does not return all macs)

after patch:
$time bridge fdb show | wc -l
35085

real    0m2.017s
user    0m0.113s
sys 0m1.942s
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d297653d

rps: flow_dissector: Add the const for the parameter of flow_keys_have_l4 · 66fdd05e

Gao Feng authored Aug 31, 2016

Add the const for the parameter of flow_keys_have_l4 for the readability.
Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

66fdd05e

rxrpc: Don't expose skbs to in-kernel users [ver #2] · d001648e

David Howells authored Aug 30, 2016

Don't expose skbs to in-kernel users, such as the AFS filesystem, but
instead provide a notification hook the indicates that a call needs
attention and another that indicates that there's a new call to be
collected.

This makes the following possibilities more achievable:

 (1) Call refcounting can be made simpler if skbs don't hold refs to calls.

 (2) skbs referring to non-data events will be able to be freed much sooner
     rather than being queued for AFS to pick up as rxrpc_kernel_recv_data
     will be able to consult the call state.

 (3) We can shortcut the receive phase when a call is remotely aborted
     because we don't have to go through all the packets to get to the one
     cancelling the operation.

 (4) It makes it easier to do encryption/decryption directly between AFS's
     buffers and sk_buffs.

 (5) Encryption/decryption can more easily be done in the AFS's thread
     contexts - usually that of the userspace process that issued a syscall
     - rather than in one of rxrpc's background threads on a workqueue.

 (6) AFS will be able to wait synchronously on a call inside AF_RXRPC.

To make this work, the following interface function has been added:

     int rxrpc_kernel_recv_data(
		struct socket *sock, struct rxrpc_call *call,
		void *buffer, size_t bufsize, size_t *_offset,
		bool want_more, u32 *_abort_code);

This is the recvmsg equivalent.  It allows the caller to find out about the
state of a specific call and to transfer received data into a buffer
piecemeal.

afs_extract_data() and rxrpc_kernel_recv_data() now do all the extraction
logic between them.  They don't wait synchronously yet because the socket
lock needs to be dealt with.

Five interface functions have been removed:

	rxrpc_kernel_is_data_last()
    	rxrpc_kernel_get_abort_code()
    	rxrpc_kernel_get_error_number()
    	rxrpc_kernel_free_skb()
    	rxrpc_kernel_data_consumed()

As a temporary hack, sk_buffs going to an in-kernel call are queued on the
rxrpc_call struct (->knlrecv_queue) rather than being handed over to the
in-kernel user.  To process the queue internally, a temporary function,
temp_deliver_data() has been added.  This will be replaced with common code
between the rxrpc_recvmsg() path and the kernel_rxrpc_recv_data() path in a
future patch.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d001648e

net: pegasus: Remove deprecated create_singlethread_workqueue · 95ac3994

Bhaktipriya Shridhar authored Aug 30, 2016

The workqueue "pegasus_workqueue" queues a single work item per pegasus
instance and hence it doesn't require execution ordering. Hence,
alloc_workqueue has been used to replace the deprecated
create_singlethread_workqueue instance.

The WQ_MEM_RECLAIM flag has been set to ensure forward progress under
memory pressure since it's a network driver.

Since there are fixed number of work items, explicit concurrency
limit is unnecessary here.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Petko Manolov <petkan@mip-labs.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

95ac3994

bonding: Remove deprecated create_singlethread_workqueue · f9f225eb

Bhaktipriya Shridhar authored Aug 30, 2016

alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces
deprecated create_singlethread_workqueue(). This is the identity
conversion.

The workqueue "wq" queues multiple work items viz
&bond->mcast_work, &nnw->work, &bond->mii_work, &bond->arp_work,
&bond->alb_work, &bond->mii_work, &bond->ad_work, &bond->slave_arr_work
which require strict execution ordering. Hence, an ordered dedicated
workqueue has been used.

Since, it is a network driver, WQ_MEM_RECLAIM has been set to
ensure forward progress under memory pressure.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

f9f225eb

sky2: use napi_complete_done · f4b63ea0

stephen hemminger authored Aug 29, 2016

Update the sky2 driver to pass number of packets done to NAPI.
The driver was never updated when napi_complete_done was added.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f4b63ea0

l2tp: make nla_policy const · f5bb341e

stephen hemminger authored Aug 31, 2016

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

f5bb341e

tcp: make nla_policy const · 4f70c96f

stephen hemminger authored Aug 31, 2016

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

4f70c96f

ila: make nla_policy const · 6501f34f

stephen hemminger authored Aug 31, 2016

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

6501f34f

fou: make nla_policy const · 3f18ff2b

stephen hemminger authored Aug 31, 2016

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

3f18ff2b

netns: make nla_policy const · 3ee5256d

stephen hemminger authored Aug 31, 2016

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

3ee5256d

batman: make netlink attributes const · deeb91f5

stephen hemminger authored Aug 31, 2016

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

deeb91f5

drop_monitor: make genl_multicast_group const · 85bae4bd

stephen hemminger authored Aug 31, 2016

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

85bae4bd

net: make genetlink ctrl ops const · 12d8de6d

stephen hemminger authored Aug 31, 2016

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

12d8de6d

Merge branch 'stmmac-STM32F429' · 5a256347

David S. Miller authored Sep 01, 2016

Alexandre TORGUE says:

====================
Add Ethernet support on STM32F429

STM32F429 Chip embeds a Synopsys 3.50a MAC IP.

This series enhance current stmmac driver to control it (code already
available) and adds basic glue for STM32F429 chip.

Changes since v5:
 -Fix typo in bindings documentation patch.
 -Change clocks names in stm32-dwmac glue driver / Documentation.
 -After rebase, stm32 ethernet node is now available. It has to be updated
according to new clocks names.

Changes since v4:
 -Fix dirty copy/past in bindings documentation patch.

Changes since v3:
 -Fix "tx-clk" and "rx-clk" as required clocks. Driver and bindings are
modified.

Changes since v2:
 -Fix alphabetic order in Kconfig and Makefile.
 -Improve code according to Joachim review.
 -Binding: remove useless entry.

Changes since v1:
 -Fix Kbuild issue in Kconfig.
 -Remove init/exit callbacks. Suspend/Resume and remove driver is no more
driven in stmmac_pltfr but directly in dwmac-stm32 glue driver.
 -Take into account Joachim review.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

5a256347

net: ethernet: stmmac: add support of Synopsys 3.50a MAC IP · f9a09687

Alexandre TORGUE authored Aug 29, 2016

Adds support of Synopsys 3.50a MAC IP in stmmac driver.
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Tested-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f9a09687

Documentation: Bindings: Add STM32 DWMAC glue · 99abf9d6

Alexandre TORGUE authored Aug 29, 2016

Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

99abf9d6

net: ethernet: dwmac: add Ethernet glue logic for stm32 chip · c6eec6f3

Alexandre TORGUE authored Aug 29, 2016

stm324xx family chips support Synopsys MAC 3.510 IP.
This patch adds settings for logical glue logic:
-clocks
-mode selection MII or RMII.
Reviewed-by: Joachim Eastwood <manabian@gmail.com>
Acked-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Tested-by: Maxime Coquelin <maxime.coquelin@st.com>
Signed-off-by: Alexandre TORGUE <alexandre.torgue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c6eec6f3

mpls: get rid of trivial returns · ce927bf1

stephen hemminger authored Sep 01, 2016

return at end of function is useless.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ce927bf1

net: xgene: fix backward compatibility fix · ba3d0dda

Arnd Bergmann authored Aug 29, 2016

A bugfix for backward compatibility handling introduced undefined
behavior for the case that of_parse_phandle() does not return
a valid entry, as "gcc -Wmaybe-unused" reports:

drivers/net/ethernet/apm/xgene/xgene_enet_hw.c: In function 'xgene_enet_phy_connect':
drivers/net/ethernet/apm/xgene/xgene_enet_hw.c:776:6: error: 'phy_dev' may be used uninitialized in this function [-Werror=maybe-uninitialized]
drivers/net/ethernet/apm/xgene/xgene_enet_hw.c: In function 'xgene_enet_mdio_config':
drivers/net/ethernet/apm/xgene/xgene_enet_hw.c:776:6: error: 'phy_dev' may be used uninitialized in this function [-Werror=maybe-uninitialized]

We can work around this by removing the check for zero "np", as
of_phy_connect() will correctly handle a NULL argument so we fall
back into the normal error handling case.

Note that I had previously fixed another bug that resulted in the
exact same warning, but this is a different problem that was
introduced after my original fix.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 03377e38 ("drivers: net: xgene: Fix backward compatibility")
Signed-off-by: David S. Miller <davem@davemloft.net>

ba3d0dda

r8152: fix the coding style with checkpatch.pl · 53700f0c

hayeswang authored Sep 01, 2016

check the coding style with checkpatch.pl and fix the warnings and errors.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

53700f0c

Merge branch 'asix-pm-improvements' · d00a90ca

David S. Miller authored Aug 31, 2016

Robert Foss says:

====================
net/usb: asix driver improvements

This is a resubmission of v3, since the netdev
mailinlist was not sent the previous submission.

This series improves power management of the asix driver.

 - Suspend/resume support is improved to save needed registers.
 - Device disconnection is improved.
 - Fixes AX88772x resume failures
 - Implementes IEEE 802.3 spec section "22.2.4.1.1 Reset" correctly
 - Fixes AX_CMD_WRITE_MEDIUM_MODE being set incorrectly

Changes since v1:
- Added proper metadata tags to series.
- Added two more patches to series.

Changes since v2:
- Added coverletter
- Tested patches on AX88772A/AX88772B/AX88178/AX88179 hardware
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

d00a90ca

net: asix: autoneg will set WRITE_MEDIUM reg · 535baf85

Robert Foss authored Aug 29, 2016

From: Grant Grundler <grundler@chromium.org>

The miii_nway_restart() causes a PHY link change activity and
ax88772_link_reset will be called. link_reset will set
AX_CMD_WRITE_MEDIUM_MODE register correctly.

The asix_write_medium_mode in reset() fills in a default value to the register
which may be different from the negotiation result. So do this first.

Ignore the ret value since it's ignored in XXX_link_reset() functions.
Signed-off-by: Grant Grundler <grundler@google.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Tested-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

535baf85

net: asix: see 802.3 spec for phy reset · a243c2ef

Robert Foss authored Aug 29, 2016

From: Grant Grundler <grundler@chromium.org>

https://lkml.org/lkml/2014/11/11/947

Ben Hutchings is correct. IEEE 802.3 spec section "22.2.4.1.1 Reset" requires
up to 500ms delay. Mitigate the "max" delay by polling the phy until BCM_RESET
bit is clear.
Signed-off-by: Grant Grundler <grundler@chromium.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Tested-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a243c2ef

net: asix: Fix AX88772x resume failures · 4c1442aa

Robert Foss authored Aug 29, 2016

From: Allan Chou <allan@asix.com.tw>

The change fixes AX88772x resume failure by
- Restore incorrect AX88772A PHY registers when resetting
- Need to stop MAC operation when suspending
- Need to restart MII when restoring PHY
Signed-off-by: Allan Chou <allan@asix.com.tw>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Tested-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

4c1442aa

net: asix: Avoid looping when the device is disconnected · 8a46f665

Robert Foss authored Aug 29, 2016

From: Vincent Palatin <vpalatin@chromium.org>

Check the answers from the USB stack and avoid re-sending multiple times
the request if the device has disappeared.
Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Tested-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8a46f665

net: asix: Add in_pm parameter · d9fe64e5

Robert Foss authored Aug 29, 2016

From: Freddy Xin <freddy@asix.com.tw>

In order to R/W registers in suspend/resume functions, in_pm flags are
added to some functions to determine whether the nopm version of usb
functions is called.

Save BMCR and ANAR PHY registers in suspend function and restore them
in resume function.

Reset HW in resume function to ensure the PHY works correctly.
Signed-off-by: Freddy Xin <freddy@asix.com.tw>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Tested-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d9fe64e5

net: axienet: constify ethtool_ops structures · c7735f1b

Julia Lawall authored Sep 01, 2016

Check for ethtool_ops structures that are only stored in the ethtool_ops
field of a net_device structure or passed as the second argument to
netdev_set_default_ethtool_ops.  These contexts are declared const, so
ethtool_ops structures that have these properties can be declared as const
also.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct ethtool_ops i@p = { ... };

@ok1@
identifier r.i;
struct net_device e;
position p;
@@
e.ethtool_ops = &i@p;

@ok2@
identifier r.i;
expression e;
position p;
@@
netdev_set_default_ethtool_ops(e, &i@p)

@bad@
position p != {r.p,ok1.p,ok2.p};
identifier r.i;
@@
i@p

@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
 struct ethtool_ops i = { ... };
// </smpl>
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

c7735f1b

r8152: constify ethtool_ops structures · 407a471d

Julia Lawall authored Sep 01, 2016

Check for ethtool_ops structures that are only stored in the ethtool_ops
field of a net_device structure or passed as the second argument to
netdev_set_default_ethtool_ops.  These contexts are declared const, so
ethtool_ops structures that have these properties can be declared as const
also.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct ethtool_ops i@p = { ... };

@ok1@
identifier r.i;
struct net_device e;
position p;
@@
e.ethtool_ops = &i@p;

@ok2@
identifier r.i;
expression e;
position p;
@@
netdev_set_default_ethtool_ops(e, &i@p)

@bad@
position p != {r.p,ok1.p,ok2.p};
identifier r.i;
@@
i@p

@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
 struct ethtool_ops i = { ... };
// </smpl>
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

407a471d

net: mediatek: constify ethtool_ops structures · 6a38cb15

Julia Lawall authored Sep 01, 2016

Check for ethtool_ops structures that are only stored in the ethtool_ops
field of a net_device structure or passed as the second argument to
netdev_set_default_ethtool_ops.  These contexts are declared const, so
ethtool_ops structures that have these properties can be declared as const
also.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@r disable optional_qualifier@
identifier i;
position p;
@@
static struct ethtool_ops i@p = { ... };

@ok1@
identifier r.i;
struct net_device e;
position p;
@@
e.ethtool_ops = &i@p;

@ok2@
identifier r.i;
expression e;
position p;
@@
netdev_set_default_ethtool_ops(e, &i@p)

@bad@
position p != {r.p,ok1.p,ok2.p};
identifier r.i;
@@
i@p

@depends on !bad disable optional_qualifier@
identifier r.i;
@@
static
+const
 struct ethtool_ops i = { ... };
// </smpl>
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

6a38cb15

31 Aug, 2016 7 commits

Merge branch 'ppp-recursion' · 127661a4

David S. Miller authored Aug 31, 2016

Guillaume Nault says:

====================
ppp: fix deadlock upon recursive xmit

This series fixes the issue reported by Feng where packets looping
through a ppp device makes the module deadlock:
https://marc.info/?l=linux-netdev&m=147134567319038&w=2

The problem can occur on virtual interfaces (e.g. PPP over L2TP, or
PPPoE on vxlan devices), when a PPP packet is routed back to the PPP
interface.

PPP's xmit path isn't reentrant, so patch #1 uses a per-cpu variable
to detect and break recursion. Patch #2 sets the NETIF_F_LLTX flag to
avoid lock inversion issues between ppp and txqueue locks.

There are multiple entry points to the PPP xmit path. This series has
been tested with lockdep and should address recursion issues no matter
how the packet entered the path.

A similar issue in L2TP is not covered by this series:
l2tp_xmit_skb() also isn't reentrant, and it can be called as part of
PPP's xmit path (pppol2tp_xmit()), or directly from the L2TP socket
(l2tp_ppp_sendmsg()). If a packet is sent by l2tp_ppp_sendmsg() and
routed to the parent PPP interface, then it's going to hit
l2tp_xmit_skb() again.

Breaking recursion as done in ppp_generic is not enough, because we'd
still have a lock inversion issue (locking in l2tp_xmit_skb() can
happen before or after locking in ppp_generic). The best approach would
be to use the ip_tunnel functions and remove the socket locking in
l2tp_xmit_skb(). But that'd be something for net-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

127661a4

ppp: declare PPP devices as LLTX · 07712770

Guillaume Nault authored Aug 27, 2016

ppp_xmit_process() already locks the xmit path. If HARD_TX_LOCK() tries
to hold the _xmit_lock we can get lock inversion.

[  973.726130] ======================================================
[  973.727311] [ INFO: possible circular locking dependency detected ]
[  973.728546] 4.8.0-rc2 #1 Tainted: G           O
[  973.728986] -------------------------------------------------------
[  973.728986] accel-pppd/1806 is trying to acquire lock:
[  973.728986]  (&qdisc_xmit_lock_key){+.-...}, at: [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221
[  973.728986]
[  973.728986] but task is already holding lock:
[  973.728986]  (l2tp_sock){+.-...}, at: [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core]
[  973.728986]
[  973.728986] which lock already depends on the new lock.
[  973.728986]
[  973.728986]
[  973.728986] the existing dependency chain (in reverse order) is:
[  973.728986]
-> #3 (l2tp_sock){+.-...}:
[  973.728986]        [<ffffffff810b3130>] lock_acquire+0x150/0x217
[  973.728986]        [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c
[  973.728986]        [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core]
[  973.728986]        [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp]
[  973.728986]        [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic]
[  973.728986]        [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic]
[  973.728986]        [<ffffffff811b2ec6>] __vfs_write+0x56/0x120
[  973.728986]        [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b
[  973.728986]        [<ffffffff811b4cb2>] SyS_write+0x5e/0x96
[  973.728986]        [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8
[  973.728986]
-> #2 (&(&pch->downl)->rlock){+.-...}:
[  973.728986]        [<ffffffff810b3130>] lock_acquire+0x150/0x217
[  973.728986]        [<ffffffff81575334>] _raw_spin_lock_bh+0x31/0x40
[  973.728986]        [<ffffffffa01808e2>] ppp_push+0xa7/0x82d [ppp_generic]
[  973.728986]        [<ffffffffa0184675>] __ppp_xmit_process+0x48/0x877 [ppp_generic]
[  973.728986]        [<ffffffffa018505b>] ppp_xmit_process+0x4b/0xaf [ppp_generic]
[  973.728986]        [<ffffffffa01853f7>] ppp_write+0x10e/0x11c [ppp_generic]
[  973.728986]        [<ffffffff811b2ec6>] __vfs_write+0x56/0x120
[  973.728986]        [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b
[  973.728986]        [<ffffffff811b4cb2>] SyS_write+0x5e/0x96
[  973.728986]        [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8
[  973.728986]
-> #1 (&(&ppp->wlock)->rlock){+.-...}:
[  973.728986]        [<ffffffff810b3130>] lock_acquire+0x150/0x217
[  973.728986]        [<ffffffff81575334>] _raw_spin_lock_bh+0x31/0x40
[  973.728986]        [<ffffffffa0184654>] __ppp_xmit_process+0x27/0x877 [ppp_generic]
[  973.728986]        [<ffffffffa018505b>] ppp_xmit_process+0x4b/0xaf [ppp_generic]
[  973.728986]        [<ffffffffa01852da>] ppp_start_xmit+0x21b/0x22a [ppp_generic]
[  973.728986]        [<ffffffff8143f767>] dev_hard_start_xmit+0x1a9/0x43d
[  973.728986]        [<ffffffff8146f747>] sch_direct_xmit+0xd6/0x221
[  973.728986]        [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912
[  973.728986]        [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd
[  973.728986]        [<ffffffff81449978>] neigh_direct_output+0xc/0xe
[  973.728986]        [<ffffffff8150e62b>] ip6_finish_output2+0x5a9/0x623
[  973.728986]        [<ffffffff81512128>] ip6_output+0x15e/0x16a
[  973.728986]        [<ffffffff8153ef86>] dst_output+0x76/0x7f
[  973.728986]        [<ffffffff8153f737>] mld_sendpack+0x335/0x404
[  973.728986]        [<ffffffff81541c61>] mld_send_initial_cr.part.21+0x99/0xa2
[  973.728986]        [<ffffffff8154441d>] ipv6_mc_dad_complete+0x42/0x71
[  973.728986]        [<ffffffff8151c4bd>] addrconf_dad_completed+0x1cf/0x2ea
[  973.728986]        [<ffffffff8151e4fa>] addrconf_dad_work+0x453/0x520
[  973.728986]        [<ffffffff8107a393>] process_one_work+0x365/0x6f0
[  973.728986]        [<ffffffff8107aecd>] worker_thread+0x2de/0x421
[  973.728986]        [<ffffffff810816fb>] kthread+0x121/0x130
[  973.728986]        [<ffffffff81575dbf>] ret_from_fork+0x1f/0x40
[  973.728986]
-> #0 (&qdisc_xmit_lock_key){+.-...}:
[  973.728986]        [<ffffffff810b28d6>] __lock_acquire+0x1118/0x1483
[  973.728986]        [<ffffffff810b3130>] lock_acquire+0x150/0x217
[  973.728986]        [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c
[  973.728986]        [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221
[  973.728986]        [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912
[  973.728986]        [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd
[  973.728986]        [<ffffffff81449978>] neigh_direct_output+0xc/0xe
[  973.728986]        [<ffffffff81487811>] ip_finish_output2+0x5db/0x609
[  973.728986]        [<ffffffff81489590>] ip_finish_output+0x152/0x15e
[  973.728986]        [<ffffffff8148a0d4>] ip_output+0x8c/0x96
[  973.728986]        [<ffffffff81489652>] ip_local_out+0x41/0x4a
[  973.728986]        [<ffffffff81489e7d>] ip_queue_xmit+0x5a5/0x609
[  973.728986]        [<ffffffffa0202fe4>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core]
[  973.728986]        [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp]
[  973.728986]        [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic]
[  973.728986]        [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic]
[  973.728986]        [<ffffffff811b2ec6>] __vfs_write+0x56/0x120
[  973.728986]        [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b
[  973.728986]        [<ffffffff811b4cb2>] SyS_write+0x5e/0x96
[  973.728986]        [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8
[  973.728986]
[  973.728986] other info that might help us debug this:
[  973.728986]
[  973.728986] Chain exists of:
  &qdisc_xmit_lock_key --> &(&pch->downl)->rlock --> l2tp_sock

[  973.728986]  Possible unsafe locking scenario:
[  973.728986]
[  973.728986]        CPU0                    CPU1
[  973.728986]        ----                    ----
[  973.728986]   lock(l2tp_sock);
[  973.728986]                                lock(&(&pch->downl)->rlock);
[  973.728986]                                lock(l2tp_sock);
[  973.728986]   lock(&qdisc_xmit_lock_key);
[  973.728986]
[  973.728986]  *** DEADLOCK ***
[  973.728986]
[  973.728986] 6 locks held by accel-pppd/1806:
[  973.728986]  #0:  (&(&pch->downl)->rlock){+.-...}, at: [<ffffffffa0184efa>] ppp_channel_push+0x56/0x14a [ppp_generic]
[  973.728986]  #1:  (l2tp_sock){+.-...}, at: [<ffffffffa0202c4a>] l2tp_xmit_skb+0x1e8/0x5d7 [l2tp_core]
[  973.728986]  #2:  (rcu_read_lock){......}, at: [<ffffffff81486981>] rcu_lock_acquire+0x0/0x20
[  973.728986]  #3:  (rcu_read_lock_bh){......}, at: [<ffffffff81486981>] rcu_lock_acquire+0x0/0x20
[  973.728986]  #4:  (rcu_read_lock_bh){......}, at: [<ffffffff814340e3>] rcu_lock_acquire+0x0/0x20
[  973.728986]  #5:  (dev->qdisc_running_key ?: &qdisc_running_key#2){+.....}, at: [<ffffffff8144011e>] __dev_queue_xmit+0x564/0x912
[  973.728986]
[  973.728986] stack backtrace:
[  973.728986] CPU: 2 PID: 1806 Comm: accel-pppd Tainted: G           O    4.8.0-rc2 #1
[  973.728986] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[  973.728986]  ffff7fffffffffff ffff88003436f850 ffffffff812a20f4 ffffffff82156e30
[  973.728986]  ffffffff82156920 ffff88003436f890 ffffffff8115c759 ffff88003344ae00
[  973.728986]  ffff88003344b5c0 0000000000000002 0000000000000006 ffff88003344b5e8
[  973.728986] Call Trace:
[  973.728986]  [<ffffffff812a20f4>] dump_stack+0x67/0x90
[  973.728986]  [<ffffffff8115c759>] print_circular_bug+0x22e/0x23c
[  973.728986]  [<ffffffff810b28d6>] __lock_acquire+0x1118/0x1483
[  973.728986]  [<ffffffff810b3130>] lock_acquire+0x150/0x217
[  973.728986]  [<ffffffff810b3130>] ? lock_acquire+0x150/0x217
[  973.728986]  [<ffffffff8146f6fe>] ? sch_direct_xmit+0x8d/0x221
[  973.728986]  [<ffffffff815752f4>] _raw_spin_lock+0x2d/0x3c
[  973.728986]  [<ffffffff8146f6fe>] ? sch_direct_xmit+0x8d/0x221
[  973.728986]  [<ffffffff8146f6fe>] sch_direct_xmit+0x8d/0x221
[  973.728986]  [<ffffffff814401e4>] __dev_queue_xmit+0x62a/0x912
[  973.728986]  [<ffffffff814404d7>] dev_queue_xmit+0xb/0xd
[  973.728986]  [<ffffffff81449978>] neigh_direct_output+0xc/0xe
[  973.728986]  [<ffffffff81487811>] ip_finish_output2+0x5db/0x609
[  973.728986]  [<ffffffff81486853>] ? dst_mtu+0x29/0x2e
[  973.728986]  [<ffffffff81489590>] ip_finish_output+0x152/0x15e
[  973.728986]  [<ffffffff8148a0bc>] ? ip_output+0x74/0x96
[  973.728986]  [<ffffffff8148a0d4>] ip_output+0x8c/0x96
[  973.728986]  [<ffffffff81489652>] ip_local_out+0x41/0x4a
[  973.728986]  [<ffffffff81489e7d>] ip_queue_xmit+0x5a5/0x609
[  973.728986]  [<ffffffff814c559e>] ? udp_set_csum+0x207/0x21e
[  973.728986]  [<ffffffffa0202fe4>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core]
[  973.728986]  [<ffffffffa01b2466>] pppol2tp_xmit+0x1f2/0x25e [l2tp_ppp]
[  973.728986]  [<ffffffffa0184f59>] ppp_channel_push+0xb5/0x14a [ppp_generic]
[  973.728986]  [<ffffffffa01853ed>] ppp_write+0x104/0x11c [ppp_generic]
[  973.728986]  [<ffffffff811b2ec6>] __vfs_write+0x56/0x120
[  973.728986]  [<ffffffff8124c11d>] ? fsnotify_perm+0x27/0x95
[  973.728986]  [<ffffffff8124d41d>] ? security_file_permission+0x4d/0x54
[  973.728986]  [<ffffffff811b3f4c>] vfs_write+0xbd/0x11b
[  973.728986]  [<ffffffff811b4cb2>] SyS_write+0x5e/0x96
[  973.728986]  [<ffffffff81575ba5>] entry_SYSCALL_64_fastpath+0x18/0xa8
[  973.728986]  [<ffffffff810ae0fa>] ? trace_hardirqs_off_caller+0x121/0x12f
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

07712770

ppp: avoid dealock on recursive xmit · 55454a56

Guillaume Nault authored Aug 27, 2016

In case of misconfiguration, a virtual PPP channel might send packets
back to their parent PPP interface. This typically happens in
misconfigured L2TP setups, where PPP's peer IP address is set with the
IP of the L2TP peer.
When that happens the system hangs due to PPP trying to recursively
lock its xmit path.

[  243.332155] BUG: spinlock recursion on CPU#1, accel-pppd/926
[  243.333272]  lock: 0xffff880033d90f18, .magic: dead4ead, .owner: accel-pppd/926, .owner_cpu: 1
[  243.334859] CPU: 1 PID: 926 Comm: accel-pppd Not tainted 4.8.0-rc2 #1
[  243.336010] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014
[  243.336018]  ffff7fffffffffff ffff8800319a77a0 ffffffff8128de85 ffff880033d90f18
[  243.336018]  ffff880033ad8000 ffff8800319a77d8 ffffffff810ad7c0 ffffffff0000039e
[  243.336018]  ffff880033d90f18 ffff880033d90f60 ffff880033d90f18 ffff880033d90f28
[  243.336018] Call Trace:
[  243.336018]  [<ffffffff8128de85>] dump_stack+0x4f/0x65
[  243.336018]  [<ffffffff810ad7c0>] spin_dump+0xe1/0xeb
[  243.336018]  [<ffffffff810ad7f0>] spin_bug+0x26/0x28
[  243.336018]  [<ffffffff810ad8b9>] do_raw_spin_lock+0x5c/0x160
[  243.336018]  [<ffffffff815522aa>] _raw_spin_lock_bh+0x35/0x3c
[  243.336018]  [<ffffffffa01a88e2>] ? ppp_push+0xa7/0x82d [ppp_generic]
[  243.336018]  [<ffffffffa01a88e2>] ppp_push+0xa7/0x82d [ppp_generic]
[  243.336018]  [<ffffffff810adada>] ? do_raw_spin_unlock+0xc2/0xcc
[  243.336018]  [<ffffffff81084962>] ? preempt_count_sub+0x13/0xc7
[  243.336018]  [<ffffffff81552438>] ? _raw_spin_unlock_irqrestore+0x34/0x49
[  243.336018]  [<ffffffffa01ac657>] ppp_xmit_process+0x48/0x877 [ppp_generic]
[  243.336018]  [<ffffffff81084962>] ? preempt_count_sub+0x13/0xc7
[  243.336018]  [<ffffffff81408cd3>] ? skb_queue_tail+0x71/0x7c
[  243.336018]  [<ffffffffa01ad1c5>] ppp_start_xmit+0x21b/0x22a [ppp_generic]
[  243.336018]  [<ffffffff81426af1>] dev_hard_start_xmit+0x15e/0x32c
[  243.336018]  [<ffffffff81454ed7>] sch_direct_xmit+0xd6/0x221
[  243.336018]  [<ffffffff814273a8>] __dev_queue_xmit+0x52a/0x820
[  243.336018]  [<ffffffff814276a9>] dev_queue_xmit+0xb/0xd
[  243.336018]  [<ffffffff81430a3c>] neigh_direct_output+0xc/0xe
[  243.336018]  [<ffffffff8146b5d7>] ip_finish_output2+0x4d2/0x548
[  243.336018]  [<ffffffff8146a8e6>] ? dst_mtu+0x29/0x2e
[  243.336018]  [<ffffffff8146d49c>] ip_finish_output+0x152/0x15e
[  243.336018]  [<ffffffff8146df84>] ? ip_output+0x74/0x96
[  243.336018]  [<ffffffff8146df9c>] ip_output+0x8c/0x96
[  243.336018]  [<ffffffff8146d55e>] ip_local_out+0x41/0x4a
[  243.336018]  [<ffffffff8146dd15>] ip_queue_xmit+0x531/0x5c5
[  243.336018]  [<ffffffff814a82cd>] ? udp_set_csum+0x207/0x21e
[  243.336018]  [<ffffffffa01f2f04>] l2tp_xmit_skb+0x582/0x5d7 [l2tp_core]
[  243.336018]  [<ffffffffa01ea458>] pppol2tp_xmit+0x1eb/0x257 [l2tp_ppp]
[  243.336018]  [<ffffffffa01acf17>] ppp_channel_push+0x91/0x102 [ppp_generic]
[  243.336018]  [<ffffffffa01ad2d8>] ppp_write+0x104/0x11c [ppp_generic]
[  243.336018]  [<ffffffff811a3c1e>] __vfs_write+0x56/0x120
[  243.336018]  [<ffffffff81239801>] ? fsnotify_perm+0x27/0x95
[  243.336018]  [<ffffffff8123ab01>] ? security_file_permission+0x4d/0x54
[  243.336018]  [<ffffffff811a4ca4>] vfs_write+0xbd/0x11b
[  243.336018]  [<ffffffff811a5a0a>] SyS_write+0x5e/0x96
[  243.336018]  [<ffffffff81552a1b>] entry_SYSCALL_64_fastpath+0x13/0x94

The main entry points for sending packets over a PPP unit are the
.write() and .ndo_start_xmit() callbacks (simplified view):

.write(unit fd) or .ndo_start_xmit()
       \
        CALL ppp_xmit_process()
               \
                LOCK unit's xmit path (ppp->wlock)
                |
                CALL ppp_push()
                       \
                        LOCK channel's xmit path (chan->downl)
                        |
                        CALL lower layer's .start_xmit() callback
                               \
                                ... might recursively call .ndo_start_xmit() ...
                               /
                        RETURN from .start_xmit()
                        |
                        UNLOCK channel's xmit path
                       /
                RETURN from ppp_push()
                |
                UNLOCK unit's xmit path
               /
        RETURN from ppp_xmit_process()

Packets can also be directly sent on channels (e.g. LCP packets):

.write(channel fd) or ppp_output_wakeup()
       \
        CALL ppp_channel_push()
               \
                LOCK channel's xmit path (chan->downl)
                |
                CALL lower layer's .start_xmit() callback
                       \
                        ... might call .ndo_start_xmit() ...
                       /
                RETURN from .start_xmit()
                |
                UNLOCK channel's xmit path
               /
        RETURN from ppp_channel_push()

Key points about the lower layer's .start_xmit() callback:

  * It can be called directly by a channel fd .write() or by
    ppp_output_wakeup() or indirectly by a unit fd .write() or by
    .ndo_start_xmit().

  * In any case, it's always called with chan->downl held.

  * It might route the packet back to its parent unit using
    .ndo_start_xmit() as entry point.

This patch detects and breaks recursion in ppp_xmit_process(). This
function is a good candidate for the task because it's called early
enough after .ndo_start_xmit(), it's always part of the recursion
loop and it's on the path of whatever entry point is used to send
a packet on a PPP unit.

Recursion detection is done using the per-cpu ppp_xmit_recursion
variable.

Since ppp_channel_push() too locks the channel's xmit path and calls
the lower layer's .start_xmit() callback, we need to also increment
ppp_xmit_recursion there. However there's no need to check for
recursion, as it's out of the recursion loop.
Reported-by: Feng Gao <gfree.wind@gmail.com>
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

55454a56

xgbe: constify get_netdev_ops and get_ethtool_ops · ce0b15d1

stephen hemminger authored Aug 31, 2016

Casting away const is bad practice. Since this is ARM specific driver
don't have hardware actually test this.

Having getter functions for ops is really unnecessary code bloat, but
not going to touch that.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

ce0b15d1

Merge branch 'dsa-mdb-support' · 07469f8c

David S. Miller authored Aug 31, 2016

Vivien Didelot says:

====================
net: dsa: add MDB support

This patchset adds the switchdev MDB object support to the DSA layer.

The MDB support for the mv88e6xxx driver is very similar to the FDB
support. The FDB operations care about unicast addresses while the MDB
operations care about multicast addresses.

Both operation set load/purge/dump the Address Translation Table (ATU),
thus common code is used.

Changes in v2 based on Andrew's comments:
  - drop "group" in multicast database related doc and comment
  - change _one for more relevant _fid in mv88e6xxx_port_db_dump_one
  - return -EOPNOTSUPP if switchdev obj ID is neither _FDB nor _MDB
====================
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

07469f8c

net: dsa: mv88e6xxx: add MDB support · 7df8fbdd

Vivien Didelot authored Aug 31, 2016

Add support for the MDB operations. This consists of
loading/purging/dumping multicast addresses for a given port in the ATU.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7df8fbdd

net: dsa: mv88e6xxx: make switchdev DB ops generic · 83dabd1f

Vivien Didelot authored Aug 31, 2016

The MDB support for the mv88e6xxx driver will be very similar to the FDB
support, since it consists of loading/purging/dumping address to/from
the Address Translation Unit (ATU).

Prepare the support for MDB by making the FDB code accessing the ATU
generic. The FDB operations now provide access to the unicast addresses
while the MDB operations will provide access to the multicast addresses.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

83dabd1f