Commits · 0243508edd317ff1fa63b495643a7c192fbfcd92 · Kirill Smelkov / linux

08 Jun, 2015 6 commits

ipv6: Fix protocol resubmission · 0243508e

Josh Hunt authored Jun 08, 2015

UDP encapsulation is broken on IPv6. This is because the logic to resubmit
the nexthdr is inverted, checking for a ret value > 0 instead of < 0. Also,
the resubmit label is in the wrong position since we already get the
nexthdr value when performing decapsulation. In addition the skb pull is no
longer necessary either.

This changes the return value check to look for < 0, using it for the
nexthdr on the next iteration, and moves the resubmit label to the proper
location.

With these changes the v6 code now matches what we do in the v4 ip input
code wrt resubmitting when decapsulating.
Signed-off-by: Josh Hunt <johunt@akamai.com>
Acked-by: "Tom Herbert" <tom@herbertland.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0243508e

ipv6: fix possible use after free of dev stats · 27e41fcf

Robert Shearman authored Jun 05, 2015

The memory pointed to by idev->stats.icmpv6msgdev,
idev->stats.icmpv6dev and idev->stats.ipv6 can each be used in an RCU
read context without taking a reference on idev. For example, through
IP6_*_STATS_* calls in ip6_rcv. These memory blocks are freed without
waiting for an RCU grace period to elapse. This could lead to the
memory being written to after it has been freed.

Fix this by using call_rcu to free the memory used for stats, as well
as idev after an RCU grace period has elapsed.
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

27e41fcf

b44: call netif_napi_del() · 1489bdee

Hauke Mehrtens authored Jun 07, 2015

When the driver gets unregistered a call to netif_napi_del() was
missing, this all was also missing in the error paths of
b44_init_one().
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

1489bdee

bridge: disable softirqs around br_fdb_update to avoid lockup · c4c832f8

Nikolay Aleksandrov authored Jun 06, 2015

br_fdb_update() can be called in process context in the following way:
br_fdb_add() -> __br_fdb_add() -> br_fdb_update() (if NTF_USE flag is set)
so we need to disable softirqs because there are softirq users of the
hash_lock. One easy way to reproduce this is to modify the bridge utility
to set NTF_USE, enable stp and then set maxageing to a low value so
br_fdb_cleanup() is called frequently and then just add new entries in
a loop. This happens because br_fdb_cleanup() is called from timer/softirq
context. The spin locks in br_fdb_update were _bh before commit f8ae737d
("[BRIDGE]: forwarding remove unneeded preempt and bh diasables")
and at the time that commit was correct because br_fdb_update() couldn't be
called from process context, but that changed after commit:
292d1398 ("bridge: add NTF_USE support")
Using local_bh_disable/enable around br_fdb_update() allows us to keep
using the spin_lock/unlock in br_fdb_update for the fast-path.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 292d1398 ("bridge: add NTF_USE support")
Signed-off-by: David S. Miller <davem@davemloft.net>

c4c832f8

Revert "bridge: use _bh spinlock variant for br_fdb_update to avoid lockup" · 7ff46e79

David S. Miller authored Jun 07, 2015

This reverts commit 1d7c4903.

Nikolay Aleksandrov has a better version of this fix.
Signed-off-by: David S. Miller <davem@davemloft.net>

7ff46e79

mpls: fix possible use after free of device · 25cc8f07

Robert Shearman authored Jun 05, 2015

The mpls device is used in an RCU read context without a lock being
held. As the memory is freed without waiting for the RCU grace period
to elapse, the freed memory could still be in use.

Address this by using kfree_rcu to free the memory for the mpls device
after the RCU grace period has elapsed.

Fixes: 03c57747 ("mpls: Per-device MPLS state")
Signed-off-by: Robert Shearman <rshearma@brocade.com>
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

25cc8f07

07 Jun, 2015 4 commits

be2net: Replace dma/pci_alloc_coherent() calls with dma_zalloc_coherent() · e51000db

Sriharsha Basavapatna authored Jun 05, 2015

There are several places in the driver (all in control paths) where
coherent dma memory is being allocated using either dma_alloc_coherent()
or the deprecated pci_alloc_consistent(). All these calls should be
changed to use dma_zalloc_coherent() to avoid uninitialized fields in
data structures backed by this memory.
Reported-by: Joerg Roedel <jroedel@suse.de>
Tested-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@avagotech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e51000db

bridge: use _bh spinlock variant for br_fdb_update to avoid lockup · 1d7c4903

Wilson Kok authored Jun 05, 2015

br_fdb_update() can be called in process context in the following way:
br_fdb_add() -> __br_fdb_add() -> br_fdb_update() (if NTF_USE flag is set)
so we need to use spin_lock_bh because there are softirq users of the
hash_lock. One easy way to reproduce this is to modify the bridge utility
to set NTF_USE, enable stp and then set maxageing to a low value so
br_fdb_cleanup() is called frequently and then just add new entries in
a loop. This happens because br_fdb_cleanup() is called from timer/softirq
context. These locks were _bh before commit f8ae737d
("[BRIDGE]: forwarding remove unneeded preempt and bh diasables")
and at the time that commit was correct because br_fdb_update() couldn't be
called from process context, but that changed after commit:
292d1398 ("bridge: add NTF_USE support")
Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 292d1398 ("bridge: add NTF_USE support")
Signed-off-by: David S. Miller <davem@davemloft.net>

1d7c4903

amd-xgbe: Use disable_irq_nosync from within timer function · 078b29d7

Lendacky, Thomas authored Jun 05, 2015

Since the Tx timer function runs in softirq context the driver needs
to call disable_irq_nosync instead of a disable_irq.
Reported-by: Josh Stone <jistone@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

078b29d7

rhashtable: add missing import <linux/export.h> · 6d795413

Hauke Mehrtens authored Jun 06, 2015

rhashtable uses EXPORT_SYMBOL_GPL() without importing linux/export.h
directly it is only imported indirectly through some other includes.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

6d795413

06 Jun, 2015 1 commit

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · c6271b76

David S. Miller authored Jun 05, 2015

Jeff Kirsher says:

====================
Intel Wired LAN Driver Updates 2015-06-04

This series contains updates to i40e and i40evf.

Anjali provides three fixes, first to resolve a Tx queue hang if mixed
size frags are passed to the driver while using TSO.  There was a corner
case where we needed to linearize but we were not.  Next fixes a bug in
the default configuration which prevented a software bridge loaded on the
PF interface from working correctly because broadcast packets are
incorrectly looped back.  Lastly fixes an NPAR bug when SRIOV is enabled,
where we need to be in VEB mode, not VEPA mode at probe.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

c6271b76

05 Jun, 2015 3 commits

i40e: Make sure to be in VEB mode if SRIOV is enabled at probe · fa11cb3d

Anjali Singhai Jain authored May 27, 2015

If SRIOV is enabled we need to be in VEB mode not VEPA mode at probe.
This fixes an NPAR bug when SRIOV is enabled in the BIOS.

Change-ID: Ibf006abafd9a0ca3698ec24848cd771cf345cbbc
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

fa11cb3d

i40e: start up in VEPA mode by default · fc60861e

Anjali Singhai Jain authored May 08, 2015

The patch fixes a bug in the default configuration which
prevented a software bridge loaded on the PF interface from
working correctly because broadcast packets are incorrectly
looped back.

Fix the general case, by loading the driver in VEPA mode Until a
VF or VMDq VSI is added. This way loopback on the Main VSI is
turned off until needed and can resolve the issue of unnecessary
reflection for users that do not have VF or VMDq VSIs setup.

The driver must now coordinate the loopback setting for the Flow
Director (FDIR) VSI to make sure it is in sync with the current
VEB or VEPA mode setting.

The user can still switch bridge modes from the bridge commands and
choose to be in VEPA mode with VF VSIs. Because of hardware
requirements, the call to switch to VEB mode when no VF/VMDqs are
present will be rejected.

NOTE: This patch uses BIT_ULL as that is preferred going forward,
a followup patch in the lower priority queue to net-next will fix
up the remaining 1 << usages.

Change-ID: Ib121ddb18fe4b3c4f52e9deda6fcbeb9105683d1
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Jim Young <james.m.young@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

fc60861e

i40e/i40evf: Fix mixed size frags and linearization · 30520831

Anjali Singhai Jain authored May 08, 2015

This patch fixes a bug where the i40e Tx queue will hang if this
skb is passed to the driver.

With mixed size fragments while using TSO there was a corner case
where we needed to linearize but we were not. This was seen with
iSCSI traffic and could be reproduced with a frag list that looks
like this:

num_frags = 17, gso_segs = 17, hdr_len = 66,
skb_shinfo(skb)->gso_size = 1448
size = 3002, j = 1, frag_size = 2936, num_frags = 17
size = 4268, j = 1, frag_size = 4096, num_frags = 16
size = 5534, j = 1, frag_size = 4096, num_frags = 15
size = 5352, j = 1, frag_size = 4096, num_frags = 14
size = 5170, j = 1, frag_size = 4096, num_frags = 13
size = 3468, j = 1, frag_size = 2576, num_frags = 12
size = 750, j = 1, frag_size = 112, num_frags = 11
size = 862, j = 2, frag_size = 112, num_frags = 10
size = 974, j = 3, frag_size = 112, num_frags = 9
size = 1126, j = 4, frag_size = 152, num_frags = 8
size = 1330, j = 5, frag_size = 204, num_frags = 7
size = 1534, j = 6, frag_size = 204, num_frags = 6
size = 356, j = 1, frag_size = 204, num_frags = 5
size = 560, j = 2, frag_size = 204, num_frags = 4
size = 764, j = 3, frag_size = 204, num_frags = 3
size = 968, j = 4, frag_size = 204, num_frags = 2
size = 1140, j = 5, frag_size = 172, num_frags = 1
result: linearize = 0, j = 6

Change-ID: I79bb1aeab0af255fe2ce28e93672a85d85bf47e8
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>

30520831

04 Jun, 2015 4 commits

ipv4/udp: Verify multicast group is ours in upd_v4_early_demux() · 6e540309

Shawn Bohrer authored Jun 03, 2015

421b3885 "udp: ipv4: Add udp early
demux" introduced a regression that allowed sockets bound to INADDR_ANY
to receive packets from multicast groups that the socket had not joined.
For example a socket that had joined 224.168.2.9 could also receive
packets from 225.168.2.9 despite not having joined that group if
ip_early_demux is enabled.

Fix this by calling ip_check_mc_rcu() in udp_v4_early_demux() to verify
that the multicast packet is indeed ours.
Signed-off-by: Shawn Bohrer <sbohrer@rgmadvisors.com>
Reported-by: Yurij M. Plotnikov <Yurij.Plotnikov@oktetlabs.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>

6e540309

openvswitch: disable LRO · 640b2b10

Jiri Benc authored Jun 02, 2015

Currently, openvswitch tries to disable LRO from the user space. This does
not work correctly when the device added is a vlan interface, though.
Instead of dealing with possibly complex stacked cross name space relations
in the user space, do the same as bridging does and call dev_disable_lro in
the kernel.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Flavio Leitner <fbl@redhat.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

640b2b10

s390/bpf: fix bpf frame pointer setup · 88aeca15

Michael Holzheu authored Jun 01, 2015

Currently the bpf frame pointer is set to the old r15. This is
wrong because of packed stack. Fix this and adjust the frame pointer
to respect packed stack. This now generates a prolog like the following:

 3ff8001c3fa: eb67f0480024   stmg    %r6,%r7,72(%r15)
 3ff8001c400: ebcff0780024   stmg    %r12,%r15,120(%r15)
 3ff8001c406: b904001f       lgr     %r1,%r15      <- load backchain
 3ff8001c40a: 41d0f048       la      %r13,72(%r15) <- load adjusted bfp
 3ff8001c40e: a7fbfd98       aghi    %r15,-616
 3ff8001c412: e310f0980024   stg     %r1,152(%r15) <- save backchain

Fixes: 05462310 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

88aeca15

s390/bpf: fix stack allocation · bbac1c94

Michael Holzheu authored Jun 01, 2015

On s390x we have to provide 160 bytes stack space before we can call
the next function. From the 160 bytes that we got from the previous
function we only use 11 * 8 bytes and have 160 - 11 * 8 bytes left.
Currently for BPF we allocate additional 160 - 11 * 8 bytes for the
next function. This is wrong because then the next function only gets:

 (160 - 11 * 8) + (160 - 11 * 8) = 2 * 72 = 144 bytes

Fix this and allocate enough memory for the next function.

Fixes: 05462310 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bbac1c94

02 Jun, 2015 3 commits

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c46a024e

Linus Torvalds authored Jun 01, 2015

Pull networking fixes from David Miller:

 1) Various VTI tunnel (mark handling, PMTU) bug fixes from Alexander
    Duyck and Steffen Klassert.

 2) Revert ethtool PHY query change, it wasn't correct.  The PHY address
    selected by the driver running the PHY to MAC connection decides
    what PHY address GET ethtool operations return information from.

 3) Fix handling of sequence number bits for encryption IV generation in
    ESP driver, from Herbert Xu.

 4) UDP can return -EAGAIN when we hit a bad checksum on receive, even
    when there are other packets in the receive queue which is wrong.
    Just respect the error returned from the generic socket recv
    datagram helper.  From Eric Dumazet.

 5) Fix BNA driver firmware loading on big-endian systems, from Ivan
    Vecera.

 6) Fix regression in that we were inheriting the congestion control of
    the listening socket for new connections, the intended behavior
    always was to use the default in this case.  From Neal Cardwell.

 7) Fix NULL deref in brcmfmac driver, from Arend van Spriel.

 8) OTP parsing fix in iwlwifi from Liad Kaufman.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
  vti6: Add pmtu handling to vti6_xmit.
  Revert "net: core: 'ethtool' issue with querying phy settings"
  bnx2x: Move statistics implementation into semaphores
  xen: netback: read hotplug script once at start of day.
  xen: netback: fix printf format string warning
  Revert "netfilter: ensure number of counters is >0 in do_replace()"
  net: dsa: Properly propagate errors from dsa_switch_setup_one
  tcp: fix child sockets to use system default congestion control if not set
  udp: fix behavior of wrong checksums
  sfc: free multiple Rx buffers when required
  bna: fix soft lock-up during firmware initialization failure
  bna: remove unreasonable iocpf timer start
  bna: fix firmware loading on big-endian machines
  bridge: fix br_multicast_query_expired() bug
  via-rhine: Resigning as maintainer
  brcmfmac: avoid null pointer access when brcmf_msgbuf_get_pktid() fails
  mac80211: Fix mac80211.h docbook comments
  iwlwifi: nvm: fix otp parsing in 8000 hw family
  iwlwifi: pcie: fix tracking of cmd_in_flight
  ip_vti/ip6_vti: Preserve skb->mark after rcv_cb call
  ...

c46a024e

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · 2459c609

Linus Torvalds authored Jun 01, 2015

Pull Sparc fixes from David Miller:

 1) Setup the core/threads/sockets bitmaps correctly so that 'lscpus'
    and friends operate properly.  Frtom Chris Hyser.

 2) The bit that normally means "Cached Virtually" on sun4v systems,
    actually changes meaning in M7 and later chips.  Fix from Khalid
    Aziz.

 3) One some PCI-E systems we need to probe different OF properties to
    fill in the PCI slot information properly, from Eric Snowberg.

 4) Kill an extraneous memset after kzalloc(), from Christophe Jaillet.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE
  sparc64: pci slots information is not populated in sysfs
  sparc: kernel: GRPCI2: Remove a useless memset
  sparc64: Setup sysfs to mark LDOM sockets, cores and threads correctly

2459c609

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · fec345ba

Linus Torvalds authored Jun 01, 2015

Pull virtio fix from Michael Tsirkin:
 "Last-minute virtio fix for 4.1

  This tweaks an exported user-space header to fix build breakage for
  userspace using it"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  include/uapi/linux/virtio_balloon.h: include linux/virtio_types.h

fec345ba

01 Jun, 2015 17 commits

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf · e453581d

David S. Miller authored Jun 01, 2015

Pablo Neira Ayuso says:

====================
Netfilter fix for net

The following patch reverts the ebtables chunk that enforces counters that was
introduced in the recently applied d26e2c9f ('Revert "netfilter: ensure
number of counters is >0 in do_replace()"') since this breaks ebtables.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

e453581d

Merge tag 'wireless-drivers-for-davem-2015-06-01' of... · cd842a67

David S. Miller authored Jun 01, 2015

Merge tag 'wireless-drivers-for-davem-2015-06-01' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
iwlwifi:

* fix OTP parsing 8260
* fix powersave handling for 8260

brcmfmac:

* fix null pointer crash
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

cd842a67

vti6: Add pmtu handling to vti6_xmit. · ccd740cb

Steffen Klassert authored May 29, 2015

We currently rely on the PMTU discovery of xfrm.
However if a packet is localy sent, the PMTU mechanism
of xfrm tries to to local socket notification what
might not work for applications like ping that don't
check for this. So add pmtu handling to vti6_xmit to
report MTU changes immediately.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ccd740cb

Revert "net: core: 'ethtool' issue with querying phy settings" · 18ec898e

David S. Miller authored Jun 01, 2015

This reverts commit f96dee13.

It isn't right, ethtool is meant to manage one PHY instance
per netdevice at a time, and this is selected by the SET
command.  Therefore by definition the GET command must only
return the settings for the configured and selected PHY.
Reported-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

18ec898e

bnx2x: Move statistics implementation into semaphores · c6e36d8c

Yuval Mintz authored Jun 01, 2015

Commit dff173de ("bnx2x: Fix statistics locking scheme") changed the
bnx2x locking around statistics state into using a mutex - but the lock
is being accessed via a timer which is forbidden.

[If compiled with CONFIG_DEBUG_MUTEXES, logs show a warning about
accessing the mutex in interrupt context]

This moves the implementation into using a semaphore [with size '1']
instead.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c6e36d8c

xen: netback: read hotplug script once at start of day. · 31a41898

Ian Campbell authored Jun 01, 2015

When we come to tear things down in netback_remove() and generate the
uevent it is possible that the xenstore directory has already been
removed (details below).

In such cases netback_uevent() won't be able to read the hotplug
script and will write a xenstore error node.

A recent change to the hypervisor exposed this race such that we now
sometimes lose it (where apparently we didn't ever before).

Instead read the hotplug script configuration during setup and use it
for the lifetime of the backend device.

The apparently more obvious fix of moving the transition to
state=Closed in netback_remove() to after the uevent does not work
because it is possible that we are already in state=Closed (in
reaction to the guest having disconnected as it shutdown). Being
already in Closed means the toolstack is at liberty to start tearing
down the xenstore directories. In principal it might be possible to
arrange to unregister the device sooner (e.g on transition to Closing)
such that xenstore would still be there but this state machine is
fragile and prone to anger...

A modern Xen system only relies on the hotplug uevent for driver
domains, when the backend is in the same domain as the toolstack it
will run the necessary setup/teardown directly in the correct sequence
wrt xenstore changes.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

31a41898

xen: netback: fix printf format string warning · dc5e7a81

Ian Campbell authored Jun 01, 2015

drivers/net/xen-netback/netback.c: In function ‘xenvif_tx_build_gops’:
drivers/net/xen-netback/netback.c:1253:8: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘int’ [-Wformat=]
        (txreq.offset&~PAGE_MASK) + txreq.size);
        ^

PAGE_MASK's type can vary by arch, so a cast is needed.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
----
v2: Cast to unsigned long, since PAGE_MASK can vary by arch.
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dc5e7a81

Revert "netfilter: ensure number of counters is >0 in do_replace()" · d26e2c9f

Bernhard Thaler authored May 28, 2015

This partially reverts commit 1086bbe9 ("netfilter: ensure number of
counters is >0 in do_replace()") in net/bridge/netfilter/ebtables.c.

Setting rules with ebtables does not work any more with 1086bbe9 place.

There is an error message and no rules set in the end.

e.g.

~# ebtables -t nat -A POSTROUTING --src 12:34:56:78:9a:bc -j DROP
Unable to update the kernel. Two possible causes:
1. Multiple ebtables programs were executing simultaneously. The ebtables
   userspace tool doesn't by default support multiple ebtables programs
running

Reverting the ebtables part of 1086bbe9 makes this work again.
Signed-off-by: Bernhard Thaler <bernhard.thaler@wvnet.at>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

d26e2c9f

include/uapi/linux/virtio_balloon.h: include linux/virtio_types.h · 8a7b19d8

Mikko Rapeli authored May 30, 2015

Fixes userspace compilation error:

error: unknown type name ‘__virtio16’
  __virtio16 tag;
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

8a7b19d8

sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE · 494e5b6f

Khalid Aziz authored May 27, 2015

sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE

Bit 9 of TTE is CV (Cacheable in V-cache) on sparc v9 processor while
the same bit 9 is MCDE (Memory Corruption Detection Enable) on M7
processor. This creates a conflicting usage of the same bit. Kernel
sets TTE.cv bit on all pages for sun4v architecture which works well
for sparc v9 but enables memory corruption detection on M7 processor
which is not the intent. This patch adds code to determine if kernel
is running on M7 processor and takes steps to not enable memory
corruption detection in TTE erroneously.
Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

494e5b6f

sparc64: pci slots information is not populated in sysfs · f0c1a117

Eric Snowberg authored May 27, 2015

Add PCI slot numbers within sysfs for PCIe hardware.  Larger
PCIe systems with nested PCI bridges and slots further
down on these bridges were not being populated within sysfs.
This will add ACPI style PCI slot numbers for these systems
since the OF 'slot-names' information is not available on
all PCIe platforms.
Signed-off-by: Eric Snowberg <eric.snowberg@oracle.com>
Reviewed-by: Bob Picco <bob.picco@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f0c1a117

sparc: kernel: GRPCI2: Remove a useless memset · 8642ad1c

Christophe Jaillet authored May 01, 2015

grpci2priv is allocated using kzalloc, so there is no need to memset it.
Signed-off-by: Christophe Jaillet <christophe.jaillet@wanadoo.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>

8642ad1c

net: dsa: Properly propagate errors from dsa_switch_setup_one · 24595346

Florian Fainelli authored May 29, 2015

While shuffling some code around, dsa_switch_setup_one() was introduced,
and it was modified to return either an error code using ERR_PTR() or a
NULL pointer when running out of memory or failing to setup a switch.

This is a problem for its caler: dsa_switch_setup() which uses IS_ERR()
and expects to find an error code, not a NULL pointer, so we still try
to proceed with dsa_switch_setup() and operate on invalid memory
addresses. This can be easily reproduced by having e.g: the bcm_sf2
driver built-in, but having no such switch, such that drv->setup will
fail.

Fix this by using PTR_ERR() consistently which is both more informative
and avoids for the caller to use IS_ERR_OR_NULL().

Fixes: df197195 ("net: dsa: split dsa_switch_setup into two functions")
Reported-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

24595346

tcp: fix child sockets to use system default congestion control if not set · 9f950415

Neal Cardwell authored May 29, 2015

Linux 3.17 and earlier are explicitly engineered so that if the app
doesn't specifically request a CC module on a listener before the SYN
arrives, then the child gets the system default CC when the connection
is established. See tcp_init_congestion_control() in 3.17 or earlier,
which says "if no choice made yet assign the current value set as
default". The change ("net: tcp: assign tcp cong_ops when tcp sk is
created") altered these semantics, so that children got their parent
listener's congestion control even if the system default had changed
after the listener was created.

This commit returns to those original semantics from 3.17 and earlier,
since they are the original semantics from 2007 in 4d4d3d1e ("[TCP]:
Congestion control initialization."), and some Linux congestion
control workflows depend on that.

In summary, if a listener socket specifically sets TCP_CONGESTION to
"x", or the route locks the CC module to "x", then the child gets
"x". Otherwise the child gets current system default from
net.ipv4.tcp_congestion_control. That's the behavior in 3.17 and
earlier, and this commit restores that.

Fixes: 55d8694f ("net: tcp: assign tcp cong_ops when tcp sk is created")
Cc: Florian Westphal <fw@strlen.de>
Cc: Daniel Borkmann <dborkman@redhat.com>
Cc: Glenn Judd <glenn.judd@morganstanley.com>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>

9f950415

udp: fix behavior of wrong checksums · beb39db5

Eric Dumazet authored May 30, 2015

We have two problems in UDP stack related to bogus checksums :

1) We return -EAGAIN to application even if receive queue is not empty.
   This breaks applications using edge trigger epoll()

2) Under UDP flood, we can loop forever without yielding to other
   processes, potentially hanging the host, especially on non SMP.

This patch is an attempt to make things better.

We might in the future add extra support for rt applications
wanting to better control time spent doing a recv() in a hostile
environment. For example we could validate checksums before queuing
packets in socket receive queue.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

beb39db5

Linux 4.1-rc6 · c65b99f0
Linus Torvalds authored May 31, 2015

c65b99f0

sfc: free multiple Rx buffers when required · 9eb0a5d1

Daniel Pieczko authored May 29, 2015

When Rx packet data must be dropped, all the buffers
associated with that Rx packet must be freed. Extend
and rename efx_free_rx_buffer() to efx_free_rx_buffers()
and loop through all the fragments.
By doing so this patch fixes a possible memory leak.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9eb0a5d1

31 May, 2015 2 commits

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · 8ba64dc3

Linus Torvalds authored May 31, 2015

Pull vfs fix from Al Viro:
 "Off-by-one in d_walk()/__dentry_kill() race fix.

  It's very hard to hit; possible in the same conditions as the original
  bug, except that you need the skipped branch to contain all the
  remaining evictables, so that the d_walk()-calling loop in
  d_invalidate() decides there's nothing more to do and doesn't go for
  another pass - otherwise that next pass will sweep the sucker.

  So it's not too urgent, but seeing that the fix is obvious and the
  original commit has spread into all -stable branches..."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  d_walk() might skip too much

8ba64dc3

Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm · 36a8b9a7

Linus Torvalds authored May 31, 2015

Pull ARM fixes from Russell King:
 "Three fixes this time around:

   - fix a memory leak which occurs when probing performance monitoring
     unit interrupts

   - fix handling of non-PMD aligned end of RAM causing boot failures

   - fix missing syscall trace exit path with syscall tracing enabled
     causing a kernel oops in the audit code"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8357/1: perf: fix memory leak when probing PMU PPIs
  ARM: fix missing syscall trace exit
  ARM: 8356/1: mm: handle non-pmd-aligned end of RAM

36a8b9a7