Commits · 1c999c47a9f1772e6d53028b6bf965c6ddb743bd · nexedi / linux

29 Jan, 2015 7 commits

Merge tag 'dm-3.19-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm · 1c999c47

Linus Torvalds authored Jan 29, 2015

Pull device mapper fixes from Mike Snitzer:
 "One stable fix for a dm-cache 3.19-rc6 regression and one stable fix
  for dm-thin:

   - fix DM cache metadata open/lookup error paths to properly use
     ERR_PTR and IS_ERR (fixes: 3.19-rc6 "stable" commit 9b1cc9f2)

   - fix DM thin-provisioning to disallow userspace from sending
     messages to the thin-pool if the pool is in READ_ONLY or FAIL mode
     since no metadata changes are allowed in these modes"

* tag 'dm-3.19-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm thin: don't allow messages to be sent to a pool target in READ_ONLY or FAIL mode
  dm cache: fix missing ERR_PTR returns and handling

1c999c47

Merge tag 'nfs-for-3.19-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 353a0c6f

Linus Torvalds authored Jan 29, 2015

Pull NFS client bugfixes from Trond Myklebust:
 "Highlights include:

   - Stable fix for a NFSv4.1 Oops on mount
   - Stable fix for an O_DIRECT deadlock condition
   - Fix an issue with submounted volumes and fake duplicate inode
     numbers"

* tag 'nfs-for-3.19-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Fix use of nfs_attr_use_mounted_on_fileid()
  NFSv4.1: Fix an Oops in nfs41_walk_client_list
  nfs: fix dio deadlock when O_DIRECT flag is flipped

353a0c6f

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client · 884e00f3

Linus Torvalds authored Jan 29, 2015

Pull Ceph fixes from Sage Weil:
 "These paches from Ilya finally squash a race condition with layered
  images that he's been chasing for a while"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  rbd: drop parent_ref in rbd_dev_unprobe() unconditionally
  rbd: fix rbd_dev_parent_get() when parent_overlap == 0

884e00f3

Merge tag 'sound-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · a2ae004a

Linus Torvalds authored Jan 29, 2015

Pull sound fixes from Takashi Iwai:
 "This batch ended up being larger than wished, but there is nothing to
  worry too much there.

  Most of commits are for ASoC, a compress NULL dereference fix, a fix
  for probe error handling, and the rest are device-specific fixes.  In
  addition, we have a fix for a long-standing but of seq-dummy driver,
  which just cuts off the buggy part in the end"

* tag 'sound-3.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: seq-dummy: remove deadlock-causing events on close
  ASoC: omap-mcbsp: Correct CBM_CFS dai format configuration
  ASoC: soc-compress.c: fix NULL dereference
  ASoC: rt286: set the same format for dac and adc
  ASoC: wm8904: fix runtime warning
  ASoC: simple-card: Fix crash in asoc_simple_card_unref()
  ASoC: fsl: imx-wm8962: Set the card owner field
  ASoC: pcm512x: Fix DSP program selection
  ASoC: rt5677: Modify the behavior that updates the PLL parameter.
  ASoC: fsl_ssi: Fix irq error check
  ASoC: rockchip: i2s: applys rate symmetry for CPU DAI
  ASoC: Intel: Add NULL checks for the stream pointer
  ASoC: wm8960: Fix capture sample rate from 11250 to 11025
  ASoC: adi: Add missing return statement.
  ASoC: Intel: Don't change offset of block allocator during fixed allocate
  ASoC: ts3a227e: Check and report jack status at probe
  ASoC: fsl_esai: Fix incorrect xDC field width of xCCR registers

a2ae004a

Merge tag 'pinctrl-v3.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 297614f3

Linus Torvalds authored Jan 29, 2015

Pull final pin control fix from Linus Walleij:
 "A late pin control fix for the v3.19 series: The AT91 gpio controller
  would miss wakeup events, this single fix make it work properly"

[ "Final"? Yeah, I'll believe that once I've actually released 3.19 ;)   - Linus ]

* tag 'pinctrl-v3.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
  pinctrl: at91: allow to have disabled gpio bank

297614f3

vm: make stack guard page errors return VM_FAULT_SIGSEGV rather than SIGBUS · 9c145c56

Linus Torvalds authored Jan 29, 2015

The stack guard page error case has long incorrectly caused a SIGBUS
rather than a SIGSEGV, but nobody actually noticed until commit
fee7e49d ("mm: propagate error from stack expansion even for guard
page") because that error case was never actually triggered in any
normal situations.

Now that we actually report the error, people noticed the wrong signal
that resulted.  So far, only the test suite of libsigsegv seems to have
actually cared, but there are real applications that use libsigsegv, so
let's not wait for any of those to break.
Reported-and-tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Jan Engelhardt <jengelh@inai.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

9c145c56

vm: add VM_FAULT_SIGSEGV handling support · 33692f27

Linus Torvalds authored Jan 29, 2015

The core VM already knows about VM_FAULT_SIGBUS, but cannot return a
"you should SIGSEGV" error, because the SIGSEGV case was generally
handled by the caller - usually the architecture fault handler.

That results in lots of duplication - all the architecture fault
handlers end up doing very similar "look up vma, check permissions, do
retries etc" - but it generally works.  However, there are cases where
the VM actually wants to SIGSEGV, and applications _expect_ SIGSEGV.

In particular, when accessing the stack guard page, libsigsegv expects a
SIGSEGV.  And it usually got one, because the stack growth is handled by
that duplicated architecture fault handler.

However, when the generic VM layer started propagating the error return
from the stack expansion in commit fee7e49d ("mm: propagate error
from stack expansion even for guard page"), that now exposed the
existing VM_FAULT_SIGBUS result to user space.  And user space really
expected SIGSEGV, not SIGBUS.

To fix that case, we need to add a VM_FAULT_SIGSEGV, and teach all those
duplicate architecture fault handlers about it.  They all already have
the code to handle SIGSEGV, so it's about just tying that new return
value to the existing code, but it's all a bit annoying.

This is the mindless minimal patch to do this.  A more extensive patch
would be to try to gather up the mostly shared fault handling logic into
one generic helper routine, and long-term we really should do that
cleanup.

Just from this patch, you can generally see that most architectures just
copied (directly or indirectly) the old x86 way of doing things, but in
the meantime that original x86 model has been improved to hold the VM
semaphore for shorter times etc and to handle VM_FAULT_RETRY and other
"newer" things, so it would be a good idea to bring all those
improvements to the generic case and teach other architectures about
them too.
Reported-and-tested-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Jan Engelhardt <jengelh@inai.de>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com> # "s390 still compiles and boots"
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

33692f27

28 Jan, 2015 5 commits

dm thin: don't allow messages to be sent to a pool target in READ_ONLY or FAIL mode · 2a7eaea0

Joe Thornber authored Jan 26, 2015

You can't modify the metadata in these modes.  It's better to fail these
messages immediately than let the block-manager deny write locks on
metadata blocks.  Otherwise these failed metadata changes will trigger
'needs_check' to get set in the metadata superblock -- requiring repair
using the thin_check utility.
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

2a7eaea0

dm cache: fix missing ERR_PTR returns and handling · 766a7888

Joe Thornber authored Jan 28, 2015

Commit 9b1cc9f2 ("dm cache: share cache-metadata object across
inactive and active DM tables") mistakenly ignored the use of ERR_PTR
returns.  Restore missing IS_ERR checks and ERR_PTR returns where
appropriate.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Joe Thornber <ejt@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Cc: stable@vger.kernel.org

766a7888

rbd: drop parent_ref in rbd_dev_unprobe() unconditionally · e69b8d41

Ilya Dryomov authored Jan 19, 2015

This effectively reverts the last hunk of 392a9dad ("rbd: detect
when clone image is flattened").

The problem with parent_overlap != 0 condition is that it's possible
and completely valid to have an image with parent_overlap == 0 whose
parent state needs to be cleaned up on unmap.  The next commit, which
drops the "clone image now standalone" logic, opens up another window
of opportunity to hit this, but even without it

    # cat parent-ref.sh
    #!/bin/bash
    rbd create --image-format 2 --size 1 foo
    rbd snap create foo@snap
    rbd snap protect foo@snap
    rbd clone foo@snap bar
    rbd resize --allow-shrink --size 0 bar
    rbd resize --size 1 bar
    DEV=$(rbd map bar)
    rbd unmap $DEV

leaves rbd_device/rbd_spec/etc and rbd_client along with ceph_client
hanging around.

My thinking behind calling rbd_dev_parent_put() unconditionally is that
there shouldn't be any requests in flight at that point in time as we
are deep into unmap sequence.  Hence, even if rbd_dev_unparent() caused
by flatten is delayed by in-flight requests, it will have finished by
the time we reach rbd_dev_unprobe() caused by unmap, thus turning
unconditional rbd_dev_parent_put() into a no-op.

Fixes: http://tracker.ceph.com/issues/10352

Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Alex Elder <elder@linaro.org>

e69b8d41

rbd: fix rbd_dev_parent_get() when parent_overlap == 0 · ae43e9d0

Ilya Dryomov authored Jan 19, 2015

The comment for rbd_dev_parent_get() said

    * We must get the reference before checking for the overlap to
    * coordinate properly with zeroing the parent overlap in
    * rbd_dev_v2_parent_info() when an image gets flattened.  We
    * drop it again if there is no overlap.

but the "drop it again if there is no overlap" part was missing from
the implementation.  This lead to absurd parent_ref values for images
with parent_overlap == 0, as parent_ref was incremented for each
img_request and virtually never decremented.

Fix this by leveraging the fact that refresh path calls
rbd_dev_v2_parent_info() under header_rwsem and use it for read in
rbd_dev_parent_get(), instead of messing around with atomics.  Get rid
of barriers in rbd_dev_v2_parent_info() while at it - I don't see what
they'd pair with now and I suspect we are in a pretty miserable
situation as far as proper locking goes regardless.

Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: Ilya Dryomov <idryomov@redhat.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Reviewed-by: Alex Elder <elder@linaro.org>

ae43e9d0

Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux · c59c961c

Linus Torvalds authored Jan 27, 2015

Pull drm fixes from Dave Airlie:
 "This feels larger than I'd like but its for three reasons.

   a) amdkfd finalising the API more, this is a new feature introduced
      last merge window, and I'd prefer to make the tweaks to the API
      before it first gets into a stable release.

   b) radeon regression required splitting an internal API to fix
      properly, so it just changed a few more lines

   c) vmwgfx fix changes a lock from a mutex->spin lock, this is fallout
      from the new sleep checking.

  Otherwise there is just some tda998x fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/radeon: Remove rdev->gart.pages_addr array
  drm/radeon: Restore GART table contents after pinning it in VRAM v3
  drm/radeon: Split off gart_get_page_entry ASIC hook from set_page_entry
  drm/amdkfd: Fix bug in call to init_pipelines()
  drm/amdkfd: Fix bug in pipelines initialization
  drm/radeon: Don't increment pipe_id in kgd_init_pipeline
  drm/i2c: tda998x: set the CEC I2C address based on the slave I2C address
  drm/vmwgfx: Replace the hw mutex with a hw spinlock
  drm/amdkfd: Allow user to limit only queues per device
  drm/amdkfd: PQM handle queue creation fault
  drm: tda998x: Fix EDID read timeout on HDMI connect
  drm: tda998x: Protect the page register

c59c961c

27 Jan, 2015 28 commits

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 59343cd7

Linus Torvalds authored Jan 27, 2015

Pull networking fixes from David Miller:

 1) Don't OOPS on socket AIO, from Christoph Hellwig.

 2) Scheduled scans should be aborted upon RFKILL, from Emmanuel
    Grumbach.

 3) Fix sleep in atomic context in kvaser_usb, from Ahmed S Darwish.

 4) Fix RCU locking across copy_to_user() in bpf code, from Alexei
    Starovoitov.

 5) Lots of crash, memory leak, short TX packet et al bug fixes in
    sh_eth from Ben Hutchings.

 6) Fix memory corruption in SCTP wrt.  INIT collitions, from Daniel
    Borkmann.

 7) Fix return value logic for poll handlers in netxen, enic, and bnx2x.
    From Eric Dumazet and Govindarajulu Varadarajan.

 8) Header length calculation fix in mac80211 from Fred Chou.

 9) mv643xx_eth doesn't handle highmem correctly in non-TSO code paths.
    From Ezequiel Garcia.

10) udp_diag has bogus logic in it's hash chain skipping, copy same fix
    tcp diag used.  From Herbert Xu.

11) amd-xgbe programs wrong rx flow control register, from Thomas
    Lendacky.

12) Fix race leading to use after free in ping receive path, from Subash
    Abhinov Kasiviswanathan.

13) Cache redirect routes otherwise we can get a heavy backlog of rcu
    jobs liberating DST_NOCACHE entries.  From Hannes Frederic Sowa.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (48 commits)
  net: don't OOPS on socket aio
  stmmac: prevent probe drivers to crash kernel
  bnx2x: fix napi poll return value for repoll
  ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too
  sh_eth: Fix DMA-API usage for RX buffers
  sh_eth: Check for DMA mapping errors on transmit
  sh_eth: Ensure DMA engines are stopped before freeing buffers
  sh_eth: Remove RX overflow log messages
  ping: Fix race in free in receive path
  udp_diag: Fix socket skipping within chain
  can: kvaser_usb: Fix state handling upon BUS_ERROR events
  can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT
  can: kvaser_usb: Send correct context to URB completion
  can: kvaser_usb: Do not sleep in atomic context
  ipv4: try to cache dst_entries which would cause a redirect
  samples: bpf: relax test_maps check
  bpf: rcu lock must not be held when calling copy_to_user()
  net: sctp: fix slab corruption from use after free on INIT collisions
  net: mv643xx_eth: Fix highmem support in non-TSO egress path
  sh_eth: Fix serialisation of interrupt disable with interrupt & NAPI handlers
  ...

59343cd7

net: don't OOPS on socket aio · 06539d30

Christoph Hellwig authored Jan 27, 2015

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>

06539d30

stmmac: prevent probe drivers to crash kernel · 9afec6ef

Andy Shevchenko authored Jan 27, 2015

In the case when alloc_netdev fails we return NULL to a caller. But there is no
check for NULL in the probe drivers. This patch changes NULL to an error
pointer. The function description is amended to reflect what we may get
returned.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9afec6ef

Merge tag 'powerpc-3.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux · 7da323bb

Linus Torvalds authored Jan 27, 2015

Pull powerpc fixes from Michael Ellerman:
 "Two powerpc fixes"

* tag 'powerpc-3.19-5' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux:
  powerpc/powernv: Restore LPCR with LPCR_PECE1 cleared
  powerpc/xmon: Fix another endiannes issue in RTAS call from xmon

7da323bb

Merge tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux · 41592e2f

Linus Torvalds authored Jan 27, 2015

Pull one more module fix from Rusty Russell:
 "SCSI was using module_refcount() to figure out when the module was
  unloading: this broke with new atomic refcounting.  The code is still
  suspicious, but this solves the WARN_ON()"

* tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  scsi: always increment reference count

41592e2f

bnx2x: fix napi poll return value for repoll · 24e579c8

Govindarajulu Varadarajan authored Jan 25, 2015

With the commit d75b1ade ("net: less interrupt masking in NAPI") napi
repoll is done only when work_done == budget. When in busy_poll is we return 0
in napi_poll. We should return budget.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

24e579c8

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec · bf693f7b

David S. Miller authored Jan 27, 2015

Steffen Klassert says:

====================
ipsec 2015-01-26

Just two small fixes for _decode_session6() where we
might decode to wrong header information in some rare
situations.

Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

bf693f7b

ipv6: replacing a rt6_info needs to purge possible propagated rt6_infos too · 6e9e16e6

Hannes Frederic Sowa authored Jan 26, 2015

Lubomir Rintel reported that during replacing a route the interface
reference counter isn't correctly decremented.

To quote bug <https://bugzilla.kernel.org/show_bug.cgi?id=91941>:
| [root@rhel7-5 lkundrak]# sh -x lal
| + ip link add dev0 type dummy
| + ip link set dev0 up
| + ip link add dev1 type dummy
| + ip link set dev1 up
| + ip addr add 2001:db8:8086::2/64 dev dev0
| + ip route add 2001:db8:8086::/48 dev dev0 proto static metric 20
| + ip route add 2001:db8:8088::/48 dev dev1 proto static metric 10
| + ip route replace 2001:db8:8086::/48 dev dev1 proto static metric 20
| + ip link del dev0 type dummy
| Message from syslogd@rhel7-5 at Jan 23 10:54:41 ...
|  kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2
|
| Message from syslogd@rhel7-5 at Jan 23 10:54:51 ...
|  kernel:unregister_netdevice: waiting for dev0 to become free. Usage count = 2

During replacement of a rt6_info we must walk all parent nodes and check
if the to be replaced rt6_info got propagated. If so, replace it with
an alive one.

Fixes: 4a287eba ("IPv6 routing, NLM_F_* flag support: REPLACE and EXCL flags support, warn about missing CREATE flag")
Reported-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Tested-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: David S. Miller <davem@davemloft.net>

6e9e16e6

Merge branch 'sh_eth' · 22577609

David S. Miller authored Jan 27, 2015

Ben Hutchings says:

====================
Fixes for sh_eth #3

I'm continuing review and testing of Ethernet support on the R-Car H2
chip.  This series fixes the last of the more serious issues I've found.

These are not tested on any of the other supported chips.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

22577609

sh_eth: Fix DMA-API usage for RX buffers · 52b9fa36

Ben Hutchings authored Jan 27, 2015

- Use the return value of dma_map_single(), rather than calling
  virt_to_page() separately
- Check for mapping failue
- Call dma_unmap_single() rather than dma_sync_single_for_cpu()
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

52b9fa36

sh_eth: Check for DMA mapping errors on transmit · aa3933b8

Ben Hutchings authored Jan 27, 2015

dma_map_single() may fail if an IOMMU or swiotlb is in use, so
we need to check for this.
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

aa3933b8

sh_eth: Ensure DMA engines are stopped before freeing buffers · 740c7f31

Ben Hutchings authored Jan 27, 2015

Currently we try to clear EDRRR and EDTRR and immediately continue to
free buffers.  This is unsafe because:

- In general, register writes are not serialised with DMA, so we still
  have to wait for DMA to complete somehow
- The R8A7790 (R-Car H2) manual states that the TX running flag cannot
  be cleared by writing to EDTRR
- The same manual states that clearing the RX running flag only stops
  RX DMA at the next packet boundary

I applied this patch to the driver to detect DMA writes to freed
buffers:

> --- a/drivers/net/ethernet/renesas/sh_eth.c
> +++ b/drivers/net/ethernet/renesas/sh_eth.c
> @@ -1098,7 +1098,14 @@ static void sh_eth_ring_free(struct net_device *ndev)
>  	/* Free Rx skb ringbuffer */
>  	if (mdp->rx_skbuff) {
>  		for (i = 0; i < mdp->num_rx_ring; i++)
> +			memcpy(mdp->rx_skbuff[i]->data,
> +			       "Hello, world", 12);
> +		msleep(100);
> +		for (i = 0; i < mdp->num_rx_ring; i++) {
> +			WARN_ON(memcmp(mdp->rx_skbuff[i]->data,
> +				       "Hello, world", 12));
>  			dev_kfree_skb(mdp->rx_skbuff[i]);
> +		}
>  	}
>  	kfree(mdp->rx_skbuff);
>  	mdp->rx_skbuff = NULL;

then ran the loop:

    while ethtool -G eth0 rx 128 ; ethtool -G eth0 rx 64; do echo -n .; done

and 'ping -f' toward the sh_eth port from another machine.  The
warning fired several times a minute.

To fix these issues:

- Deactivate all TX descriptors rather than writing to EDTRR
- As there seems to be no way of telling when RX DMA is stopped,
  perform a soft reset to ensure that both DMA enginess are stopped
- To reduce the possibility of the reset truncating a transmitted
  frame, disable egress and wait a reasonable time to reach a
  packet boundary before resetting
- Update statistics before resetting

(The 'reasonable time' does not allow for CS/CD in half-duplex
mode, but half-duplex no longer seems reasonable!)
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

740c7f31

sh_eth: Remove RX overflow log messages · dc1d0e6d

Ben Hutchings authored Jan 27, 2015

If RX traffic is overflowing the FIFO or DMA ring, logging every time
this happens just makes things worse.  These errors are visible in the
statistics anyway.
Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

dc1d0e6d

Merge tag 'linux-can-fixes-for-3.19-20150127' of... · 8d8d67f1

David S. Miller authored Jan 27, 2015

Merge tag 'linux-can-fixes-for-3.19-20150127' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2015-01-27

this is another pull request for net/master which consists of 4 patches.

All 4 patches are contributed by Ahmed S. Darwish, he fixes more problems in
the kvaser_usb driver.

David, please merge net/master to net-next/master, as we have more kvaser_usb
patches in the queue, that target net-next.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

8d8d67f1

ping: Fix race in free in receive path · fc752f1f

subashab@codeaurora.org authored Jan 23, 2015

An exception is seen in ICMP ping receive path where the skb
destructor sock_rfree() tries to access a freed socket. This happens
because ping_rcv() releases socket reference with sock_put() and this
internally frees up the socket. Later icmp_rcv() will try to free the
skb and as part of this, skb destructor is called and which leads
to a kernel panic as the socket is freed already in ping_rcv().

-->|exception
-007|sk_mem_uncharge
-007|sock_rfree
-008|skb_release_head_state
-009|skb_release_all
-009|__kfree_skb
-010|kfree_skb
-011|icmp_rcv
-012|ip_local_deliver_finish

Fix this incorrect free by cloning this skb and processing this cloned
skb instead.

This patch was suggested by Eric Dumazet
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

fc752f1f

udp_diag: Fix socket skipping within chain · 86f3cddb

Herbert Xu authored Jan 24, 2015

While working on rhashtable walking I noticed that the UDP diag
dumping code is buggy.  In particular, the socket skipping within
a chain never happens, even though we record the number of sockets
that should be skipped.

As this code was supposedly copied from TCP, this patch does what
TCP does and resets num before we walk a chain.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

86f3cddb

can: kvaser_usb: Fix state handling upon BUS_ERROR events · e638642b

Ahmed S. Darwish authored Jan 26, 2015

While being in an ERROR_WARNING state, and receiving further
bus error events with error counters still in the ERROR_WARNING
range of 97-127 inclusive, the state handling code erroneously
reverts back to ERROR_ACTIVE.

Per the CAN standard, only revert to ERROR_ACTIVE when the
error counters are less than 96.

Moreover, in certain Kvaser models, the BUS_ERROR flag is
always set along with undefined bits in the M16C status
register. Thus use bitwise operators instead of full equality
for checking that register against bus errors.
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

e638642b

can: kvaser_usb: Retry the first bulk transfer on -ETIMEDOUT · 14c10c2a

Ahmed S. Darwish authored Jan 26, 2015

On some x86 laptops, plugging a Kvaser device again after an
unplug makes the firmware always ignore the very first command.
For such a case, provide some room for retries instead of
completely exiting the driver init code.
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

14c10c2a

can: kvaser_usb: Send correct context to URB completion · 3803fa69

Ahmed S. Darwish authored Jan 26, 2015

Send expected argument to the URB completion hander: a CAN
netdevice instead of the network interface private context
`kvaser_usb_net_priv'.

This was discovered by having some garbage in the kernel
log in place of the netdevice names: can0 and can1.
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

3803fa69

can: kvaser_usb: Do not sleep in atomic context · ded50066

Ahmed S. Darwish authored Jan 26, 2015

Upon receiving a hardware event with the BUS_RESET flag set,
the driver kills all of its anchored URBs and resets all of
its transmit URB contexts.

Unfortunately it does so under the context of URB completion
handler `kvaser_usb_read_bulk_callback()', which is often
called in an atomic context.

While the device is flooded with many received error packets,
usb_kill_urb() typically sleeps/reschedules till the transfer
request of each killed URB in question completes, leading to
the sleep in atomic bug. [3]

In v2 submission of the original driver patch [1], it was
stated that the URBs kill and tx contexts reset was needed
since we don't receive any tx acknowledgments later and thus
such resources will be locked down forever. Fortunately this
is no longer needed since an earlier bugfix in this patch
series is now applied: all tx URB contexts are reset upon CAN
channel close. [2]

Moreover, a BUS_RESET is now treated _exactly_ like a BUS_OFF
event, which is the recommended handling method advised by
the device manufacturer.

[1] http://article.gmane.org/gmane.linux.network/239442
    http://www.webcitation.org/6Vr2yagAQ

[2] can: kvaser_usb: Reset all URB tx contexts upon channel close
    889b77f7

[3] Stacktrace:

 <IRQ>  [<ffffffff8158de87>] dump_stack+0x45/0x57
 [<ffffffff8158b60c>] __schedule_bug+0x41/0x4f
 [<ffffffff815904b1>] __schedule+0x5f1/0x700
 [<ffffffff8159360a>] ? _raw_spin_unlock_irqrestore+0xa/0x10
 [<ffffffff81590684>] schedule+0x24/0x70
 [<ffffffff8147d0a5>] usb_kill_urb+0x65/0xa0
 [<ffffffff81077970>] ? prepare_to_wait_event+0x110/0x110
 [<ffffffff8147d7d8>] usb_kill_anchored_urbs+0x48/0x80
 [<ffffffffa01f4028>] kvaser_usb_unlink_tx_urbs+0x18/0x50 [kvaser_usb]
 [<ffffffffa01f45d0>] kvaser_usb_rx_error+0xc0/0x400 [kvaser_usb]
 [<ffffffff8108b14a>] ? vprintk_default+0x1a/0x20
 [<ffffffffa01f5241>] kvaser_usb_read_bulk_callback+0x4c1/0x5f0 [kvaser_usb]
 [<ffffffff8147a73e>] __usb_hcd_giveback_urb+0x5e/0xc0
 [<ffffffff8147a8a1>] usb_hcd_giveback_urb+0x41/0x110
 [<ffffffffa0008748>] finish_urb+0x98/0x180 [ohci_hcd]
 [<ffffffff810cd1a7>] ? acct_account_cputime+0x17/0x20
 [<ffffffff81069f65>] ? local_clock+0x15/0x30
 [<ffffffffa000a36b>] ohci_work+0x1fb/0x5a0 [ohci_hcd]
 [<ffffffff814fbb31>] ? process_backlog+0xb1/0x130
 [<ffffffffa000cd5b>] ohci_irq+0xeb/0x270 [ohci_hcd]
 [<ffffffff81479fc1>] usb_hcd_irq+0x21/0x30
 [<ffffffff8108bfd3>] handle_irq_event_percpu+0x43/0x120
 [<ffffffff8108c0ed>] handle_irq_event+0x3d/0x60
 [<ffffffff8108ec84>] handle_fasteoi_irq+0x74/0x110
 [<ffffffff81004dfd>] handle_irq+0x1d/0x30
 [<ffffffff81004727>] do_IRQ+0x57/0x100
 [<ffffffff8159482a>] common_interrupt+0x6a/0x6a
Signed-off-by: Ahmed S. Darwish <ahmed.darwish@valeo.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

ded50066

Merge tag 'mac80211-for-davem-2015-01-23' of... · 7d63585b

David S. Miller authored Jan 26, 2015

Merge tag 'mac80211-for-davem-2015-01-23' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Another set of last-minute fixes:
 * fix station double-removal when suspending while associating
 * fix the HT (802.11n) header length calculation
 * fix the CCK radiotap flag used for monitoring, a pretty
   old regression but a simple one-liner
 * fix per-station group-key handling
Signed-off-by: David S. Miller <davem@davemloft.net>

7d63585b

ipv4: try to cache dst_entries which would cause a redirect · df4d9254

Hannes Frederic Sowa authored Jan 23, 2015

Not caching dst_entries which cause redirects could be exploited by hosts
on the same subnet, causing a severe DoS attack. This effect aggravated
since commit f8864972 ("ipv4: fix dst race in sk_dst_get()").

Lookups causing redirects will be allocated with DST_NOCACHE set which
will force dst_release to free them via RCU.  Unfortunately waiting for
RCU grace period just takes too long, we can end up with >1M dst_entries
waiting to be released and the system will run OOM. rcuos threads cannot
catch up under high softirq load.

Attaching the flag to emit a redirect later on to the specific skb allows
us to cache those dst_entries thus reducing the pressure on allocation
and deallocation.

This issue was discovered by Marcelo Leitner.

Cc: Julian Anastasov <ja@ssi.bg>
Signed-off-by: Marcelo Leitner <mleitner@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>

df4d9254

Merge branch 'bpf' · 412d2907

David S. Miller authored Jan 26, 2015

Alexei Starovoitov says:

====================
bpf: fix two bugs

Michael Holzheu caught two issues (in bpf syscall and in the test).
Fix them. Details in corresponding patches.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

412d2907

samples: bpf: relax test_maps check · ba1a68bf

Alexei Starovoitov authored Jan 22, 2015

hash map is unordered, so get_next_key() iterator shouldn't
rely on particular order of elements. So relax this test.

Fixes: ffb65f27 ("bpf: add a testsuite for eBPF maps")
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ba1a68bf

bpf: rcu lock must not be held when calling copy_to_user() · 8ebe667c

Alexei Starovoitov authored Jan 22, 2015

BUG: sleeping function called from invalid context at mm/memory.c:3732
in_atomic(): 0, irqs_disabled(): 0, pid: 671, name: test_maps
1 lock held by test_maps/671:
 #0:  (rcu_read_lock){......}, at: [<0000000000264190>] map_lookup_elem+0xe8/0x260
Call Trace:
([<0000000000115b7e>] show_trace+0x12e/0x150)
 [<0000000000115c40>] show_stack+0xa0/0x100
 [<00000000009b163c>] dump_stack+0x74/0xc8
 [<000000000017424a>] ___might_sleep+0x23a/0x248
 [<00000000002b58e8>] might_fault+0x70/0xe8
 [<0000000000264230>] map_lookup_elem+0x188/0x260
 [<0000000000264716>] SyS_bpf+0x20e/0x840

Fix it by allocating temporary buffer to store map element value.

Fixes: db20fd2b ("bpf: add lookup/update/delete/iterate methods to BPF maps")
Reported-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8ebe667c

net: sctp: fix slab corruption from use after free on INIT collisions · 600ddd68

Daniel Borkmann authored Jan 22, 2015

When hitting an INIT collision case during the 4WHS with AUTH enabled, as
already described in detail in commit 1be9a950 ("net: sctp: inherit
auth_capable on INIT collisions"), it can happen that we occasionally
still remotely trigger the following panic on server side which seems to
have been uncovered after the fix from commit 1be9a950 ...

[  533.876389] BUG: unable to handle kernel paging request at 00000000ffffffff
[  533.913657] IP: [<ffffffff811ac385>] __kmalloc+0x95/0x230
[  533.940559] PGD 5030f2067 PUD 0
[  533.957104] Oops: 0000 [#1] SMP
[  533.974283] Modules linked in: sctp mlx4_en [...]
[  534.939704] Call Trace:
[  534.951833]  [<ffffffff81294e30>] ? crypto_init_shash_ops+0x60/0xf0
[  534.984213]  [<ffffffff81294e30>] crypto_init_shash_ops+0x60/0xf0
[  535.015025]  [<ffffffff8128c8ed>] __crypto_alloc_tfm+0x6d/0x170
[  535.045661]  [<ffffffff8128d12c>] crypto_alloc_base+0x4c/0xb0
[  535.074593]  [<ffffffff8160bd42>] ? _raw_spin_lock_bh+0x12/0x50
[  535.105239]  [<ffffffffa0418c11>] sctp_inet_listen+0x161/0x1e0 [sctp]
[  535.138606]  [<ffffffff814e43bd>] SyS_listen+0x9d/0xb0
[  535.166848]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b

... or depending on the the application, for example this one:

[ 1370.026490] BUG: unable to handle kernel paging request at 00000000ffffffff
[ 1370.026506] IP: [<ffffffff811ab455>] kmem_cache_alloc+0x75/0x1d0
[ 1370.054568] PGD 633c94067 PUD 0
[ 1370.070446] Oops: 0000 [#1] SMP
[ 1370.085010] Modules linked in: sctp kvm_amd kvm [...]
[ 1370.963431] Call Trace:
[ 1370.974632]  [<ffffffff8120f7cf>] ? SyS_epoll_ctl+0x53f/0x960
[ 1371.000863]  [<ffffffff8120f7cf>] SyS_epoll_ctl+0x53f/0x960
[ 1371.027154]  [<ffffffff812100d3>] ? anon_inode_getfile+0xd3/0x170
[ 1371.054679]  [<ffffffff811e3d67>] ? __alloc_fd+0xa7/0x130
[ 1371.080183]  [<ffffffff816149a9>] system_call_fastpath+0x16/0x1b

With slab debugging enabled, we can see that the poison has been overwritten:

[  669.826368] BUG kmalloc-128 (Tainted: G        W     ): Poison overwritten
[  669.826385] INFO: 0xffff880228b32e50-0xffff880228b32e50. First byte 0x6a instead of 0x6b
[  669.826414] INFO: Allocated in sctp_auth_create_key+0x23/0x50 [sctp] age=3 cpu=0 pid=18494
[  669.826424]  __slab_alloc+0x4bf/0x566
[  669.826433]  __kmalloc+0x280/0x310
[  669.826453]  sctp_auth_create_key+0x23/0x50 [sctp]
[  669.826471]  sctp_auth_asoc_create_secret+0xcb/0x1e0 [sctp]
[  669.826488]  sctp_auth_asoc_init_active_key+0x68/0xa0 [sctp]
[  669.826505]  sctp_do_sm+0x29d/0x17c0 [sctp] [...]
[  669.826629] INFO: Freed in kzfree+0x31/0x40 age=1 cpu=0 pid=18494
[  669.826635]  __slab_free+0x39/0x2a8
[  669.826643]  kfree+0x1d6/0x230
[  669.826650]  kzfree+0x31/0x40
[  669.826666]  sctp_auth_key_put+0x19/0x20 [sctp]
[  669.826681]  sctp_assoc_update+0x1ee/0x2d0 [sctp]
[  669.826695]  sctp_do_sm+0x674/0x17c0 [sctp]

Since this only triggers in some collision-cases with AUTH, the problem at
heart is that sctp_auth_key_put() on asoc->asoc_shared_key is called twice
when having refcnt 1, once directly in sctp_assoc_update() and yet again
from within sctp_auth_asoc_init_active_key() via sctp_assoc_update() on
the already kzfree'd memory, which is also consistent with the observation
of the poison decrease from 0x6b to 0x6a (note: the overwrite is detected
at a later point in time when poison is checked on new allocation).

Reference counting of auth keys revisited:

Shared keys for AUTH chunks are being stored in endpoints and associations
in endpoint_shared_keys list. On endpoint creation, a null key is being
added; on association creation, all endpoint shared keys are being cached
and thus cloned over to the association. struct sctp_shared_key only holds
a pointer to the actual key bytes, that is, struct sctp_auth_bytes which
keeps track of users internally through refcounting. Naturally, on assoc
or enpoint destruction, sctp_shared_key are being destroyed directly and
the reference on sctp_auth_bytes dropped.

User space can add keys to either list via setsockopt(2) through struct
sctp_authkey and by passing that to sctp_auth_set_key() which replaces or
adds a new auth key. There, sctp_auth_create_key() creates a new sctp_auth_bytes
with refcount 1 and in case of replacement drops the reference on the old
sctp_auth_bytes. A key can be set active from user space through setsockopt()
on the id via sctp_auth_set_active_key(), which iterates through either
endpoint_shared_keys and in case of an assoc, invokes (one of various places)
sctp_auth_asoc_init_active_key().

sctp_auth_asoc_init_active_key() computes the actual secret from local's
and peer's random, hmac and shared key parameters and returns a new key
directly as sctp_auth_bytes, that is asoc->asoc_shared_key, plus drops
the reference if there was a previous one. The secret, which where we
eventually double drop the ref comes from sctp_auth_asoc_set_secret() with
intitial refcount of 1, which also stays unchanged eventually in
sctp_assoc_update(). This key is later being used for crypto layer to
set the key for the hash in crypto_hash_setkey() from sctp_auth_calculate_hmac().

To close the loop: asoc->asoc_shared_key is freshly allocated secret
material and independant of the sctp_shared_key management keeping track
of only shared keys in endpoints and assocs. Hence, also commit 4184b2a7
("net: sctp: fix memory leak in auth key management") is independant of
this bug here since it concerns a different layer (though same structures
being used eventually). asoc->asoc_shared_key is reference dropped correctly
on assoc destruction in sctp_association_free() and when active keys are
being replaced in sctp_auth_asoc_init_active_key(), it always has a refcount
of 1. Hence, it's freed prematurely in sctp_assoc_update(). Simple fix is
to remove that sctp_auth_key_put() from there which fixes these panics.

Fixes: 730fc3d0 ("[SCTP]: Implete SCTP-AUTH parameter processing")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Acked-by: Vlad Yasevich <vyasevich@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

600ddd68

Merge branch 'akpm' (patches from Andrew Morton) · 4adca1cb

Linus Torvalds authored Jan 26, 2015

Merge misc fixes from Andrew Morton:
 "Six fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  drivers/rtc/rtc-s5m.c: terminate s5m_rtc_id array with empty element
  printk: add dummy routine for when CONFIG_PRINTK=n
  mm/vmscan: fix highidx argument type
  memcg: remove extra newlines from memcg oom kill log
  x86, build: replace Perl script with Shell script
  mm: page_alloc: embed OOM killing naturally into allocation slowpath

4adca1cb

net: mv643xx_eth: Fix highmem support in non-TSO egress path · 9e911414

Ezequiel Garcia authored Jan 22, 2015

Commit 69ad0dd7
Author: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Date:   Mon May 19 13:59:59 2014 -0300

    net: mv643xx_eth: Use dma_map_single() to map the skb fragments

caused a nasty regression by removing the support for highmem skb
fragments. By using page_address() to get the address of a fragment's
page, we are assuming a lowmem page. However, such assumption is incorrect,
as fragments can be in highmem pages, resulting in very nasty issues.

This commit fixes this by using the skb_frag_dma_map() helper,
which takes care of mapping the skb fragment properly. Additionally,
the type of mapping is now tracked, so it can be unmapped using
dma_unmap_page or dma_unmap_single when appropriate.

This commit also fixes the error path in txq_init() to release the
resources properly.

Fixes: 69ad0dd7 ("net: mv643xx_eth: Use dma_map_single() to map the skb fragments")
Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

9e911414