Commits · 9147efcbe0b7cc96b18eb64b1a3f0d4bba81443c · Kirill Smelkov / linux

12 Dec, 2017 6 commits

bpf: add schedule points to map alloc/free · 9147efcb

Eric Dumazet authored Dec 12, 2017

While using large percpu maps, htab_map_alloc() can hold
cpu for hundreds of ms.

This patch adds cond_resched() calls to percpu alloc/free
call sites, all running in process context.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

9147efcb

Merge branch 'bpf-misc-fixes' · 9dbd2d94

Alexei Starovoitov authored Dec 12, 2017

Daniel Borkmann says:

====================
Couple of outstanding fixes for BPF tree: 1) fixes a perf RB
corruption, 2) and 3) fixes a few build issues from the recent
bpf_perf_event.h uapi corrections. Thanks!
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

9dbd2d94

bpf: fix broken BPF selftest build · 720f228e

Daniel Borkmann authored Dec 12, 2017

At least on x86_64, the kernel's BPF selftests seemed to have stopped
to build due to 618e165b ("selftests/bpf: sync kernel headers and
introduce arch support in Makefile"):

  [...]
  In file included from test_verifier.c:29:0:
  ../../../include/uapi/linux/bpf_perf_event.h:11:32:
     fatal error: asm/bpf_perf_event.h: No such file or directory
   #include <asm/bpf_perf_event.h>
                                ^
  compilation terminated.
  [...]

While pulling in tools/arch/*/include/uapi/asm/bpf_perf_event.h seems
to work fine, there's no automated fall-back logic right now that would
do the same out of tools/include/uapi/asm-generic/bpf_perf_event.h. The
usual convention today is to add a include/[uapi/]asm/ equivalent that
would pull in the correct arch header or generic one as fall-back, all
ifdef'ed based on compiler target definition. It's similarly done also
in other cases such as tools/include/asm/barrier.h, thus adapt the same
here.

Fixes: 618e165b ("selftests/bpf: sync kernel headers and introduce arch support in Makefile")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

720f228e

bpf: fix build issues on um due to mising bpf_perf_event.h · a23f06f0

Daniel Borkmann authored Dec 12, 2017

Since c895f6f7 ("bpf: correct broken uapi for
BPF_PROG_TYPE_PERF_EVENT program type") um (uml) won't build
on i386 or x86_64:

  [...]
    CC      init/main.o
  In file included from ../include/linux/perf_event.h:18:0,
                   from ../include/linux/trace_events.h:10,
                   from ../include/trace/syscall.h:7,
                   from ../include/linux/syscalls.h:82,
                   from ../init/main.c:20:
  ../include/uapi/linux/bpf_perf_event.h:11:32: fatal error:
  asm/bpf_perf_event.h: No such file or directory #include
  <asm/bpf_perf_event.h>
  [...]

Lets add missing bpf_perf_event.h also to um arch. This seems
to be the only one still missing.

Fixes: c895f6f7 ("bpf: correct broken uapi for BPF_PROG_TYPE_PERF_EVENT program type")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Richard Weinberger <richard@sigma-star.at>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Richard Weinberger <richard@sigma-star.at>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Richard Weinberger <richard@nod.at>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

a23f06f0

bpf: fix corruption on concurrent perf_event_output calls · 283ca526

Daniel Borkmann authored Dec 12, 2017

When tracing and networking programs are both attached in the
system and both use event-output helpers that eventually call
into perf_event_output(), then we could end up in a situation
where the tracing attached program runs in user context while
a cls_bpf program is triggered on that same CPU out of softirq
context.

Since both rely on the same per-cpu perf_sample_data, we could
potentially corrupt it. This can only ever happen in a combination
of the two types; all tracing programs use a bpf_prog_active
counter to bail out in case a program is already running on
that CPU out of a different context. XDP and cls_bpf programs
by themselves don't have this issue as they run in the same
context only. Therefore, split both perf_sample_data so they
cannot be accessed from each other.

Fixes: 20b9d7ac ("bpf: avoid excessive stack usage for perf_sample_data")
Reported-by: Alexei Starovoitov <ast@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Song Liu <songliubraving@fb.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>

283ca526

tcp md5sig: Use skb's saddr when replying to an incoming segment · 30791ac4

Christoph Paasch authored Dec 11, 2017

The MD5-key that belongs to a connection is identified by the peer's
IP-address. When we are in tcp_v4(6)_reqsk_send_ack(), we are replying
to an incoming segment from tcp_check_req() that failed the seq-number
checks.

Thus, to find the correct key, we need to use the skb's saddr and not
the daddr.

This bug seems to have been there since quite a while, but probably got
unnoticed because the consequences are not catastrophic. We will call
tcp_v4_reqsk_send_ack only to send a challenge-ACK back to the peer,
thus the connection doesn't really fail.

Fixes: 9501f972 ("tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash().")
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

30791ac4

11 Dec, 2017 9 commits

fou: fix some member types in guehdr · 20080971

Xin Long authored Dec 10, 2017

guehdr struct is used to build or parse gue packets, which
are always in big endian. It's better to define all guehdr
members as __beXX types.

Also, in validate_gue_flags it's not good to use a __be32
variable for both Standard flags(__be16) and Private flags
(__be32), and pass it to other funcions.

This patch could fix a bunch of sparse warnings from fou.

Fixes: 5024c33a ("gue: Add infrastructure for flags and options")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

20080971

sctp: make sure stream nums can match optlen in sctp_setsockopt_reset_streams · 2342b8d9

Xin Long authored Dec 10, 2017

Now in sctp_setsockopt_reset_streams, it only does the check
optlen < sizeof(*params) for optlen. But it's not enough, as
params->srs_number_streams should also match optlen.

If the streams in params->srs_stream_list are less than stream
nums in params->srs_number_streams, later when dereferencing
the stream list, it could cause a slab-out-of-bounds crash, as
reported by syzbot.

This patch is to fix it by also checking the stream numbers in
sctp_setsockopt_reset_streams to make sure at least it's not
greater than the streams in the list.

Fixes: 7f9d68ac ("sctp: implement sender-side procedures for SSN Reset Request Parameter")
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2342b8d9

net: ipv4: fix for a race condition in raw_sendmsg · 8f659a03

Mohamed Ghannam authored Dec 10, 2017

inet->hdrincl is racy, and could lead to uninitialized stack pointer
usage, so its value should be read only once.

Fixes: c008ba5b ("ipv4: Avoid reading user iov twice after raw_probe_proto_opt")
Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8f659a03

netlink: Add netns check on taps · 93c64764

Kevin Cernekee authored Dec 06, 2017

Currently, a nlmon link inside a child namespace can observe systemwide
netlink activity.  Filter the traffic so that nlmon can only sniff
netlink messages from its own netns.

Test case:

    vpnns -- bash -c "ip link add nlmon0 type nlmon; \
                      ip link set nlmon0 up; \
                      tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" &
    sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \
        spi 0x1 mode transport \
        auth sha1 0x6162633132330000000000000000000000000000 \
        enc aes 0x00000000000000000000000000000000
    grep --binary abc123 /tmp/nlmon.pcap
Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

93c64764

net: sh_eth: do not advertise Gigabit capabilities when not available · 2aab6b40

Thomas Petazzoni authored Dec 08, 2017

Not all variants of the sh_eth hardware have Gigabit
support. Unfortunately, the current driver doesn't tell the PHY about
the limited MAC capabilities. Due to this, if you have a Gigabit
capable PHY, the PHY will advertise its Gigabit capability and
establish a link at 1Gbit/s, even though the MAC doesn't support it.

In order to avoid this, we use the recently introduced
phy_set_max_speed() to tell the PHY to not advertise speed higher than
100 MBit/s.

Tested on a SH7786 platform, with a Gigabit PHY.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2aab6b40

net: phy: meson-gxl: detect LPA corruption · f1e2400a

Jerome Brunet authored Dec 08, 2017

The purpose of this change is to fix the incorrect detection of the link
partner (LP) advertised capabilities which sometimes happens with this PHY
(roughly 1 time in a dozen)

This issue may cause the link to be negotiated at 10Mbps/Full or
10Mbps/Half when 100MBps/Full is actually possible. In some case, the link
is even completely broken and no communication is possible.

To detect the corruption, we must look for a magic undocumented bit in the
WOL bank (hint given by the SoC vendor kernel) but this is not enough to
cover all cases. We also have to look at the LPA ack. If the LP supports
Aneg but did not ack our base code when aneg is completed, we assume
something went wrong.

The detection of a corrupted LPA triggers a restart of the aneg process.
This solves the problem but may take up to 6 retries to complete.

Fixes: 7334b3e4 ("net: phy: Add Meson GXL Internal PHY driver")
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

f1e2400a

ptr_ring: add barriers · a8ceb5db

Michael S. Tsirkin authored Dec 05, 2017

Users of ptr_ring expect that it's safe to give the
data structure a pointer and have it be available
to consumers, but that actually requires an smb_wmb
or a stronger barrier.

In absence of such barriers and on architectures that reorder writes,
consumer might read an un=initialized value from an skb pointer stored
in the skb array.  This was observed causing crashes.

To fix, add memory barriers.  The barrier we use is a wmb, the
assumption being that producers do not need to read the value so we do
not need to order these reads.
Reported-by: George Cherian <george.cherian@cavium.com>
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

a8ceb5db

Merge tag 'mac80211-for-davem-2017-12-11' of... · f0f1d016

David S. Miller authored Dec 11, 2017

Merge tag 'mac80211-for-davem-2017-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211

Johannes Berg says:

====================
Three fixes:
 * for certificate C file generation, don't use hexdump as it's
   not always installed by default, use pure posix instead (od/sed)
 * for certificate C file generation, don't write the file if
   anything fails, so the build abort will not cause a bad build
   upon a second attempt
 * fix locking in ieee80211_sta_tear_down_BA_sessions() which had
   been causing lots of locking warnings
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

f0f1d016

mac80211: fix locking in ieee80211_sta_tear_down_BA_sessions · 0afe9d4a

Johannes Berg authored Dec 09, 2017

Due to overlap between
commit 12811037 ("mac80211: Simplify locking in ieee80211_sta_tear_down_BA_sessions()")
and the way that Luca modified
commit 72e2c343 ("mac80211: tear down RX aggregations first")
when sending it upstream from Intel's internal tree, we get
the following warning:

WARNING: CPU: 0 PID: 5472 at net/mac80211/agg-tx.c:315 ___ieee80211_stop_tx_ba_session+0x158/0x1f0

since there's no appropriate locking around the call to
___ieee80211_stop_tx_ba_session; Sara's original just had
a call to the locked __ieee80211_stop_tx_ba_session (one
less underscore) but it looks like Luca modified both of
the calls when fixing it up for upstream, leading to the
problem at hand.

Move the locking appropriately to fix this problem.
Reported-by: Kalle Valo <kvalo@codeaurora.org>
Reported-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

0afe9d4a

08 Dec, 2017 25 commits

kmemcheck: rip it out for real · f335195a

Michal Hocko authored Dec 06, 2017

Commit 4675ff05 ("kmemcheck: rip it out") has removed the code but
for some reason SPDX header stayed in place.  This looks like a rebase
mistake in the mmotm tree or the merge mistake.  Let's drop those
leftovers as well.
Signed-off-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

f335195a

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · e9ef1fe3

Linus Torvalds authored Dec 08, 2017

Pull networking fixes from David Miller:

 1) CAN fixes from Martin Kelly (cancel URBs properly in all the CAN usb
    drivers).

 2) Revert returning -EEXIST from __dev_alloc_name() as this propagates
    to userspace and broke some apps. From Johannes Berg.

 3) Fix conn memory leaks and crashes in TIPC, from Jon Malloc and Cong
    Wang.

 4) Gianfar MAC can't do EEE so don't advertise it by default, from
    Claudiu Manoil.

 5) Relax strict netlink attribute validation, but emit a warning. From
    David Ahern.

 6) Fix regression in checksum offload of thunderx driver, from Florian
    Westphal.

 7) Fix UAPI bpf issues on s390, from Hendrik Brueckner.

 8) New card support in iwlwifi, from Ihab Zhaika.

 9) BBR congestion control bug fixes from Neal Cardwell.

10) Fix port stats in nfp driver, from Pieter Jansen van Vuuren.

11) Fix leaks in qualcomm rmnet, from Subash Abhinov Kasiviswanathan.

12) Fix DMA API handling in sh_eth driver, from Thomas Petazzoni.

13) Fix spurious netpoll warnings in bnxt_en, from Calvin Owens.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (67 commits)
  net: mvpp2: fix the RSS table entry offset
  tcp: evaluate packet losses upon RTT change
  tcp: fix off-by-one bug in RACK
  tcp: always evaluate losses in RACK upon undo
  tcp: correctly test congestion state in RACK
  bnxt_en: Fix sources of spurious netpoll warnings
  tcp_bbr: reset long-term bandwidth sampling on loss recovery undo
  tcp_bbr: reset full pipe detection on loss recovery undo
  tcp_bbr: record "full bw reached" decision in new full_bw_reached bit
  sfc: pass valid pointers from efx_enqueue_unwind
  gianfar: Disable EEE autoneg by default
  tcp: invalidate rate samples during SACK reneging
  can: peak/pcie_fd: fix potential bug in restarting tx queue
  can: usb_8dev: cancel urb on -EPIPE and -EPROTO
  can: kvaser_usb: cancel urb on -EPIPE and -EPROTO
  can: esd_usb2: cancel urb on -EPIPE and -EPROTO
  can: ems_usb: cancel urb on -EPIPE and -EPROTO
  can: mcba_usb: cancel urb on -EPROTO
  usbnet: fix alignment for frames with no ethernet header
  tcp: use current time in tcp_rcv_space_adjust()
  ...

e9ef1fe3

Merge tag 'media/v4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media · 77071bc6

Linus Torvalds authored Dec 08, 2017

Pull media fixes from Mauro Carvalho Chehab:

 "A series of fixes for the media subsytem:

   - The largest amount of fixes in this series is with regards to
     comments that aren't kernel-doc, but start with "/**".

     A new check added for 4.15 makes it to produce a *huge* amount of
     new warnings (I'm compiling here with W=1). Most of the patches in
     this series fix those.

     No code changes - just comment changes at the source files

   - rc: some fixed in order to better handle RC repetition codes

   - v4l-async: use the v4l2_dev from the root notifier when matching
     sub-devices

   - v4l2-fwnode: Check subdev count after checking port

   - ov 13858 and et8ek8: compilation fix with randconfigs

   - usbtv: a trivial new USB ID addition

   - dibusb-common: don't do DMA on stack on firmware load

   - imx274: Fix error handling, add MAINTAINERS entry

   - sir_ir: detect presence of port"

* tag 'media/v4.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (50 commits)
  media: imx274: Fix error handling, add MAINTAINERS entry
  media: v4l: async: use the v4l2_dev from the root notifier when matching sub-devices
  media: v4l2-fwnode: Check subdev count after checking port
  media: et8ek8: select V4L2_FWNODE
  media: ov13858: Select V4L2_FWNODE
  media: rc: partial revert of "media: rc: per-protocol repeat period"
  media: dvb: i2c transfers over usb cannot be done from stack
  media: dvb-frontends: complete kernel-doc markups
  media: docs: add documentation for frontend attach info
  media: dvb_frontends: fix kernel-doc macros
  media: drivers: remove "/**" from non-kernel-doc comments
  media: lm3560: add a missing kernel-doc parameter
  media: rcar_jpu: fix two kernel-doc markups
  media: vsp1: add a missing kernel-doc parameter
  media: soc_camera: fix a kernel-doc markup
  media: mt2063: fix some kernel-doc warnings
  media: radio-wl1273: fix a parameter name at kernel-doc macro
  media: s3c-camif: add missing description at s3c_camif_find_format()
  media: mtk-vpu: add description for wdt fields at struct mtk_vpu
  media: vdec: fix some kernel-doc warnings
  ...

77071bc6

Merge tag 'drm-fixes-for-v4.15-rc3' of git://people.freedesktop.org/~airlied/linux · 4066aa72

Linus Torvalds authored Dec 08, 2017

Pull drm fixes from Dave Airlie:
 "This pull is a bit larger than I'd like but a large bunch of it is
  license fixes, AMD wanted to fix the licenses for a bunch of files
  that were missing them,

 Otherwise a bunch of TTM regression fix since the hugepage support,
 some i915 and gvt fixes, a core connector free in a safe context fix,
 and one bridge fix"

* tag 'drm-fixes-for-v4.15-rc3' of git://people.freedesktop.org/~airlied/linux: (26 commits)
  drm/bridge: analogix dp: Fix runtime PM state in get_modes() callback
  Revert "drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk"
  drm/vc4: Fix false positive WARN() backtrace on refcount_inc() usage
  drm/i915: Call i915_gem_init_userptr() before taking struct_mutex
  drm/exynos: remove unnecessary function declaration
  drm/exynos: remove unnecessary descrptions
  drm/exynos: gem: Drop NONCONTIG flag for buffers allocated without IOMMU
  drm/exynos: Fix dma-buf import
  drm/ttm: swap consecutive allocated pooled pages v4
  drm: safely free connectors from connector_iter
  drm/i915/gvt: set max priority for gvt context
  drm/i915/gvt: Don't mark vgpu context as inactive when preempted
  drm/i915/gvt: Limit read hw reg to active vgpu
  drm/i915/gvt: Export intel_gvt_render_mmio_to_ring_id()
  drm/i915/gvt: Emulate PCI expansion ROM base address register
  drm/ttm: swap consecutive allocated cached pages v3
  drm/ttm: roundup the shrink request to prevent skip huge pool
  drm/ttm: add page order support in ttm_pages_put
  drm/ttm: add set_pages_wb for handling page order more than zero
  drm/ttm: add page order in page pool
  ...

4066aa72

Merge tag 'md/4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md · 7267212c

Linus Torvalds authored Dec 08, 2017

Pull md fixes from Shaohua Li:
 "Some MD fixes.

  The notable one is a raid5-cache deadlock bug with dm-raid, others are
  not significant"

* tag 'md/4.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md:
  md/raid1/10: add missed blk plug
  md: limit mdstat resync progress to max_sectors
  md/r5cache: move mddev_lock() out of r5c_journal_mode_set()
  md/raid5: correct degraded calculation in raid5_error

7267212c

Merge tag 'devicetree-fixes-for-4.15-part2' of... · 78d9b048

Linus Torvalds authored Dec 08, 2017

Merge tag 'devicetree-fixes-for-4.15-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux

Pull DeviceTree fixes from Rob Herring:
 "Another set of DT fixes:

   - Fixes from overlay code rework. A trifecta of fixes to the locking,
     an out of bounds access, and a memory leak in of_overlay_apply()

   - Clean-up at25 eeprom binding document

   - Remove leading '0x' in unit-addresses from binding docs"

* tag 'devicetree-fixes-for-4.15-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: overlay: Make node skipping in init_overlay_changeset() clearer
  of: overlay: Fix out-of-bounds write in init_overlay_changeset()
  of: overlay: Fix (un)locking in of_overlay_apply()
  of: overlay: Fix memory leak in of_overlay_apply() error path
  dt-bindings: eeprom: at25: Document device-specific compatible values
  dt-bindings: eeprom: at25: Grammar s/are can/can/
  dt-bindings: Remove leading 0x from bindings notation
  of: overlay: Remove else after goto
  of: Spelling s/changset/changeset/
  of: unittest: Remove bogus overlay mutex release from overlay_data_add()

78d9b048

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · 900add27

Linus Torvalds authored Dec 08, 2017

Pull virtio bugfixes from Michael Tsirkin:
 "A couple of minor bugfixes"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_net: fix return value check in receive_mergeable()
  virtio_mmio: add cleanup for virtio_mmio_remove
  virtio_mmio: add cleanup for virtio_mmio_probe

900add27

Merge tag 'for-linus-4.15-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · 32abeb09

Linus Torvalds authored Dec 08, 2017

Pull xen fixes from Juergen Gross:
 "Just two small fixes for the new pvcalls frontend driver"

* tag 'for-linus-4.15-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/pvcalls: Fix a check in pvcalls_front_remove()
  xen/pvcalls: check for xenbus_read() errors

32abeb09

Merge tag 'powerpc-4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · d90696ed

Linus Torvalds authored Dec 08, 2017

Pull powerpc fixes from Michael Ellerman:

 "One notable fix for kexec on Power9, where we were not clearing MMU
  PID properly which sometimes leads to hangs. Finally debugged to a
  root cause by Nick.

  A revert of a patch which tried to rework our panic handling to get
  more output on the console, but inadvertently broke reporting the
  panic to the hypervisor, which apparently people care about.

  Then a fix for an oops in the PMU code, and finally some s/%p/%px/ in
  xmon.

  Thanks to: David Gibson, Nicholas Piggin, Ravi Bangoria"

* tag 'powerpc-4.15-4' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/xmon: Don't print hashed pointers in xmon
  powerpc/64s: Initialize ISAv3 MMU registers before setting partition table
  Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier"
  powerpc/perf: Fix oops when grouping different pmu events

d90696ed

Merge tag 'linux-can-fixes-for-4.15-20171208' of... · fd29117a

David S. Miller authored Dec 08, 2017

Merge tag 'linux-can-fixes-for-4.15-20171208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2017-12-08

this is a pull request of 6 patches for net/master.

Martin Kelly provides 5 patches for various USB based CAN drivers, that
properly cancel the URBs on adapter unplug, so that the driver doesn't
end up in an endless loop. Stephane Grosjean provides a patch to restart
the tx queue if zero length packages are transmitted.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

fd29117a

Merge tag 'wireless-drivers-for-davem-2017-12-08' of... · 03afb6e4

David S. Miller authored Dec 08, 2017

Merge tag 'wireless-drivers-for-davem-2017-12-08' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers

Kalle Valo says:

====================
wireless-drivers fixes for 4.15

Second set of fixes for 4.15. This time a lot of iwlwifi patches and
two brcmfmac patches. Most important here are the MIC and IVC fixes
for iwlwifi to unbreak 9000 series.

iwlwifi

* fix rate-scaling to not start lowest possible rate

* fix the TX queue hang detection for AP/GO modes

* fix the TX queue hang timeout in monitor interfaces

* fix packet injection

* remove a wrong error message when dumping PCI registers

* fix race condition with RF-kill

* tell mac80211 when the MIC has been stripped (9000 series)

* tell mac80211 when the IVC has been stripped (9000 series)

* add 2 new PCI IDs, one for 9000 and one for 22000

* fix a queue hang due during a P2P Remain-on-Channel operation

brcmfmac

* fix a race which sometimes caused a crash during sdio unbind

* fix a kernel-doc related build error
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

03afb6e4

net: mvpp2: fix the RSS table entry offset · 8a7b741e

Antoine Tenart authored Dec 08, 2017

The macro used to access or set an RSS table entry was using an offset
of 8, while it should use an offset of 0. This lead to wrongly configure
the RSS table, not accessing the right entries.

Fixes: 1d7d15d7 ("net: mvpp2: initialize the RSS tables")
Signed-off-by: Antoine Tenart <antoine.tenart@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

8a7b741e

Merge branch 'tcp-RACK-loss-recovery-bug-fixes' · b7e445a1

David S. Miller authored Dec 08, 2017

Yuchung Cheng says:

====================
tcp: RACK loss recovery bug fixes

This patch set has four minor bug fixes in TCP RACK loss recovery.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b7e445a1

tcp: evaluate packet losses upon RTT change · 6065fd0d

Yuchung Cheng authored Dec 07, 2017

RACK skips an ACK unless it advances the most recently delivered
TX timestamp (rack.mstamp). Since RACK also uses the most recent
RTT to decide if a packet is lost, RACK should still run the
loss detection whenever the most recent RTT changes. For example,
an ACK that does not advance the timestamp but triggers the cwnd
undo due to reordering, would then use the most recent (higher)
RTT measurement to detect further losses.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

6065fd0d

tcp: fix off-by-one bug in RACK · 428aec5e

Yuchung Cheng authored Dec 07, 2017

RACK should mark a packet lost when remaining wait time is zero.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

428aec5e

tcp: always evaluate losses in RACK upon undo · cd1fc85b

Yuchung Cheng authored Dec 07, 2017

When sender detects spurious retransmission, all packets
marked lost are remarked to be in-flight. However some may
be considered lost based on its timestamps in RACK. This patch
forces RACK to re-evaluate, which may be skipped previously if
the ACK does not advance RACK timestamp.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

cd1fc85b

tcp: correctly test congestion state in RACK · 0ce294d8

Yuchung Cheng authored Dec 07, 2017

RACK does not test the loss recovery state correctly to compute
the reordering window. It assumes if lost_out is zero then TCP is
not in loss recovery. But it can be zero during recovery before
calling tcp_rack_detect_loss(): when an ACK acknowledges all
packets marked lost before receiving this ACK, but has not yet
to discover new ones by tcp_rack_detect_loss(). The fix is to
simply test the congestion state directly.
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Priyaranjan Jha <priyarjha@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

0ce294d8

bnxt_en: Fix sources of spurious netpoll warnings · 2edbdb31

Calvin Owens authored Dec 08, 2017

After applying 2270bc5d ("bnxt_en: Fix netpoll handling") and
903649e7 ("bnxt_en: Improve -ENOMEM logic in NAPI poll loop."),
we still see the following WARN fire:

  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 1875170 at net/core/netpoll.c:165 netpoll_poll_dev+0x15a/0x160
  bnxt_poll+0x0/0xd0 exceeded budget in poll
  <snip>
  Call Trace:
   [<ffffffff814be5cd>] dump_stack+0x4d/0x70
   [<ffffffff8107e013>] __warn+0xd3/0xf0
   [<ffffffff8107e07f>] warn_slowpath_fmt+0x4f/0x60
   [<ffffffff8179519a>] netpoll_poll_dev+0x15a/0x160
   [<ffffffff81795f38>] netpoll_send_skb_on_dev+0x168/0x250
   [<ffffffff817962fc>] netpoll_send_udp+0x2dc/0x440
   [<ffffffff815fa9be>] write_ext_msg+0x20e/0x250
   [<ffffffff810c8125>] call_console_drivers.constprop.23+0xa5/0x110
   [<ffffffff810c9549>] console_unlock+0x339/0x5b0
   [<ffffffff810c9a88>] vprintk_emit+0x2c8/0x450
   [<ffffffff810c9d5f>] vprintk_default+0x1f/0x30
   [<ffffffff81173df5>] printk+0x48/0x50
   [<ffffffffa0197713>] edac_raw_mc_handle_error+0x563/0x5c0 [edac_core]
   [<ffffffffa0197b9b>] edac_mc_handle_error+0x42b/0x6e0 [edac_core]
   [<ffffffffa01c3a60>] sbridge_mce_output_error+0x410/0x10d0 [sb_edac]
   [<ffffffffa01c47cc>] sbridge_check_error+0xac/0x130 [sb_edac]
   [<ffffffffa0197f3c>] edac_mc_workq_function+0x3c/0x90 [edac_core]
   [<ffffffff81095f8b>] process_one_work+0x19b/0x480
   [<ffffffff810967ca>] worker_thread+0x6a/0x520
   [<ffffffff8109c7c4>] kthread+0xe4/0x100
   [<ffffffff81884c52>] ret_from_fork+0x22/0x40

This happens because we increment rx_pkts on -ENOMEM and -EIO, resulting
in rx_pkts > 0. Fix this by only bumping rx_pkts if we were actually
given a non-zero budget.
Signed-off-by: Calvin Owens <calvinowens@fb.com>
Acked-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2edbdb31

Merge branch 'tcp-bbr-sampling-fixes' · b25b3e2f

David S. Miller authored Dec 08, 2017

Neal Cardwell says:

====================
TCP BBR sampling fixes for loss recovery undo

This patch series has a few minor bug fixes for cases where spurious
loss recoveries can trick BBR estimators into estimating that the
available bandwidth is much lower than the true available bandwidth.
In both cases the fix here is to just reset the estimator upon loss
recovery undo.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

b25b3e2f

tcp_bbr: reset long-term bandwidth sampling on loss recovery undo · 600647d4

Neal Cardwell authored Dec 07, 2017

Fix BBR so that upon notification of a loss recovery undo BBR resets
long-term bandwidth sampling.

Under high reordering, reordering events can be interpreted as loss.
If the reordering and spurious loss estimates are high enough, this
can cause BBR to spuriously estimate that we are seeing loss rates
high enough to trigger long-term bandwidth estimation. To avoid that
problem, this commit resets long-term bandwidth sampling on loss
recovery undo events.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

600647d4

tcp_bbr: reset full pipe detection on loss recovery undo · 2f6c498e

Neal Cardwell authored Dec 07, 2017

Fix BBR so that upon notification of a loss recovery undo BBR resets
the full pipe detection (STARTUP exit) state machine.

Under high reordering, reordering events can be interpreted as loss.
If the reordering and spurious loss estimates are high enough, this
could previously cause BBR to spuriously estimate that the pipe is
full.

Since spurious loss recovery means that our overall sending will have
slowed down spuriously, this commit gives a flow more time to probe
robustly for bandwidth and decide the pipe is really full.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

2f6c498e

tcp_bbr: record "full bw reached" decision in new full_bw_reached bit · c589e69b

Neal Cardwell authored Dec 07, 2017

This commit records the "full bw reached" decision in a new
full_bw_reached bit. This is a pure refactor that does not change the
current behavior, but enables subsequent fixes and improvements.

In particular, this enables simple and clean fixes because the full_bw
and full_bw_cnt can be unconditionally zeroed without worrying about
forgetting that we estimated we filled the pipe in Startup. And it
enables future improvements because multiple code paths can be used
for estimating that we filled the pipe in Startup; any new code paths
only need to set this bit when they think the pipe is full.

Note that this fix intentionally reduces the width of the full_bw_cnt
counter, since we have never used the most significant bit.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

c589e69b

sfc: pass valid pointers from efx_enqueue_unwind · d4a7a889

Bert Kenward authored Dec 07, 2017

The bytes_compl and pkts_compl pointers passed to efx_dequeue_buffers
cannot be NULL. Add a paranoid warning to check this condition and fix
the one case where they were NULL.

efx_enqueue_unwind() is called very rarely, during error handling.
Without this fix it would fail with a NULL pointer dereference in
efx_dequeue_buffer, with efx_enqueue_skb in the call stack.

Fixes: e9117e50 ("sfc: Firmware-Assisted TSO version 2")
Reported-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Tested-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

d4a7a889

gianfar: Disable EEE autoneg by default · b6b5e8a6

Claudiu Manoil authored Dec 07, 2017

This controller does not support EEE, but it may connect to a PHY
which supports EEE and advertises EEE by default, while its link
partner also advertises EEE. If this happens, the PHY enters low
power mode when the traffic rate is low and causes packet loss.
This patch disables EEE advertisement by default for any PHY that
gianfar connects to, to prevent the above unwanted outcome.
Signed-off-by: Shaohui Xie <Shaohui.Xie@nxp.com>
Tested-by: Yangbo Lu <Yangbo.lu@nxp.com>
Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>

b6b5e8a6

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux · c6b3e969

Linus Torvalds authored Dec 08, 2017

Pull s390 fixes from Martin Schwidefsky:

 - three more patches in regard to the SPDX license tags. The missing
   tags for the files in arch/s390/kvm will be merged via the KVM tree.
   With that all s390 related files should have their SPDX tags.

 - a patch to get rid of 'struct timespec' in the DASD driver.

 - bug fixes

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390: fix compat system call table
  s390/mm: fix off-by-one bug in 5-level page table handling
  s390: Remove redudant license text
  s390: add a few more SPDX identifiers
  s390/dasd: prevent prefix I/O error
  s390: always save and restore all registers on context switch
  s390/dasd: remove 'struct timespec' usage
  s390/qdio: restrict target-full handling to IQDIO
  s390/qdio: consider ERROR buffers for inbound-full condition
  s390/virtio: add BSD license to virtio-ccw

c6b3e969