Commits · 06087114606c416892bd67c5fde9f0d498afb287 · Kirill Smelkov / linux

31 Oct, 2019 4 commits

Merge branch 'bpf-cleanup-btf-raw-tp' · 06087114

Daniel Borkmann authored Oct 31, 2019

Alexei Starovoitov says:

====================
v1->v2: addressed Andrii's feedback

When BTF-enabled raw_tp were introduced the plan was to follow up
with BTF-enabled kprobe and kretprobe reusing PROG_RAW_TRACEPOINT
and PROG_KPROBE types. But k[ret]probe expect pt_regs while
BTF-enabled program ctx will be the same as raw_tp.
kretprobe is indistinguishable from kprobe while BTF-enabled
kretprobe will have access to retval while kprobe will not.
Hence PROG_KPROBE type is not reusable and reusing
PROG_RAW_TRACEPOINT no longer fits well.
Hence introduce 'umbrella' prog type BPF_PROG_TYPE_TRACING
that will cover different BTF-enabled tracing attach points.
The changes make libbpf side cleaner as well.
check_attach_btf_id() is cleaner too.
====================
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

06087114

libbpf: Add support for prog_tracing · 12a8654b

Alexei Starovoitov authored Oct 30, 2019

Cleanup libbpf from expected_attach_type == attach_btf_id hack
and introduce BPF_PROG_TYPE_TRACING.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191030223212.953010-3-ast@kernel.org

12a8654b

bpf: Replace prog_raw_tp+btf_id with prog_tracing · f1b9509c

Alexei Starovoitov authored Oct 30, 2019

The bpf program type raw_tp together with 'expected_attach_type'
was the most appropriate api to indicate BTF-enabled raw_tp programs.
But during development it became apparent that 'expected_attach_type'
cannot be used and new 'attach_btf_id' field had to be introduced.
Which means that the information is duplicated in two fields where
one of them is ignored.
Clean it up by introducing new program type where both
'expected_attach_type' and 'attach_btf_id' fields have
specific meaning.
In the future 'expected_attach_type' will be extended
with other attach points that have similar semantics to raw_tp.
This patch is replacing BTF-enabled BPF_PROG_TYPE_RAW_TRACEPOINT with
prog_type = BPF_RPOG_TYPE_TRACING
expected_attach_type = BPF_TRACE_RAW_TP
attach_btf_id = btf_id of raw tracepoint inside the kernel
Future patches will add
expected_attach_type = BPF_TRACE_FENTRY or BPF_TRACE_FEXIT
where programs have the same input context and the same helpers,
but different attach points.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20191030223212.953010-2-ast@kernel.org

f1b9509c

bpf: Fix bpf jit kallsym access · af91acbc

Alexei Starovoitov authored Oct 30, 2019

Jiri reported crash when JIT is on, but net.core.bpf_jit_kallsyms is off.
bpf_prog_kallsyms_find() was skipping addr->bpf_prog resolution
logic in oops and stack traces. That's incorrect.
It should only skip addr->name resolution for 'cat /proc/kallsyms'.
That's what bpf_jit_kallsyms and bpf_jit_harden protect.

Fixes: 3dec541b ("bpf: Add support for BTF pointers to x86 JIT")
Reported-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191030233019.1187404-1-ast@kernel.org

af91acbc

30 Oct, 2019 5 commits

bpf, testing: Introduce 'gso_linear_no_head_frag' skb_segment test · cf204a71

Shmulik Ladkani authored Oct 25, 2019

Following reports of skb_segment() hitting a BUG_ON when working on
GROed skbs which have their gso_size mangled (e.g. after a
bpf_skb_change_proto call), add a reproducer test that mimics the
input skbs that lead to the mentioned BUG_ON as in [1] and validates the
fix submitted in [2].

[1] https://lists.openwall.net/netdev/2019/08/26/110
[2] commit 3dcbdb13 ("net: gso: Fix skb_segment splat when splitting gso_size mangled skb having linear-headed frag_list")
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191025134223.2761-3-shmulik.ladkani@gmail.com

cf204a71

bpf, testing: Refactor test_skb_segment() for testing skb_segment() on different skbs · af21c717

Shmulik Ladkani authored Oct 25, 2019

Currently, test_skb_segment() builds a single test skb and runs
skb_segment() on it.

Extend test_skb_segment() so it processes an array of numerous
skb/feature pairs to test.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191025134223.2761-2-shmulik.ladkani@gmail.com

af21c717

bpf: Add s390 testing documentation · 7e07e7ae

Ilya Leoshkevich authored Oct 29, 2019

This commits adds a document that explains how to test BPF in an s390
QEMU guest.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20191029172916.36528-1-iii@linux.ibm.com

7e07e7ae

selftests/bpf: Test narrow load from bpf_sysctl.write · 9ffccb76

Ilya Leoshkevich authored Oct 29, 2019

There are tests for full and narrows loads from bpf_sysctl.file_pos, but
for bpf_sysctl.write only full load is tested. Add the missing test.
Suggested-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/20191029143027.28681-1-iii@linux.ibm.com

9ffccb76

bpf: Enforce 'return 0' in BTF-enabled raw_tp programs · 15ab09bd

Alexei Starovoitov authored Oct 28, 2019

The return value of raw_tp programs is ignored by __bpf_trace_run()
that calls them. The verifier also allows any value to be returned.
For BTF-enabled raw_tp lets enforce 'return 0', so that return value
can be used for something in the future.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191029032426.1206762-1-ast@kernel.org

15ab09bd

29 Oct, 2019 3 commits

libbpf: Don't use kernel-side u32 type in xsk.c · a566e35f

Andrii Nakryiko authored Oct 28, 2019

u32 is a kernel-side typedef. User-space library is supposed to use __u32.
This breaks Github's projection of libbpf. Do u32 -> __u32 fix.

Fixes: 94ff9ebb ("libbpf: Fix compatibility for kernels without need_wakeup")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@intel.com>
Link: https://lore.kernel.org/bpf/20191029055953.2461336-1-andriin@fb.com

a566e35f

libbpf: Fix off-by-one error in ELF sanity check · d3a3aa0c

Andrii Nakryiko authored Oct 28, 2019

libbpf's bpf_object__elf_collect() does simple sanity check after iterating
over all ELF sections, if checks that .strtab index is correct. Unfortunately,
due to section indices being 1-based, the check breaks for cases when .strtab
ends up being the very last section in ELF.

Fixes: 77ba9a5b ("tools lib bpf: Fetch map names from correct strtab")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191028233727.1286699-1-andriin@fb.com

d3a3aa0c

libbpf: Fix compatibility for kernels without need_wakeup · 94ff9ebb

Magnus Karlsson authored Oct 25, 2019

When the need_wakeup flag was added to AF_XDP, the format of the
XDP_MMAP_OFFSETS getsockopt was extended. Code was added to the
kernel to take care of compatibility issues arrising from running
applications using any of the two formats. However, libbpf was
not extended to take care of the case when the application/libbpf
uses the new format but the kernel only supports the old
format. This patch adds support in libbpf for parsing the old
format, before the need_wakeup flag was added, and emulating a
set of static need_wakeup flags that will always work for the
application.

v2 -> v3:
* Incorporated code improvements suggested by Jonathan Lemon

v1 -> v2:
* Rebased to bpf-next
* Rewrote the code as the previous version made you blind

Fixes: a4500432 ("libbpf: add support for need_wakeup flag in AF_XDP part")
Reported-by: Eloy Degen <degeneloy@gmail.com>
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Link: https://lore.kernel.org/bpf/1571995035-21889-1-git-send-email-magnus.karlsson@intel.com

94ff9ebb

28 Oct, 2019 2 commits

selftests/bpf: Restore $(OUTPUT)/test_stub.o rule · e93d9918

Ilya Leoshkevich authored Oct 28, 2019

`make O=/linux-build kselftest TARGETS=bpf` fails with

	make[3]: *** No rule to make target '/linux-build/bpf/test_stub.o', needed by '/linux-build/bpf/test_verifier'

The same command without the O= part works, presumably thanks to the
implicit rule.

Fix by restoring the explicit $(OUTPUT)/test_stub.o rule.

Fixes: 74b5a596 ("selftests/bpf: Replace test_progs and test_maps w/ general rule")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191028102110.7545-1-iii@linux.ibm.com

e93d9918

selftest/bpf: Use -m{little, big}-endian for clang · 313e7f6f

Ilya Leoshkevich authored Oct 28, 2019

When cross-compiling tests from x86 to s390, the resulting BPF objects
fail to load due to endianness mismatch.

Fix by using BPF-GCC endianness check for clang as well.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191028102049.7489-1-iii@linux.ibm.com

313e7f6f

27 Oct, 2019 1 commit

Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · 5b7fe93d

David S. Miller authored Oct 26, 2019

Daniel Borkmann says:

====================
pull-request: bpf-next 2019-10-27

The following pull-request contains BPF updates for your *net-next* tree.

We've added 52 non-merge commits during the last 11 day(s) which contain
a total of 65 files changed, 2604 insertions(+), 1100 deletions(-).

The main changes are:

 1) Revolutionize BPF tracing by using in-kernel BTF to type check BPF
    assembly code. The work here teaches BPF verifier to recognize
    kfree_skb()'s first argument as 'struct sk_buff *' in tracepoints
    such that verifier allows direct use of bpf_skb_event_output() helper
    used in tc BPF et al (w/o probing memory access) that dumps skb data
    into perf ring buffer. Also add direct loads to probe memory in order
    to speed up/replace bpf_probe_read() calls, from Alexei Starovoitov.

 2) Big batch of changes to improve libbpf and BPF kselftests. Besides
    others: generalization of libbpf's CO-RE relocation support to now
    also include field existence relocations, revamp the BPF kselftest
    Makefile to add test runner concept allowing to exercise various
    ways to build BPF programs, and teach bpf_object__open() and friends
    to automatically derive BPF program type/expected attach type from
    section names to ease their use, from Andrii Nakryiko.

 3) Fix deadlock in stackmap's build-id lookup on rq_lock(), from Song Liu.

 4) Allow to read BTF as raw data from bpftool. Most notable use case
    is to dump /sys/kernel/btf/vmlinux through this, from Jiri Olsa.

 5) Use bpf_redirect_map() helper in libbpf's AF_XDP helper prog which
    manages to improve "rx_drop" performance by ~4%., from Björn Töpel.

 6) Fix to restore the flow dissector after reattach BPF test and also
    fix error handling in bpf_helper_defs.h generation, from Jakub Sitnicki.

 7) Improve verifier's BTF ctx access for use outside of raw_tp, from
    Martin KaFai Lau.

 8) Improve documentation for AF_XDP with new sections and to reflect
    latest features, from Magnus Karlsson.

 9) Add back 'version' section parsing to libbpf for old kernels, from
    John Fastabend.

10) Fix strncat bounds error in libbpf's libbpf_prog_type_by_name(),
    from KP Singh.

11) Turn on -mattr=+alu32 in LLVM by default for BPF kselftests in order
    to improve insn coverage for built BPF progs, from Yonghong Song.

12) Misc minor cleanups and fixes, from various others.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

5b7fe93d

26 Oct, 2019 25 commits

tc-testing: list required kernel options for act_ct action · b9512485

Roman Mashak authored Oct 26, 2019

Updated config with required kernel options for conntrac TC action,
so that tdc can run the tests.
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

b9512485

Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next · 4b1f5dda

David S. Miller authored Oct 26, 2019

Pablo Neira Ayuso says:

====================
Netfilter/IPVS updates for net-next

The following patchset contains Netfilter/IPVS updates for net-next,
more specifically:

* Updates for ipset:

1) Coding style fix for ipset comment extension, from Jeremy Sowden.

2) De-inline many functions in ipset, from Jeremy Sowden.

3) Move ipset function definition from header to source file.

4) Move ip_set_put_flags() to source, export it as a symbol, remove
   inline.

5) Move range_to_mask() to the source file where this is used.

6) Move ip_set_get_ip_port() to the source file where this is used.

* IPVS selftests and netns improvements:

7) Two patches to speedup ipvs netns dismantle, from Haishuang Yan.

8) Three patches to add selftest script for ipvs, also from
   Haishuang Yan.

* Conntrack updates and new nf_hook_slow_list() function:

9) Document ct ecache extension, from Florian Westphal.

10) Skip ct extensions from ctnetlink dump, from Florian.

11) Free ct extension immediately, from Florian.

12) Skip access to ecache extension from nf_ct_deliver_cached_events()
    this is not correct as reported by Syzbot.

13) Add and use nf_hook_slow_list(), from Florian.

* Flowtable infrastructure updates:

14) Move priority to nf_flowtable definition.

15) Dynamic allocation of per-device hooks in flowtables.

16) Allow to include netdevice only once in flowtable definitions.

17) Rise maximum number of devices per flowtable.

* Netfilter hardware offload infrastructure updates:

18) Add nft_flow_block_chain() helper function.

19) Pass callback list to nft_setup_cb_call().

20) Add nft_flow_cls_offload_setup() helper function.

21) Remove rules for the unregistered device via netdevice event.

22) Support for multiple devices in a basechain definition at the
    ingress hook.

22) Add nft_chain_offload_cmd() helper function.

23) Add nft_flow_block_offload_init() helper function.

24) Rewind in case of failing to bind multiple devices to hook.

25) Typo in IPv6 tproxy module description, from Norman Rasmussen.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

4b1f5dda

Merge branch 'net-aquantia-ptp-followup-fixes' · 64fe8e97

David S. Miller authored Oct 26, 2019

Igor Russkikh says:

====================
net: aquantia: ptp followup fixes

Here are two sparse warnings, third patch is a fix for
scaled_ppm_to_ppb missing. Eventually I reworked this
to exclude ptp module from build. Please consider it instead
of this patch: https://patchwork.ozlabs.org/patch/1184171/
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

64fe8e97

net: aquantia: disable ptp object build if no config · 7873ee26

Igor Russkikh authored Oct 26, 2019

We do disable aq_ptp module build using inline
stubs when CONFIG_PTP_1588_CLOCK is not declared.

This reduces module size and removes unnecessary code.
Reported-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7873ee26

net: aquantia: fix warnings on endianness · 5eeb6c3c

Igor Russkikh authored Oct 26, 2019

fixes to remove sparse warnings:
sparse: sparse: cast to restricted __be64

Fixes: 04a18399 ("net: aquantia: implement data PTP datapath")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5eeb6c3c

net: aquantia: fix var initialization warning · bb1eded1

Igor Russkikh authored Oct 26, 2019

found by sparse, simply useless local initialization with zero.

Fixes: 94ad9455 ("net: aquantia: add PTP rings infrastructure")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Igor Russkikh <igor.russkikh@aquantia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

bb1eded1

netfilter: nf_tables_offload: unbind if multi-device binding fails · 671312e1

Pablo Neira Ayuso authored Oct 24, 2019

nft_flow_block_chain() needs to unbind in case of error when performing
the multi-device binding.

Fixes: d54725cd ("netfilter: nf_tables: support for multiple devices per netdev hook")
Reported-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

671312e1

netfilter: nf_tables_offload: add nft_flow_block_offload_init() · 75ceaf86

Pablo Neira Ayuso authored Oct 24, 2019

This patch adds the nft_flow_block_offload_init() helper function to
initialize the flow_block_offload object.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

75ceaf86

netfilter: nf_tables_offload: add nft_chain_offload_cmd() · 6df5490f

Pablo Neira Ayuso authored Oct 24, 2019

This patch adds the nft_chain_offload_cmd() helper function.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

6df5490f

netfilter: ecache: don't look for ecache extension on dying/unconfirmed conntracks · ad88b7a6

Florian Westphal authored Oct 22, 2019

syzbot reported following splat:
BUG: KASAN: use-after-free in __nf_ct_ext_exist
include/net/netfilter/nf_conntrack_extend.h:53 [inline]
BUG: KASAN: use-after-free in nf_ct_deliver_cached_events+0x5c3/0x6d0
net/netfilter/nf_conntrack_ecache.c:205
nf_conntrack_confirm include/net/netfilter/nf_conntrack_core.h:65 [inline]
nf_confirm+0x3d8/0x4d0 net/netfilter/nf_conntrack_proto.c:154
[..]

While there is no reproducer yet, the syzbot report contains one
interesting bit of information:

Freed by task 27585:
[..]
 kfree+0x10a/0x2c0 mm/slab.c:3757
 nf_ct_ext_destroy+0x2ab/0x2e0 net/netfilter/nf_conntrack_extend.c:38
 nf_conntrack_free+0x8f/0xe0 net/netfilter/nf_conntrack_core.c:1418
 destroy_conntrack+0x1a2/0x270 net/netfilter/nf_conntrack_core.c:626
 nf_conntrack_put include/linux/netfilter/nf_conntrack_common.h:31 [inline]
 nf_ct_resolve_clash net/netfilter/nf_conntrack_core.c:915 [inline]
 ^^^^^^^^^^^^^^^^^^^
 __nf_conntrack_confirm+0x21ca/0x2830 net/netfilter/nf_conntrack_core.c:1038
 nf_conntrack_confirm include/net/netfilter/nf_conntrack_core.h:63 [inline]
 nf_confirm+0x3e7/0x4d0 net/netfilter/nf_conntrack_proto.c:154

This is whats happening:

1. a conntrack entry is about to be confirmed (added to hash table).
2. a clash with existing entry is detected.
3. nf_ct_resolve_clash() puts skb->nfct (the "losing" entry).
4. this entry now has a refcount of 0 and is freed to SLAB_TYPESAFE_BY_RCU
   kmem cache.

skb->nfct has been replaced by the one found in the hash.
Problem is that nf_conntrack_confirm() uses the old ct:

static inline int nf_conntrack_confirm(struct sk_buff *skb)
{
 struct nf_conn *ct = (struct nf_conn *)skb_nfct(skb);
 int ret = NF_ACCEPT;

  if (ct) {
    if (!nf_ct_is_confirmed(ct))
       ret = __nf_conntrack_confirm(skb);
    if (likely(ret == NF_ACCEPT))
	nf_ct_deliver_cached_events(ct); /* This ct has refcount 0! */
  }
  return ret;
}

As of "netfilter: conntrack: free extension area immediately", we can't
access conntrack extensions in this case.

To fix this, make sure we check the dying bit presence before attempting
to get the eache extension.

Reported-by: syzbot+c7aabc9fe93e7f3637ba@syzkaller.appspotmail.com
Fixes: 2ad9d774 ("netfilter: conntrack: free extension area immediately")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>

ad88b7a6

Merge branch 'ionic-updates' · 0629d245

David S. Miller authored Oct 25, 2019

Shannon Nelson says:

====================
ionic updates

These are a few of the driver updates we've been working on internally.
These clean up a few mismatched struct comments, add checking for dead
firmware, fix an initialization bug, and change the Rx buffer management.

These are based on net-next v5.4-rc3-709-g985fd98a.

v2: clear napi->skb in the error case in ionic_rx_frags()
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

0629d245

ionic: update driver version · 63ad1cd6

Shannon Nelson authored Oct 23, 2019

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

63ad1cd6

ionic: implement support for rx sgl · 08f2e4b2

Shannon Nelson authored Oct 23, 2019

Even out Rx performance across MTU sizes by changing from full
skb allocations to page-based frag allocations.  The device
supports a form of scatter-gather in the Rx path, so we can
set up a number of pages for each descriptor, all of which are
easier to alloc and pass around than the standard kzalloc'd
buffer.  An skb is wrapped around the pages while processing
the received packets, and pages are recycled as needed, or
left alone if they weren't used in the Rx.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

08f2e4b2

ionic: add a watchdog timer to monitor heartbeat · 089406bc

Shannon Nelson authored Oct 23, 2019

Add a watchdog to periodically monitor the NIC heartbeat.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

089406bc

ionic: add heartbeat check · 97ca4865

Shannon Nelson authored Oct 23, 2019

Most of our firmware has a heartbeat feature that the driver
can watch for to see if the FW is still alive and likely to
answer a dev_cmd or AdminQ request.
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

97ca4865

ionic: reverse an interrupt coalesce calculation · ff7ebed9

Shannon Nelson authored Oct 23, 2019

Fix the initial interrupt coalesce usec-to-hw setting
to actually be usec-to-hw.

Fixes: 780eded3 ("ionic: report users coalesce request")
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

ff7ebed9

ionic: fix up struct name comments · 5c28f213

Shannon Nelson authored Oct 23, 2019

Fix up struct names in the ionic_if.h comments
Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>

5c28f213

r8169: improve rtl8169_rx_fill · e4b5c7a5

Heiner Kallweit authored Oct 23, 2019

We have only one user of the error path, so we can inline it.
In addition the call to rtl8169_make_unusable_by_asic() can be removed
because rtl8169_alloc_rx_data() didn't call rtl8169_mark_to_asic() yet
for the respective index if returning NULL.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

e4b5c7a5

r8169: align fix_features callback with vendor driver · 7cb83b21

Heiner Kallweit authored Oct 23, 2019

This patch aligns the fix_features callback with the vendor driver and
also disables IPv6 HW checksumming and TSO if jumbo packets are used
on RTL8101/RTL8168/RTL8125.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

7cb83b21

Merge branch 'for-upstream' of... · 8ca12bc3

David S. Miller authored Oct 25, 2019

Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2019-10-23

Here's the main bluetooth-next pull request for the 5.5 kernel:

 - Multiple fixes to hci_qca driver
 - Fix for HCI_USER_CHANNEL initialization
 - btwlink: drop superseded driver
 - Add support for Intel FW download error recovery
 - Various other smaller fixes & improvements

Please let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

8ca12bc3

tcp: add TCP_INFO status for failed client TFO · 48027478

Jason Baron authored Oct 23, 2019

The TCPI_OPT_SYN_DATA bit as part of tcpi_options currently reports whether
or not data-in-SYN was ack'd on both the client and server side. We'd like
to gather more information on the client-side in the failure case in order
to indicate the reason for the failure. This can be useful for not only
debugging TFO, but also for creating TFO socket policies. For example, if
a middle box removes the TFO option or drops a data-in-SYN, we can
can detect this case, and turn off TFO for these connections saving the
extra retransmits.

The newly added tcpi_fastopen_client_fail status is 2 bits and has the
following 4 states:

1) TFO_STATUS_UNSPEC

Catch-all state which includes when TFO is disabled via black hole
detection, which is indicated via LINUX_MIB_TCPFASTOPENBLACKHOLE.

2) TFO_COOKIE_UNAVAILABLE

If TFO_CLIENT_NO_COOKIE mode is off, this state indicates that no cookie
is available in the cache.

3) TFO_DATA_NOT_ACKED

Data was sent with SYN, we received a SYN/ACK but it did not cover the data
portion. Cookie is not accepted by server because the cookie may be invalid
or the server may be overloaded.

4) TFO_SYN_RETRANSMITTED

Data was sent with SYN, we received a SYN/ACK which did not cover the data
after at least 1 additional SYN was sent (without data). It may be the case
that a middle-box is dropping data-in-SYN packets. Thus, it would be more
efficient to not use TFO on this connection to avoid extra retransmits
during connection establishment.

These new fields do not cover all the cases where TFO may fail, but other
failures, such as SYN/ACK + data being dropped, will result in the
connection not becoming established. And a connection blackhole after
session establishment shows up as a stalled connection.
Signed-off-by: Jason Baron <jbaron@akamai.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Christoph Paasch <cpaasch@apple.com>
Cc: Yuchung Cheng <ycheng@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

48027478

Merge branch 'phy-dp83867-enable-robust-auto-mdix' · 79f2056b

David S. Miller authored Oct 25, 2019

Grygorii Strashko says:

====================
net: phy: dp83867: enable robust auto-mdix

Patch 1 - improves link detection when dp83867 PHY is configured in manual mode
by enabling CFG3[9] Robust Auto-MDIX option.

Patch 2 - is minor optimization.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>

79f2056b

net: phy: dp83867: move dt parsing to probe · ef87f7da

Grygorii Strashko authored Oct 23, 2019

Move DT parsing code to probe dp83867_probe() as it's one time operation.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

ef87f7da

net: phy: dp83867: enable robust auto-mdix · 5a7f08c2

Grygorii Strashko authored Oct 23, 2019

The link detection timeouts can be observed (or link might not be detected
at all) when dp83867 PHY is configured in manual mode (speed/duplex).

CFG3[9] Robust Auto-MDIX option allows to significantly improve link detection
in case dp83867 is configured in manual mode and reduce link detection
time.
As per DM: "If link partners are configured to operational modes that are
not supported by normal Auto MDI/MDIX mode (like Auto-Neg versus Force
100Base-TX or Force 100Base-TX versus Force 100Base-TX), this Robust Auto
MDI/MDIX mode allows MDI/MDIX resolution and prevents deadlock."

Hence, enable this option by default as there are no known reasons
not to do so.
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

5a7f08c2

net: sch_generic: Use pfifo_fast as fallback scheduler for CAN hardware · 546b85bb

Vincent Prince authored Oct 23, 2019

There is networking hardware that isn't based on Ethernet for layers 1 and 2.

For example CAN.

CAN is a multi-master serial bus standard for connecting Electronic Control
Units [ECUs] also known as nodes. A frame on the CAN bus carries up to 8 bytes
of payload. Frame corruption is detected by a CRC. However frame loss due to
corruption is possible, but a quite unusual phenomenon.

While fq_codel works great for TCP/IP, it doesn't for CAN. There are a lot of
legacy protocols on top of CAN, which are not build with flow control or high
CAN frame drop rates in mind.

When using fq_codel, as soon as the queue reaches a certain delay based length,
skbs from the head of the queue are silently dropped. Silently meaning that the
user space using a send() or similar syscall doesn't get an error. However
TCP's flow control algorithm will detect dropped packages and adjust the
bandwidth accordingly.

When using fq_codel and sending raw frames over CAN, which is the common use
case, the user space thinks the package has been sent without problems, because
send() returned without an error. pfifo_fast will drop skbs, if the queue
length exceeds the maximum. But with this scheduler the skbs at the tail are
dropped, an error (-ENOBUFS) is propagated to user space. So that the user
space can slow down the package generation.

On distributions, where fq_codel is made default via CONFIG_DEFAULT_NET_SCH
during compile time, or set default during runtime with sysctl
net.core.default_qdisc (see [1]), we get a bad user experience. In my test case
with pfifo_fast, I can transfer thousands of million CAN frames without a frame
drop. On the other hand with fq_codel there is more then one lost CAN frame per
thousand frames.

As pointed out fq_codel is not suited for CAN hardware, so this patch changes
attach_one_default_qdisc() to use pfifo_fast for "ARPHRD_CAN" network devices.

During transition of a netdev from down to up state the default queuing
discipline is attached by attach_default_qdiscs() with the help of
attach_one_default_qdisc(). This patch modifies attach_one_default_qdisc() to
attach the pfifo_fast (pfifo_fast_ops) if the network device type is
"ARPHRD_CAN".

[1] https://github.com/systemd/systemd/issues/9194Suggested-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Signed-off-by: Vincent Prince <vincent.prince.fr@gmail.com>
Acked-by: Dave Taht <dave.taht@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

546b85bb