- 23 Sep, 2020 36 commits
-
-
Nikolay Aleksandrov authored
When excluding S,G entries we need a way to block a particular S,G,port. The new port group flag is managed based on the source's timer as per RFCs 3376 and 3810. When a source expires and its port group is in EXCLUDE mode, it will be blocked. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
We need to handle group filter mode transitions and initial state. To change a port group's INCLUDE -> EXCLUDE mode (or when we have added a new port group in EXCLUDE mode) we need to add that port to all of *,G ports' S,G entries for proper replication. When the EXCLUDE state is changed from IGMPv3 report, br_multicast_fwd_filter_exclude() must be called after the source list processing because the assumption is that all of the group's S,G entries will be created before transitioning to EXCLUDE mode, i.e. most importantly its blocked entries will already be added so it will not get automatically added to them. The transition EXCLUDE -> INCLUDE happens only when a port group timer expires, it requires us to remove that port from all of *,G ports' S,G entries where it was automatically added previously. Finally when we are adding a new S,G entry we must add all of *,G's EXCLUDE ports to it. In order to distinguish automatically added *,G EXCLUDE ports we have a new port group flag - MDB_PG_FLAGS_STAR_EXCL. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
This patch adds support for automatic install of S,G mdb entries based on the port group's source list and the source entry's timer. Once installed the S,G will be used when forwarding packets if the approprate multicast/mld versions are set. A new source flag called BR_SGRP_F_INSTALLED denotes if the source has a forwarding mdb entry installed. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
To speedup S,G forward handling we need to be able to quickly find out if a port is a member of an S,G group. To do that add a global S,G port rhashtable with key: source addr, group addr, protocol, vid (all br_ip fields) and port pointer. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
We need to be able to differentiate between pg entries created by user-space and the kernel when we start generating S,G entries for IGMPv3/MLDv2's fast path. User-space entries are created by default as RTPROT_STATIC and the kernel entries are RTPROT_KERNEL. Later we can allow user-space to provide the entry rt_protocol so we can differentiate between who added the entries specifically (e.g. clag, admin, frr etc). Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
If (S,G) entries are enabled (igmpv3/mldv2) then look them up first. If there isn't a present (S,G) entry then try to find (*,G). Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
Add new mdb attributes (MDBE_ATTR_SOURCE for setting, MDBA_MDB_EATTR_SOURCE for dumping) to allow add/del and dump of mdb entries with a source address (S,G). New S,G entries are created with filter mode of MCAST_INCLUDE. The same attributes are used for IPv4 and IPv6, they're validated and parsed based on their protocol. S,G host joined entries which are added by user are not allowed yet. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
Since the MDB add/del code expects an exact struct br_mdb_entry we can't really add any extensions, thus add a new nested attribute at the level of MDBA_SET_ENTRY called MDBA_SET_ENTRY_ATTRS which will be used to pass all new options via netlink attributes. This patch doesn't change anything functionally since the new attribute is not used yet, only parsed. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
Since now we have src in br_ip, u no longer makes sense so rename it to dst. No functional changes. v2: fix build with CONFIG_BATMAN_ADV_MCAST CC: Marek Lindner <mareklindner@neomailbox.ch> CC: Simon Wunderlich <sw@simonwunderlich.de> CC: Antonio Quartulli <a@unstable.cc> CC: Sven Eckelmann <sven@narfation.org> CC: b.a.t.m.a.n@lists.open-mesh.org Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
Now that we have src and dst in br_ip it is logical to use the src field for the cases where we need to work with a source address such as querier source address and group source address. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
Add a new src field to struct br_ip which will be used to lookup S, G entries. When SSM option is added we will enable full br_ip lookups. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
Pass and use extack all the way down to br_mdb_add_group(). Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
To avoid doing duplicate device checks and searches (the same were done in br_mdb_add and __br_mdb_add) pass the already found port to __br_mdb_add and pull the bridge's netif_running and enabled multicast checks to br_mdb_add. This would also simplify the future extack errors. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Nikolay Aleksandrov authored
We can drop the pr_info() calls and just use extack to return a meaningful error to user-space when br_mdb_parse() fails. Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Zheng Yongjun authored
Fixes gcc '-Wunused-but-set-variable' warning: drivers/net/ethernet/realtek/8139cp.c: In function cp_tx_timeout: drivers/net/ethernet/realtek/8139cp.c:1242:6: warning: variable ‘rc’ set but not used [-Wunused-but-set-variable] `rc` is never used, so remove it. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Luo bin authored
Fix the warnings about function header comments when building hinic driver with "W=1" option. Signed-off-by: Luo bin <luobin9@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller authored
Alexei Starovoitov says: ==================== pull-request: bpf-next 2020-09-23 The following pull-request contains BPF updates for your *net-next* tree. We've added 95 non-merge commits during the last 22 day(s) which contain a total of 124 files changed, 4211 insertions(+), 2040 deletions(-). The main changes are: 1) Full multi function support in libbpf, from Andrii. 2) Refactoring of function argument checks, from Lorenz. 3) Make bpf_tail_call compatible with functions (subprograms), from Maciej. 4) Program metadata support, from YiFei. 5) bpf iterator optimizations, from Yonghong. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Jiri Olsa authored
Seth reported problem with cross builds, that fail on resolve_btfids build, because we are trying to build it on cross build arch. Fixing this by always forcing the host arch. Reported-by: Seth Forshee <seth.forshee@canonical.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200923185735.3048198-2-jolsa@kernel.org
-
Jiri Olsa authored
Currently all the resolve_btfids 'users' are under CONFIG_BPF code, so if we have CONFIG_BPF disabled, resolve_btfids will fail, because there's no data to resolve. Disabling resolve_btfids if there's CONFIG_BPF disabled, so we won't fail such builds. Suggested-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200923185735.3048198-1-jolsa@kernel.org
-
David S. Miller authored
Merge tag 'linux-can-next-for-5.10-20200923' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2020-09-23 this is a pull request of 20 patches for net-next. The complete series target the flexcan driver and is created by Joakim Zhang and me. The first six patches are cleanup (sort include files alphabetically, remove stray empty line, get rid of long lines) and adding more registers and documentation (registers and wakeup interrupt). Then in two patches the transceiver regulator is made optional, and a check for maximum transceiver bitrate is added. Then the ECC support for HW thats supports this is added. The next three patches improve suspend and low power mode handling. Followed by six patches that add CAN-FD support and CAN-FD related features. The last two patches add support for the flexcan IP core on the imx8qm and lx2160ar1. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Julian Wiedmann says: ==================== s390/qeth: updates 2020-09-23 please apply the following patch series for qeth to netdev's net-next tree. This brings all sorts of cleanups. Highlights are more code sharing in the init/teardown paths, and more fine-grained rollback on errors during initialization (instead of a full-blown teardown). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Shuffle some code around (primarily all the discipline-related stuff) to get rid of all the unnecessary forward declarations. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Clarify which discipline-specific steps are needed to roll back after error in qeth_l?_set_online(), and which are common to roll back from qeth_hardsetup_card(). Some steps (cancelling the RX modeset, draining the TX queues) are only necessary if the netdev was potentially UP before, so move them to the common qeth_set_offline(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Move duplicated code from the disciplines into the core path. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Originators of cmd IO typically hold the rtnl or conf_mutex to protect against a concurrent teardown. Since qeth_set_offline() already holds the conf_mutex, the main reason why we still care about cancelling pending cmds is so that they release the rtnl when we need it ourselves. So move this step a little earlier into the teardown sequence. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
The programming of ucast IPs via qeth_l3_modify_ip() is driven independently from any of our typical locking mechanisms (eg. detaching the netdevice, or holding the conf_mutex). So when we inspect the card state to check whether the required cmd IO should be deferred, there is no protection against concurrent state changes. But by slightly re-ordering the teardown sequence, we can rely on the ip_lock to sufficiently serialize things: 1. when running concurrently to qeth_l3_set_online(), any instance of qeth_l3_modify_ip() that aquires the ip_lock _after_ qeth_l3_recover_ip() will observe the state as CARD_STATE_SOFTSETUP and not defer the IO. 2. when running concurrently to qeth_l3_set_offline(), any instance of qeth_l3_modify_ip() that aquires the ip_lock _after_ qeth_l3_clear_ip_htable() will observe the state as CARD_STATE_DOWN and defer the IO. These guarantees in mind, we can now drop the conf_mutex from the qeth_l3_modify_rxip_vipa() wrapper. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Convert the remaining occurences in sysfs code to kstrtouint(). While at it move some input parsing out of locked sections, replace an open-coded clamp() and remove some unnecessary run-time checks for ipatoe->mask_bits that are already enforced when creating the object. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
Indicate the max number of to-be-parsed characters, and avoid copying the address sub-string. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
card->ipato is currently protected by the conf_mutex. But most users also hold the ip_lock - in particular qeth_l3_add_ip(). So slightly expand the sections under ip_lock in a few places (to effectively cover a few error & no-op cases), and then drop the conf_mutex where it's no longer needed. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Julian Wiedmann authored
mcast IP objects are allocated within qeth_l3_add_mcast_rtnl(), with .ref_counter already set to 1 via qeth_l3_init_ipaddr(). Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Lorenz Bauer authored
Arrays with designated initializers have an implicit length of the highest initialized value plus one. I used this to ensure that newly added entries in enum bpf_reg_type get a NULL entry in compatible_reg_types. This is difficult to understand since it requires knowledge of the peculiarities of designated initializers. Use __BPF_ARG_TYPE_MAX to size the array instead. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20200923160156.80814-1-lmb@cloudflare.com
-
Zheng Yongjun authored
drivers/net/ethernet/microchip/lan743x_main.c: In function lan743x_pm_suspend: `ret` is set but not used. In fact, `pci_prepare_to_sleep` function value should be the right value of `lan743x_pm_suspend` function, therefore, fix it. Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linuxDavid S. Miller authored
Saeed Mahameed says: ==================== mlx5-updates-2020-09-21 Multi packet TX descriptor support for SKBs. This series introduces some refactoring of the regular TX data path in mlx5 and adds the Enhanced TX MPWQE feature support. MPWQE stands for multi-packet work queue element, and it can serve multiple packets, reducing the PCI bandwidth spent on control traffic. It should improve performance in scenarios where PCI is the bottleneck, and xmit_more is signaled by the kernel. The refactoring done in this series also improves the packet rate on its own. MPWQE is already implemented in the XDP tx path, this series adds the support of MPWQE for regular kernel SKB tx path. MPWQE is supported from ConnectX-5 and onward, for legacy devices we need to keep backward compatibility for regular (Single packet) WQE descriptor. MPWQE is not compatible with certain offloads and features, such as TLS offload, TSO, nonlinear SKBs. If such incompatible features are in use, the driver gracefully falls back to non-MPWQE per SKB. Prior to the final patch "net/mlx5e: Enhanced TX MPWQE for SKBs" that adds the actual support, Maxim did some refactoring to the tx data path to split it into stages and smaller helper functions that can be utilized and reused for both legacy and new MPWQE feature. Performance testing: UDP performance is improved in a single stream pktgen test: Packet rate: 16.86 Mpps (±0.15 Mpps) -> 20.94 Mpps (±0.33 Mpps) Instructions per packet: 434 -> 329 Cycles per packet: 158 -> 123 Instructions per cycle: 2.75 -> 2.67 TCP and XDP_TX single stream tests show no performance difference. MPWQE can reduce PCI bandwidth: PCI Gen2, pktgen at fixed rate of 36864000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 80.3% Inbound PCI utilization with MPWQE on: 59.0% PCI Gen3, pktgen at fixed rate of 56064000 pps on 24 CPU cores: Inbound PCI utilization with MPWQE off: 65.4% Inbound PCI utilization with MPWQE on: 49.3% MPWQE can also reduce CPU load, increasing the packet rate in case of CPU bottleneck: PCI Gen2, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 37.5 Mpps Packet rate with MPWQE on: 49.0 Mpps PCI Gen3, pktgen at full rate on 24 CPU cores: Packet rate with MPWQE off: 57.0 Mpps Packet rate with MPWQE on: 66.8 Mpps Burst size in all pktgen tests is 32. CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz (x86_64) NIC: Mellanox ConnectX-6 Dx GCC 10.2.0 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Parav Pandit says: ==================== devlink: Use nla_policy to validate range This two small patches uses nla_policy to validate user specified fields are in valid range or not. Patch summary: Patch-1 checks the range of eswitch mode field Patch-2 checks for the port type field. It eliminates a check in code by using nla policy infrastructure. ==================== Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Parav Pandit authored
Use range checking facility of nla_policy to validate port type attribute input value is valid or not. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Parav Pandit authored
Use range checking facility of nla_policy to validate eswitch mode input attribute value is valid or not. Signed-off-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 22 Sep, 2020 4 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller authored
Two minor conflicts: 1) net/ipv4/route.c, adding a new local variable while moving another local variable and removing it's initial assignment. 2) drivers/net/dsa/microchip/ksz9477.c, overlapping changes. One pretty prints the port mode differently, whilst another changes the driver to try and obtain the port mode from the port node rather than the switch node. Signed-off-by: David S. Miller <davem@davemloft.net>
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs fixes from Al Viro: "No common topic, just assorted fixes" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fuse: fix the ->direct_IO() treatment of iov_iter fs: fix cast in fsparam_u32hex() macro vboxsf: Fix the check for the old binary mount-arguments struct
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from Jakub Kicinski: - fix failure to add bond interfaces to a bridge, the offload-handling code was too defensive there and recent refactoring unearthed that. Users complained (Ido) - fix unnecessarily reflecting ECN bits within TOS values / QoS marking in TCP ACK and reset packets (Wei) - fix a deadlock with bpf iterator. Hopefully we're in the clear on this front now... (Yonghong) - BPF fix for clobbering r2 in bpf_gen_ld_abs (Daniel) - fix AQL on mt76 devices with FW rate control and add a couple of AQL issues in mac80211 code (Felix) - fix authentication issue with mwifiex (Maximilian) - WiFi connectivity fix: revert IGTK support in ti/wlcore (Mauro) - fix exception handling for multipath routes via same device (David Ahern) - revert back to a BH spin lock flavor for nsid_lock: there are paths which do require the BH context protection (Taehee) - fix interrupt / queue / NAPI handling in the lantiq driver (Hauke) - fix ife module load deadlock (Cong) - make an adjustment to netlink reply message type for code added in this release (the sole change touching uAPI here) (Michal) - a number of fixes for small NXP and Microchip switches (Vladimir) [ Pull request acked by David: "you can expect more of this in the future as I try to delegate more things to Jakub" ] * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (167 commits) net: mscc: ocelot: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries net: dsa: seville: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries net: dsa: felix: fix some key offsets for IP4_TCP_UDP VCAP IS2 entries inet_diag: validate INET_DIAG_REQ_PROTOCOL attribute net: bridge: br_vlan_get_pvid_rcu() should dereference the VLAN group under RCU net: Update MAINTAINERS for MediaTek switch driver net/mlx5e: mlx5e_fec_in_caps() returns a boolean net/mlx5e: kTLS, Avoid kzalloc(GFP_KERNEL) under spinlock net/mlx5e: kTLS, Fix leak on resync error flow net/mlx5e: kTLS, Add missing dma_unmap in RX resync net/mlx5e: kTLS, Fix napi sync and possible use-after-free net/mlx5e: TLS, Do not expose FPGA TLS counter if not supported net/mlx5e: Fix using wrong stats_grps in mlx5e_update_ndo_stats() net/mlx5e: Fix multicast counter not up-to-date in "ip -s" net/mlx5e: Fix endianness when calculating pedit mask first bit net/mlx5e: Enable adding peer miss rules only if merged eswitch is supported net/mlx5e: CT: Fix freeing ct_label mapping net/mlx5e: Fix memory leak of tunnel info when rule under multipath not ready net/mlx5e: Use synchronize_rcu to sync with NAPI net/mlx5e: Use RCU to protect rq->xdp_prog ...
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull io_uring fixes from Jens Axboe: "A few fixes - most of them regression fixes from this cycle, but also a few stable heading fixes, and a build fix for the included demo tool since some systems now actually have gettid() available" * tag 'io_uring-5.9-2020-09-22' of git://git.kernel.dk/linux-block: io_uring: fix openat/openat2 unified prep handling io_uring: mark statx/files_update/epoll_ctl as non-SQPOLL tools/io_uring: fix compile breakage io_uring: don't use retry based buffered reads for non-async bdev io_uring: don't re-setup vecs/iter in io_resumit_prep() is already there io_uring: don't run task work on an exiting task io_uring: drop 'ctx' ref on task work cancelation io_uring: grab any needed state during defer prep
-