- 27 Nov, 2021 13 commits
-
-
Xin Long authored
The same optimization as the one in commit cc0be1ad ("net: bridge: Slightly optimize 'find_portno()'") is needed for the 'changed' bitmap in __br_vlan_set_default_pvid(). Signed-off-by: Xin Long <lucien.xin@gmail.com> Link: https://lore.kernel.org/r/4e35f415226765e79c2a11d2c96fbf3061c486e2.1637782773.git.lucien.xin@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Tonghao Zhang authored
The netdev (e.g. ifb, bareudp), which not support ethtool ops (e.g. .get_drvinfo), we can use the rtnl kind as a default name. ifb netdev may be created by others prefix, not ifbX. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Hao Chen <chenhao288@hisilicon.com> Cc: Heiner Kallweit <hkallweit1@gmail.com> Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org> Cc: Danielle Ratson <danieller@nvidia.com> Cc: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Link: https://lore.kernel.org/r/20211125163049.84970-1-xiangxia.m.yue@gmail.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Nikolay Aleksandrov says: ==================== selftests: net: bridge: vlan multicast tests This patch-set adds selftests for the new vlan multicast options that were recently added. Most of the tests check for default values, changing options and try to verify that the changes actually take effect. The last test checks if the dependency between vlan_filtering and mcast_vlan_snooping holds. The rest are pretty self-explanatory. TEST: Vlan multicast snooping enable [ OK ] TEST: Vlan global options existence [ OK ] TEST: Vlan mcast_snooping global option default value [ OK ] TEST: Vlan 10 multicast snooping control [ OK ] TEST: Vlan mcast_querier global option default value [ OK ] TEST: Vlan 10 multicast querier enable [ OK ] TEST: Vlan 10 tagged IGMPv2 general query sent [ OK ] TEST: Vlan 10 tagged MLD general query sent [ OK ] TEST: Vlan mcast_igmp_version global option default value [ OK ] TEST: Vlan mcast_mld_version global option default value [ OK ] TEST: Vlan 10 mcast_igmp_version option changed to 3 [ OK ] TEST: Vlan 10 tagged IGMPv3 general query sent [ OK ] TEST: Vlan 10 mcast_mld_version option changed to 2 [ OK ] TEST: Vlan 10 tagged MLDv2 general query sent [ OK ] TEST: Vlan mcast_last_member_count global option default value [ OK ] TEST: Vlan mcast_last_member_interval global option default value [ OK ] TEST: Vlan 10 mcast_last_member_count option changed to 3 [ OK ] TEST: Vlan 10 mcast_last_member_interval option changed to 200 [ OK ] TEST: Vlan mcast_startup_query_interval global option default value [ OK ] TEST: Vlan mcast_startup_query_count global option default value [ OK ] TEST: Vlan 10 mcast_startup_query_interval option changed to 100 [ OK ] TEST: Vlan 10 mcast_startup_query_count option changed to 3 [ OK ] TEST: Vlan mcast_membership_interval global option default value [ OK ] TEST: Vlan 10 mcast_membership_interval option changed to 200 [ OK ] TEST: Vlan 10 mcast_membership_interval mdb entry expire [ OK ] TEST: Vlan mcast_querier_interval global option default value [ OK ] TEST: Vlan 10 mcast_querier_interval option changed to 100 [ OK ] TEST: Vlan 10 mcast_querier_interval expire after outside query [ OK ] TEST: Vlan mcast_query_interval global option default value [ OK ] TEST: Vlan 10 mcast_query_interval option changed to 200 [ OK ] TEST: Vlan mcast_query_response_interval global option default value [ OK ] TEST: Vlan 10 mcast_query_response_interval option changed to 200 [ OK ] TEST: Port vlan 10 option mcast_router default value [ OK ] TEST: Port vlan 10 mcast_router option changed to 2 [ OK ] TEST: Flood unknown vlan multicast packets to router port only [ OK ] TEST: Disable multicast vlan snooping when vlan filtering is disabled [ OK ] ==================== Link: https://lore.kernel.org/r/20211125140858.3639139-1-razor@blackwall.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nikolay Aleksandrov authored
Add a test for dependency of mcast_vlan_snooping on vlan_filtering. If vlan_filtering gets disabled, then mcast_vlan_snooping must be automatically disabled as well. TEST: Disable multicast vlan snooping when vlan filtering is disabled [ OK ] Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nikolay Aleksandrov authored
Add tests for the new per-port/vlan mcast_router option, verify that unknown multicast packets are flooded only to router ports. TEST: Port vlan 10 option mcast_router default value [ OK ] TEST: Port vlan 10 mcast_router option changed to 2 [ OK ] TEST: Flood unknown vlan multicast packets to router port only [ OK ] Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nikolay Aleksandrov authored
Add tests which change the new per-vlan mcast_query_interval and verify the new value is in effect, also add a test to change mcast_query_response_interval's value. TEST: Vlan mcast_query_interval global option default value [ OK ] TEST: Vlan 10 mcast_query_interval option changed to 200 [ OK ] TEST: Vlan mcast_query_response_interval global option default value [ OK ] TEST: Vlan 10 mcast_query_response_interval option changed to 200 [ OK ] Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nikolay Aleksandrov authored
Add tests which change the new per-vlan mcast_querier_interval and verify that it is taken into account when an outside querier is present. TEST: Vlan mcast_querier_interval global option default value [ OK ] TEST: Vlan 10 mcast_querier_interval option changed to 100 [ OK ] TEST: Vlan 10 mcast_querier_interval expire after outside query [ OK ] Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nikolay Aleksandrov authored
Add a test which changes the new per-vlan mcast_membership_interval and verifies that a newly learned mdb entry would expire in that interval. TEST: Vlan mcast_membership_interval global option default value [ OK ] TEST: Vlan 10 mcast_membership_interval option changed to 200 [ OK ] TEST: Vlan 10 mcast_membership_interval mdb entry expire [ OK ] Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nikolay Aleksandrov authored
Add tests which change the new per-vlan startup query count/interval options and verify the proper number of queries are sent in the expected interval. TEST: Vlan mcast_startup_query_interval global option default value [ OK ] TEST: Vlan mcast_startup_query_count global option default value [ OK ] TEST: Vlan 10 mcast_startup_query_interval option changed to 100 [ OK ] TEST: Vlan 10 mcast_startup_query_count option changed to 3 [ OK ] Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nikolay Aleksandrov authored
Add tests which verify the default values of mcast_last_member_count mcast_last_member_count and also try to change them. TEST: Vlan mcast_last_member_count global option default value [ OK ] TEST: Vlan mcast_last_member_interval global option default value [ OK ] TEST: Vlan 10 mcast_last_member_count option changed to 3 [ OK ] TEST: Vlan 10 mcast_last_member_interval option changed to 200 [ OK ] Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nikolay Aleksandrov authored
Add tests which change the new per-vlan IGMP/MLD versions and verify that proper tagged general query packets are sent. TEST: Vlan mcast_igmp_version global option default value [ OK ] TEST: Vlan mcast_mld_version global option default value [ OK ] TEST: Vlan 10 mcast_igmp_version option changed to 3 [ OK ] TEST: Vlan 10 tagged IGMPv3 general query sent [ OK ] TEST: Vlan 10 mcast_mld_version option changed to 2 [ OK ] TEST: Vlan 10 tagged MLDv2 general query sent [ OK ] Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nikolay Aleksandrov authored
Add a test to try the new global vlan mcast_querier control and also verify that tagged general query packets are properly generated when querier is enabled for a single vlan. TEST: Vlan mcast_querier global option default value [ OK ] TEST: Vlan 10 multicast querier enable [ OK ] TEST: Vlan 10 tagged IGMPv2 general query sent [ OK ] TEST: Vlan 10 tagged MLD general query sent [ OK ] Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Nikolay Aleksandrov authored
Add the first test for bridge per-vlan multicast snooping which checks if control of the global and per-vlan options work as expected, joins and leaves are tested at each option value. TEST: Vlan multicast snooping enable [ OK ] TEST: Vlan global options existence [ OK ] TEST: Vlan mcast_snooping global option default value [ OK ] TEST: Vlan 10 multicast snooping control [ OK ] Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
- 26 Nov, 2021 27 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski authored
drivers/net/ipa/ipa_main.c 8afc7e47 ("net: ipa: separate disabling setup from modem stop") 76b5fbcd ("net: ipa: kill ipa_modem_init()") Duplicated include, drop one. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from Jakub Kicinski: "Networking fixes, including fixes from netfilter. Current release - regressions: - r8169: fix incorrect mac address assignment - vlan: fix underflow for the real_dev refcnt when vlan creation fails - smc: avoid warning of possible recursive locking Current release - new code bugs: - vsock/virtio: suppress used length validation - neigh: fix crash in v6 module initialization error path Previous releases - regressions: - af_unix: fix change in behavior in read after shutdown - igb: fix netpoll exit with traffic, avoid warning - tls: fix splice_read() when starting mid-record - lan743x: fix deadlock in lan743x_phy_link_status_change() - marvell: prestera: fix bridge port operation Previous releases - always broken: - tcp_cubic: fix spurious Hystart ACK train detections for not-cwnd-limited flows - nexthop: fix refcount issues when replacing IPv6 groups - nexthop: fix null pointer dereference when IPv6 is not enabled - phylink: force link down and retrigger resolve on interface change - mptcp: fix delack timer length calculation and incorrect early clearing - ieee802154: handle iftypes as u32, prevent shift-out-of-bounds - nfc: virtual_ncidev: change default device permissions - netfilter: ctnetlink: fix error codes and flags used for kernel side filtering of dumps - netfilter: flowtable: fix IPv6 tunnel addr match - ncsi: align payload to 32-bit to fix dropped packets - iavf: fix deadlock and loss of config during VF interface reset - ice: avoid bpf_prog refcount underflow - ocelot: fix broken PTP over IP and PTP API violations Misc: - marvell: mvpp2: increase MTU limit when XDP enabled" * tag 'net-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits) net: dsa: microchip: implement multi-bridge support net: mscc: ocelot: correctly report the timestamping RX filters in ethtool net: mscc: ocelot: set up traps for PTP packets net: ptp: add a definition for the UDP port for IEEE 1588 general messages net: mscc: ocelot: create a function that replaces an existing VCAP filter net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMP net: hns3: fix incorrect components info of ethtool --reset command net: hns3: fix one incorrect value of page pool info when queried by debugfs net: hns3: add check NULL address for page pool net: hns3: fix VF RSS failed problem after PF enable multi-TCs net: qed: fix the array may be out of bound net/smc: Don't call clcsock shutdown twice when smc shutdown net: vlan: fix underflow for the real_dev refcnt ptp: fix filter names in the documentation ethtool: ioctl: fix potential NULL deref in ethtool_set_coalesce() nfc: virtual_ncidev: change default device permissions net/sched: sch_ets: don't peek at classes beyond 'nbands' net: stmmac: Disable Tx queues when reconfiguring the interface selftests: tls: test for correct proto_ops tls: fix replacing proto_ops ...
-
Oleksij Rempel authored
Current driver version is able to handle only one bridge at time. Configuring two bridges on two different ports would end up shorting this bridges by HW. To reproduce it: ip l a name br0 type bridge ip l a name br1 type bridge ip l s dev br0 up ip l s dev br1 up ip l s lan1 master br0 ip l s dev lan1 up ip l s lan2 master br1 ip l s dev lan2 up Ping on lan1 and get response on lan2, which should not happen. This happened, because current driver version is storing one global "Port VLAN Membership" and applying it to all ports which are members of any bridge. To solve this issue, we need to handle each port separately. This patch is dropping the global port member storage and calculating membership dynamically depending on STP state and bridge participation. Note: STP support was broken before this patch and should be fixed separately. Fixes: c2e86691 ("net: dsa: microchip: break KSZ9477 DSA driver into two files") Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://lore.kernel.org/r/20211126123926.2981028-1-o.rempel@pengutronix.deSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull ACPI fixes from Rafael Wysocki: "These fix a NULL pointer dereference in the CPPC library code and a locking issue related to printing the names of ACPI device nodes in the device properties framework. Specifics: - Fix NULL pointer dereference in the CPPC library code occuring on hybrid systems without CPPC support (Rafael Wysocki). - Avoid attempts to acquire a semaphore with interrupts off when printing the names of ACPI device nodes and clean up code on top of that fix (Sakari Ailus)" * tag 'acpi-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI: CPPC: Add NULL pointer check to cppc_get_perf() ACPI: Make acpi_node_get_parent() local ACPI: Get acpi_device's parent from the parent field
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull power management fixes from Rafael Wysocki: "These address three issues in the intel_pstate driver and fix two problems related to hibernation. Specifics: - Make intel_pstate work correctly on Ice Lake server systems with out-of-band performance control enabled (Adamos Ttofari). - Fix EPP handling in intel_pstate during CPU offline and online in the active mode (Rafael Wysocki). - Make intel_pstate support ITMT on asymmetric systems with overclocking enabled (Srinivas Pandruvada). - Fix hibernation image saving when using the user space interface based on the snapshot special device file (Evan Green). - Make the hibernation code release the snapshot block device using the same mode that was used when acquiring it (Thomas Zeitlhofer)" * tag 'pm-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: hibernate: Fix snapshot partial write lengths PM: hibernate: use correct mode for swsusp_close() cpufreq: intel_pstate: ITMT support for overclocked system cpufreq: intel_pstate: Fix active mode offline/online EPP handling cpufreq: intel_pstate: Add Ice Lake server to out-of-band IDs
-
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuseLinus Torvalds authored
Pull fuse fix from Miklos Szeredi: "Fix a regression caused by a bugfix in the previous release. The symptom is a VM_BUG_ON triggered from splice to the fuse device. Unfortunately the original bugfix was already backported to a number of stable releases, so this fix-fix will need to be backported as well" * tag 'fuse-fixes-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: release pipe buf after last use
-
Jakub Kicinski authored
Vladimir Oltean says: ==================== Fix broken PTP over IP on Ocelot switches Changes in v2: added patch 5, added Richard's ack for the whole series sans patch 5 which is new. Po Liu reported recently that timestamping PTP over IPv4 is broken using the felix driver on NXP LS1028A. This has been known for a while, of course, since it has always been broken. The reason is because IP PTP packets are currently treated as unknown IP multicast, which is not flooded to the CPU port in the ocelot driver design, so packets don't reach the ptp4l program. The series solves the problem by installing packet traps per port when the timestamping ioctl is called, depending on the RX filter selected (L2, L4 or both). ==================== Link: https://lore.kernel.org/r/20211126172845.3149260-1-vladimir.oltean@nxp.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
The driver doesn't support RX timestamping for non-PTP packets, but it declares that it does. Restrict the reported RX filters to PTP v2 over L2 and over L4. Fixes: 4e3b0468 ("net: mscc: PTP Hardware Clock (PHC) support") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
IEEE 1588 support was declared too soon for the Ocelot switch. Out of reset, this switch does not apply any special treatment for PTP packets, i.e. when an event message is received, the natural tendency is to forward it by MAC DA/VLAN ID. This poses a problem when the ingress port is under a bridge, since user space application stacks (written primarily for endpoint ports, not switches) like ptp4l expect that PTP messages are always received on AF_PACKET / AF_INET sockets (depending on the PTP transport being used), and never being autonomously forwarded. Any forwarding, if necessary (for example in Transparent Clock mode) is handled in software by ptp4l. Having the hardware forward these packets too will cause duplicates which will confuse endpoints connected to these switches. So PTP over L2 barely works, in the sense that PTP packets reach the CPU port, but they reach it via flooding, and therefore reach lots of other unwanted destinations too. But PTP over IPv4/IPv6 does not work at all. This is because the Ocelot switch have a separate destination port mask for unknown IP multicast (which PTP over IP is) flooding compared to unknown non-IP multicast (which PTP over L2 is) flooding. Specifically, the driver allows the CPU port to be in the PGID_MC port group, but not in PGID_MCIPV4 and PGID_MCIPV6. There are several presentations from Allan Nielsen which explain that the embedded MIPS CPU on Ocelot switches is not very powerful at all, so every penny they could save by not allowing flooding to the CPU port module matters. Unknown IP multicast did not make it. The de facto consensus is that when a switch is PTP-aware and an application stack for PTP is running, switches should have some sort of trapping mechanism for PTP packets, to extract them from the hardware data path. This avoids both problems: (a) PTP packets are no longer flooded to unwanted destinations (b) PTP over IP packets are no longer denied from reaching the CPU since they arrive there via a trap and not via flooding It is not the first time when this change is attempted. Last time, the feedback from Allan Nielsen and Andrew Lunn was that the traps should not be installed by default, and that PTP-unaware switching may be desired for some use cases: https://patchwork.ozlabs.org/project/netdev/patch/20190813025214.18601-5-yangbo.lu@nxp.com/ To address that feedback, the present patch adds the necessary packet traps according to the RX filter configuration transmitted by user space through the SIOCSHWTSTAMP ioctl. Trapping is done via VCAP IS2, where we keep 5 filters, which are amended each time RX timestamping is enabled or disabled on a port: - 1 for PTP over L2 - 2 for PTP over IPv4 (UDP ports 319 and 320) - 2 for PTP over IPv6 (UDP ports 319 and 320) The cookie by which these filters (invisible to tc) are identified is strategically chosen such that it does not collide with the filters used for the ocelot-8021q tagging protocol by the Felix driver, or with the MRP traps set up by the Ocelot library. Other alternatives were considered, like patching user space to do something, but there are so many ways in which PTP packets could be made to reach the CPU, generically speaking, that "do what?" is a very valid question. The ptp4l program from the linuxptp stack already attempts to do something: it calls setsockopt(IP_ADD_MEMBERSHIP) (and PACKET_ADD_MEMBERSHIP, respectively) which translates in both cases into a dev_mc_add() on the interface, in the kernel: https://github.com/richardcochran/linuxptp/blob/v3.1.1/udp.c#L73 https://github.com/richardcochran/linuxptp/blob/v3.1.1/raw.c Reality shows that this is not sufficient in case the interface belongs to a switchdev driver, as dev_mc_add() does not show the intention to trap a packet to the CPU, but rather the intention to not drop it (it is strictly for RX filtering, same as promiscuous does not mean to send all traffic to the CPU, but to not drop traffic with unknown MAC DA). This topic is a can of worms in itself, and it would be great if user space could just stay out of it. On the other hand, setting up PTP traps privately within the driver is not new by any stretch of the imagination: https://elixir.bootlin.com/linux/v5.16-rc2/source/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c#L833 https://elixir.bootlin.com/linux/v5.16-rc2/source/drivers/net/dsa/hirschmann/hellcreek.c#L1050 https://elixir.bootlin.com/linux/v5.16-rc2/source/include/linux/dsa/sja1105.h#L21 So this is the approach taken here as well. The difference here being that we prepare and destroy the traps per port, dynamically at runtime, as opposed to driver init time, because apparently, PTP-unaware forwarding is a use case. Fixes: 4e3b0468 ("net: mscc: PTP Hardware Clock (PHC) support") Reported-by: Po Liu <po.liu@nxp.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
As opposed to event messages (Sync, PdelayReq etc) which require timestamping, general messages (Announce, FollowUp etc) do not. In PTP they are part of different streams of data. IEEE 1588-2008 Annex D.2 "UDP port numbers" states that the UDP destination port assigned by IANA is 319 for event messages, and 320 for general messages. Yet the kernel seems to be missing the definition for general messages. This patch adds it. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
VCAP (Versatile Content Aware Processor) is the TCAM-based engine behind tc flower offload on ocelot, among other things. The ingress port mask on which VCAP rules match is present as a bit field in the actual key of the rule. This means that it is possible for a rule to be shared among multiple source ports. When the rule is added one by one on each desired port, that the ingress port mask of the key must be edited and rewritten to hardware. But the API in ocelot_vcap.c does not allow for this. For one thing, ocelot_vcap_filter_add() and ocelot_vcap_filter_del() are not symmetric, because ocelot_vcap_filter_add() works with a preallocated and prepopulated filter and programs it to hardware, and ocelot_vcap_filter_del() does both the job of removing the specified filter from hardware, as well as kfreeing it. That is to say, the only option of editing a filter in place, which is to delete it, modify the structure and add it back, does not work because it results in use-after-free. This patch introduces ocelot_vcap_filter_replace, which trivially reprograms a VCAP entry to hardware, at the exact same index at which it existed before, without modifying any list or allocating any memory. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Vladimir Oltean authored
The ocelot driver, when asked to timestamp all receiving packets, 1588 v1 or NTP, says "nah, here's 1588 v2 for you". According to this discussion: https://patchwork.kernel.org/project/netdevbpf/patch/20211104133204.19757-8-martin.kaistra@linutronix.de/#24577647 drivers that downgrade from a wider request to a narrower response (or even a response where the intersection with the request is empty) are buggy, and should return -ERANGE instead. This patch fixes that. Fixes: 4e3b0468 ("net: mscc: PTP Hardware Clock (PHC) support") Suggested-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Acked-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
Guangbin Huang says: ==================== net: hns3: add some fixes for -net This series adds some fixes for the HNS3 ethernet driver. ==================== Link: https://lore.kernel.org/r/20211126120318.33921-1-huangguangbin2@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jie Wang authored
Currently, HNS3 driver doesn't clear the reset flags of components after successfully executing reset, it causes userspace info of "Components reset" and "Components not reset" is incorrect. So fix this problem by clear corresponding reset flag after reset process. Fixes: ddccc5e3 ("net: hns3: add support for triggering reset by ethtool") Signed-off-by: Jie Wang <wangjie125@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Hao Chen authored
Currently, when user queries page pool info by debugfs command "cat page_pool_info", the cnt of allocated page for page pool may be incorrect because of memory inconsistency problem caused by compiler optimization. So this patch uses READ_ONCE() to read value of pages_state_hold_cnt to fix this problem. Fixes: 850bfb91 ("net: hns3: debugfs add support dumping page pool info") Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Hao Chen authored
When page pool is not enabled, its address value is still NULL and page pool should not be accessed, so add a check for it. Fixes: 850bfb91 ("net: hns3: debugfs add support dumping page pool info") Signed-off-by: Hao Chen <chenhao288@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
Guangbin Huang authored
When PF is set to multi-TCs and configured mapping relationship between priorities and TCs, the hardware will active these settings for this PF and its VFs. In this case when VF just uses one TC and its rx packets contain priority, and if the priority is not mapped to TC0, as other TCs of VF is not valid, hardware always put this kind of packets to the queue 0. It cause this kind of packets of VF can not be used RSS function. To fix this problem, set tc mode of all unused TCs of VF to the setting of TC0, then rx packet with priority which map to unused TC will be direct to TC0. Fixes: e2cb1dec ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-
zhangyue authored
If the variable 'p_bit->flags' is always 0, the loop condition is always 0. The variable 'j' may be greater than or equal to 32. At this time, the array 'p_aeu->bits[32]' may be out of bound. Signed-off-by: zhangyue <zhangyue1@kylinos.cn> Link: https://lore.kernel.org/r/20211125113610.273841-1-zhangyue1@kylinos.cnSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linuxLinus Torvalds authored
Pull btrfs fix from David Sterba: "One more fix to the lzo code, a missing put_page causing memory leaks when some error branches are taken" * tag 'for-5.16-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix the memory leak caused in lzo_compress_pages()
-
Tony Lu authored
When applications call shutdown() with SHUT_RDWR in userspace, smc_close_active() calls kernel_sock_shutdown(), and it is called twice in smc_shutdown(). This fixes this by checking sk_state before do clcsock shutdown, and avoids missing the application's call of smc_shutdown(). Link: https://lore.kernel.org/linux-s390/1f67548e-cbf6-0dce-82b5-10288a4583bd@linux.ibm.com/ Fixes: 606a63c9 ("net/smc: Ensure the active closing peer first closes clcsock") Signed-off-by: Tony Lu <tonylu@linux.alibaba.com> Reviewed-by: Wen Gu <guwen@linux.alibaba.com> Acked-by: Karsten Graul <kgraul@linux.ibm.com> Link: https://lore.kernel.org/r/20211126024134.45693-1-tonylu@linux.alibaba.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
wengjianfeng authored
Combine two judgments that return the same value Signed-off-by: wengjianfeng <wengjianfeng@yulong.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Link: https://lore.kernel.org/r/20211126013130.27112-1-samirweng1979@163.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Ziyang Xuan authored
Inject error before dev_hold(real_dev) in register_vlan_dev(), and execute the following testcase: ip link add dev dummy1 type dummy ip link add name dummy1.100 link dummy1 type vlan id 100 ip link del dev dummy1 When the dummy netdevice is removed, we will get a WARNING as following: ======================================================================= refcount_t: decrement hit 0; leaking memory. WARNING: CPU: 2 PID: 0 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 and an endless loop of: ======================================================================= unregister_netdevice: waiting for dummy1 to become free. Usage count = -1073741824 That is because dev_put(real_dev) in vlan_dev_free() be called without dev_hold(real_dev) in register_vlan_dev(). It makes the refcnt of real_dev underflow. Move the dev_hold(real_dev) to vlan_dev_init() which is the call-back of ndo_init(). That makes dev_hold() and dev_put() for vlan's real_dev symmetrical. Fixes: 563bcbae ("net: vlan: fix a UAF in vlan_dev_real_dev()") Reported-by: Petr Machata <petrm@nvidia.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Link: https://lore.kernel.org/r/20211126015942.2918542-1-william.xuanziyang@huawei.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Jakub Kicinski authored
All the filter names are missing _PTP in them. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Richard Cochran <richardcochran@gmail.com> Link: https://lore.kernel.org/r/20211126031921.2466944-1-kuba@kernel.orgSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Julian Wiedmann authored
ethtool_set_coalesce() now uses both the .get_coalesce() and .set_coalesce() callbacks. But the check for their availability is buggy, so changing the coalesce settings on a device where the driver provides only _one_ of the callbacks results in a NULL pointer dereference instead of an -EOPNOTSUPP. Fix the condition so that the availability of both callbacks is ensured. This also matches the netlink code. Note that reproducing this requires some effort - it only affects the legacy ioctl path, and needs a specific combination of driver options: - have .get_coalesce() and .coalesce_supported but no .set_coalesce(), or - have .set_coalesce() but no .get_coalesce(). Here eg. ethtool doesn't cause the crash as it first attempts to call ethtool_get_coalesce() and bails out on error. Fixes: f3ccfda1 ("ethtool: extend coalesce setting uAPI with CQE mode") Cc: Yufeng Mo <moyufeng@huawei.com> Cc: Huazhong Tan <tanhuazhong@huawei.com> Cc: Andrew Lunn <andrew@lunn.ch> Cc: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Link: https://lore.kernel.org/r/20211126175543.28000-1-jwi@linux.ibm.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Thadeu Lima de Souza Cascardo authored
Device permissions is S_IALLUGO, with many unnecessary bits. Remove them and also remove read and write permissions from group and others. Before the change: crwsrwsrwt 1 0 0 10, 125 Nov 25 13:59 /dev/virtual_nci After the change: crw------- 1 0 0 10, 125 Nov 25 14:05 /dev/virtual_nci Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> Reviewed-by: Bongsu Jeon <bongsu.jeon@samsung.com> Link: https://lore.kernel.org/r/20211125141457.716921-1-cascardo@canonical.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Davide Caratti authored
when the number of DRR classes decreases, the round-robin active list can contain elements that have already been freed in ets_qdisc_change(). As a consequence, it's possible to see a NULL dereference crash, caused by the attempt to call cl->qdisc->ops->peek(cl->qdisc) when cl->qdisc is NULL: BUG: kernel NULL pointer dereference, address: 0000000000000018 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 1 PID: 910 Comm: mausezahn Not tainted 5.16.0-rc1+ #475 Hardware name: Red Hat KVM, BIOS 1.11.1-4.module+el8.1.0+4066+0f1aadab 04/01/2014 RIP: 0010:ets_qdisc_dequeue+0x129/0x2c0 [sch_ets] Code: c5 01 41 39 ad e4 02 00 00 0f 87 18 ff ff ff 49 8b 85 c0 02 00 00 49 39 c4 0f 84 ba 00 00 00 49 8b ad c0 02 00 00 48 8b 7d 10 <48> 8b 47 18 48 8b 40 38 0f ae e8 ff d0 48 89 c3 48 85 c0 0f 84 9d RSP: 0000:ffffbb36c0b5fdd8 EFLAGS: 00010287 RAX: ffff956678efed30 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000002 RSI: ffffffff9b938dc9 RDI: 0000000000000000 RBP: ffff956678efed30 R08: e2f3207fe360129c R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff956678efeac0 R13: ffff956678efe800 R14: ffff956611545000 R15: ffff95667ac8f100 FS: 00007f2aa9120740(0000) GS:ffff95667b800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000018 CR3: 000000011070c000 CR4: 0000000000350ee0 Call Trace: <TASK> qdisc_peek_dequeued+0x29/0x70 [sch_ets] tbf_dequeue+0x22/0x260 [sch_tbf] __qdisc_run+0x7f/0x630 net_tx_action+0x290/0x4c0 __do_softirq+0xee/0x4f8 irq_exit_rcu+0xf4/0x130 sysvec_apic_timer_interrupt+0x52/0xc0 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0033:0x7f2aa7fc9ad4 Code: b9 ff ff 48 8b 54 24 18 48 83 c4 08 48 89 ee 48 89 df 5b 5d e9 ed fc ff ff 0f 1f 00 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa <53> 48 83 ec 10 48 8b 05 10 64 33 00 48 8b 00 48 85 c0 0f 85 84 00 RSP: 002b:00007ffe5d33fab8 EFLAGS: 00000202 RAX: 0000000000000002 RBX: 0000561f72c31460 RCX: 0000561f72c31720 RDX: 0000000000000002 RSI: 0000561f72c31722 RDI: 0000561f72c31720 RBP: 000000000000002a R08: 00007ffe5d33fa40 R09: 0000000000000014 R10: 0000000000000000 R11: 0000000000000246 R12: 0000561f7187e380 R13: 0000000000000000 R14: 0000000000000000 R15: 0000561f72c31460 </TASK> Modules linked in: sch_ets sch_tbf dummy rfkill iTCO_wdt intel_rapl_msr iTCO_vendor_support intel_rapl_common joydev virtio_balloon lpc_ich i2c_i801 i2c_smbus pcspkr ip_tables xfs libcrc32c crct10dif_pclmul crc32_pclmul crc32c_intel ahci libahci ghash_clmulni_intel serio_raw libata virtio_blk virtio_console virtio_net net_failover failover sunrpc dm_mirror dm_region_hash dm_log dm_mod CR2: 0000000000000018 Ensuring that 'alist' was never zeroed [1] was not sufficient, we need to remove from the active list those elements that are no more SP nor DRR. [1] https://lore.kernel.org/netdev/60d274838bf09777f0371253416e8af71360bc08.1633609148.git.dcaratti@redhat.com/ v3: fix race between ets_qdisc_change() and ets_qdisc_dequeue() delisting DRR classes beyond 'nbands' in ets_qdisc_change() with the qdisc lock acquired, thanks to Cong Wang. v2: when a NULL qdisc is found in the DRR active list, try to dequeue skb from the next list item. Reported-by: Hangbin Liu <liuhangbin@gmail.com> Fixes: dcc68b4d ("net: sch_ets: Add a new Qdisc") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Link: https://lore.kernel.org/r/7a5c496eed2d62241620bdbb83eb03fb9d571c99.1637762721.git.dcaratti@redhat.comSigned-off-by: Jakub Kicinski <kuba@kernel.org>
-
Rafael J. Wysocki authored
Merge fix and cleanup related to the management of ACPI device properties for 5.16-rc3. * acpi-properties: ACPI: Make acpi_node_get_parent() local ACPI: Get acpi_device's parent from the parent field
-