- 11 Feb, 2021 1 commit
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queueDavid S. Miller authored
Tony Nguyen says: ==================== 40GbE Intel Wired LAN Driver Updates 2021-02-10 This series contains updates to i40e driver only. Arkadiusz adds support for software controlled DCB. Upon disabling of the firmware LLDP agent, the driver configures DCB with default values (only one Traffic Class). At the same time, it allows a software based LLDP agent - userspace application i.e. lldpad) to receive DCB TLVs and set desired DCB configuration through DCB related netlink callbacks. Aleksandr implements get and set ethtool ops for Energy Efficient Ethernet. Przemyslaw extends support for ntuple filters allowing for Flow Director IPv6 and VLAN filters. Kaixu Xia removes an unneeded assignment. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 10 Feb, 2021 39 commits
-
-
Stefan Chulski authored
This entry used when skipping the parser needed, for example, the custom header pretended to ethernet header. Suggested-by: Liron Himi <liron@marvell.com> Signed-off-by: Stefan Chulski <stefanc@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David Howells authored
The changes to make rxrpc create the udp socket missed a bit to add the Kconfig dependency on the udp tunnel code to do this. Fix this by adding making AF_RXRPC select NET_UDP_TUNNEL. Fixes: 1a9b86c9 ("rxrpc: use udp tunnel APIs instead of open code in rxrpc_open_socket") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Xin Long <lucien.xin@gmail.com> cc: alaa@dev.mellanox.co.il cc: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Hariprasad Kelam says: ==================== ethtool support for fec and link configuration This series of patches add support for forward error correction(fec) and physical link configuration. Patches 1&2 adds necessary mbox handlers for fec mode configuration request and to fetch stats. Patch 3 registers driver callbacks for fec mode configuration and display. Patch 4&5 adds support of mbox handlers for configuring link parameters like speed/duplex and autoneg etc. Patche 6&7 registers driver callbacks for physical link configuration. Change-log: v2: - Fixed review comments - Corrected indentation issues - Return -ENOMEM incase of mbox allocation failure - added validation for input fecparams bitmask values - added more comments V3: - Removed inline functions - Make use of ethtool helpers APIs to display supported advertised modes - corrected indentation issues - code changes such that return early in case of failure to aid branch prediction v4: - Corrected indentation issues - Use FEC_OFF if user requests for FEC_AUTO mode - Do not clear fec stats in case of user changes fec mode - dont hide fec stats depending on interface mode selection ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christina Jacob authored
Register set_link_ksetting callback with driver such that link configurations parameters like advertised mode,speed, duplex and autoneg can be configured. below command ethtool -s eth0 advertise 0x1 speed 10 duplex full autoneg on Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christina Jacob authored
Register get_link_ksettings callback to get link status information from the driver. As virtual function (vf) shares same physical link same API is used for both the drivers and for loop back drivers simply returns the fixed values as its does not have physical link. ethtool eth3 Settings for eth3: Supported ports: [ ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full 1000baseT/Half 1000baseT/Full 10000baseKR/Full 1000baseX/Full Supports auto-negotiation: No Supported FEC modes: BaseR RS Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: None ethtool lbk0 Settings for lbk0: Speed: 100000Mb/s Duplex: Full Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christina Jacob authored
CGX supports setting advertised link modes on physical link. This patch adds support to derive cgx mode from ethtool link mode and pass it to firmware to configure the same. Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christina Jacob authored
CGX LMAC, the physical interface support link configuration parameters like speed, auto negotiation, duplex etc. Firmware saves these into memory region shared between firmware and this driver. This patch adds mailbox handler set_link_mode, fw_data_get to configure and read these parameters. Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christina Jacob authored
Add ethtool support to configure fec modes baser/rs and support to fecth FEC stats from CGX as well PHY. Configure fec mode - ethtool --set-fec eth0 encoding rs/baser/off/auto Query fec mode - ethtool --show-fec eth0 Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Felix Manlunas authored
This patch adds support to fetch fec stats from PHY. The stats are put in the shared data struct fwdata. A PHY driver indicates that it has FEC stats by setting the flag fwdata.phy.misc.has_fec_stats Besides CGX_CMD_GET_PHY_FEC_STATS, also add CGX_CMD_PRBS and CGX_CMD_DISPLAY_EYE to enum cgx_cmd_id so that Linux's enum list is in sync with firmware's enum list. Signed-off-by: Felix Manlunas <fmanlunas@marvell.com> Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Christina Jacob authored
CGX block supports forward error correction modes baseR and RS. This patch adds support to set encoding mode and to read corrected/uncorrected block counters Adds new mailbox handlers set_fec to configure encoding modes and fec_stats to read counters and also increase mbox timeout to accomdate firmware command response timeout. Along with new CGX_CMD_SET_FEC command add other commands to sync with kernel enum list with firmware. Signed-off-by: Christina Jacob <cjacob@marvell.com> Signed-off-by: Sunil Goutham <sgoutham@marvell.com> Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kevin Hao authored
Pavel pointed that the return of dma_addr_t in otx2_alloc_rbuf/__otx2_alloc_rbuf() seem suspicious because a negative error code may be returned in some cases. For a dma_addr_t, the error code such as -ENOMEM does seem a valid value, so we can't judge if the buffer allocation fail or not based on that value. Add a parameter for otx2_alloc_rbuf/__otx2_alloc_rbuf() to store the dma address and make the return value to indicate if the buffer allocation really fail or not. Reported-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Kevin Hao <haokexin@gmail.com> Tested-by: Subbaraya Sundeep <sbhatta@marvell.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
David S. Miller authored
Signed-off-by: David S. Miller <davem@davemloft.net>
-
Loic Poulain authored
MBIM has initially been specified by USB-IF for transporting data (IP) between a modem and a host over USB. However some modern modems also support MBIM over PCIe (via MHI). In the same way as QMAP(rmnet), it allows to aggregate IP packets and to perform context multiplexing. This change adds minimal MBIM data transport support to MHI, allowing to support MBIM only modems. MBIM being based on USB NCM, it reuses and copy some helpers/functions from the USB stack (cdc-ncm, cdc-mbim). Note that is a subset of the CDC-MBIM specification, supporting only transport of network data (IP), there is no support for DSS. Moreover the multi-session (for multi-pdn) is not supported in this initial version, but will be added latter, and aligned with the cdc-mbim solution (VLAN tags). This code has been inspired from the mhi_mbim downstream implementation (Carl Yin <carl.yin@quectel.com>). Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Loic Poulain authored
This can be used by proto when packet len is incorrect. Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Loic Poulain authored
Move mhi-net shared structures to mhi header, that will be used by upcoming proto(s). Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Loic Poulain authored
Create a dedicated mhi directory for mhi-net, mhi-net is going to be split into differente files (for additional protocol support). Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Loic Poulain authored
MHI can transport different protocols, some are handled at upper level, like IP and QMAP(rmnet/netlink), but others will need to be inside MHI net driver, like mbim. This change adds support for protocol rx and tx_fixup callbacks registration, that can be used to encode/decode the targeted protocol. Signed-off-by: Loic Poulain <loic.poulain@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Rahul Lakkireddy authored
Collect serial config version information directly from an internal register, instead of explicitly resizing VPD. v2: - Add comments on info stored in PCIE_STATIC_SPARE2 register. Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Kaixu Xia authored
The variable ret is overwritten by the following call i40e_clean_arq_element() and the assignment is useless, so remove it. Reported-by: Tosk Robot <tencent_os_robot@tencent.com> Signed-off-by: Kaixu Xia <kaixuxia@tencent.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Przemyslaw Patynowski authored
Allow user to specify VLAN field and add it to flow director. Show VLAN field in "ethtool -n ethx" command. Handle VLAN type and tag field provided by ethtool command. Refactored filter addition, by replacing static arrays with runtime dummy packet creation, which allows specifying VLAN field. Previously, VLAN field was omitted. Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Przemyslaw Patynowski authored
Flow director for IPv6 is not supported. 1) Implementation of support for IPv6 flow director. 2) Added handlers for addition of TCP6, UDP6, SCTP6, IPv6. 3) Refactored legacy code to make it more generic. 4) Added packet templates for TCP6, UDP6, SCTP6, IPv6. 5) Added handling of IPv6 source and destination address for flow director. 6) Improved argument passing for source and destination portin TCP6, UDP6 and SCTP6. 7) Added handling of ethtool -n for IPv6, TCP6,UDP6, SCTP6. 8) Used correct bit flag regarding FLEXOFF field of flow director data descriptor. Without this patch, there would be no support for flow director on IPv6, TCP6, UDP6, SCTP6. Tested based on x710 datasheet by using: ethtool -N enp133s0f0 flow-type tcp4 src-port 13 dst-port 37 user-def 0x44142 action 1 ethtool -N enp133s0f0 flow-type tcp6 src-port 13 dst-port 40 user-def 0x44142 action 2 ethtool -N enp133s0f0 flow-type udp4 src-port 20 dst-port 40 user-def 0x44142 action 3 ethtool -N enp133s0f0 flow-type udp6 src-port 25 dst-port 40 user-def 0x44142 action 4 ethtool -N enp133s0f0 flow-type sctp4 src-port 55 dst-port 65 user-def 0x44142 action 5 ethtool -N enp133s0f0 flow-type sctp6 src-port 60 dst-port 40 user-def 0x44142 action 6 ethtool -N enp133s0f0 flow-type ip4 src-ip 1.1.1.1 dst-ip 1.1.1.4 user-def 0x44142 action 7 ethtool -N enp133s0f0 flow-type ip6 src-ip fe80::3efd:feff:fe6f:bbbb dst-ip fe80::3efd:feff:fe6f:aaaa user-def 0x44142 action 8 Then send traffic from client which matches the criteria provided to ethtool. Observe that packets are redirected to user set queues with ethtool -S <interface> Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Aleksandr Loktionov authored
Implement Energy Efficient Ethernet (EEE) status getting & setting. The i40e_get_eee() requesting PHY EEE capabilities from firmware. The i40e_set_eee() function requests PHY EEE capabilities from firmware and sets PHY EEE advertising to full abilities or 0 depending whether EEE is to be enabled or disabled. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Arkadiusz Kubalewski authored
Add callbacks used by software based LLDP agent, which allows to configure DCB feature from userspace. Update copyright dates as appropriate. If LLDP agent is turned off in BIOS, or after setting private flag ("disable-fw-lldp on"). The driver initialized DCB functionality with default values, one traffic class with 100% bandwidth allocated. The new netlink callbacks are required for software LLDP agent, it must be able to acquire current DCB configuration of a network port and apply DCB configuration changes, if required. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Arkadiusz Kubalewski authored
Add extra handling on changing the "disable-fw-lldp" private flag to properly initialize software based DCB feature. Add default configuration of DCB functionality when Firmware LLDP agent is turned off, in case of driver probe and device reset on reconfiguration. Update copyright dates as appropriate. Software based DCB is a brand-new feature in i40e driver. Before, DCB was implemented by Firmware LLDP agent only. The agent was responsible for handling incoming DCB-related LLDP frames and applying received DCB configuration to hardware. Default configuration and new initialization flow for software based DCB is required. If LLDP agent is turned off in BIOS, or after setting private flag ("disable-fw-lldp on"). The driver initializes DCB functionality with default values, one traffic class with 100% bandwidth allocated. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
Arkadiusz Kubalewski authored
Add registers and definitions required for applying DCB related hardware configuration. Add functions responsible for calculating and setting proper hardware configuration values for software based DCB functionality. Add function responsible for invoking Admin Queue command, which results in applying new DCB configuration to the hardware. Update copyright dates as appropriate. Software based DCB is a brand-new feature in i40e driver. Before, DCB was implemented by Firmware LLDP agent only. The agent was responsible for handling incoming DCB-related LLDP frames and applying received DCB configuration to hardware. New communication channel between software and hardware is required for software driver. It must be able to calculate and configure all the registers related for DCB feature. Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull power management fixes from Rafael Wysocki: "Address a performance regression related to scale-invariance on x86 that may prevent turbo CPU frequencies from being used in certain workloads on systems using acpi-cpufreq as the CPU performance scaling driver and schedutil as the scaling governor" * tag 'pm-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: ACPI: Update arch scale-invariance max perf ratio if CPPC is not there cpufreq: ACPI: Extend frequency tables to cover boost frequencies
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull ACPI fix from Rafael Wysocki: "Revert a problematic ACPICA commit that changed the code to attempt to update memory regions which may be read-only on some systems (Ard Biesheuvel)" * tag 'acpi-5.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "ACPICA: Interpreter: fix memory leak by using existing buffer"
-
git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengineLinus Torvalds authored
Pull dmaengine fixes from Vinod Koul: "Some late fixes for dmaengine: Core: - fix channel device_node deletion Driver fixes: - dw: revert of runtime pm enabling - idxd: device state fix, interrupt completion and list corruption - ti: resource leak * tag 'dmaengine-fix2-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: dmaengine dw: Revert "dmaengine: dw: Enable runtime PM" dmaengine: idxd: check device state before issue command dmaengine: ti: k3-udma: Fix a resource leak in an error handling path dmaengine: move channel device_node deletion to driver dmaengine: idxd: fix misc interrupt completion dmaengine: idxd: Fix list corruption in description completion
-
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds authored
Pull networking fixes from David Miller: "Another pile of networing fixes: 1) ath9k build error fix from Arnd Bergmann 2) dma memory leak fix in mediatec driver from Lorenzo Bianconi. 3) bpf int3 kprobe fix from Alexei Starovoitov. 4) bpf stackmap integer overflow fix from Bui Quang Minh. 5) Add usb device ids for Cinterion MV31 to qmi_qwwan driver, from Christoph Schemmel. 6) Don't update deleted entry in xt_recent netfilter module, from Jazsef Kadlecsik. 7) Use after free in nftables, fix from Pablo Neira Ayuso. 8) Header checksum fix in flowtable from Sven Auhagen. 9) Validate user controlled length in qrtr code, from Sabyrzhan Tasbolatov. 10) Fix race in xen/netback, from Juergen Gross, 11) New device ID in cxgb4, from Raju Rangoju. 12) Fix ring locking in rxrpc release call, from David Howells. 13) Don't return LAPB error codes from x25_open(), from Xie He. 14) Missing error returns in gsi_channel_setup() from Alex Elder. 15) Get skb_copy_and_csum_datagram working properly with odd segment sizes, from Willem de Bruijn. 16) Missing RFS/RSS table init in enetc driver, from Vladimir Oltean. 17) Do teardown on probe failure in DSA, from Vladimir Oltean. 18) Fix compilation failures of txtimestamp selftest, from Vadim Fedorenko. 19) Limit rx per-napi gro queue size to fix latency regression, from Eric Dumazet. 20) dpaa_eth xdp fixes from Camelia Groza. 21) Missing txq mode update when switching CBS off, in stmmac driver, from Mohammad Athari Bin Ismail. 22) Failover pending logic fix in ibmvnic driver, from Sukadev Bhattiprolu. 23) Null deref fix in vmw_vsock, from Norbert Slusarek. 24) Missing verdict update in xdp paths of ena driver, from Shay Agroskin. 25) seq_file iteration fix in sctp from Neil Brown. 26) bpf 32-bit src register truncation fix on div/mod, from Daniel Borkmann. 27) Fix jmp32 pruning in bpf verifier, from Daniel Borkmann. 28) Fix locking in vsock_shutdown(), from Stefano Garzarella. 29) Various missing index bound checks in hns3 driver, from Yufeng Mo. 30) Flush ports on .phylink_mac_link_down() in dsa felix driver, from Vladimir Oltean. 31) Don't mix up stp and mrp port states in bridge layer, from Horatiu Vultur. 32) Fix locking during netif_tx_disable(), from Edwin Peer" * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (45 commits) bpf: Fix 32 bit src register truncation on div/mod bpf: Fix verifier jmp32 pruning decision logic bpf: Fix verifier jsgt branch analysis on max bound vsock: fix locking in vsock_shutdown() net: hns3: add a check for index in hclge_get_rss_key() net: hns3: add a check for tqp_index in hclge_get_ring_chain_from_mbx() net: hns3: add a check for queue_id in hclge_reset_vf_queue() net: dsa: felix: implement port flushing on .phylink_mac_link_down switchdev: mrp: Remove SWITCHDEV_ATTR_ID_MRP_PORT_STAT bridge: mrp: Fix the usage of br_mrp_port_switchdev_set_state net: watchdog: hold device global xmit lock during tx disable netfilter: nftables: relax check for stateful expressions in set definition netfilter: conntrack: skip identical origin tuple in same zone only vsock/virtio: update credit only if socket is not closed net: fix iteration for sctp transport seq_files net: ena: Update XDP verdict upon failure net/vmw_vsock: improve locking in vsock_connect_timeout() net/vmw_vsock: fix NULL pointer dereference ibmvnic: Clear failover_pending if unable to schedule net: stmmac: set TxQ mode back to DCB after disabling CBS ...
-
Linus Torvalds authored
Merge misc fixes from Andrew Morton: "14 patches. Subsystems affected by this patch series: mm (kasan, mremap, tmpfs, selftests, memcg, and slub), MAINTAINERS, squashfs, nilfs2, and firmware" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: nilfs2: make splice write available again mm, slub: better heuristic for number of cpus when calculating slab order Revert "mm: memcontrol: avoid workload stalls when lowering memory.high" MAINTAINERS: update Andrey Ryabinin's email address selftests/vm: rename file run_vmtests to run_vmtests.sh tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha tmpfs: disallow CONFIG_TMPFS_INODE64 on s390 mm/mremap: fix BUILD_BUG_ON() error in get_extent firmware_loader: align .builtin_fw to 8 kasan: fix stack traces dependency for HW_TAGS squashfs: add more sanity checks in xattr id lookup squashfs: add more sanity checks in inode lookup squashfs: add more sanity checks in id lookup squashfs: avoid out of bounds writes in decompressors
-
Joachim Henke authored
Since 5.10, splice() or sendfile() to NILFS2 return EINVAL. This was caused by commit 36e2c742 ("fs: don't allow splice read/write without explicit ops"). This patch initializes the splice_write field in file_operations, like most file systems do, to restore the functionality. Link: https://lkml.kernel.org/r/1612784101-14353-1-git-send-email-konishi.ryusuke@gmail.comSigned-off-by: Joachim Henke <joachim.henke@t-systems.com> Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> [5.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
When creating a new kmem cache, SLUB determines how large the slab pages will based on number of inputs, including the number of CPUs in the system. Larger slab pages mean that more objects can be allocated/free from per-cpu slabs before accessing shared structures, but also potentially more memory can be wasted due to low slab usage and fragmentation. The rough idea of using number of CPUs is that larger systems will be more likely to benefit from reduced contention, and also should have enough memory to spare. Number of CPUs used to be determined as nr_cpu_ids, which is number of possible cpus, but on some systems many will never be onlined, thus commit 045ab8c9 ("mm/slub: let number of online CPUs determine the slub page order") changed it to nr_online_cpus(). However, for kmem caches created early before CPUs are onlined, this may lead to permamently low slab page sizes. Vincent reports a regression [1] of hackbench on arm64 systems: "I'm facing significant performances regression on a large arm64 server system (224 CPUs). Regressions is also present on small arm64 system (8 CPUs) but in a far smaller order of magnitude On 224 CPUs system : 9 iterations of hackbench -l 16000 -g 16 v5.11-rc4 : 9.135sec (+/- 0.45%) v5.11-rc4 + revert this patch: 3.173sec (+/- 0.48%) v5.10: 3.136sec (+/- 0.40%)" Mel reports a regression [2] of hackbench on x86_64, with lockstat suggesting page allocator contention: "i.e. the patch incurs a 7% to 32% performance penalty. This bisected cleanly yesterday when I was looking for the regression and then found the thread. Numerous caches change size. For example, kmalloc-512 goes from order-0 (vanilla) to order-2 with the revert. So mostly this is down to the number of times SLUB calls into the page allocator which only caches order-0 pages on a per-cpu basis" Clearly num_online_cpus() doesn't work too early in bootup. We could change the order dynamically in a memory hotplug callback, but runtime order changing for existing kmem caches has been already shown as dangerous, and removed in 32a6f409 ("mm, slub: remove runtime allocation order changes"). It could be resurrected in a safe manner with some effort, but to fix the regression we need something simpler. We could use num_present_cpus() that should be the number of physically present CPUs even before they are onlined. That would work for PowerPC [3], which triggered the original commit, but that still doesn't work on arm64 [4] as explained in [5]. So this patch tries to determine the best available value without specific arch knowledge. - num_present_cpus() if the number is larger than 1, as that means the arch is likely setting it properly - nr_cpu_ids otherwise This should fix the reported regressions while also keeping the effect of 045ab8c9 for PowerPC systems. It's possible there are configurations where num_present_cpus() is 1 during boot while nr_cpu_ids is at the same time bloated, so these (if they exist) would keep the large orders based on nr_cpu_ids as was before 045ab8c9. [1] https://lore.kernel.org/linux-mm/CAKfTPtA_JgMf_+zdFbcb_V9rM7JBWNPjAz9irgwFj7Rou=xzZg@mail.gmail.com/ [2] https://lore.kernel.org/linux-mm/20210128134512.GF3592@techsingularity.net/ [3] https://lore.kernel.org/linux-mm/20210123051607.GC2587010@in.ibm.com/ [4] https://lore.kernel.org/linux-mm/CAKfTPtAjyVmS5VYvU6DBxg4-JEo5bdmWbngf-03YsY18cmWv_g@mail.gmail.com/ [5] https://lore.kernel.org/linux-mm/20210126230305.GD30941@willie-the-truck/ Link: https://lkml.kernel.org/r/20210208134108.22286-1-vbabka@suse.cz Fixes: 045ab8c9 ("mm/slub: let number of online CPUs determine the slub page order") Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reported-by: Vincent Guittot <vincent.guittot@linaro.org> Reported-by: Mel Gorman <mgorman@techsingularity.net> Tested-by: Mel Gorman <mgorman@techsingularity.net> Tested-by: Vincent Guittot <vincent.guittot@linaro.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Roman Gushchin <guro@fb.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Jann Horn <jannh@google.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller authored
Daniel Borkmann says: ==================== pull-request: bpf 2021-02-10 The following pull-request contains BPF updates for your *net* tree. We've added 5 non-merge commits during the last 8 day(s) which contain a total of 3 files changed, 22 insertions(+), 21 deletions(-). The main changes are: 1) Fix missed execution of kprobes BPF progs when kprobe is firing via int3, from Alexei Starovoitov. 2) Fix potential integer overflow in map max_entries for stackmap on 32 bit archs, from Bui Quang Minh. 3) Fix a verifier pruning and a insn rewrite issue related to 32 bit ops, from Daniel Borkmann. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> c# Please enter a commit message to explain why this merge is necessary,
-
Johannes Weiner authored
This reverts commit 536d3bf2, as it can cause writers to memory.high to get stuck in the kernel forever, performing page reclaim and consuming excessive amounts of CPU cycles. Before the patch, a write to memory.high would first put the new limit in place for the workload, and then reclaim the requested delta. After the patch, the kernel tries to reclaim the delta before putting the new limit into place, in order to not overwhelm the workload with a sudden, large excess over the limit. However, if reclaim is actively racing with new allocations from the uncurbed workload, it can keep the write() working inside the kernel indefinitely. This is causing problems in Facebook production. A privileged system-level daemon that adjusts memory.high for various workloads running on a host can get unexpectedly stuck in the kernel and essentially turn into a sort of involuntary kswapd for one of the workloads. We've observed that daemon busy-spin in a write() for minutes at a time, neglecting its other duties on the system, and expending privileged system resources on behalf of a workload. To remedy this, we have first considered changing the reclaim logic to break out after a couple of loops - whether the workload has converged to the new limit or not - and bound the write() call this way. However, the root cause that inspired the sequence change in the first place has been fixed through other means, and so a revert back to the proven limit-setting sequence, also used by memory.max, is preferable. The sequence was changed to avoid extreme latencies in the workload when the limit was lowered: the sudden, large excess created by the limit lowering would erroneously trigger the penalty sleeping code that is meant to throttle excessive growth from below. Allocating threads could end up sleeping long after the write() had already reclaimed the delta for which they were being punished. However, erroneous throttling also caused problems in other scenarios at around the same time. This resulted in commit b3ff9291 ("mm, memcg: reclaim more aggressively before high allocator throttling"), included in the same release as the offending commit. When allocating threads now encounter large excess caused by a racing write() to memory.high, instead of entering punitive sleeps, they will simply be tasked with helping reclaim down the excess, and will be held no longer than it takes to accomplish that. This is in line with regular limit enforcement - i.e. if the workload allocates up against or over an otherwise unchanged limit from below. With the patch breaking userspace, and the root cause addressed by other means already, revert it again. Link: https://lkml.kernel.org/r/20210122184341.292461-1-hannes@cmpxchg.org Fixes: 536d3bf2 ("mm: memcontrol: avoid workload stalls when lowering memory.high") Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Tejun Heo <tj@kernel.org> Acked-by: Chris Down <chris@chrisdown.name> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Michal Koutný <mkoutny@suse.com> Cc: <stable@vger.kernel.org> [5.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Ryabinin authored
Update my email, @virtuozzo.com will stop working shortly. Link: https://lkml.kernel.org/r/20210204223904.3824-1-ryabinin.a.a@gmail.comSigned-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Rong Chen authored
Commit c2aa8afc has renamed run_vmtests in Makefile, but the file still uses the old name. The kernel test robot reported the following issue: # selftests: vm: run_vmtests.sh # Warning: file run_vmtests.sh is missing! not ok 1 selftests: vm: run_vmtests.sh Link: https://lkml.kernel.org/r/20210205085507.1479894-1-rong.a.chen@intel.com Fixes: c2aa8afc (selftests/vm: rename run_vmtests --> run_vmtests.sh) Signed-off-by: Rong Chen <rong.a.chen@intel.com> Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Seth Forshee authored
As with s390, alpha is a 64-bit architecture with a 32-bit ino_t. With CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and display "inode64" in the mount options, whereas passing "inode64" in the mount options will fail. This leads to erroneous behaviours such as this: # mkdir mnt # mount -t tmpfs nodev mnt # mount -o remount,rw mnt mount: /home/ubuntu/mnt: mount point not mounted or bad option. Prevent CONFIG_TMPFS_INODE64 from being selected on alpha. Link: https://lkml.kernel.org/r/20210208215726.608197-1-seth.forshee@canonical.com Fixes: ea3271f7 ("tmpfs: support 64-bit inums per-sb") Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Chris Down <chris@chrisdown.name> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Cc: <stable@vger.kernel.org> [5.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Seth Forshee authored
Currently there is an assumption in tmpfs that 64-bit architectures also have a 64-bit ino_t. This is not true on s390 which has a 32-bit ino_t. With CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and display "inode64" in the mount options, but passing the "inode64" mount option will fail. This leads to the following behavior: # mkdir mnt # mount -t tmpfs nodev mnt # mount -o remount,rw mnt mount: /home/ubuntu/mnt: mount point not mounted or bad option. As mount sees "inode64" in the mount options and thus passes it in the options for the remount. So prevent CONFIG_TMPFS_INODE64 from being selected on s390. Link: https://lkml.kernel.org/r/20210205230620.518245-1-seth.forshee@canonical.com Fixes: ea3271f7 ("tmpfs: support 64-bit inums per-sb") Signed-off-by: Seth Forshee <seth.forshee@canonical.com> Acked-by: Hugh Dickins <hughd@google.com> Cc: Chris Down <chris@chrisdown.name> Cc: Hugh Dickins <hughd@google.com> Cc: Amir Goldstein <amir73il@gmail.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: <stable@vger.kernel.org> [5.9+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-