1. 16 Dec, 2020 5 commits
    • Oleksij Rempel's avatar
      net: dsa: qca: ar9331: fix sleeping function called from invalid context bug · 3e47495f
      Oleksij Rempel authored
      With lockdep enabled, we will get following warning:
      
       ar9331_switch ethernet.1:10 lan0 (uninitialized): PHY [!ahb!ethernet@1a000000!mdio!switch@10:00] driver [Qualcomm Atheros AR9331 built-in PHY] (irq=13)
       BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
       in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 18, name: kworker/0:1
       INFO: lockdep is turned off.
       irq event stamp: 602
       hardirqs last  enabled at (601): [<8073fde0>] _raw_spin_unlock_irq+0x3c/0x80
       hardirqs last disabled at (602): [<8073a4f4>] __schedule+0x184/0x800
       softirqs last  enabled at (0): [<80080f60>] copy_process+0x578/0x14c8
       softirqs last disabled at (0): [<00000000>] 0x0
       CPU: 0 PID: 18 Comm: kworker/0:1 Not tainted 5.10.0-rc3-ar9331-00734-g7d644991df0c #31
       Workqueue: events deferred_probe_work_func
       Stack : 80980000 80980000 8089ef70 80890000 804b5414 80980000 00000002 80b53728
               00000000 800d1268 804b5414 ffffffde 00000017 800afe08 81943860 0f5bfc32
               00000000 00000000 8089ef70 819436c0 ffffffea 00000000 00000000 00000000
               8194390c 808e353c 0000000f 66657272 80980000 00000000 00000000 80890000
               804b5414 80980000 00000002 80b53728 00000000 00000000 00000000 80d40000
               ...
       Call Trace:
       [<80069ce0>] show_stack+0x9c/0x140
       [<800afe08>] ___might_sleep+0x220/0x244
       [<8073bfb0>] __mutex_lock+0x70/0x374
       [<8073c2e0>] mutex_lock_nested+0x2c/0x38
       [<804b5414>] regmap_update_bits_base+0x38/0x8c
       [<804ee584>] regmap_update_bits+0x1c/0x28
       [<804ee714>] ar9331_sw_unmask_irq+0x34/0x60
       [<800d91f0>] unmask_irq+0x48/0x70
       [<800d93d4>] irq_startup+0x114/0x11c
       [<800d65b4>] __setup_irq+0x4f4/0x6d0
       [<800d68a0>] request_threaded_irq+0x110/0x190
       [<804e3ef0>] phy_request_interrupt+0x4c/0xe4
       [<804df508>] phylink_bringup_phy+0x2c0/0x37c
       [<804df7bc>] phylink_of_phy_connect+0x118/0x130
       [<806c1a64>] dsa_slave_create+0x3d0/0x578
       [<806bc4ec>] dsa_register_switch+0x934/0xa20
       [<804eef98>] ar9331_sw_probe+0x34c/0x364
       [<804eb48c>] mdio_probe+0x44/0x70
       [<8049e3b4>] really_probe+0x30c/0x4f4
       [<8049ea10>] driver_probe_device+0x264/0x26c
       [<8049bc10>] bus_for_each_drv+0xb4/0xd8
       [<8049e684>] __device_attach+0xe8/0x18c
       [<8049ce58>] bus_probe_device+0x48/0xc4
       [<8049db70>] deferred_probe_work_func+0xdc/0xf8
       [<8009ff64>] process_one_work+0x2e4/0x4a0
       [<800a0770>] worker_thread+0x2a8/0x354
       [<800a774c>] kthread+0x16c/0x174
       [<8006306c>] ret_from_kernel_thread+0x14/0x1c
      
       ar9331_switch ethernet.1:10 lan1 (uninitialized): PHY [!ahb!ethernet@1a000000!mdio!switch@10:02] driver [Qualcomm Atheros AR9331 built-in PHY] (irq=13)
       DSA: tree 0 setup
      
      To fix it, it is better to move access to MDIO register to the .irq_bus_sync_unlock
      call back.
      
      Fixes: ec6698c2 ("net: dsa: add support for Atheros AR9331 built-in switch")
      Signed-off-by: default avatarOleksij Rempel <o.rempel@pengutronix.de>
      Reviewed-by: default avatarVladimir Oltean <olteanv@gmail.com>
      Link: https://lore.kernel.org/r/20201211110317.17061-1-o.rempel@pengutronix.deSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      3e47495f
    • Jakub Kicinski's avatar
      Merge branch 'i40e-ice-af_xdp-zc-fixes' · ec58c75a
      Jakub Kicinski authored
      Björn Töpel says:
      
      ====================
      i40e/ice AF_XDP ZC fixes
      
      This series address two crashes in the AF_XDP zero-copy mode for ice
      and i40e. More details in each individual the commit message.
      ====================
      
      Link: https://lore.kernel.org/r/20201211145712.72957-1-bjorn.topel@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ec58c75a
    • Björn Töpel's avatar
      i40e, xsk: clear the status bits for the next_to_use descriptor · 64050b5b
      Björn Töpel authored
      On the Rx side, the next_to_use index points to the next item in the
      HW ring to be refilled/allocated, and next_to_clean points to the next
      item to potentially be processed.
      
      When the HW Rx ring is fully refilled, i.e. no packets has been
      processed, the next_to_use will be next_to_clean - 1. When the ring is
      fully processed next_to_clean will be equal to next_to_use. The latter
      case is where a bug is triggered.
      
      If the next_to_use bits are not cleared, and the "fully processed"
      state is entered, a stale descriptor can be processed.
      
      The skb-path correctly clear the status bit for the next_to_use
      descriptor, but the AF_XDP zero-copy path did not do that.
      
      This change adds the status bits clearing of the next_to_use
      descriptor.
      
      Fixes: 3b4f0b66 ("i40e, xsk: Migrate to new MEM_TYPE_XSK_BUFF_POOL")
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      64050b5b
    • Björn Töpel's avatar
      ice, xsk: clear the status bits for the next_to_use descriptor · 8d14768a
      Björn Töpel authored
      On the Rx side, the next_to_use index points to the next item in the
      HW ring to be refilled/allocated, and next_to_clean points to the next
      item to potentially be processed.
      
      When the HW Rx ring is fully refilled, i.e. no packets has been
      processed, the next_to_use will be next_to_clean - 1. When the ring is
      fully processed next_to_clean will be equal to next_to_use. The latter
      case is where a bug is triggered.
      
      If the next_to_use bits are not cleared, and the "fully processed"
      state is entered, a stale descriptor can be processed.
      
      The skb-path correctly clear the status bit for the next_to_use
      descriptor, but the AF_XDP zero-copy path did not do that.
      
      This change adds the status bits clearing of the next_to_use
      descriptor.
      
      Fixes: 2d4238f5 ("ice: Add support for AF_XDP")
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@intel.com>
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      8d14768a
    • Sven Van Asbroeck's avatar
      lan743x: fix rx_napi_poll/interrupt ping-pong · 57030a0b
      Sven Van Asbroeck authored
      Even if there is more rx data waiting on the chip, the rx napi poll fn
      will never run more than once - it will always read a few buffers, then
      bail out and re-arm interrupts. Which results in ping-pong between napi
      and interrupt.
      
      This defeats the purpose of napi, and is bad for performance.
      
      Fix by making the rx napi poll behave identically to other ethernet
      drivers:
      1. initialize rx napi polling with an arbitrary budget (64).
      2. in the polling fn, return full weight if rx queue is not depleted,
         this tells the napi core to "keep polling".
      3. update the rx tail ("ring the doorbell") once for every 8 processed
         rx ring buffers.
      
      Thanks to Jakub Kicinski, Eric Dumazet and Andrew Lunn for their expert
      opinions and suggestions.
      
      Tested with 20 seconds of full bandwidth receive (iperf3):
              rx irqs      softirqs(NET_RX)
              -----------------------------
      before  23827        33620
      after   129          4081
      
      Tested-by: Sven Van Asbroeck <thesven73@gmail.com> # lan7430
      Fixes: 23f0703c ("lan743x: Add main source files for new lan743x driver")
      Signed-off-by: default avatarSven Van Asbroeck <thesven73@gmail.com>
      Link: https://lore.kernel.org/r/20201215161954.5950-1-TheSven73@gmail.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      57030a0b
  2. 15 Dec, 2020 35 commits
    • Linus Torvalds's avatar
      Merge tag 'staging-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 3db1a3fa
      Linus Torvalds authored
      Pull staging / IIO driver updates from Greg KH:
       "Here is the big staging and IIO driver pull request for 5.11-rc1
      
        Lots of different things in here:
      
         - loads of driver updates
      
         - so many coding style cleanups
      
         - new IIO drivers
      
         - Android ION code is finally removed from the tree
      
         - wimax drivers are moved to staging on their way out of the kernel
      
        Nothing really exciting, just the constant grind of kernel development :)
      
        All have been in linux-next for a while with no reported issues"
      
      * tag 'staging-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (341 commits)
        staging: olpc_dcon: Do not call platform_device_unregister() in dcon_probe()
        staging: most: Fix spelling mistake "tranceiver" -> "transceiver"
        staging: qlge: remove duplicate word in comment
        staging: comedi: mf6x4: Fix AI end-of-conversion detection
        staging: greybus: Add TODO item about modernizing the pwm code
        pinctrl: ralink: add a pinctrl driver for the rt2880 family
        dt-bindings: pinctrl: rt2880: add binding document
        staging: rtl8723bs: remove ELEMENT_ID enum
        staging: rtl8723bs: remove unused macros
        staging: rtl8723bs: replace EID_EXTCapability
        staging: rtl8723bs: replace EID_BSSIntolerantChlReport
        staging: rtl8723bs: replace EID_BSSCoexistence
        staging: rtl8723bs: replace _MME_IE_
        staging: rtl8723bs: replace _WAPI_IE_
        staging: rtl8723bs: replace _EXT_SUPPORTEDRATES_IE_
        staging: rtl8723bs: replace _ERPINFO_IE_
        staging: rtl8723bs: replace _CHLGETXT_IE_
        staging: rtl8723bs: replace _COUNTRY_IE_
        staging: rtl8723bs: replace _IBSS_PARA_IE_
        staging: rtl8723bs: replace _TIM_IE_
        ...
      3db1a3fa
    • Linus Torvalds's avatar
      Merge tag 'char-misc-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 2911ed9f
      Linus Torvalds authored
      Pull char / misc driver updates from Greg KH:
       "Here is the big char/misc driver update for 5.11-rc1.
      
        Continuing the tradition of previous -rc1 pulls, there seems to be
        more and more tiny driver subsystems flowing through this tree.
      
        Lots of different things, all of which have been in linux-next for a
        while with no reported issues:
      
         - extcon driver updates
      
         - habannalab driver updates
      
         - mei driver updates
      
         - uio driver updates
      
         - binder fixes and features added
      
         - soundwire driver updates
      
         - mhi bus driver updates
      
         - phy driver updates
      
         - coresight driver updates
      
         - fpga driver updates
      
         - speakup driver updates
      
         - slimbus driver updates
      
         - various small char and misc driver updates"
      
      * tag 'char-misc-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (305 commits)
        extcon: max77693: Fix modalias string
        extcon: fsa9480: Support TI TSU6111 variant
        extcon: fsa9480: Rewrite bindings in YAML and extend
        dt-bindings: extcon: add binding for TUSB320
        extcon: Add driver for TI TUSB320
        slimbus: qcom: fix potential NULL dereference in qcom_slim_prg_slew()
        siox: Make remove callback return void
        siox: Use bus_type functions for probe, remove and shutdown
        spmi: Add driver shutdown support
        spmi: fix some coding style issues at the spmi core
        spmi: get rid of a warning when built with W=1
        uio: uio_hv_generic: use devm_kzalloc() for private data alloc
        uio: uio_fsl_elbc_gpcm: use device-managed allocators
        uio: uio_aec: use devm_kzalloc() for uio_info object
        uio: uio_cif: use devm_kzalloc() for uio_info object
        uio: uio_netx: use devm_kzalloc() for or uio_info object
        uio: uio_mf624: use devm_kzalloc() for uio_info object
        uio: uio_sercos3: use device-managed functions for simple allocs
        uio: uio_dmem_genirq: finalize conversion of probe to devm_ handlers
        uio: uio_dmem_genirq: convert simple allocations to device-managed
        ...
      2911ed9f
    • Linus Torvalds's avatar
      Merge tag 'driver-core-5.11-rc1' of... · 7240153a
      Linus Torvalds authored
      Merge tag 'driver-core-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
      
      Pull driver core updates from Greg KH:
       "Here is the big driver core updates for 5.11-rc1
      
        This time there was a lot of different work happening here for some
        reason:
      
         - redo of the fwnode link logic, speeding it up greatly
      
         - auxiliary bus added (this was a tag that will be pulled in from
           other trees/maintainers this merge window as well, as driver
           subsystems started to rely on it)
      
         - platform driver core cleanups on the way to fixing some long-time
           api updates in future releases
      
         - minor fixes and tweaks.
      
        All have been in linux-next with no (finally) reported issues. Testing
        there did helped in shaking issues out a lot :)"
      
      * tag 'driver-core-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (39 commits)
        driver core: platform: don't oops in platform_shutdown() on unbound devices
        ACPI: Use fwnode_init() to set up fwnode
        misc: pvpanic: Replace OF headers by mod_devicetable.h
        misc: pvpanic: Combine ACPI and platform drivers
        usb: host: sl811: Switch to use platform_get_mem_or_io()
        vfio: platform: Switch to use platform_get_mem_or_io()
        driver core: platform: Introduce platform_get_mem_or_io()
        dyndbg: fix use before null check
        soc: fix comment for freeing soc_dev_attr
        driver core: platform: use bus_type functions
        driver core: platform: change logic implementing platform_driver_probe
        driver core: platform: reorder functions
        driver core: make driver_probe_device() static
        driver core: Fix a couple of typos
        driver core: Reorder devices on successful probe
        driver core: Delete pointless parameter in fwnode_operations.add_links
        driver core: Refactor fw_devlink feature
        efi: Update implementation of add_links() to create fwnode links
        of: property: Update implementation of add_links() to create fwnode links
        driver core: Use device's fwnode to check if it is waiting for suppliers
        ...
      7240153a
    • Linus Torvalds's avatar
      Merge tag 'tty-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 157f8098
      Linus Torvalds authored
      Pull tty / serial updates from Greg KH:
       "Here is the "large" set of tty and serial patches for 5.11-rc1.
      
        Nothing major at all, some cleanups and some driver removals, always a
        nice sign:
      
         - build warning cleanups
      
         - vt locking and logic unwinding and cleanups
      
         - tiny serial driver fixes and updates
      
         - removal of the synclink serial driver as it's no longer needed
      
         - removal of dead termiox code
      
        All of this has been in linux-next for a while with no reported issues"
      
      * tag 'tty-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (89 commits)
        serial: 8250_pci: Drop bogus __refdata annotation
        tty: serial: meson: enable console as module
        serial: 8250_omap: Avoid FIFO corruption caused by MDR1 access
        serial: imx: Move imx_uart_probe_dt() content into probe()
        serial: imx: Remove unneeded of_device_get_match_data() NULL check
        tty: Fix whitespace inconsistencies in vt_io_ioctl
        serial_core: Check for port state when tty is in error state
        dt-bindings: serial: Update DT binding docs to support SiFive FU740 SoC
        tty: use const parameters in port-flag accessors
        tty: use assign_bit() in port-flag accessors
        earlycon: drop semicolon from earlycon macro
        tty: Remove dead termiox code
        tty/serial/imx: Enable TXEN bit in imx_poll_init().
        tty : serial: jsm: Fixed file by adding spacing
        tty: serial: uartlite: Support probe deferral
        earlycon: simplify earlycon-table implementation
        tty: serial: bcm63xx: lower driver dependencies
        serial: mxs-auart: Remove unneeded platform_device_id
        serial: 8250-mtk: Fix reference leak in mtk8250_probe
        serial: imx: Remove unused .id_table support
        ...
      157f8098
    • Linus Torvalds's avatar
      Merge tag 'usb-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 0cee54c8
      Linus Torvalds authored
      Pull USB / Thunderbolt updates from Greg KH:
       "Here is the big USB and thunderbolt pull request for 5.11-rc1.
      
        Nothing major in here, just the grind of constant development to
        support new hardware and fix old issues:
      
         - thunderbolt updates for new USB4 hardware
      
         - cdns3 major driver updates
      
         - lots of typec updates and additions as more hardware is available
      
         - usb serial driver updates and fixes
      
         - other tiny USB driver updates
      
        All have been in linux-next with no reported issues"
      
      * tag 'usb-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (172 commits)
        usb: phy: convert comma to semicolon
        usb: ucsi: convert comma to semicolon
        usb: typec: tcpm: convert comma to semicolon
        usb: typec: tcpm: Update vbus_vsafe0v on init
        usb: typec: tcpci: Enable bleed discharge when auto discharge is enabled
        usb: typec: Add class for plug alt mode device
        USB: typec: tcpci: Add Bleed discharge to POWER_CONTROL definition
        USB: typec: tcpm: Add a 30ms room for tPSSourceOn in PR_SWAP
        USB: typec: tcpm: Fix PR_SWAP error handling
        USB: typec: tcpm: Hard Reset after not receiving a Request
        USB: gadget: f_fs: remove likely/unlikely
        usb: gadget: f_fs: Re-use SS descriptors for SuperSpeedPlus
        USB: gadget: f_midi: setup SuperSpeed Plus descriptors
        USB: gadget: f_acm: add support for SuperSpeed Plus
        USB: gadget: f_rndis: fix bitrate for SuperSpeed and above
        usb: typec: intel_pmc_mux: Configure cable generation value for USB4
        MAINTAINERS: Add myself as a reviewer for CADENCE USB3 DRD IP DRIVER
        usb: chipidea: ci_hdrc_imx: Use of_device_get_match_data()
        usb: chipidea: usbmisc_imx: Use of_device_get_match_data()
        usb: cdns3: fix NULL pointer dereference on no platform data
        ...
      0cee54c8
    • Linus Torvalds's avatar
      Merge tag 'sound-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound · c367caf1
      Linus Torvalds authored
      Pull sound updates from Takashi Iwai:
       "Lots of changes (slightly more code increase than usual) at this time,
        while most of code changes are ASoC driver-specific.
      
        Here are some highlights:
      
        Core:
      
         - The new auxiliary bus implementation for Intel DSP, which will be
           used by other drivers as well
      
         - Lots of ASoC core cleanups and refactoring
      
         - UBSAN and KCSAN fixes in rawmidi, sequencer and a few others
      
         - Compress-offload API enhancement for the pause during draining
      
        HD- and USB-audio:
      
         - Enhancements of the USB-audio implicit feedback support, including
           better full-duplex operations
      
         - Continued CA0132 improvements and fixes
      
         - A few new quirk entries, HDMI audio fixes
      
        ASoC:
      
         - Support for boot time selection of Intel DSP firmware, which should
           help distros/users testing new stuff more easily; the kconfig was
           moved to boot time option, too
      
         - Some basic DPCM support in audio graph card
      
         - Removal of old pre-DT Freescale drivers
      
         - Support for Allwinner H6 I2S, Analog Devices ADAU1372, Intel
           Alderlake-S, GMediatek MT8192, NXP i.MX HDMI and XCVR, Realtek
           RT715, Qualcomm SM8250 and simple GPIO based muxes"
      
      * tag 'sound-5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (445 commits)
        ALSA: pcm: oss: Fix potential out-of-bounds shift
        ALSA: usb-audio: Fix potential out-of-bounds shift
        ALSA: hda/ca0132 - Add ZxR surround DAC setup.
        ALSA: hda/ca0132 - Add 8051 PLL write helper functions.
        ALSA: hda/hdmi: packet buffer index must be set before reading value
        ASoC: SOF: imx: update kernel-doc description
        ASoC: mediatek: mt8183: delete some unreachable code
        ASoC: mediatek: mt8183: add PM ops to machine drivers
        ASoC: topology: Fix wrong size check
        ASoC: topology: Add missing size check
        ASoC: SOF: Intel: hda: fix the condition passed to sof_dev_dbg_or_err
        ASoC: SOF: modify the SOF_DBG flags
        ASoC: SOF: Intel: hda: remove duplicated status dump
        ASoC: rt1015p: delay 300ms after SDB pulling high for calibration
        ASoC: rt1015p: move SDB control from trigger to DAPM
        ASoC: wm_adsp: remove "ctl" from list on error in wm_adsp_create_control()
        ALSA: usb-audio: Fix control 'access overflow' errors from chmap
        ALSA: hda/hdmi: always print pin NIDs as hexadecimal
        ALSA: hda/realtek - Add supported for more Lenovo ALC285 Headset Button
        ALSA: hda/ca0132 - Remove now unnecessary DSP setup functions.
        ...
      c367caf1
    • Linus Torvalds's avatar
      Merge tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next · d635a69d
      Linus Torvalds authored
      Pull networking updates from Jakub Kicinski:
       "Core:
      
         - support "prefer busy polling" NAPI operation mode, where we defer
           softirq for some time expecting applications to periodically busy
           poll
      
         - AF_XDP: improve efficiency by more batching and hindering the
           adjacency cache prefetcher
      
         - af_packet: make packet_fanout.arr size configurable up to 64K
      
         - tcp: optimize TCP zero copy receive in presence of partial or
           unaligned reads making zero copy a performance win for much smaller
           messages
      
         - XDP: add bulk APIs for returning / freeing frames
      
         - sched: support fragmenting IP packets as they come out of conntrack
      
         - net: allow virtual netdevs to forward UDP L4 and fraglist GSO skbs
      
        BPF:
      
         - BPF switch from crude rlimit-based to memcg-based memory accounting
      
         - BPF type format information for kernel modules and related tracing
           enhancements
      
         - BPF implement task local storage for BPF LSM
      
         - allow the FENTRY/FEXIT/RAW_TP tracing programs to use
           bpf_sk_storage
      
        Protocols:
      
         - mptcp: improve multiple xmit streams support, memory accounting and
           many smaller improvements
      
         - TLS: support CHACHA20-POLY1305 cipher
      
         - seg6: add support for SRv6 End.DT4/DT6 behavior
      
         - sctp: Implement RFC 6951: UDP Encapsulation of SCTP
      
         - ppp_generic: add ability to bridge channels directly
      
         - bridge: Connectivity Fault Management (CFM) support as is defined
           in IEEE 802.1Q section 12.14.
      
        Drivers:
      
         - mlx5: make use of the new auxiliary bus to organize the driver
           internals
      
         - mlx5: more accurate port TX timestamping support
      
         - mlxsw:
            - improve the efficiency of offloaded next hop updates by using
              the new nexthop object API
            - support blackhole nexthops
            - support IEEE 802.1ad (Q-in-Q) bridging
      
         - rtw88: major bluetooth co-existance improvements
      
         - iwlwifi: support new 6 GHz frequency band
      
         - ath11k: Fast Initial Link Setup (FILS)
      
         - mt7915: dual band concurrent (DBDC) support
      
         - net: ipa: add basic support for IPA v4.5
      
        Refactor:
      
         - a few pieces of in_interrupt() cleanup work from Sebastian Andrzej
           Siewior
      
         - phy: add support for shared interrupts; get rid of multiple driver
           APIs and have the drivers write a full IRQ handler, slight growth
           of driver code should be compensated by the simpler API which also
           allows shared IRQs
      
         - add common code for handling netdev per-cpu counters
      
         - move TX packet re-allocation from Ethernet switch tag drivers to a
           central place
      
         - improve efficiency and rename nla_strlcpy
      
         - number of W=1 warning cleanups as we now catch those in a patchwork
           build bot
      
        Old code removal:
      
         - wan: delete the DLCI / SDLA drivers
      
         - wimax: move to staging
      
         - wifi: remove old WDS wifi bridging support"
      
      * tag 'net-next-5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1922 commits)
        net: hns3: fix expression that is currently always true
        net: fix proc_fs init handling in af_packet and tls
        nfc: pn533: convert comma to semicolon
        af_vsock: Assign the vsock transport considering the vsock address flags
        af_vsock: Set VMADDR_FLAG_TO_HOST flag on the receive path
        vsock_addr: Check for supported flag values
        vm_sockets: Add VMADDR_FLAG_TO_HOST vsock flag
        vm_sockets: Add flags field in the vsock address data structure
        net: Disable NETIF_F_HW_TLS_TX when HW_CSUM is disabled
        tcp: Add logic to check for SYN w/ data in tcp_simple_retransmit
        net: mscc: ocelot: install MAC addresses in .ndo_set_rx_mode from process context
        nfc: s3fwrn5: Release the nfc firmware
        net: vxget: clean up sparse warnings
        mlxsw: spectrum_router: Use eXtended mezzanine to offload IPv4 router
        mlxsw: spectrum: Set KVH XLT cache mode for Spectrum2/3
        mlxsw: spectrum_router_xm: Introduce basic XM cache flushing
        mlxsw: reg: Add Router LPM Cache Enable Register
        mlxsw: reg: Add Router LPM Cache ML Delete Register
        mlxsw: spectrum_router_xm: Implement L-value tracking for M-index
        mlxsw: reg: Add XM Router M Table Register
        ...
      d635a69d
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · ac73e3dc
      Linus Torvalds authored
      Merge misc updates from Andrew Morton:
      
       - a few random little subsystems
      
       - almost all of the MM patches which are staged ahead of linux-next
         material. I'll trickle to post-linux-next work in as the dependents
         get merged up.
      
      Subsystems affected by this patch series: kthread, kbuild, ide, ntfs,
      ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache,
      gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation,
      kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction,
      oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc,
      uaccess, zram, and cleanups).
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (200 commits)
        mm: cleanup kstrto*() usage
        mm: fix fall-through warnings for Clang
        mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at
        mm: shmem: convert shmem_enabled_show to use sysfs_emit_at
        mm:backing-dev: use sysfs_emit in macro defining functions
        mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening
        mm: use sysfs_emit for struct kobject * uses
        mm: fix kernel-doc markups
        zram: break the strict dependency from lzo
        zram: add stat to gather incompressible pages since zram set up
        zram: support page writeback
        mm/process_vm_access: remove redundant initialization of iov_r
        mm/zsmalloc.c: rework the list_add code in insert_zspage()
        mm/zswap: move to use crypto_acomp API for hardware acceleration
        mm/zswap: fix passing zero to 'PTR_ERR' warning
        mm/zswap: make struct kernel_param_ops definitions const
        userfaultfd/selftests: hint the test runner on required privilege
        userfaultfd/selftests: fix retval check for userfaultfd_open()
        userfaultfd/selftests: always dump something in modes
        userfaultfd: selftests: make __{s,u}64 format specifiers portable
        ...
      ac73e3dc
    • Alexey Dobriyan's avatar
      mm: cleanup kstrto*() usage · dfefd226
      Alexey Dobriyan authored
      Range checks can folded into proper conversion function.  kstrto*() exist
      for all arithmetic types.
      
      Link: https://lkml.kernel.org/r/20201122123759.GC92364@localhost.localdomainSigned-off-by: default avatarAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      dfefd226
    • Gustavo A. R. Silva's avatar
      mm: fix fall-through warnings for Clang · 01359eb2
      Gustavo A. R. Silva authored
      In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple of
      warnings by explicitly adding a break statement instead of just letting
      the code fall through to the next, and by adding a fallthrough
      pseudo-keyword in places where the code is intended to fall through.
      
      Link: https://github.com/KSPP/linux/issues/115
      Link: https://lkml.kernel.org/r/f5756988b8842a3f10008fbc5b0a654f828920a9.1605896059.git.gustavoars@kernel.orgSigned-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      01359eb2
    • Joe Perches's avatar
      mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at · bf16d19a
      Joe Perches authored
      Convert the unbounded uses of sprintf to sysfs_emit.
      
      A few conversions may now not end in a newline if the output buffer is
      overflowed.
      
      Link: https://lkml.kernel.org/r/0c90a90f466167f8c37de4b737553cf49c4a277f.1605376435.git.joe@perches.comSigned-off-by: default avatarJoe Perches <joe@perches.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bf16d19a
    • Joe Perches's avatar
      mm: shmem: convert shmem_enabled_show to use sysfs_emit_at · 79d4d38a
      Joe Perches authored
      Update the function to use sysfs_emit_at while neatening the uses of
      sprintf and overwriting the last space char with a newline to avoid
      possible output buffer overflow.
      
      Miscellanea:
      
       - in shmem_enabled_show, the removal of the indirected use of fmt
         allows __printf verification
      
      Link: https://lkml.kernel.org/r/b612a93825e5ea330cb68d2e8b516e9687a06cc6.1605376435.git.joe@perches.comSigned-off-by: default avatarJoe Perches <joe@perches.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      79d4d38a
    • Joe Perches's avatar
      mm:backing-dev: use sysfs_emit in macro defining functions · 5e4c0d86
      Joe Perches authored
      The cocci script used in commit bdacbb8d04f ("mm: Use sysfs_emit for
      struct kobject * uses") does not convert the name##_show macro because the
      macro uses concatenation via ##.
      
      Convert it by hand.
      
      Link: https://lkml.kernel.org/r/45ec6cfc177d743f9c0ebaf35e43969dce43af42.1605376435.git.joe@perches.comSigned-off-by: default avatarJoe Perches <joe@perches.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5e4c0d86
    • Joe Perches's avatar
      mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening · bfb0ffeb
      Joe Perches authored
      Convert the only use of sprintf with struct kobject * that the cocci
      script could not convert.
      
      Miscellanea:
      
       - Neaten the uses of a constant string with sysfs_emit to use a const
         char * to reduce overall object size
      
      Link: https://lkml.kernel.org/r/7df6be66bbd68e1a0bca9d35aca1341dbf94d2a7.1605376435.git.joe@perches.comSigned-off-by: default avatarJoe Perches <joe@perches.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      bfb0ffeb
    • Joe Perches's avatar
      mm: use sysfs_emit for struct kobject * uses · ae7a927d
      Joe Perches authored
      Patch series "mm: Convert sysfs sprintf family to sysfs_emit", v2.
      
      Use the new sysfs_emit family and not the sprintf family.
      
      This patch (of 5):
      
      Use the sysfs_emit function instead of the sprintf family.
      
      Done with cocci script as in commit 3c6bff3c ("RDMA: Convert sysfs
      kobject * show functions to use sysfs_emit()")
      
      Link: https://lkml.kernel.org/r/cover.1605376435.git.joe@perches.com
      Link: https://lkml.kernel.org/r/9c249215bad6df616ba0410ad980042694970c1b.1605376435.git.joe@perches.comSigned-off-by: default avatarJoe Perches <joe@perches.com>
      Cc: Mike Kravetz <mike.kravetz@oracle.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ae7a927d
    • Mauro Carvalho Chehab's avatar
      mm: fix kernel-doc markups · a00cda3f
      Mauro Carvalho Chehab authored
      Kernel-doc markups should use this format:
              identifier - description
      
      Fix some issues on mm files:
      
      1) The definition for get_user_pages_locked() doesn't follow it.  Also,
         it expects a short descrpition at the header, followed by a long one,
         after the parameters.  Fix it.
      
      2) Kernel-doc requires that a kernel-doc markup to be immediately below
         the function prototype, as otherwise it will rename it.  So, move
         get_pfnblock_flags_mask() description to the right place.
      
      3) Make invalidate_mapping_pagevec() to also follow the expected
         kernel-doc format.
      
      While here, fix a few minor English syntax issues, as suggested
      by Matthew:
      	will used -> will be used
      	similar with -> similar to
      
      Link: https://lkml.kernel.org/r/80e85dddc92d333bc2159ee8a2294921612e8745.1605521731.git.mchehab+huawei@kernel.orgSigned-off-by: default avatarMauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Suggested-by: Mattew Wilcox <willy@infradead.org>	[English fixes]
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a00cda3f
    • Rui Salvaterra's avatar
      zram: break the strict dependency from lzo · 3d711a38
      Rui Salvaterra authored
      From the beginning, the zram block device always enabled CRYPTO_LZO,
      since lzo-rle is hardcoded as the fallback compression algorithm.  As a
      consequence, on systems where another compression algorithm is chosen
      (e.g.  CRYPTO_ZSTD), the lzo kernel module becomes unused, while still
      having to be built/loaded.
      
      This patch removes the hardcoded lzo-rle dependency and allows the user
      to select the default compression algorithm for zram at build time.  The
      previous behaviour is kept, as the default algorithm is still lzo-rle.
      
      Link: https://lkml.kernel.org/r/20201207121245.50529-1-rsalvaterra@gmail.comSigned-off-by: default avatarRui Salvaterra <rsalvaterra@gmail.com>
      Suggested-by: default avatarSergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
      Suggested-by: default avatarMinchan Kim <minchan@kernel.org>
      Acked-by: default avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      3d711a38
    • Minchan Kim's avatar
      zram: add stat to gather incompressible pages since zram set up · 194e28da
      Minchan Kim authored
      Currently, zram supports the stat via /sys/block/zram/mm_stat to represent
      how many of incompressible pages are stored at the moment but it couldn't
      show how many times incompressible pages were wrote down since zram set
      up.  It's also good indication to see how zram is effective in the system.
      
      Link: https://lkml.kernel.org/r/20201130201907.1284910-1-minchan@kernel.orgSigned-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      194e28da
    • Minchan Kim's avatar
      zram: support page writeback · 0d835962
      Minchan Kim authored
      There is demand to writeback specific process pages to backing store
      instead of all idles pages in the system due to storage wear out concerns
      and to launching latency of apps which are most of the time idle but are
      critical for resume latency.
      
      This patch extends the writeback knob to support a specific page
      writeback.
      
      Link: https://lkml.kernel.org/r/20201020190506.3758660-1-minchan@kernel.orgSigned-off-by: default avatarMinchan Kim <minchan@kernel.org>
      Reviewed-by: default avatarSergey Senozhatsky <sergey.senozhatsky@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0d835962
    • Colin Ian King's avatar
      mm/process_vm_access: remove redundant initialization of iov_r · 95c9ae14
      Colin Ian King authored
      The pointer iov_r is being initialized with a value that is never read and
      it is being updated later with a new value.  The initialization is
      redundant and can be removed.
      
      Link: https://lkml.kernel.org/r/20201102120614.694917-1-colin.king@canonical.comSigned-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Reviewed-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      95c9ae14
    • Miaohe Lin's avatar
    • Barry Song's avatar
      mm/zswap: move to use crypto_acomp API for hardware acceleration · 1ec3b5fe
      Barry Song authored
      Right now, all new ZIP drivers are adapted to crypto_acomp APIs rather
      than legacy crypto_comp APIs.  Tradiontal ZIP drivers like lz4,lzo etc
      have been also wrapped into acomp via scomp backend.  But zswap.c is still
      using the old APIs.  That means zswap won't be able to work on any new ZIP
      drivers in kernel.
      
      This patch moves to use cryto_acomp APIs to fix the disconnected bridge
      between new ZIP drivers and zswap.  It is probably the first real user to
      use acomp but perhaps not a good example to demonstrate how multiple acomp
      requests can be executed in parallel in one acomp instance.  frontswap is
      doing page load and store page by page synchronously.  swap_writepage()
      depends on the completion of frontswap_store() to decide if it should call
      __swap_writepage() to swap to disk.
      
      However this patch creates multiple acomp instances, so multiple threads
      running on multiple different cpus can actually do (de)compression
      parallelly, leveraging the power of multiple ZIP hardware queues.  This is
      also consistent with frontswap's page management model.
      
      The old zswap code uses atomic context and avoids the race conditions
      while shared resources like zswap_dstmem are accessed.  Here since acomp
      can sleep, per-cpu mutex is used to replace preemption-disable.
      
      While it is possible to make mm/page_io.c and mm/frontswap.c support async
      (de)compression in some way, the entire design requires careful thinking
      and performance evaluation.  For the first step, the base with fixed
      connection between ZIP drivers and zswap should be built.
      
      Link: https://lkml.kernel.org/r/20201107065332.26992-1-song.bao.hua@hisilicon.comSigned-off-by: default avatarBarry Song <song.bao.hua@hisilicon.com>
      Acked-by: default avatarVitaly Wool <vitalywool@gmail.com>
      Cc: Luis Claudio R. Goncalves <lgoncalv@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Mahipal Challa <mahipalreddy2006@gmail.com>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Zhou Wang <wangzhou1@hisilicon.com>
      Cc: Colin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1ec3b5fe
    • YueHaibing's avatar
      mm/zswap: fix passing zero to 'PTR_ERR' warning · 42a44704
      YueHaibing authored
      Fix smatch warning:
      
        mm/zswap.c:425 zswap_cpu_comp_prepare() warn: passing zero to 'PTR_ERR'
      
      crypto_alloc_comp() never return NULL, use IS_ERR instead of
      IS_ERR_OR_NULL to fix this.
      
      Link: https://lkml.kernel.org/r/20201031055615.28080-1-yuehaibing@huawei.com
      Fixes: f1c54846 ("zswap: dynamic pool creation")
      Signed-off-by: default avatarYueHaibing <yuehaibing@huawei.com>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      42a44704
    • Joe Perches's avatar
      mm/zswap: make struct kernel_param_ops definitions const · 83aed6cd
      Joe Perches authored
      These should be const, so make it so.
      
      Link: https://lkml.kernel.org/r/1791535ee0b00f4a5c68cc4a8adada06593ad8f1.1601770305.git.joe@perches.comSigned-off-by: default avatarJoe Perches <joe@perches.com>
      Cc: Seth Jennings <sjenning@redhat.com>
      Cc: Dan Streetman <ddstreet@ieee.org>
      Cc: Vitaly Wool <vitaly.wool@konsulko.com>
      Cc: "Maciej S. Szmigiero" <mail@maciej.szmigiero.name>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      83aed6cd
    • Peter Xu's avatar
      userfaultfd/selftests: hint the test runner on required privilege · d9f411ba
      Peter Xu authored
      Now userfaultfd test program requires either root or ptrace privilege due
      to the signal/event tests.  When UFFDIO_API failed, hint the test runner
      about this fact verbosely.
      
      Link: https://lkml.kernel.org/r/20201208024709.7701-4-peterx@redhat.comSigned-off-by: default avatarPeter Xu <peterx@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d9f411ba
    • Peter Xu's avatar
      userfaultfd/selftests: fix retval check for userfaultfd_open() · 1e17a24e
      Peter Xu authored
      userfaultfd_open() returns 1 for errors rather than negatives.  Fix it on
      all the callers so when UFFDIO_API failed the test will bail out.
      
      Link: https://lkml.kernel.org/r/20201208024709.7701-3-peterx@redhat.comSigned-off-by: default avatarPeter Xu <peterx@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1e17a24e
    • Peter Xu's avatar
      userfaultfd/selftests: always dump something in modes · 164c50be
      Peter Xu authored
      Patch series "userfaultfd: selftests: Small fixes".
      
      Some very trivial fixes that I kept locally to userfaultfd selftest
      program.
      
      This patch (of 3):
      
      BOUNCE_POLL is a special bit that if cleared it means "READ" instead.
      Dump that too otherwise we'll see tests with empty modes.
      
      Link: https://lkml.kernel.org/r/20201208024709.7701-1-peterx@redhat.com
      Link: https://lkml.kernel.org/r/20201208024709.7701-2-peterx@redhat.comSigned-off-by: default avatarPeter Xu <peterx@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      164c50be
    • Axel Rasmussen's avatar
      userfaultfd: selftests: make __{s,u}64 format specifiers portable · 77f962e7
      Axel Rasmussen authored
      On certain platforms (powerpcle is the one on which I ran into this),
      "%Ld" and "%Lu" are unsuitable for printing __s64 and __u64, respectively,
      resulting in build warnings.  Cast to {u,}int64_t, and use the PRI{d,u}64
      macros defined in inttypes.h to print them.  This ought to be portable to
      all platforms.
      
      Splitting this off into a separate macro lets us remove some lines, and
      get rid of some (I would argue) stylistically odd cases where we joined
      printf() and exit() into a single statement with a ,.
      
      Finally, this also fixes a "missing braces around initializer" warning
      when we initialize prms in wp_range().
      
      [axelrasmussen@google.com: v2]
        Link: https://lkml.kernel.org/r/20201203180244.1811601-1-axelrasmussen@google.com
      
      Link: https://lkml.kernel.org/r/20201202211542.1121189-1-axelrasmussen@google.comSigned-off-by: default avatarAxel Rasmussen <axelrasmussen@google.com>
      Acked-by: default avatarPeter Xu <peterx@redhat.com>
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Joe Perches <joe@perches.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: David Alan Gilbert <dgilbert@redhat.com>
      Cc: Greg Thelen <gthelen@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      77f962e7
    • Lokesh Gidra's avatar
      userfaultfd: add user-mode only option to unprivileged_userfaultfd sysctl knob · d0d4730a
      Lokesh Gidra authored
      With this change, when the knob is set to 0, it allows unprivileged users
      to call userfaultfd, like when it is set to 1, but with the restriction
      that page faults from only user-mode can be handled.  In this mode, an
      unprivileged user (without SYS_CAP_PTRACE capability) must pass
      UFFD_USER_MODE_ONLY to userfaultd or the API will fail with EPERM.
      
      This enables administrators to reduce the likelihood that an attacker with
      access to userfaultfd can delay faulting kernel code to widen timing
      windows for other exploits.
      
      The default value of this knob is changed to 0.  This is required for
      correct functioning of pipe mutex.  However, this will fail postcopy live
      migration, which will be unnoticeable to the VM guests.  To avoid this,
      set 'vm.userfault = 1' in /sys/sysctl.conf.
      
      The main reason this change is desirable as in the short term is that the
      Android userland will behave as with the sysctl set to zero.  So without
      this commit, any Linux binary using userfaultfd to manage its memory would
      behave differently if run within the Android userland.  For more details,
      refer to Andrea's reply [1].
      
      [1] https://lore.kernel.org/lkml/20200904033438.GI9411@redhat.com/
      
      Link: https://lkml.kernel.org/r/20201120030411.2690816-3-lokeshgidra@google.comSigned-off-by: default avatarLokesh Gidra <lokeshgidra@google.com>
      Reviewed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
      Cc: Eric Biggers <ebiggers@kernel.org>
      Cc: Daniel Colascione <dancol@dancol.org>
      Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
      Cc: Kalesh Singh <kaleshsingh@google.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Jeff Vander Stoep <jeffv@google.com>
      Cc: <calin@google.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Nitin Gupta <nigupta@nvidia.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Iurii Zaikin <yzaikin@google.com>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Daniel Colascione <dancol@google.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d0d4730a
    • Lokesh Gidra's avatar
      userfaultfd: add UFFD_USER_MODE_ONLY · 37cd0575
      Lokesh Gidra authored
      Patch series "Control over userfaultfd kernel-fault handling", v6.
      
      This patch series is split from [1].  The other series enables SELinux
      support for userfaultfd file descriptors so that its creation and movement
      can be controlled.
      
      It has been demonstrated on various occasions that suspending kernel code
      execution for an arbitrary amount of time at any access to userspace
      memory (copy_from_user()/copy_to_user()/...) can be exploited to change
      the intended behavior of the kernel.  For instance, handling page faults
      in kernel-mode using userfaultfd has been exploited in [2, 3].  Likewise,
      FUSE, which is similar to userfaultfd in this respect, has been exploited
      in [4, 5] for similar outcome.
      
      This small patch series adds a new flag to userfaultfd(2) that allows
      callers to give up the ability to handle kernel-mode faults with the
      resulting UFFD file object.  It then adds a 'user-mode only' option to the
      unprivileged_userfaultfd sysctl knob to require unprivileged callers to
      use this new flag.
      
      The purpose of this new interface is to decrease the chance of an
      unprivileged userfaultfd user taking advantage of userfaultfd to enhance
      security vulnerabilities by lengthening the race window in kernel code.
      
      [1] https://lore.kernel.org/lkml/20200211225547.235083-1-dancol@google.com/
      [2] https://duasynt.com/blog/linux-kernel-heap-spray
      [3] https://duasynt.com/blog/cve-2016-6187-heap-off-by-one-exploit
      [4] https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html
      [5] https://bugs.chromium.org/p/project-zero/issues/detail?id=808
      
      This patch (of 2):
      
      userfaultfd handles page faults from both user and kernel code.  Add a new
      UFFD_USER_MODE_ONLY flag for userfaultfd(2) that makes the resulting
      userfaultfd object refuse to handle faults from kernel mode, treating
      these faults as if SIGBUS were always raised, causing the kernel code to
      fail with EFAULT.
      
      A future patch adds a knob allowing administrators to give some processes
      the ability to create userfaultfd file objects only if they pass
      UFFD_USER_MODE_ONLY, reducing the likelihood that these processes will
      exploit userfaultfd's ability to delay kernel page faults to open timing
      windows for future exploits.
      
      Link: https://lkml.kernel.org/r/20201120030411.2690816-1-lokeshgidra@google.com
      Link: https://lkml.kernel.org/r/20201120030411.2690816-2-lokeshgidra@google.comSigned-off-by: default avatarDaniel Colascione <dancol@google.com>
      Signed-off-by: default avatarLokesh Gidra <lokeshgidra@google.com>
      Reviewed-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: <calin@google.com>
      Cc: Daniel Colascione <dancol@dancol.org>
      Cc: Eric Biggers <ebiggers@kernel.org>
      Cc: Iurii Zaikin <yzaikin@google.com>
      Cc: Jeff Vander Stoep <jeffv@google.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kalesh Singh <kaleshsingh@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Luis Chamberlain <mcgrof@kernel.org>
      Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Nitin Gupta <nigupta@nvidia.com>
      Cc: Peter Xu <peterx@redhat.com>
      Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
      Cc: Shaohua Li <shli@fb.com>
      Cc: Stephen Smalley <stephen.smalley.work@gmail.com>
      Cc: Suren Baghdasaryan <surenb@google.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      37cd0575
    • Vlastimil Babka's avatar
      mm, page_poison: remove CONFIG_PAGE_POISONING_ZERO · f289041e
      Vlastimil Babka authored
      CONFIG_PAGE_POISONING_ZERO uses the zero pattern instead of 0xAA.  It was
      introduced by commit 1414c7f4 ("mm/page_poisoning.c: allow for zero
      poisoning"), noting that using zeroes retains the benefit of sanitizing
      content of freed pages, with the benefit of not having to zero them again
      on alloc, and the downside of making some forms of corruption (stray
      writes of NULLs) harder to detect than with the 0xAA pattern.  Together
      with CONFIG_PAGE_POISONING_NO_SANITY it made possible to sanitize the
      contents on free without checking it back on alloc.
      
      These days we have the init_on_free() option to achieve sanitization with
      zeroes and to save clearing on alloc (and without checking on alloc).
      Arguably if someone does choose to check the poison for corruption on
      alloc, the savings of not clearing the page are secondary, and it makes
      sense to always use the 0xAA poison pattern.  Thus, remove the
      CONFIG_PAGE_POISONING_ZERO option for being redundant.
      
      Link: https://lkml.kernel.org/r/20201113104033.22907-6-vbabka@suse.czSigned-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Laura Abbott <labbott@kernel.org>
      Cc: Mateusz Nosek <mateusznosek0@gmail.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      f289041e
    • Vlastimil Babka's avatar
      mm, page_poison: remove CONFIG_PAGE_POISONING_NO_SANITY · 8f424750
      Vlastimil Babka authored
      CONFIG_PAGE_POISONING_NO_SANITY skips the check on page alloc whether the
      poison pattern was corrupted, suggesting a use-after-free.  The motivation
      to introduce it in commit 8823b1db ("mm/page_poison.c: enable
      PAGE_POISONING as a separate option") was to simply sanitize freed pages,
      optimally together with CONFIG_PAGE_POISONING_ZERO.
      
      These days we have an init_on_free=1 boot option, which makes this use
      case of page poisoning redundant.  For sanitizing, writing zeroes is
      sufficient, there is pretty much no benefit from writing the 0xAA poison
      pattern to freed pages, without checking it back on alloc.  Thus, remove
      this option and suggest init_on_free instead in the main config's help.
      
      Link: https://lkml.kernel.org/r/20201113104033.22907-5-vbabka@suse.czSigned-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Laura Abbott <labbott@kernel.org>
      Cc: Mateusz Nosek <mateusznosek0@gmail.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8f424750
    • Vlastimil Babka's avatar
      kernel/power: allow hibernation with page_poison sanity checking · 03b6c9a3
      Vlastimil Babka authored
      Page poisoning used to be incompatible with hibernation, as the state of
      poisoned pages was lost after resume, thus enabling CONFIG_HIBERNATION
      forces CONFIG_PAGE_POISONING_NO_SANITY.  For the same reason, the
      poisoning with zeroes variant CONFIG_PAGE_POISONING_ZERO used to disable
      hibernation.  The latter restriction was removed by commit 1ad1410f
      ("PM / Hibernate: allow hibernation with PAGE_POISONING_ZERO") and
      similarly for init_on_free by commit 18451f9f ("PM: hibernate: fix
      crashes with init_on_free=1") by making sure free pages are cleared after
      resume.
      
      We can use the same mechanism to instead poison free pages with
      PAGE_POISON after resume.  This covers both zero and 0xAA patterns.  Thus
      we can remove the Kconfig restriction that disables page poison sanity
      checking when hibernation is enabled.
      
      Link: https://lkml.kernel.org/r/20201113104033.22907-4-vbabka@suse.czSigned-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	[hibernation]
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Laura Abbott <labbott@kernel.org>
      Cc: Mateusz Nosek <mateusznosek0@gmail.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      03b6c9a3
    • Vlastimil Babka's avatar
      mm, page_poison: use static key more efficiently · 8db26a3d
      Vlastimil Babka authored
      Commit 11c9c7ed ("mm/page_poison.c: replace bool variable with static
      key") changed page_poisoning_enabled() to a static key check.  However,
      the function is not inlined, so each check still involves a function call
      with overhead not eliminated when page poisoning is disabled.
      
      Analogically to how debug_pagealloc is handled, this patch converts
      page_poisoning_enabled() back to boolean check, and introduces
      page_poisoning_enabled_static() for fast paths.  Both functions are
      inlined.
      
      The function kernel_poison_pages() is also called unconditionally and does
      the static key check inside.  Remove it from there and put it to callers.
      Also split it to two functions kernel_poison_pages() and
      kernel_unpoison_pages() instead of the confusing bool parameter.
      
      Also optimize the check that enables page poisoning instead of
      debug_pagealloc for architectures without proper debug_pagealloc support.
      Move the check to init_mem_debugging_and_hardening() to enable a single
      static key instead of having two static branches in
      page_poisoning_enabled_static().
      
      Link: https://lkml.kernel.org/r/20201113104033.22907-3-vbabka@suse.czSigned-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Cc: Mike Rapoport <rppt@linux.ibm.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Laura Abbott <labbott@kernel.org>
      Cc: Mateusz Nosek <mateusznosek0@gmail.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8db26a3d
    • Vlastimil Babka's avatar
      mm, page_alloc: do not rely on the order of page_poison and init_on_alloc/free parameters · 04013513
      Vlastimil Babka authored
      Patch series "cleanup page poisoning", v3.
      
      I have identified a number of issues and opportunities for cleanup with
      CONFIG_PAGE_POISON and friends:
      
       - interaction with init_on_alloc and init_on_free parameters depends on
         the order of parameters (Patch 1)
      
       - the boot time enabling uses static key, but inefficienty (Patch 2)
      
       - sanity checking is incompatible with hibernation (Patch 3)
      
       - CONFIG_PAGE_POISONING_NO_SANITY can be removed now that we have
         init_on_free (Patch 4)
      
       - CONFIG_PAGE_POISONING_ZERO can be most likely removed now that we
         have init_on_free (Patch 5)
      
      This patch (of 5):
      
      Enabling page_poison=1 together with init_on_alloc=1 or init_on_free=1
      produces a warning in dmesg that page_poison takes precedence.  However,
      as these warnings are printed in early_param handlers for
      init_on_alloc/free, they are not printed if page_poison is enabled later
      on the command line (handlers are called in the order of their
      parameters), or when init_on_alloc/free is always enabled by the
      respective config option - before the page_poison early param handler is
      called, it is not considered to be enabled.  This is inconsistent.
      
      We can remove the dependency on order by making the init_on_* parameters
      only set a boolean variable, and postponing the evaluation after all early
      params have been processed.  Introduce a new
      init_mem_debugging_and_hardening() function for that, and move the related
      debug_pagealloc processing there as well.
      
      As a result init_mem_debugging_and_hardening() knows always accurately if
      init_on_* and/or page_poison options were enabled.  Thus we can also
      optimize want_init_on_alloc() and want_init_on_free().  We don't need to
      check page_poisoning_enabled() there, we can instead not enable the
      init_on_* static keys at all, if page poisoning is enabled.  This results
      in a simpler and more effective code.
      
      Link: https://lkml.kernel.org/r/20201113104033.22907-1-vbabka@suse.cz
      Link: https://lkml.kernel.org/r/20201113104033.22907-2-vbabka@suse.czSigned-off-by: default avatarVlastimil Babka <vbabka@suse.cz>
      Reviewed-by: default avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: default avatarMike Rapoport <rppt@linux.ibm.com>
      Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Mateusz Nosek <mateusznosek0@gmail.com>
      Cc: Laura Abbott <labbott@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      04013513