1. 12 Apr, 2024 3 commits
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2024-04-12' of https://gitlab.freedesktop.org/drm/kernel · d1c13e80
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Looks like everyone woke up after holidays, this weeks pull has a
        bunch of stuff all over, 2 weeks worth of amdgpu is a lot of it, then
        i915/xe have a few, a bunch of msm fixes, then some scattered driver
        fixes.
      
        I expect things will settle down for rc5.
      
        client:
         - Protect connector modes with mode_config mutex
      
        ast:
         - Fix soft lockup
      
        host1x:
         - Do not setup DMA for virtual addresses
      
        ivpu:
         - Fix deadlock in context_xa
         - PCI fixes
         - Fixes to error handling
      
        nouveau:
         - gsp: Fix OOB access
         - Fix casting
      
        panfrost:
         - Fix error path in MMU code
      
        qxl:
         - Revert "drm/qxl: simplify qxl_fence_wait"
      
        vmwgfx:
         - Enable DMA for SEV mappings
      
        i915:
         - Couple CDCLK programming fixes
         - HDCP related fix
         - 4 Bigjoiner related fixes
         - Fix for a circular locking around GuC on reset+wedged case
      
        xe:
         - Fix double display mutex initializations
         - Fix u32 -> u64 implicit conversions
         - Fix RING_CONTEXT_CONTROL not marked as masked
      
        msm:
         - DP refcount leak fix on disconnect
         - Add missing newlines to prints in msm_fb and msm_kms
         - fix dpu debugfs entry permissions
         - Fix the interface table for the catalog of X1E80100
         - fix irq message printing
         - Bindings fix to add DP node as child of mdss for mdss node
         - Minor typo fix in DP driver API which handles port status change
         - fix CHRASHDUMP_READ()
         - fix HHB (highest bank bit) for a619 to fix UBWC corruption
      
        amdgpu:
         - GPU reset fixes
         - Fix some confusing logging
         - UMSCH fix
         - Aborted suspend fix
         - DCN 3.5 fixes
         - S4 fix
         - MES logging fixes
         - SMU 14 fixes
         - SDMA 4.4.2 fix
         - KASAN fix
         - SMU 13.0.10 fix
         - VCN partition fix
         - GFX11 fixes
         - DWB fixes
         - Plane handling fix
         - FAMS fix
         - DCN 3.1.6 fix
         - VSC SDP fixes
         - OLED panel fix
         - GFX 11.5 fix
      
        amdkfd:
         - GPU reset fixes
         - fix ioctl integer overflow"
      
      * tag 'drm-fixes-2024-04-12' of https://gitlab.freedesktop.org/drm/kernel: (65 commits)
        amdkfd: use calloc instead of kzalloc to avoid integer overflow
        drm/xe: Label RING_CONTEXT_CONTROL as masked
        drm/xe/xe_migrate: Cast to output precision before multiplying operands
        drm/xe/hwmon: Cast result to output precision on left shift of operand
        drm/xe/display: Fix double mutex initialization
        drm/amdgpu: differentiate external rev id for gfx 11.5.0
        drm/amd/display: Adjust dprefclk by down spread percentage.
        drm/amd/display: Set VSC SDP Colorimetry same way for MST and SST
        drm/amd/display: Program VSC SDP colorimetry for all DP sinks >= 1.4
        drm/amd/display: fix disable otg wa logic in DCN316
        drm/amd/display: Do not recursively call manual trigger programming
        drm/amd/display: always reset ODM mode in context when adding first plane
        drm/amdgpu: fix incorrect number of active RBs for gfx11
        drm/amd/display: Return max resolution supported by DWB
        amd/amdkfd: sync all devices to wait all processes being evicted
        drm/amdgpu: clear set_q_mode_offs when VM changed
        drm/amdgpu: Fix VCN allocation in CPX partition
        drm/amd/pm: fix the high voltage issue after unload
        drm/amd/display: Skip on writeback when it's not applicable
        drm/amdgpu: implement IRQ_STATE_ENABLE for SDMA v4.4.2
        ...
      d1c13e80
    • Dave Airlie's avatar
      amdkfd: use calloc instead of kzalloc to avoid integer overflow · 3b0daecf
      Dave Airlie authored
      This uses calloc instead of doing the multiplication which might
      overflow.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      3b0daecf
    • Dave Airlie's avatar
      Merge tag 'drm-msm-next-2024-04-11' of https://gitlab.freedesktop.org/drm/msm into drm-fixes · 6d837271
      Dave Airlie authored
      Fixes for v6.9
      
      Display:
      - Fixes for PM refcount leak when DP goes to disconnected state and
        also when link training fails. This is also one of the issues found
        with the pm runtime series
      - Add missing newlines to prints in msm_fb and msm_kms
      - Change permissions of some dpu debugfs entries which write to const
        data from catalog to read-only to avoid protection faults
      - Fix the interface table for the catalog of X1E80100. This is an
        important fix to bringup DP for X1E80100.
      - Logging fix to print the callback symbol in the invalid IRQ message
        case rather than printing when its known to be NULL.
      - Bindings fix to add DP node as child of mdss for mdss node
      - Minor typo fix in DP driver API which handles port status change
      
      GPU:
      - fix CHRASHDUMP_READ()
      - fix HHB (highest bank bit) for a619 to fix UBWC corruption
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Rob Clark <robdclark@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvFwRUcHGWva7oDeydq1PTiZMduuykCD2MWaFrT4iMGZA@mail.gmail.com
      6d837271
  2. 11 Apr, 2024 37 commits
    • Linus Torvalds's avatar
      Merge tag 'cxl-fixes-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl · 586b5dfb
      Linus Torvalds authored
      Pull cxl fixes from Dave Jiang:
      
       - Fix index of Clear Event Record handles in cxl_clear_event_record()
      
       - Fix use before init of map->reg_type in cxl_decode_regblock()
      
       - Fix initialization of mbox_cmd.size_out in cxl_mem_get_records_log()
      
       - Fix CXL path access_coordinate computation:
           - Remove unneded check of iter in loop
           - Fix of retrieving of access_coordinate in PCI topology walk
           - Fix of incorrect region access_coordinate data calculation
           - Consolidate of access_coordinates attached to downstream port
             context
           - Add check to validate access_coordinate validity to prevent
             incorrect data being exposed via sysfs
      
      * tag 'cxl-fixes-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
        cxl: Add checks to access_coordinate calculation to fail missing data
        cxl: Consolidate dport access_coordinate ->hb_coord and ->sw_coord into ->coord
        cxl: Fix incorrect region perf data calculation
        cxl: Fix retrieving of access_coordinates in PCIe path
        cxl: Remove checking of iter in cxl_endpoint_get_perf_coordinates()
        cxl/core: Fix initialization of mbox_cmd.size_out in get event
        cxl/core/regs: Fix usage of map->reg_type in cxl_decode_regblock() before assigned
        cxl/mem: Fix for the index of Clear Event Record Handle
      586b5dfb
    • Linus Torvalds's avatar
      Merge tag 'hyperv-fixes-signed-20240411' of... · 52e5070f
      Linus Torvalds authored
      Merge tag 'hyperv-fixes-signed-20240411' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
      
      Pull hyperv fixes from Wei Liu:
      
       - Some cosmetic changes (Erni Sri Satya Vennela, Li Zhijian)
      
       - Introduce hv_numa_node_to_pxm_info() (Nuno Das Neves)
      
       - Fix KVP daemon to handle IPv4 and IPv6 combination for keyfile format
         (Shradha Gupta)
      
       - Avoid freeing decrypted memory in a confidential VM (Rick Edgecombe
         and Michael Kelley)
      
      * tag 'hyperv-fixes-signed-20240411' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
        Drivers: hv: vmbus: Don't free ring buffers that couldn't be re-encrypted
        uio_hv_generic: Don't free decrypted memory
        hv_netvsc: Don't free decrypted memory
        Drivers: hv: vmbus: Track decrypted status in vmbus_gpadl
        Drivers: hv: vmbus: Leak pages if set_memory_encrypted() fails
        hv/hv_kvp_daemon: Handle IPv4 and Ipv6 combination for keyfile format
        hv: vmbus: Convert sprintf() family to sysfs_emit() family
        mshyperv: Introduce hv_numa_node_to_pxm_info()
        x86/hyperv: Cosmetic changes for hv_apic.c
      52e5070f
    • Dave Airlie's avatar
      Merge tag 'drm-xe-fixes-2024-04-11' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes · 1bafeaf2
      Dave Airlie authored
      - Fix double display mutex initializations
      - Fix u32 -> u64 implicit conversions
      - Fix RING_CONTEXT_CONTROL not marked as masked
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Lucas De Marchi <lucas.demarchi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/ewvvtgcb2gonxvccws6nt6fqswoyfp4g43t5ex24vpqwtrxdzm@hgjoz5uirmxx
      1bafeaf2
    • Dave Airlie's avatar
      Merge tag 'drm-misc-fixes-2024-04-11' of... · 1b24b3cd
      Dave Airlie authored
      Merge tag 'drm-misc-fixes-2024-04-11' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
      
      Short summary of fixes pull:
      
      ast:
      - Fix soft lockup
      
      client:
      - Protect connector modes with mode_config mutex
      
      host1x:
      - Do not setup DMA for virtual addresses
      
      ivpu:
      - Fix deadlock in context_xa
      - PCI fixes
      - Fixes to error handling
      
      nouveau:
      - gsp: Fix OOB access
      - Fix casting
      
      panfrost:
      - Fix error path in MMU code
      
      qxl:
      - Revert "drm/qxl: simplify qxl_fence_wait"
      
      vmwgfx:
      - Enable DMA for SEV mappings
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Thomas Zimmermann <tzimmermann@suse.de>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240411073403.GA9895@localhost.localdomain
      1b24b3cd
    • Linus Torvalds's avatar
      Merge tag 'acpi-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 00dcf5d8
      Linus Torvalds authored
      Pull ACPI fixes from Rafael Wysocki:
       "These fix the handling of dependencies between devices in the ACPI
        device enumeration code and address a _UID matching regression from
        the 6.8 development cycle.
      
        Specifics:
      
         - Modify the ACPI device enumeration code to avoid counting
           dependencies that have been met already as unmet (Hans de Goede)
      
         - Make _UID matching take the integer value of 0 into account as
           appropriate (Raag Jadav)"
      
      * tag 'acpi-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        ACPI: bus: allow _UID matching for integer zero
        ACPI: scan: Do not increase dep_unmet for already met dependencies
      00dcf5d8
    • Linus Torvalds's avatar
      Merge tag 'pm-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm · 136eb5fd
      Linus Torvalds authored
      Pull power management fix from Rafael Wysocki:
       "Fix the suspend-to-idle core code to guarantee that timers queued on
        CPUs other than the one that has first left the idle state, which
        should expire directly after resume, will be handled (Anna-Maria
        Behnsen)"
      
      * tag 'pm-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
        PM: s2idle: Make sure CPUs will wakeup directly on resume
      136eb5fd
    • Linus Torvalds's avatar
      Merge tag 'net-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net · 2ae9a897
      Linus Torvalds authored
      Pull networking fixes from Paolo Abeni:
       "Including fixes from bluetooth.
      
        Current release - new code bugs:
      
         - netfilter: complete validation of user input
      
         - mlx5: disallow SRIOV switchdev mode when in multi-PF netdev
      
        Previous releases - regressions:
      
         - core: fix u64_stats_init() for lockdep when used repeatedly in one
           file
      
         - ipv6: fix race condition between ipv6_get_ifaddr and ipv6_del_addr
      
         - bluetooth: fix memory leak in hci_req_sync_complete()
      
         - batman-adv: avoid infinite loop trying to resize local TT
      
         - drv: geneve: fix header validation in geneve[6]_xmit_skb
      
         - drv: bnxt_en: fix possible memory leak in
           bnxt_rdma_aux_device_init()
      
         - drv: mlx5: offset comp irq index in name by one
      
         - drv: ena: avoid double-free clearing stale tx_info->xdpf value
      
         - drv: pds_core: fix pdsc_check_pci_health deadlock
      
        Previous releases - always broken:
      
         - xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING
      
         - bluetooth: fix setsockopt not validating user input
      
         - af_unix: clear stale u->oob_skb.
      
         - nfc: llcp: fix nfc_llcp_setsockopt() unsafe copies
      
         - drv: virtio_net: fix guest hangup on invalid RSS update
      
         - drv: mlx5e: Fix mlx5e_priv_init() cleanup flow
      
         - dsa: mt7530: trap link-local frames regardless of ST Port State"
      
      * tag 'net-6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (59 commits)
        net: ena: Set tx_info->xdpf value to NULL
        net: ena: Fix incorrect descriptor free behavior
        net: ena: Wrong missing IO completions check order
        net: ena: Fix potential sign extension issue
        af_unix: Fix garbage collector racing against connect()
        net: dsa: mt7530: trap link-local frames regardless of ST Port State
        Revert "s390/ism: fix receive message buffer allocation"
        net: sparx5: fix wrong config being used when reconfiguring PCS
        net/mlx5: fix possible stack overflows
        net/mlx5: Disallow SRIOV switchdev mode when in multi-PF netdev
        net/mlx5e: RSS, Block XOR hash with over 128 channels
        net/mlx5e: Do not produce metadata freelist entries in Tx port ts WQE xmit
        net/mlx5e: HTB, Fix inconsistencies with QoS SQs number
        net/mlx5e: Fix mlx5e_priv_init() cleanup flow
        net/mlx5e: RSS, Block changing channels number when RXFH is configured
        net/mlx5: Correctly compare pkt reformat ids
        net/mlx5: Properly link new fs rules into the tree
        net/mlx5: offset comp irq index in name by one
        net/mlx5: Register devlink first under devlink lock
        net/mlx5: E-switch, store eswitch pointer before registering devlink_param
        ...
      2ae9a897
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · ab4319fd
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "The most important fix is the sg one because the regression it fixes
        (spurious warning and use after final put) is already backported to
        stable.
      
        The next biggest impact is the target fix for wrong credentials used
        to load a module because it's affecting new kernels installed on
        selinux based distributions.
      
        The other three fixes are an obvious off by one and SATA protocol
        issues"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: qla2xxx: Fix off by one in qla_edif_app_getstats()
        scsi: hisi_sas: Modify the deadline for ata_wait_after_reset()
        scsi: hisi_sas: Handle the NCQ error returned by D2H frame
        scsi: target: Fix SELinux error when systemd-modules loads the target module
        scsi: sg: Avoid race in error handling & drop bogus warn
      ab4319fd
    • Linus Torvalds's avatar
      Merge tag 'loongarch-fixes-6.9-1' of... · 5de6b467
      Linus Torvalds authored
      Merge tag 'loongarch-fixes-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
      
      Pull LoongArch fixes from Huacai Chen:
      
       - make {virt, phys, page, pfn} translation work with KFENCE for
         LoongArch (otherwise NVMe and virtio-blk cannot work with KFENCE
         enabled)
      
       - update dts files for Loongson-2K series to make devices work
         correctly
      
       - fix a build error
      
      * tag 'loongarch-fixes-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
        LoongArch: Include linux/sizes.h in addrspace.h to prevent build errors
        LoongArch: Update dts for Loongson-2K2000 to support GMAC/GNET
        LoongArch: Update dts for Loongson-2K2000 to support PCI-MSI
        LoongArch: Update dts for Loongson-2K2000 to support ISA/LPC
        LoongArch: Update dts for Loongson-2K1000 to support ISA/LPC
        LoongArch: Make virt_addr_valid()/__virt_addr_valid() work with KFENCE
        LoongArch: Make {virt, phys, page, pfn} translation work with KFENCE
        mm: Move lowmem_page_address() a little later
      5de6b467
    • Linus Torvalds's avatar
      Merge tag 'bcachefs-2024-04-10' of https://evilpiepirate.org/git/bcachefs · e1dc191d
      Linus Torvalds authored
      Pull more bcachefs fixes from Kent Overstreet:
       "Notable user impacting bugs
      
         - On multi device filesystems, recovery was looping in
           btree_trans_too_many_iters(). This checks if a transaction has
           touched too many btree paths (because of iteration over many keys),
           and isuses a restart to drop unneeded paths.
      
           But it's now possible for some paths to exceed the previous limit
           without iteration in the interior btree update path, since the
           transaction commit will do alloc updates for every old and new
           btree node, and during journal replay we don't use the btree write
           buffer for locking reasons and thus those updates use btree paths
           when they wouldn't normally.
      
         - Fix a corner case in rebalance when moving extents on a
           durability=0 device. This wouldn't be hit when a device was
           formatted with durability=0 since in that case we'll only use it as
           a write through cache (only cached extents will live on it), but
           durability can now be changed on an existing device.
      
         - bch2_get_acl() could rarely forget to handle a transaction restart;
           this manifested as the occasional missing acl that came back after
           dropping caches.
      
         - Fix a major performance regression on high iops multithreaded write
           workloads (only since 6.9-rc1); a previous fix for a deadlock in
           the interior btree update path to check the journal watermark
           introduced a dependency on the state of btree write buffer flushing
           that we didn't want.
      
         - Assorted other repair paths and recovery fixes"
      
      * tag 'bcachefs-2024-04-10' of https://evilpiepirate.org/git/bcachefs: (25 commits)
        bcachefs: Fix __bch2_btree_and_journal_iter_init_node_iter()
        bcachefs: Kill read lock dropping in bch2_btree_node_lock_write_nofail()
        bcachefs: Fix a race in btree_update_nodes_written()
        bcachefs: btree_node_scan: Respect member.data_allowed
        bcachefs: Don't scan for btree nodes when we can reconstruct
        bcachefs: Fix check_topology() when using node scan
        bcachefs: fix eytzinger0_find_gt()
        bcachefs: fix bch2_get_acl() transaction restart handling
        bcachefs: fix the count of nr_freed_pcpu after changing bc->freed_nonpcpu list
        bcachefs: Fix gap buffer bug in bch2_journal_key_insert_take()
        bcachefs: Rename struct field swap to prevent macro naming collision
        MAINTAINERS: Add entry for bcachefs documentation
        Documentation: filesystems: Add bcachefs toctree
        bcachefs: JOURNAL_SPACE_LOW
        bcachefs: Disable errors=panic for BCH_IOCTL_FSCK_OFFLINE
        bcachefs: Fix BCH_IOCTL_FSCK_OFFLINE for encrypted filesystems
        bcachefs: fix rand_delete unit test
        bcachefs: fix ! vs ~ typo in __clear_bit_le64()
        bcachefs: Fix rebalance from durability=0 device
        bcachefs: Print shutdown journal sequence number
        ...
      e1dc191d
    • Linus Torvalds's avatar
      Merge tag 'tag-chrome-platform-fixes-for-v6.9-rc4' of... · 346668f0
      Linus Torvalds authored
      Merge tag 'tag-chrome-platform-fixes-for-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux
      
      Pull chrome platform fix from Tzung-Bi Shih:
       "Fix a NULL pointer dereference"
      
      * tag 'tag-chrome-platform-fixes-for-v6.9-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/chrome-platform/linux:
        platform/chrome: cros_ec_uart: properly fix race condition
      346668f0
    • Rafael J. Wysocki's avatar
      Merge branch 'acpi-bus' · d7da7e7c
      Rafael J. Wysocki authored
      * acpi-bus:
        ACPI: bus: allow _UID matching for integer zero
      d7da7e7c
    • Ashutosh Dixit's avatar
      drm/xe: Label RING_CONTEXT_CONTROL as masked · f76646c8
      Ashutosh Dixit authored
      RING_CONTEXT_CONTROL is a masked register.
      
      v2: Also clean up setting register value (Lucas)
      Reviewed-by: default avatarMatt Roper <matthew.d.roper@intel.com>
      Reviewed-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      Signed-off-by: default avatarAshutosh Dixit <ashutosh.dixit@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240404161256.3852502-1-ashutosh.dixit@intel.com
      (cherry picked from commit dc30c6e7)
      Signed-off-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      f76646c8
    • Himal Prasad Ghimiray's avatar
      drm/xe/xe_migrate: Cast to output precision before multiplying operands · 9cb46b31
      Himal Prasad Ghimiray authored
      Addressing potential overflow in result of  multiplication of two lower
      precision (u32) operands before widening it to higher precision
      (u64).
      
      -v2
      Fix commit message and description. (Rodrigo)
      
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Signed-off-by: default avatarHimal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
      Reviewed-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240401175300.3823653-1-himal.prasad.ghimiray@intel.comSigned-off-by: default avatarRodrigo Vivi <rodrigo.vivi@intel.com>
      (cherry picked from commit 34820967)
      Signed-off-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      9cb46b31
    • Karthik Poosa's avatar
      drm/xe/hwmon: Cast result to output precision on left shift of operand · a8ad8715
      Karthik Poosa authored
      Address potential overflow in result of left shift of a
      lower precision (u32) operand before assignment to higher
      precision (u64) variable.
      
      v2:
       - Update commit message. (Himal)
      
      Fixes: 4446fcf2 ("drm/xe/hwmon: Expose power1_max_interval")
      Signed-off-by: default avatarKarthik Poosa <karthik.poosa@intel.com>
      Reviewed-by: default avatarAnshuman Gupta <anshuman.gupta@intel.com>
      Cc: Badal Nilawar <badal.nilawar@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240405130127.1392426-5-karthik.poosa@intel.comSigned-off-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      (cherry picked from commit 883232b4)
      Signed-off-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      a8ad8715
    • Lucas De Marchi's avatar
      drm/xe/display: Fix double mutex initialization · 50a9b7fc
      Lucas De Marchi authored
      All of these mutexes are already initialized by the display side since
      commit 3fef3e6f ("drm/i915: move display mutex inits to display
      code"), so the xe shouldn´t initialize them.
      
      Fixes: 44e69495 ("drm/xe/display: Implement display support")
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Arun R Murthy <arun.r.murthy@intel.com>
      Reviewed-by: default avatarJani Nikula <jani.nikula@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240405200711.2041428-1-lucas.demarchi@intel.comSigned-off-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      (cherry picked from commit 117de185)
      Signed-off-by: default avatarLucas De Marchi <lucas.demarchi@intel.com>
      50a9b7fc
    • Paolo Abeni's avatar
      Merge branch 'ena-driver-bug-fixes' · 4e1ad31c
      Paolo Abeni authored
      David Arinzon says:
      
      ====================
      ENA driver bug fixes
      
      From: David Arinzon <darinzon@amazon.com>
      
      This patchset contains multiple bug fixes for the
      ENA driver.
      ====================
      
      Link: https://lore.kernel.org/r/20240410091358.16289-1-darinzon@amazon.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      4e1ad31c
    • David Arinzon's avatar
      net: ena: Set tx_info->xdpf value to NULL · 36a1ca01
      David Arinzon authored
      The patch mentioned in the `Fixes` tag removed the explicit assignment
      of tx_info->xdpf to NULL with the justification that there's no need
      to set tx_info->xdpf to NULL and tx_info->num_of_bufs to 0 in case
      of a mapping error. Both values won't be used once the mapping function
      returns an error, and their values would be overridden by the next
      transmitted packet.
      
      While both values do indeed get overridden in the next transmission
      call, the value of tx_info->xdpf is also used to check whether a TX
      descriptor's transmission has been completed (i.e. a completion for it
      was polled).
      
      An example scenario:
      1. Mapping failed, tx_info->xdpf wasn't set to NULL
      2. A VF reset occurred leading to IO resource destruction and
         a call to ena_free_tx_bufs() function
      3. Although the descriptor whose mapping failed was freed by the
         transmission function, it still passes the check
           if (!tx_info->skb)
      
         (skb and xdp_frame are in a union)
      4. The xdp_frame associated with the descriptor is freed twice
      
      This patch returns the assignment of NULL to tx_info->xdpf to make the
      cleaning function knows that the descriptor is already freed.
      
      Fixes: 504fd6a5 ("net: ena: fix DMA mapping function issues in XDP")
      Signed-off-by: default avatarShay Agroskin <shayagr@amazon.com>
      Signed-off-by: default avatarDavid Arinzon <darinzon@amazon.com>
      Reviewed-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      36a1ca01
    • David Arinzon's avatar
      net: ena: Fix incorrect descriptor free behavior · bf02d9fe
      David Arinzon authored
      ENA has two types of TX queues:
      - queues which only process TX packets arriving from the network stack
      - queues which only process TX packets forwarded to it by XDP_REDIRECT
        or XDP_TX instructions
      
      The ena_free_tx_bufs() cycles through all descriptors in a TX queue
      and unmaps + frees every descriptor that hasn't been acknowledged yet
      by the device (uncompleted TX transactions).
      The function assumes that the processed TX queue is necessarily from
      the first category listed above and ends up using napi_consume_skb()
      for descriptors belonging to an XDP specific queue.
      
      This patch solves a bug in which, in case of a VF reset, the
      descriptors aren't freed correctly, leading to crashes.
      
      Fixes: 548c4940 ("net: ena: Implement XDP_TX action")
      Signed-off-by: default avatarShay Agroskin <shayagr@amazon.com>
      Signed-off-by: default avatarDavid Arinzon <darinzon@amazon.com>
      Reviewed-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      bf02d9fe
    • David Arinzon's avatar
      net: ena: Wrong missing IO completions check order · f7e41718
      David Arinzon authored
      Missing IO completions check is called every second (HZ jiffies).
      This commit fixes several issues with this check:
      
      1. Duplicate queues check:
         Max of 4 queues are scanned on each check due to monitor budget.
         Once reaching the budget, this check exits under the assumption that
         the next check will continue to scan the remainder of the queues,
         but in practice, next check will first scan the last already scanned
         queue which is not necessary and may cause the full queue scan to
         last a couple of seconds longer.
         The fix is to start every check with the next queue to scan.
         For example, on 8 IO queues:
         Bug: [0,1,2,3], [3,4,5,6], [6,7]
         Fix: [0,1,2,3], [4,5,6,7]
      
      2. Unbalanced queues check:
         In case the number of active IO queues is not a multiple of budget,
         there will be checks which don't utilize the full budget
         because the full scan exits when reaching the last queue id.
         The fix is to run every TX completion check with exact queue budget
         regardless of the queue id.
         For example, on 7 IO queues:
         Bug: [0,1,2,3], [4,5,6], [0,1,2,3]
         Fix: [0,1,2,3], [4,5,6,0], [1,2,3,4]
         The budget may be lowered in case the number of IO queues is less
         than the budget (4) to make sure there are no duplicate queues on
         the same check.
         For example, on 3 IO queues:
         Bug: [0,1,2,0], [1,2,0,1]
         Fix: [0,1,2], [0,1,2]
      
      Fixes: 1738cd3e ("net: ena: Add a driver for Amazon Elastic Network Adapters (ENA)")
      Signed-off-by: default avatarAmit Bernstein <amitbern@amazon.com>
      Signed-off-by: default avatarDavid Arinzon <darinzon@amazon.com>
      Reviewed-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f7e41718
    • David Arinzon's avatar
      net: ena: Fix potential sign extension issue · 713a8519
      David Arinzon authored
      Small unsigned types are promoted to larger signed types in
      the case of multiplication, the result of which may overflow.
      In case the result of such a multiplication has its MSB
      turned on, it will be sign extended with '1's.
      This changes the multiplication result.
      
      Code example of the phenomenon:
      -------------------------------
      u16 x, y;
      size_t z1, z2;
      
      x = y = 0xffff;
      printk("x=%x y=%x\n",x,y);
      
      z1 = x*y;
      z2 = (size_t)x*y;
      
      printk("z1=%lx z2=%lx\n", z1, z2);
      
      Output:
      -------
      x=ffff y=ffff
      z1=fffffffffffe0001 z2=fffe0001
      
      The expected result of ffff*ffff is fffe0001, and without the
      explicit casting to avoid the unwanted sign extension we got
      fffffffffffe0001.
      
      This commit adds an explicit casting to avoid the sign extension
      issue.
      
      Fixes: 689b2bda ("net: ena: add functions for handling Low Latency Queues in ena_com")
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid Arinzon <darinzon@amazon.com>
      Reviewed-by: default avatarShannon Nelson <shannon.nelson@amd.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      713a8519
    • Paolo Abeni's avatar
      Merge tag 'for-net-2024-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · fe3eb406
      Paolo Abeni authored
      Luiz Augusto von Dentz says:
      
      ====================
      bluetooth pull request for net:
      
        - L2CAP: Don't double set the HCI_CONN_MGMT_CONNECTED bit
        - Fix memory leak in hci_req_sync_complete
        - hci_sync: Fix using the same interval and window for Coded PHY
        - Fix not validating setsockopt user input
      
      * tag 'for-net-2024-04-10' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
        Bluetooth: l2cap: Don't double set the HCI_CONN_MGMT_CONNECTED bit
        Bluetooth: hci_sock: Fix not validating setsockopt user input
        Bluetooth: ISO: Fix not validating setsockopt user input
        Bluetooth: L2CAP: Fix not validating setsockopt user input
        Bluetooth: RFCOMM: Fix not validating setsockopt user input
        Bluetooth: SCO: Fix not validating setsockopt user input
        Bluetooth: Fix memory leak in hci_req_sync_complete()
        Bluetooth: hci_sync: Fix using the same interval and window for Coded PHY
        Bluetooth: ISO: Don't reject BT_ISO_QOS if parameters are unset
      ====================
      
      Link: https://lore.kernel.org/r/20240410191610.4156653-1-luiz.dentz@gmail.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      fe3eb406
    • Michal Luczaj's avatar
      af_unix: Fix garbage collector racing against connect() · 47d8ac01
      Michal Luczaj authored
      Garbage collector does not take into account the risk of embryo getting
      enqueued during the garbage collection. If such embryo has a peer that
      carries SCM_RIGHTS, two consecutive passes of scan_children() may see a
      different set of children. Leading to an incorrectly elevated inflight
      count, and then a dangling pointer within the gc_inflight_list.
      
      sockets are AF_UNIX/SOCK_STREAM
      S is an unconnected socket
      L is a listening in-flight socket bound to addr, not in fdtable
      V's fd will be passed via sendmsg(), gets inflight count bumped
      
      connect(S, addr)	sendmsg(S, [V]); close(V)	__unix_gc()
      ----------------	-------------------------	-----------
      
      NS = unix_create1()
      skb1 = sock_wmalloc(NS)
      L = unix_find_other(addr)
      unix_state_lock(L)
      unix_peer(S) = NS
      			// V count=1 inflight=0
      
       			NS = unix_peer(S)
       			skb2 = sock_alloc()
      			skb_queue_tail(NS, skb2[V])
      
      			// V became in-flight
      			// V count=2 inflight=1
      
      			close(V)
      
      			// V count=1 inflight=1
      			// GC candidate condition met
      
      						for u in gc_inflight_list:
      						  if (total_refs == inflight_refs)
      						    add u to gc_candidates
      
      						// gc_candidates={L, V}
      
      						for u in gc_candidates:
      						  scan_children(u, dec_inflight)
      
      						// embryo (skb1) was not
      						// reachable from L yet, so V's
      						// inflight remains unchanged
      __skb_queue_tail(L, skb1)
      unix_state_unlock(L)
      						for u in gc_candidates:
      						  if (u.inflight)
      						    scan_children(u, inc_inflight_move_tail)
      
      						// V count=1 inflight=2 (!)
      
      If there is a GC-candidate listening socket, lock/unlock its state. This
      makes GC wait until the end of any ongoing connect() to that socket. After
      flipping the lock, a possibly SCM-laden embryo is already enqueued. And if
      there is another embryo coming, it can not possibly carry SCM_RIGHTS. At
      this point, unix_inflight() can not happen because unix_gc_lock is already
      taken. Inflight graph remains unaffected.
      
      Fixes: 1fd05ba5 ("[AF_UNIX]: Rewrite garbage collector, fixes race.")
      Signed-off-by: default avatarMichal Luczaj <mhal@rbox.co>
      Reviewed-by: default avatarKuniyuki Iwashima <kuniyu@amazon.com>
      Link: https://lore.kernel.org/r/20240409201047.1032217-1-mhal@rbox.coSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      47d8ac01
    • Arınç ÜNAL's avatar
      net: dsa: mt7530: trap link-local frames regardless of ST Port State · 17c56011
      Arınç ÜNAL authored
      In Clause 5 of IEEE Std 802-2014, two sublayers of the data link layer
      (DLL) of the Open Systems Interconnection basic reference model (OSI/RM)
      are described; the medium access control (MAC) and logical link control
      (LLC) sublayers. The MAC sublayer is the one facing the physical layer.
      
      In 8.2 of IEEE Std 802.1Q-2022, the Bridge architecture is described. A
      Bridge component comprises a MAC Relay Entity for interconnecting the Ports
      of the Bridge, at least two Ports, and higher layer entities with at least
      a Spanning Tree Protocol Entity included.
      
      Each Bridge Port also functions as an end station and shall provide the MAC
      Service to an LLC Entity. Each instance of the MAC Service is provided to a
      distinct LLC Entity that supports protocol identification, multiplexing,
      and demultiplexing, for protocol data unit (PDU) transmission and reception
      by one or more higher layer entities.
      
      It is described in 8.13.9 of IEEE Std 802.1Q-2022 that in a Bridge, the LLC
      Entity associated with each Bridge Port is modeled as being directly
      connected to the attached Local Area Network (LAN).
      
      On the switch with CPU port architecture, CPU port functions as Management
      Port, and the Management Port functionality is provided by software which
      functions as an end station. Software is connected to an IEEE 802 LAN that
      is wholly contained within the system that incorporates the Bridge.
      Software provides access to the LLC Entity associated with each Bridge Port
      by the value of the source port field on the special tag on the frame
      received by software.
      
      We call frames that carry control information to determine the active
      topology and current extent of each Virtual Local Area Network (VLAN),
      i.e., spanning tree or Shortest Path Bridging (SPB) and Multiple VLAN
      Registration Protocol Data Units (MVRPDUs), and frames from other link
      constrained protocols, such as Extensible Authentication Protocol over LAN
      (EAPOL) and Link Layer Discovery Protocol (LLDP), link-local frames. They
      are not forwarded by a Bridge. Permanently configured entries in the
      filtering database (FDB) ensure that such frames are discarded by the
      Forwarding Process. In 8.6.3 of IEEE Std 802.1Q-2022, this is described in
      detail:
      
      Each of the reserved MAC addresses specified in Table 8-1
      (01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]) shall be
      permanently configured in the FDB in C-VLAN components and ERs.
      
      Each of the reserved MAC addresses specified in Table 8-2
      (01-80-C2-00-00-[01,02,03,04,05,06,07,08,09,0A,0E]) shall be permanently
      configured in the FDB in S-VLAN components.
      
      Each of the reserved MAC addresses specified in Table 8-3
      (01-80-C2-00-00-[01,02,04,0E]) shall be permanently configured in the FDB
      in TPMR components.
      
      The FDB entries for reserved MAC addresses shall specify filtering for all
      Bridge Ports and all VIDs. Management shall not provide the capability to
      modify or remove entries for reserved MAC addresses.
      
      The addresses in Table 8-1, Table 8-2, and Table 8-3 determine the scope of
      propagation of PDUs within a Bridged Network, as follows:
      
        The Nearest Bridge group address (01-80-C2-00-00-0E) is an address that
        no conformant Two-Port MAC Relay (TPMR) component, Service VLAN (S-VLAN)
        component, Customer VLAN (C-VLAN) component, or MAC Bridge can forward.
        PDUs transmitted using this destination address, or any other addresses
        that appear in Table 8-1, Table 8-2, and Table 8-3
        (01-80-C2-00-00-[00,01,02,03,04,05,06,07,08,09,0A,0B,0C,0D,0E,0F]), can
        therefore travel no further than those stations that can be reached via a
        single individual LAN from the originating station.
      
        The Nearest non-TPMR Bridge group address (01-80-C2-00-00-03), is an
        address that no conformant S-VLAN component, C-VLAN component, or MAC
        Bridge can forward; however, this address is relayed by a TPMR component.
        PDUs using this destination address, or any of the other addresses that
        appear in both Table 8-1 and Table 8-2 but not in Table 8-3
        (01-80-C2-00-00-[00,03,05,06,07,08,09,0A,0B,0C,0D,0F]), will be relayed
        by any TPMRs but will propagate no further than the nearest S-VLAN
        component, C-VLAN component, or MAC Bridge.
      
        The Nearest Customer Bridge group address (01-80-C2-00-00-00) is an
        address that no conformant C-VLAN component, MAC Bridge can forward;
        however, it is relayed by TPMR components and S-VLAN components. PDUs
        using this destination address, or any of the other addresses that appear
        in Table 8-1 but not in either Table 8-2 or Table 8-3
        (01-80-C2-00-00-[00,0B,0C,0D,0F]), will be relayed by TPMR components and
        S-VLAN components but will propagate no further than the nearest C-VLAN
        component or MAC Bridge.
      
      Because the LLC Entity associated with each Bridge Port is provided via CPU
      port, we must not filter these frames but forward them to CPU port.
      
      In a Bridge, the transmission Port is majorly decided by ingress and egress
      rules, FDB, and spanning tree Port State functions of the Forwarding
      Process. For link-local frames, only CPU port should be designated as
      destination port in the FDB, and the other functions of the Forwarding
      Process must not interfere with the decision of the transmission Port. We
      call this process trapping frames to CPU port.
      
      Therefore, on the switch with CPU port architecture, link-local frames must
      be trapped to CPU port, and certain link-local frames received by a Port of
      a Bridge comprising a TPMR component or an S-VLAN component must be
      excluded from it.
      
      A Bridge of the switch with CPU port architecture cannot comprise a
      Two-Port MAC Relay (TPMR) component as a TPMR component supports only a
      subset of the functionality of a MAC Bridge. A Bridge comprising two Ports
      (Management Port doesn't count) of this architecture will either function
      as a standard MAC Bridge or a standard VLAN Bridge.
      
      Therefore, a Bridge of this architecture can only comprise S-VLAN
      components, C-VLAN components, or MAC Bridge components. Since there's no
      TPMR component, we don't need to relay PDUs using the destination addresses
      specified on the Nearest non-TPMR section, and the proportion of the
      Nearest Customer Bridge section where they must be relayed by TPMR
      components.
      
      One option to trap link-local frames to CPU port is to add static FDB
      entries with CPU port designated as destination port. However, because that
      Independent VLAN Learning (IVL) is being used on every VID, each entry only
      applies to a single VLAN Identifier (VID). For a Bridge comprising a MAC
      Bridge component or a C-VLAN component, there would have to be 16 times
      4096 entries. This switch intellectual property can only hold a maximum of
      2048 entries. Using this option, there also isn't a mechanism to prevent
      link-local frames from being discarded when the spanning tree Port State of
      the reception Port is discarding.
      
      The remaining option is to utilise the BPC, RGAC1, RGAC2, RGAC3, and RGAC4
      registers. Whilst this applies to every VID, it doesn't contain all of the
      reserved MAC addresses without affecting the remaining Standard Group MAC
      Addresses. The REV_UN frame tag utilised using the RGAC4 register covers
      the remaining 01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F] destination
      addresses. It also includes the 01-80-C2-00-00-22 to 01-80-C2-00-00-FF
      destination addresses which may be relayed by MAC Bridges or VLAN Bridges.
      The latter option provides better but not complete conformance.
      
      This switch intellectual property also does not provide a mechanism to trap
      link-local frames with specific destination addresses to CPU port by
      Bridge, to conform to the filtering rules for the distinct Bridge
      components.
      
      Therefore, regardless of the type of the Bridge component, link-local
      frames with these destination addresses will be trapped to CPU port:
      
      01-80-C2-00-00-[00,01,02,03,0E]
      
      In a Bridge comprising a MAC Bridge component or a C-VLAN component:
      
        Link-local frames with these destination addresses won't be trapped to
        CPU port which won't conform to IEEE Std 802.1Q-2022:
      
        01-80-C2-00-00-[04,05,06,07,08,09,0A,0B,0C,0D,0F]
      
      In a Bridge comprising an S-VLAN component:
      
        Link-local frames with these destination addresses will be trapped to CPU
        port which won't conform to IEEE Std 802.1Q-2022:
      
        01-80-C2-00-00-00
      
        Link-local frames with these destination addresses won't be trapped to
        CPU port which won't conform to IEEE Std 802.1Q-2022:
      
        01-80-C2-00-00-[04,05,06,07,08,09,0A]
      
      Currently on this switch intellectual property, if the spanning tree Port
      State of the reception Port is discarding, link-local frames will be
      discarded.
      
      To trap link-local frames regardless of the spanning tree Port State, make
      the switch regard them as Bridge Protocol Data Units (BPDUs). This switch
      intellectual property only lets the frames regarded as BPDUs bypass the
      spanning tree Port State function of the Forwarding Process.
      
      With this change, the only remaining interference is the ingress rules.
      When the reception Port has no PVID assigned on software, VLAN-untagged
      frames won't be allowed in. There doesn't seem to be a mechanism on the
      switch intellectual property to have link-local frames bypass this function
      of the Forwarding Process.
      
      Fixes: b8f126a8 ("net-next: dsa: add dsa support for Mediatek MT7530 switch")
      Reviewed-by: default avatarDaniel Golle <daniel@makrotopia.org>
      Signed-off-by: default avatarArınç ÜNAL <arinc.unal@arinc9.com>
      Link: https://lore.kernel.org/r/20240409-b4-for-net-mt7530-fix-link-local-when-stp-discarding-v2-1-07b1150164ac@arinc9.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      17c56011
    • Gerd Bayer's avatar
      Revert "s390/ism: fix receive message buffer allocation" · d51dc8dd
      Gerd Bayer authored
      This reverts commit 58effa34.
      Review was not finished on this patch. So it's not ready for
      upstreaming.
      Signed-off-by: default avatarGerd Bayer <gbayer@linux.ibm.com>
      Link: https://lore.kernel.org/r/20240409113753.2181368-1-gbayer@linux.ibm.com
      Fixes: 58effa34 ("s390/ism: fix receive message buffer allocation")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d51dc8dd
    • Daniel Machon's avatar
      net: sparx5: fix wrong config being used when reconfiguring PCS · 33623113
      Daniel Machon authored
      The wrong port config is being used if the PCS is reconfigured. Fix this
      by correctly using the new config instead of the old one.
      
      Fixes: 946e7fd5 ("net: sparx5: add port module support")
      Signed-off-by: default avatarDaniel Machon <daniel.machon@microchip.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Link: https://lore.kernel.org/r/20240409-link-mode-reconfiguration-fix-v2-1-db6a507f3627@microchip.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      33623113
    • Dave Airlie's avatar
      Merge tag 'amd-drm-fixes-6.9-2024-04-10' of... · b4589db5
      Dave Airlie authored
      Merge tag 'amd-drm-fixes-6.9-2024-04-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
      
      amd-drm-fixes-6.9-2024-04-10:
      
      amdgpu:
      - GPU reset fixes
      - Fix some confusing logging
      - UMSCH fix
      - Aborted suspend fix
      - DCN 3.5 fixes
      - S4 fix
      - MES logging fixes
      - SMU 14 fixes
      - SDMA 4.4.2 fix
      - KASAN fix
      - SMU 13.0.10 fix
      - VCN partition fix
      - GFX11 fixes
      - DWB fixes
      - Plane handling fix
      - FAMS fix
      - DCN 3.1.6 fix
      - VSC SDP fixes
      - OLED panel fix
      - GFX 11.5 fix
      
      amdkfd:
      - GPU reset fixes
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Alex Deucher <alexander.deucher@amd.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20240411013425.6431-1-alexander.deucher@amd.com
      b4589db5
    • Dave Airlie's avatar
      Merge tag 'drm-intel-fixes-2024-04-10' of... · aaf00e61
      Dave Airlie authored
      Merge tag 'drm-intel-fixes-2024-04-10' of https://anongit.freedesktop.org/git/drm/drm-intel into drm-fixes
      
      Display fixes:
      - Couple CDCLK programming fixes (Ville)
      - HDCP related fix (Suraj)
      - 4 Bigjoiner related fixes (Ville)
      
      Core fix:
      - Fix for a circular locking around GuC on reset+wedged case (John)
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/ZhcJxlzc6zLMC1c-@intel.com
      aaf00e61
    • Arnd Bergmann's avatar
      net/mlx5: fix possible stack overflows · fe87922c
      Arnd Bergmann authored
      A couple of debug functions use a 512 byte temporary buffer and call another
      function that has another buffer of the same size, which in turn exceeds the
      usual warning limit for excessive stack usage:
      
      drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c:1073:1: error: stack frame size (1448) exceeds limit (1024) in 'dr_dump_start' [-Werror,-Wframe-larger-than]
      dr_dump_start(struct seq_file *file, loff_t *pos)
      drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c:1009:1: error: stack frame size (1120) exceeds limit (1024) in 'dr_dump_domain' [-Werror,-Wframe-larger-than]
      dr_dump_domain(struct seq_file *file, struct mlx5dr_domain *dmn)
      drivers/net/ethernet/mellanox/mlx5/core/steering/dr_dbg.c:705:1: error: stack frame size (1104) exceeds limit (1024) in 'dr_dump_matcher_rx_tx' [-Werror,-Wframe-larger-than]
      dr_dump_matcher_rx_tx(struct seq_file *file, bool is_rx,
      
      Rework these so that each of the various code paths only ever has one of
      these buffers in it, and exactly the functions that declare one have
      the 'noinline_for_stack' annotation that prevents them from all being
      inlined into the same caller.
      
      Fixes: 917d1e79 ("net/mlx5: DR, Change SWS usage to debug fs seq_file interface")
      Reviewed-by: default avatarSimon Horman <horms@kernel.org>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Link: https://lore.kernel.org/all/20240219100506.648089-1-arnd@kernel.org/Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/r/20240408074142.3007036-1-arnd@kernel.orgSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      fe87922c
    • Jakub Kicinski's avatar
      Merge branch 'mlx5-misc-fixes' · 186abfcd
      Jakub Kicinski authored
      Tariq Toukan says:
      
      ====================
      mlx5 misc fixes
      
      This patchset provides bug fixes to mlx5 driver.
      
      This is V2 of the series previously submitted as PR by Saeed:
      https://lore.kernel.org/netdev/20240326144646.2078893-1-saeed@kernel.org/T/
      
      Series generated against:
      commit 237f3cf1 ("xsk: validate user input for XDP_{UMEM|COMPLETION}_FILL_RING")
      ====================
      
      Link: https://lore.kernel.org/r/20240409190820.227554-1-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      186abfcd
    • Tariq Toukan's avatar
      net/mlx5: Disallow SRIOV switchdev mode when in multi-PF netdev · 7772dc74
      Tariq Toukan authored
      Adaptations need to be made for the auxiliary device management in the
      core driver level. Block this combination for now.
      
      Fixes: 678eb448 ("net/mlx5: SD, Implement basic query and instantiation")
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Reviewed-by: default avatarGal Pressman <gal@nvidia.com>
      Link: https://lore.kernel.org/r/20240409190820.227554-12-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      7772dc74
    • Carolina Jubran's avatar
      net/mlx5e: RSS, Block XOR hash with over 128 channels · 49e6c938
      Carolina Jubran authored
      When supporting more than 128 channels, the RQT size is
      calculated by multiplying the number of channels by 2
      and rounding up to the nearest power of 2.
      
      The index of the RQT is derived from the RSS hash
      calculations. If XOR8 is used as the RSS hash function,
      there are only 256 possible hash results, and therefore,
      only 256 indexes can be reached in the RQT.
      
      Block setting the RSS hash function to XOR when the number
      of channels exceeds 128.
      
      Fixes: 74a8dada ("net/mlx5e: Preparations for supporting larger number of channels")
      Signed-off-by: default avatarCarolina Jubran <cjubran@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/r/20240409190820.227554-11-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      49e6c938
    • Rahul Rameshbabu's avatar
      net/mlx5e: Do not produce metadata freelist entries in Tx port ts WQE xmit · 86b0ca5b
      Rahul Rameshbabu authored
      Free Tx port timestamping metadata entries in the NAPI poll context and
      consume metadata enties in the WQE xmit path. Do not free a Tx port
      timestamping metadata entry in the WQE xmit path even in the error path to
      avoid a race between two metadata entry producers.
      
      Fixes: 3178308a ("net/mlx5e: Make tx_port_ts logic resilient to out-of-order CQEs")
      Signed-off-by: default avatarRahul Rameshbabu <rrameshbabu@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/r/20240409190820.227554-10-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      86b0ca5b
    • Carolina Jubran's avatar
      net/mlx5e: HTB, Fix inconsistencies with QoS SQs number · 2f436f18
      Carolina Jubran authored
      When creating a new HTB class while the interface is down,
      the variable that follows the number of QoS SQs (htb_max_qos_sqs)
      may not be consistent with the number of HTB classes.
      
      Previously, we compared these two values to ensure that
      the node_qid is lower than the number of QoS SQs, and we
      allocated stats for that SQ when they are equal.
      
      Change the check to compare the node_qid with the current
      number of leaf nodes and fix the checking conditions to
      ensure allocation of stats_list and stats for each node.
      
      Fixes: 214baf22 ("net/mlx5e: Support HTB offload")
      Signed-off-by: default avatarCarolina Jubran <cjubran@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/r/20240409190820.227554-9-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      2f436f18
    • Carolina Jubran's avatar
      net/mlx5e: Fix mlx5e_priv_init() cleanup flow · ecb82945
      Carolina Jubran authored
      When mlx5e_priv_init() fails, the cleanup flow calls mlx5e_selq_cleanup which
      calls mlx5e_selq_apply() that assures that the `priv->state_lock` is held using
      lockdep_is_held().
      
      Acquire the state_lock in mlx5e_selq_cleanup().
      
      Kernel log:
      =============================
      WARNING: suspicious RCU usage
      6.8.0-rc3_net_next_841a9b5 #1 Not tainted
      -----------------------------
      drivers/net/ethernet/mellanox/mlx5/core/en/selq.c:124 suspicious rcu_dereference_protected() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 2, debug_locks = 1
      2 locks held by systemd-modules/293:
       #0: ffffffffa05067b0 (devices_rwsem){++++}-{3:3}, at: ib_register_client+0x109/0x1b0 [ib_core]
       #1: ffff8881096c65c0 (&device->client_data_rwsem){++++}-{3:3}, at: add_client_context+0x104/0x1c0 [ib_core]
      
      stack backtrace:
      CPU: 4 PID: 293 Comm: systemd-modules Not tainted 6.8.0-rc3_net_next_841a9b5 #1
      Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      Call Trace:
       <TASK>
       dump_stack_lvl+0x8a/0xa0
       lockdep_rcu_suspicious+0x154/0x1a0
       mlx5e_selq_apply+0x94/0xa0 [mlx5_core]
       mlx5e_selq_cleanup+0x3a/0x60 [mlx5_core]
       mlx5e_priv_init+0x2be/0x2f0 [mlx5_core]
       mlx5_rdma_setup_rn+0x7c/0x1a0 [mlx5_core]
       rdma_init_netdev+0x4e/0x80 [ib_core]
       ? mlx5_rdma_netdev_free+0x70/0x70 [mlx5_core]
       ipoib_intf_init+0x64/0x550 [ib_ipoib]
       ipoib_intf_alloc+0x4e/0xc0 [ib_ipoib]
       ipoib_add_one+0xb0/0x360 [ib_ipoib]
       add_client_context+0x112/0x1c0 [ib_core]
       ib_register_client+0x166/0x1b0 [ib_core]
       ? 0xffffffffa0573000
       ipoib_init_module+0xeb/0x1a0 [ib_ipoib]
       do_one_initcall+0x61/0x250
       do_init_module+0x8a/0x270
       init_module_from_file+0x8b/0xd0
       idempotent_init_module+0x17d/0x230
       __x64_sys_finit_module+0x61/0xb0
       do_syscall_64+0x71/0x140
       entry_SYSCALL_64_after_hwframe+0x46/0x4e
       </TASK>
      
      Fixes: 8bf30be7 ("net/mlx5e: Introduce select queue parameters")
      Signed-off-by: default avatarCarolina Jubran <cjubran@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Reviewed-by: default avatarDragos Tatulea <dtatulea@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/r/20240409190820.227554-8-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ecb82945
    • Carolina Jubran's avatar
      net/mlx5e: RSS, Block changing channels number when RXFH is configured · ee357240
      Carolina Jubran authored
      Changing the channels number after configuring the receive flow hash
      indirection table may affect the RSS table size. The previous
      configuration may no longer be compatible with the new receive flow
      hash indirection table.
      
      Block changing the channels number when RXFH is configured and changing
      the channels number requires resizing the RSS table size.
      
      Fixes: 74a8dada ("net/mlx5e: Preparations for supporting larger number of channels")
      Signed-off-by: default avatarCarolina Jubran <cjubran@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/r/20240409190820.227554-7-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      ee357240
    • Cosmin Ratiu's avatar
      net/mlx5: Correctly compare pkt reformat ids · 9eca93f4
      Cosmin Ratiu authored
      struct mlx5_pkt_reformat contains a naked union of a u32 id and a
      dr_action pointer which is used when the action is SW-managed (when
      pkt_reformat.owner is set to MLX5_FLOW_RESOURCE_OWNER_SW). Using id
      directly in that case is incorrect, as it maps to the least significant
      32 bits of the 64-bit pointer in mlx5_fs_dr_action and not to the pkt
      reformat id allocated in firmware.
      
      For the purpose of comparing whether two rules are identical,
      interpreting the least significant 32 bits of the mlx5_fs_dr_action
      pointer as an id mostly works... until it breaks horribly and produces
      the outcome described in [1].
      
      This patch fixes mlx5_flow_dests_cmp to correctly compare ids using
      mlx5_fs_dr_action_get_pkt_reformat_id for the SW-managed rules.
      
      Link: https://lore.kernel.org/netdev/ea5264d6-6b55-4449-a602-214c6f509c1e@163.com/T/#u [1]
      
      Fixes: 6a48faee ("net/mlx5: Add direct rule fs_cmd implementation")
      Signed-off-by: default avatarCosmin Ratiu <cratiu@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Link: https://lore.kernel.org/r/20240409190820.227554-6-tariqt@nvidia.comSigned-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      9eca93f4