1. 20 Feb, 2023 16 commits
    • Eric Dumazet's avatar
      ipv6: icmp6: add drop reason support to ndisc_recv_ns() · 7c9c8913
      Eric Dumazet authored
      Change ndisc_recv_ns() to return a drop reason.
      
      For the moment, return PKT_TOO_SMALL, NOT_SPECIFIED
      or SKB_CONSUMED.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7c9c8913
    • Eric Dumazet's avatar
      net: add location to trace_consume_skb() · dd1b5278
      Eric Dumazet authored
      kfree_skb() includes the location, it makes sense
      to add it to consume_skb() as well.
      
      After patch:
      
       taskd_EventMana  8602 [004]   420.406239: skb:consume_skb: skbaddr=0xffff893a4a6d0500 location=unix_stream_read_generic
               swapper     0 [011]   422.732607: skb:consume_skb: skbaddr=0xffff89597f68cee0 location=mlx4_en_free_tx_desc
            discipline  9141 [043]   423.065653: skb:consume_skb: skbaddr=0xffff893a487e9c00 location=skb_consume_udp
               swapper     0 [010]   423.073166: skb:consume_skb: skbaddr=0xffff8949ce9cdb00 location=icmpv6_rcv
               borglet  8672 [014]   425.628256: skb:consume_skb: skbaddr=0xffff8949c42e9400 location=netlink_dump
               swapper     0 [028]   426.263317: skb:consume_skb: skbaddr=0xffff893b1589dce0 location=net_rx_action
                  wget 14339 [009]   426.686380: skb:consume_skb: skbaddr=0xffff893a51b552e0 location=tcp_rcv_state_process
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd1b5278
    • Xuan Zhuo's avatar
      xsk: support use vaddr as ring · 9f78bf33
      Xuan Zhuo authored
      When we try to start AF_XDP on some machines with long running time, due
      to the machine's memory fragmentation problem, there is no sufficient
      contiguous physical memory that will cause the start failure.
      
      If the size of the queue is 8 * 1024, then the size of the desc[] is
      8 * 1024 * 8 = 16 * PAGE, but we also add struct xdp_ring size, so it is
      16page+. This is necessary to apply for a 4-order memory. If there are a
      lot of queues, it is difficult to these machine with long running time.
      
      Here, that we actually waste 15 pages. 4-Order memory is 32 pages, but
      we only use 17 pages.
      
      This patch replaces __get_free_pages() by vmalloc() to allocate memory
      to solve these problems.
      Signed-off-by: default avatarXuan Zhuo <xuanzhuo@linux.alibaba.com>
      Acked-by: default avatarMagnus Karlsson <magnus.karlsson@intel.com>
      Reviewed-by: default avatarAlexander Lobakin <aleksander.lobakin@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f78bf33
    • Paolo Abeni's avatar
      Merge branch 'taprio-queuemaxsdu-fixes' · b148d400
      Paolo Abeni authored
      Vladimir Oltean says:
      
      ====================
      taprio queueMaxSDU fixes
      
      This fixes 3 issues noticed while attempting to reoffload the
      dynamically calculated queueMaxSDU values. These are:
      - Dynamic queueMaxSDU is not calculated correctly due to a lost patch
      - Dynamically calculated queueMaxSDU needs to be clamped on the low end
      - Dynamically calculated queueMaxSDU needs to be clamped on the high end
      ====================
      
      Link: https://lore.kernel.org/r/20230215224632.2532685-1-vladimir.oltean@nxp.comSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      b148d400
    • Vladimir Oltean's avatar
      net/sched: taprio: dynamic max_sdu larger than the max_mtu is unlimited · 64cb6aad
      Vladimir Oltean authored
      It makes no sense to keep randomly large max_sdu values, especially if
      larger than the device's max_mtu. These are visible in "tc qdisc show".
      Such a max_sdu is practically unlimited and will cause no packets for
      that traffic class to be dropped on enqueue.
      
      Just set max_sdu_dynamic to U32_MAX, which in the logic below causes
      taprio to save a max_frm_len of U32_MAX and a max_sdu presented to user
      space of 0 (unlimited).
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      64cb6aad
    • Vladimir Oltean's avatar
      net/sched: taprio: don't allow dynamic max_sdu to go negative after stab adjustment · bdf366bd
      Vladimir Oltean authored
      The overhead specified in the size table comes from the user. With small
      time intervals (or gates always closed), the overhead can be larger than
      the max interval for that traffic class, and their difference is
      negative.
      
      What we want to happen is for max_sdu_dynamic to have the smallest
      non-zero value possible (1) which means that all packets on that traffic
      class are dropped on enqueue. However, since max_sdu_dynamic is u32, a
      negative is represented as a large value and oversized dropping never
      happens.
      
      Use max_t with int to force a truncation of max_frm_len to no smaller
      than dev->hard_header_len + 1, which in turn makes max_sdu_dynamic no
      smaller than 1.
      
      Fixes: fed87cc6 ("net/sched: taprio: automatically calculate queueMaxSDU based on TC gate durations")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      bdf366bd
    • Vladimir Oltean's avatar
      net/sched: taprio: fix calculation of maximum gate durations · 09dbdf28
      Vladimir Oltean authored
      taprio_calculate_gate_durations() depends on netdev_get_num_tc() and
      this returns 0. So it calculates the maximum gate durations for no
      traffic class.
      
      I had tested the blamed commit only with another patch in my tree, one
      which in the end I decided isn't valuable enough to submit ("net/sched:
      taprio: mask off bits in gate mask that exceed number of TCs").
      
      The problem is that having this patch threw off my testing. By moving
      the netdev_set_num_tc() call earlier, we implicitly gave to
      taprio_calculate_gate_durations() the information it needed.
      
      Extract only the portion from the unsubmitted change which applies the
      mqprio configuration to the netdev earlier.
      
      Link: https://patchwork.kernel.org/project/netdevbpf/patch/20230130173145.475943-15-vladimir.oltean@nxp.com/
      Fixes: a306a90c ("net/sched: taprio: calculate tc gate durations")
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: Kurt Kanzenbach's avatarKurt Kanzenbach <kurt@linutronix.de>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      09dbdf28
    • David Howells's avatar
      rxrpc: Fix overproduction of wakeups to recvmsg() · c0783818
      David Howells authored
      Fix three cases of overproduction of wakeups:
      
       (1) rxrpc_input_split_jumbo() conditionally notifies the app that there's
           data for recvmsg() to collect if it queues some data - and then its
           only caller, rxrpc_input_data(), goes and wakes up recvmsg() anyway.
      
           Fix the rxrpc_input_data() to only do the wakeup in failure cases.
      
       (2) If a DATA packet is received for a call by the I/O thread whilst
           recvmsg() is busy draining the call's rx queue in the app thread, the
           call will left on the recvmsg() queue for recvmsg() to pick up, even
           though there isn't any data on it.
      
           This can cause an unexpected recvmsg() with a 0 return and no MSG_EOR
           set after the reply has been posted to a service call.
      
           Fix this by discarding pending calls from the recvmsg() queue that
           don't need servicing yet.
      
       (3) Not-yet-completed calls get requeued after having data read from them,
           even if they have no data to read.
      
           Fix this by only requeuing them if they have data waiting on them; if
           they don't, the I/O thread will requeue them when data arrives or they
           fail.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      cc: Marc Dionne <marc.dionne@auristor.com>
      cc: linux-afs@lists.infradead.org
      Link: https://lore.kernel.org/r/3386149.1676497685@warthog.procyon.org.ukSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c0783818
    • Paolo Abeni's avatar
      Merge branch 'net-final-gsi-register-updates' · d269ac13
      Paolo Abeni authored
      Alex Elder says:
      
      ====================
      net: final GSI register updates
      
      I believe this is the last set of changes required to allow IPA v5.0
      to be supported.  There is a little cleanup work remaining, but that
      can happen in the next Linux release cycle.  Otherwise we just need
      config data and register definitions for IPA v5.0 (and DTS updates).
      These are ready but won't be posted without further testing.
      
      The first patch in this series fixes a minor bug in a patch just
      posted, which I found too late.  The second eliminates the GSI
      memory "adjustment"; this was done previously to avoid/delay the
      need to implement a more general way to define GSI register offsets.
      Note that this patch causes "checkpatch" warnings due to indentation
      that aligns with an open parenthesis.
      
      The third patch makes use of the newly-defined register offsets, to
      eliminate the need for a function that hid a few details.  The next
      modifies a different helper function to work properly for IPA v5.0+.
      The fifth patch changes the way the event ring size is specified
      based on how it's now done for IPA v5.0+.  And the last defines a
      new register required for IPA v5.0+.
      ====================
      
      Link: https://lore.kernel.org/r/20230215195352.755744-1-elder@linaro.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      d269ac13
    • Alex Elder's avatar
      net: ipa: add HW_PARAM_4 GSI register · f651334e
      Alex Elder authored
      Starting at IPA v5.0, the number of event rings per EE is defined
      in a field in a new HW_PARAM_4 GSI register rather than HW_PARAM_2.
      Define this new register and its fields, and update the code that
      checks the number of rings supported by hardware to use the proper
      field based on IPA version.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f651334e
    • Alex Elder's avatar
      net: ipa: support different event ring encoding · 37cd29ec
      Alex Elder authored
      Starting with IPA v5.0, a channel's event ring index is encoded in
      a field in the CH_C_CNTXT_1 GSI register rather than CH_C_CNTXT_0.
      Define a new field ID for the former register and encode the event
      ring in the appropriate register.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      37cd29ec
    • Alex Elder's avatar
      net: ipa: avoid setting an undefined field · 62747512
      Alex Elder authored
      The GSI channel protocol field in the CH_C_CNTXT_0 GSI register is
      widened starting IPA v5.0, making the CHTYPE_PROTOCOL_MSB field
      added in IPA v4.5 unnecessary.  Update the code to reflect this.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      62747512
    • Alex Elder's avatar
      net: ipa: kill ev_ch_e_cntxt_1_length_encode() · f75f44dd
      Alex Elder authored
      Now that we explicitly define each register field width there is no
      need to have a special encoding function for the event ring length.
      Add a field for this to the EV_CH_E_CNTXT_1 GSI register, and use it
      in place of ev_ch_e_cntxt_1_length_encode() (which can be removed).
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      f75f44dd
    • Alex Elder's avatar
      net: ipa: kill gsi->virt_raw · 59b12b1d
      Alex Elder authored
      Starting at IPA v4.5, almost all GSI registers had their offsets
      changed by a fixed amount (shifted downward by 0xd000).  Rather than
      defining offsets for all those registers dependent on version, an
      adjustment was applied for most register accesses.  This was
      implemented in commit cdeee49f ("net: ipa: adjust GSI register
      addresses").  It was later modified to be a bit more obvious about
      the adjusment, in commit 571b1e7e ("net: ipa: use a separate
      pointer for adjusted GSI memory").
      
      We now are able to define every GSI register with its own offset, so
      there's no need to implement this special adjustment.
      
      So get rid of the "virt_raw" pointer, and just maintain "virt" as
      the (non-adjusted) base address of I/O mapped GSI register memory.
      
      Redefine the offsets of all GSI registers (other than the INTER_EE
      ones, which were not subject to the adjustment) for IPA v4.5+,
      subtracting 0xd000 from their defined offsets instead.
      
      Move the ERROR_LOG and ERROR_LOG_CLR definitions further down in the
      register definition files so all registers are defined in order of
      their offset.
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      59b12b1d
    • Alex Elder's avatar
      net: ipa: fix an incorrect assignment · ecfa80ce
      Alex Elder authored
      I spotted an error in a patch posted this week, unfortunately just
      after it got accepted.  The effect of the bug is that time-based
      interrupt moderation is disabled.  This is not technically a bug,
      but it is not what is intended.  The problem is that a |= assignment
      got implemented as a simple assignment, so the previously assigned
      value was ignored.
      
      Fixes: edc6158b ("net: ipa: define fields for event-ring related registers")
      Signed-off-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      ecfa80ce
    • Lorenzo Bianconi's avatar
      net: dpaa2-eth: do not always set xsk support in xdp_features flag · 1c93e48c
      Lorenzo Bianconi authored
      Do not always add NETDEV_XDP_ACT_XSK_ZEROCOPY bit in xdp_features flag
      but check if the NIC really supports it.
      Signed-off-by: default avatarLorenzo Bianconi <lorenzo@kernel.org>
      Reviewed-by: default avatarLarysa Zaremba <larysa.zaremba@intel.com>
      Link: https://lore.kernel.org/r/3dba6ea42dc343a9f2d7d1a6a6a6c173235e1ebf.1676471386.git.lorenzo@kernel.orgSigned-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      1c93e48c
  2. 17 Feb, 2023 2 commits
    • David S. Miller's avatar
      Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net · 675f176b
      David S. Miller authored
      Some of the devlink bits were tricky, but I think I got it right.
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      675f176b
    • Linus Torvalds's avatar
      Merge tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm · ec35307e
      Linus Torvalds authored
      Pull drm fixes from Dave Airlie:
       "Just a final collection of misc fixes, the biggest disables the
        recently added dynamic debugging support, it has a regression that
        needs some bigger fixes.
      
        Otherwise a bunch of fixes across the board, vc4, amdgpu and vmwgfx
        mostly, with some smaller i915 and ast fixes.
      
        drm:
         - dynamic debug disable for now
      
        fbdev:
         - deferred i/o device close fix
      
        amdgpu:
         - Fix GC11.x suspend warning
         - Fix display warning
      
        vc4:
         - YUV planes fix
         - hdmi display fix
         - crtc reduced blanking fix
      
        ast:
         - fix start address computation
      
        vmwgfx:
         - fix bo/handle races
      
        i915:
         - gen11 WA fix"
      
      * tag 'drm-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm:
        drm/amd/display: Fail atomic_check early on normalize_zpos error
        drm/amd/amdgpu: fix warning during suspend
        drm/vmwgfx: Do not drop the reference to the handle too soon
        drm/vmwgfx: Stop accessing buffer objects which failed init
        drm/i915/gen11: Wa_1408615072/Wa_1407596294 should be on GT list
        drm: Disable dynamic debug as broken
        drm/ast: Fix start address computation
        fbdev: Fix invalid page access after closing deferred I/O devices
        drm/vc4: crtc: Increase setup cost in core clock calculation to handle extreme reduced blanking
        drm/vc4: hdmi: Always enable GCP with AVMUTE cleared
        drm/vc4: Fix YUV plane handling when planes are in different buffers
      ec35307e
  3. 16 Feb, 2023 22 commits