1. 18 Oct, 2021 10 commits
    • David S. Miller's avatar
      Merge branch 'remove-qdisc-running-counter' · f8ba22a1
      David S. Miller authored
      Sebastian Andrzej Siewior says:
      
      ====================
      Try to simplify the gnet_stats and remove qdisc->running sequence counter.
      
      The first few patches is a follow up to
          https://lore.kernel.org/all/20211007175000.2334713-1-bigeasy@linutronix.de/
      
      The remaining patches (#5+) remove the seqcount_t (Qdisc::running) from
      the Qdisc. The statistics (Qdisc::bstats and Qdisc::cpu_bstats) use
      u64_stats_t and the "running state" is now represented by a bit in
      Qdisc::state.
      
      By removing the seqcount_t from Qdisc and decoupling the bstats
      statistics from the seqcount_t it is possible to query the statistics
      even if the Qdisc is running instead of waiting until it is idle again.
      
      The try-lock like usage of the seqcount_t in qdisc_run_begin() is
      problematic on PREEMPT_RT. Inside the qdisc_run_begin/end() qdisc->running
      sequence counter write sections, at sch_direct_xmit(), the seqcount write
      serialization lock is released then re-acquired. This is fine for !RT, because
      the writer is in a BH disabled region and there is a no in-IRQ reader. For RT
      though, BH sections are preemptible. The earlier introduced seqcount_LOCKNAME_t
      mechanism, which for RT the reader acquires then relesaes the write
      serailization lock to avoid infinite spinning if it preempts a seqcount write
      section, cannot work: the qdisc->running write serialization lock is already
      intermittingly released inside the seqcount write section.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f8ba22a1
    • Ahmed S. Darwish's avatar
      net: sched: Remove Qdisc::running sequence counter · 29cbcd85
      Ahmed S. Darwish authored
      The Qdisc::running sequence counter has two uses:
      
        1. Reliably reading qdisc's tc statistics while the qdisc is running
           (a seqcount read/retry loop at gnet_stats_add_basic()).
      
        2. As a flag, indicating whether the qdisc in question is running
           (without any retry loops).
      
      For the first usage, the Qdisc::running sequence counter write section,
      qdisc_run_begin() => qdisc_run_end(), covers a much wider area than what
      is actually needed: the raw qdisc's bstats update. A u64_stats sync
      point was thus introduced (in previous commits) inside the bstats
      structure itself. A local u64_stats write section is then started and
      stopped for the bstats updates.
      
      Use that u64_stats sync point mechanism for the bstats read/retry loop
      at gnet_stats_add_basic().
      
      For the second qdisc->running usage, a __QDISC_STATE_RUNNING bit flag,
      accessed with atomic bitops, is sufficient. Using a bit flag instead of
      a sequence counter at qdisc_run_begin/end() and qdisc_is_running() leads
      to the SMP barriers implicitly added through raw_read_seqcount() and
      write_seqcount_begin/end() getting removed. All call sites have been
      surveyed though, and no required ordering was identified.
      
      Now that the qdisc->running sequence counter is no longer used, remove
      it.
      
      Note, using u64_stats implies no sequence counter protection for 64-bit
      architectures. This can lead to the qdisc tc statistics "packets" vs.
      "bytes" values getting out of sync on rare occasions. The individual
      values will still be valid.
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      29cbcd85
    • Ahmed S. Darwish's avatar
      net: sched: Merge Qdisc::bstats and Qdisc::cpu_bstats data types · 50dc9a85
      Ahmed S. Darwish authored
      The only factor differentiating per-CPU bstats data type (struct
      gnet_stats_basic_cpu) from the packed non-per-CPU one (struct
      gnet_stats_basic_packed) was a u64_stats sync point inside the former.
      The two data types are now equivalent: earlier commits added a u64_stats
      sync point to the latter.
      
      Combine both data types into "struct gnet_stats_basic_sync". This
      eliminates redundancy and simplifies the bstats read/write APIs.
      
      Use u64_stats_t for bstats "packets" and "bytes" data types. On 64-bit
      architectures, u64_stats sync points do not use sequence counter
      protection.
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50dc9a85
    • Ahmed S. Darwish's avatar
      net: sched: Use _bstats_update/set() instead of raw writes · f56940da
      Ahmed S. Darwish authored
      The Qdisc::running sequence counter, used to protect Qdisc::bstats reads
      from parallel writes, is in the process of being removed. Qdisc::bstats
      read/writes will synchronize using an internal u64_stats sync point
      instead.
      
      Modify all bstats writes to use _bstats_update(). This ensures that
      the internal u64_stats sync point is always acquired and released as
      appropriate.
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f56940da
    • Ahmed S. Darwish's avatar
      net: sched: Protect Qdisc::bstats with u64_stats · 67c9e627
      Ahmed S. Darwish authored
      The not-per-CPU variant of qdisc tc (traffic control) statistics,
      Qdisc::gnet_stats_basic_packed bstats, is protected with Qdisc::running
      sequence counter.
      
      This sequence counter is used for reliably protecting bstats reads from
      parallel writes. Meanwhile, the seqcount's write section covers a much
      wider area than bstats update: qdisc_run_begin() => qdisc_run_end().
      
      That read/write section asymmetry can lead to needless retries of the
      read section. To prepare for removing the Qdisc::running sequence
      counter altogether, introduce a u64_stats sync point inside bstats
      instead.
      
      Modify _bstats_update() to start/end the bstats u64_stats write
      section.
      
      For bisectability, and finer commits granularity, the bstats read
      section is still protected with a Qdisc::running read/retry loop and
      qdisc_run_begin/end() still starts/ends that seqcount write section.
      Once all call sites are modified to use _bstats_update(), the
      Qdisc::running seqcount will be removed and bstats read/retry loop will
      be modified to utilize the internal u64_stats sync point.
      
      Note, using u64_stats implies no sequence counter protection for 64-bit
      architectures. This can lead to the statistics "packets" vs. "bytes"
      values getting out of sync on rare occasions. The individual values will
      still be valid.
      
      [bigeasy: Minor commit message edits, init all gnet_stats_basic_packed.]
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      67c9e627
    • Ahmed S. Darwish's avatar
      u64_stats: Introduce u64_stats_set() · f2efdb17
      Ahmed S. Darwish authored
      Allow to directly set a u64_stats_t value which is used to provide an init
      function which sets it directly to zero intead of memset() the value.
      
      Add u64_stats_set() to the u64_stats API.
      
      [bigeasy: commit message. ]
      Signed-off-by: default avatarAhmed S. Darwish <a.darwish@linutronix.de>
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f2efdb17
    • Sebastian Andrzej Siewior's avatar
      gen_stats: Move remaining users to gnet_stats_add_queue(). · 10940eb7
      Sebastian Andrzej Siewior authored
      The gnet_stats_queue::qlen member is only used in the SMP-case.
      
      qdisc_qstats_qlen_backlog() needs to add qdisc_qlen() to qstats.qlen to
      have the same value as that provided by qdisc_qlen_sum().
      
      gnet_stats_copy_queue() needs to overwritte the resulting qstats.qlen
      field whith the caller submitted qlen value. It might be differ from the
      submitted value.
      
      Let both functions use gnet_stats_add_queue() and remove unused
      __gnet_stats_copy_queue().
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      10940eb7
    • Sebastian Andrzej Siewior's avatar
      mq, mqprio: Use gnet_stats_add_queue(). · 7361df46
      Sebastian Andrzej Siewior authored
      gnet_stats_add_basic() and gnet_stats_add_queue() add up the statistics
      so they can be used directly for both the per-CPU and global case.
      
      gnet_stats_add_queue() copies either Qdisc's per-CPU
      gnet_stats_queue::qlen or the global member. The global
      gnet_stats_queue::qlen isn't touched in the per-CPU case so there is no
      need to consider it in the global-case.
      
      In the per-CPU case, the sum of global gnet_stats_queue::qlen and
      the per-CPU gnet_stats_queue::qlen was assigned to sch->q.qlen and
      sch->qstats.qlen. Now both fields are copied individually.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7361df46
    • Sebastian Andrzej Siewior's avatar
      gen_stats: Add gnet_stats_add_queue(). · 448e163f
      Sebastian Andrzej Siewior authored
      This function will replace __gnet_stats_copy_queue(). It reads all
      arguments and adds them into the passed gnet_stats_queue argument.
      In contrast to __gnet_stats_copy_queue() it also copies the qlen member.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      448e163f
    • Sebastian Andrzej Siewior's avatar
      gen_stats: Add instead Set the value in __gnet_stats_copy_basic(). · fbf307c8
      Sebastian Andrzej Siewior authored
      __gnet_stats_copy_basic() always assigns the value to the bstats
      argument overwriting the previous value. The later added per-CPU version
      always accumulated the values in the returning gnet_stats_basic_packed
      argument.
      
      Based on review there are five users of that function as of today:
      - est_fetch_counters(), ___gnet_stats_copy_basic()
        memsets() bstats to zero, single invocation.
      
      - mq_dump(), mqprio_dump(), mqprio_dump_class_stats()
        memsets() bstats to zero, multiple invocation but does not use the
        function due to !qdisc_is_percpu_stats().
      
      Add the values in __gnet_stats_copy_basic() instead overwriting. Rename
      the function to gnet_stats_add_basic() to make it more obvious.
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fbf307c8
  2. 16 Oct, 2021 30 commits