1. 03 Jun, 2021 32 commits
  2. 02 Jun, 2021 8 commits
    • David S. Miller's avatar
      Merge branch 'devlink-rate-objects' · 270d47dc
      David S. Miller authored
      Dmytro Linkin says:
      
      ====================
      devlink: rate objects API
      
      Resending without RFC.
      
      Currently kernel provides a way to change tx rate of single VF in
      switchdev mode via tc-police action. When lots of VFs are configured
      management of theirs rates becomes non-trivial task and some grouping
      mechanism is required. Implementing such grouping in tc-police will bring
      flow related limitations and unwanted complications, like:
      - tc-police is a policer and there is a user request for a traffic
        shaper, so shared tc-police action is not suitable;
      - flows requires net device to be placed on, means "groups" wouldn't
        have net device instance itself. Taking into the account previous
        point was reviewed a sollution, when representor have a policer and
        the driver use a shaper if qdisc contains group of VFs - such approach
        ugly, compilated and misleading;
      - TC is ingress only, while configuring "other" side of the wire looks
        more like a "real" picture where shaping is outside of the steering
        world, similar to "ip link" command;
      
      According to that devlink is the most appropriate place.
      
      This series introduces devlink API for managing tx rate of single devlink
      port or of a group by invoking callbacks (see below) of corresponding
      driver. Also devlink port or a group can be added to the parent group,
      where driver responsible to handle rates of a group elements. To achieve
      all of that new rate object is added. It can be one of the two types:
      - leaf - represents a single devlink port; created/destroyed by the
        driver and bound to the devlink port. As example, some driver may
        create leaf rate object for every devlink port associated with VF.
        Since leaf have 1to1 mapping to it's devlink port, in user space it is
        referred as pci/<bus_addr>/<port_index>;
      - node - represents a group of rate objects; created/deleted by request
        from the userspace; initially empty (no rate objects added). In
        userspace it is referred as pci/<bus_addr>/<node_name>, where node name
        can be any, except decimal number, to avoid collisions with leafs.
      
      devlink_ops extended with following callbacks:
      - rate_{leaf|node}_tx_{share|max}_set
      - rate_node_{new|del}
      - rate_{leaf|node}_parent_set
      
      KAPI provides:
      - creation/destruction of the leaf rate object associated with devlink
        port
      - destruction of rate nodes to allow a vendor driver to free allocated
        resources on driver removal or due to the other reasons when nodes
        destruction required
      
      UAPI provides:
      - dumping all or single rate objects
      - setting tx_{share|max} of rate object of any type
      - creating/deleting node rate object
      - setting/unsetting parent of any rate object
      
      Added devlink rate object support for netdevsim driver
      
      Issues/open questions:
      - Does user need DEVLINK_CMD_RATE_DEL_ALL_CHILD command to clean all
        children of particular parent node? For example:
        $ devlink port function rate flush netdevsim/netdevsim10/group
      - priv pointer passed to the callbacks is a source of bugs; in leaf case
        driver can embed rate object into internal structure and use
        container_of() on it; in node case it cannot be done since nodes are
        created from userspace
      
      v1->v2:
      - fixed kernel-doc for devlink_rate_leaf_{create|destroy}()
      - s/func/function/ for all devlink port command occurences
      
      v2->v3:
      - devlink:
        - added devlink_rate_nodes_destroy() function
      - netdevsim:
        - added call of devlink_rate_nodes_destroy() function
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      270d47dc
    • Dmytro Linkin's avatar
      Documentation: devlink rate objects · b62767e7
      Dmytro Linkin authored
      Add devlink rate objects section at devlink port documentation.
      Add devlink rate support info at netdevsim devlink documentation.
      Signed-off-by: default avatarDmytro Linkin <dlinkin@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b62767e7
    • Dmytro Linkin's avatar
      selftest: netdevsim: Add devlink rate grouping test · 1a9c0482
      Dmytro Linkin authored
      Test verifies that netdevsim correctly implements devlink ops callbacks
      that set node as a parent of devlink leaf or node rate object.
      Co-developed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarDmytro Linkin <dlinkin@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1a9c0482
    • Dmytro Linkin's avatar
      netdevsim: Allow setting parent node of rate objects · f3d101b4
      Dmytro Linkin authored
      Implement new devlink ops that allow setting rate node as a parent for
      devlink port (leaf) or another devlink node through devlink API.
      Expose parent names to netdevsim debugfs in read only mode.
      Co-developed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarDmytro Linkin <dlinkin@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f3d101b4
    • Dmytro Linkin's avatar
      devlink: Allow setting parent node of rate objects · d7555984
      Dmytro Linkin authored
      Refactor DEVLINK_CMD_RATE_{GET|SET} command handlers to support setting
      a node as a parent for another rate object (leaf or node) by means of
      new attribute DEVLINK_ATTR_RATE_PARENT_NODE_NAME. Extend devlink ops
      with new callbacks rate_{leaf|node}_parent_set() to set node as a parent
      for rate object to allow supporting drivers to implement rate grouping
      through devlink. Driver implementations are allowed to support leafs
      or node children only. Invoking callback with NULL as parent should be
      threated by the driver as unset parent action.
      Extend rate object struct with reference counter to disallow deleting a
      node with any child pointing to it. User should unset parent for the
      child explicitly.
      
      Example:
      
      $ devlink port function rate add netdevsim/netdevsim10/group1
      
      $ devlink port function rate add netdevsim/netdevsim10/group2
      
      $ devlink port function rate set netdevsim/netdevsim10/group1 parent group2
      
      $ devlink port function rate show netdevsim/netdevsim10/group1
      netdevsim/netdevsim10/group1: type node parent group2
      
      $ devlink port function rate set netdevsim/netdevsim10/group1 noparent
      Co-developed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarDmytro Linkin <dlinkin@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7555984
    • Dmytro Linkin's avatar
      selftest: netdevsim: Add devlink rate nodes test · 413ee943
      Dmytro Linkin authored
      Test verifies that it is possible to create, delete and set min/max tx
      rate of devlink rate node on netdevsim VF.
      Co-developed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarDmytro Linkin <dlinkin@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      413ee943
    • Dmytro Linkin's avatar
      netdevsim: Implement support for devlink rate nodes · 885226f5
      Dmytro Linkin authored
      Implement new devlink ops that allow creation, deletion and setting of
      shared/max tx rate of devlink rate nodes through devlink API.
      Expose rate node and it's tx rates to netdevsim debugfs.
      Co-developed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarDmytro Linkin <dlinkin@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      885226f5
    • Dmytro Linkin's avatar
      devlink: Introduce rate nodes · a8ecb93e
      Dmytro Linkin authored
      Implement support for DEVLINK_CMD_RATE_{NEW|DEL} commands that are used
      to create and delete devlink rate nodes. Add new attribute
      DEVLINK_ATTR_RATE_NODE_NAME that specify node name string. The node name
      is an alphanumeric identifier. No valid node name can be a devlink port
      index, eg. decimal number. Extend devlink ops with new callbacks
      rate_node_{new|del}() and rate_node_tx_{share|max}_set() to allow
      supporting drivers to implement ports rate grouping and setting tx rate
      of rate nodes through devlink.
      Expose devlink_rate_nodes_destroy() function to allow vendor driver do
      proper cleanup of internally allocated resources for the nodes if the
      driver goes down or due to any other reasons which requires nodes to be
      destroyed.
      Disallow moving device from switchdev to legacy mode if any node exists
      on that device. User must explicitly delete nodes before switching mode.
      
      Example:
      
      $ devlink port function rate add netdevsim/netdevsim10/group1
      
      $ devlink port function rate set netdevsim/netdevsim10/group1 \
              tx_share 10mbit tx_max 100mbit
      
      Add + set command can be combined:
      
      $ devlink port function rate add netdevsim/netdevsim10/group1 \
              tx_share 10mbit tx_max 100mbit
      
      $ devlink port function rate show netdevsim/netdevsim10/group1
      netdevsim/netdevsim10/group1: type node tx_share 10mbit tx_max 100mbit
      
      $ devlink port function rate del netdevsim/netdevsim10/group1
      Co-developed-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarVlad Buslov <vladbu@nvidia.com>
      Signed-off-by: default avatarDmytro Linkin <dlinkin@nvidia.com>
      Reviewed-by: default avatarJiri Pirko <jiri@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8ecb93e