• Jacob Keller's avatar
    ice: Report stats for allocated queues via ethtool stats · f8ba7db8
    Jacob Keller authored
    It is not safe to have the string table for statistics change order or
    size over the lifetime of a given netdevice. This is because of the
    nature of the 3-step process for obtaining stats. First, user space
    performs a request for the size of the strings table. Second it performs
    a separate request for the strings themselves, after allocating space
    for the table. Third, it requests the stats themselves, also allocating
    space for the table.
    
    If the size decreased, there is potential to see garbage data or stats
    values. In the worst case, we could potentially see stats values become
    mis-aligned with their strings, so that it looks like a statistic is
    being reported differently than it actually is.
    
    Even worse, if the size increased, there is potential that the strings
    table or stats table was not allocated large enough and the stats code
    could access and write to memory it should not, potentially resulting in
    undefined behavior and system crashes.
    
    It isn't even safe if the size always changes under the RTNL lock. This
    is because the calls take place over multiple user space commands, so it
    is not possible to hold the RTNL lock for the entire duration of
    obtaining strings and stats. Further, not all consumers of the ethtool
    API are the user space ethtool program, and it is possible that one
    assumes the strings will not change (valid under the current contract),
    and thus only requests the stats values when requesting stats in a loop.
    
    Finally, it's not possible in the general case to detect when the size
    changes, because it is quite possible that one value which could impact
    the stat size increased, while another decreased. This would result in
    the same total number of stats, but reordering them so that stats no
    longer line up with the strings they belong to. Since only size changes
    aren't enough, we would need some sort of hash or token to determine
    when the strings no longer match. This would require extending the
    ethtool stats commands, but there is no more space in the relevant
    structures.
    
    The real solution to resolve this would be to add a completely new API
    for stats, probably over netlink.
    
    In the ice driver, the only thing impacting the stats that is not
    constant is the number of queues. Instead of reporting stats for each
    used queue, report stats for each allocated queue. We do not change the
    number of queues allocated for a given netdevice, as we pass this into
    the alloc_etherdev_mq() function to set the num_tx_queues and
    num_rx_queues.
    
    This resolves the potential bugs at the slight cost of displaying many
    queue statistics which will not be activated.
    Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
    Signed-off-by: default avatarAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
    Tested-by: default avatarTony Brelinski <tonyx.brelinski@intel.com>
    Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
    f8ba7db8
ice.h 9.69 KB