• Ido Schimmel's avatar
    ethtool: Add ability to control transceiver modules' power mode · 353407d9
    Ido Schimmel authored
    Add a pair of new ethtool messages, 'ETHTOOL_MSG_MODULE_SET' and
    'ETHTOOL_MSG_MODULE_GET', that can be used to control transceiver
    modules parameters and retrieve their status.
    
    The first parameter to control is the power mode of the module. It is
    only relevant for paged memory modules, as flat memory modules always
    operate in low power mode.
    
    When a paged memory module is in low power mode, its power consumption
    is reduced to the minimum, the management interface towards the host is
    available and the data path is deactivated.
    
    User space can choose to put modules that are not currently in use in
    low power mode and transition them to high power mode before putting the
    associated ports administratively up. This is useful for user space that
    favors reduced power consumption and lower temperatures over reduced
    link up times. In QSFP-DD modules the transition from low power mode to
    high power mode can take a few seconds and this transition is only
    expected to get longer with future / more complex modules.
    
    User space can control the power mode of the module via the power mode
    policy attribute ('ETHTOOL_A_MODULE_POWER_MODE_POLICY'). Possible
    values:
    
    * high: Module is always in high power mode.
    
    * auto: Module is transitioned by the host to high power mode when the
      first port using it is put administratively up and to low power mode
      when the last port using it is put administratively down.
    
    The operational power mode of the module is available to user space via
    the 'ETHTOOL_A_MODULE_POWER_MODE' attribute. The attribute is not
    reported to user space when a module is not plugged-in.
    
    The user API is designed to be generic enough so that it could be used
    for modules with different memory maps (e.g., SFF-8636, CMIS).
    
    The only implementation of the device driver API in this series is for a
    MAC driver (mlxsw) where the module is controlled by the device's
    firmware, but it is designed to be generic enough so that it could also
    be used by implementations where the module is controlled by the CPU.
    
    CMIS testing
    ============
    
     # ethtool -m swp11
     Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
     ...
     Module State                              : 0x03 (ModuleReady)
     LowPwrAllowRequestHW                      : Off
     LowPwrRequestSW                           : Off
    
    The module is not in low power mode, as it is not forced by hardware
    (LowPwrAllowRequestHW is off) or by software (LowPwrRequestSW is off).
    
    The power mode can be queried from the kernel. In case
    LowPwrAllowRequestHW was on, the kernel would need to take into account
    the state of the LowPwrRequestHW signal, which is not visible to user
    space.
    
     $ ethtool --show-module swp11
     Module parameters for swp11:
     power-mode-policy high
     power-mode high
    
    Change the power mode policy to 'auto':
    
     # ethtool --set-module swp11 power-mode-policy auto
    
    Query the power mode again:
    
     $ ethtool --show-module swp11
     Module parameters for swp11:
     power-mode-policy auto
     power-mode low
    
    Verify with the data read from the EEPROM:
    
     # ethtool -m swp11
     Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
     ...
     Module State                              : 0x01 (ModuleLowPwr)
     LowPwrAllowRequestHW                      : Off
     LowPwrRequestSW                           : On
    
    Put the associated port administratively up which will instruct the host
    to transition the module to high power mode:
    
     # ip link set dev swp11 up
    
    Query the power mode again:
    
     $ ethtool --show-module swp11
     Module parameters for swp11:
     power-mode-policy auto
     power-mode high
    
    Verify with the data read from the EEPROM:
    
     # ethtool -m swp11
     Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
     ...
     Module State                              : 0x03 (ModuleReady)
     LowPwrAllowRequestHW                      : Off
     LowPwrRequestSW                           : Off
    
    Put the associated port administratively down which will instruct the
    host to transition the module to low power mode:
    
     # ip link set dev swp11 down
    
    Query the power mode again:
    
     $ ethtool --show-module swp11
     Module parameters for swp11:
     power-mode-policy auto
     power-mode low
    
    Verify with the data read from the EEPROM:
    
     # ethtool -m swp11
     Identifier                                : 0x18 (QSFP-DD Double Density 8X Pluggable Transceiver (INF-8628))
     ...
     Module State                              : 0x01 (ModuleLowPwr)
     LowPwrAllowRequestHW                      : Off
     LowPwrRequestSW                           : On
    
    SFF-8636 testing
    ================
    
     # ethtool -m swp13
     Identifier                                : 0x11 (QSFP28)
     ...
     Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) enabled
     Power set                                 : Off
     Power override                            : On
     ...
     Transmit avg optical power (Channel 1)    : 0.7733 mW / -1.12 dBm
     Transmit avg optical power (Channel 2)    : 0.7649 mW / -1.16 dBm
     Transmit avg optical power (Channel 3)    : 0.7790 mW / -1.08 dBm
     Transmit avg optical power (Channel 4)    : 0.7837 mW / -1.06 dBm
     Rcvr signal avg optical power(Channel 1)  : 0.9302 mW / -0.31 dBm
     Rcvr signal avg optical power(Channel 2)  : 0.9079 mW / -0.42 dBm
     Rcvr signal avg optical power(Channel 3)  : 0.8993 mW / -0.46 dBm
     Rcvr signal avg optical power(Channel 4)  : 0.8778 mW / -0.57 dBm
    
    The module is not in low power mode, as it is not forced by hardware
    (Power override is on) or by software (Power set is off).
    
    The power mode can be queried from the kernel. In case Power override
    was off, the kernel would need to take into account the state of the
    LPMode signal, which is not visible to user space.
    
     $ ethtool --show-module swp13
     Module parameters for swp13:
     power-mode-policy high
     power-mode high
    
    Change the power mode policy to 'auto':
    
     # ethtool --set-module swp13 power-mode-policy auto
    
    Query the power mode again:
    
     $ ethtool --show-module swp13
     Module parameters for swp13:
     power-mode-policy auto
     power-mode low
    
    Verify with the data read from the EEPROM:
    
     # ethtool -m swp13
     Identifier                                : 0x11 (QSFP28)
     Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) not enabled
     Power set                                 : On
     Power override                            : On
     ...
     Transmit avg optical power (Channel 1)    : 0.0000 mW / -inf dBm
     Transmit avg optical power (Channel 2)    : 0.0000 mW / -inf dBm
     Transmit avg optical power (Channel 3)    : 0.0000 mW / -inf dBm
     Transmit avg optical power (Channel 4)    : 0.0000 mW / -inf dBm
     Rcvr signal avg optical power(Channel 1)  : 0.0000 mW / -inf dBm
     Rcvr signal avg optical power(Channel 2)  : 0.0000 mW / -inf dBm
     Rcvr signal avg optical power(Channel 3)  : 0.0000 mW / -inf dBm
     Rcvr signal avg optical power(Channel 4)  : 0.0000 mW / -inf dBm
    
    Put the associated port administratively up which will instruct the host
    to transition the module to high power mode:
    
     # ip link set dev swp13 up
    
    Query the power mode again:
    
     $ ethtool --show-module swp13
     Module parameters for swp13:
     power-mode-policy auto
     power-mode high
    
    Verify with the data read from the EEPROM:
    
     # ethtool -m swp13
     Identifier                                : 0x11 (QSFP28)
     ...
     Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) enabled
     Power set                                 : Off
     Power override                            : On
     ...
     Transmit avg optical power (Channel 1)    : 0.7934 mW / -1.01 dBm
     Transmit avg optical power (Channel 2)    : 0.7859 mW / -1.05 dBm
     Transmit avg optical power (Channel 3)    : 0.7885 mW / -1.03 dBm
     Transmit avg optical power (Channel 4)    : 0.7985 mW / -0.98 dBm
     Rcvr signal avg optical power(Channel 1)  : 0.9325 mW / -0.30 dBm
     Rcvr signal avg optical power(Channel 2)  : 0.9034 mW / -0.44 dBm
     Rcvr signal avg optical power(Channel 3)  : 0.9086 mW / -0.42 dBm
     Rcvr signal avg optical power(Channel 4)  : 0.8885 mW / -0.51 dBm
    
    Put the associated port administratively down which will instruct the
    host to transition the module to low power mode:
    
     # ip link set dev swp13 down
    
    Query the power mode again:
    
     $ ethtool --show-module swp13
     Module parameters for swp13:
     power-mode-policy auto
     power-mode low
    
    Verify with the data read from the EEPROM:
    
     # ethtool -m swp13
     Identifier                                : 0x11 (QSFP28)
     ...
     Extended identifier description           : 5.0W max. Power consumption,  High Power Class (> 3.5 W) not enabled
     Power set                                 : On
     Power override                            : On
     ...
     Transmit avg optical power (Channel 1)    : 0.0000 mW / -inf dBm
     Transmit avg optical power (Channel 2)    : 0.0000 mW / -inf dBm
     Transmit avg optical power (Channel 3)    : 0.0000 mW / -inf dBm
     Transmit avg optical power (Channel 4)    : 0.0000 mW / -inf dBm
     Rcvr signal avg optical power(Channel 1)  : 0.0000 mW / -inf dBm
     Rcvr signal avg optical power(Channel 2)  : 0.0000 mW / -inf dBm
     Rcvr signal avg optical power(Channel 3)  : 0.0000 mW / -inf dBm
     Rcvr signal avg optical power(Channel 4)  : 0.0000 mW / -inf dBm
    Signed-off-by: default avatarIdo Schimmel <idosch@nvidia.com>
    Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
    353407d9
netlink.c 29.1 KB