1. 26 Sep, 2020 31 commits
    • Vladimir Oltean's avatar
      net: dsa: tag_sja1105: use a custom flow dissector procedure · e6652979
      Vladimir Oltean authored
      The sja1105 is a bit of a special snowflake, in that not all frames are
      transmitted/received in the same way. L2 link-local frames are received
      with the source port/switch ID information put in the destination MAC
      address. For the rest, a tag_8021q header is used. So only the latter
      frames displace the rest of the headers and need to use the generic flow
      dissector procedure.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e6652979
    • Vladimir Oltean's avatar
      net: dsa: tag_qca: use the generic flow dissector procedure · 6b04f171
      Vladimir Oltean authored
      Remove the .flow_dissect procedure, so the flow dissector will call the
      generic variant which works for this tagging protocol.
      
      Cc: John Crispin <john@phrozen.org>
      Cc: Alexander Lobakin <alobakin@pm.me>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b04f171
    • Vladimir Oltean's avatar
      net: dsa: tag_mtk: use the generic flow dissector procedure · b1af3656
      Vladimir Oltean authored
      Remove the .flow_dissect procedure, so the flow dissector will call the
      generic variant which works for this tagging protocol.
      
      Cc: DENG Qingfang <dqfext@gmail.com>
      Cc: Sean Wang <sean.wang@mediatek.com>
      Cc: John Crispin <john@phrozen.org>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b1af3656
    • Vladimir Oltean's avatar
      net: dsa: tag_edsa: use the generic flow dissector procedure · 742b2e19
      Vladimir Oltean authored
      Remove the .flow_dissect procedure, so the flow dissector will call the
      generic variant which works for this tagging protocol.
      
      Cc: Andrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      742b2e19
    • Vladimir Oltean's avatar
      net: dsa: tag_dsa: use the generic flow dissector procedure · 11f50111
      Vladimir Oltean authored
      Remove the .flow_dissect procedure, so the flow dissector will call the
      generic variant which works for this tagging protocol.
      
      Cc: Andrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      11f50111
    • Vladimir Oltean's avatar
      net: dsa: tag_brcm: use generic flow dissector procedure · f569ad52
      Vladimir Oltean authored
      There are 2 Broadcom tags in use, one places the DSA tag before the
      Ethernet destination MAC address, and the other before the EtherType.
      Nonetheless, both displace the rest of the headers, so this tagger can
      use the generic flow dissector procedure which accounts for that.
      
      The ASCII art drawing is a good reference though, so keep it but move it
      somewhere else.
      
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f569ad52
    • Vladimir Oltean's avatar
      net: flow_dissector: avoid indirect call to DSA .flow_dissect for generic case · 54fec335
      Vladimir Oltean authored
      With the recent mitigations against speculative execution exploits,
      indirect function calls are more expensive and it would be good to avoid
      them where possible.
      
      In the case of DSA, most switch taggers will shift the EtherType and
      next headers by a fixed amount equal to that tag's length in bytes.
      So we can use a generic procedure to determine that, without calling
      into custom tagger code. However we still leave the flow_dissect method
      inside struct dsa_device_ops as an override for the generic function.
      Suggested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54fec335
    • Vladimir Oltean's avatar
      net: dsa: point out the tail taggers · 7a6ffe76
      Vladimir Oltean authored
      The Marvell 88E6060 uses tag_trailer.c and the KSZ8795, KSZ9477 and
      KSZ9893 switches also use tail tags.
      
      Tell that to the DSA core, since this makes a difference for the flow
      dissector. Most switches break the parsing of frame headers, but these
      ones don't, so no flow dissector adjustment needs to be done for them.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7a6ffe76
    • Vladimir Oltean's avatar
      net: dsa: add a generic procedure for the flow dissector · 9790cf20
      Vladimir Oltean authored
      For all DSA formats that don't use tail tags, it looks like behind the
      obscure number crunching they're all doing the same thing: locating the
      real EtherType behind the DSA tag. Nonetheless, this is not immediately
      obvious, so create a generic helper for those DSA taggers that put the
      header before the EtherType.
      
      Another assumption for the generic function is that the DSA tags are of
      equal length on RX and on TX. Prior to the previous patch, this was not
      true for ocelot and for gswip. The problem was resolved for ocelot, but
      for gswip it still remains, so that can't use this helper yet.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9790cf20
    • Vladimir Oltean's avatar
      net: dsa: make the .flow_dissect tagger callback return void · 2e8cb1b3
      Vladimir Oltean authored
      There is no tagger that returns anything other than zero, so just change
      the return type appropriately.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2e8cb1b3
    • Vladimir Oltean's avatar
      net: dsa: tag_ocelot: use a short prefix on both ingress and egress · 5124197c
      Vladimir Oltean authored
      There are 2 goals that we follow:
      
      - Reduce the header size
      - Make the header size equal between RX and TX
      
      The issue that required long prefix on RX was the fact that the ocelot
      DSA tag, being put before Ethernet as it is, would overlap with the area
      that a DSA master uses for RX filtering (destination MAC address
      mainly).
      
      Now that we can ask DSA to put the master in promiscuous mode, in theory
      we could remove the prefix altogether and call it a day, but it looks
      like we can't. Using no prefix on ingress, some packets (such as ICMP)
      would be received, while others (such as PTP) would not be received.
      This is because the DSA master we use (enetc) triggers parse errors
      ("MAC rx frame errors") presumably because it sees Ethernet frames with
      a bad length. And indeed, when using no prefix, the EtherType (bytes
      12-13 of the frame, bits 96-111) falls over the REW_VAL field from the
      extraction header, aka the PTP timestamp.
      
      When turning the short (32-bit) prefix on, the EtherType overlaps with
      bits 64-79 of the extraction header, which are a reserved area
      transmitted as zero by the switch. The packets are not dropped by the
      DSA master with a short prefix. Actually, the frames look like this in
      tcpdump (below is a PTP frame, with an extra dsa_8021q tag - dadb 0482 -
      added by a downstream sja1105).
      
      89:0c:a9:f2:01:00 > 88:80:00:0a:00:1d, 802.3, length 0: LLC, \
      	dsap Unknown (0x10) Individual, ssap ProWay NM (0x0e) Response, \
      	ctrl 0x0004: Information, send seq 2, rcv seq 0, \
      	Flags [Response], length 78
      
      0x0000:  8880 000a 001d 890c a9f2 0100 0000 100f  ................
      0x0010:  0400 0000 0180 c200 000e 001f 7b63 0248  ............{c.H
      0x0020:  dadb 0482 88f7 1202 0036 0000 0000 0000  .........6......
      0x0030:  0000 0000 0000 0000 0000 001f 7bff fe63  ............{..c
      0x0040:  0248 0001 1f81 0500 0000 0000 0000 0000  .H..............
      0x0050:  0000 0000 0000 0000 0000 0000            ............
      
      So the short prefix is our new default: we've shortened our RX frames by
      12 octets, increased TX by 4, and headers are now equal between RX and
      TX. Note that we still need promiscuous mode for the DSA master to not
      drop it.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5124197c
    • Vladimir Oltean's avatar
      net: dsa: tag_sja1105: request promiscuous mode for master · 707091eb
      Vladimir Oltean authored
      Currently PTP is broken when ports are in standalone mode (the tagger
      keeps printing this message):
      
      sja1105 spi0.1: Expected meta frame, is 01-80-c2-00-00-0e in the DSA master multicast filter?
      
      Sure, one might say "simply add 01-80-c2-00-00-0e to the master's RX
      filter" but things become more complicated because:
      
      - Actually all frames in the 01-80-c2-xx-xx-xx and 01-1b-19-xx-xx-xx
        range are trapped to the CPU automatically
      - The switch mangles bytes 3 and 4 of the MAC address via the incl_srcpt
        ("include source port [in the DMAC]") option, which is how source port
        and switch id identification is done for link-local traffic on RX. But
        this means that an address installed to the RX filter would, at the
        end of the day, not correspond to the final address seen by the DSA
        master.
      
      Assume RX filtering lists on DSA masters are typically too small to
      include all necessary addresses for PTP to work properly on sja1105, and
      just request promiscuous mode unconditionally.
      
      Just an example:
      Assuming the following addresses are trapped to the CPU:
      01-80-c2-00-00-00 to 01-80-c2-00-00-ff
      01-1b-19-00-00-00 to 01-1b-19-00-00-ff
      
      These are 512 addresses.
      Now let's say this is a board with 3 switches, and 4 ports per switch.
      The 512 addresses become 6144 addresses that must be managed by the DSA
      master's RX filtering lists.
      
      This may be refined in the future, but for now, it is simply not worth
      it to add the additional addresses to the master's RX filter, so simply
      request it to become promiscuous as soon as the driver probes.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      707091eb
    • Vladimir Oltean's avatar
      net: dsa: allow drivers to request promiscuous mode on master · c3975400
      Vladimir Oltean authored
      Currently DSA assumes that taggers don't mess with the destination MAC
      address of the frames on RX. That is not always the case. Some DSA
      headers are placed before the Ethernet header (ocelot), and others
      simply mangle random bytes from the destination MAC address (sja1105
      with its incl_srcpt option).
      
      Currently the DSA master goes to promiscuous mode automatically when the
      slave devices go too (such as when enslaved to a bridge), but in
      standalone mode this is a problem that needs to be dealt with.
      
      So give drivers the possibility to signal that their tagging protocol
      will get randomly dropped otherwise, and let DSA deal with fixing that.
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c3975400
    • Vladimir Oltean's avatar
      net: mscc: ocelot: move NPI port configuration to DSA · 2d44b097
      Vladimir Oltean authored
      Remove the ocelot_configure_cpu() function, which was in fact bringing
      up 2 ports: the CPU port module, which both switchdev and DSA have, and
      the NPI port, which only DSA has.
      
      The (non-Ethernet) CPU port module is at a fixed index in the analyzer,
      whereas the NPI port is selected through the "ethernet" property in the
      device tree.
      
      Therefore, the function to set up an NPI port is DSA-specific, so we
      move it there, simplifying the ocelot switch library a little bit.
      
      Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
      Cc: Alexandre Belloni <alexandre.belloni@bootlin.com>
      Cc: UNGLinuxDriver <UNGLinuxDriver@microchip.com>
      Signed-off-by: default avatarVladimir Oltean <vladimir.oltean@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2d44b097
    • Jakub Kicinski's avatar
      Revert "vxlan: move encapsulation warning" · 435be28b
      Jakub Kicinski authored
      This reverts commit 546c044c.
      
      Nothing prevents user from sending frames to "external" VxLAN devices.
      In fact kernel itself may generate icmp chatter.
      
      This is fine, such frames should be dropped.
      
      The point of the "missing encapsulation" warning was that
      frames with missing encap should not make it into vxlan_xmit_one().
      And vxlan_xmit() drops them cleanly, so let it just do that.
      
      Without this revert the warning is triggered by the udp_tunnel_nic.sh
      test, but the minimal repro is:
      
      $ ip link add vxlan0 type vxlan \
           	      	     group 239.1.1.1 \
      		     dev lo \
      		     dstport 1234 \
      		     external
      $ ip li set dev vxlan0 up
      
      [  419.165981] vxlan0: Missing encapsulation instructions
      [  419.166551] WARNING: CPU: 0 PID: 1041 at drivers/net/vxlan.c:2889 vxlan_xmit+0x15c0/0x1fc0 [vxlan]
      Signed-off-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      435be28b
    • David S. Miller's avatar
      Merge branch 'devlink-flash-update-overwrite-mask' · cb9e4a73
      David S. Miller authored
      Jacob Keller says:
      
      ====================
      devlink flash update overwrite mask
      
      This series introduces support for a new attribute to the flash update
      command: DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK.
      
      This attribute is a bitfield which allows userspace to specify what set of
      subfields to overwrite when performing a flash update for a device.
      
      The intention is to support the ability to control the behavior of
      overwriting the configuration and identifying fields in the Intel ice device
      flash update process. This is necessary  as the firmware layout for the ice
      device includes some settings and configuration within the same flash
      section as the main firmware binary.
      
      This series, and the accompanying iproute2 series, introduce support for the
      attribute. Once applied, the overwrite support can be be invoked via
      devlink:
      
        # overwrite settings
        devlink dev flash pci/0000:af:00.0 file firmware.bin overwrite settings
      
        # overwrite identifiers and settings
        devlink dev flash pci/0000:af:00.0 file firmware.bin overwrite settings overwrite identifiers
      
      To aid in the safe addition of new parameters, first some refactoring is
      done to the .flash_update function: its parameters are converted from a
      series of function arguments into a structure. This makes it easier to add
      the new parameter without changing the signature of the .flash_update
      handler in the future. Additionally, a "supported_flash_update_params" field
      is added to devlink_ops. This field is similar to the ethtool
      "supported_coalesc_params" field. The devlink core will now check that the
      DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT bit is set before forwarding the
      component attribute. Similarly, the new overwrite attribute will also
      require a supported bit.
      
      Doing these refactors will aid in adding any other attributes in the future,
      and creates a good pattern for other interfaces to use in the future. By
      requiring drivers to opt-in, we reduce the risk of accidentally breaking
      drivers when ever we add an additional parameter. We also reduce boiler
      plate code in drivers which do not support the parameters.
      
      Changes since v9:
      * rebased to current net-next, no other changes
      
      Changes since v7
      * resend, hopefully avoiding the SMTP server issues I experienced on Friday
      
      Changes since v6
      * Rebased to current net-next to resolve conflicts
      * Added changes to the ionic driver that recently merged flash update support
      * Fixed the changes for mlxsw to apply to core instead of spectrum.c after
        the recent refactor.
      * Picked up the review tags from Jakub
      
      Changes since v5
      * Fix *all* of the BIT usage to use _BITUL() (thanks Jakub!)
      
      Changes since v4
      * Renamed nla_overwrite to nla_overwrite_mask at Jiri's suggestion
      * Added "by this device" to the netlink error messages for unsupported
        attributes
      * Removed use of BIT() in the uapi header
      * Fixed the commit message for the netdevsim patch
      * Picked up Jakub's reviewed
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb9e4a73
    • Jacob Keller's avatar
      ice: add support for flash update overwrite mask · 50db1bca
      Jacob Keller authored
      Support the recently added DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK
      parameter in the ice flash update handler. Convert the overwrite mask
      bitfield into the appropriate preservation level used by the firmware
      when updating.
      
      Because there is no equivalent preservation level for overwriting only
      identifiers, this combination is rejected by the driver as not supported
      with an appropriate extended ACK message.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      50db1bca
    • Jacob Keller's avatar
      netdevsim: add support for flash_update overwrite mask · cbb58368
      Jacob Keller authored
      The devlink interface recently gained support for a new "overwrite mask"
      parameter that allows specifying how various sub-sections of a flash
      component are modified when updating.
      
      Add support for this to netdevsim, to enable easily testing the
      interface. Make the allowed overwrite mask values controllable via
      a debugfs parameter. This enables testing a flow where the driver
      rejects an unsupportable overwrite mask.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cbb58368
    • Jacob Keller's avatar
      devlink: introduce flash update overwrite mask · 5d5b4128
      Jacob Keller authored
      Sections of device flash may contain settings or device identifying
      information. When performing a flash update, it is generally expected
      that these settings and identifiers are not overwritten.
      
      However, it may sometimes be useful to allow overwriting these fields
      when performing a flash update. Some examples include, 1) customizing
      the initial device config on first programming, such as overwriting
      default device identifying information, or 2) reverting a device
      configuration to known good state provided in the new firmware image, or
      3) in case it is suspected that current firmware logic for managing the
      preservation of fields during an update is broken.
      
      Although some devices are able to completely separate these types of
      settings and fields into separate components, this is not true for all
      hardware.
      
      To support controlling this behavior, a new
      DEVLINK_ATTR_FLASH_UPDATE_OVERWRITE_MASK is defined. This is an
      nla_bitfield32 which will define what subset of fields in a component
      should be overwritten during an update.
      
      If no bits are specified, or of the overwrite mask is not provided, then
      an update should not overwrite anything, and should maintain the
      settings and identifiers as they are in the previous image.
      
      If the overwrite mask has the DEVLINK_FLASH_OVERWRITE_SETTINGS bit set,
      then the device should be configured to overwrite any of the settings in
      the requested component with settings found in the provided image.
      
      Similarly, if the DEVLINK_FLASH_OVERWRITE_IDENTIFIERS bit is set, the
      device should be configured to overwrite any device identifiers in the
      requested component with the identifiers from the image.
      
      Multiple overwrite modes may be combined to indicate that a combination
      of the set of fields that should be overwritten.
      
      Drivers which support the new overwrite mask must set the
      DEVLINK_SUPPORT_FLASH_UPDATE_OVERWRITE_MASK in the
      supported_flash_update_params field of their devlink_ops.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5d5b4128
    • Jacob Keller's avatar
      devlink: convert flash_update to use params structure · bc75c054
      Jacob Keller authored
      The devlink core recently gained support for checking whether the driver
      supports a flash_update parameter, via `supported_flash_update_params`.
      However, parameters are specified as function arguments. Adding a new
      parameter still requires modifying the signature of the .flash_update
      callback in all drivers.
      
      Convert the .flash_update function to take a new `struct
      devlink_flash_update_params` instead. By using this structure, and the
      `supported_flash_update_params` bit field, a new parameter to
      flash_update can be added without requiring modification to existing
      drivers.
      
      As before, all parameters except file_name will require driver opt-in.
      Because file_name is a necessary field to for the flash_update to make
      sense, no "SUPPORTED" bitflag is provided and it is always considered
      valid. All future additional parameters will require a new bit in the
      supported_flash_update_params bitfield.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Cc: Jiri Pirko <jiri@mellanox.com>
      Cc: Jakub Kicinski <kuba@kernel.org>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Michael Chan <michael.chan@broadcom.com>
      Cc: Bin Luo <luobin9@huawei.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Leon Romanovsky <leon@kernel.org>
      Cc: Ido Schimmel <idosch@mellanox.com>
      Cc: Danielle Ratson <danieller@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bc75c054
    • Jacob Keller's avatar
      devlink: check flash_update parameter support in net core · 22ec3d23
      Jacob Keller authored
      When implementing .flash_update, drivers which do not support
      per-component update are manually checking the component parameter to
      verify that it is NULL. Without this check, the driver might accept an
      update request with a component specified even though it will not honor
      such a request.
      
      Instead of having each driver check this, move the logic into
      net/core/devlink.c, and use a new `supported_flash_update_params` field
      in the devlink_ops. Drivers which will support per-component update must
      now specify this by setting DEVLINK_SUPPORT_FLASH_UPDATE_COMPONENT in
      the supported_flash_update_params in their devlink_ops.
      
      This helps ensure that drivers do not forget to check for a NULL
      component if they do not support per-component update. This also enables
      a slightly better error message by enabling the core stack to set the
      netlink bad attribute message to indicate precisely the unsupported
      attribute in the message.
      
      Going forward, any new additional parameter to flash update will require
      a bit in the supported_flash_update_params bitfield.
      Signed-off-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Reviewed-by: default avatarJakub Kicinski <kuba@kernel.org>
      Cc: Jiri Pirko <jiri@mellanox.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Michael Chan <michael.chan@broadcom.com>
      Cc: Bin Luo <luobin9@huawei.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Cc: Leon Romanovsky <leon@kernel.org>
      Cc: Ido Schimmel <idosch@mellanox.com>
      Cc: Danielle Ratson <danieller@mellanox.com>
      Cc: Shannon Nelson <snelson@pensando.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      22ec3d23
    • David S. Miller's avatar
      Merge branch 'simplify-TCP-loss-marking-code' · 6fba737a
      David S. Miller authored
      Yuchung Cheng says:
      
      ====================
      simplify TCP loss marking code
      
      The TCP loss marking is implemented by a set of intertwined
      subroutines. TCP has several loss detection algorithms
      (RACK, RFC6675/FACK, NewReno, etc) each calls a subset of
      these routines to mark a packet lost. This has led to
      various bugs (and fixes and fixes of fixes).
      
      This patch set is to consolidate the loss marking code so
      all detection algorithms call the same routine tcp_mark_skb_lost().
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6fba737a
    • Yuchung Cheng's avatar
      tcp: consolidate tcp_mark_skb_lost and tcp_skb_mark_lost · 534a2109
      Yuchung Cheng authored
      tcp_skb_mark_lost is used by RFC6675-SACK and can easily be replaced
      with the new tcp_mark_skb_lost handler.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      534a2109
    • Yuchung Cheng's avatar
      tcp: simplify tcp_mark_skb_lost · 68698970
      Yuchung Cheng authored
      This patch consolidates and simplifes the loss marking logic used
      by a few loss detections (RACK, RFC6675, NewReno). Previously
      each detection uses a subset of several intertwined subroutines.
      This unncessary complexity has led to bugs (and fixes of bug fixes).
      
      tcp_mark_skb_lost now is the single one routine to mark a packet loss
      when a loss detection caller deems an skb ist lost:
      
         1. rewind tp->retransmit_hint_skb if skb has lower sequence or
            all lost ones have been retransmitted.
      
         2. book-keeping: adjust flags and counts depending on if skb was
            retransmitted or not.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      68698970
    • Yuchung Cheng's avatar
      tcp: move tcp_mark_skb_lost · fd214674
      Yuchung Cheng authored
      A pure refactor to move tcp_mark_skb_lost to tcp_input.c to prepare
      for the later loss marking consolidation.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fd214674
    • Yuchung Cheng's avatar
      tcp: consistently check retransmit hint · 179ac35f
      Yuchung Cheng authored
      tcp_simple_retransmit() used for path MTU discovery may not adjust
      the retransmit hint properly by deducting retrans_out before checking
      it to adjust the hint. This patch fixes this by a correct routine
      tcp_mark_skb_lost() already used by the RACK loss detection.
      Signed-off-by: default avatarYuchung Cheng <ycheng@google.com>
      Signed-off-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      179ac35f
    • Gustavo A. R. Silva's avatar
      dpaa2-mac: Fix potential null pointer dereference · b4f43483
      Gustavo A. R. Silva authored
      There is a null-check for _pcs_, but it is being dereferenced
      prior to this null-check. So, if _pcs_ can actually be null,
      then there is a potential null pointer dereference that should
      be fixed by null-checking _pcs_ before being dereferenced.
      
      Addresses-Coverity-ID: 1497159 ("Dereference before null check")
      Fixes: 94ae899b ("dpaa2-mac: add PCS support through the Lynx module")
      Signed-off-by: default avatarGustavo A. R. Silva <gustavoars@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4f43483
    • David S. Miller's avatar
      Merge branch 'dpaa2-eth-small-updates' · 9b69e5eb
      David S. Miller authored
      Ioana Ciornei says:
      
      ====================
      dpaa2-eth: small updates
      
      This patch set is just a collection of small updates to the dpaa2-eth
      driver.
      
      First, we only need to check the availability of the DTS child node, not
      both child and parent node. Then remove a call to
      dpaa2_eth_link_state_update() which is now just a leftover and it's not
      useful in how are things working now in the PHY integration. Lastly,
      modify how the driver is behaving when the the flow steering table is
      used between all the traffic classes.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b69e5eb
    • Ionut-robert Aron's avatar
      dpaa2-eth: install a single steering rule when SHARED_FS is enabled · 5e29c16f
      Ionut-robert Aron authored
      When SHARED_FS is enabled on a DPNI object the flow steering tables are
      shared between all the traffic classes. Modify the driver so that we
      only add a new flow steering entry on the TC#0 when this new option is
      enabled.
      Signed-off-by: default avatarIonut-robert Aron <ionut-robert.aron@nxp.com>
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5e29c16f
    • Ioana Ciornei's avatar
      dpaa2-eth: no need to check link state right after ndo_open · 4c33a5bd
      Ioana Ciornei authored
      The call to dpaa2_eth_link_state_update() is a leftover from the time
      when on DPAA2 platforms the PHYs were started at boot time so when an
      ifconfig was issued on the associated interface, the link status needed
      to be checked directly from the ndo_open() callback.
      This is not needed anymore since we are now properly integrated with the
      PHY layer thus a link interrupt will come directly from the PHY
      eventually without the need to call the sync function.
      Fix this up by removing the call to dpaa2_eth_link_state_update().
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c33a5bd
    • Ioana Ciornei's avatar
      dpaa2-mac: do not check for both child and parent DTS nodes · 98179709
      Ioana Ciornei authored
      There is no need to check if both the MDIO controller node and its
      child node, the PCS device, are available since there is no chance that
      the child node would be enabled when the parent it's not.
      Suggested-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarIoana Ciornei <ioana.ciornei@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      98179709
  2. 25 Sep, 2020 9 commits