1. 22 Nov, 2019 7 commits
  2. 21 Nov, 2019 33 commits
    • David S. Miller's avatar
      Merge branch 'net-introduce-and-use-route-hint' · 7d75c0cb
      David S. Miller authored
      Paolo Abeni says:
      
      ====================
      net: introduce and use route hint
      
      This series leverages the listification infrastructure to avoid
      unnecessary route lookup on ingress packets. In absence of custom rules,
      packets with equal daddr will usually land on the same dst.
      
      When processing packet bursts (lists) we can easily reference the previous
      dst entry. When we hit the 'same destination' condition we can avoid the
      route lookup, coping the already available dst.
      
      Detailed performance numbers are available in the individual commit
      messages.
      
      v3 -> v4:
       - move helpers to their own patches (Eric D.)
       - enable hints for SUBTREE builds (David A.)
       - re-enable hints for ipv4 forward (David A.)
      
      v2 -> v3:
       - use fib*_has_custom_rules() helpers (David A.)
       - add ip*_extract_route_hint() helper (Edward C.)
       - use prev skb as hint instead of copying data (Willem )
      
      v1 -> v2:
       - fix build issue with !CONFIG_IP*_MULTIPLE_TABLES
       - fix potential race in ip6_list_rcv_finish()
      ====================
      Acked-by: default avatarEdward Cree <ecree@solarflare.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d75c0cb
    • Paolo Abeni's avatar
      ipv4: use dst hint for ipv4 list receive · 02b24941
      Paolo Abeni authored
      This is alike the previous change, with some additional ipv4 specific
      quirk. Even when using the route hint we still have to do perform
      additional per packet checks about source address validity: a new
      helper is added to wrap them.
      
      Hints are explicitly disabled if the destination is a local broadcast,
      that keeps the code simple and local broadcast are a slower path anyway.
      
      UDP flood performances vs recvmmsg() receiver:
      
      vanilla		patched		delta
      Kpps		Kpps		%
      1683		1871		+11
      
      In the worst case scenario - each packet has a different
      destination address - the performance delta is within noise
      range.
      
      v3 -> v4:
       - re-enable hints for forward
      
      v2 -> v3:
       - really fix build (sic) and hint usage check
       - use fib4_has_custom_rules() helpers (David A.)
       - add ip_extract_route_hint() helper (Edward C.)
       - use prev skb as hint instead of copying data (Willem)
      
      v1 -> v2:
       - fix build issue with !CONFIG_IP_MULTIPLE_TABLES
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02b24941
    • Paolo Abeni's avatar
      ipv4: move fib4_has_custom_rules() helper to public header · c43c3d76
      Paolo Abeni authored
      So that we can use it in the next patch.
      Additionally constify the helper argument.
      Suggested-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c43c3d76
    • Paolo Abeni's avatar
      ipv6: introduce and uses route look hints for list input. · 197dbf24
      Paolo Abeni authored
      When doing RX batch packet processing, we currently always repeat
      the route lookup for each ingress packet. When no custom rules are
      in place, and there aren't routes depending on source addresses,
      we know that packets with the same destination address will use
      the same dst.
      
      This change tries to avoid per packet route lookup caching
      the destination address of the latest successful lookup, and
      reusing it for the next packet when the above conditions are
      in place. Ingress traffic for most servers should fit.
      
      The measured performance delta under UDP flood vs a recvmmsg
      receiver is as follow:
      
      vanilla		patched		delta
      Kpps		Kpps		%
      1431		1674		+17
      
      In the worst-case scenario - each packet has a different
      destination address - the performance delta is within noise
      range.
      
      v3 -> v4:
       - support hints for SUBFLOW build, too (David A.)
       - several style fixes (Eric)
      
      v2 -> v3:
       - add fib6_has_custom_rules() helpers (David A.)
       - add ip6_extract_route_hint() helper (Edward C.)
       - use hint directly in ip6_list_rcv_finish() (Willem)
      
      v1 -> v2:
       - fix build issue with !CONFIG_IPV6_MULTIPLE_TABLES
       - fix potential race when fib6_has_custom_rules is set
         while processing a packet batch
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      197dbf24
    • Paolo Abeni's avatar
      ipv6: keep track of routes using src · b9b33e7c
      Paolo Abeni authored
      Use a per namespace counter, increment it on successful creation
      of any route using the source address, decrement it on deletion
      of such routes.
      
      This allows us to check easily if the routing decision in the
      current namespace depends on the packet source. Will be used
      by the next patch.
      Suggested-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b9b33e7c
    • Paolo Abeni's avatar
      ipv6: add fib6_has_custom_rules() helper · 1f8ac570
      Paolo Abeni authored
      It wraps the namespace field with the same name, to easily
      access it regardless of build options.
      Suggested-by: default avatarDavid Ahern <dsahern@gmail.com>
      Suggested-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f8ac570
    • David S. Miller's avatar
      Merge branch 'DSA-Felix-PTP' · 2c44713e
      David S. Miller authored
      Yangbo Lu says:
      
      ====================
      Support PTP clock and hardware timestamping for DSA Felix driver
      
      This patch-set is to support PTP clock and hardware timestamping
      for DSA Felix driver. Some functions in ocelot.c/ocelot_board.c
      driver were reworked/exported, so that DSA Felix driver was able
      to reuse them as much as possible.
      
      On TX path, timestamping works on packet which requires timestamp.
      The injection header will be configured accordingly, and skb clone
      requires timestamp will be added into a list. The TX timestamp
      is final handled in threaded interrupt handler when PTP timestamp
      FIFO is ready.
      On RX path, timestamping is always working. The RX timestamp could
      be got from extraction header.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2c44713e
    • Yangbo Lu's avatar
      net: dsa: ocelot: add hardware timestamping support for Felix · c0bcf537
      Yangbo Lu authored
      This patch is to reuse ocelot functions as possible to enable PTP
      clock and to support hardware timestamping on Felix.
      On TX path, timestamping works on packet which requires timestamp.
      The injection header will be configured accordingly, and skb clone
      requires timestamp will be added into a list. The TX timestamp
      is final handled in threaded interrupt handler when PTP timestamp
      FIFO is ready.
      On RX path, timestamping is always working. The RX timestamp could
      be got from extraction header.
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0bcf537
    • Yangbo Lu's avatar
      net: dsa: ocelot: define PTP registers for felix_vsc9959 · 5df66c48
      Yangbo Lu authored
      This patch is to define PTP registers for felix_vsc9959.
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5df66c48
    • Yangbo Lu's avatar
      net: mscc: ocelot: convert to use ocelot_port_add_txtstamp_skb() · 400928bf
      Yangbo Lu authored
      Convert to use ocelot_port_add_txtstamp_skb() for adding skbs which
      require TX timestamp into list. Export it so that DSA Felix driver
      could reuse it too.
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      400928bf
    • Yangbo Lu's avatar
      net: mscc: ocelot: convert to use ocelot_get_txtstamp() · e23a7b3e
      Yangbo Lu authored
      The method getting TX timestamp by reading timestamp FIFO and
      matching skbs list is common for DSA Felix driver too.
      So move code out of ocelot_board.c, convert to use
      ocelot_get_txtstamp() function and export it.
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e23a7b3e
    • Yangbo Lu's avatar
      net: mscc: ocelot: export ocelot_hwstamp_get/set functions · f145922d
      Yangbo Lu authored
      Export ocelot_hwstamp_get/set functions so that DSA driver
      is able to reuse them.
      Signed-off-by: default avatarYangbo Lu <yangbo.lu@nxp.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f145922d
    • John Fastabend's avatar
      bpf: skmsg, fix potential psock NULL pointer dereference · 8163999d
      John Fastabend authored
      Report from Dan Carpenter,
      
       net/core/skmsg.c:792 sk_psock_write_space()
       error: we previously assumed 'psock' could be null (see line 790)
      
       net/core/skmsg.c
         789 psock = sk_psock(sk);
         790 if (likely(psock && sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)))
       Check for NULL
         791 schedule_work(&psock->work);
         792 write_space = psock->saved_write_space;
                           ^^^^^^^^^^^^^^^^^^^^^^^^
         793          rcu_read_unlock();
         794          write_space(sk);
      
      Ensure psock dereference on line 792 only occurs if psock is not null.
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Fixes: 604326b4 ("bpf, sockmap: convert to generic sk_msg interface")
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8163999d
    • Jiri Olsa's avatar
      audit: Move audit_log_task declaration under CONFIG_AUDITSYSCALL · 7599a896
      Jiri Olsa authored
      The 0-DAY found that audit_log_task is not declared under
      CONFIG_AUDITSYSCALL which causes compilation error when
      it is not defined:
      
          kernel/bpf/syscall.o: In function `bpf_audit_prog.isra.30':
       >> syscall.c:(.text+0x860): undefined reference to `audit_log_task'
      
      Adding the audit_log_task declaration and stub within
      CONFIG_AUDITSYSCALL ifdef.
      
      Fixes: 91e6015b ("bpf: Emit audit messages upon successful prog load and unload")
      Reported-by: default avatarkbuild test robot <lkp@intel.com>
      Signed-off-by: default avatarJiri Olsa <jolsa@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7599a896
    • Krzysztof Kozlowski's avatar
      net: Fix Kconfig indentation, continued · 43da1411
      Krzysztof Kozlowski authored
      Adjust indentation from spaces to tab (+optional two spaces) as in
      coding style.  This fixes various indentation mixups (seven spaces,
      tab+one space, etc).
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      43da1411
    • Krzysztof Kozlowski's avatar
      drivers: net: Fix Kconfig indentation, continued · 5421cf84
      Krzysztof Kozlowski authored
      Adjust indentation from spaces to tab (+optional two spaces) as in
      coding style.  This fixes various indentation mixups (seven spaces,
      tab+one space, etc).
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5421cf84
    • Xin Long's avatar
      lwtunnel: check erspan options before allocating tun_info · 1841b982
      Xin Long authored
      As Jakub suggested on another patch, it's better to do the check
      on erspan options before allocating memory.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1841b982
    • Xin Long's avatar
      lwtunnel: be STRICT to validate the new LWTUNNEL_IP(6)_OPTS · 7b6a70f7
      Xin Long authored
      LWTUNNEL_IP(6)_OPTS are the new items in ip(6)_tun_policy, which
      are parsed by nla_parse_nested_deprecated(). We should check it
      strictly by setting .strict_start_type = LWTUNNEL_IP(6)_OPTS.
      
      This patch also adds missing LWTUNNEL_IP6_OPTS in ip6_tun_policy.
      
      Fixes: 4ece4778 ("lwtunnel: add options setting and dumping for geneve")
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b6a70f7
    • Xin Long's avatar
      net: remove the unnecessary strict_start_type in some policies · f3bed7f8
      Xin Long authored
      ct_policy and mpls_policy are parsed with nla_parse_nested(), which
      does NL_VALIDATE_STRICT validation, strict_start_type is not needed
      to set as it is actually trying to make some attributes parsed with
      NL_VALIDATE_STRICT.
      
      This patch is to remove it, and do the same on rtm_nh_policy which
      is parsed by nlmsg_parse().
      Suggested-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f3bed7f8
    • David S. Miller's avatar
      Merge branch 'net-sched-support-vxlan-and-erspan-options' · ff998a80
      David S. Miller authored
      Xin Long says:
      
      ====================
      net: sched: support vxlan and erspan options
      
      This patchset is to add vxlan and erspan options support in
      cls_flower and act_tunnel_key. The form is pretty much like
      geneve_opts in:
      
        https://patchwork.ozlabs.org/patch/935272/
        https://patchwork.ozlabs.org/patch/954564/
      
      but only one option is allowed for vxlan and erspan.
      
      v1->v2:
        - see each patch changelog.
      ====================
      Acked-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ff998a80
    • Xin Long's avatar
      net: sched: allow flower to match erspan options · 79b1011c
      Xin Long authored
      This patch is to allow matching options in erspan.
      
      The options can be described in the form:
      VER:INDEX:DIR:HWID/VER:INDEX_MASK:DIR_MASK:HWID_MASK.
      When ver is set to 1, index will be applied while dir
      and hwid will be ignored, and when ver is set to 2,
      dir and hwid will be used while index will be ignored.
      
      Different from geneve, only one option can be set. And
      also, geneve options, vxlan options or erspan options
      can't be set at the same time.
      
        # ip link add name erspan1 type erspan external
        # tc qdisc add dev erspan1 ingress
        # tc filter add dev erspan1 protocol ip parent ffff: \
            flower \
              enc_src_ip 10.0.99.192 \
              enc_dst_ip 10.0.99.193 \
              enc_key_id 11 \
              erspan_opts 1:12:0:0/1:ffff:0:0 \
              ip_proto udp \
              action mirred egress redirect dev eth0
      
      v1->v2:
        - improve some err msgs of extack.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      79b1011c
    • Xin Long's avatar
      net: sched: allow flower to match vxlan options · d8f9dfae
      Xin Long authored
      This patch is to allow matching gbp option in vxlan.
      
      The options can be described in the form GBP/GBP_MASK,
      where GBP is represented as a 32bit hexadecimal value.
      Different from geneve, only one option can be set. And
      also, geneve options and vxlan options can't be set at
      the same time.
      
        # ip link add name vxlan0 type vxlan dstport 0 external
        # tc qdisc add dev vxlan0 ingress
        # tc filter add dev vxlan0 protocol ip parent ffff: \
            flower \
              enc_src_ip 10.0.99.192 \
              enc_dst_ip 10.0.99.193 \
              enc_key_id 11 \
              vxlan_opts 01020304/ffffffff \
              ip_proto udp \
              action mirred egress redirect dev eth0
      
      v1->v2:
        - add .strict_start_type for enc_opts_policy as Jakub noticed.
        - use Duplicate instead of Wrong in err msg for extack as Jakub
          suggested.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d8f9dfae
    • Xin Long's avatar
      net: sched: add erspan option support to act_tunnel_key · e20d4ff2
      Xin Long authored
      This patch is to allow setting erspan options using the
      act_tunnel_key action. Different from geneve options,
      only one option can be set. And also, geneve options,
      vxlan options or erspan options can't be set at the
      same time.
      
      Options are expressed as ver:index:dir:hwid, when ver
      is set to 1, index will be applied while dir and hwid
      will be ignored, and when ver is set to 2, dir and
      hwid will be used while index will be ignored.
      
        # ip link add name erspan1 type erspan external
        # tc qdisc add dev eth0 ingress
        # tc filter add dev eth0 protocol ip parent ffff: \
                 flower indev eth0 \
                    ip_proto udp \
                    action tunnel_key \
                        set src_ip 10.0.99.192 \
                        dst_ip 10.0.99.193 \
                        dst_port 6081 \
                        id 11 \
        		erspan_opts 1:2:0:0 \
                action mirred egress redirect dev erspan1
      
      v1->v2:
        - do the validation when dst is not yet allocated as Jakub suggested.
        - use Duplicate instead of Wrong in err msg for extack.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e20d4ff2
    • Xin Long's avatar
      net: sched: add vxlan option support to act_tunnel_key · fca3f91c
      Xin Long authored
      This patch is to allow setting vxlan options using the
      act_tunnel_key action. Different from geneve options,
      only one option can be set. And also, geneve options
      and vxlan options can't be set at the same time.
      
      gbp is the only param for vxlan options:
      
        # ip link add name vxlan0 type vxlan dstport 0 external
        # tc qdisc add dev eth0 ingress
        # tc filter add dev eth0 protocol ip parent ffff: \
                 flower indev eth0 \
                    ip_proto udp \
                    action tunnel_key \
                        set src_ip 10.0.99.192 \
                        dst_ip 10.0.99.193 \
                        dst_port 6081 \
                        id 11 \
        		  vxlan_opts 01020304 \
                action mirred egress redirect dev vxlan0
      
      v1->v2:
        - add .strict_start_type for enc_opts_policy as Jakub noticed.
        - use Duplicate instead of Wrong in err msg for extack as Jakub
          suggested.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fca3f91c
    • Dan Carpenter's avatar
      octeontx2-af: Fix uninitialized variable in debugfs · 0617aa98
      Dan Carpenter authored
      If rvu_get_blkaddr() fails, then this rvu_cgx_nix_cuml_stats() returns
      zero and we write some uninitialized data into the debugfs output.
      
      On the error paths, the use of the uninitialized "*stat" is harmless,
      but it will lead to a Smatch warning (static analysis) and a UBSan
      warning (runtime analysis) so we should prevent that as well.
      
      Fixes: f967488d ("octeontx2-af: Add per CGX port level NIX Rx/Tx counters")
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0617aa98
    • Stefano Garzarella's avatar
      vsock: avoid to assign transport if its initialization fails · 039fccca
      Stefano Garzarella authored
      If transport->init() fails, we can't assign the transport to the
      socket, because it's not initialized correctly, and any future
      calls to the transport callbacks would have an unexpected behavior.
      
      Fixes: c0cfa2d8 ("vsock: add multi-transports support")
      Reported-and-tested-by: syzbot+e2e5c07bf353b2f79daa@syzkaller.appspotmail.com
      Signed-off-by: default avatarStefano Garzarella <sgarzare@redhat.com>
      Reviewed-by: default avatarJorgen Hansen <jhansen@vmware.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      039fccca
    • Russell King's avatar
      net: sfp: soft status and control support · f3c9a666
      Russell King authored
      Add support for the soft status and control register, which allows
      TX_FAULT and RX_LOS to be monitored and TX_DISABLE to be set.  We
      make use of this when the board does not support GPIOs for these
      signals.
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f3c9a666
    • David S. Miller's avatar
      Merge branch 'sfp-quirks' · 9ce33351
      David S. Miller authored
      Russell King says:
      
      ====================
      Add rudimentary SFP module quirk support
      
      The SFP module EEPROM describes the capabilities of the module, but
      doesn't describe the host interface.  We have a certain amount of
      guess-work to work out how to configure the host - which works most
      of the time.
      
      However, there are some (such as GPON) modules which are able to
      support different host interfaces, such as 1000BASE-X and 2500BASE-X.
      The module will switch between each mode until it achieves link with
      the host.
      
      There is no defined way to describe this in the SFP EEPROM, so we can
      only recognise the module and handle it appropriately.  This series
      adds the necessary recognition of the modules using a quirk system,
      and tweaks the support mask to allow them to link with the host at
      2500BASE-X, thereby allowing the user to achieve full line rate.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9ce33351
    • Russell King's avatar
      net: sfp: add some quirks for GPON modules · b0eae33b
      Russell King authored
      Marc Micalizzi reports that Huawei MA5671A and Alcatel/Lucent G-010S-P
      modules are capable of 2500base-X, but incorrectly report their
      capabilities in the EEPROM.  It seems rather common that GPON modules
      mis-report.
      
      Let's fix these modules by adding some quirks.
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b0eae33b
    • Russell King's avatar
      net: sfp: add support for module quirks · b34bb2cb
      Russell King authored
      Add support for applying module quirks to the list of supported
      ethtool link modes.
      Signed-off-by: default avatarRussell King <rmk+kernel@armlinux.org.uk>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b34bb2cb
    • Hangbin Liu's avatar
      tcp: warn if offset reach the maxlen limit when using snprintf · 9bb59a21
      Hangbin Liu authored
      snprintf returns the number of chars that would be written, not number
      of chars that were actually written. As such, 'offs' may get larger than
      'tbl.maxlen', causing the 'tbl.maxlen - offs' being < 0, and since the
      parameter is size_t, it would overflow.
      
      Since using scnprintf may hide the limit error, while the buffer is still
      enough now, let's just add a WARN_ON_ONCE in case it reach the limit
      in future.
      
      v2: Use WARN_ON_ONCE as Jiri and Eric suggested.
      Suggested-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9bb59a21
    • wenxu's avatar
      ip_gre: Make none-tun-dst gre tunnel store tunnel info as metadat_dst in recv · c0d59da7
      wenxu authored
      Currently collect_md gre tunnel will store the tunnel info(metadata_dst)
      to skb_dst.
      And now the non-tun-dst gre tunnel already can add tunnel header through
      lwtunnel.
      
      When received a arp_request on the non-tun-dst gre tunnel. The packet of
      arp response will send through the non-tun-dst tunnel without tunnel info
      which will lead the arp response packet to be dropped.
      
      If the non-tun-dst gre tunnel also store the tunnel info as metadata_dst,
      The arp response packet will set the releted tunnel info in the
      iptunnel_metadata_reply.
      
      The following is the test script:
      
      ip netns add cl
      ip l add dev vethc type veth peer name eth0 netns cl
      
      ifconfig vethc 172.168.0.7/24 up
      ip l add dev tun1000 type gretap key 1000
      
      ip link add user1000 type vrf table 1
      ip l set user1000 up
      ip l set dev tun1000 master user1000
      ifconfig tun1000 10.0.1.1/24 up
      
      ip netns exec cl ifconfig eth0 172.168.0.17/24 up
      ip netns exec cl ip l add dev tun type gretap local 172.168.0.17 remote 172.168.0.7 key 1000
      ip netns exec cl ifconfig tun 10.0.1.7/24 up
      ip r r 10.0.1.7 encap ip id 1000 dst 172.168.0.17 key dev tun1000 table 1
      
      With this patch
      ip netns exec cl ping 10.0.1.1 can success
      Signed-off-by: default avatarwenxu <wenxu@ucloud.cn>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c0d59da7
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next · ee5a489f
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf-next 2019-11-20
      
      The following pull-request contains BPF updates for your *net-next* tree.
      
      We've added 81 non-merge commits during the last 17 day(s) which contain
      a total of 120 files changed, 4958 insertions(+), 1081 deletions(-).
      
      There are 3 trivial conflicts, resolve it by always taking the chunk from
      196e8ca7:
      
      <<<<<<< HEAD
      =======
      void *bpf_map_area_mmapable_alloc(u64 size, int numa_node);
      >>>>>>> 196e8ca7
      
      <<<<<<< HEAD
      void *bpf_map_area_alloc(u64 size, int numa_node)
      =======
      static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable)
      >>>>>>> 196e8ca7
      
      <<<<<<< HEAD
              if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
      =======
              /* kmalloc()'ed memory can't be mmap()'ed */
              if (!mmapable && size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) {
      >>>>>>> 196e8ca7
      
      The main changes are:
      
      1) Addition of BPF trampoline which works as a bridge between kernel functions,
         BPF programs and other BPF programs along with two new use cases: i) fentry/fexit
         BPF programs for tracing with practically zero overhead to call into BPF (as
         opposed to k[ret]probes) and ii) attachment of the former to networking related
         programs to see input/output of networking programs (covering xdpdump use case),
         from Alexei Starovoitov.
      
      2) BPF array map mmap support and use in libbpf for global data maps; also a big
         batch of libbpf improvements, among others, support for reading bitfields in a
         relocatable manner (via libbpf's CO-RE helper API), from Andrii Nakryiko.
      
      3) Extend s390x JIT with usage of relative long jumps and loads in order to lift
         the current 64/512k size limits on JITed BPF programs there, from Ilya Leoshkevich.
      
      4) Add BPF audit support and emit messages upon successful prog load and unload in
         order to have a timeline of events, from Daniel Borkmann and Jiri Olsa.
      
      5) Extension to libbpf and xdpsock sample programs to demo the shared umem mode
         (XDP_SHARED_UMEM) as well as RX-only and TX-only sockets, from Magnus Karlsson.
      
      6) Several follow-up bug fixes for libbpf's auto-pinning code and a new API
         call named bpf_get_link_xdp_info() for retrieving the full set of prog
         IDs attached to XDP, from Toke Høiland-Jørgensen.
      
      7) Add BTF support for array of int, array of struct and multidimensional arrays
         and enable it for skb->cb[] access in kfree_skb test, from Martin KaFai Lau.
      
      8) Fix AF_XDP by using the correct number of channels from ethtool, from Luigi Rizzo.
      
      9) Two fixes for BPF selftest to get rid of a hang in test_tc_tunnel and to avoid
         xdping to be run as standalone, from Jiri Benc.
      
      10) Various BPF selftest fixes when run with latest LLVM trunk, from Yonghong Song.
      
      11) Fix a memory leak in BPF fentry test run data, from Colin Ian King.
      
      12) Various smaller misc cleanups and improvements mostly all over BPF selftests and
          samples, from Daniel T. Lee, Andre Guedes, Anders Roxell, Mao Wenan, Yue Haibing.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ee5a489f