1. 21 Nov, 2018 2 commits
    • Eric Dumazet's avatar
      tcp: defer SACK compression after DupThresh · 86de5921
      Eric Dumazet authored
      Jean-Louis reported a TCP regression and bisected to recent SACK
      compression.
      
      After a loss episode (receiver not able to keep up and dropping
      packets because its backlog is full), linux TCP stack is sending
      a single SACK (DUPACK).
      
      Sender waits a full RTO timer before recovering losses.
      
      While RFC 6675 says in section 5, "Algorithm Details",
      
         (2) If DupAcks < DupThresh but IsLost (HighACK + 1) returns true --
             indicating at least three segments have arrived above the current
             cumulative acknowledgment point, which is taken to indicate loss
             -- go to step (4).
      ...
         (4) Invoke fast retransmit and enter loss recovery as follows:
      
      there are old TCP stacks not implementing this strategy, and
      still counting the dupacks before starting fast retransmit.
      
      While these stacks probably perform poorly when receivers implement
      LRO/GRO, we should be a little more gentle to them.
      
      This patch makes sure we do not enable SACK compression unless
      3 dupacks have been sent since last rcv_nxt update.
      
      Ideally we should even rearm the timer to send one or two
      more DUPACK if no more packets are coming, but that will
      be work aiming for linux-4.21.
      
      Many thanks to Jean-Louis for bisecting the issue, providing
      packet captures and testing this patch.
      
      Fixes: 5d9f4262 ("tcp: add SACK compression")
      Reported-by: default avatarJean-Louis Dupond <jean-louis@dupond.be>
      Tested-by: default avatarJean-Louis Dupond <jean-louis@dupond.be>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      86de5921
    • Petr Machata's avatar
      net: skb_scrub_packet(): Scrub offload_fwd_mark · b5dd186d
      Petr Machata authored
      When a packet is trapped and the corresponding SKB marked as
      already-forwarded, it retains this marking even after it is forwarded
      across veth links into another bridge. There, since it ingresses the
      bridge over veth, which doesn't have offload_fwd_mark, it triggers a
      warning in nbp_switchdev_frame_mark().
      
      Then nbp_switchdev_allowed_egress() decides not to allow egress from
      this bridge through another veth, because the SKB is already marked, and
      the mark (of 0) of course matches. Thus the packet is incorrectly
      blocked.
      
      Solve by resetting offload_fwd_mark() in skb_scrub_packet(). That
      function is called from tunnels and also from veth, and thus catches the
      cases where traffic is forwarded between bridges and transformed in a
      way that invalidates the marking.
      
      Fixes: 6bc506b4 ("bridge: switchdev: Add forward mark support for stacked devices")
      Fixes: abf4bb6b ("skbuff: Add the offload_mr_fwd_mark field")
      Signed-off-by: default avatarPetr Machata <petrm@mellanox.com>
      Suggested-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Acked-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5dd186d
  2. 20 Nov, 2018 9 commits
  3. 19 Nov, 2018 25 commits
    • Shay Agroskin's avatar
      net/mlx5e: Fix failing ethtool query on FEC query error · 9184e51b
      Shay Agroskin authored
      If FEC caps query fails when executing 'ethtool <interface>'
      the whole callback fails unnecessarily, fixed that by replacing the
      error return code with debug logging only.
      
      Fixes: 6cfa9460 ("net/mlx5e: Ethtool driver callback for query/set FEC policy")
      Signed-off-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      9184e51b
    • Shay Agroskin's avatar
      net/mlx5e: Removed unnecessary warnings in FEC caps query · 64e28334
      Shay Agroskin authored
      Querying interface FEC caps with 'ethtool [int]' after link reset
      throws warning regading link speed.
      This warning is not needed as there is already an indication in
      user space that the link is not up.
      
      Fixes: 0696d608 ("net/mlx5e: Receive buffer configuration")
      Signed-off-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      64e28334
    • Shay Agroskin's avatar
      net/mlx5e: Fix wrong field name in FEC related functions · febd72f2
      Shay Agroskin authored
      This bug would result in reading wrong FEC capabilities for 10G/40G.
      
      Fixes: 2095b264 ("net/mlx5e: Add port FEC get/set functions")
      Signed-off-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      febd72f2
    • Shay Agroskin's avatar
      net/mlx5e: Fix a bug in turning off FEC policy in unsupported speeds · 9cdeaab3
      Shay Agroskin authored
      Some speeds don't support turning FEC policy off. In case a requested
      FEC policy is not supported for a speed (including current speed), its new
      FEC policy would be:
      	no FEC - if disabling FEC is supported for that speed
      	unchanged - else
      
      Fixes: 2095b264 ("net/mlx5e: Add port FEC get/set functions")
      Signed-off-by: default avatarShay Agroskin <shayag@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      9cdeaab3
    • David S. Miller's avatar
      Merge branch 'ena-hibernation-and-rmmod-bug-fixes' · d7c60210
      David S. Miller authored
      Arthur Kiyanovski says:
      
      ====================
      net: ena: hibernation and rmmod bug fixes
      
      This patchset includes 2 bug fixes:
      1. A fix to a crash during resume from hibernation.
      2. A fix to an illegal memory access during driver removal (e.g. during rmmod)
         which might cause a crash in certain systems.
      
      The subminor number in the driver version is also promoted to indicate driver
      was changed.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d7c60210
    • Arthur Kiyanovski's avatar
      net: ena: update driver version from 2.0.1 to 2.0.2 · 4c23738a
      Arthur Kiyanovski authored
      Update driver version due to critical bug fixes.
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c23738a
    • Arthur Kiyanovski's avatar
      net: ena: fix crash during ena_remove() · 58a54b9c
      Arthur Kiyanovski authored
      In ena_remove() we have the following stack call:
      ena_remove()
        unregister_netdev()
        ena_destroy_device()
          netif_carrier_off()
      
      Calling netif_carrier_off() causes linkwatch to try to handle the
      link change event on the already unregistered netdev, which leads
      to a read from an unreadable memory address.
      
      This patch switches the order of the two functions, so that
      netif_carrier_off() is called on a regiestered netdev.
      
      To accomplish this fix we also had to:
      1. Remove the set bit ENA_FLAG_TRIGGER_RESET
      2. Add a sanitiy check in ena_close()
      both to prevent double device reset (when calling unregister_netdev()
      ena_close is called, but the device was already deleted in
      ena_destroy_device()).
      3. Set the admin_queue running state to false to avoid using it after
      device was reset (for example when calling ena_destroy_all_io_queues()
      right after ena_com_dev_reset() in ena_down)
      
      Fixes: 944b28aa ("net: ena: fix missing lock during device destruction")
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      58a54b9c
    • Arthur Kiyanovski's avatar
      net: ena: fix crash during failed resume from hibernation · e76ad21d
      Arthur Kiyanovski authored
      During resume from hibernation if ena_restore_device fails,
      ena_com_dev_reset() is called, and uses the readless read mechanism,
      which was already destroyed by the call to
      ena_com_mmio_reg_read_request_destroy(). This causes a NULL pointer
      reference.
      
      In this commit we switch the call order of the above two functions
      to avoid this crash.
      
      Fixes: d7703ddb ("net: ena: fix rare bug when failed restart/resume is followed by driver removal")
      Signed-off-by: default avatarArthur Kiyanovski <akiyano@amazon.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e76ad21d
    • Xin Long's avatar
      sctp: not increase stream's incnt before sending addstrm_in request · e1e46479
      Xin Long authored
      Different from processing the addstrm_out request, The receiver handles
      an addstrm_in request by sending back an addstrm_out request to the
      sender who will increase its stream's in and incnt later.
      
      Now stream->incnt has been increased since it sent out the addstrm_in
      request in sctp_send_add_streams(), with the wrong stream->incnt will
      even cause crash when copying stream info from the old stream's in to
      the new one's in sctp_process_strreset_addstrm_out().
      
      This patch is to fix it by simply removing the stream->incnt change
      from sctp_send_add_streams().
      
      Fixes: 242bd2d5 ("sctp: implement sender-side procedures for Add Incoming/Outgoing Streams Request Parameter")
      Reported-by: default avatarJianwen Ji <jiji@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e1e46479
    • Valentine Fatiev's avatar
      net/mlx5e: Fix selftest for small MTUs · 228c4cd0
      Valentine Fatiev authored
      Loopback test had fixed packet size, which can be bigger than configured
      MTU. Shorten the loopback packet size to be bigger than minimal MTU
      allowed by the device. Text field removed from struct 'mlx5ehdr'
      as redundant to allow send small packets as minimal allowed MTU.
      
      Fixes: d605d668 ("net/mlx5e: Add support for ethtool self diagnostics test")
      Signed-off-by: default avatarValentine Fatiev <valentinef@mellanox.com>
      Reviewed-by: default avatarEran Ben Elisha <eranbe@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      228c4cd0
    • Moshe Shemesh's avatar
      net/mlx5e: RX, verify received packet size in Linear Striding RQ · 0073c8f7
      Moshe Shemesh authored
      In case of striding RQ, we use  MPWRQ (Multi Packet WQE RQ), which means
      that WQE (RX descriptor) can be used for many packets and so the WQE is
      much bigger than MTU.  In virtualization setups where the port mtu can
      be larger than the vf mtu, if received packet is bigger than MTU, it
      won't be dropped by HW on too small receive WQE. If we use linear SKB in
      striding RQ, since each stride has room for mtu size payload and skb
      info, an oversized packet can lead to crash for crossing allocated page
      boundary upon the call to build_skb. So driver needs to check packet
      size and drop it.
      
      Introduce new SW rx counter, rx_oversize_pkts_sw_drop, which counts the
      number of packets dropped by the driver for being too large.
      
      As a new field is added to the RQ struct, re-open the channels whenever
      this field is being used in datapath (i.e., in the case of linear
      Striding RQ).
      
      Fixes: 619a8f2a ("net/mlx5e: Use linear SKB in Striding RQ")
      Signed-off-by: default avatarMoshe Shemesh <moshe@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      0073c8f7
    • Roi Dayan's avatar
      net/mlx5e: Apply the correct check for supporting TC esw rules split · 1392f44b
      Roi Dayan authored
      The mirror and not the output count is the one denoting a split.
      Fix to condition the offload attempt on the mirror count being > 0
      along the firmware to have the related capability.
      
      Fixes: 592d3651 ("net/mlx5e: Parse mirroring action for offloaded TC eswitch flows")
      Signed-off-by: default avatarRoi Dayan <roid@mellanox.com>
      Reviewed-by: default avatarYossi Kuperman <yossiku@mellanox.com>
      Reviewed-by: default avatarChris Mi <chrism@mellanox.com>
      Acked-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      1392f44b
    • Yuval Avnery's avatar
      net/mlx5e: Adjust to max number of channles when re-attaching · a1f240f1
      Yuval Avnery authored
      When core driver enters deattach/attach flow after pci reset,
      Number of logical CPUs may have changed.
      As a result we need to update the cpu affiliated resource tables.
      	1. indirect rqt list
      	2. eq table
      
      Reproduction (PowerPC):
      	echo 1000 > /sys/kernel/debug/powerpc/eeh_max_freezes
      	ppc64_cpu --smt=on
      	# Restart driver
      	modprobe -r ... ; modprobe ...
      	# Link up
      	ifconfig ...
      	# Only physical CPUs
      	ppc64_cpu --smt=off
      	# Inject PCI errors so PCI will reset - calling the pci error handler
      	echo 0x8000000000000000 > /sys/kernel/debug/powerpc/<PCI BUS>/err_injct_inboundA
      
      Call trace when trying to add non-existing rqs to an indirect rqt:
      	mlx5e_redirect_rqt+0x84/0x260 [mlx5_core] (unreliable)
      	mlx5e_redirect_rqts+0x188/0x190 [mlx5_core]
      	mlx5e_activate_priv_channels+0x488/0x570 [mlx5_core]
      	mlx5e_open_locked+0xbc/0x140 [mlx5_core]
      	mlx5e_open+0x50/0x130 [mlx5_core]
      	mlx5e_nic_enable+0x174/0x1b0 [mlx5_core]
      	mlx5e_attach_netdev+0x154/0x290 [mlx5_core]
      	mlx5e_attach+0x88/0xd0 [mlx5_core]
      	mlx5_attach_device+0x168/0x1e0 [mlx5_core]
      	mlx5_load_one+0x1140/0x1210 [mlx5_core]
      	mlx5_pci_resume+0x6c/0xf0 [mlx5_core]
      
      Create cq will fail when trying to use non-existing EQ.
      
      Fixes: 89d44f0a ("net/mlx5_core: Add pci error handlers to mlx5_core driver")
      Signed-off-by: default avatarYuval Avnery <yuvalav@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      a1f240f1
    • Or Gerlitz's avatar
      net/mlx5e: Always use the match level enum when parsing TC rule match · 83621b7d
      Or Gerlitz authored
      We get the match level (none, l2, l3, l4) while going over the match
      dissectors of an offloaded tc rule. When doing this, the match level
      enum and the not min inline enum values should be used, fix that.
      
      This worked accidentally b/c both enums have the same numerical values.
      
      Fixes: d708f902 ('net/mlx5e: Get the required HW match level while parsing TC flow matches')
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      83621b7d
    • Or Gerlitz's avatar
      net/mlx5e: Claim TC hw offloads support only under a proper build config · 077ecd78
      Or Gerlitz authored
      Currently, we are only supporting tc hw offloads when the eswitch
      support is compiled in, but we are not gating the adevertizment
      of the NETIF_F_HW_TC feature on this config being set.
      
      Fix it, and while doing that, also avoid dealing with the feature
      on ethtool when the config is not set.
      
      Fixes: e8f887ac ('net/mlx5e: Introduce tc offload support')
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      077ecd78
    • Or Gerlitz's avatar
      net/mlx5e: Don't match on vlan non-existence if ethertype is wildcarded · d3a80bb5
      Or Gerlitz authored
      For the "all" ethertype we should not care whether the packet has
      vlans. Besides being wrong, the way we did it caused FW error
      for rules such as:
      
      tc filter add dev eth0 protocol all parent ffff: \
      	prio 1 flower skip_sw action drop
      
      b/c the matching meta-data (outer headers bit in struct mlx5_flow_spec)
      wasn't set. Fix that by matching on vlan non-existence only if we were
      also told to match on the ethertype.
      
      Fixes: cee26487 ('net/mlx5e: Set vlan masks for all offloaded TC rules')
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Reported-by: default avatarSlava Ovsiienko <viacheslavo@mellanox.com>
      Reviewed-by: default avatarJianbo Liu <jianbol@mellanox.com>
      Reviewed-by: default avatarRoi Dayan <roid@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      d3a80bb5
    • Denis Drozdov's avatar
      net/mlx5e: IPoIB, Reset QP after channels are closed · acf3766b
      Denis Drozdov authored
      The mlx5e channels should be closed before mlx5i_uninit_underlay_qp
      puts the QP into RST (reset) state during mlx5i_close. Currently QP
      state incorrectly set to RST before channels got deactivated and closed,
      since mlx5_post_send request expects QP in RTS (Ready To Send) state.
      
      The fix is to keep QP in RTS state until mlx5e channels get closed
      and to reset QP afterwards.
      
      Also this fix is simply correct in order to keep the open/close flow
      symmetric, i.e mlx5i_init_underlay_qp() is called first thing at open,
      the correct thing to do is to call mlx5i_uninit_underlay_qp() last thing
      at close, which is exactly what this patch is doing.
      
      Fixes: dae37456 ("net/mlx5: Support for attaching multiple underlay QPs to root flow table")
      Signed-off-by: default avatarDenis Drozdov <denisd@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      acf3766b
    • Raed Salem's avatar
      net/mlx5: IPSec, Fix the SA context hash key · f2b18732
      Raed Salem authored
      The commit "net/mlx5: Refactor accel IPSec code" introduced a
      bug where asynchronous short time change in hash key value
      by create/release SA context might happen during an asynchronous
      hash resize operation this could cause a subsequent remove SA
      context operation to fail as the key value used during resize is
      not the same key value used when remove SA context operation is
      invoked.
      
      This commit fixes the bug by defining the SA context hash key
      such that it includes only fields that never change during the
      lifetime of the SA context object.
      
      Fixes: d6c4f029 ("net/mlx5: Refactor accel IPSec code")
      Signed-off-by: default avatarRaed Salem <raeds@mellanox.com>
      Reviewed-by: default avatarAviad Yehezkel <aviadye@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      f2b18732
    • Xin Long's avatar
      Revert "sctp: remove sctp_transport_pmtu_check" · 69fec325
      Xin Long authored
      This reverts commit 22d7be26.
      
      The dst's mtu in transport can be updated by a non sctp place like
      in xfrm where the MTU information didn't get synced between asoc,
      transport and dst, so it is still needed to do the pmtu check
      in sctp_packet_config.
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      69fec325
    • Xin Long's avatar
      sctp: not allow to set asoc prsctp_enable by sockopt · cc3ccf26
      Xin Long authored
      As rfc7496#section4.5 says about SCTP_PR_SUPPORTED:
      
         This socket option allows the enabling or disabling of the
         negotiation of PR-SCTP support for future associations.  For existing
         associations, it allows one to query whether or not PR-SCTP support
         was negotiated on a particular association.
      
      It means only sctp sock's prsctp_enable can be set.
      
      Note that for the limitation of SCTP_{CURRENT|ALL}_ASSOC, we will
      add it when introducing SCTP_{FUTURE|CURRENT|ALL}_ASSOC for linux
      sctp in another patchset.
      
      v1->v2:
        - drop the params.assoc_id check as Neil suggested.
      
      Fixes: 28aa4c26 ("sctp: add SCTP_PR_SUPPORTED on sctp sockopt")
      Reported-by: default avatarYing Xu <yinxu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc3ccf26
    • Xin Long's avatar
      sctp: count sk_wmem_alloc by skb truesize in sctp_packet_transmit · 02968ccf
      Xin Long authored
      Now sctp increases sk_wmem_alloc by 1 when doing set_owner_w for the
      skb allocked in sctp_packet_transmit and decreases by 1 when freeing
      this skb.
      
      But when this skb goes through networking stack, some subcomponents
      might change skb->truesize and add the same amount on sk_wmem_alloc.
      However sctp doesn't know the amount to decrease by, it would cause
      a leak on sk->sk_wmem_alloc and the sock can never be freed.
      
      Xiumei found this issue when it hit esp_output_head() by using sctp
      over ipsec, where skb->truesize is added and so is sk->sk_wmem_alloc.
      
      Since sctp has used sk_wmem_queued to count for writable space since
      Commit cd305c74 ("sctp: use sk_wmem_queued to check for writable
      space"), it's ok to fix it by counting sk_wmem_alloc by skb truesize
      in sctp_packet_transmit.
      
      Fixes: cac2661c ("esp4: Avoid skb_cow_data whenever possible")
      Reported-by: default avatarXiumei Mu <xmu@redhat.com>
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02968ccf
    • Heiner Kallweit's avatar
      MAINTAINERS: Add myself as third phylib maintainer · a36b5444
      Heiner Kallweit authored
      Add myself as third phylib maintainer.
      Signed-off-by: default avatarHeiner Kallweit <hkallweit1@gmail.com>
      Acked-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Acked-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a36b5444
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f2ce1065
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Fix some potentially uninitialized variables and use-after-free in
          kvaser_usb can drier, from Jimmy Assarsson.
      
       2) Fix leaks in qed driver, from Denis Bolotin.
      
       3) Socket leak in l2tp, from Xin Long.
      
       4) RSS context allocation fix in bnxt_en from Michael Chan.
      
       5) Fix cxgb4 build errors, from Ganesh Goudar.
      
       6) Route leaks in ipv6 when removing exceptions, from Xin Long.
      
       7) Memory leak in IDR allocation handling of act_pedit, from Davide
          Caratti.
      
       8) Use-after-free of bridge vlan stats, from Nikolay Aleksandrov.
      
       9) When MTU is locked, do not force DF bit on ipv4 tunnels. From
          Sabrina Dubroca.
      
      10) When NAPI cached skb is reused, we must set it to the proper initial
          state which includes skb->pkt_type. From Eric Dumazet.
      
      11) Lockdep and non-linear SKB handling fix in tipc from Jon Maloy.
      
      12) Set RX queue properly in various tuntap receive paths, from Matthew
          Cover.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
        tuntap: fix multiqueue rx
        ipv6: Fix PMTU updates for UDP/raw sockets in presence of VRF
        tipc: don't assume linear buffer when reading ancillary data
        tipc: fix lockdep warning when reinitilaizing sockets
        net-gro: reset skb->pkt_type in napi_reuse_skb()
        tc-testing: tdc.py: Guard against lack of returncode in executed command
        tc-testing: tdc.py: ignore errors when decoding stdout/stderr
        ip_tunnel: don't force DF when MTU is locked
        MAINTAINERS: Add entry for CAKE qdisc
        net: bridge: fix vlan stats use-after-free on destruction
        socket: do a generic_file_splice_read when proto_ops has no splice_read
        net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs
        Revert "net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs"
        net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs
        net/sched: act_pedit: fix memory leak when IDR allocation fails
        net: lantiq: Fix returned value in case of error in 'xrx200_probe()'
        ipv6: fix a dst leak when removing its exception
        net: mvneta: Don't advertise 2.5G modes
        drivers/net/ethernet/qlogic/qed/qed_rdma.h: fix typo
        net/mlx4: Fix UBSAN warning of signed integer overflow
        ...
      f2ce1065
    • Matthew Cover's avatar
      tuntap: fix multiqueue rx · 8ebebcba
      Matthew Cover authored
      When writing packets to a descriptor associated with a combined queue, the
      packets should end up on that queue.
      
      Before this change all packets written to any descriptor associated with a
      tap interface end up on rx-0, even when the descriptor is associated with a
      different queue.
      
      The rx traffic can be generated by either of the following.
        1. a simple tap program which spins up multiple queues and writes packets
           to each of the file descriptors
        2. tx from a qemu vm with a tap multiqueue netdev
      
      The queue for rx traffic can be observed by either of the following (done
      on the hypervisor in the qemu case).
        1. a simple netmap program which opens and reads from per-queue
           descriptors
        2. configuring RPS and doing per-cpu captures with rxtxcpu
      
      Alternatively, if you printk() the return value of skb_get_rx_queue() just
      before each instance of netif_receive_skb() in tun.c, you will get 65535
      for every skb.
      
      Calling skb_record_rx_queue() to set the rx queue to the queue_index fixes
      the association between descriptor and rx queue.
      Signed-off-by: default avatarMatthew Cover <matthew.cover@stackpath.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      8ebebcba
    • David Ahern's avatar
      ipv6: Fix PMTU updates for UDP/raw sockets in presence of VRF · 7ddacfa5
      David Ahern authored
      Preethi reported that PMTU discovery for UDP/raw applications is not
      working in the presence of VRF when the socket is not bound to a device.
      The problem is that ip6_sk_update_pmtu does not consider the L3 domain
      of the skb device if the socket is not bound. Update the function to
      set oif to the L3 master device if relevant.
      
      Fixes: ca254490 ("net: Add VRF support to IPv6 stack")
      Reported-by: default avatarPreethi Ramachandra <preethir@juniper.net>
      Signed-off-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ddacfa5
  4. 18 Nov, 2018 4 commits
    • Linus Torvalds's avatar
      Linux 4.20-rc3 · 9ff01193
      Linus Torvalds authored
      9ff01193
    • Linus Torvalds's avatar
      Merge tag 'libnvdimm-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · 25e19c1f
      Linus Torvalds authored
      Pull libnvdimm fixes from Dan Williams:
       "A small batch of fixes for v4.20-rc3.
      
        The overflow continuation fix addresses something that has been broken
        for several releases. Arguably it could wait even longer, but it's a
        one line fix and this finishes the last of the known address range
        scrub bug reports. The revert addresses a lockdep regression. The unit
        tests are not critical to fix, but no reason to hold this fix back.
      
        Summary:
      
         - Address Range Scrub overflow continuation handling has been broken
           since it was initially merged. It was only recently that error
           injection and platform-BIOS support enabled this corner case to be
           exercised.
      
         - The recent attempt to provide more isolation for the kernel Address
           Range Scrub state machine from userapace initiated sessions
           triggers a lockdep report. Revert and try again at the next merge
           window.
      
         - Fix a kasan reported buffer overflow in libnvdimm unit test
           infrastrucutre (nfit_test)"
      
      * tag 'libnvdimm-fixes-4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
        Revert "acpi, nfit: Further restrict userspace ARS start requests"
        acpi, nfit: Fix ARS overflow continuation
        tools/testing/nvdimm: Fix the array size for dimm devices.
      25e19c1f
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · c67a98c0
      Linus Torvalds authored
      Merge misc fixes from Andrew Morton:
       "16 fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        mm/memblock.c: fix a typo in __next_mem_pfn_range() comments
        mm, page_alloc: check for max order in hot path
        scripts/spdxcheck.py: make python3 compliant
        tmpfs: make lseek(SEEK_DATA/SEK_HOLE) return ENXIO with a negative offset
        lib/ubsan.c: don't mark __ubsan_handle_builtin_unreachable as noreturn
        mm/vmstat.c: fix NUMA statistics updates
        mm/gup.c: fix follow_page_mask() kerneldoc comment
        ocfs2: free up write context when direct IO failed
        scripts/faddr2line: fix location of start_kernel in comment
        mm: don't reclaim inodes with many attached pages
        mm, memory_hotplug: check zone_movable in has_unmovable_pages
        mm/swapfile.c: use kvzalloc for swap_info_struct allocation
        MAINTAINERS: update OMAP MMC entry
        hugetlbfs: fix kernel BUG at fs/hugetlbfs/inode.c:444!
        kernel/sched/psi.c: simplify cgroup_move_task()
        z3fold: fix possible reclaim races
      c67a98c0
    • Linus Torvalds's avatar
      Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 03582f33
      Linus Torvalds authored
      Pull scheduler fix from Ingo Molnar:
       "Fix an exec() related scalability/performance regression, which was
        caused by incorrectly calculating load and migrating tasks on exec()
        when they shouldn't be"
      
      * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
        sched/fair: Fix cpu_util_wake() for 'execl' type workloads
      03582f33