1. 04 Feb, 2017 36 commits
    • Christoph Hellwig's avatar
      xfs: don't rely on ->total in xfs_alloc_space_available · e9b77651
      Christoph Hellwig authored
      commit 12ef8301 upstream.
      
      ->total is a bit of an odd parameter passed down to the low-level
      allocator all the way from the high-level callers.  It's supposed to
      contain the maximum number of blocks to be allocated for the whole
      transaction [1].
      
      But in xfs_iomap_write_allocate we only convert existing delayed
      allocations and thus only have a minimal block reservation for the
      current transaction, so xfs_alloc_space_available can't use it for
      the allocation decisions.  Use the maximum of args->total and the
      calculated block requirement to make a decision.  We probably should
      get rid of args->total eventually and instead apply ->minleft more
      broadly, but that will require some extensive changes all over.
      
      [1] which creates lots of confusion as most callers don't decrement it
      once doing a first allocation.  But that's for a separate series.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9b77651
    • Christoph Hellwig's avatar
      xfs: adjust allocation length in xfs_alloc_space_available · 6b81365b
      Christoph Hellwig authored
      commit 54fee133 upstream.
      
      We must decide in xfs_alloc_fix_freelist if we can perform an
      allocation from a given AG is possible or not based on the available
      space, and should not fail the allocation past that point on a
      healthy file system.
      
      But currently we have two additional places that second-guess
      xfs_alloc_fix_freelist: xfs_alloc_ag_vextent tries to adjust the
      maxlen parameter to remove the reservation before doing the
      allocation (but ignores the various minium freespace requirements),
      and xfs_alloc_fix_minleft tries to fix up the allocated length
      after we've found an extent, but ignores the reservations and also
      doesn't take the AGFL into account (and thus fails allocations
      for not matching minlen in some cases).
      
      Remove all these later fixups and just correct the maxlen argument
      inside xfs_alloc_fix_freelist once we have the AGF buffer locked.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b81365b
    • Christoph Hellwig's avatar
      xfs: fix bogus minleft manipulations · c63f4d3a
      Christoph Hellwig authored
      commit 255c5162 upstream.
      
      We can't just set minleft to 0 when we're low on space - that's exactly
      what we need minleft for: to protect space in the AG for btree block
      allocations when we are low on free space.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c63f4d3a
    • Christoph Hellwig's avatar
      xfs: bump up reserved blocks in xfs_alloc_set_aside · d20e4ad0
      Christoph Hellwig authored
      commit 5149fd32 upstream.
      
      Setting aside 4 blocks globally for bmbt splits isn't all that useful,
      as different threads can allocate space in parallel.  Bump it to 4
      blocks per AG to allow each thread that is currently doing an
      allocation to dip into it separately.  Without that we may no have
      enough reserved blocks if there are enough parallel transactions
      in an almost out space file system that all run into bmap btree
      splits.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarBrian Foster <bfoster@redhat.com>
      Signed-off-by: default avatarDarrick J. Wong <darrick.wong@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d20e4ad0
    • Florian Fainelli's avatar
      net: dsa: Bring back device detaching in dsa_slave_suspend() · 9f42bc4f
      Florian Fainelli authored
      [ Upstream commit f154be24 ]
      
      Commit 448b4482 ("net: dsa: Add lockdep class to tx queues to avoid
      lockdep splat") removed the netif_device_detach() call done in
      dsa_slave_suspend() which is necessary, and paired with a corresponding
      netif_device_attach(), bring it back.
      
      Fixes: 448b4482 ("net: dsa: Add lockdep class to tx queues to avoid lockdep splat")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9f42bc4f
    • Robert Shearman's avatar
      lwtunnel: Fix oops on state free after encap module unload · e972cce0
      Robert Shearman authored
      [ Upstream commit 85c81401 ]
      
      When attempting to free lwtunnel state after the module for the encap
      has been unloaded an oops occurs:
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      IP: lwtstate_free+0x18/0x40
      [..]
      task: ffff88003e372380 task.stack: ffffc900001fc000
      RIP: 0010:lwtstate_free+0x18/0x40
      RSP: 0018:ffff88003fd83e88 EFLAGS: 00010246
      RAX: 0000000000000000 RBX: ffff88002bbb3380 RCX: ffff88000c91a300
      [..]
      Call Trace:
       <IRQ>
       free_fib_info_rcu+0x195/0x1a0
       ? rt_fibinfo_free+0x50/0x50
       rcu_process_callbacks+0x2d3/0x850
       ? rcu_process_callbacks+0x296/0x850
       __do_softirq+0xe4/0x4cb
       irq_exit+0xb0/0xc0
       smp_apic_timer_interrupt+0x3d/0x50
       apic_timer_interrupt+0x93/0xa0
      [..]
      Code: e8 6e c6 fc ff 89 d8 5b 5d c3 bb de ff ff ff eb f4 66 90 66 66 66 66 90 55 48 89 e5 53 0f b7 07 48 89 fb 48 8b 04 c5 00 81 d5 81 <48> 8b 40 08 48 85 c0 74 13 ff d0 48 8d 7b 20 be 20 00 00 00 e8
      
      The problem is after the module for the encap can be unloaded the
      corresponding ops is removed and is thus NULL here.
      
      Modules implementing lwtunnel ops should not be allowed to unload
      while there is state alive using those ops, so grab the module
      reference for the ops on creating lwtunnel state and of course release
      the reference when freeing the state.
      
      Fixes: 1104d9ba ("lwtunnel: Add destroy state operation")
      Signed-off-by: default avatarRobert Shearman <rshearma@brocade.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e972cce0
    • Robert Shearman's avatar
      net: Specify the owning module for lwtunnel ops · 89c25886
      Robert Shearman authored
      [ Upstream commit 88ff7334 ]
      
      Modules implementing lwtunnel ops should not be allowed to unload
      while there is state alive using those ops, so specify the owning
      module for all lwtunnel ops.
      Signed-off-by: default avatarRobert Shearman <rshearma@brocade.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      89c25886
    • Bjørn Mork's avatar
      qmi_wwan/cdc_ether: add device ID for HP lt2523 (Novatel E371) WWAN card · 087c2ecb
      Bjørn Mork authored
      [ Upstream commit 5b9f5751 ]
      
      Another rebranded Novatel E371.  qmi_wwan should drive this device, while
      cdc_ether should ignore it.  Even though the USB descriptors are plain
      CDC-ETHER that USB interface is a QMI interface.  Ref commit 7fdb7846
      ("qmi_wwan/cdc_ether: add device IDs for Dell 5804 (Novatel E371) WWAN
      card")
      
      Cc: Dan Williams <dcbw@redhat.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      087c2ecb
    • WANG Cong's avatar
      af_unix: move unix_mknod() out of bindlock · 93ff5e03
      WANG Cong authored
      [ Upstream commit 0fb44559 ]
      
      Dmitry reported a deadlock scenario:
      
      unix_bind() path:
      u->bindlock ==> sb_writer
      
      do_splice() path:
      sb_writer ==> pipe->mutex ==> u->bindlock
      
      In the unix_bind() code path, unix_mknod() does not have to
      be done with u->bindlock held, since it is a pure fs operation,
      so we can just move unix_mknod() out.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Tested-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Cc: Rainer Weikusat <rweikusat@mobileactivedefense.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      93ff5e03
    • hayeswang's avatar
      r8152: don't execute runtime suspend if the tx is not empty · 37b27b20
      hayeswang authored
      [ Upstream commit 6a0b76c0 ]
      
      Runtime suspend shouldn't be executed if the tx queue is not empty,
      because the device is not idle.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      37b27b20
    • David Ahern's avatar
      net: mpls: Fix multipath selection for LSR use case · ad864d9f
      David Ahern authored
      [ Upstream commit 9f427a0e ]
      
      MPLS multipath for LSR is broken -- always selecting the first nexthop
      in the one label case. For example:
      
          $ ip -f mpls ro ls
          100
                  nexthop as to 200 via inet 172.16.2.2  dev virt12
                  nexthop as to 300 via inet 172.16.3.2  dev virt13
          101
                  nexthop as to 201 via inet6 2000:2::2  dev virt12
                  nexthop as to 301 via inet6 2000:3::2  dev virt13
      
      In this example incoming packets have a single MPLS labels which means
      BOS bit is set. The BOS bit is passed from mpls_forward down to
      mpls_multipath_hash which never processes the hash loop because BOS is 1.
      
      Update mpls_multipath_hash to process the entire label stack. mpls_hdr_len
      tracks the total mpls header length on each pass (on pass N mpls_hdr_len
      is N * sizeof(mpls_shim_hdr)). When the label is found with the BOS set
      it verifies the skb has sufficient header for ipv4 or ipv6, and find the
      IPv4 and IPv6 header by using the last mpls_hdr pointer and adding 1 to
      advance past it.
      
      With these changes I have verified the code correctly sees the label,
      BOS, IPv4 and IPv6 addresses in the network header and icmp/tcp/udp
      traffic for ipv4 and ipv6 are distributed across the nexthops.
      
      Fixes: 1c78efa8 ("mpls: flow-based multipath selection")
      Acked-by: default avatarRobert Shearman <rshearma@brocade.com>
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ad864d9f
    • Ivan Vecera's avatar
      bridge: netlink: call br_changelink() during br_dev_newlink() · 74423145
      Ivan Vecera authored
      [ Upstream commit b6677449 ]
      
      Any bridge options specified during link creation (e.g. ip link add)
      are ignored as br_dev_newlink() does not process them.
      Use br_changelink() to do it.
      
      Fixes: 13323516 ("bridge: implement rtnl_link_ops->changelink")
      Signed-off-by: default avatarIvan Vecera <cera@cera.cz>
      Reviewed-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      74423145
    • Eric Dumazet's avatar
      net/mlx5e: Do not recycle pages from emergency reserve · 087dced6
      Eric Dumazet authored
      [ Upstream commit e048fc50 ]
      
      A driver using dev_alloc_page() must not reuse a page allocated from
      emergency memory reserve.
      
      Otherwise all packets using this page will be immediately dropped,
      unless for very specific sockets having SOCK_MEMALLOC bit set.
      
      This issue might be hard to debug, because only a fraction of received
      packets would be dropped.
      
      Fixes: 4415a031 ("net/mlx5e: Implement RX mapped page cache for page recycle")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Cc: Saeed Mahameed <saeedm@mellanox.com>
      Acked-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      087dced6
    • Alexey Kodanev's avatar
      tcp: initialize max window for a new fastopen socket · 0c687a73
      Alexey Kodanev authored
      [ Upstream commit 0dbd7ff3 ]
      
      Found that if we run LTP netstress test with large MSS (65K),
      the first attempt from server to send data comparable to this
      MSS on fastopen connection will be delayed by the probe timer.
      
      Here is an example:
      
           < S  seq 0:0 win 43690 options [mss 65495 wscale 7 tfo cookie] length 32
           > S. seq 0:0 ack 1 win 43690 options [mss 65495 wscale 7] length 0
           < .  ack 1 win 342 length 0
      
      Inside tcp_sendmsg(), tcp_send_mss() returns max MSS in 'mss_now',
      as well as in 'size_goal'. This results the segment not queued for
      transmition until all the data copied from user buffer. Then, inside
      __tcp_push_pending_frames(), it breaks on send window test and
      continues with the check probe timer.
      
      Fragmentation occurs in tcp_write_wakeup()...
      
      +0.2 > P. seq 1:43777 ack 1 win 342 length 43776
           < .  ack 43777, win 1365 length 0
           > P. seq 43777:65001 ack 1 win 342 options [...] length 21224
           ...
      
      This also contradicts with the fact that we should bound to the half
      of the window if it is large.
      
      Fix this flaw by correctly initializing max_window. Before that, it
      could have large values that affect further calculations of 'size_goal'.
      
      Fixes: 168a8f58 ("tcp: TCP Fast Open Server - main code path")
      Signed-off-by: default avatarAlexey Kodanev <alexey.kodanev@oracle.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0c687a73
    • Kefeng Wang's avatar
      ipv6: addrconf: Avoid addrconf_disable_change() using RCU read-side lock · 79453ab8
      Kefeng Wang authored
      [ Upstream commit 03e4deff ]
      
      Just like commit 4acd4945 ("ipv6: addrconf: Avoid calling
      netdevice notifiers with RCU read-side lock"), it is unnecessary
      to make addrconf_disable_change() use RCU iteration over the
      netdev list, since it already holds the RTNL lock, or we may meet
      Illegal context switch in RCU read-side critical section.
      Signed-off-by: default avatarKefeng Wang <wangkefeng.wang@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      79453ab8
    • David Ahern's avatar
      lwtunnel: fix autoload of lwt modules · e9db042d
      David Ahern authored
      [ Upstream commit 9ed59592]
      
      Trying to add an mpls encap route when the MPLS modules are not loaded
      hangs. For example:
      
          CONFIG_MPLS=y
          CONFIG_NET_MPLS_GSO=m
          CONFIG_MPLS_ROUTING=m
          CONFIG_MPLS_IPTUNNEL=m
      
          $ ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
      
      The ip command hangs:
      root       880   826  0 21:25 pts/0    00:00:00 ip route add 10.10.10.10/32 encap mpls 100 via inet 10.100.1.2
      
          $ cat /proc/880/stack
          [<ffffffff81065a9b>] call_usermodehelper_exec+0xd6/0x134
          [<ffffffff81065efc>] __request_module+0x27b/0x30a
          [<ffffffff814542f6>] lwtunnel_build_state+0xe4/0x178
          [<ffffffff814aa1e4>] fib_create_info+0x47f/0xdd4
          [<ffffffff814ae451>] fib_table_insert+0x90/0x41f
          [<ffffffff814a8010>] inet_rtm_newroute+0x4b/0x52
          ...
      
      modprobe is trying to load rtnl-lwt-MPLS:
      
      root       881     5  0 21:25 ?        00:00:00 /sbin/modprobe -q -- rtnl-lwt-MPLS
      
      and it hangs after loading mpls_router:
      
          $ cat /proc/881/stack
          [<ffffffff81441537>] rtnl_lock+0x12/0x14
          [<ffffffff8142ca2a>] register_netdevice_notifier+0x16/0x179
          [<ffffffffa0033025>] mpls_init+0x25/0x1000 [mpls_router]
          [<ffffffff81000471>] do_one_initcall+0x8e/0x13f
          [<ffffffff81119961>] do_init_module+0x5a/0x1e5
          [<ffffffff810bd070>] load_module+0x13bd/0x17d6
          ...
      
      The problem is that lwtunnel_build_state is called with rtnl lock
      held preventing mpls_init from registering.
      
      Given the potential references held by the time lwtunnel_build_state it
      can not drop the rtnl lock to the load module. So, extract the module
      loading code from lwtunnel_build_state into a new function to validate
      the encap type. The new function is called while converting the user
      request into a fib_config which is well before any table, device or
      fib entries are examined.
      
      Fixes: 745041e2 ("lwtunnel: autoload of lwt modules")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e9db042d
    • Daniel Gonzalez Cabanelas's avatar
      net: phy: bcm63xx: Utilize correct config_intr function · b335e656
      Daniel Gonzalez Cabanelas authored
      [ Upstream commit cd33b3e0 ]
      
      Commit a1cba561 ("net: phy: Add Broadcom phy library for common
      interfaces") make the BCM63xx PHY driver utilize bcm_phy_config_intr()
      which would appear to do the right thing, except that it does not write
      to the MII_BCM63XX_IR register but to MII_BCM54XX_ECR which is
      different.
      
      This would be causing invalid link parameters and events from being
      generated by the PHY interrupt.
      
      Fixes: a1cba561 ("net: phy: Add Broadcom phy library for common interfaces")
      Signed-off-by: default avatarDaniel Gonzalez Cabanelas <dgcbueu@gmail.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b335e656
    • Eric Dumazet's avatar
      net: fix harmonize_features() vs NETIF_F_HIGHDMA · 948e137a
      Eric Dumazet authored
      [ Upstream commit 7be2c82c ]
      
      Ashizuka reported a highmem oddity and sent a patch for freescale
      fec driver.
      
      But the problem root cause is that core networking stack
      must ensure no skb with highmem fragment is ever sent through
      a device that does not assert NETIF_F_HIGHDMA in its features.
      
      We need to call illegal_highdma() from harmonize_features()
      regardless of CSUM checks.
      
      Fixes: ec5f0615 ("net: Kill link between CSUM and SG features.")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Pravin Shelar <pshelar@ovn.org>
      Reported-by: default avatar"Ashizuka, Yuusuke" <ashiduka@jp.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      948e137a
    • Lance Richardson's avatar
      vxlan: fix byte order of vxlan-gpe port number · d1c95f9c
      Lance Richardson authored
      [ Upstream commit d5ff72d9 ]
      
      vxlan->cfg.dst_port is in network byte order, so an htons()
      is needed here. Also reduced comment length to stay closer
      to 80 column width (still slightly over, however).
      
      Fixes: e1e5314d ("vxlan: implement GPE")
      Signed-off-by: default avatarLance Richardson <lrichard@redhat.com>
      Acked-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d1c95f9c
    • Jason Wang's avatar
      virtio-net: restore VIRTIO_HDR_F_DATA_VALID on receiving · 1e7cbb41
      Jason Wang authored
      [ Upstream commit 6391a448 ]
      
      Commit 501db511 ("virtio: don't set VIRTIO_NET_HDR_F_DATA_VALID on
      xmit") in fact disables VIRTIO_HDR_F_DATA_VALID on receiving path too,
      fixing this by adding a hint (has_data_valid) and set it only on the
      receiving path.
      
      Cc: Rolf Neugebauer <rolf.neugebauer@docker.com>
      Signed-off-by: default avatarJason Wang <jasowang@redhat.com>
      Acked-by: default avatarRolf Neugebauer <rolf.neugebauer@docker.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1e7cbb41
    • Rolf Neugebauer's avatar
      virtio: don't set VIRTIO_NET_HDR_F_DATA_VALID on xmit · 3eab5dd0
      Rolf Neugebauer authored
      [ Upstream commit 501db511 ]
      
      This patch part reverts fd2a0437 and e858fae2 which introduced a
      subtle change in how the virtio_net flags are derived from the SKBs
      ip_summed field.
      
      With the above commits, the flags are set to VIRTIO_NET_HDR_F_DATA_VALID
      when ip_summed == CHECKSUM_UNNECESSARY, thus treating it differently to
      ip_summed == CHECKSUM_NONE, which should be the same.
      
      Further, the virtio spec 1.0 / CS04 explicitly says that
      VIRTIO_NET_HDR_F_DATA_VALID must not be set by the driver.
      
      Fixes: fd2a0437 ("virtio_net: introduce virtio_net_hdr_{from,to}_skb")
      Fixes: e858fae2 (" virtio_net: use common code for virtio_net_hdr and skb GSO conversion")
      Signed-off-by: default avatarRolf Neugebauer <rolf.neugebauer@docker.com>
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3eab5dd0
    • Jamal Hadi Salim's avatar
      net sched actions: fix refcnt when GETing of action after bind · b260a714
      Jamal Hadi Salim authored
      [ Upstream commit 0faa9cb5 ]
      
      Demonstrating the issue:
      
      .. add a drop action
      $sudo $TC actions add action drop index 10
      
      .. retrieve it
      $ sudo $TC -s actions get action gact index 10
      
      	action order 1: gact action drop
      	 random type none pass val 0
      	 index 10 ref 2 bind 0 installed 29 sec used 29 sec
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      ... bug 1 above: reference is two.
          Reference is actually 1 but we forget to subtract 1.
      
      ... do a GET again and we see the same issue
          try a few times and nothing changes
      ~$ sudo $TC -s actions get action gact index 10
      
      	action order 1: gact action drop
      	 random type none pass val 0
      	 index 10 ref 2 bind 0 installed 31 sec used 31 sec
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      ... lets try to bind the action to a filter..
      $ sudo $TC qdisc add dev lo ingress
      $ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
        u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10
      
      ... and now a few GETs:
      $ sudo $TC -s actions get action gact index 10
      
      	action order 1: gact action drop
      	 random type none pass val 0
      	 index 10 ref 3 bind 1 installed 204 sec used 204 sec
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      $ sudo $TC -s actions get action gact index 10
      
      	action order 1: gact action drop
      	 random type none pass val 0
      	 index 10 ref 4 bind 1 installed 206 sec used 206 sec
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      $ sudo $TC -s actions get action gact index 10
      
      	action order 1: gact action drop
      	 random type none pass val 0
      	 index 10 ref 5 bind 1 installed 235 sec used 235 sec
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      .... as can be observed the reference count keeps going up.
      
      After the fix
      
      $ sudo $TC actions add action drop index 10
      $ sudo $TC -s actions get action gact index 10
      
      	action order 1: gact action drop
      	 random type none pass val 0
      	 index 10 ref 1 bind 0 installed 4 sec used 4 sec
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      $ sudo $TC -s actions get action gact index 10
      
      	action order 1: gact action drop
      	 random type none pass val 0
      	 index 10 ref 1 bind 0 installed 6 sec used 6 sec
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      $ sudo $TC qdisc add dev lo ingress
      $ sudo $TC filter add dev lo parent ffff: protocol ip prio 1 \
        u32 match ip dst 127.0.0.1/32 flowid 1:1 action gact index 10
      
      $ sudo $TC -s actions get action gact index 10
      
      	action order 1: gact action drop
      	 random type none pass val 0
      	 index 10 ref 2 bind 1 installed 32 sec used 32 sec
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      $ sudo $TC -s actions get action gact index 10
      
      	action order 1: gact action drop
      	 random type none pass val 0
      	 index 10 ref 2 bind 1 installed 33 sec used 33 sec
       	Action statistics:
      	Sent 0 bytes 0 pkt (dropped 0, overlimits 0 requeues 0)
      	backlog 0b 0p requeues 0
      
      Fixes: aecc5cef ("net sched actions: fix GETing actions")
      Signed-off-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b260a714
    • Basil Gunn's avatar
      ax25: Fix segfault after sock connection timeout · 2d6b61ec
      Basil Gunn authored
      [ Upstream commit 8a367e74 ]
      
      The ax.25 socket connection timed out & the sock struct has been
      previously taken down ie. sock struct is now a NULL pointer. Checking
      the sock_flag causes the segfault.  Check if the socket struct pointer
      is NULL before checking sock_flag. This segfault is seen in
      timed out netrom connections.
      
      Please submit to -stable.
      Signed-off-by: default avatarBasil Gunn <basil@pacabunga.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2d6b61ec
    • Jakub Sitnicki's avatar
      ip6_tunnel: Account for tunnel header in tunnel MTU · c7a5df92
      Jakub Sitnicki authored
      [ Upstream commit 02ca0423 ]
      
      With ip6gre we have a tunnel header which also makes the tunnel MTU
      smaller. We need to reserve room for it. Previously we were using up
      space reserved for the Tunnel Encapsulation Limit option
      header (RFC 2473).
      
      Also, after commit b05229f4 ("gre6: Cleanup GREv6 transmit path,
      call common GRE functions") our contract with the caller has
      changed. Now we check if the packet length exceeds the tunnel MTU after
      the tunnel header has been pushed, unlike before.
      
      This is reflected in the check where we look at the packet length minus
      the size of the tunnel header, which is already accounted for in tunnel
      MTU.
      
      Fixes: b05229f4 ("gre6: Cleanup GREv6 transmit path, call common GRE functions")
      Signed-off-by: default avatarJakub Sitnicki <jkbs@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      c7a5df92
    • Masaru Nagai's avatar
      ravb: do not use zero-length alignment DMA descriptor · 08e65070
      Masaru Nagai authored
      [ Upstream commit 8ec3e8a1 ]
      
      Due to alignment requirements of the hardware transmissions are split into
      two DMA descriptors, a small padding descriptor of 0 - 3 bytes in length
      followed by a descriptor for rest of the packet.
      
      In the case of IP packets the first descriptor will never be zero due to
      the way that the stack aligns buffers for IP packets. However, for non-IP
      packets it may be zero.
      
      In that case it has been reported that timeouts occur, presumably because
      transmission stops at the first zero-length DMA descriptor and thus the
      packet is not transmitted. However, in my environment a BUG is triggered as
      follows:
      
      [   20.381417] ------------[ cut here ]------------
      [   20.386054] kernel BUG at lib/swiotlb.c:495!
      [   20.390324] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
      [   20.395805] Modules linked in:
      [   20.398862] CPU: 0 PID: 2089 Comm: mz Not tainted 4.10.0-rc3-00001-gf13ad2db193f #162
      [   20.406689] Hardware name: Renesas Salvator-X board based on r8a7796 (DT)
      [   20.413474] task: ffff80063b1f1900 task.stack: ffff80063a71c000
      [   20.419404] PC is at swiotlb_tbl_map_single+0x178/0x2ec
      [   20.424625] LR is at map_single+0x4c/0x98
      [   20.428629] pc : [<ffff00000839c4c0>] lr : [<ffff00000839c680>] pstate: 800001c5
      [   20.436019] sp : ffff80063a71f9b0
      [   20.439327] x29: ffff80063a71f9b0 x28: ffff80063a20d500
      [   20.444636] x27: ffff000008ed5000 x26: 0000000000000000
      [   20.449944] x25: 000000067abe2adc x24: 0000000000000000
      [   20.455252] x23: 0000000000200000 x22: 0000000000000001
      [   20.460559] x21: 0000000000175ffe x20: ffff80063b2a0010
      [   20.465866] x19: 0000000000000000 x18: 0000ffffcae6fb20
      [   20.471173] x17: 0000ffffa09ba018 x16: ffff0000087c8b70
      [   20.476480] x15: 0000ffffa084f588 x14: 0000ffffa09cfa14
      [   20.481787] x13: 0000ffffcae87ff0 x12: 000000000063abe2
      [   20.487098] x11: ffff000008096360 x10: ffff80063abe2adc
      [   20.492407] x9 : 0000000000000000 x8 : 0000000000000000
      [   20.497718] x7 : 0000000000000000 x6 : ffff000008ed50d0
      [   20.503028] x5 : 0000000000000000 x4 : 0000000000000001
      [   20.508338] x3 : 0000000000000000 x2 : 000000067abe2adc
      [   20.513648] x1 : 00000000bafff000 x0 : 0000000000000000
      [   20.518958]
      [   20.520446] Process mz (pid: 2089, stack limit = 0xffff80063a71c000)
      [   20.526798] Stack: (0xffff80063a71f9b0 to 0xffff80063a720000)
      [   20.532543] f9a0:                                   ffff80063a71fa30 ffff00000839c680
      [   20.540374] f9c0: ffff80063b2a0010 ffff80063b2a0010 0000000000000001 0000000000000000
      [   20.548204] f9e0: 000000000000006e ffff80063b23c000 ffff80063b23c000 0000000000000000
      [   20.556034] fa00: ffff80063b23c000 ffff80063a20d500 000000013b1f1900 0000000000000000
      [   20.563864] fa20: ffff80063ffd18e0 ffff80063b2a0010 ffff80063a71fa60 ffff00000839cd10
      [   20.571694] fa40: ffff80063b2a0010 0000000000000000 ffff80063ffd18e0 000000067abe2adc
      [   20.579524] fa60: ffff80063a71fa90 ffff000008096380 ffff80063b2a0010 0000000000000000
      [   20.587353] fa80: 0000000000000000 0000000000000001 ffff80063a71fac0 ffff00000864f770
      [   20.595184] faa0: ffff80063b23caf0 0000000000000000 0000000000000000 0000000000000140
      [   20.603014] fac0: ffff80063a71fb60 ffff0000087e6498 ffff80063a20d500 ffff80063b23c000
      [   20.610843] fae0: 0000000000000000 ffff000008daeaf0 0000000000000000 ffff000008daeb00
      [   20.618673] fb00: ffff80063a71fc0c ffff000008da7000 ffff80063b23c090 ffff80063a44f000
      [   20.626503] fb20: 0000000000000000 ffff000008daeb00 ffff80063a71fc0c ffff000008da7000
      [   20.634333] fb40: ffff80063b23c090 0000000000000000 ffff800600000037 ffff0000087e63d8
      [   20.642163] fb60: ffff80063a71fbc0 ffff000008807510 ffff80063a692400 ffff80063a20d500
      [   20.649993] fb80: ffff80063a44f000 ffff80063b23c000 ffff80063a69249c 0000000000000000
      [   20.657823] fba0: 0000000000000000 ffff80063a087800 ffff80063b23c000 ffff80063a20d500
      [   20.665653] fbc0: ffff80063a71fc10 ffff0000087e67dc ffff80063a20d500 ffff80063a692400
      [   20.673483] fbe0: ffff80063b23c000 0000000000000000 ffff80063a44f000 ffff80063a69249c
      [   20.681312] fc00: ffff80063a5f1a10 000000103a087800 ffff80063a71fc70 ffff0000087e6b24
      [   20.689142] fc20: ffff80063a5f1a80 ffff80063a71fde8 000000000000000f 00000000000005ea
      [   20.696972] fc40: ffff80063a5f1a10 0000000000000000 000000000000000f ffff00000887fbd0
      [   20.704802] fc60: fffffff43a5f1a80 0000000000000000 ffff80063a71fc80 ffff000008880240
      [   20.712632] fc80: ffff80063a71fd90 ffff0000087c7a34 ffff80063afc7180 0000000000000000
      [   20.720462] fca0: 0000ffffcae6fe18 0000000000000014 0000000060000000 0000000000000015
      [   20.728292] fcc0: 0000000000000123 00000000000000ce ffff0000088d2000 ffff80063b1f1900
      [   20.736122] fce0: 0000000000008933 ffff000008e7cb80 ffff80063a71fd80 ffff0000087c50a4
      [   20.743951] fd00: 0000000000008933 ffff000008e7cb80 ffff000008e7cb80 000000100000000e
      [   20.751781] fd20: ffff80063a71fe4c 0000ffff00000300 0000000000000123 0000000000000000
      [   20.759611] fd40: 0000000000000000 ffff80063b1f0000 000000000000000e 0000000000000300
      [   20.767441] fd60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      [   20.775271] fd80: 0000000000000000 0000000000000000 ffff80063a71fda0 ffff0000087c8c20
      [   20.783100] fda0: 0000000000000000 ffff000008082f30 0000000000000000 0000800637260000
      [   20.790930] fdc0: ffffffffffffffff 0000ffffa0903078 0000000000000000 000000001ea87232
      [   20.798760] fde0: 000000000000000f ffff80063a71fe40 ffff800600000014 ffff000000000001
      [   20.806590] fe00: 0000000000000000 0000000000000000 ffff80063a71fde8 0000000000000000
      [   20.814420] fe20: 0000000000000000 0000000000000000 0000000000000000 0000000000000001
      [   20.822249] fe40: 0000000203000011 0000000000000000 0000000000000000 ffff80063a68aa00
      [   20.830079] fe60: ffff80063a68aa00 0000000000000003 0000000000008933 ffff0000081f1b9c
      [   20.837909] fe80: 0000000000000000 ffff000008082f30 0000000000000000 0000800637260000
      [   20.845739] fea0: ffffffffffffffff 0000ffffa07ca81c 0000000060000000 0000000000000015
      [   20.853569] fec0: 0000000000000003 000000001ea87232 000000000000000f 0000000000000000
      [   20.861399] fee0: 0000ffffcae6fe18 0000000000000014 0000000000000300 0000000000000000
      [   20.869228] ff00: 00000000000000ce 0000000000000000 00000000ffffffff 0000000000000000
      [   20.877059] ff20: 0000000000000002 0000ffffcae87ff0 0000ffffa09cfa14 0000ffffa084f588
      [   20.884888] ff40: 0000000000000000 0000ffffa09ba018 0000ffffcae6fb20 000000001ea87010
      [   20.892718] ff60: 0000ffffa09b9000 0000ffffcae6fe30 0000ffffcae6fe18 000000000000000f
      [   20.900548] ff80: 0000000000000003 000000001ea87232 0000000000000000 0000000000000000
      [   20.908378] ffa0: 0000000000000000 0000ffffcae6fdc0 0000ffffa09a7824 0000ffffcae6fdc0
      [   20.916208] ffc0: 0000ffffa0903078 0000000060000000 0000000000000003 00000000000000ce
      [   20.924038] ffe0: 0000000000000000 0000000000000000 ffffffffffffffff ffffffffffffffff
      [   20.931867] Call trace:
      [   20.934312] Exception stack(0xffff80063a71f7e0 to 0xffff80063a71f910)
      [   20.940750] f7e0: 0000000000000000 0001000000000000 ffff80063a71f9b0 ffff00000839c4c0
      [   20.948580] f800: ffff80063a71f840 ffff00000888a6e4 ffff80063a24c418 ffff80063a24c448
      [   20.956410] f820: 0000000000000000 ffff00000811cd54 ffff80063a71f860 ffff80063a24c458
      [   20.964240] f840: ffff80063a71f870 ffff00000888b258 ffff80063a24c418 0000000000000001
      [   20.972070] f860: ffff80063a71f910 ffff80063a7b7028 ffff80063a71f890 ffff0000088825e4
      [   20.979899] f880: 0000000000000000 00000000bafff000 000000067abe2adc 0000000000000000
      [   20.987729] f8a0: 0000000000000001 0000000000000000 ffff000008ed50d0 0000000000000000
      [   20.995560] f8c0: 0000000000000000 0000000000000000 ffff80063abe2adc ffff000008096360
      [   21.003390] f8e0: 000000000063abe2 0000ffffcae87ff0 0000ffffa09cfa14 0000ffffa084f588
      [   21.011219] f900: ffff0000087c8b70 0000ffffa09ba018
      [   21.016097] [<ffff00000839c4c0>] swiotlb_tbl_map_single+0x178/0x2ec
      [   21.022362] [<ffff00000839c680>] map_single+0x4c/0x98
      [   21.027411] [<ffff00000839cd10>] swiotlb_map_page+0xa4/0x138
      [   21.033072] [<ffff000008096380>] __swiotlb_map_page+0x20/0x7c
      [   21.038821] [<ffff00000864f770>] ravb_start_xmit+0x174/0x668
      [   21.044484] [<ffff0000087e6498>] dev_hard_start_xmit+0x8c/0x120
      [   21.050407] [<ffff000008807510>] sch_direct_xmit+0x108/0x1a0
      [   21.056064] [<ffff0000087e67dc>] __dev_queue_xmit+0x194/0x4cc
      [   21.061807] [<ffff0000087e6b24>] dev_queue_xmit+0x10/0x18
      [   21.067214] [<ffff000008880240>] packet_sendmsg+0xf40/0x1220
      [   21.072873] [<ffff0000087c7a34>] sock_sendmsg+0x18/0x2c
      [   21.078097] [<ffff0000087c8c20>] SyS_sendto+0xb0/0xf0
      [   21.083150] [<ffff000008082f30>] el0_svc_naked+0x24/0x28
      [   21.088462] Code: d34bfef7 2a1803f3 1a9f86d6 35fff878 (d4210000)
      [   21.094611] ---[ end trace 5bc544ad491f3814 ]---
      [   21.099234] Kernel panic - not syncing: Fatal exception in interrupt
      [   21.105587] Kernel Offset: disabled
      [   21.109073] Memory Limit: none
      [   21.112126] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
      
      Fixes: 2f45d190 ("ravb: minimize TX data copying")
      Signed-off-by: Masaru Nagai <masaru.nagai.vx@renesas.com
      Signed-off-by: default avatarSimon Horman <horms+renesas@verge.net.au>
      Acked-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      08e65070
    • Eric Dumazet's avatar
      mlx4: do not call napi_schedule() without care · 77ce30dc
      Eric Dumazet authored
      [ Upstream commit 8cf699ec ]
      
      Disable BH around the call to napi_schedule() to avoid following warning
      
      [   52.095499] NOHZ: local_softirq_pending 08
      [   52.421291] NOHZ: local_softirq_pending 08
      [   52.608313] NOHZ: local_softirq_pending 08
      
      Fixes: 8d59de8f ("net/mlx4_en: Process all completions in RX rings after port goes up")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Erez Shitrit <erezsh@mellanox.com>
      Cc: Eugenia Emantayev <eugenia@mellanox.com>
      Cc: Tariq Toukan <tariqt@mellanox.com>
      Acked-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      77ce30dc
    • Lance Richardson's avatar
      openvswitch: maintain correct checksum state in conntrack actions · 18767acb
      Lance Richardson authored
      [ Upstream commit 75f01a4c ]
      
      When executing conntrack actions on skbuffs with checksum mode
      CHECKSUM_COMPLETE, the checksum must be updated to account for
      header pushes and pulls. Otherwise we get "hw csum failure"
      logs similar to this (ICMP packet received on geneve tunnel
      via ixgbe NIC):
      
      [  405.740065] genev_sys_6081: hw csum failure
      [  405.740106] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G          I     4.10.0-rc3+ #1
      [  405.740108] Call Trace:
      [  405.740110]  <IRQ>
      [  405.740113]  dump_stack+0x63/0x87
      [  405.740116]  netdev_rx_csum_fault+0x3a/0x40
      [  405.740118]  __skb_checksum_complete+0xcf/0xe0
      [  405.740120]  nf_ip_checksum+0xc8/0xf0
      [  405.740124]  icmp_error+0x1de/0x351 [nf_conntrack_ipv4]
      [  405.740132]  nf_conntrack_in+0xe1/0x550 [nf_conntrack]
      [  405.740137]  ? find_bucket.isra.2+0x62/0x70 [openvswitch]
      [  405.740143]  __ovs_ct_lookup+0x95/0x980 [openvswitch]
      [  405.740145]  ? netif_rx_internal+0x44/0x110
      [  405.740149]  ovs_ct_execute+0x147/0x4b0 [openvswitch]
      [  405.740153]  do_execute_actions+0x22e/0xa70 [openvswitch]
      [  405.740157]  ovs_execute_actions+0x40/0x120 [openvswitch]
      [  405.740161]  ovs_dp_process_packet+0x84/0x120 [openvswitch]
      [  405.740166]  ovs_vport_receive+0x73/0xd0 [openvswitch]
      [  405.740168]  ? udp_rcv+0x1a/0x20
      [  405.740170]  ? ip_local_deliver_finish+0x93/0x1e0
      [  405.740172]  ? ip_local_deliver+0x6f/0xe0
      [  405.740174]  ? ip_rcv_finish+0x3a0/0x3a0
      [  405.740176]  ? ip_rcv_finish+0xdb/0x3a0
      [  405.740177]  ? ip_rcv+0x2a7/0x400
      [  405.740180]  ? __netif_receive_skb_core+0x970/0xa00
      [  405.740185]  netdev_frame_hook+0xd3/0x160 [openvswitch]
      [  405.740187]  __netif_receive_skb_core+0x1dc/0xa00
      [  405.740194]  ? ixgbe_clean_rx_irq+0x46d/0xa20 [ixgbe]
      [  405.740197]  __netif_receive_skb+0x18/0x60
      [  405.740199]  netif_receive_skb_internal+0x40/0xb0
      [  405.740201]  napi_gro_receive+0xcd/0x120
      [  405.740204]  gro_cell_poll+0x57/0x80 [geneve]
      [  405.740206]  net_rx_action+0x260/0x3c0
      [  405.740209]  __do_softirq+0xc9/0x28c
      [  405.740211]  irq_exit+0xd9/0xf0
      [  405.740213]  do_IRQ+0x51/0xd0
      [  405.740215]  common_interrupt+0x93/0x93
      
      Fixes: 7f8a436e ("openvswitch: Add conntrack action")
      Signed-off-by: default avatarLance Richardson <lrichard@redhat.com>
      Acked-by: default avatarPravin B Shelar <pshelar@ovn.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      18767acb
    • Shannon Nelson's avatar
      tcp: fix tcp_fastopen unaligned access complaints on sparc · 3524f642
      Shannon Nelson authored
      [ Upstream commit 003c9410 ]
      
      Fix up a data alignment issue on sparc by swapping the order
      of the cookie byte array field with the length field in
      struct tcp_fastopen_cookie, and making it a proper union
      to clean up the typecasting.
      
      This addresses log complaints like these:
          log_unaligned: 113 callbacks suppressed
          Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360
          Kernel unaligned access at TPC[9764ac] tcp_try_fastopen+0x2ec/0x360
          Kernel unaligned access at TPC[9764c8] tcp_try_fastopen+0x308/0x360
          Kernel unaligned access at TPC[9764e4] tcp_try_fastopen+0x324/0x360
          Kernel unaligned access at TPC[976490] tcp_try_fastopen+0x2d0/0x360
      
      Cc: Eric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarShannon Nelson <shannon.nelson@oracle.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3524f642
    • Florian Fainelli's avatar
      net: systemport: Decouple flow control from __bcm_sysport_tx_reclaim · b66b1f5a
      Florian Fainelli authored
      [ Upstream commit 148d3d02 ]
      
      The __bcm_sysport_tx_reclaim() function is used to reclaim transmit
      resources in different places within the driver. Most of them should
      not affect the state of the transit flow control.
      
      Introduce bcm_sysport_tx_clean() which cleans the ring, but does not
      re-enable flow control towards the networking stack, and make
      bcm_sysport_tx_reclaim() do the actual transmit queue flow control.
      
      Fixes: 80105bef ("net: systemport: add Broadcom SYSTEMPORT Ethernet MAC driver")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b66b1f5a
    • David Ahern's avatar
      net: ipv4: fix table id in getroute response · 958bb1bd
      David Ahern authored
      [ Upstream commit 8a430ed5 ]
      
      rtm_table is an 8-bit field while table ids are allowed up to u32. Commit
      709772e6 ("net: Fix routing tables with id > 255 for legacy software")
      added the preference to set rtm_table in dumps to RT_TABLE_COMPAT if the
      table id is > 255. The table id returned on get route requests should do
      the same.
      
      Fixes: c36ba660 ("net: Allow user to get table id from route lookup")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      958bb1bd
    • David Ahern's avatar
      net: lwtunnel: Handle lwtunnel_fill_encap failure · 6980c52c
      David Ahern authored
      [ Upstream commit ea7a8085 ]
      
      Handle failure in lwtunnel_fill_encap adding attributes to skb.
      
      Fixes: 571e7226 ("ipv4: support for fib route lwtunnel encap attributes")
      Fixes: 19e42e45 ("ipv6: support for fib route lwtunnel encap attributes")
      Signed-off-by: default avatarDavid Ahern <dsa@cumulusnetworks.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6980c52c
    • Elad Raz's avatar
      mlxsw: pci: Fix EQE structure definition · ec1aa8d4
      Elad Raz authored
      [ Upstream commit 28e46a0f ]
      
      The event_data starts from address 0x00-0x0C and not from 0x08-0x014. This
      leads to duplication with other fields in the Event Queue Element such as
      sub-type, cqn and owner.
      
      Fixes: eda6500a ("mlxsw: Add PCI bus implementation")
      Signed-off-by: default avatarElad Raz <eladr@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ec1aa8d4
    • Arkadi Sharshevsky's avatar
      mlxsw: switchx2: Fix memory leak at skb reallocation · 4ec59d1f
      Arkadi Sharshevsky authored
      [ Upstream commit 400fc010 ]
      
      During transmission the skb is checked for headroom in order to
      add vendor specific header. In case the skb needs to be re-allocated,
      skb_realloc_headroom() is called to make a private copy of the original,
      but doesn't release it. Current code assumes that the original skb is
      released during reallocation and only releases it at the error path
      which causes a memory leak.
      
      Fix this by adding the original skb release to the main path.
      
      Fixes: d003462a ("mlxsw: Simplify mlxsw_sx_port_xmit function")
      Signed-off-by: default avatarArkadi Sharshevsky <arkadis@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4ec59d1f
    • Arkadi Sharshevsky's avatar
      mlxsw: spectrum: Fix memory leak at skb reallocation · 7c249f33
      Arkadi Sharshevsky authored
      [ Upstream commit 36bf38d1 ]
      
      During transmission the skb is checked for headroom in order to
      add vendor specific header. In case the skb needs to be re-allocated,
      skb_realloc_headroom() is called to make a private copy of the original,
      but doesn't release it. Current code assumes that the original skb is
      released during reallocation and only releases it at the error path
      which causes a memory leak.
      
      Fix this by adding the original skb release to the main path.
      
      Fixes: 56ade8fe ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
      Signed-off-by: default avatarArkadi Sharshevsky <arkadis@mellanox.com>
      Reviewed-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7c249f33
    • stephen hemminger's avatar
      netvsc: add rcu_read locking to netvsc callback · 5b3df440
      stephen hemminger authored
      [ Upstream commit 0719e72c ]
      
      The receive callback (in tasklet context) is using RCU to get reference
      to associated VF network device but this is not safe. RCU read lock
      needs to be held. Found by running with full lockdep debugging
      enabled.
      
      Fixes: f207c10d ("hv_netvsc: use RCU to protect vf_netdev")
      Signed-off-by: default avatarStephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b3df440
    • hayeswang's avatar
      r8152: fix the sw rx checksum is unavailable · a37f2311
      hayeswang authored
      [ Upstream commit 19c0f40d ]
      
      Fix the hw rx checksum is always enabled, and the user couldn't switch
      it to sw rx checksum.
      
      Note that the RTL_VER_01 only support sw rx checksum only. Besides,
      the hw rx checksum for RTL_VER_02 is disabled after
      commit b9a321b4 ("r8152: Fix broken RX checksums."). Re-enable it.
      Signed-off-by: default avatarHayes Wang <hayeswang@realtek.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a37f2311
  2. 01 Feb, 2017 4 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.7 · fd2ffe57
      Greg Kroah-Hartman authored
      fd2ffe57
    • Francisco Jerez's avatar
      drm/i915: Remove WaDisableLSQCROPERFforOCL KBL workaround. · b59dd202
      Francisco Jerez authored
      commit 4fc020d8 upstream.
      
      The WaDisableLSQCROPERFforOCL workaround has the side effect of
      disabling an L3SQ optimization that has huge performance implications
      and is unlikely to be necessary for the correct functioning of usual
      graphic workloads.  Userspace is free to re-enable the workaround on
      demand, and is generally in a better position to determine whether the
      workaround is necessary than the DRM is (e.g. only during the
      execution of compute kernels that rely on both L3 fences and HDC R/W
      requests).
      
      The same workaround seems to apply to BDW (at least to production
      stepping G1) and SKL as well (the internal workaround database claims
      that it does for all steppings, while the BSpec workaround table only
      mentions pre-production steppings), but the DRM doesn't do anything
      beyond whitelisting the L3SQCREG4 register so userspace can enable it
      when it sees fit.  Do the same on KBL platforms.
      
      Improves performance of the GFXBench4 gl_manhattan31 benchmark by 60%,
      and gl_4 (AKA car chase) by 14% on a KBL GT2 running Mesa master --
      This is followed by a regression of 35% and 10% respectively for the
      same benchmarks and platform caused by my recent patch series
      switching userspace to use the dataport constant cache instead of the
      sampler to implement uniform pull constant loads, which caused us to
      hit more heavily the L3 cache (and on platforms other than KBL had the
      opposite effect of improving performance of the same two benchmarks).
      The overall effect on KBL of this change combined with the recent
      userspace change is respectively 4.6% and 2.6%.  SynMark2 OglShMapPcf
      was affected by the constant cache changes (though it improved as it
      did on other platforms rather than regressing), but is not
      significantly affected by this patch (with statistical significance of
      5% and sample size 20).
      
      v2: Drop some more code to avoid unused variable warning.
      
      Fixes: 738fa1b3 ("drm/i915/kbl: Add WaDisableLSQCROPERFforOCL")
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99256Signed-off-by: default avatarFrancisco Jerez <currojerez@riseup.net>
      Cc: Matthew Auld <matthew.william.auld@gmail.com>
      Cc: Eero Tamminen <eero.t.tamminen@intel.com>
      Cc: Jani Nikula <jani.nikula@intel.com>
      Cc: Mika Kuoppala <mika.kuoppala@intel.com>
      Cc: beignet@lists.freedesktop.org
      Reviewed-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
      [Removed double Fixes tag]
      Signed-off-by: default avatarMika Kuoppala <mika.kuoppala@intel.com>
      Link: http://patchwork.freedesktop.org/patch/msgid/1484217894-20505-1-git-send-email-mika.kuoppala@intel.com
      (cherry picked from commit 8726f2fa)
      Signed-off-by: default avatarJani Nikula <jani.nikula@intel.com>
      [ Francisco Jerez: Rebase on v4.9 branch. ]
      Signed-off-by: default avatarFrancisco Jerez <currojerez@riseup.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b59dd202
    • Peter Zijlstra's avatar
      perf/core: Fix concurrent sys_perf_event_open() vs. 'move_group' race · 922813f4
      Peter Zijlstra authored
      commit 321027c1 upstream.
      
      Di Shen reported a race between two concurrent sys_perf_event_open()
      calls where both try and move the same pre-existing software group
      into a hardware context.
      
      The problem is exactly that described in commit:
      
        f63a8daa ("perf: Fix event->ctx locking")
      
      ... where, while we wait for a ctx->mutex acquisition, the event->ctx
      relation can have changed under us.
      
      That very same commit failed to recognise sys_perf_event_context() as an
      external access vector to the events and thereby didn't apply the
      established locking rules correctly.
      
      So while one sys_perf_event_open() call is stuck waiting on
      mutex_lock_double(), the other (which owns said locks) moves the group
      about. So by the time the former sys_perf_event_open() acquires the
      locks, the context we've acquired is stale (and possibly dead).
      
      Apply the established locking rules as per perf_event_ctx_lock_nested()
      to the mutex_lock_double() for the 'move_group' case. This obviously means
      we need to validate state after we acquire the locks.
      
      Reported-by: Di Shen (Keen Lab)
      Tested-by: default avatarJohn Dias <joaodias@google.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Min Chong <mchong@google.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Vince Weaver <vincent.weaver@maine.edu>
      Fixes: f63a8daa ("perf: Fix event->ctx locking")
      Link: http://lkml.kernel.org/r/20170106131444.GZ3174@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      922813f4
    • David Rientjes's avatar
      mm, memcg: do not retry precharge charges · f5f415c1
      David Rientjes authored
      commit 3674534b upstream.
      
      When memory.move_charge_at_immigrate is enabled and precharges are
      depleted during move, mem_cgroup_move_charge_pte_range() will attempt to
      increase the size of the precharge.
      
      Prevent precharges from ever looping by setting __GFP_NORETRY.  This was
      probably the intention of the GFP_KERNEL & ~__GFP_NORETRY, which is
      pointless as written.
      
      Fixes: 0029e19e ("mm: memcontrol: remove explicit OOM parameter in charge path")
      Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1701130208510.69402@chino.kir.corp.google.comSigned-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f5f415c1