1. 03 Aug, 2017 1 commit
  2. 02 Aug, 2017 19 commits
  3. 01 Aug, 2017 18 commits
    • K. Den's avatar
      gue: fix remcsum when GRO on and CHECKSUM_PARTIAL boundary is outer UDP · 1bff8a0c
      K. Den authored
      In the case that GRO is turned on and the original received packet is
      CHECKSUM_PARTIAL, if the outer UDP header is exactly at the last
      csum-unnecessary point, which for instance could occur if the packet
      comes from another Linux guest on the same Linux host, we have to do
      either remcsum_adjust or set up CHECKSUM_PARTIAL again with its
      csum_start properly reset considering RCO.
      
      However, since b7fe10e5 ("gro: Fix remcsum offload to deal with frags
      in GRO") that barrier in such case could be skipped if GRO turned on,
      hence we pass over it and the inner L4 validation mistakenly reckons
      it as a bad csum.
      
      This patch makes remcsum_offload being reset at the same time of GRO
      remcsum cleanup, so as to make it work in such case as before.
      
      Fixes: b7fe10e5 ("gro: Fix remcsum offload to deal with frags in GRO")
      Signed-off-by: default avatarKoichiro Den <den@klaipeden.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1bff8a0c
    • K. Den's avatar
      vxlan: fix remcsum when GRO on and CHECKSUM_PARTIAL boundary is outer UDP · be73b304
      K. Den authored
      In the case that GRO is turned on and the original received packet is
      CHECKSUM_PARTIAL, if the outer UDP header is exactly at the last
      csum-unnecessary point, which for instance could occur if the packet
      comes from another Linux guest on the same Linux host, we have to do
      either remcsum_adjust or set up CHECKSUM_PARTIAL again with its
      csum_start properly reset considering RCO.
      
      However, since b7fe10e5("gro: Fix remcsum offload to deal with frags
      in GRO") that barrier in such case could be skipped if GRO turned on,
      hence we pass over it and the inner L4 validation mistakenly reckons
      it as a bad csum.
      
      This patch makes remcsum_offload being reset at the same time of GRO
      remcsum cleanup, so as to make it work in such case as before.
      
      Fixes: b7fe10e5 ("gro: Fix remcsum offload to deal with frags in GRO")
      Signed-off-by: default avatarKoichiro Den <den@klaipeden.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      be73b304
    • yujuan.qi's avatar
      Cipso: cipso_v4_optptr enter infinite loop · 40413955
      yujuan.qi authored
      in for(),if((optlen > 0) && (optptr[1] == 0)), enter infinite loop.
      
      Test: receive a packet which the ip length > 20 and the first byte of ip option is 0, produce this issue
      Signed-off-by: default avataryujuan.qi <yujuan.qi@mediatek.com>
      Acked-by: default avatarPaul Moore <paul@paul-moore.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      40413955
    • David S. Miller's avatar
      Merge branch 'ethernet-ti-cpts-fix-tx-timestamping-timeout' · fdaa419b
      David S. Miller authored
      Grygorii Strashko says:
      
      ====================
      net: ethernet: ti: cpts: fix tx timestamping timeout
      
      With the low Ethernet connection speed cpdma notification about packet
      processing can be received before CPTS TX timestamp event, which is set
      when packet actually left CPSW while cpdma notification is sent when packet
      pushed in CPSW fifo. As result, when connection is slow and CPU is fast
      enough TX timestamping is not working properly.
      Issue was discovered using timestamping tool on am57x boards with Ethernet link
      speed forced to 100M and on am335x-evm with Ethernet link speed forced to 10M.
      
      Patch3 - This series fixes it by introducing TX SKB queue to store PTP SKBs for
      which Ethernet Transmit Event hasn't been received yet and then re-check this
      queue with new Ethernet Transmit Events by scheduling CPTS overflow
      work more often until TX SKB queue is not empty.
      
      Patch 1,2 - As CPTS overflow work is time critical task it important to ensure
      that its scheduling is not delayed. Unfortunately, There could be significant
      delay in CPTS work schedule under high system load and on -RT which could cause
      CPTS misbehavior due to internal counter overflow and there is no way to tune
      CPTS overflow work execution policy and priority manually. The kthread_worker
      can be used instead of workqueues, as it creates separate named kthread for
      each worker and its its execution policy and priority can be configured
      using chrt tool. Instead of modifying CPTS driver itself it was proposed to
      it was proposed to add PTP auxiliary worker to the PHC subsystem [1], so
      other drivers can benefit from this feature also.
      
      [1] https://www.spinics.net/lists/netdev/msg445392.html
      
      changes in v4:
      - fixed memleak in ptp_clock_register()
      - undocumented change in cpts_find_ts() moved to separate patch (minor fix)
      
      changes in v3:
      - patch 1: added proper error handling in ptp_clock_register.
        minor comments applied.
      
      changes in v2:
      - added PTP auxiliary worker to the PHC subsystem
      
      Links
      v3: https://www.spinics.net/lists/netdev/msg446058.html
      v2: https://www.spinics.net/lists/netdev/msg445859.html
      v1: https://www.spinics.net/lists/netdev/msg445387.html
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdaa419b
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: fix fifo read in cpts_find_ts · a93439cc
      Grygorii Strashko authored
      Now the call chain
       cpts_find_ts()
        |- cpts_fifo_read(cpts, CPTS_EV_PUSH)
      
      will stop reading CPTS FIFO if PUSH event is found. But this is not
      expected and CPTS FIFI should be completely drained here. This is most
      probably copy-paste error and it has no negative impact as CPTS_EV_PUSH
      should not be present in FIFO without TS_PUSH request and
      cpts_systim_read() and cpts_find_ts() synchronized by spin_lock.
      
      Correct above by calling cpts_fifo_read() with -1 parameter, so it will
      read all CPTS event from FIFO.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a93439cc
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: fix tx timestamping timeout · 0d5f54fe
      Grygorii Strashko authored
      With the low speed Ethernet connection CPDMA notification about packet
      processing can be received before CPTS TX timestamp event, which is set
      when packet actually left CPSW while cpdma notification is sent when packet
      pushed in CPSW fifo.  As result, when connection is slow and CPU is fast
      enough TX timestamping is not working properly.
      
      Fix it, by introducing TX SKB queue to store PTP SKBs for which Ethernet
      Transmit Event hasn't been received yet and then re-check this queue
      with new Ethernet Transmit Events by scheduling CPTS overflow
      work more often (every 1 jiffies) until TX SKB queue is not empty.
      
      Side effect of this change is:
       - User space tools require to take into account possible delay in TX
      timestamp processing (for example ptp4l works with tx_timestamp_timeout=400
      under net traffic and tx_timestamp_timeout=25 in idle).
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0d5f54fe
    • Grygorii Strashko's avatar
      net: ethernet: ti: cpts: convert to use ptp auxiliary worker · 999f1292
      Grygorii Strashko authored
      There could be significant delay in CPTS work schedule under high system
      load and on -RT which could cause CPTS misbehavior due to internal counter
      overflow. Usage of own kthread_worker allows to avoid such kind of issues
      and makes it possible to tune priority of CPTS kthread_worker thread on -RT
      (thread name "cpts").
      
      Hence, the CPTS driver is converted to use PTP auxiliary worker as PHC
      subsystem implements such functionality in a generic way now.
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      999f1292
    • Grygorii Strashko's avatar
      ptp: introduce ptp auxiliary worker · d9535cb7
      Grygorii Strashko authored
      Many PTP drivers required to perform some asynchronous or periodic work,
      like periodically handling PHC counter overflow or handle delayed timestamp
      for RX/TX network packets. In most of the cases, such work is implemented
      using workqueues. Unfortunately, Kernel workqueues might introduce
      significant delay in work scheduling under high system load and on -RT,
      which could cause misbehavior of PTP drivers due to internal counter
      overflow, for example, and there is no way to tune its execution policy and
      priority manuallly.
      
      Hence, The kthread_worker can be used insted of workqueues, as it create
      separte named kthread for each worker and its its execution policy and
      priority can be configured using chrt tool.
      
      This prblem was reported for two drivers TI CPSW CPTS and dp83640, so
      instead of modifying each of these driver it was proposed to add PTP
      auxiliary worker to the PHC subsystem.
      
      The patch adds PTP auxiliary worker in PHC subsystem using kthread_worker
      and kthread_delayed_work and introduces two new PHC subsystem APIs:
      
      - long (*do_aux_work)(struct ptp_clock_info *ptp) callback in
      ptp_clock_info structure, which driver should assign if it require to
      perform asynchronous or periodic work. Driver should return the delay of
      the PTP next auxiliary work scheduling time (>=0) or negative value in case
      further scheduling is not required.
      
      - int ptp_schedule_worker(struct ptp_clock *ptp, unsigned long delay) which
      allows schedule PTP auxiliary work.
      
      The name of kthread_worker thread corresponds PTP PHC device name "ptp%d".
      Signed-off-by: default avatarGrygorii Strashko <grygorii.strashko@ti.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9535cb7
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · bc78d646
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Handle notifier registry failures properly in tun/tap driver, from
          Tonghao Zhang.
      
       2) Fix bpf verifier handling of subtraction bounds and add a testcase
          for this, from Edward Cree.
      
       3) Increase reset timeout in ftgmac100 driver, from Ben Herrenschmidt.
      
       4) Fix use after free in prd_retire_rx_blk_timer_exired() in AF_PACKET,
          from Cong Wang.
      
       5) Fix SElinux regression due to recent UDP optimizations, from Paolo
          Abeni.
      
       6) We accidently increment IPSTATS_MIB_FRAGFAILS in the ipv6 code
          paths, fix from Stefano Brivio.
      
       7) Fix some mem leaks in dccp, from Xin Long.
      
       8) Adjust MDIO_BUS kconfig deps to avoid build errors, from Arnd
          Bergmann.
      
       9) Mac address length check and buffer size fixes from Cong Wang.
      
      10) Don't leak sockets in ipv6 udp early demux, from Paolo Abeni.
      
      11) Fix return value when copy_from_user() fails in
          bpf_prog_get_info_by_fd(), from Daniel Borkmann.
      
      12) Handle PHY_HALTED properly in phy library state machine, from
          Florian Fainelli.
      
      13) Fix OOPS in fib_sync_down_dev(), from Ido Schimmel.
      
      14) Fix truesize calculation in virtio_net which led to performance
          regressions, from Michael S Tsirkin.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
        samples/bpf: fix bpf tunnel cleanup
        udp6: fix jumbogram reception
        ppp: Fix a scheduling-while-atomic bug in del_chan
        Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config"
        virtio_net: fix truesize for mergeable buffers
        mv643xx_eth: fix of_irq_to_resource() error check
        MAINTAINERS: Add more files to the PHY LIBRARY section
        ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev()
        net: phy: Correctly process PHY_HALTED in phy_stop_machine()
        sunhme: fix up GREG_STAT and GREG_IMASK register offsets
        bpf: fix bpf_prog_get_info_by_fd to dump correct xlated_prog_len
        tcp: avoid bogus gcc-7 array-bounds warning
        net: tc35815: fix spelling mistake: "Intterrupt" -> "Interrupt"
        bpf: don't indicate success when copy_from_user fails
        udp6: fix socket leak on early demux
        net: thunderx: Fix BGX transmit stall due to underflow
        Revert "vhost: cache used event for better performance"
        team: use a larger struct for mac address
        net: check dev->addr_len for dev_set_mac_address()
        phy: bcm-ns-usb3: fix MDIO_BUS dependency
        ...
      bc78d646
    • William Tu's avatar
      samples/bpf: fix bpf tunnel cleanup · cc75f851
      William Tu authored
      test_tunnel_bpf.sh fails to remove the vxlan11 tunnel device, causing the
      next geneve tunnelling test case fails.  In addition, the geneve reserved bit
      in tcbpf2_kern.c should be zero, according to the RFC.
      Signed-off-by: default avatarWilliam Tu <u9012063@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cc75f851
    • Paolo Abeni's avatar
      udp6: fix jumbogram reception · cb891fa6
      Paolo Abeni authored
      Since commit 67a51780 ("ipv6: udp: leverage scratch area
      helpers") udp6_recvmsg() read the skb len from the scratch area,
      to avoid a cache miss.
      But the UDP6 rx path support RFC 2675 UDPv6 jumbograms, and their
      length exceeds the 16 bits available in the scratch area. As a side
      effect the length returned by recvmsg() is:
      <ingress datagram len> % (1<<16)
      
      This commit addresses the issue allocating one more bit in the
      IP6CB flags field and setting it for incoming jumbograms.
      Such field is still in the first cacheline, so at recvmsg()
      time we can check it and fallback to access skb->len if
      required, without a measurable overhead.
      
      Fixes: 67a51780 ("ipv6: udp: leverage scratch area helpers")
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cb891fa6
    • Gao Feng's avatar
      ppp: Fix a scheduling-while-atomic bug in del_chan · ddab8282
      Gao Feng authored
      The PPTP set the pptp_sock_destruct as the sock's sk_destruct, it would
      trigger this bug when __sk_free is invoked in atomic context, because of
      the call path pptp_sock_destruct->del_chan->synchronize_rcu.
      
      Now move the synchronize_rcu to pptp_release from del_chan. This is the
      only one case which would free the sock and need the synchronize_rcu.
      
      The following is the panic I met with kernel 3.3.8, but this issue should
      exist in current kernel too according to the codes.
      
      BUG: scheduling while atomic
      __schedule_bug+0x5e/0x64
      __schedule+0x55/0x580
      ? ppp_unregister_channel+0x1cd5/0x1de0 [ppp_generic]
      ? dev_hard_start_xmit+0x423/0x530
      ? sch_direct_xmit+0x73/0x170
      __cond_resched+0x16/0x30
      _cond_resched+0x22/0x30
      wait_for_common+0x18/0x110
      ? call_rcu_bh+0x10/0x10
      wait_for_completion+0x12/0x20
      wait_rcu_gp+0x34/0x40
      ? wait_rcu_gp+0x40/0x40
      synchronize_sched+0x1e/0x20
      0xf8417298
      0xf8417484
      ? sock_queue_rcv_skb+0x109/0x130
      __sk_free+0x16/0x110
      ? udp_queue_rcv_skb+0x1f2/0x290
      sk_free+0x16/0x20
      __udp4_lib_rcv+0x3b8/0x650
      Signed-off-by: default avatarGao Feng <gfree.wind@vip.163.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ddab8282
    • Florian Fainelli's avatar
      Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config" · 00d51094
      Florian Fainelli authored
      This reverts commit 28b45910 ("net: bcmgenet: Remove init parameter
      from bcmgenet_mii_config") because in the process of moving from
      dev_info() to dev_info_once() we essentially lost the helpful printed
      messages once the second instance of the driver is loaded.
      dev_info_once() does not actually print the message once per device
      instance, but once period.
      
      Fixes: 28b45910 ("net: bcmgenet: Remove init parameter from bcmgenet_mii_config")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Reviewed-by: default avatarDoug Berger <opendmb@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      00d51094
    • Michael S. Tsirkin's avatar
      virtio_net: fix truesize for mergeable buffers · 1daa8790
      Michael S. Tsirkin authored
      Seth Forshee noticed a performance degradation with some workloads.
      This turns out to be due to packet drops.  Euan Kemp noticed that this
      is because we drop all packets where length exceeds the truesize, but
      for some packets we add in extra memory without updating the truesize.
      This in turn was kept around unchanged from ab7db917 ("virtio-net:
      auto-tune mergeable rx buffer size for improved performance").  That
      commit had an internal reason not to account for the extra space: not
      enough bits to do it.  No longer true so let's account for the allocated
      length exactly.
      
      Many thanks to Seth Forshee for the report and bisecting and Euan Kemp
      for debugging the issue.
      
      Fixes: 680557cf ("virtio_net: rework mergeable buffer handling")
      Reported-by: default avatarEuan Kemp <euan.kemp@coreos.com>
      Tested-by: default avatarEuan Kemp <euan.kemp@coreos.com>
      Reported-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Tested-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1daa8790
    • Sergei Shtylyov's avatar
      mv643xx_eth: fix of_irq_to_resource() error check · cfbcb61f
      Sergei Shtylyov authored
      of_irq_to_resource() has recently been  fixed to return negative error #'s
      along with 0 in case of failure,  however the Marvell MV643xx Ethernet
      driver still only regards 0  as invalid IRQ -- fix it up.
      
      Fixes: 7a4228bb ("of: irq: use of_irq_get() in of_irq_to_resource()")
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cfbcb61f
    • Florian Fainelli's avatar
      MAINTAINERS: Add more files to the PHY LIBRARY section · 13332db5
      Florian Fainelli authored
      Include missing files that are provided by, used, or directly maintained
      within the PHY LIBRARY, this include uapi header, header files used by
      Device Tree code etc.
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      13332db5
    • Ido Schimmel's avatar
      ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev() · 71ed7ee3
      Ido Schimmel authored
      Michał reported a NULL pointer deref during fib_sync_down_dev() when
      unregistering a netdevice. The problem is that we don't check for
      'in_dev' being NULL, which can happen in very specific cases.
      
      Usually routes are flushed upon NETDEV_DOWN sent in either the netdev or
      the inetaddr notification chains. However, if an interface isn't
      configured with any IP address, then it's possible for host routes to be
      flushed following NETDEV_UNREGISTER, after NULLing dev->ip_ptr in
      inetdev_destroy().
      
      To reproduce:
      $ ip link add type dummy
      $ ip route add local 1.1.1.0/24 dev dummy0
      $ ip link del dev dummy0
      
      Fix this by checking for the presence of 'in_dev' before referencing it.
      
      Fixes: 982acb97 ("ipv4: fib: Notify about nexthop status changes")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarMichał Mirosław <mirq-linux@rere.qmqm.pl>
      Tested-by: default avatarMichał Mirosław <mirq-linux@rere.qmqm.pl>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      71ed7ee3
    • Florian Fainelli's avatar
      net: phy: Correctly process PHY_HALTED in phy_stop_machine() · 7ad813f2
      Florian Fainelli authored
      Marc reported that he was not getting the PHY library adjust_link()
      callback function to run when calling phy_stop() + phy_disconnect()
      which does not indeed happen because we set the state machine to
      PHY_HALTED but we don't get to run it to process this state past that
      point.
      
      Fix this with a synchronous call to phy_state_machine() in order to have
      the state machine actually act on PHY_HALTED, set the PHY device's link
      down, turn the network device's carrier off and finally call the
      adjust_link() function.
      Reported-by: default avatarMarc Gonzalez <marc_gonzalez@sigmadesigns.com>
      Fixes: a390d1f3 ("phylib: convert state_queue work to delayed_work")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarMarc Gonzalez <marc_gonzalez@sigmadesigns.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7ad813f2
  4. 31 Jul, 2017 2 commits