1. 08 May, 2018 2 commits
  2. 07 May, 2018 1 commit
  3. 04 May, 2018 14 commits
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 2ba5622f
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2018-05-05
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Sanitize attr->{prog,map}_type from bpf(2) since used as an array index
         to retrieve prog/map specific ops such that we prevent potential out of
         bounds value under speculation, from Mark and Daniel.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ba5622f
    • Antoine Tenart's avatar
      net: phy: sfp: fix the BR,min computation · 52c5cd1b
      Antoine Tenart authored
      In an SFP EEPROM values can be read to get information about a given SFP
      module. One of those is the bitrate, which can be determined using a
      nominal bitrate in addition with min and max values (in %). The SFP code
      currently compute both BR,min and BR,max values thanks to this nominal
      and min,max values.
      
      This patch fixes the BR,min computation as the min value should be
      subtracted to the nominal one, not added.
      
      Fixes: 9962acf7 ("sfp: add support for 1000Base-PX and 1000Base-BX10")
      Signed-off-by: default avatarAntoine Tenart <antoine.tenart@bootlin.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      52c5cd1b
    • Rob Taglang's avatar
      net: ethernet: sun: niu set correct packet size in skb · 14224923
      Rob Taglang authored
      Currently, skb->len and skb->data_len are set to the page size, not
      the packet size. This causes the frame check sequence to not be
      located at the "end" of the packet resulting in ethernet frame check
      errors. The driver does work currently, but stricter kernel facing
      networking solutions like OpenVSwitch will drop these packets as
      invalid.
      
      These changes set the packet size correctly so that these errors no
      longer occur. The length does not include the frame check sequence, so
      that subtraction was removed.
      
      Tested on Oracle/SUN Multithreaded 10-Gigabit Ethernet Network
      Controller [108e:abcd] and validated in wireshark.
      Signed-off-by: default avatarRob Taglang <rob@taglang.io>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      14224923
    • YU Bo's avatar
      net/netlink: make sure the headers line up actual value output · ae552ac2
      YU Bo authored
      Making sure the headers line up properly with the actual value output of the command
      `cat /proc/net/netlink`
      
      Before the patch:
      <sk       Eth Pid    Groups   Rmem     Wmem     Dump     Locks     Drops     Inode
      <ffff8cd2c2f7b000 0   909    00000550 0        0        0 2        0        18946
      
      After the patch:
      >sk               Eth Pid        Groups   Rmem     Wmem     Dump  Locks    Drops    Inode
      >0000000033203952 0   897        00000113 0        0        0     2        0        14906
      Signed-off-by: default avatarBo YU <tsu.yubo@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ae552ac2
    • Michael Chan's avatar
      tg3: Fix vunmap() BUG_ON() triggered from tg3_free_consistent(). · d89a2adb
      Michael Chan authored
      tg3_free_consistent() calls dma_free_coherent() to free tp->hw_stats
      under spinlock and can trigger BUG_ON() in vunmap() because vunmap()
      may sleep.  Fix it by removing the spinlock and relying on the
      TG3_FLAG_INIT_COMPLETE flag to prevent race conditions between
      tg3_get_stats64() and tg3_free_consistent().  TG3_FLAG_INIT_COMPLETE
      is always cleared under tp->lock before tg3_free_consistent()
      and therefore tg3_get_stats64() can safely access tp->hw_stats
      under tp->lock if TG3_FLAG_INIT_COMPLETE is set.
      
      Fixes: f5992b72 ("tg3: Fix race condition in tg3_get_stats64().")
      Reported-by: default avatarZumeng Chen <zumeng.chen@gmail.com>
      Signed-off-by: default avatarMichael Chan <michael.chan@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d89a2adb
    • Eric Dumazet's avatar
      nsh: fix infinite loop · af50e4ba
      Eric Dumazet authored
      syzbot caught an infinite recursion in nsh_gso_segment().
      
      Problem here is that we need to make sure the NSH header is of
      reasonable length.
      
      BUG: MAX_LOCK_DEPTH too low!
      turning off the locking correctness validator.
      depth: 48  max: 48!
      48 locks held by syz-executor0/10189:
       #0:         (ptrval) (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x30f/0x34c0 net/core/dev.c:3517
       #1:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #1:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #2:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #2:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #3:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #3:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #4:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #4:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #5:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #5:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #6:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #6:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #7:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #7:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #8:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #8:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #9:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #9:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #10:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #10:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #11:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #11:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #12:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #12:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #13:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #13:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #14:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #14:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #15:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #15:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #16:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #16:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #17:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #17:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #18:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #18:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #19:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #19:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #20:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #20:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #21:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #21:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #22:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #22:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #23:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #23:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #24:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #24:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #25:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #25:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #26:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #26:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #27:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #27:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #28:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #28:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #29:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #29:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #30:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #30:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #31:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #31:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
      dccp_close: ABORT with 65423 bytes unread
       #32:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #32:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #33:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #33:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #34:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #34:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #35:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #35:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #36:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #36:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #37:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #37:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #38:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #38:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #39:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #39:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #40:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #40:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #41:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #41:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #42:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #42:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #43:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #43:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #44:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #44:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #45:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #45:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #46:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #46:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
       #47:         (ptrval) (rcu_read_lock){....}, at: __skb_pull include/linux/skbuff.h:2080 [inline]
       #47:         (ptrval) (rcu_read_lock){....}, at: skb_mac_gso_segment+0x221/0x720 net/core/dev.c:2787
      INFO: lockdep is turned off.
      CPU: 1 PID: 10189 Comm: syz-executor0 Not tainted 4.17.0-rc2+ #26
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x1b9/0x294 lib/dump_stack.c:113
       __lock_acquire+0x1788/0x5140 kernel/locking/lockdep.c:3449
       lock_acquire+0x1dc/0x520 kernel/locking/lockdep.c:3920
       rcu_lock_acquire include/linux/rcupdate.h:246 [inline]
       rcu_read_lock include/linux/rcupdate.h:632 [inline]
       skb_mac_gso_segment+0x25b/0x720 net/core/dev.c:2789
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       nsh_gso_segment+0x405/0xb60 net/nsh/nsh.c:107
       skb_mac_gso_segment+0x3ad/0x720 net/core/dev.c:2792
       __skb_gso_segment+0x3bb/0x870 net/core/dev.c:2865
       skb_gso_segment include/linux/netdevice.h:4025 [inline]
       validate_xmit_skb+0x54d/0xd90 net/core/dev.c:3118
       validate_xmit_skb_list+0xbf/0x120 net/core/dev.c:3168
       sch_direct_xmit+0x354/0x11e0 net/sched/sch_generic.c:312
       qdisc_restart net/sched/sch_generic.c:399 [inline]
       __qdisc_run+0x741/0x1af0 net/sched/sch_generic.c:410
       __dev_xmit_skb net/core/dev.c:3243 [inline]
       __dev_queue_xmit+0x28ea/0x34c0 net/core/dev.c:3551
       dev_queue_xmit+0x17/0x20 net/core/dev.c:3616
       packet_snd net/packet/af_packet.c:2951 [inline]
       packet_sendmsg+0x40f8/0x6070 net/packet/af_packet.c:2976
       sock_sendmsg_nosec net/socket.c:629 [inline]
       sock_sendmsg+0xd5/0x120 net/socket.c:639
       __sys_sendto+0x3d7/0x670 net/socket.c:1789
       __do_sys_sendto net/socket.c:1801 [inline]
       __se_sys_sendto net/socket.c:1797 [inline]
       __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1797
       do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Fixes: c411ed85 ("nsh: add GSO support")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Jiri Benc <jbenc@redhat.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Acked-by: default avatarJiri Benc <jbenc@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af50e4ba
    • Gustavo A. R. Silva's avatar
      net: atm: Fix potential Spectre v1 · acf784bd
      Gustavo A. R. Silva authored
      ioc_data.dev_num can be controlled by user-space, hence leading to
      a potential exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      net/atm/lec.c:702 lec_vcc_attach() warn: potential spectre issue
      'dev_lec'
      
      Fix this by sanitizing ioc_data.dev_num before using it to index
      dev_lec. Also, notice that there is another instance in which array
      dev_lec is being indexed using ioc_data.dev_num at line 705:
      lec_vcc_added(netdev_priv(dev_lec[ioc_data.dev_num]),
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      acf784bd
    • Gustavo A. R. Silva's avatar
      atm: zatm: Fix potential Spectre v1 · 2be147f7
      Gustavo A. R. Silva authored
      pool can be indirectly controlled by user-space, hence leading to
      a potential exploitation of the Spectre variant 1 vulnerability.
      
      This issue was detected with the help of Smatch:
      
      drivers/atm/zatm.c:1462 zatm_ioctl() warn: potential spectre issue
      'zatm_dev->pool_info' (local cap)
      
      Fix this by sanitizing pool before using it to index
      zatm_dev->pool_info
      
      Notice that given that speculation windows are large, the policy is
      to kill the speculation on the first load and not worry if it can be
      completed with a dependent load/store [1].
      
      [1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2be147f7
    • Stefano Brivio's avatar
      openvswitch: Don't swap table in nlattr_set() after OVS_ATTR_NESTED is found · 72f17baf
      Stefano Brivio authored
      If an OVS_ATTR_NESTED attribute type is found while walking
      through netlink attributes, we call nlattr_set() recursively
      passing the length table for the following nested attributes, if
      different from the current one.
      
      However, once we're done with those sub-nested attributes, we
      should continue walking through attributes using the current
      table, instead of using the one related to the sub-nested
      attributes.
      
      For example, given this sequence:
      
      1  OVS_KEY_ATTR_PRIORITY
      2  OVS_KEY_ATTR_TUNNEL
      3	OVS_TUNNEL_KEY_ATTR_ID
      4	OVS_TUNNEL_KEY_ATTR_IPV4_SRC
      5	OVS_TUNNEL_KEY_ATTR_IPV4_DST
      6	OVS_TUNNEL_KEY_ATTR_TTL
      7	OVS_TUNNEL_KEY_ATTR_TP_SRC
      8	OVS_TUNNEL_KEY_ATTR_TP_DST
      9  OVS_KEY_ATTR_IN_PORT
      10 OVS_KEY_ATTR_SKB_MARK
      11 OVS_KEY_ATTR_MPLS
      
      we switch to the 'ovs_tunnel_key_lens' table on attribute #3,
      and we don't switch back to 'ovs_key_lens' while setting
      attributes #9 to #11 in the sequence. As OVS_KEY_ATTR_MPLS
      evaluates to 21, and the array size of 'ovs_tunnel_key_lens' is
      15, we also get this kind of KASan splat while accessing the
      wrong table:
      
      [ 7654.586496] ==================================================================
      [ 7654.594573] BUG: KASAN: global-out-of-bounds in nlattr_set+0x164/0xde9 [openvswitch]
      [ 7654.603214] Read of size 4 at addr ffffffffc169ecf0 by task handler29/87430
      [ 7654.610983]
      [ 7654.612644] CPU: 21 PID: 87430 Comm: handler29 Kdump: loaded Not tainted 3.10.0-866.el7.test.x86_64 #1
      [ 7654.623030] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7 06/16/2016
      [ 7654.631379] Call Trace:
      [ 7654.634108]  [<ffffffffb65a7c50>] dump_stack+0x19/0x1b
      [ 7654.639843]  [<ffffffffb53ff373>] print_address_description+0x33/0x290
      [ 7654.647129]  [<ffffffffc169b37b>] ? nlattr_set+0x164/0xde9 [openvswitch]
      [ 7654.654607]  [<ffffffffb53ff812>] kasan_report.part.3+0x242/0x330
      [ 7654.661406]  [<ffffffffb53ff9b4>] __asan_report_load4_noabort+0x34/0x40
      [ 7654.668789]  [<ffffffffc169b37b>] nlattr_set+0x164/0xde9 [openvswitch]
      [ 7654.676076]  [<ffffffffc167ef68>] ovs_nla_get_match+0x10c8/0x1900 [openvswitch]
      [ 7654.684234]  [<ffffffffb61e9cc8>] ? genl_rcv+0x28/0x40
      [ 7654.689968]  [<ffffffffb61e7733>] ? netlink_unicast+0x3f3/0x590
      [ 7654.696574]  [<ffffffffc167dea0>] ? ovs_nla_put_tunnel_info+0xb0/0xb0 [openvswitch]
      [ 7654.705122]  [<ffffffffb4f41b50>] ? unwind_get_return_address+0xb0/0xb0
      [ 7654.712503]  [<ffffffffb65d9355>] ? system_call_fastpath+0x1c/0x21
      [ 7654.719401]  [<ffffffffb4f41d79>] ? update_stack_state+0x229/0x370
      [ 7654.726298]  [<ffffffffb4f41d79>] ? update_stack_state+0x229/0x370
      [ 7654.733195]  [<ffffffffb53fe4b5>] ? kasan_unpoison_shadow+0x35/0x50
      [ 7654.740187]  [<ffffffffb53fe62a>] ? kasan_kmalloc+0xaa/0xe0
      [ 7654.746406]  [<ffffffffb53fec32>] ? kasan_slab_alloc+0x12/0x20
      [ 7654.752914]  [<ffffffffb53fe711>] ? memset+0x31/0x40
      [ 7654.758456]  [<ffffffffc165bf92>] ovs_flow_cmd_new+0x2b2/0xf00 [openvswitch]
      
      [snip]
      
      [ 7655.132484] The buggy address belongs to the variable:
      [ 7655.138226]  ovs_tunnel_key_lens+0xf0/0xffffffffffffd400 [openvswitch]
      [ 7655.145507]
      [ 7655.147166] Memory state around the buggy address:
      [ 7655.152514]  ffffffffc169eb80: 00 00 00 00 00 00 00 00 00 00 fa fa fa fa fa fa
      [ 7655.160585]  ffffffffc169ec00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      [ 7655.168644] >ffffffffc169ec80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa fa
      [ 7655.176701]                                                              ^
      [ 7655.184372]  ffffffffc169ed00: fa fa fa fa 00 00 00 00 fa fa fa fa 00 00 00 05
      [ 7655.192431]  ffffffffc169ed80: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
      [ 7655.200490] ==================================================================
      Reported-by: default avatarHangbin Liu <liuhangbin@gmail.com>
      Fixes: 982b5270 ("openvswitch: Fix mask generation for nested attributes.")
      Signed-off-by: default avatarStefano Brivio <sbrivio@redhat.com>
      Reviewed-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      72f17baf
    • Bhadram Varka's avatar
      net: phy: broadcom: add support for BCM89610 PHY · 23b83922
      Bhadram Varka authored
      It adds support for BCM89610 (Single-Port 10/100/1000BASE-T)
      transceiver which is used in P3310 Tegra186 platform.
      Signed-off-by: default avatarBhadram Varka <vbhadram@nvidia.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      23b83922
    • Linus Torvalds's avatar
      Merge tag 'linux-kselftest-4.17-rc4' of... · 15042698
      Linus Torvalds authored
      Merge tag 'linux-kselftest-4.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
      
      Pull kselftest fixes from Shuah Khan:
       "This Kselftest update for 4.17-rc4 consists of a fix for a syntax
        error in the script that runs selftests. Mathieu Desnoyers found this
        bug in the script on systems running GNU Make 3.8 or older"
      
      * tag 'linux-kselftest-4.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
        selftests: Fix lib.mk run_tests target shell script
      15042698
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · e523a256
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Various sockmap fixes from John Fastabend (pinned map handling,
          blocking in recvmsg, double page put, error handling during redirect
          failures, etc.)
      
       2) Fix dead code handling in x86-64 JIT, from Gianluca Borello.
      
       3) Missing device put in RDS IB code, from Dag Moxnes.
      
       4) Don't process fast open during repair mode in TCP< from Yuchung
          Cheng.
      
       5) Move address/port comparison fixes in SCTP, from Xin Long.
      
       6) Handle add a bond slave's master into a bridge properly, from
          Hangbin Liu.
      
       7) IPv6 multipath code can operate on unitialized memory due to an
          assumption that the icmp header is in the linear SKB area. Fix from
          Eric Dumazet.
      
       8) Don't invoke do_tcp_sendpages() recursively via TLS, from Dave
          Watson.
      
      9) Fix memory leaks in x86-64 JIT, from Daniel Borkmann.
      
      10) RDS leaks kernel memory to userspace, from Eric Dumazet.
      
      11) DCCP can invoke a tasklet on a freed socket, take a refcount. Also
          from Eric Dumazet.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (78 commits)
        dccp: fix tasklet usage
        smc: fix sendpage() call
        net/smc: handle unregistered buffers
        net/smc: call consolidation
        qed: fix spelling mistake: "offloded" -> "offloaded"
        net/mlx5e: fix spelling mistake: "loobpack" -> "loopback"
        tcp: restore autocorking
        rds: do not leak kernel memory to user land
        qmi_wwan: do not steal interfaces from class drivers
        ipv4: fix fnhe usage by non-cached routes
        bpf: sockmap, fix error handling in redirect failures
        bpf: sockmap, zero sg_size on error when buffer is released
        bpf: sockmap, fix scatterlist update on error path in send with apply
        net_sched: fq: take care of throttled flows before reuse
        ipv6: Revert "ipv6: Allow non-gateway ECMP for IPv6"
        bpf, x64: fix memleak when not converging on calls
        bpf, x64: fix memleak when not converging after image
        net/smc: restrict non-blocking connect finish
        8139too: Use disable_irq_nosync() in rtl8139_poll_controller()
        sctp: fix the issue that the cookie-ack with auth can't get processed
        ...
      e523a256
    • Linus Torvalds's avatar
      Merge branch 'parisc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux · bb609316
      Linus Torvalds authored
      Pull parisc fixes from Helge Deller:
       "Fix two section mismatches, convert to read_persistent_clock64(), add
        further documentation regarding the HPMC crash handler and make
        bzImage the default build target"
      
      * 'parisc-4.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
        parisc: Fix section mismatches
        parisc: drivers.c: Fix section mismatches
        parisc: time: Convert read_persistent_clock() to read_persistent_clock64()
        parisc: Document rules regarding checksum of HPMC handler
        parisc: Make bzImage default build target
      bb609316
    • Daniel Borkmann's avatar
      bpf: use array_index_nospec in find_prog_type · d0f1a451
      Daniel Borkmann authored
      Commit 9ef09e35 ("bpf: fix possible spectre-v1 in find_and_alloc_map()")
      converted find_and_alloc_map() over to use array_index_nospec() to sanitize
      map type that user space passes on map creation, and this patch does an
      analogous conversion for progs in find_prog_type() as it's also passed from
      user space when loading progs as attr->prog_type.
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      d0f1a451
  4. 03 May, 2018 17 commits
    • Mark Rutland's avatar
      bpf: fix possible spectre-v1 in find_and_alloc_map() · 9ef09e35
      Mark Rutland authored
      It's possible for userspace to control attr->map_type. Sanitize it when
      using it as an array index to prevent an out-of-bounds value being used
      under speculation.
      
      Found by smatch.
      Signed-off-by: default avatarMark Rutland <mark.rutland@arm.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: netdev@vger.kernel.org
      Acked-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      9ef09e35
    • Eric Dumazet's avatar
      dccp: fix tasklet usage · a8d7aa17
      Eric Dumazet authored
      syzbot reported a crash in tasklet_action_common() caused by dccp.
      
      dccp needs to make sure socket wont disappear before tasklet handler
      has completed.
      
      This patch takes a reference on the socket when arming the tasklet,
      and moves the sock_put() from dccp_write_xmit_timer() to dccp_write_xmitlet()
      
      kernel BUG at kernel/softirq.c:514!
      invalid opcode: 0000 [#1] SMP KASAN
      Dumping ftrace buffer:
         (ftrace buffer empty)
      Modules linked in:
      CPU: 1 PID: 17 Comm: ksoftirqd/1 Not tainted 4.17.0-rc3+ #30
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      RIP: 0010:tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515
      RSP: 0018:ffff8801d9b3faf8 EFLAGS: 00010246
      dccp_close: ABORT with 65423 bytes unread
      RAX: 1ffff1003b367f6b RBX: ffff8801daf1f3f0 RCX: 0000000000000000
      RDX: ffff8801cf895498 RSI: 0000000000000004 RDI: 0000000000000000
      RBP: ffff8801d9b3fc40 R08: ffffed0039f12a95 R09: ffffed0039f12a94
      dccp_close: ABORT with 65423 bytes unread
      R10: ffffed0039f12a94 R11: ffff8801cf8954a3 R12: 0000000000000000
      R13: ffff8801d9b3fc18 R14: dffffc0000000000 R15: ffff8801cf895490
      FS:  0000000000000000(0000) GS:ffff8801daf00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000001b2bc28000 CR3: 00000001a08a9000 CR4: 00000000001406e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
      Call Trace:
       tasklet_action+0x1d/0x20 kernel/softirq.c:533
       __do_softirq+0x2e0/0xaf5 kernel/softirq.c:285
      dccp_close: ABORT with 65423 bytes unread
       run_ksoftirqd+0x86/0x100 kernel/softirq.c:646
       smpboot_thread_fn+0x417/0x870 kernel/smpboot.c:164
       kthread+0x345/0x410 kernel/kthread.c:238
       ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
      Code: 48 8b 85 e8 fe ff ff 48 8b 95 f0 fe ff ff e9 94 fb ff ff 48 89 95 f0 fe ff ff e8 81 53 6e 00 48 8b 95 f0 fe ff ff e9 62 fb ff ff <0f> 0b 48 89 cf 48 89 8d e8 fe ff ff e8 64 53 6e 00 48 8b 8d e8
      RIP: tasklet_action_common.isra.19+0x6db/0x700 kernel/softirq.c:515 RSP: ffff8801d9b3faf8
      
      Fixes: dc841e30 ("dccp: Extend CCID packet dequeueing interface")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Gerrit Renker <gerrit@erg.abdn.ac.uk>
      Cc: dccp@vger.kernel.org
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a8d7aa17
    • David S. Miller's avatar
      Merge branch 'smc-fixes' · 31140b47
      David S. Miller authored
      Ursula Braun says:
      
      ====================
      net/smc: fixes 2018/05/03
      
      here are smc fixes for 2 problems:
       * receive buffers in SMC must be registered. If registration fails
         these buffers must not be kept within the link group for reuse.
         Patch 1 is a preparational patch; patch 2 contains the fix.
       * sendpage: do not hold the sock lock when calling kernel_sendpage()
                   or sock_no_sendpage()
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      31140b47
    • Stefan Raspl's avatar
      smc: fix sendpage() call · bda27ff5
      Stefan Raspl authored
      The sendpage() call grabs the sock lock before calling the default
      implementation - which tries to grab it once again.
      Signed-off-by: default avatarStefan Raspl <raspl@linux.ibm.com>
      Signed-off-by: Ursula Braun <ubraun@linux.ibm.com><
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bda27ff5
    • Karsten Graul's avatar
      net/smc: handle unregistered buffers · a6920d1d
      Karsten Graul authored
      When smc_wr_reg_send() fails then tag (regerr) the affected buffer and
      free it in smc_buf_unuse().
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a6920d1d
    • Karsten Graul's avatar
      net/smc: call consolidation · e63a5f8c
      Karsten Graul authored
      Consolidate the call to smc_wr_reg_send() in a new function.
      No functional changes.
      Signed-off-by: default avatarKarsten Graul <kgraul@linux.ibm.com>
      Signed-off-by: default avatarUrsula Braun <ubraun@linux.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e63a5f8c
    • Colin Ian King's avatar
      qed: fix spelling mistake: "offloded" -> "offloaded" · df80b8fb
      Colin Ian King authored
      Trivial fix to spelling mistake in DP_NOTICE message
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df80b8fb
    • Colin Ian King's avatar
      net/mlx5e: fix spelling mistake: "loobpack" -> "loopback" · 4e11581c
      Colin Ian King authored
      Trivial fix to spelling mistake in netdev_err error message
      Signed-off-by: default avatarColin Ian King <colin.king@canonical.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e11581c
    • Linus Torvalds's avatar
      Merge tag 'dma-mapping-4.17-4' of git://git.infradead.org/users/hch/dma-mapping · c15f6d8d
      Linus Torvalds authored
      Pull dma-mapping fix from Christoph Hellwig:
       "Fix an incorrect warning selection introduced in the last merge
        window"
      
      * tag 'dma-mapping-4.17-4' of git://git.infradead.org/users/hch/dma-mapping:
        swiotlb: fix inversed DMA_ATTR_NO_WARN test
      c15f6d8d
    • Eric Dumazet's avatar
      tcp: restore autocorking · 114f39fe
      Eric Dumazet authored
      When adding rb-tree for TCP retransmit queue, we inadvertently broke
      TCP autocorking.
      
      tcp_should_autocork() should really check if the rtx queue is not empty.
      
      Tested:
      
      Before the fix :
      $ nstat -n;./netperf -H 10.246.7.152 -Cc -- -m 500;nstat | grep AutoCork
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
      540000 262144    500    10.00      2682.85   2.47     1.59     3.618   2.329
      TcpExtTCPAutoCorking            33                 0.0
      
      // Same test, but forcing TCP_NODELAY
      $ nstat -n;./netperf -H 10.246.7.152 -Cc -- -D -m 500;nstat | grep AutoCork
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET : nodelay
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
      540000 262144    500    10.00      1408.75   2.44     2.96     6.802   8.259
      TcpExtTCPAutoCorking            1                  0.0
      
      After the fix :
      $ nstat -n;./netperf -H 10.246.7.152 -Cc -- -m 500;nstat | grep AutoCork
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
      540000 262144    500    10.00      5472.46   2.45     1.43     1.761   1.027
      TcpExtTCPAutoCorking            361293             0.0
      
      // With TCP_NODELAY option
      $ nstat -n;./netperf -H 10.246.7.152 -Cc -- -D -m 500;nstat | grep AutoCork
      MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.246.7.152 () port 0 AF_INET : nodelay
      Recv   Send    Send                          Utilization       Service Demand
      Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
      Size   Size    Size     Time     Throughput  local    remote   local   remote
      bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB
      
      540000 262144    500    10.00      5454.96   2.46     1.63     1.775   1.174
      TcpExtTCPAutoCorking            315448             0.0
      
      Fixes: 75c119af ("tcp: implement rb-tree based retransmit queue")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMichael Wenig <mwenig@vmware.com>
      Tested-by: default avatarMichael Wenig <mwenig@vmware.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarMichael Wenig <mwenig@vmware.com>
      Tested-by: default avatarMichael Wenig <mwenig@vmware.com>
      Acked-by: default avatarNeal Cardwell <ncardwell@google.com>
      Acked-by: default avatarSoheil Hassas Yeganeh <soheil@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      114f39fe
    • Eric Dumazet's avatar
      rds: do not leak kernel memory to user land · eb80ca47
      Eric Dumazet authored
      syzbot/KMSAN reported an uninit-value in put_cmsg(), originating
      from rds_cmsg_recv().
      
      Simply clear the structure, since we have holes there, or since
      rx_traces might be smaller than RDS_MSG_RX_DGRAM_TRACE_MAX.
      
      BUG: KMSAN: uninit-value in copy_to_user include/linux/uaccess.h:184 [inline]
      BUG: KMSAN: uninit-value in put_cmsg+0x600/0x870 net/core/scm.c:242
      CPU: 0 PID: 4459 Comm: syz-executor582 Not tainted 4.16.0+ #87
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       kmsan_internal_check_memory+0x135/0x1e0 mm/kmsan/kmsan.c:1157
       kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199
       copy_to_user include/linux/uaccess.h:184 [inline]
       put_cmsg+0x600/0x870 net/core/scm.c:242
       rds_cmsg_recv net/rds/recv.c:570 [inline]
       rds_recvmsg+0x2db5/0x3170 net/rds/recv.c:657
       sock_recvmsg_nosec net/socket.c:803 [inline]
       sock_recvmsg+0x1d0/0x230 net/socket.c:810
       ___sys_recvmsg+0x3fb/0x810 net/socket.c:2205
       __sys_recvmsg net/socket.c:2250 [inline]
       SYSC_recvmsg+0x298/0x3c0 net/socket.c:2262
       SyS_recvmsg+0x54/0x80 net/socket.c:2257
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Fixes: 3289025a ("RDS: add receive message trace used by application")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
      Cc: linux-rdma <linux-rdma@vger.kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb80ca47
    • Bjørn Mork's avatar
      qmi_wwan: do not steal interfaces from class drivers · 5697db4a
      Bjørn Mork authored
      The USB_DEVICE_INTERFACE_NUMBER matching macro assumes that
      the { vendorid, productid, interfacenumber } set uniquely
      identifies one specific function.  This has proven to fail
      for some configurable devices. One example is the Quectel
      EM06/EP06 where the same interface number can be either
      QMI or MBIM, without the device ID changing either.
      
      Fix by requiring the vendor-specific class for interface number
      based matching.  Functions of other classes can and should use
      class based matching instead.
      
      Fixes: 03304bcb ("net: qmi_wwan: use fixed interface number matching")
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5697db4a
    • Linus Torvalds's avatar
      Merge tag 'trace-v4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · f4ef6a43
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "Various fixes in tracing:
      
         - Tracepoints should not give warning on OOM failures
      
         - Use special field for function pointer in trace event
      
         - Fix igrab issues in uprobes
      
         - Fixes to the new histogram triggers"
      
      * tag 'trace-v4.17-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        tracepoint: Do not warn on ENOMEM
        tracing: Add field modifier parsing hist error for hist triggers
        tracing: Add field parsing hist error for hist triggers
        tracing: Restore proper field flag printing when displaying triggers
        tracing: initcall: Ordered comparison of function pointers
        tracing: Remove igrab() iput() call from uprobes.c
        tracing: Fix bad use of igrab in trace_uprobe.c
      f4ef6a43
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input · ecd649b3
      Linus Torvalds authored
      Pull input updates from Dmitry Torokhov:
       "Just a few driver fixes"
      
      * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
        Input: atmel_mxt_ts - add missing compatible strings to OF device table
        Input: atmel_mxt_ts - fix the firmware update
        Input: atmel_mxt_ts - add touchpad button mapping for Samsung Chromebook Pro
        MAINTAINERS: Rakesh Iyer can't be reached anymore
        Input: hideep_ts - fix a typo in Kconfig
        Input: alps - fix reporting pressure of v3 trackstick
        Input: leds - fix out of bound access
        Input: synaptics-rmi4 - fix an unchecked out of memory error path
      ecd649b3
    • Julian Anastasov's avatar
      ipv4: fix fnhe usage by non-cached routes · 94720e3a
      Julian Anastasov authored
      Allow some non-cached routes to use non-expired fnhe:
      
      1. ip_del_fnhe: moved above and now called by find_exception.
      The 4.5+ commit deed49df expires fnhe only when caching
      routes. Change that to:
      
      1.1. use fnhe for non-cached local output routes, with the help
      from (2)
      
      1.2. allow __mkroute_input to detect expired fnhe (outdated
      fnhe_gw, for example) when do_cache is false, eg. when itag!=0
      for unicast destinations.
      
      2. __mkroute_output: keep fi to allow local routes with orig_oif != 0
      to use fnhe info even when the new route will not be cached into fnhe.
      After commit 839da4d9 ("net: ipv4: set orig_oif based on fib
      result for local traffic") it means all local routes will be affected
      because they are not cached. This change is used to solve a PMTU
      problem with IPVS (and probably Netfilter DNAT) setups that redirect
      local clients from target local IP (local route to Virtual IP)
      to new remote IP target, eg. IPVS TUN real server. Loopback has
      64K MTU and we need to create fnhe on the local route that will
      keep the reduced PMTU for the Virtual IP. Without this change
      fnhe_pmtu is updated from ICMP but never exposed to non-cached
      local routes. This includes routes with flowi4_oif!=0 for 4.6+ and
      with flowi4_oif=any for 4.14+).
      
      3. update_or_create_fnhe: make sure fnhe_expires is not 0 for
      new entries
      
      Fixes: 839da4d9 ("net: ipv4: set orig_oif based on fib result for local traffic")
      Fixes: d6d5e999 ("route: do not cache fib route info on local routes with oif")
      Fixes: deed49df ("route: check and remove route cache when we get route")
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Xin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Acked-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      94720e3a
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 3b6f9793
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "Three small bug fixes: an illegally overlapping memcmp in target code,
        a potential infinite loop in isci under certain rare phy conditions
        and an ATA queue depth (performance) correction for storvsc"
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        scsi: target: Fix fortify_panic kernel exception
        scsi: isci: Fix infinite loop in while loop
        scsi: storvsc: Set up correct queue depth values for IDE devices
      3b6f9793
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · e002434e
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2018-05-03
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Several BPF sockmap fixes mostly related to bugs in error path
         handling, that is, a bug in updating the scatterlist length /
         offset accounting, a missing sk_mem_uncharge() in redirect
         error handling, and a bug where the outstanding bytes counter
         sg_size was not zeroed, from John.
      
      2) Fix two memory leaks in the x86-64 BPF JIT, one in an error
         path where we still don't converge after image was allocated
         and another one where BPF calls are used and JIT passes don't
         converge, from Daniel.
      
      3) Minor fix in BPF selftests where in test_stacktrace_build_id()
         we drop useless args in urandom_read and we need to add a missing
         newline in a CHECK() error message, from Song.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e002434e
  5. 02 May, 2018 6 commits
    • Alexei Starovoitov's avatar
      Merge branch 'bpf-sockmap-fixes' · b5b6ff73
      Alexei Starovoitov authored
      John Fastabend says:
      
      ====================
      When I added the test_sockmap to selftests I mistakenly changed the
      test logic a bit. The result of this was on redirect cases we ended up
      choosing the wrong sock from the BPF program and ended up sending to a
      socket that had no receive handler. The result was the actual receive
      handler, running on a different socket, is timing out and closing the
      socket. This results in errors (-EPIPE to be specific) on the sending
      side. Typically happening if the sender does not complete the send
      before the receive side times out. So depending on timing and the size
      of the send we may get errors. This exposed some bugs in the sockmap
      error path handling.
      
      This series fixes the errors. The primary issue is we did not do proper
      memory accounting in these cases which resulted in missing a
      sk_mem_uncharge(). This happened in the redirect path and in one case
      on the normal send path. See the three patches for the details.
      
      The other take-away from this is we need to fix the test_sockmap and
      also add more negative test cases. That will happen in bpf-next.
      
      Finally, I tested this using the existing test_sockmap program, the
      older sockmap sample test script, and a few real use cases with
      Cilium. All of these seem to be in working correctly.
      
      v2: fix compiler warning, drop iterator variable 'i' that is no longer
          used in patch 3.
      ====================
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      b5b6ff73
    • John Fastabend's avatar
      bpf: sockmap, fix error handling in redirect failures · abaeb096
      John Fastabend authored
      When a redirect failure happens we release the buffers in-flight
      without calling a sk_mem_uncharge(), the uncharge is called before
      dropping the sock lock for the redirecte, however we missed updating
      the ring start index. When no apply actions are in progress this
      is OK because we uncharge the entire buffer before the redirect.
      But, when we have apply logic running its possible that only a
      portion of the buffer is being redirected. In this case we only
      do memory accounting for the buffer slice being redirected and
      expect to be able to loop over the BPF program again and/or if
      a sock is closed uncharge the memory at sock destruct time.
      
      With an invalid start index however the program logic looks at
      the start pointer index, checks the length, and when seeing the
      length is zero (from the initial release and failure to update
      the pointer) aborts without uncharging/releasing the remaining
      memory.
      
      The fix for this is simply to update the start index. To avoid
      fixing this error in two locations we do a small refactor and
      remove one case where it is open-coded. Then fix it in the
      single function.
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      abaeb096
    • John Fastabend's avatar
      bpf: sockmap, zero sg_size on error when buffer is released · fec51d40
      John Fastabend authored
      When an error occurs during a redirect we have two cases that need
      to be handled (i) we have a cork'ed buffer (ii) we have a normal
      sendmsg buffer.
      
      In the cork'ed buffer case we don't currently support recovering from
      errors in a redirect action. So the buffer is released and the error
      should _not_ be pushed back to the caller of sendmsg/sendpage. The
      rationale here is the user will get an error that relates to old
      data that may have been sent by some arbitrary thread on that sock.
      Instead we simple consume the data and tell the user that the data
      has been consumed. We may add proper error recovery in the future.
      However, this patch fixes a bug where the bytes outstanding counter
      sg_size was not zeroed. This could result in a case where if the user
      has both a cork'ed action and apply action in progress we may
      incorrectly call into the BPF program when the user expected an
      old verdict to be applied via the apply action. I don't have a use
      case where using apply and cork at the same time is valid but we
      never explicitly reject it because it should work fine. This patch
      ensures the sg_size is zeroed so we don't have this case.
      
      In the normal sendmsg buffer case (no cork data) we also do not
      zero sg_size. Again this can confuse the apply logic when the logic
      calls into the BPF program when the BPF programmer expected the old
      verdict to remain. So ensure we set sg_size to zero here as well. And
      additionally to keep the psock state in-sync with the sk_msg_buff
      release all the memory as well. Previously we did this before
      returning to the user but this left a gap where psock and sk_msg_buff
      states were out of sync which seems fragile. No additional overhead
      is taken here except for a call to check the length and realize its
      already been freed. This is in the error path as well so in my
      opinion lets have robust code over optimized error paths.
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      fec51d40
    • John Fastabend's avatar
      bpf: sockmap, fix scatterlist update on error path in send with apply · 3cc9a472
      John Fastabend authored
      When the call to do_tcp_sendpage() fails to send the complete block
      requested we either retry if only a partial send was completed or
      abort if we receive a error less than or equal to zero. Before
      returning though we must update the scatterlist length/offset to
      account for any partial send completed.
      
      Before this patch we did this at the end of the retry loop, but
      this was buggy when used while applying a verdict to fewer bytes
      than in the scatterlist. When the scatterlist length was being set
      we forgot to account for the apply logic reducing the size variable.
      So the result was we chopped off some bytes in the scatterlist without
      doing proper cleanup on them. This results in a WARNING when the
      sock is tore down because the bytes have previously been charged to
      the socket but are never uncharged.
      
      The simple fix is to simply do the accounting inside the retry loop
      subtracting from the absolute scatterlist values rather than trying
      to accumulate the totals and subtract at the end.
      Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      3cc9a472
    • Eric Dumazet's avatar
      net_sched: fq: take care of throttled flows before reuse · 7df40c26
      Eric Dumazet authored
      Normally, a socket can not be freed/reused unless all its TX packets
      left qdisc and were TX-completed. However connect(AF_UNSPEC) allows
      this to happen.
      
      With commit fc59d5bd ("pkt_sched: fq: clear time_next_packet for
      reused flows") we cleared f->time_next_packet but took no special
      action if the flow was still in the throttled rb-tree.
      
      Since f->time_next_packet is the key used in the rb-tree searches,
      blindly clearing it might break rb-tree integrity. We need to make
      sure the flow is no longer in the rb-tree to avoid this problem.
      
      Fixes: fc59d5bd ("pkt_sched: fq: clear time_next_packet for reused flows")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7df40c26
    • Ido Schimmel's avatar
      ipv6: Revert "ipv6: Allow non-gateway ECMP for IPv6" · 30ca22e4
      Ido Schimmel authored
      This reverts commit edd7ceb7 ("ipv6: Allow non-gateway ECMP for
      IPv6").
      
      Eric reported a division by zero in rt6_multipath_rebalance() which is
      caused by above commit that considers identical local routes to be
      siblings. The division by zero happens because a nexthop weight is not
      set for local routes.
      
      Revert the commit as it does not fix a bug and has side effects.
      
      To reproduce:
      
      # ip -6 address add 2001:db8::1/64 dev dummy0
      # ip -6 address add 2001:db8::1/64 dev dummy1
      
      Fixes: edd7ceb7 ("ipv6: Allow non-gateway ECMP for IPv6")
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Reported-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Tested-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      30ca22e4