1. 16 Apr, 2018 2 commits
    • Arnd Bergmann's avatar
      netfilter: fix CONFIG_NF_REJECT_IPV6=m link error · a6615743
      Arnd Bergmann authored
      We get a new link error with CONFIG_NFT_REJECT_INET=y and CONFIG_NF_REJECT_IPV6=m
      after larger parts of the nftables modules are linked together:
      
      net/netfilter/nft_reject_inet.o: In function `nft_reject_inet_eval':
      nft_reject_inet.c:(.text+0x17c): undefined reference to `nf_send_unreach6'
      nft_reject_inet.c:(.text+0x190): undefined reference to `nf_send_reset6'
      
      The problem is that with NF_TABLES_INET set, we implicitly try to use
      the ipv6 version as well for NFT_REJECT, but when CONFIG_IPV6 is set to
      a loadable module, it's impossible to reach that.
      
      The best workaround I found is to express the above as a Kconfig
      dependency, forcing NFT_REJECT itself to be 'm' in that particular
      configuration.
      
      Fixes: 02c7b25e ("netfilter: nf_tables: build-in filter chain type")
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      a6615743
    • Cong Wang's avatar
      netfilter: conntrack: silent a memory leak warning · 114aa35d
      Cong Wang authored
      The following memory leak is false postive:
      
      unreferenced object 0xffff8f37f156fb38 (size 128):
        comm "softirq", pid 0, jiffies 4294899665 (age 11.292s)
        hex dump (first 32 bytes):
          6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b  kkkkkkkkkkkkkkkk
          00 00 00 00 30 00 20 00 48 6b 6b 6b 6b 6b 6b 6b  ....0. .Hkkkkkkk
        backtrace:
          [<000000004fda266a>] __kmalloc_track_caller+0x10d/0x141
          [<000000007b0a7e3c>] __krealloc+0x45/0x62
          [<00000000d08e0bfb>] nf_ct_ext_add+0xdc/0x133
          [<0000000099b47fd8>] init_conntrack+0x1b1/0x392
          [<0000000086dc36ec>] nf_conntrack_in+0x1ee/0x34b
          [<00000000940592de>] nf_hook_slow+0x36/0x95
          [<00000000d1bd4da7>] nf_hook.constprop.43+0x1c3/0x1dd
          [<00000000c3673266>] __ip_local_out+0xae/0xb4
          [<000000003e4192a6>] ip_local_out+0x17/0x33
          [<00000000b64356de>] igmp_ifc_timer_expire+0x23e/0x26f
          [<000000006a8f3032>] call_timer_fn+0x14c/0x2a5
          [<00000000650c1725>] __run_timers.part.34+0x150/0x182
          [<0000000090e6946e>] run_timer_softirq+0x2a/0x4c
          [<000000004d1e7293>] __do_softirq+0x1d1/0x3c2
          [<000000004643557d>] irq_exit+0x53/0xa2
          [<0000000029ddee8f>] smp_apic_timer_interrupt+0x22a/0x235
      
      because __krealloc() is not supposed to release the old
      memory and it is released later via kfree_rcu(). Since this is
      the only external user of __krealloc(), just mark it as not leak
      here.
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      114aa35d
  2. 11 Apr, 2018 1 commit
    • Jack Ma's avatar
      netfilter: xt_connmark: Add bit mapping for bit-shift operation. · cf43ae63
      Jack Ma authored
      With the addition of bit-shift operations, we are able to shift
      ct/skbmark based on user requirements. However, this change might also
      cause the most left/right hand- side mark to be accidentially lost
      during shift operations.
      
      This patch adds the ability to 'grep' certain bits based on ctmask or
      nfmask out of the original mark. Then, apply shift operations to achieve
      a new mapping between ctmark and skb->mark.
      
      For example: If someone would like save the fourth F bits of ctmark
      0xFFF(F)000F into the seventh hexadecimal (0) skb->mark 0xABC000(0)E.
      
      	new_targetmark = (ctmark & ctmask) >> 12;
      	(new) skb->mark = (skb->mark &~nfmask) ^
              	           new_targetmark;
      
      This will preserve the other bits that are not related to this
      operation.
      
      Fixes: 472a73e0 ("netfilter: xt_conntrack: Support bit-shifting for CONNMARK & MARK targets.")
      Reviewed-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarJack Ma <jack.ma@alliedtelesis.co.nz>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      cf43ae63
  3. 09 Apr, 2018 6 commits
    • Florian Westphal's avatar
      netfilter: ebtables: don't attempt to allocate 0-sized compat array · 3f1e53ab
      Florian Westphal authored
      Dmitry reports 32bit ebtables on 64bit kernel got broken by
      a recent change that returns -EINVAL when ruleset has no entries.
      
      ebtables however only counts user-defined chains, so for the
      initial table nentries will be 0.
      
      Don't try to allocate the compat array in this case, as no user
      defined rules exist no rule will need 64bit translation.
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Fixes: 7d7d7e02 ("netfilter: compat: reject huge allocation requests")
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      3f1e53ab
    • Julian Anastasov's avatar
      ipvs: fix rtnl_lock lockups caused by start_sync_thread · 5c64576a
      Julian Anastasov authored
      syzkaller reports for wrong rtnl_lock usage in sync code [1] and [2]
      
      We have 2 problems in start_sync_thread if error path is
      taken, eg. on memory allocation error or failure to configure
      sockets for mcast group or addr/port binding:
      
      1. recursive locking: holding rtnl_lock while calling sock_release
      which in turn calls again rtnl_lock in ip_mc_drop_socket to leave
      the mcast group, as noticed by Florian Westphal. Additionally,
      sock_release can not be called while holding sync_mutex (ABBA
      deadlock).
      
      2. task hung: holding rtnl_lock while calling kthread_stop to
      stop the running kthreads. As the kthreads do the same to leave
      the mcast group (sock_release -> ip_mc_drop_socket -> rtnl_lock)
      they hang.
      
      Fix the problems by calling rtnl_unlock early in the error path,
      now sock_release is called after unlocking both mutexes.
      
      Problem 3 (task hung reported by syzkaller [2]) is variant of
      problem 2: use _trylock to prevent one user to call rtnl_lock and
      then while waiting for sync_mutex to block kthreads that execute
      sock_release when they are stopped by stop_sync_thread.
      
      [1]
      IPVS: stopping backup sync thread 4500 ...
      WARNING: possible recursive locking detected
      4.16.0-rc7+ #3 Not tainted
      --------------------------------------------
      syzkaller688027/4497 is trying to acquire lock:
        (rtnl_mutex){+.+.}, at: [<00000000bb14d7fb>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      
      but task is already holding lock:
      IPVS: stopping backup sync thread 4495 ...
        (rtnl_mutex){+.+.}, at: [<00000000bb14d7fb>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      
      other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock(rtnl_mutex);
         lock(rtnl_mutex);
      
        *** DEADLOCK ***
      
        May be due to missing lock nesting notation
      
      2 locks held by syzkaller688027/4497:
        #0:  (rtnl_mutex){+.+.}, at: [<00000000bb14d7fb>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
        #1:  (ipvs->sync_mutex){+.+.}, at: [<00000000703f78e3>]
      do_ip_vs_set_ctl+0x10f8/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2388
      
      stack backtrace:
      CPU: 1 PID: 4497 Comm: syzkaller688027 Not tainted 4.16.0-rc7+ #3
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
      Google 01/01/2011
      Call Trace:
        __dump_stack lib/dump_stack.c:17 [inline]
        dump_stack+0x194/0x24d lib/dump_stack.c:53
        print_deadlock_bug kernel/locking/lockdep.c:1761 [inline]
        check_deadlock kernel/locking/lockdep.c:1805 [inline]
        validate_chain kernel/locking/lockdep.c:2401 [inline]
        __lock_acquire+0xe8f/0x3e00 kernel/locking/lockdep.c:3431
        lock_acquire+0x1d5/0x580 kernel/locking/lockdep.c:3920
        __mutex_lock_common kernel/locking/mutex.c:756 [inline]
        __mutex_lock+0x16f/0x1a80 kernel/locking/mutex.c:893
        mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
        rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
        ip_mc_drop_socket+0x88/0x230 net/ipv4/igmp.c:2643
        inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:413
        sock_release+0x8d/0x1e0 net/socket.c:595
        start_sync_thread+0x2213/0x2b70 net/netfilter/ipvs/ip_vs_sync.c:1924
        do_ip_vs_set_ctl+0x1139/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2389
        nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
        nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
        ip_setsockopt+0x97/0xa0 net/ipv4/ip_sockglue.c:1261
        udp_setsockopt+0x45/0x80 net/ipv4/udp.c:2406
        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:2975
        SYSC_setsockopt net/socket.c:1849 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1828
        do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x42/0xb7
      RIP: 0033:0x446a69
      RSP: 002b:00007fa1c3a64da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
      RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000446a69
      RDX: 000000000000048b RSI: 0000000000000000 RDI: 0000000000000003
      RBP: 00000000006e29fc R08: 0000000000000018 R09: 0000000000000000
      R10: 00000000200000c0 R11: 0000000000000246 R12: 00000000006e29f8
      R13: 00676e697279656b R14: 00007fa1c3a659c0 R15: 00000000006e2b60
      
      [2]
      IPVS: sync thread started: state = BACKUP, mcast_ifn = syz_tun, syncid = 4,
      id = 0
      IPVS: stopping backup sync thread 25415 ...
      INFO: task syz-executor7:25421 blocked for more than 120 seconds.
             Not tainted 4.16.0-rc6+ #284
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      syz-executor7   D23688 25421   4408 0x00000004
      Call Trace:
        context_switch kernel/sched/core.c:2862 [inline]
        __schedule+0x8fb/0x1ec0 kernel/sched/core.c:3440
        schedule+0xf5/0x430 kernel/sched/core.c:3499
        schedule_timeout+0x1a3/0x230 kernel/time/timer.c:1777
        do_wait_for_common kernel/sched/completion.c:86 [inline]
        __wait_for_common kernel/sched/completion.c:107 [inline]
        wait_for_common kernel/sched/completion.c:118 [inline]
        wait_for_completion+0x415/0x770 kernel/sched/completion.c:139
        kthread_stop+0x14a/0x7a0 kernel/kthread.c:530
        stop_sync_thread+0x3d9/0x740 net/netfilter/ipvs/ip_vs_sync.c:1996
        do_ip_vs_set_ctl+0x2b1/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2394
        nf_sockopt net/netfilter/nf_sockopt.c:106 [inline]
        nf_setsockopt+0x67/0xc0 net/netfilter/nf_sockopt.c:115
        ip_setsockopt+0x97/0xa0 net/ipv4/ip_sockglue.c:1253
        sctp_setsockopt+0x2ca/0x63e0 net/sctp/socket.c:4154
        sock_common_setsockopt+0x95/0xd0 net/core/sock.c:3039
        SYSC_setsockopt net/socket.c:1850 [inline]
        SyS_setsockopt+0x189/0x360 net/socket.c:1829
        do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
        entry_SYSCALL_64_after_hwframe+0x42/0xb7
      RIP: 0033:0x454889
      RSP: 002b:00007fc927626c68 EFLAGS: 00000246 ORIG_RAX: 0000000000000036
      RAX: ffffffffffffffda RBX: 00007fc9276276d4 RCX: 0000000000454889
      RDX: 000000000000048c RSI: 0000000000000000 RDI: 0000000000000017
      RBP: 000000000072bf58 R08: 0000000000000018 R09: 0000000000000000
      R10: 0000000020000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 000000000000051c R14: 00000000006f9b40 R15: 0000000000000001
      
      Showing all locks held in the system:
      2 locks held by khungtaskd/868:
        #0:  (rcu_read_lock){....}, at: [<00000000a1a8f002>]
      check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
        #0:  (rcu_read_lock){....}, at: [<00000000a1a8f002>] watchdog+0x1c5/0xd60
      kernel/hung_task.c:249
        #1:  (tasklist_lock){.+.+}, at: [<0000000037c2f8f9>]
      debug_show_all_locks+0xd3/0x3d0 kernel/locking/lockdep.c:4470
      1 lock held by rsyslogd/4247:
        #0:  (&f->f_pos_lock){+.+.}, at: [<000000000d8d6983>]
      __fdget_pos+0x12b/0x190 fs/file.c:765
      2 locks held by getty/4338:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4339:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4340:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4341:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4342:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4343:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      2 locks held by getty/4344:
        #0:  (&tty->ldisc_sem){++++}, at: [<00000000bee98654>]
      ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
        #1:  (&ldata->atomic_read_lock){+.+.}, at: [<00000000c1d180aa>]
      n_tty_read+0x2ef/0x1a40 drivers/tty/n_tty.c:2131
      3 locks held by kworker/0:5/6494:
        #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
      [<00000000a062b18e>] work_static include/linux/workqueue.h:198 [inline]
        #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
      [<00000000a062b18e>] set_work_data kernel/workqueue.c:619 [inline]
        #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
      [<00000000a062b18e>] set_work_pool_and_clear_pending kernel/workqueue.c:646
      [inline]
        #0:  ((wq_completion)"%s"("ipv6_addrconf")){+.+.}, at:
      [<00000000a062b18e>] process_one_work+0xb12/0x1bb0 kernel/workqueue.c:2084
        #1:  ((addr_chk_work).work){+.+.}, at: [<00000000278427d5>]
      process_one_work+0xb89/0x1bb0 kernel/workqueue.c:2088
        #2:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      1 lock held by syz-executor7/25421:
        #0:  (ipvs->sync_mutex){+.+.}, at: [<00000000d414a689>]
      do_ip_vs_set_ctl+0x277/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2393
      2 locks held by syz-executor7/25427:
        #0:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
        #1:  (ipvs->sync_mutex){+.+.}, at: [<00000000e6d48489>]
      do_ip_vs_set_ctl+0x10f8/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2388
      1 lock held by syz-executor7/25435:
        #0:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      1 lock held by ipvs-b:2:0/25415:
        #0:  (rtnl_mutex){+.+.}, at: [<00000000066e35ac>] rtnl_lock+0x17/0x20
      net/core/rtnetlink.c:74
      
      Reported-and-tested-by: syzbot+a46d6abf9d56b1365a72@syzkaller.appspotmail.com
      Reported-and-tested-by: syzbot+5fe074c01b2032ce9618@syzkaller.appspotmail.com
      Fixes: e0b26cc9 ("ipvs: call rtnl_lock early")
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      5c64576a
    • Florian Westphal's avatar
      netfilter: nf_conntrack_sip: allow duplicate SDP expectations · 876c2731
      Florian Westphal authored
      Callum Sinclair reported SIP IP Phone errors that he tracked down to
      such phones sending session descriptions for different media types but
      with same port numbers.
      
      The expect core will only 'refresh' existing expectation if it is
      from same master AND same expectation class (media type).
      As expectation class is different, we get an error.
      
      The SIP connection tracking code will then
      
      1). drop the SDP packet
      2). if an rtp expectation was already installed successfully,
          error on rtcp expectation will cancel the rtp one.
      
      Make the expect core report back to caller when the conflict is due
      to different expectation class and have SIP tracker ignore soft-error.
      Reported-by: default avatarCallum Sinclair <Callum.Sinclair@alliedtelesis.co.nz>
      Tested-by: default avatarCallum Sinclair <Callum.Sinclair@alliedtelesis.co.nz>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      876c2731
    • haibinzhang(张海斌)'s avatar
      vhost-net: set packet weight of tx polling to 2 * vq size · a2ac9990
      haibinzhang(张海斌) authored
      handle_tx will delay rx for tens or even hundreds of milliseconds when tx busy
      polling udp packets with small length(e.g. 1byte udp payload), because setting
      VHOST_NET_WEIGHT takes into account only sent-bytes but no single packet length.
      
      Ping-Latencies shown below were tested between two Virtual Machines using
      netperf (UDP_STREAM, len=1), and then another machine pinged the client:
      
      vq size=256
      Packet-Weight   Ping-Latencies(millisecond)
                         min      avg       max
      Origin           3.319   18.489    57.303
      64               1.643    2.021     2.552
      128              1.825    2.600     3.224
      256              1.997    2.710     4.295
      512              1.860    3.171     4.631
      1024             2.002    4.173     9.056
      2048             2.257    5.650     9.688
      4096             2.093    8.508    15.943
      
      vq size=512
      Packet-Weight   Ping-Latencies(millisecond)
                         min      avg       max
      Origin           6.537   29.177    66.245
      64               2.798    3.614     4.403
      128              2.861    3.820     4.775
      256              3.008    4.018     4.807
      512              3.254    4.523     5.824
      1024             3.079    5.335     7.747
      2048             3.944    8.201    12.762
      4096             4.158   11.057    19.985
      
      Seems pretty consistent, a small dip at 2 VQ sizes.
      Ring size is a hint from device about a burst size it can tolerate. Based on
      benchmarks, set the weight to 2 * vq size.
      
      To evaluate this change, another tests were done using netperf(RR, TX) between
      two machines with Intel(R) Xeon(R) Gold 6133 CPU @ 2.50GHz, and vq size was
      tweaked through qemu. Results shown below does not show obvious changes.
      
      vq size=256 TCP_RR                vq size=512 TCP_RR
      size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
         1/       1/  -7%/        -2%      1/       1/   0%/        -2%
         1/       4/  +1%/         0%      1/       4/  +1%/         0%
         1/       8/  +1%/        -2%      1/       8/   0%/        +1%
        64/       1/  -6%/         0%     64/       1/  +7%/        +3%
        64/       4/   0%/        +2%     64/       4/  -1%/        +1%
        64/       8/   0%/         0%     64/       8/  -1%/        -2%
       256/       1/  -3%/        -4%    256/       1/  -4%/        -2%
       256/       4/  +3%/        +4%    256/       4/  +1%/        +2%
       256/       8/  +2%/         0%    256/       8/  +1%/        -1%
      
      vq size=256 UDP_RR                vq size=512 UDP_RR
      size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
         1/       1/  -5%/        +1%      1/       1/  -3%/        -2%
         1/       4/  +4%/        +1%      1/       4/  -2%/        +2%
         1/       8/  -1%/        -1%      1/       8/  -1%/         0%
        64/       1/  -2%/        -3%     64/       1/  +1%/        +1%
        64/       4/  -5%/        -1%     64/       4/  +2%/         0%
        64/       8/   0%/        -1%     64/       8/  -2%/        +1%
       256/       1/  +7%/        +1%    256/       1/  -7%/         0%
       256/       4/  +1%/        +1%    256/       4/  -3%/        -4%
       256/       8/  +2%/        +2%    256/       8/  +1%/        +1%
      
      vq size=256 TCP_STREAM            vq size=512 TCP_STREAM
      size/sessions/+thu%/+normalize%   size/sessions/+thu%/+normalize%
        64/       1/   0%/        -3%     64/       1/   0%/         0%
        64/       4/  +3%/        -1%     64/       4/  -2%/        +4%
        64/       8/  +9%/        -4%     64/       8/  -1%/        +2%
       256/       1/  +1%/        -4%    256/       1/  +1%/        +1%
       256/       4/  -1%/        -1%    256/       4/  -3%/         0%
       256/       8/  +7%/        +5%    256/       8/  -3%/         0%
       512/       1/  +1%/         0%    512/       1/  -1%/        -1%
       512/       4/  +1%/        -1%    512/       4/   0%/         0%
       512/       8/  +7%/        -5%    512/       8/  +6%/        -1%
      1024/       1/   0%/        -1%   1024/       1/   0%/        +1%
      1024/       4/  +3%/         0%   1024/       4/  +1%/         0%
      1024/       8/  +8%/        +5%   1024/       8/  -1%/         0%
      2048/       1/  +2%/        +2%   2048/       1/  -1%/         0%
      2048/       4/  +1%/         0%   2048/       4/   0%/        -1%
      2048/       8/  -2%/         0%   2048/       8/   5%/        -1%
      4096/       1/  -2%/         0%   4096/       1/  -2%/         0%
      4096/       4/  +2%/         0%   4096/       4/   0%/         0%
      4096/       8/  +9%/        -2%   4096/       8/  -5%/        -1%
      Acked-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Signed-off-by: default avatarHaibin Zhang <haibinzhang@tencent.com>
      Signed-off-by: default avatarYunfang Tai <yunfangtai@tencent.com>
      Signed-off-by: default avatarLidong Chen <lidongchen@tencent.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a2ac9990
    • Vadim Lomovtsev's avatar
      net: thunderx: rework mac addresses list to u64 array · 9b5c4dfb
      Vadim Lomovtsev authored
      It is too expensive to pass u64 values via linked list, instead
      allocate array for them by overall number of mac addresses from netdev.
      
      This eventually removes multiple kmalloc() calls, aviod memory
      fragmentation and allow to put single null check on kmalloc
      return value in order to prevent a potential null pointer dereference.
      
      Addresses-Coverity-ID: 1467429 ("Dereference null return value")
      Fixes: 37c3347e ("net: thunderx: add ndo_set_rx_mode callback implementation for VF")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarVadim Lomovtsev <Vadim.Lomovtsev@cavium.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9b5c4dfb
    • Eric Dumazet's avatar
      inetpeer: fix uninit-value in inet_getpeer · b6a37e5e
      Eric Dumazet authored
      syzbot/KMSAN reported that p->dtime was read while it was
      not yet initialized in :
      
      	delta = (__u32)jiffies - p->dtime;
      	if (delta < ttl || !refcount_dec_if_one(&p->refcnt))
      		gc_stack[i] = NULL;
      
      This is a false positive, because the inetpeer wont be erased
      from rb-tree if the refcount_dec_if_one(&p->refcnt) does not
      succeed. And this wont happen before first inet_putpeer() call
      for this inetpeer has been done, and ->dtime field is written
      exactly before the refcount_dec_and_test(&p->refcnt).
      
      The KMSAN report was :
      
      BUG: KMSAN: uninit-value in inet_peer_gc net/ipv4/inetpeer.c:163 [inline]
      BUG: KMSAN: uninit-value in inet_getpeer+0x1567/0x1e70 net/ipv4/inetpeer.c:228
      CPU: 0 PID: 9494 Comm: syz-executor5 Not tainted 4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
       inet_peer_gc net/ipv4/inetpeer.c:163 [inline]
       inet_getpeer+0x1567/0x1e70 net/ipv4/inetpeer.c:228
       inet_getpeer_v4 include/net/inetpeer.h:110 [inline]
       icmpv4_xrlim_allow net/ipv4/icmp.c:330 [inline]
       icmp_send+0x2b44/0x3050 net/ipv4/icmp.c:725
       ip_options_compile+0x237c/0x29f0 net/ipv4/ip_options.c:472
       ip_rcv_options net/ipv4/ip_input.c:284 [inline]
       ip_rcv_finish+0xda8/0x16d0 net/ipv4/ip_input.c:365
       NF_HOOK include/linux/netfilter.h:288 [inline]
       ip_rcv+0x119d/0x16f0 net/ipv4/ip_input.c:493
       __netif_receive_skb_core+0x47cf/0x4a80 net/core/dev.c:4562
       __netif_receive_skb net/core/dev.c:4627 [inline]
       netif_receive_skb_internal+0x49d/0x630 net/core/dev.c:4701
       netif_receive_skb+0x230/0x240 net/core/dev.c:4725
       tun_rx_batched drivers/net/tun.c:1555 [inline]
       tun_get_user+0x6d88/0x7580 drivers/net/tun.c:1962
       tun_chr_write_iter+0x1d4/0x330 drivers/net/tun.c:1990
       do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
       do_iter_write+0x30d/0xd40 fs/read_write.c:932
       vfs_writev fs/read_write.c:977 [inline]
       do_writev+0x3c9/0x830 fs/read_write.c:1012
       SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
       SyS_writev+0x56/0x80 fs/read_write.c:1082
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x455111
      RSP: 002b:00007fae0365cba0 EFLAGS: 00000293 ORIG_RAX: 0000000000000014
      RAX: ffffffffffffffda RBX: 000000000000002e RCX: 0000000000455111
      RDX: 0000000000000001 RSI: 00007fae0365cbf0 RDI: 00000000000000fc
      RBP: 0000000020000040 R08: 00000000000000fc R09: 0000000000000000
      R10: 000000000000002e R11: 0000000000000293 R12: 00000000ffffffff
      R13: 0000000000000658 R14: 00000000006fc8e0 R15: 0000000000000000
      
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
       kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
       kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
       kmem_cache_alloc+0xaab/0xb90 mm/slub.c:2756
       inet_getpeer+0xed8/0x1e70 net/ipv4/inetpeer.c:210
       inet_getpeer_v4 include/net/inetpeer.h:110 [inline]
       ip4_frag_init+0x4d1/0x740 net/ipv4/ip_fragment.c:153
       inet_frag_alloc net/ipv4/inet_fragment.c:369 [inline]
       inet_frag_create net/ipv4/inet_fragment.c:385 [inline]
       inet_frag_find+0x7da/0x1610 net/ipv4/inet_fragment.c:418
       ip_find net/ipv4/ip_fragment.c:275 [inline]
       ip_defrag+0x448/0x67a0 net/ipv4/ip_fragment.c:676
       ip_check_defrag+0x775/0xda0 net/ipv4/ip_fragment.c:724
       packet_rcv_fanout+0x2a8/0x8d0 net/packet/af_packet.c:1447
       deliver_skb net/core/dev.c:1897 [inline]
       deliver_ptype_list_skb net/core/dev.c:1912 [inline]
       __netif_receive_skb_core+0x314a/0x4a80 net/core/dev.c:4545
       __netif_receive_skb net/core/dev.c:4627 [inline]
       netif_receive_skb_internal+0x49d/0x630 net/core/dev.c:4701
       netif_receive_skb+0x230/0x240 net/core/dev.c:4725
       tun_rx_batched drivers/net/tun.c:1555 [inline]
       tun_get_user+0x6d88/0x7580 drivers/net/tun.c:1962
       tun_chr_write_iter+0x1d4/0x330 drivers/net/tun.c:1990
       do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776
       do_iter_write+0x30d/0xd40 fs/read_write.c:932
       vfs_writev fs/read_write.c:977 [inline]
       do_writev+0x3c9/0x830 fs/read_write.c:1012
       SYSC_writev+0x9b/0xb0 fs/read_write.c:1085
       SyS_writev+0x56/0x80 fs/read_write.c:1082
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b6a37e5e
  4. 08 Apr, 2018 25 commits
    • Esben Haabendal's avatar
      dp83640: Ensure against premature access to PHY registers after reset · 76327a35
      Esben Haabendal authored
      The datasheet specifies a 3uS pause after performing a software
      reset. The default implementation of genphy_soft_reset() does not
      provide this, so implement soft_reset with the needed pause.
      Signed-off-by: default avatarEsben Haabendal <eha@deif.com>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      76327a35
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · 1f1cba78
      David S. Miller authored
      Daniel Borkmann says:
      
      ====================
      pull-request: bpf 2018-04-09
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) Two sockmap fixes: i) fix a potential warning when a socket with
         pending cork data is closed by freeing the memory right when the
         socket is closed instead of seeing still outstanding memory at
         garbage collector time, ii) fix a NULL pointer deref in case of
         duplicates release calls, so make sure to only reset the sk_prot
         pointer when it's in a valid state to do so, both from John.
      
      2) Fix a compilation warning in bpf_prog_attach_check_attach_type()
         by moving the function under CONFIG_CGROUP_BPF ifdef since only
         used there, from Anders.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1f1cba78
    • David S. Miller's avatar
      Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth · 4c7c12e0
      David S. Miller authored
      Johan Hedberg says:
      
      ====================
      pull request: bluetooth 2018-04-08
      
      Here's one important Bluetooth fix for the 4.17-rc series that's needed
      to pass several Bluetooth qualification test cases.
      
      Let me know if there are any issues pulling. Thanks.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4c7c12e0
    • Jiri Pirko's avatar
      devlink: convert occ_get op to separate registration · fc56be47
      Jiri Pirko authored
      This resolves race during initialization where the resources with
      ops are registered before driver and the structures used by occ_get
      op is initialized. So keep occ_get callbacks registered only when
      all structs are initialized.
      
      The example flows, as it is in mlxsw:
      1) driver load/asic probe:
         mlxsw_core
            -> mlxsw_sp_resources_register
              -> mlxsw_sp_kvdl_resources_register
                -> devlink_resource_register IDX
         mlxsw_spectrum
            -> mlxsw_sp_kvdl_init
              -> mlxsw_sp_kvdl_parts_init
                -> mlxsw_sp_kvdl_part_init
                  -> devlink_resource_size_get IDX (to get the current setup
                                                    size from devlink)
              -> devlink_resource_occ_get_register IDX (register current
                                                        occupancy getter)
      2) reload triggered by devlink command:
        -> mlxsw_devlink_core_bus_device_reload
          -> mlxsw_sp_fini
            -> mlxsw_sp_kvdl_fini
      	-> devlink_resource_occ_get_unregister IDX
          (struct mlxsw_sp *mlxsw_sp is freed at this point, call to occ get
           which is using mlxsw_sp would cause use-after free)
          -> mlxsw_sp_init
            -> mlxsw_sp_kvdl_init
              -> mlxsw_sp_kvdl_parts_init
                -> mlxsw_sp_kvdl_part_init
                  -> devlink_resource_size_get IDX (to get the current setup
                                                    size from devlink)
              -> devlink_resource_occ_get_register IDX (register current
                                                        occupancy getter)
      
      Fixes: d9f9b9a4 ("devlink: Add support for resource abstraction")
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc56be47
    • Esben Haabendal's avatar
      ARM: dts: ls1021a: Specify TBIPA register address · 55711961
      Esben Haabendal authored
      The current (mildly evil) fsl_pq_mdio code uses an undocumented shadow of
      the TBIPA register on LS1021A, which happens to be read-only.
      Changing TBI PHY address therefore does not work on LS1021A.
      
      The real (and documented) address of the TBIPA registere lies in the eTSEC
      block and not in MDIO/MII, which is read/write, so using that fixes
      the problem.
      Signed-off-by: default avatarEsben Haabendal <eha@deif.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      55711961
    • Esben Haabendal's avatar
      net/fsl_pq_mdio: Allow explicit speficition of TBIPA address · 21481189
      Esben Haabendal authored
      This introduces a simpler and generic method for for finding (and mapping)
      the TBIPA register.
      
      Instead of relying of complicated logic for finding the TBIPA register
      address based on the MDIO or MII register block base
      address, which even in some cases relies on undocumented shadow registers,
      a second "reg" entry for the mdio bus devicetree node specifies the TBIPA
      register.
      
      Backwards compatibility is kept, as the existing logic is applied when
      only a single "reg" mapping is specified.
      Signed-off-by: default avatarEsben Haabendal <eha@deif.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      21481189
    • David S. Miller's avatar
      Merge branch 'ibmvnic-Fix-driver-reset-and-DMA-bugs' · 4e31a684
      David S. Miller authored
      Thomas Falcon says:
      
      ====================
      ibmvnic: Fix driver reset and DMA bugs
      
      This patch series introduces some fixes to the driver reset
      routines and a patch that fixes mistakes caught by the kernel
      DMA debugger.
      
      The reset fixes include a fix to reset TX queue counters properly
      after a reset as well as updates to driver reset error-handling code.
      It also provides updates to the reset handling routine for redundant
      backing VF failover and partition migration cases.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e31a684
    • Nathan Fontenot's avatar
      ibmvnic: Do not reset CRQ for Mobility driver resets · 30f79625
      Nathan Fontenot authored
      When resetting the ibmvnic driver after a partition migration occurs
      there is no requirement to do a reset of the main CRQ. The current
      driver code does the required re-enable of the main CRQ, then does
      a reset of the main CRQ later.
      
      What we should be doing for a driver reset after a migration is to
      re-enable the main CRQ, release all the sub-CRQs, and then allocate
      new sub-CRQs after capability negotiation.
      
      This patch updates the handling of mobility resets to do the proper
      work and not reset the main CRQ. To do this the initialization/reset
      of the main CRQ had to be moved out of the ibmvnic_init routine
      and in to the ibmvnic_probe and do_reset routines.
      Signed-off-by: default avatarNathan Fontenot <nfont@linux.vnet.ibm.com>
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      30f79625
    • Thomas Falcon's avatar
      ibmvnic: Fix failover case for non-redundant configuration · 5a18e1e0
      Thomas Falcon authored
      There is a failover case for a non-redundant pseries VNIC
      configuration that was not being handled properly. The current
      implementation assumes that the driver will always have a redandant
      device to communicate with following a failover notification. There
      are cases, however, when a non-redundant configuration can receive
      a failover request. If that happens, the driver should wait until
      it receives a signal that the device is ready for operation.
      
      The driver is agnostic of its backing hardware configuration,
      so this fix necessarily affects all device failover management.
      The driver needs to wait until it receives a signal that the device
      is ready for resetting. A flag is introduced to track this intermediary
      state where the driver is waiting for an active device.
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a18e1e0
    • Thomas Falcon's avatar
      ibmvnic: Fix reset scheduler error handling · af894d23
      Thomas Falcon authored
      In some cases, if the driver is waiting for a reset following
      a device parameter change, failure to schedule a reset can result
      in a hang since a completion signal is never sent.
      
      If the device configuration is being altered by a tool such
      as ethtool or ifconfig, it could cause the console to hang
      if the reset request does not get scheduled. Add some additional
      error handling code to exit the wait_for_completion if there is
      one in progress.
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      af894d23
    • Thomas Falcon's avatar
      ibmvnic: Zero used TX descriptor counter on reset · 41f71467
      Thomas Falcon authored
      The counter that tracks used TX descriptors pending completion
      needs to be zeroed as part of a device reset. This change fixes
      a bug causing transmit queues to be stopped unnecessarily and in
      some cases a transmit queue stall and timeout reset. If the counter
      is not reset, the remaining descriptors will not be "removed",
      effectively reducing queue capacity. If the queue is over half full,
      it will cause the queue to stall if stopped.
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      41f71467
    • Thomas Falcon's avatar
      ibmvnic: Fix DMA mapping mistakes · 37e40fa8
      Thomas Falcon authored
      Fix some mistakes caught by the DMA debugger. The first change
      fixes a unnecessary unmap that should have been removed in an
      earlier update. The next hunk fixes another bad unmap by zeroing
      the bit checked to determine that an unmap is needed. The final
      change fixes some buffers that are unmapped with the wrong
      direction specified.
      Signed-off-by: default avatarThomas Falcon <tlfalcon@linux.vnet.ibm.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      37e40fa8
    • Cong Wang's avatar
      tipc: use the right skb in tipc_sk_fill_sock_diag() · e41f0548
      Cong Wang authored
      Commit 4b2e6877 ("tipc: Fix namespace violation in tipc_sk_fill_sock_diag")
      tried to fix the crash but failed, the crash is still 100% reproducible
      with it.
      
      In tipc_sk_fill_sock_diag(), skb is the diag dump we are filling, it is not
      correct to retrieve its NETLINK_CB(), instead, like other protocol diag,
      we should use NETLINK_CB(cb->skb).sk here.
      
      Reported-by: <syzbot+326e587eff1074657718@syzkaller.appspotmail.com>
      Fixes: 4b2e6877 ("tipc: Fix namespace violation in tipc_sk_fill_sock_diag")
      Fixes: c30b70de (tipc: implement socket diagnostics for AF_TIPC)
      Cc: GhantaKrishnamurthy MohanKrishna <mohan.krishna.ghanta.krishnamurthy@ericsson.com>
      Cc: Jon Maloy <jon.maloy@ericsson.com>
      Cc: Ying Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e41f0548
    • Eric Dumazet's avatar
      sctp: sctp_sockaddr_af must check minimal addr length for AF_INET6 · 81e98370
      Eric Dumazet authored
      Check must happen before call to ipv6_addr_v4mapped()
      
      syzbot report was :
      
      BUG: KMSAN: uninit-value in sctp_sockaddr_af net/sctp/socket.c:359 [inline]
      BUG: KMSAN: uninit-value in sctp_do_bind+0x60f/0xdc0 net/sctp/socket.c:384
      CPU: 0 PID: 3576 Comm: syzkaller968804 Not tainted 4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
       sctp_sockaddr_af net/sctp/socket.c:359 [inline]
       sctp_do_bind+0x60f/0xdc0 net/sctp/socket.c:384
       sctp_bind+0x149/0x190 net/sctp/socket.c:332
       inet6_bind+0x1fd/0x1820 net/ipv6/af_inet6.c:293
       SYSC_bind+0x3f2/0x4b0 net/socket.c:1474
       SyS_bind+0x54/0x80 net/socket.c:1460
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x43fd49
      RSP: 002b:00007ffe99df3d28 EFLAGS: 00000213 ORIG_RAX: 0000000000000031
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fd49
      RDX: 0000000000000010 RSI: 0000000020000000 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
      R10: 00000000004002c8 R11: 0000000000000213 R12: 0000000000401670
      R13: 0000000000401700 R14: 0000000000000000 R15: 0000000000000000
      
      Local variable description: ----address@SYSC_bind
      Variable was created at:
       SYSC_bind+0x6f/0x4b0 net/socket.c:1461
       SyS_bind+0x54/0x80 net/socket.c:1460
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Vlad Yasevich <vyasevich@gmail.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      81e98370
    • Andrew Lunn's avatar
      net: dsa: Discard frames from unused ports · fc5f3376
      Andrew Lunn authored
      The Marvell switches under some conditions will pass a frame to the
      host with the port being the CPU port. Such frames are invalid, and
      should be dropped. Not dropping them can result in a crash when
      incrementing the receive statistics for an invalid port.
      Reported-by: default avatarChris Healy <cphealy@gmail.com>
      Fixes: 91da11f8 ("net: Distributed Switch Architecture protocol support")
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fc5f3376
    • Eric Dumazet's avatar
      sctp: do not leak kernel memory to user space · 6780db24
      Eric Dumazet authored
      syzbot produced a nice report [1]
      
      Issue here is that a recvmmsg() managed to leak 8 bytes of kernel memory
      to user space, because sin_zero (padding field) was not properly cleared.
      
      [1]
      BUG: KMSAN: uninit-value in copy_to_user include/linux/uaccess.h:184 [inline]
      BUG: KMSAN: uninit-value in move_addr_to_user+0x32e/0x530 net/socket.c:227
      CPU: 1 PID: 3586 Comm: syzkaller481044 Not tainted 4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       kmsan_internal_check_memory+0x164/0x1d0 mm/kmsan/kmsan.c:1176
       kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199
       copy_to_user include/linux/uaccess.h:184 [inline]
       move_addr_to_user+0x32e/0x530 net/socket.c:227
       ___sys_recvmsg+0x4e2/0x810 net/socket.c:2211
       __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313
       SYSC_recvmmsg+0x29b/0x3e0 net/socket.c:2394
       SyS_recvmmsg+0x76/0xa0 net/socket.c:2378
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x4401c9
      RSP: 002b:00007ffc56f73098 EFLAGS: 00000217 ORIG_RAX: 000000000000012b
      RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 00000000004401c9
      RDX: 0000000000000001 RSI: 0000000020003ac0 RDI: 0000000000000003
      RBP: 00000000006ca018 R08: 0000000020003bc0 R09: 0000000000000010
      R10: 0000000000000000 R11: 0000000000000217 R12: 0000000000401af0
      R13: 0000000000401b80 R14: 0000000000000000 R15: 0000000000000000
      
      Local variable description: ----addr@___sys_recvmsg
      Variable was created at:
       ___sys_recvmsg+0xd5/0x810 net/socket.c:2172
       __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313
      
      Bytes 8-15 of 16 are uninitialized
      
      ==================================================================
      Kernel panic - not syncing: panic_on_warn set ...
      
      CPU: 1 PID: 3586 Comm: syzkaller481044 Tainted: G    B            4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       panic+0x39d/0x940 kernel/panic.c:183
       kmsan_report+0x238/0x240 mm/kmsan/kmsan.c:1083
       kmsan_internal_check_memory+0x164/0x1d0 mm/kmsan/kmsan.c:1176
       kmsan_copy_to_user+0x69/0x160 mm/kmsan/kmsan.c:1199
       copy_to_user include/linux/uaccess.h:184 [inline]
       move_addr_to_user+0x32e/0x530 net/socket.c:227
       ___sys_recvmsg+0x4e2/0x810 net/socket.c:2211
       __sys_recvmmsg+0x54e/0xdb0 net/socket.c:2313
       SYSC_recvmmsg+0x29b/0x3e0 net/socket.c:2394
       SyS_recvmmsg+0x76/0xa0 net/socket.c:2378
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc:	Vlad Yasevich <vyasevich@gmail.com>
      Cc:	Neil Horman <nhorman@tuxdriver.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6780db24
    • David S. Miller's avatar
      Merge branch 'net-fix-uninit-values-in-networking-stack' · ccb48e83
      David S. Miller authored
      Eric Dumazet says:
      
      ====================
      net: fix uninit-values in networking stack
      
      It seems syzbot got new features enabled, and fired some interesting
      reports. Oh well.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ccb48e83
    • Eric Dumazet's avatar
      soreuseport: initialise timewait reuseport field · 3099a529
      Eric Dumazet authored
      syzbot reported an uninit-value in inet_csk_bind_conflict() [1]
      
      It turns out we never propagated sk->sk_reuseport into timewait socket.
      
      [1]
      BUG: KMSAN: uninit-value in inet_csk_bind_conflict+0x5f9/0x990 net/ipv4/inet_connection_sock.c:151
      CPU: 1 PID: 3589 Comm: syzkaller008242 Not tainted 4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
       inet_csk_bind_conflict+0x5f9/0x990 net/ipv4/inet_connection_sock.c:151
       inet_csk_get_port+0x1d28/0x1e40 net/ipv4/inet_connection_sock.c:320
       inet6_bind+0x121c/0x1820 net/ipv6/af_inet6.c:399
       SYSC_bind+0x3f2/0x4b0 net/socket.c:1474
       SyS_bind+0x54/0x80 net/socket.c:1460
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x4416e9
      RSP: 002b:00007ffce6d15c88 EFLAGS: 00000217 ORIG_RAX: 0000000000000031
      RAX: ffffffffffffffda RBX: 0100000000000000 RCX: 00000000004416e9
      RDX: 000000000000001c RSI: 0000000020402000 RDI: 0000000000000004
      RBP: 0000000000000000 R08: 00000000e6d15e08 R09: 00000000e6d15e08
      R10: 0000000000000004 R11: 0000000000000217 R12: 0000000000009478
      R13: 00000000006cd448 R14: 0000000000000000 R15: 0000000000000000
      
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
       kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
       __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
       tcp_time_wait+0xf17/0xf50 net/ipv4/tcp_minisocks.c:283
       tcp_rcv_state_process+0xebe/0x6490 net/ipv4/tcp_input.c:6003
       tcp_v6_do_rcv+0x11dd/0x1d90 net/ipv6/tcp_ipv6.c:1331
       sk_backlog_rcv include/net/sock.h:908 [inline]
       __release_sock+0x2d6/0x680 net/core/sock.c:2271
       release_sock+0x97/0x2a0 net/core/sock.c:2786
       tcp_close+0x277/0x18f0 net/ipv4/tcp.c:2269
       inet_release+0x240/0x2a0 net/ipv4/af_inet.c:427
       inet6_release+0xaf/0x100 net/ipv6/af_inet6.c:435
       sock_release net/socket.c:595 [inline]
       sock_close+0xe0/0x300 net/socket.c:1149
       __fput+0x49e/0xa10 fs/file_table.c:209
       ____fput+0x37/0x40 fs/file_table.c:243
       task_work_run+0x243/0x2c0 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x10e1/0x38d0 kernel/exit.c:867
       do_group_exit+0x1a0/0x360 kernel/exit.c:970
       SYSC_exit_group+0x21/0x30 kernel/exit.c:981
       SyS_exit_group+0x25/0x30 kernel/exit.c:979
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
       kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
       __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
       inet_twsk_alloc+0xaef/0xc00 net/ipv4/inet_timewait_sock.c:182
       tcp_time_wait+0xd9/0xf50 net/ipv4/tcp_minisocks.c:258
       tcp_rcv_state_process+0xebe/0x6490 net/ipv4/tcp_input.c:6003
       tcp_v6_do_rcv+0x11dd/0x1d90 net/ipv6/tcp_ipv6.c:1331
       sk_backlog_rcv include/net/sock.h:908 [inline]
       __release_sock+0x2d6/0x680 net/core/sock.c:2271
       release_sock+0x97/0x2a0 net/core/sock.c:2786
       tcp_close+0x277/0x18f0 net/ipv4/tcp.c:2269
       inet_release+0x240/0x2a0 net/ipv4/af_inet.c:427
       inet6_release+0xaf/0x100 net/ipv6/af_inet6.c:435
       sock_release net/socket.c:595 [inline]
       sock_close+0xe0/0x300 net/socket.c:1149
       __fput+0x49e/0xa10 fs/file_table.c:209
       ____fput+0x37/0x40 fs/file_table.c:243
       task_work_run+0x243/0x2c0 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x10e1/0x38d0 kernel/exit.c:867
       do_group_exit+0x1a0/0x360 kernel/exit.c:970
       SYSC_exit_group+0x21/0x30 kernel/exit.c:981
       SyS_exit_group+0x25/0x30 kernel/exit.c:979
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
       kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
       kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
       kmem_cache_alloc+0xaab/0xb90 mm/slub.c:2756
       inet_twsk_alloc+0x13b/0xc00 net/ipv4/inet_timewait_sock.c:163
       tcp_time_wait+0xd9/0xf50 net/ipv4/tcp_minisocks.c:258
       tcp_rcv_state_process+0xebe/0x6490 net/ipv4/tcp_input.c:6003
       tcp_v6_do_rcv+0x11dd/0x1d90 net/ipv6/tcp_ipv6.c:1331
       sk_backlog_rcv include/net/sock.h:908 [inline]
       __release_sock+0x2d6/0x680 net/core/sock.c:2271
       release_sock+0x97/0x2a0 net/core/sock.c:2786
       tcp_close+0x277/0x18f0 net/ipv4/tcp.c:2269
       inet_release+0x240/0x2a0 net/ipv4/af_inet.c:427
       inet6_release+0xaf/0x100 net/ipv6/af_inet6.c:435
       sock_release net/socket.c:595 [inline]
       sock_close+0xe0/0x300 net/socket.c:1149
       __fput+0x49e/0xa10 fs/file_table.c:209
       ____fput+0x37/0x40 fs/file_table.c:243
       task_work_run+0x243/0x2c0 kernel/task_work.c:113
       exit_task_work include/linux/task_work.h:22 [inline]
       do_exit+0x10e1/0x38d0 kernel/exit.c:867
       do_group_exit+0x1a0/0x360 kernel/exit.c:970
       SYSC_exit_group+0x21/0x30 kernel/exit.c:981
       SyS_exit_group+0x25/0x30 kernel/exit.c:979
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      
      Fixes: da5e3630 ("soreuseport: TCP/IPv4 implementation")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3099a529
    • Eric Dumazet's avatar
      ipv4: fix uninit-value in ip_route_output_key_hash_rcu() · d0ea2b12
      Eric Dumazet authored
      syzbot complained that res.type could be used while not initialized.
      
      Using RTN_UNSPEC as initial value seems better than using garbage.
      
      BUG: KMSAN: uninit-value in __mkroute_output net/ipv4/route.c:2200 [inline]
      BUG: KMSAN: uninit-value in ip_route_output_key_hash_rcu+0x31f0/0x3940 net/ipv4/route.c:2493
      CPU: 1 PID: 12207 Comm: syz-executor0 Not tainted 4.16.0+ #81
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
       __mkroute_output net/ipv4/route.c:2200 [inline]
       ip_route_output_key_hash_rcu+0x31f0/0x3940 net/ipv4/route.c:2493
       ip_route_output_key_hash net/ipv4/route.c:2322 [inline]
       __ip_route_output_key include/net/route.h:126 [inline]
       ip_route_output_flow+0x1eb/0x3c0 net/ipv4/route.c:2577
       raw_sendmsg+0x1861/0x3ed0 net/ipv4/raw.c:653
       inet_sendmsg+0x48d/0x740 net/ipv4/af_inet.c:764
       sock_sendmsg_nosec net/socket.c:630 [inline]
       sock_sendmsg net/socket.c:640 [inline]
       SYSC_sendto+0x6c3/0x7e0 net/socket.c:1747
       SyS_sendto+0x8a/0xb0 net/socket.c:1715
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x455259
      RSP: 002b:00007fdc0625dc68 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
      RAX: ffffffffffffffda RBX: 00007fdc0625e6d4 RCX: 0000000000455259
      RDX: 0000000000000000 RSI: 0000000020000040 RDI: 0000000000000013
      RBP: 000000000072bea0 R08: 0000000020000080 R09: 0000000000000010
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000000004f7 R14: 00000000006fa7c8 R15: 0000000000000000
      
      Local variable description: ----res.i.i@ip_route_output_flow
      Variable was created at:
       ip_route_output_flow+0x75/0x3c0 net/ipv4/route.c:2576
       raw_sendmsg+0x1861/0x3ed0 net/ipv4/raw.c:653
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d0ea2b12
    • Eric Dumazet's avatar
      dccp: initialize ireq->ir_mark · b855ff82
      Eric Dumazet authored
      syzbot reported an uninit-value read of skb->mark in iptable_mangle_hook()
      
      Thanks to the nice report, I tracked the problem to dccp not caring
      of ireq->ir_mark for passive sessions.
      
      BUG: KMSAN: uninit-value in ipt_mangle_out net/ipv4/netfilter/iptable_mangle.c:66 [inline]
      BUG: KMSAN: uninit-value in iptable_mangle_hook+0x5e5/0x720 net/ipv4/netfilter/iptable_mangle.c:84
      CPU: 0 PID: 5300 Comm: syz-executor3 Not tainted 4.16.0+ #81
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
       ipt_mangle_out net/ipv4/netfilter/iptable_mangle.c:66 [inline]
       iptable_mangle_hook+0x5e5/0x720 net/ipv4/netfilter/iptable_mangle.c:84
       nf_hook_entry_hookfn include/linux/netfilter.h:120 [inline]
       nf_hook_slow+0x158/0x3d0 net/netfilter/core.c:483
       nf_hook include/linux/netfilter.h:243 [inline]
       __ip_local_out net/ipv4/ip_output.c:113 [inline]
       ip_local_out net/ipv4/ip_output.c:122 [inline]
       ip_queue_xmit+0x1d21/0x21c0 net/ipv4/ip_output.c:504
       dccp_transmit_skb+0x15eb/0x1900 net/dccp/output.c:142
       dccp_xmit_packet+0x814/0x9e0 net/dccp/output.c:281
       dccp_write_xmit+0x20f/0x480 net/dccp/output.c:363
       dccp_sendmsg+0x12ca/0x12d0 net/dccp/proto.c:818
       inet_sendmsg+0x48d/0x740 net/ipv4/af_inet.c:764
       sock_sendmsg_nosec net/socket.c:630 [inline]
       sock_sendmsg net/socket.c:640 [inline]
       ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
       __sys_sendmsg net/socket.c:2080 [inline]
       SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091
       SyS_sendmsg+0x54/0x80 net/socket.c:2087
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      RIP: 0033:0x455259
      RSP: 002b:00007f1a4473dc68 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
      RAX: ffffffffffffffda RBX: 00007f1a4473e6d4 RCX: 0000000000455259
      RDX: 0000000000000000 RSI: 0000000020b76fc8 RDI: 0000000000000015
      RBP: 000000000072bea0 R08: 0000000000000000 R09: 0000000000000000
      R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
      R13: 00000000000004f0 R14: 00000000006fa720 R15: 0000000000000000
      
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
       kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
       __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
       ip_queue_xmit+0x1e35/0x21c0 net/ipv4/ip_output.c:502
       dccp_transmit_skb+0x15eb/0x1900 net/dccp/output.c:142
       dccp_xmit_packet+0x814/0x9e0 net/dccp/output.c:281
       dccp_write_xmit+0x20f/0x480 net/dccp/output.c:363
       dccp_sendmsg+0x12ca/0x12d0 net/dccp/proto.c:818
       inet_sendmsg+0x48d/0x740 net/ipv4/af_inet.c:764
       sock_sendmsg_nosec net/socket.c:630 [inline]
       sock_sendmsg net/socket.c:640 [inline]
       ___sys_sendmsg+0xec0/0x1310 net/socket.c:2046
       __sys_sendmsg net/socket.c:2080 [inline]
       SYSC_sendmsg+0x2a3/0x3d0 net/socket.c:2091
       SyS_sendmsg+0x54/0x80 net/socket.c:2087
       do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
       entry_SYSCALL_64_after_hwframe+0x3d/0xa2
      Uninit was stored to memory at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
       kmsan_save_stack mm/kmsan/kmsan.c:293 [inline]
       kmsan_internal_chain_origin+0x12b/0x210 mm/kmsan/kmsan.c:684
       __msan_chain_origin+0x69/0xc0 mm/kmsan/kmsan_instr.c:521
       inet_csk_clone_lock+0x503/0x580 net/ipv4/inet_connection_sock.c:797
       dccp_create_openreq_child+0x7f/0x890 net/dccp/minisocks.c:92
       dccp_v4_request_recv_sock+0x22c/0xe90 net/dccp/ipv4.c:408
       dccp_v6_request_recv_sock+0x290/0x2000 net/dccp/ipv6.c:414
       dccp_check_req+0x7b9/0x8f0 net/dccp/minisocks.c:197
       dccp_v4_rcv+0x12e4/0x2630 net/dccp/ipv4.c:840
       ip_local_deliver_finish+0x6ed/0xd40 net/ipv4/ip_input.c:216
       NF_HOOK include/linux/netfilter.h:288 [inline]
       ip_local_deliver+0x43c/0x4e0 net/ipv4/ip_input.c:257
       dst_input include/net/dst.h:449 [inline]
       ip_rcv_finish+0x1253/0x16d0 net/ipv4/ip_input.c:397
       NF_HOOK include/linux/netfilter.h:288 [inline]
       ip_rcv+0x119d/0x16f0 net/ipv4/ip_input.c:493
       __netif_receive_skb_core+0x47cf/0x4a80 net/core/dev.c:4562
       __netif_receive_skb net/core/dev.c:4627 [inline]
       process_backlog+0x62d/0xe20 net/core/dev.c:5307
       napi_poll net/core/dev.c:5705 [inline]
       net_rx_action+0x7c1/0x1a70 net/core/dev.c:5771
       __do_softirq+0x56d/0x93d kernel/softirq.c:285
      Uninit was created at:
       kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline]
       kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188
       kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314
       kmem_cache_alloc+0xaab/0xb90 mm/slub.c:2756
       reqsk_alloc include/net/request_sock.h:88 [inline]
       inet_reqsk_alloc+0xc4/0x7f0 net/ipv4/tcp_input.c:6145
       dccp_v4_conn_request+0x5cc/0x1770 net/dccp/ipv4.c:600
       dccp_v6_conn_request+0x299/0x1880 net/dccp/ipv6.c:317
       dccp_rcv_state_process+0x2ea/0x2410 net/dccp/input.c:612
       dccp_v4_do_rcv+0x229/0x340 net/dccp/ipv4.c:682
       dccp_v6_do_rcv+0x16d/0x1220 net/dccp/ipv6.c:578
       sk_backlog_rcv include/net/sock.h:908 [inline]
       __sk_receive_skb+0x60e/0xf20 net/core/sock.c:513
       dccp_v4_rcv+0x24d4/0x2630 net/dccp/ipv4.c:874
       ip_local_deliver_finish+0x6ed/0xd40 net/ipv4/ip_input.c:216
       NF_HOOK include/linux/netfilter.h:288 [inline]
       ip_local_deliver+0x43c/0x4e0 net/ipv4/ip_input.c:257
       dst_input include/net/dst.h:449 [inline]
       ip_rcv_finish+0x1253/0x16d0 net/ipv4/ip_input.c:397
       NF_HOOK include/linux/netfilter.h:288 [inline]
       ip_rcv+0x119d/0x16f0 net/ipv4/ip_input.c:493
       __netif_receive_skb_core+0x47cf/0x4a80 net/core/dev.c:4562
       __netif_receive_skb net/core/dev.c:4627 [inline]
       process_backlog+0x62d/0xe20 net/core/dev.c:5307
       napi_poll net/core/dev.c:5705 [inline]
       net_rx_action+0x7c1/0x1a70 net/core/dev.c:5771
       __do_softirq+0x56d/0x93d kernel/softirq.c:285
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b855ff82
    • Eric Dumazet's avatar
      net: fix uninit-value in __hw_addr_add_ex() · 77d36398
      Eric Dumazet authored
      syzbot complained :
      
      BUG: KMSAN: uninit-value in memcmp+0x119/0x180 lib/string.c:861
      CPU: 0 PID: 3 Comm: kworker/0:0 Not tainted 4.16.0+ #82
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Workqueue: ipv6_addrconf addrconf_dad_work
      Call Trace:
       __dump_stack lib/dump_stack.c:17 [inline]
       dump_stack+0x185/0x1d0 lib/dump_stack.c:53
       kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
       __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676
       memcmp+0x119/0x180 lib/string.c:861
       __hw_addr_add_ex net/core/dev_addr_lists.c:60 [inline]
       __dev_mc_add+0x1c2/0x8e0 net/core/dev_addr_lists.c:670
       dev_mc_add+0x6d/0x80 net/core/dev_addr_lists.c:687
       igmp6_group_added+0x2db/0xa00 net/ipv6/mcast.c:662
       ipv6_dev_mc_inc+0xe9e/0x1130 net/ipv6/mcast.c:914
       addrconf_join_solict net/ipv6/addrconf.c:2078 [inline]
       addrconf_dad_begin net/ipv6/addrconf.c:3828 [inline]
       addrconf_dad_work+0x427/0x2150 net/ipv6/addrconf.c:3954
       process_one_work+0x12c6/0x1f60 kernel/workqueue.c:2113
       worker_thread+0x113c/0x24f0 kernel/workqueue.c:2247
       kthread+0x539/0x720 kernel/kthread.c:239
      
      Fixes: f001fde5 ("net: introduce a list of device addresses dev_addr_list (v6)")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      77d36398
    • Eric Dumazet's avatar
      net: initialize skb->peeked when cloning · b13dda9f
      Eric Dumazet authored
      syzbot reported __skb_try_recv_from_queue() was using skb->peeked
      while it was potentially unitialized.
      
      We need to clear it in __skb_clone()
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b13dda9f
    • Eric Dumazet's avatar
      net: fix rtnh_ok() · b1993a2d
      Eric Dumazet authored
      syzbot reported :
      
      BUG: KMSAN: uninit-value in rtnh_ok include/net/nexthop.h:11 [inline]
      BUG: KMSAN: uninit-value in fib_count_nexthops net/ipv4/fib_semantics.c:469 [inline]
      BUG: KMSAN: uninit-value in fib_create_info+0x554/0x8d20 net/ipv4/fib_semantics.c:1091
      
      @remaining is an integer, coming from user space.
      If it is negative we want rtnh_ok() to return false.
      
      Fixes: 4e902c57 ("[IPv4]: FIB configuration using struct fib_config")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b1993a2d
    • Eric Dumazet's avatar
      netlink: fix uninit-value in netlink_sendmsg · 6091f09c
      Eric Dumazet authored
      syzbot reported :
      
      BUG: KMSAN: uninit-value in ffs arch/x86/include/asm/bitops.h:432 [inline]
      BUG: KMSAN: uninit-value in netlink_sendmsg+0xb26/0x1310 net/netlink/af_netlink.c:1851
      
      Fixes: 1da177e4 ("Linux-2.6.12-rc2")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6091f09c
    • Eric Dumazet's avatar
      crypto: af_alg - fix possible uninit-value in alg_bind() · a466856e
      Eric Dumazet authored
      syzbot reported :
      
      BUG: KMSAN: uninit-value in alg_bind+0xe3/0xd90 crypto/af_alg.c:162
      
      We need to check addr_len before dereferencing sa (or uaddr)
      
      Fixes: bb30b884 ("crypto: af_alg - whitelist mask and type")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Cc: Stephan Mueller <smueller@chronox.de>
      Cc: Herbert Xu <herbert@gondor.apana.org.au>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a466856e
  5. 07 Apr, 2018 1 commit
  6. 06 Apr, 2018 5 commits
    • Esben Haabendal's avatar
      net: phy: marvell: Enable interrupt function on LED2 pin · dd9a122a
      Esben Haabendal authored
      The LED2[2]/INTn pin on Marvell 88E1318S as well as 88E1510/12/14/18 needs
      to be configured to be usable as interrupt not only when WOL is enabled,
      but whenever we rely on interrupts from the PHY.
      Signed-off-by: default avatarEsben Haabendal <eha@deif.com>
      Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
      Reviewed-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd9a122a
    • David S. Miller's avatar
      Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue · eb192480
      David S. Miller authored
      Jeff Kirsher says:
      
      ====================
      Intel Wired LAN Driver Updates 2018-04-06
      
      This series contains a couple of fixes for the new ice driver.
      
      Wei Yongjun fixes the return error code for error case during init.
      
      Anirudh fixes the incorrect use of ARRAY_SIZE() in the ice ethtool code
      and fixed "for" loop calculations.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb192480
    • Anirudh Venkataramanan's avatar
      ice: Bug fixes in ethtool code · cba5957d
      Anirudh Venkataramanan authored
      1) Return correct size from ice_get_regs_len.
      2) Fix incorrect use of ARRAY_SIZE in ice_get_regs.
      
      Fixes: fcea6f3d (ice: Add stats and ethtool support)
      Signed-off-by: default avatarAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: default avatarTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      cba5957d
    • Wei Yongjun's avatar
      ice: Fix error return code in ice_init_hw() · 63bb4e1e
      Wei Yongjun authored
      Fix to return error code ICE_ERR_NO_MEMORY from the alloc error
      handling case instead of 0, as done elsewhere in this function.
      
      Fixes: dc49c772 ("ice: Get MAC/PHY/link info and scheduler topology")
      Signed-off-by: default avatarWei Yongjun <weiyongjun1@huawei.com>
      Acked-by: default avatarAnirudh Venkataramanan <anirudh.venkataramanan@intel.com>
      Tested-by: default avatarTony Brelinski <tonyx.brelinski@intel.com>
      Signed-off-by: default avatarJeff Kirsher <jeffrey.t.kirsher@intel.com>
      63bb4e1e
    • Davide Caratti's avatar
      net/sched: fix NULL dereference in the error path of tcf_bpf_init() · 3239534a
      Davide Caratti authored
      when tcf_bpf_init_from_ops() fails (e.g. because of program having invalid
      number of instructions), tcf_bpf_cfg_cleanup() calls bpf_prog_put(NULL) or
      bpf_prog_destroy(NULL). Unless CONFIG_BPF_SYSCALL is unset, this causes
      the following error:
      
       BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
       PGD 800000007345a067 P4D 800000007345a067 PUD 340e1067 PMD 0
       Oops: 0000 [#1] SMP PTI
       Modules linked in: act_bpf(E) ip6table_filter ip6_tables iptable_filter binfmt_misc ext4 mbcache jbd2 crct10dif_pclmul crc32_pclmul ghash_clmulni_intel snd_hda_codec_generic pcbc snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm aesni_intel crypto_simd glue_helper cryptd joydev snd_timer snd virtio_balloon pcspkr soundcore i2c_piix4 nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c ata_generic pata_acpi qxl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm virtio_blk drm virtio_net virtio_console i2c_core crc32c_intel serio_raw virtio_pci ata_piix libata virtio_ring floppy virtio dm_mirror dm_region_hash dm_log dm_mod [last unloaded: act_bpf]
       CPU: 3 PID: 5654 Comm: tc Tainted: G            E    4.16.0.bpf_test+ #408
       Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
       RIP: 0010:__bpf_prog_put+0xc/0xc0
       RSP: 0018:ffff9594003ef728 EFLAGS: 00010202
       RAX: 0000000000000000 RBX: ffff9594003ef758 RCX: 0000000000000024
       RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
       RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000044
       R10: 0000000000000220 R11: ffff8a7ab9f17131 R12: 0000000000000000
       R13: ffff8a7ab7c3c8e0 R14: 0000000000000001 R15: ffff8a7ab88f1054
       FS:  00007fcb2f17c740(0000) GS:ffff8a7abfd80000(0000) knlGS:0000000000000000
       CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
       CR2: 0000000000000020 CR3: 000000007c888006 CR4: 00000000001606e0
       Call Trace:
        tcf_bpf_cfg_cleanup+0x2f/0x40 [act_bpf]
        tcf_bpf_cleanup+0x4c/0x70 [act_bpf]
        __tcf_idr_release+0x79/0x140
        tcf_bpf_init+0x125/0x330 [act_bpf]
        tcf_action_init_1+0x2cc/0x430
        ? get_page_from_freelist+0x3f0/0x11b0
        tcf_action_init+0xd3/0x1b0
        tc_ctl_action+0x18b/0x240
        rtnetlink_rcv_msg+0x29c/0x310
        ? _cond_resched+0x15/0x30
        ? __kmalloc_node_track_caller+0x1b9/0x270
        ? rtnl_calcit.isra.29+0x100/0x100
        netlink_rcv_skb+0xd2/0x110
        netlink_unicast+0x17c/0x230
        netlink_sendmsg+0x2cd/0x3c0
        sock_sendmsg+0x30/0x40
        ___sys_sendmsg+0x27a/0x290
        ? mem_cgroup_commit_charge+0x80/0x130
        ? page_add_new_anon_rmap+0x73/0xc0
        ? do_anonymous_page+0x2a2/0x560
        ? __handle_mm_fault+0xc75/0xe20
        __sys_sendmsg+0x58/0xa0
        do_syscall_64+0x6e/0x1a0
        entry_SYSCALL_64_after_hwframe+0x3d/0xa2
       RIP: 0033:0x7fcb2e58eba0
       RSP: 002b:00007ffc93c496c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
       RAX: ffffffffffffffda RBX: 00007ffc93c497f0 RCX: 00007fcb2e58eba0
       RDX: 0000000000000000 RSI: 00007ffc93c49740 RDI: 0000000000000003
       RBP: 000000005ac6a646 R08: 0000000000000002 R09: 0000000000000000
       R10: 00007ffc93c49120 R11: 0000000000000246 R12: 0000000000000000
       R13: 00007ffc93c49804 R14: 0000000000000001 R15: 000000000066afa0
       Code: 5f 00 48 8b 43 20 48 c7 c7 70 2f 7c b8 c7 40 10 00 00 00 00 5b e9 a5 8b 61 00 0f 1f 44 00 00 0f 1f 44 00 00 41 54 55 48 89 fd 53 <48> 8b 47 20 f0 ff 08 74 05 5b 5d 41 5c c3 41 89 f4 0f 1f 44 00
       RIP: __bpf_prog_put+0xc/0xc0 RSP: ffff9594003ef728
       CR2: 0000000000000020
      
      Fix it in tcf_bpf_cfg_cleanup(), ensuring that bpf_prog_{put,destroy}(f)
      is called only when f is not NULL.
      
      Fixes: bbc09e78 ("net/sched: fix idr leak on the error path of tcf_bpf_init()")
      Reported-by: default avatarLucas Bates <lucasb@mojatatu.com>
      Signed-off-by: default avatarDavide Caratti <dcaratti@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3239534a