1. 09 May, 2019 2 commits
  2. 06 May, 2019 3 commits
  3. 05 May, 2019 3 commits
    • Taehee Yoo's avatar
      netfilter: nf_flow_table: fix missing error check for rhashtable_insert_fast · 43c8f131
      Taehee Yoo authored
      rhashtable_insert_fast() may return an error value when memory
      allocation fails, but flow_offload_add() does not check for errors.
      This patch just adds missing error checking.
      
      Fixes: ac2a6666 ("netfilter: add generic flow table infrastructure")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      43c8f131
    • Florian Westphal's avatar
      netfilter: nf_tables: fix base chain stat rcu_dereference usage · edbd82c5
      Florian Westphal authored
      Following splat gets triggered when nfnetlink monitor is running while
      xtables-nft selftests are running:
      
      net/netfilter/nf_tables_api.c:1272 suspicious rcu_dereference_check() usage!
      other info that might help us debug this:
      
      1 lock held by xtables-nft-mul/27006:
       #0: 00000000e0f85be9 (&net->nft.commit_mutex){+.+.}, at: nf_tables_valid_genid+0x1a/0x50
      Call Trace:
       nf_tables_fill_chain_info.isra.45+0x6cc/0x6e0
       nf_tables_chain_notify+0xf8/0x1a0
       nf_tables_commit+0x165c/0x1740
      
      nf_tables_fill_chain_info() can be called both from dumps (rcu read locked)
      or from the transaction path if a userspace process subscribed to nftables
      notifications.
      
      In the 'table dump' case, rcu_access_pointer() cannot be used: We do not
      hold transaction mutex so the pointer can be NULLed right after the check.
      Just unconditionally fetch the value, then have the helper return
      immediately if its NULL.
      
      In the notification case we don't hold the rcu read lock, but updates are
      prevented due to transaction mutex. Use rcu_dereference_check() to make lockdep
      aware of this.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      edbd82c5
    • Jakub Jankowski's avatar
      netfilter: nf_conntrack_h323: restore boundary check correctness · f5e85ce8
      Jakub Jankowski authored
      Since commit bc7d811a ("netfilter: nf_ct_h323: Convert
      CHECK_BOUND macro to function"), NAT traversal for H.323
      doesn't work, failing to parse H323-UserInformation.
      nf_h323_error_boundary() compares contents of the bitstring,
      not the addresses, preventing valid H.323 packets from being
      conntrack'd.
      
      This looks like an oversight from when CHECK_BOUND macro was
      converted to a function.
      
      To fix it, stop dereferencing bs->cur and bs->end.
      
      Fixes: bc7d811a ("netfilter: nf_ct_h323: Convert CHECK_BOUND macro to function")
      Signed-off-by: default avatarJakub Jankowski <shasta@toxcorp.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      f5e85ce8
  4. 30 Apr, 2019 7 commits
    • Taehee Yoo's avatar
      netfilter: nf_flow_table: check ttl value in flow offload data path · 33cc3c0c
      Taehee Yoo authored
      nf_flow_offload_ip_hook() and nf_flow_offload_ipv6_hook() do not check
      ttl value. So, ttl value overflow may occur.
      
      Fixes: 97add9f0 ("netfilter: flow table support for IPv4")
      Fixes: 09952107 ("netfilter: flow table support for IPv6")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      33cc3c0c
    • Taehee Yoo's avatar
      netfilter: nf_flow_table: fix netdev refcnt leak · 26a302af
      Taehee Yoo authored
      flow_offload_alloc() calls nf_route() to get a dst_entry. Internally,
      nf_route() calls ip_route_output_key() that allocates a dst_entry and
      holds it. So, a dst_entry should be released by dst_release() if
      nf_route() is successful.
      
      Otherwise, netns exit routine cannot be finished and the following
      message is printed:
      
      [  257.490952] unregister_netdevice: waiting for lo to become free. Usage count = 1
      
      Fixes: ac2a6666 ("netfilter: add generic flow table infrastructure")
      Signed-off-by: default avatarTaehee Yoo <ap420073@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      26a302af
    • Pablo Neira Ayuso's avatar
      netfilter: nft_flow_offload: add entry to flowtable after confirmation · 270a8a29
      Pablo Neira Ayuso authored
      This is fixing flow offload for UDP traffic where packets only follow
      one single direction.
      
      The flow_offload_fixup_tcp() mechanism works fine in case that the
      offloaded entry remains in SYN_RECV state, given sequence tracking is
      reset and that conntrack handles syn+ack packets as a retransmission, ie.
      
      	sES + synack => sIG
      
      for reply traffic.
      
      Fixes: a3c90f7a ("netfilter: nf_tables: flow offload expression")
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      270a8a29
    • Florian Westphal's avatar
      netfilter: nf_tables: delay chain policy update until transaction is complete · 66293c46
      Florian Westphal authored
      When we process a long ruleset of the form
      
      chain input {
         type filter hook input priority filter; policy drop;
         ...
      }
      
      Then the base chain gets registered early on, we then continue to
      process/validate the next messages coming in the same transaction.
      
      Problem is that if the base chain policy is 'drop', it will take effect
      immediately, which causes all traffic to get blocked until the
      transaction completes or is aborted.
      
      Fix this by deferring the policy until the transaction has been
      processed and all of the rules have been flagged as active.
      Reported-by: default avatarJann Haber <jann.haber@selfnet.de>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      66293c46
    • Eric Dumazet's avatar
      ipv6/flowlabel: wait rcu grace period before put_pid() · 6c0afef5
      Eric Dumazet authored
      syzbot was able to catch a use-after-free read in pid_nr_ns() [1]
      
      ip6fl_seq_show() seems to use RCU protection, dereferencing fl->owner.pid
      but fl_free() releases fl->owner.pid before rcu grace period is started.
      
      [1]
      
      BUG: KASAN: use-after-free in pid_nr_ns+0x128/0x140 kernel/pid.c:407
      Read of size 4 at addr ffff888094012a04 by task syz-executor.0/18087
      
      CPU: 0 PID: 18087 Comm: syz-executor.0 Not tainted 5.1.0-rc6+ #89
      Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
      Call Trace:
       __dump_stack lib/dump_stack.c:77 [inline]
       dump_stack+0x172/0x1f0 lib/dump_stack.c:113
       print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
       kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
       __asan_report_load4_noabort+0x14/0x20 mm/kasan/generic_report.c:131
       pid_nr_ns+0x128/0x140 kernel/pid.c:407
       ip6fl_seq_show+0x2f8/0x4f0 net/ipv6/ip6_flowlabel.c:794
       seq_read+0xad3/0x1130 fs/seq_file.c:268
       proc_reg_read+0x1fe/0x2c0 fs/proc/inode.c:227
       do_loop_readv_writev fs/read_write.c:701 [inline]
       do_loop_readv_writev fs/read_write.c:688 [inline]
       do_iter_read+0x4a9/0x660 fs/read_write.c:922
       vfs_readv+0xf0/0x160 fs/read_write.c:984
       kernel_readv fs/splice.c:358 [inline]
       default_file_splice_read+0x475/0x890 fs/splice.c:413
       do_splice_to+0x12a/0x190 fs/splice.c:876
       splice_direct_to_actor+0x2d2/0x970 fs/splice.c:953
       do_splice_direct+0x1da/0x2a0 fs/splice.c:1062
       do_sendfile+0x597/0xd00 fs/read_write.c:1443
       __do_sys_sendfile64 fs/read_write.c:1498 [inline]
       __se_sys_sendfile64 fs/read_write.c:1490 [inline]
       __x64_sys_sendfile64+0x15a/0x220 fs/read_write.c:1490
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x458da9
      Code: ad b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 7b b8 fb ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007f300d24bc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000028
      RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 0000000000458da9
      RDX: 00000000200000c0 RSI: 0000000000000008 RDI: 0000000000000007
      RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
      R10: 000000000000005a R11: 0000000000000246 R12: 00007f300d24c6d4
      R13: 00000000004c5fa3 R14: 00000000004da748 R15: 00000000ffffffff
      
      Allocated by task 17543:
       save_stack+0x45/0xd0 mm/kasan/common.c:75
       set_track mm/kasan/common.c:87 [inline]
       __kasan_kmalloc mm/kasan/common.c:497 [inline]
       __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:470
       kasan_slab_alloc+0xf/0x20 mm/kasan/common.c:505
       slab_post_alloc_hook mm/slab.h:437 [inline]
       slab_alloc mm/slab.c:3393 [inline]
       kmem_cache_alloc+0x11a/0x6f0 mm/slab.c:3555
       alloc_pid+0x55/0x8f0 kernel/pid.c:168
       copy_process.part.0+0x3b08/0x7980 kernel/fork.c:1932
       copy_process kernel/fork.c:1709 [inline]
       _do_fork+0x257/0xfd0 kernel/fork.c:2226
       __do_sys_clone kernel/fork.c:2333 [inline]
       __se_sys_clone kernel/fork.c:2327 [inline]
       __x64_sys_clone+0xbf/0x150 kernel/fork.c:2327
       do_syscall_64+0x103/0x610 arch/x86/entry/common.c:290
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 7789:
       save_stack+0x45/0xd0 mm/kasan/common.c:75
       set_track mm/kasan/common.c:87 [inline]
       __kasan_slab_free+0x102/0x150 mm/kasan/common.c:459
       kasan_slab_free+0xe/0x10 mm/kasan/common.c:467
       __cache_free mm/slab.c:3499 [inline]
       kmem_cache_free+0x86/0x260 mm/slab.c:3765
       put_pid.part.0+0x111/0x150 kernel/pid.c:111
       put_pid+0x20/0x30 kernel/pid.c:105
       fl_free+0xbe/0xe0 net/ipv6/ip6_flowlabel.c:102
       ip6_fl_gc+0x295/0x3e0 net/ipv6/ip6_flowlabel.c:152
       call_timer_fn+0x190/0x720 kernel/time/timer.c:1325
       expire_timers kernel/time/timer.c:1362 [inline]
       __run_timers kernel/time/timer.c:1681 [inline]
       __run_timers kernel/time/timer.c:1649 [inline]
       run_timer_softirq+0x652/0x1700 kernel/time/timer.c:1694
       __do_softirq+0x266/0x95a kernel/softirq.c:293
      
      The buggy address belongs to the object at ffff888094012a00
       which belongs to the cache pid_2 of size 88
      The buggy address is located 4 bytes inside of
       88-byte region [ffff888094012a00, ffff888094012a58)
      The buggy address belongs to the page:
      page:ffffea0002500480 count:1 mapcount:0 mapping:ffff88809a483080 index:0xffff888094012980
      flags: 0x1fffc0000000200(slab)
      raw: 01fffc0000000200 ffffea00018a3508 ffffea0002524a88 ffff88809a483080
      raw: ffff888094012980 ffff888094012000 000000010000001b 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff888094012900: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
       ffff888094012980: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
      >ffff888094012a00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
                         ^
       ffff888094012a80: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
       ffff888094012b00: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
      
      Fixes: 4f82f457 ("net ip6 flowlabel: Make owner a union of struct pid * and kuid_t")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c0afef5
    • Stephen Suryaputra's avatar
      vrf: Use orig netdev to count Ip6InNoRoutes and a fresh route lookup when sending dest unreach · 1d3fd8a1
      Stephen Suryaputra authored
      When there is no route to an IPv6 dest addr, skb_dst(skb) points
      to loopback dev in the case of that the IP6CB(skb)->iif is
      enslaved to a vrf. This causes Ip6InNoRoutes to be incremented on the
      loopback dev. This also causes the lookup to fail on icmpv6_send() and
      the dest unreachable to not sent and Ip6OutNoRoutes gets incremented on
      the loopback dev.
      
      To reproduce:
      * Gateway configuration:
              ip link add dev vrf_258 type vrf table 258
              ip link set dev enp0s9 master vrf_258
              ip addr add 66:1/64 dev enp0s9
              ip -6 route add unreachable default metric 8192 table 258
              sysctl -w net.ipv6.conf.all.forwarding=1
              sysctl -w net.ipv6.conf.enp0s9.forwarding=1
      * Sender configuration:
              ip addr add 66::2/64 dev enp0s9
              ip -6 route add default via 66::1
      and ping 67::1 for example from the sender.
      
      Fix this by counting on the original netdev and reset the skb dst to
      force a fresh lookup.
      
      v2: Fix typo of destination address in the repro steps.
      v3: Simplify the loopback check (per David Ahern) and use reverse
          Christmas tree format (per David Miller).
      Signed-off-by: default avatarStephen Suryaputra <ssuryaextr@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Tested-by: default avatarDavid Ahern <dsahern@gmail.com>
      Reviewed-by: default avatarDavid Ahern <dsahern@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      1d3fd8a1
    • Eric Dumazet's avatar
      tcp: add sanity tests in tcp_add_backlog() · ca2fe295
      Eric Dumazet authored
      Richard and Bruno both reported that my commit added a bug,
      and Bruno was able to determine the problem came when a segment
      wih a FIN packet was coalesced to a prior one in tcp backlog queue.
      
      It turns out the header prediction in tcp_rcv_established()
      looks back to TCP headers in the packet, not in the metadata
      (aka TCP_SKB_CB(skb)->tcp_flags)
      
      The fast path in tcp_rcv_established() is not supposed to
      handle a FIN flag (it does not call tcp_fin())
      
      Therefore we need to make sure to propagate the FIN flag,
      so that the coalesced packet does not go through the fast path,
      the same than a GRO packet carrying a FIN flag.
      
      While we are at it, make sure we do not coalesce packets with
      RST or SYN, or if they do not have ACK set.
      
      Many thanks to Richard and Bruno for pinpointing the bad commit,
      and to Richard for providing a first version of the fix.
      
      Fixes: 4f693b55 ("tcp: implement coalescing on backlog queue")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarRichard Purdie <richard.purdie@linuxfoundation.org>
      Reported-by: default avatarBruno Prémont <bonbons@sysophe.eu>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ca2fe295
  5. 29 Apr, 2019 3 commits
  6. 28 Apr, 2019 4 commits
  7. 27 Apr, 2019 7 commits
  8. 26 Apr, 2019 11 commits
    • Andrew Lunn's avatar
      net: phy: marvell: Fix buffer overrun with stats counters · fdfdf867
      Andrew Lunn authored
      marvell_get_sset_count() returns how many statistics counters there
      are. If the PHY supports fibre, there are 3, otherwise two.
      
      marvell_get_strings() does not make this distinction, and always
      returns 3 strings. This then often results in writing past the end
      of the buffer for the strings.
      
      Fixes: 2170fef7 ("Marvell phy: add field to get errors from fiber link.")
      Signed-off-by: default avatarAndrew Lunn <andrew@lunn.ch>
      Reviewed-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdfdf867
    • Marcel Holtmann's avatar
      genetlink: use idr_alloc_cyclic for family->id assignment · 4e43df38
      Marcel Holtmann authored
      When allocating the next family->id it makes more sense to use
      idr_alloc_cyclic to avoid re-using a previously used family->id as much
      as possible.
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4e43df38
    • Bjørn Mork's avatar
      qmi_wwan: new Wistron, ZTE and D-Link devices · 88ef66a2
      Bjørn Mork authored
      Adding device entries found in vendor modified versions of this
      driver.  Function maps for some of the devices follow:
      
      WNC D16Q1, D16Q5, D18Q1 LTE CAT3 module (1435:0918)
      
      MI_00 Qualcomm HS-USB Diagnostics
      MI_01 Android Debug interface
      MI_02 Qualcomm HS-USB Modem
      MI_03 Qualcomm Wireless HS-USB Ethernet Adapter
      MI_04 Qualcomm Wireless HS-USB Ethernet Adapter
      MI_05 Qualcomm Wireless HS-USB Ethernet Adapter
      MI_06 USB Mass Storage Device
      
       T:  Bus=02 Lev=01 Prnt=01 Port=00 Cnt=01 Dev#=  2 Spd=480 MxCh= 0
       D:  Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs=  1
       P:  Vendor=1435 ProdID=0918 Rev= 2.32
       S:  Manufacturer=Android
       S:  Product=Android
       S:  SerialNumber=0123456789ABCDEF
       C:* #Ifs= 7 Cfg#= 1 Atr=80 MxPwr=500mA
       I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=option
       E:  Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
       E:  Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
       I:* If#= 1 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=(none)
       E:  Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
       E:  Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
       I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=option
       E:  Ad=84(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
       E:  Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
       E:  Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
       I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
       E:  Ad=86(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
       E:  Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
       E:  Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
       I:* If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
       E:  Ad=88(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
       E:  Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
       I:* If#= 5 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
       E:  Ad=8a(I) Atr=03(Int.) MxPS=  64 Ivl=32ms
       E:  Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
       E:  Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
      
      WNC D18 LTE CAT3 module (1435:d182)
      
      MI_00 Qualcomm HS-USB Diagnostics
      MI_01 Androd Debug interface
      MI_02 Qualcomm HS-USB Modem
      MI_03 Qualcomm HS-USB NMEA
      MI_04 Qualcomm Wireless HS-USB Ethernet Adapter
      MI_05 Qualcomm Wireless HS-USB Ethernet Adapter
      MI_06 USB Mass Storage Device
      
      ZM8510/ZM8620/ME3960 (19d2:0396)
      
      MI_00 ZTE Mobile Broadband Diagnostics Port
      MI_01 ZTE Mobile Broadband AT Port
      MI_02 ZTE Mobile Broadband Modem
      MI_03 ZTE Mobile Broadband NDIS Port (qmi_wwan)
      MI_04 ZTE Mobile Broadband ADB Port
      
      ME3620_X (19d2:1432)
      
      MI_00 ZTE Diagnostics Device
      MI_01 ZTE UI AT Interface
      MI_02 ZTE Modem Device
      MI_03 ZTE Mobile Broadband Network Adapter
      MI_04 ZTE Composite ADB Interface
      Reported-by: default avatarLars Melin <larsm17@gmail.com>
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      88ef66a2
    • Fabien Dessenne's avatar
      net: ethernet: stmmac: manage the get_irq probe defer case · 56c5bc18
      Fabien Dessenne authored
      Manage the -EPROBE_DEFER error case for "stm32_pwr_wakeup"  IRQ.
      Signed-off-by: default avatarFabien Dessenne <fabien.dessenne@st.com>
      Acked-by: default avatarAlexandre TORGUE <alexandre.torgue@st.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      56c5bc18
    • Eric Dumazet's avatar
      l2tp: use rcu_dereference_sk_user_data() in l2tp_udp_encap_recv() · c1c47721
      Eric Dumazet authored
      Canonical way to fetch sk_user_data from an encap_rcv() handler called
      from UDP stack in rcu protected section is to use rcu_dereference_sk_user_data(),
      otherwise compiler might read it multiple times.
      
      Fixes: d00fa9ad ("il2tp: fix races with tunnel socket close")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: James Chapman <jchapman@katalix.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c1c47721
    • David S. Miller's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf · ad759c90
      David S. Miller authored
      Alexei Starovoitov says:
      
      ====================
      pull-request: bpf 2019-04-25
      
      The following pull-request contains BPF updates for your *net* tree.
      
      The main changes are:
      
      1) the bpf verifier fix to properly mark registers in all stack frames, from Paul.
      
      2) preempt_enable_no_resched->preempt_enable fix, from Peter.
      
      3) other misc fixes.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad759c90
    • Peter Zijlstra's avatar
      bpf: Fix preempt_enable_no_resched() abuse · 0edd6b64
      Peter Zijlstra authored
      Unless the very next line is schedule(), or implies it, one must not use
      preempt_enable_no_resched(). It can cause a preemption to go missing and
      thereby cause arbitrary delays, breaking the PREEMPT=y invariant.
      
      Cc: Roman Gushchin <guro@fb.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      0edd6b64
    • Paul Chaignon's avatar
      selftests/bpf: test cases for pkt/null checks in subprogs · 6dd7f140
      Paul Chaignon authored
      The first test case, for pointer null checks, is equivalent to the
      following pseudo-code.  It checks that the verifier does not complain on
      line 6 and recognizes that ptr isn't null.
      
      1: ptr = bpf_map_lookup_elem(map, &key);
      2: ret = subprog(ptr) {
      3:   return ptr != NULL;
      4: }
      5: if (ret)
      6:   value = *ptr;
      
      The second test case, for packet bound checks, is equivalent to the
      following pseudo-code.  It checks that the verifier does not complain on
      line 7 and recognizes that the packet is at least 1 byte long.
      
      1: pkt_end = ctx.pkt_end;
      2: ptr = ctx.pkt + 8;
      3: ret = subprog(ptr, pkt_end) {
      4:   return ptr <= pkt_end;
      5: }
      6: if (ret)
      7:   value = *(u8 *)ctx.pkt;
      Signed-off-by: default avatarPaul Chaignon <paul.chaignon@orange.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      6dd7f140
    • Paul Chaignon's avatar
      bpf: mark registers in all frames after pkt/null checks · c6a9efa1
      Paul Chaignon authored
      In case of a null check on a pointer inside a subprog, we should mark all
      registers with this pointer as either safe or unknown, in both the current
      and previous frames.  Currently, only spilled registers and registers in
      the current frame are marked.  Packet bound checks in subprogs have the
      same issue.  This patch fixes it to mark registers in previous frames as
      well.
      
      A good reproducer for null checks looks as follow:
      
      1: ptr = bpf_map_lookup_elem(map, &key);
      2: ret = subprog(ptr) {
      3:   return ptr != NULL;
      4: }
      5: if (ret)
      6:   value = *ptr;
      
      With the above, the verifier will complain on line 6 because it sees ptr
      as map_value_or_null despite the null check in subprog 1.
      
      Note that this patch fixes another resulting bug when using
      bpf_sk_release():
      
      1: sk = bpf_sk_lookup_tcp(...);
      2: subprog(sk) {
      3:   if (sk)
      4:     bpf_sk_release(sk);
      5: }
      6: if (!sk)
      7:   return 0;
      8: return 1;
      
      In the above, mark_ptr_or_null_regs will warn on line 6 because it will
      try to free the reference state, even though it was already freed on
      line 3.
      
      Fixes: f4d7e40a ("bpf: introduce function calls (verification)")
      Signed-off-by: default avatarPaul Chaignon <paul.chaignon@orange.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      c6a9efa1
    • Matteo Croce's avatar
      libbpf: add binary to gitignore · 39391377
      Matteo Croce authored
      Some binaries are generated when building libbpf from tools/lib/bpf/,
      namely libbpf.so.0.0.2 and libbpf.so.0.
      Add them to the local .gitignore.
      Signed-off-by: default avatarMatteo Croce <mcroce@redhat.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      39391377
    • Alban Crequy's avatar
      tools: bpftool: fix infinite loop in map create · 8694d8c1
      Alban Crequy authored
      "bpftool map create" has an infinite loop on "while (argc)". The error
      case is missing.
      
      Symptoms: when forgetting to type the keyword 'type' in front of 'hash':
      $ sudo bpftool map create /sys/fs/bpf/dir/foobar hash key 8 value 8 entries 128
      (infinite loop, taking all the CPU)
      ^C
      
      After the patch:
      $ sudo bpftool map create /sys/fs/bpf/dir/foobar hash key 8 value 8 entries 128
      Error: unknown arg hash
      
      Fixes: 0b592b5a ("tools: bpftool: add map create command")
      Signed-off-by: default avatarAlban Crequy <alban@kinvolk.io>
      Reviewed-by: default avatarQuentin Monnet <quentin.monnet@netronome.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Reviewed-by: default avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      8694d8c1