1. 03 Jun, 2019 2 commits
  2. 01 Jun, 2019 2 commits
    • Luke Nelson's avatar
      bpf, riscv: clear high 32 bits for ALU32 add/sub/neg/lsh/rsh/arsh · 1e692f09
      Luke Nelson authored
      In BPF, 32-bit ALU operations should zero-extend their results into
      the 64-bit registers.
      
      The current BPF JIT on RISC-V emits incorrect instructions that perform
      sign extension only (e.g., addw, subw) on 32-bit add, sub, lsh, rsh,
      arsh, and neg. This behavior diverges from the interpreter and JITs
      for other architectures.
      
      This patch fixes the bugs by performing zero extension on the destination
      register of 32-bit ALU operations.
      
      Fixes: 2353ecc6 ("bpf, riscv: add BPF JIT for RV64G")
      Cc: Xi Wang <xi.wang@gmail.com>
      Signed-off-by: default avatarLuke Nelson <luke.r.nels@gmail.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Acked-by: default avatarBjörn Töpel <bjorn.topel@gmail.com>
      Reviewed-by: default avatarPalmer Dabbelt <palmer@sifive.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      1e692f09
    • Michal Rostecki's avatar
      libbpf: Return btf_fd for load_sk_storage_btf · cfd49210
      Michal Rostecki authored
      Before this change, function load_sk_storage_btf expected that
      libbpf__probe_raw_btf was returning a BTF descriptor, but in fact it was
      returning an information about whether the probe was successful (0 or
      1). load_sk_storage_btf was using that value as an argument of the close
      function, which was resulting in closing stdout and thus terminating the
      process which called that function.
      
      That bug was visible in bpftool. `bpftool feature` subcommand was always
      exiting too early (because of closed stdout) and it didn't display all
      requested probes. `bpftool -j feature` or `bpftool -p feature` were not
      returning a valid json object.
      
      This change renames the libbpf__probe_raw_btf function to
      libbpf__load_raw_btf, which now returns a BTF descriptor, as expected in
      load_sk_storage_btf.
      
      v2:
      - Fix typo in the commit message.
      
      v3:
      - Simplify BTF descriptor handling in bpf_object__probe_btf_* functions.
      - Rename libbpf__probe_raw_btf function to libbpf__load_raw_btf and
      return a BTF descriptor.
      
      v4:
      - Fix typo in the commit message.
      
      Fixes: d7c4b398 ("libbpf: detect supported kernel BTF features and sanitize BTF")
      Signed-off-by: default avatarMichal Rostecki <mrostecki@opensuse.org>
      Acked-by: default avatarAndrii Nakryiko <andriin@fb.com>
      Acked-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      cfd49210
  3. 29 May, 2019 4 commits
  4. 24 May, 2019 1 commit
    • John Fastabend's avatar
      bpf: sockmap, fix use after free from sleep in psock backlog workqueue · bd95e678
      John Fastabend authored
      Backlog work for psock (sk_psock_backlog) might sleep while waiting
      for memory to free up when sending packets. However, while sleeping
      the socket may be closed and removed from the map by the user space
      side.
      
      This breaks an assumption in sk_stream_wait_memory, which expects the
      wait queue to be still there when it wakes up resulting in a
      use-after-free shown below. To fix his mark sendmsg as MSG_DONTWAIT
      to avoid the sleep altogether. We already set the flag for the
      sendpage case but we missed the case were sendmsg is used.
      Sockmap is currently the only user of skb_send_sock_locked() so only
      the sockmap paths should be impacted.
      
      ==================================================================
      BUG: KASAN: use-after-free in remove_wait_queue+0x31/0x70
      Write of size 8 at addr ffff888069a0c4e8 by task kworker/0:2/110
      
      CPU: 0 PID: 110 Comm: kworker/0:2 Not tainted 5.0.0-rc2-00335-g28f9d1a3-dirty #14
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
      Workqueue: events sk_psock_backlog
      Call Trace:
       print_address_description+0x6e/0x2b0
       ? remove_wait_queue+0x31/0x70
       kasan_report+0xfd/0x177
       ? remove_wait_queue+0x31/0x70
       ? remove_wait_queue+0x31/0x70
       remove_wait_queue+0x31/0x70
       sk_stream_wait_memory+0x4dd/0x5f0
       ? sk_stream_wait_close+0x1b0/0x1b0
       ? wait_woken+0xc0/0xc0
       ? tcp_current_mss+0xc5/0x110
       tcp_sendmsg_locked+0x634/0x15d0
       ? tcp_set_state+0x2e0/0x2e0
       ? __kasan_slab_free+0x1d1/0x230
       ? kmem_cache_free+0x70/0x140
       ? sk_psock_backlog+0x40c/0x4b0
       ? process_one_work+0x40b/0x660
       ? worker_thread+0x82/0x680
       ? kthread+0x1b9/0x1e0
       ? ret_from_fork+0x1f/0x30
       ? check_preempt_curr+0xaf/0x130
       ? iov_iter_kvec+0x5f/0x70
       ? kernel_sendmsg_locked+0xa0/0xe0
       skb_send_sock_locked+0x273/0x3c0
       ? skb_splice_bits+0x180/0x180
       ? start_thread+0xe0/0xe0
       ? update_min_vruntime.constprop.27+0x88/0xc0
       sk_psock_backlog+0xb3/0x4b0
       ? strscpy+0xbf/0x1e0
       process_one_work+0x40b/0x660
       worker_thread+0x82/0x680
       ? process_one_work+0x660/0x660
       kthread+0x1b9/0x1e0
       ? __kthread_create_on_node+0x250/0x250
       ret_from_fork+0x1f/0x30
      
      Fixes: 20bf50de ("skbuff: Function to send an skbuf on a socket")
      Reported-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Tested-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarJohn Fastabend <john.fastabend@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      bd95e678
  5. 23 May, 2019 3 commits
    • Jakub Sitnicki's avatar
      bpf: sockmap, restore sk_write_space when psock gets dropped · 186bcc3d
      Jakub Sitnicki authored
      Once psock gets unlinked from its sock (sk_psock_drop), user-space can
      still trigger a call to sk->sk_write_space by setting TCP_NOTSENT_LOWAT
      socket option. This causes a null-ptr-deref because we try to read
      psock->saved_write_space from sk_psock_write_space:
      
      ==================================================================
      BUG: KASAN: null-ptr-deref in sk_psock_write_space+0x69/0x80
      Read of size 8 at addr 00000000000001a0 by task sockmap-echo/131
      
      CPU: 0 PID: 131 Comm: sockmap-echo Not tainted 5.2.0-rc1-00094-gf49aa1de #81
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
      ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
      Call Trace:
       ? sk_psock_write_space+0x69/0x80
       __kasan_report.cold.2+0x5/0x3f
       ? sk_psock_write_space+0x69/0x80
       kasan_report+0xe/0x20
       sk_psock_write_space+0x69/0x80
       tcp_setsockopt+0x69a/0xfc0
       ? tcp_shutdown+0x70/0x70
       ? fsnotify+0x5b0/0x5f0
       ? remove_wait_queue+0x90/0x90
       ? __fget_light+0xa5/0xf0
       __sys_setsockopt+0xe6/0x180
       ? sockfd_lookup_light+0xb0/0xb0
       ? vfs_write+0x195/0x210
       ? ksys_write+0xc9/0x150
       ? __x64_sys_read+0x50/0x50
       ? __bpf_trace_x86_fpu+0x10/0x10
       __x64_sys_setsockopt+0x61/0x70
       do_syscall_64+0xc5/0x520
       ? vmacache_find+0xc0/0x110
       ? syscall_return_slowpath+0x110/0x110
       ? handle_mm_fault+0xb4/0x110
       ? entry_SYSCALL_64_after_hwframe+0x3e/0xbe
       ? trace_hardirqs_off_caller+0x4b/0x120
       ? trace_hardirqs_off_thunk+0x1a/0x3a
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x7f2e5e7cdcce
      Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b1 66 2e 0f 1f 84 00 00 00 00 00
      0f 1f 44 00 00 f3 0f 1e fa 49 89 ca b8 36 00 00 00 0f 05 <48> 3d 01 f0 ff
      ff 73 01 c3 48 8b 0d 8a 11 0c 00 f7 d8 64 89 01 48
      RSP: 002b:00007ffed011b778 EFLAGS: 00000206 ORIG_RAX: 0000000000000036
      RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f2e5e7cdcce
      RDX: 0000000000000019 RSI: 0000000000000006 RDI: 0000000000000007
      RBP: 00007ffed011b790 R08: 0000000000000004 R09: 00007f2e5e84ee80
      R10: 00007ffed011b788 R11: 0000000000000206 R12: 00007ffed011b78c
      R13: 00007ffed011b788 R14: 0000000000000007 R15: 0000000000000068
      ==================================================================
      
      Restore the saved sk_write_space callback when psock is being dropped to
      fix the crash.
      Signed-off-by: default avatarJakub Sitnicki <jakub@cloudflare.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      186bcc3d
    • Björn Töpel's avatar
      selftests: bpf: add zero extend checks for ALU32 and/or/xor · 00d83045
      Björn Töpel authored
      Add three tests to test_verifier/basic_instr that make sure that the
      high 32-bits of the destination register is cleared after an ALU32
      and/or/xor.
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@gmail.com>
      Acked-by: default avatarYonghong Song <yhs@fb.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      00d83045
    • Björn Töpel's avatar
      bpf, riscv: clear target register high 32-bits for and/or/xor on ALU32 · fe121ee5
      Björn Töpel authored
      When using 32-bit subregisters (ALU32), the RISC-V JIT would not clear
      the high 32-bits of the target register and therefore generate
      incorrect code.
      
      E.g., in the following code:
      
        $ cat test.c
        unsigned int f(unsigned long long a,
        	       unsigned int b)
        {
        	return (unsigned int)a & b;
        }
      
        $ clang-9 -target bpf -O2 -emit-llvm -S test.c -o - | \
        	llc-9 -mattr=+alu32 -mcpu=v3
        	.text
        	.file	"test.c"
        	.globl	f
        	.p2align	3
        	.type	f,@function
        f:
        	r0 = r1
        	w0 &= w2
        	exit
        .Lfunc_end0:
        	.size	f, .Lfunc_end0-f
      
      The JIT would not clear the high 32-bits of r0 after the
      and-operation, which in this case might give an incorrect return
      value.
      
      After this patch, that is not the case, and the upper 32-bits are
      cleared.
      Reported-by: default avatarJiong Wang <jiong.wang@netronome.com>
      Fixes: 2353ecc6 ("bpf, riscv: add BPF JIT for RV64G")
      Signed-off-by: default avatarBjörn Töpel <bjorn.topel@gmail.com>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      fe121ee5
  6. 21 May, 2019 18 commits
  7. 20 May, 2019 6 commits
    • Gustavo A. R. Silva's avatar
      vlan: Mark expected switch fall-through · fa2c52be
      Gustavo A. R. Silva authored
      In preparation to enabling -Wimplicit-fallthrough, mark switch
      cases where we are expecting to fall through.
      
      This patch fixes the following warning:
      
      net/8021q/vlan_dev.c: In function ‘vlan_dev_ioctl’:
      net/8021q/vlan_dev.c:374:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
         if (!net_eq(dev_net(dev), &init_net))
            ^
      net/8021q/vlan_dev.c:376:2: note: here
        case SIOCGMIIPHY:
        ^~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enable
      -Wimplicit-fallthrough.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fa2c52be
    • Gustavo A. R. Silva's avatar
      macvlan: Mark expected switch fall-through · 02596252
      Gustavo A. R. Silva authored
      In preparation to enabling -Wimplicit-fallthrough, mark switch
      cases where we are expecting to fall through.
      
      This patch fixes the following warning:
      
      drivers/net/macvlan.c: In function ‘macvlan_do_ioctl’:
      drivers/net/macvlan.c:839:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
         if (!net_eq(dev_net(dev), &init_net))
            ^
      drivers/net/macvlan.c:841:2: note: here
        case SIOCGHWTSTAMP:
        ^~~~
      
      Warning level 3 was used: -Wimplicit-fallthrough=3
      
      This patch is part of the ongoing efforts to enable
      -Wimplicit-fallthrough.
      Signed-off-by: default avatarGustavo A. R. Silva <gustavo@embeddedor.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      02596252
    • Erez Alfasi's avatar
      net/mlx4_en: ethtool, Remove unsupported SFP EEPROM high pages query · 135dd959
      Erez Alfasi authored
      Querying EEPROM high pages data for SFP module is currently
      not supported by our driver but is still tried, resulting in
      invalid FW queries.
      
      Set the EEPROM ethtool data length to 256 for SFP module to
      limit the reading for page 0 only and prevent invalid FW queries.
      
      Fixes: 7202da8b ("ethtool, net/mlx4_en: Cable info, get_module_info/eeprom ethtool support")
      Signed-off-by: default avatarErez Alfasi <ereza@mellanox.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      135dd959
    • Junwei Hu's avatar
      tipc: fix modprobe tipc failed after switch order of device registration · 526f5b85
      Junwei Hu authored
      Error message printed:
      modprobe: ERROR: could not insert 'tipc': Address family not
      supported by protocol.
      when modprobe tipc after the following patch: switch order of
      device registration, commit 7e27e8d6
      ("tipc: switch order of device registration to fix a crash")
      
      Because sock_create_kern(net, AF_TIPC, ...) called by
      tipc_topsrv_create_listener() in the initialization process
      of tipc_init_net(), so tipc_socket_init() must be execute before that.
      Meanwhile, tipc_net_id need to be initialized when sock_create()
      called, and tipc_socket_init() is no need to be called for each namespace.
      
      I add a variable tipc_topsrv_net_ops, and split the
      register_pernet_subsys() of tipc into two parts, and split
      tipc_socket_init() with initialization of pernet params.
      
      By the way, I fixed resources rollback error when tipc_bcast_init()
      failed in tipc_init_net().
      
      Fixes: 7e27e8d6 ("tipc: switch order of device registration to fix a crash")
      Signed-off-by: default avatarJunwei Hu <hujunwei4@huawei.com>
      Reported-by: default avatarWang Wang <wangwang2@huawei.com>
      Reported-by: syzbot+1e8114b61079bfe9cbc5@syzkaller.appspotmail.com
      Reviewed-by: default avatarKang Zhou <zhoukang7@huawei.com>
      Reviewed-by: default avatarSuanming Mou <mousuanming@huawei.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      526f5b85
    • Linus Torvalds's avatar
      Merge tag 'for-5.2-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux · f49aa1de
      Linus Torvalds authored
      Pull btrfs fixes from David Sterba:
       "Notable highlights:
      
         - fixes for some long-standing bugs in fsync that were quite hard to
           catch but now finaly fixed
      
         - some fixups to error handling paths that did not properly clean up
           (locking, memory)
      
         - fix to space reservation for inheriting properties"
      
      * tag 'for-5.2-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
        Btrfs: tree-checker: detect file extent items with overlapping ranges
        Btrfs: fix race between ranged fsync and writeback of adjacent ranges
        Btrfs: avoid fallback to transaction commit during fsync of files with holes
        btrfs: extent-tree: Fix a bug that btrfs is unable to add pinned bytes
        btrfs: sysfs: don't leak memory when failing add fsid
        btrfs: sysfs: Fix error path kobject memory leak
        Btrfs: do not abort transaction at btrfs_update_root() after failure to COW path
        btrfs: use the existing reserved items for our first prop for inheritance
        btrfs: don't double unlock on error in btrfs_punch_hole
        btrfs: Check the compression level before getting a workspace
      f49aa1de
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 78e03651
      Linus Torvalds authored
      Pull networking fixes from David Miller:1) Use after free in __dev_map_entry_free(), from Eric Dumazet.
      
       1) Use after free in __dev_map_entry_free(), from Eric Dumazet.
      
       2) Fix TCP retransmission timestamps on passive Fast Open, from Yuchung
          Cheng.
      
       3) Orphan NFC, we'll take the patches directly into my tree. From
          Johannes Berg.
      
       4) We can't recycle cloned TCP skbs, from Eric Dumazet.
      
       5) Some flow dissector bpf test fixes, from Stanislav Fomichev.
      
       6) Fix RCU marking and warnings in rhashtable, from Herbert Xu.
      
       7) Fix some potential fib6 leaks, from Eric Dumazet.
      
       8) Fix a _decode_session4 uninitialized memory read bug fix that got
          lost in a merge. From Florian Westphal.
      
       9) Fix ipv6 source address routing wrt. exception route entries, from
          Wei Wang.
      
      10) The netdev_xmit_more() conversion was not done %100 properly in mlx5
          driver, fix from Tariq Toukan.
      
      11) Clean up botched merge on netfilter kselftest, from Florian
          Westphal.
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (74 commits)
        of_net: fix of_get_mac_address retval if compiled without CONFIG_OF
        net: fix kernel-doc warnings for socket.c
        net: Treat sock->sk_drops as an unsigned int when printing
        kselftests: netfilter: fix leftover net/net-next merge conflict
        mlxsw: core: Prevent reading unsupported slave address from SFP EEPROM
        mlxsw: core: Prevent QSFP module initialization for old hardware
        vsock/virtio: Initialize core virtio vsock before registering the driver
        net/mlx5e: Fix possible modify header actions memory leak
        net/mlx5e: Fix no rewrite fields with the same match
        net/mlx5e: Additional check for flow destination comparison
        net/mlx5e: Add missing ethtool driver info for representors
        net/mlx5e: Fix number of vports for ingress ACL configuration
        net/mlx5e: Fix ethtool rxfh commands when CONFIG_MLX5_EN_RXNFC is disabled
        net/mlx5e: Fix wrong xmit_more application
        net/mlx5: Fix peer pf disable hca command
        net/mlx5: E-Switch, Correct type to u16 for vport_num and int for vport_index
        net/mlx5: Add meaningful return codes to status_to_err function
        net/mlx5: Imply MLXFW in mlx5_core
        Revert "tipc: fix modprobe tipc failed after switch order of device registration"
        vsock/virtio: free packets during the socket release
        ...
      78e03651
  8. 19 May, 2019 4 commits
    • Linus Torvalds's avatar
      Linux 5.2-rc1 · a188339c
      Linus Torvalds authored
      a188339c
    • Linus Torvalds's avatar
      Merge tag 'upstream-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs · 2e2c1220
      Linus Torvalds authored
      Pull UBIFS fixes from Richard Weinberger:
      
       - build errors wrt xattrs
      
       - mismerge which lead to a wrong Kconfig ifdef
      
       - missing endianness conversion
      
      * tag 'upstream-5.2-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/ubifs:
        ubifs: Convert xattr inum to host order
        ubifs: Use correct config name for encryption
        ubifs: Fix build error without CONFIG_UBIFS_FS_XATTR
      2e2c1220
    • Linus Torvalds's avatar
      Merge branch 'akpm' (patches from Andrew) · cb6f8739
      Linus Torvalds authored
      Merge yet more updates from Andrew Morton:
       "A few final bits:
      
         - large changes to vmalloc, yielding large performance benefits
      
         - tweak the console-flush-on-panic code
      
         - a few fixes"
      
      * emailed patches from Andrew Morton <akpm@linux-foundation.org>:
        panic: add an option to replay all the printk message in buffer
        initramfs: don't free a non-existent initrd
        fs/writeback.c: use rcu_barrier() to wait for inflight wb switches going into workqueue when umount
        mm/compaction.c: correct zone boundary handling when isolating pages from a pageblock
        mm/vmap: add DEBUG_AUGMENT_LOWEST_MATCH_CHECK macro
        mm/vmap: add DEBUG_AUGMENT_PROPAGATE_CHECK macro
        mm/vmalloc.c: keep track of free blocks for vmap allocation
      cb6f8739
    • Linus Torvalds's avatar
      Merge tag 'kbuild-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · ff8583d6
      Linus Torvalds authored
      Pull more Kbuild updates from Masahiro Yamada:
      
       - remove unneeded use of cc-option, cc-disable-warning, cc-ldoption
      
       - exclude tracked files from .gitignore
      
       - re-enable -Wint-in-bool-context warning
      
       - refactor samples/Makefile
      
       - stop building immediately if syncconfig fails
      
       - do not sprinkle error messages when $(CC) does not exist
      
       - move arch/alpha/defconfig to the configs subdirectory
      
       - remove crappy header search path manipulation
      
       - add comment lines to .config to clarify the end of menu blocks
      
       - check uniqueness of module names (adding new warnings intentionally)
      
      * tag 'kbuild-v5.2-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (24 commits)
        kconfig: use 'else ifneq' for Makefile to improve readability
        kbuild: check uniqueness of module names
        kconfig: Terminate menu blocks with a comment in the generated config
        kbuild: add LICENSES to KBUILD_ALLDIRS
        kbuild: remove 'addtree' and 'flags' magic for header search paths
        treewide: prefix header search paths with $(srctree)/
        media: prefix header search paths with $(srctree)/
        media: remove unneeded header search paths
        alpha: move arch/alpha/defconfig to arch/alpha/configs/defconfig
        kbuild: terminate Kconfig when $(CC) or $(LD) is missing
        kbuild: turn auto.conf.cmd into a mandatory include file
        .gitignore: exclude .get_maintainer.ignore and .gitattributes
        kbuild: add all Clang-specific flags unconditionally
        kbuild: Don't try to add '-fcatch-undefined-behavior' flag
        kbuild: add some extra warning flags unconditionally
        kbuild: add -Wvla flag unconditionally
        arch: remove dangling asm-generic wrappers
        samples: guard sub-directories with CONFIG options
        kbuild: re-enable int-in-bool-context warning
        MAINTAINERS: kbuild: Add pattern for scripts/*vmlinux*
        ...
      ff8583d6