1. 03 Feb, 2018 34 commits
  2. 31 Jan, 2018 6 commits
    • Greg Kroah-Hartman's avatar
      Linux 4.9.79 · 6c6f924f
      Greg Kroah-Hartman authored
      6c6f924f
    • Ben Hutchings's avatar
      nfsd: auth: Fix gid sorting when rootsquash enabled · f12d0602
      Ben Hutchings authored
      commit 19952667 upstream.
      
      Commit bdcf0a42 ("kernel: make groups_sort calling a responsibility
      group_info allocators") appears to break nfsd rootsquash in a pretty
      major way.
      
      It adds a call to groups_sort() inside the loop that copies/squashes
      gids, which means the valid gids are sorted along with the following
      garbage.  The net result is that the highest numbered valid gids are
      replaced with any lower-valued garbage gids, possibly including 0.
      
      We should sort only once, after filling in all the gids.
      
      Fixes: bdcf0a42 ("kernel: make groups_sort calling a responsibility ...")
      Signed-off-by: default avatarBen Hutchings <ben.hutchings@codethink.co.uk>
      Acked-by: default avatarJ. Bruce Fields <bfields@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Wolfgang Walter <linux@stwm.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f12d0602
    • Daniel Borkmann's avatar
      bpf: reject stores into ctx via st and xadd · f531fbb0
      Daniel Borkmann authored
      [ upstream commit f37a8cb8 ]
      
      Alexei found that verifier does not reject stores into context
      via BPF_ST instead of BPF_STX. And while looking at it, we
      also should not allow XADD variant of BPF_STX.
      
      The context rewriter is only assuming either BPF_LDX_MEM- or
      BPF_STX_MEM-type operations, thus reject anything other than
      that so that assumptions in the rewriter properly hold. Add
      test cases as well for BPF selftests.
      
      Fixes: d691f9e8 ("bpf: allow programs to write to certain skb fields")
      Reported-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f531fbb0
    • Alexei Starovoitov's avatar
      bpf: fix 32-bit divide by zero · 265d7657
      Alexei Starovoitov authored
      [ upstream commit 68fda450 ]
      
      due to some JITs doing if (src_reg == 0) check in 64-bit mode
      for div/mod operations mask upper 32-bits of src register
      before doing the check
      
      Fixes: 62258278 ("net: filter: x86: internal BPF JIT")
      Fixes: 7a12b503 ("sparc64: Add eBPF JIT.")
      Reported-by: syzbot+48340bb518e88849e2e3@syzkaller.appspotmail.com
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      265d7657
    • Eric Dumazet's avatar
      bpf: fix divides by zero · 46060778
      Eric Dumazet authored
      [ upstream commit c366287e ]
      
      Divides by zero are not nice, lets avoid them if possible.
      
      Also do_div() seems not needed when dealing with 32bit operands,
      but this seems a minor detail.
      
      Fixes: bd4cf0ed ("net: filter: rework/optimize internal BPF interpreter's instruction set")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarsyzbot <syzkaller@googlegroups.com>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      46060778
    • Daniel Borkmann's avatar
      bpf: avoid false sharing of map refcount with max_entries · 5cb917aa
      Daniel Borkmann authored
      [ upstream commit be95a845 ]
      
      In addition to commit b2157399 ("bpf: prevent out-of-bounds
      speculation") also change the layout of struct bpf_map such that
      false sharing of fast-path members like max_entries is avoided
      when the maps reference counter is altered. Therefore enforce
      them to be placed into separate cachelines.
      
      pahole dump after change:
      
        struct bpf_map {
              const struct bpf_map_ops  * ops;                 /*     0     8 */
              struct bpf_map *           inner_map_meta;       /*     8     8 */
              void *                     security;             /*    16     8 */
              enum bpf_map_type          map_type;             /*    24     4 */
              u32                        key_size;             /*    28     4 */
              u32                        value_size;           /*    32     4 */
              u32                        max_entries;          /*    36     4 */
              u32                        map_flags;            /*    40     4 */
              u32                        pages;                /*    44     4 */
              u32                        id;                   /*    48     4 */
              int                        numa_node;            /*    52     4 */
              bool                       unpriv_array;         /*    56     1 */
      
              /* XXX 7 bytes hole, try to pack */
      
              /* --- cacheline 1 boundary (64 bytes) --- */
              struct user_struct *       user;                 /*    64     8 */
              atomic_t                   refcnt;               /*    72     4 */
              atomic_t                   usercnt;              /*    76     4 */
              struct work_struct         work;                 /*    80    32 */
              char                       name[16];             /*   112    16 */
              /* --- cacheline 2 boundary (128 bytes) --- */
      
              /* size: 128, cachelines: 2, members: 17 */
              /* sum members: 121, holes: 1, sum holes: 7 */
        };
      
      Now all entries in the first cacheline are read only throughout
      the life time of the map, set up once during map creation. Overall
      struct size and number of cachelines doesn't change from the
      reordering. struct bpf_map is usually first member and embedded
      in map structs in specific map implementations, so also avoid those
      members to sit at the end where it could potentially share the
      cacheline with first map values e.g. in the array since remote
      CPUs could trigger map updates just as well for those (easily
      dirtying members like max_entries intentionally as well) while
      having subsequent values in cache.
      
      Quoting from Google's Project Zero blog [1]:
      
        Additionally, at least on the Intel machine on which this was
        tested, bouncing modified cache lines between cores is slow,
        apparently because the MESI protocol is used for cache coherence
        [8]. Changing the reference counter of an eBPF array on one
        physical CPU core causes the cache line containing the reference
        counter to be bounced over to that CPU core, making reads of the
        reference counter on all other CPU cores slow until the changed
        reference counter has been written back to memory. Because the
        length and the reference counter of an eBPF array are stored in
        the same cache line, this also means that changing the reference
        counter on one physical CPU core causes reads of the eBPF array's
        length to be slow on other physical CPU cores (intentional false
        sharing).
      
      While this doesn't 'control' the out-of-bounds speculation through
      masking the index as in commit b2157399, triggering a manipulation
      of the map's reference counter is really trivial, so lets not allow
      to easily affect max_entries from it.
      
      Splitting to separate cachelines also generally makes sense from
      a performance perspective anyway in that fast-path won't have a
      cache miss if the map gets pinned, reused in other progs, etc out
      of control path, thus also avoids unintentional false sharing.
      
        [1] https://googleprojectzero.blogspot.ch/2018/01/reading-privileged-memory-with-side.htmlSigned-off-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5cb917aa