1. 30 Dec, 2015 26 commits
  2. 27 Nov, 2015 14 commits
    • Ben Hutchings's avatar
      Linux 3.2.74 · 95afdc9f
      Ben Hutchings authored
      95afdc9f
    • Christophe Leroy's avatar
      splice: sendfile() at once fails for big files · fcb27817
      Christophe Leroy authored
      commit 0ff28d9f upstream.
      
      Using sendfile with below small program to get MD5 sums of some files,
      it appear that big files (over 64kbytes with 4k pages system) get a
      wrong MD5 sum while small files get the correct sum.
      This program uses sendfile() to send a file to an AF_ALG socket
      for hashing.
      
      /* md5sum2.c */
      #include <stdio.h>
      #include <stdlib.h>
      #include <unistd.h>
      #include <string.h>
      #include <fcntl.h>
      #include <sys/socket.h>
      #include <sys/stat.h>
      #include <sys/types.h>
      #include <linux/if_alg.h>
      
      int main(int argc, char **argv)
      {
      	int sk = socket(AF_ALG, SOCK_SEQPACKET, 0);
      	struct stat st;
      	struct sockaddr_alg sa = {
      		.salg_family = AF_ALG,
      		.salg_type = "hash",
      		.salg_name = "md5",
      	};
      	int n;
      
      	bind(sk, (struct sockaddr*)&sa, sizeof(sa));
      
      	for (n = 1; n < argc; n++) {
      		int size;
      		int offset = 0;
      		char buf[4096];
      		int fd;
      		int sko;
      		int i;
      
      		fd = open(argv[n], O_RDONLY);
      		sko = accept(sk, NULL, 0);
      		fstat(fd, &st);
      		size = st.st_size;
      		sendfile(sko, fd, &offset, size);
      		size = read(sko, buf, sizeof(buf));
      		for (i = 0; i < size; i++)
      			printf("%2.2x", buf[i]);
      		printf("  %s\n", argv[n]);
      		close(fd);
      		close(sko);
      	}
      	exit(0);
      }
      
      Test below is done using official linux patch files. First result is
      with a software based md5sum. Second result is with the program above.
      
      root@vgoip:~# ls -l patch-3.6.*
      -rw-r--r--    1 root     root         64011 Aug 24 12:01 patch-3.6.2.gz
      -rw-r--r--    1 root     root         94131 Aug 24 12:01 patch-3.6.3.gz
      
      root@vgoip:~# md5sum patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
      
      root@vgoip:~# ./md5sum2 patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      5fd77b24e68bb24dcc72d6e57c64790e  patch-3.6.3.gz
      
      After investivation, it appears that sendfile() sends the files by blocks
      of 64kbytes (16 times PAGE_SIZE). The problem is that at the end of each
      block, the SPLICE_F_MORE flag is missing, therefore the hashing operation
      is reset as if it was the end of the file.
      
      This patch adds SPLICE_F_MORE to the flags when more data is pending.
      
      With the patch applied, we get the correct sums:
      
      root@vgoip:~# md5sum patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
      
      root@vgoip:~# ./md5sum2 patch-3.6.*
      b3ffb9848196846f31b2ff133d2d6443  patch-3.6.2.gz
      c5e8f687878457db77cb7158c38a7e43  patch-3.6.3.gz
      Signed-off-by: default avatarChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      fcb27817
    • Eric Dumazet's avatar
      net: avoid NULL deref in inet_ctl_sock_destroy() · f79c83d6
      Eric Dumazet authored
      [ Upstream commit 8fa677d2 ]
      
      Under low memory conditions, tcp_sk_init() and icmp_sk_init()
      can both iterate on all possible cpus and call inet_ctl_sock_destroy(),
      with eventual NULL pointer.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      f79c83d6
    • Ani Sinha's avatar
      ipmr: fix possible race resulting from improper usage of IP_INC_STATS_BH() in preemptible context. · 33cf84ba
      Ani Sinha authored
      [ Upstream commit 44f49dd8 ]
      
      Fixes the following kernel BUG :
      
      BUG: using __this_cpu_add() in preemptible [00000000] code: bash/2758
      caller is __this_cpu_preempt_check+0x13/0x15
      CPU: 0 PID: 2758 Comm: bash Tainted: P           O   3.18.19 #2
       ffffffff8170eaca ffff880110d1b788 ffffffff81482b2a 0000000000000000
       0000000000000000 ffff880110d1b7b8 ffffffff812010ae ffff880007cab800
       ffff88001a060800 ffff88013a899108 ffff880108b84240 ffff880110d1b7c8
      Call Trace:
      [<ffffffff81482b2a>] dump_stack+0x52/0x80
      [<ffffffff812010ae>] check_preemption_disabled+0xce/0xe1
      [<ffffffff812010d4>] __this_cpu_preempt_check+0x13/0x15
      [<ffffffff81419d60>] ipmr_queue_xmit+0x647/0x70c
      [<ffffffff8141a154>] ip_mr_forward+0x32f/0x34e
      [<ffffffff8141af76>] ip_mroute_setsockopt+0xe03/0x108c
      [<ffffffff810553fc>] ? get_parent_ip+0x11/0x42
      [<ffffffff810e6974>] ? pollwake+0x4d/0x51
      [<ffffffff81058ac0>] ? default_wake_function+0x0/0xf
      [<ffffffff810553fc>] ? get_parent_ip+0x11/0x42
      [<ffffffff810613d9>] ? __wake_up_common+0x45/0x77
      [<ffffffff81486ea9>] ? _raw_spin_unlock_irqrestore+0x1d/0x32
      [<ffffffff810618bc>] ? __wake_up_sync_key+0x4a/0x53
      [<ffffffff8139a519>] ? sock_def_readable+0x71/0x75
      [<ffffffff813dd226>] do_ip_setsockopt+0x9d/0xb55
      [<ffffffff81429818>] ? unix_seqpacket_sendmsg+0x3f/0x41
      [<ffffffff813963fe>] ? sock_sendmsg+0x6d/0x86
      [<ffffffff813959d4>] ? sockfd_lookup_light+0x12/0x5d
      [<ffffffff8139650a>] ? SyS_sendto+0xf3/0x11b
      [<ffffffff810d5738>] ? new_sync_read+0x82/0xaa
      [<ffffffff813ddd19>] compat_ip_setsockopt+0x3b/0x99
      [<ffffffff813fb24a>] compat_raw_setsockopt+0x11/0x32
      [<ffffffff81399052>] compat_sock_common_setsockopt+0x18/0x1f
      [<ffffffff813c4d05>] compat_SyS_setsockopt+0x1a9/0x1cf
      [<ffffffff813c4149>] compat_SyS_socketcall+0x180/0x1e3
      [<ffffffff81488ea1>] cstar_dispatch+0x7/0x1e
      Signed-off-by: default avatarAni Sinha <ani@arista.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: ipmr doesn't implement IPSTATS_MIB_OUTOCTETS]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      33cf84ba
    • Sowmini Varadhan's avatar
      RDS-TCP: Recover correctly from pskb_pull()/pksb_trim() failure in rds_tcp_data_recv · f114d937
      Sowmini Varadhan authored
      [ Upstream commit 8ce675ff ]
      
      Either of pskb_pull() or pskb_trim() may fail under low memory conditions.
      If rds_tcp_data_recv() ignores such failures, the application will
      receive corrupted data because the skb has not been correctly
      carved to the RDS datagram size.
      
      Avoid this by handling pskb_pull/pskb_trim failure in the same
      manner as the skb_clone failure: bail out of rds_tcp_data_recv(), and
      retry via the deferred call to rds_send_worker() that gets set up on
      ENOMEM from rds_tcp_read_sock()
      Signed-off-by: default avatarSowmini Varadhan <sowmini.varadhan@oracle.com>
      Acked-by: default avatarSantosh Shilimkar <santosh.shilimkar@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      f114d937
    • Dan Carpenter's avatar
      irda: precedence bug in irlmp_seq_hb_idx() · a8ab3e63
      Dan Carpenter authored
      [ Upstream commit 50010c20 ]
      
      This is decrementing the pointer, instead of the value stored in the
      pointer.  KASan detects it as an out of bounds reference.
      Reported-by: default avatar"Berry Cheng 程君(成淼)" <chengmiao.cj@alibaba-inc.com>
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a8ab3e63
    • Jann Horn's avatar
      fs: if a coredump already exists, unlink and recreate with O_EXCL · b208d27d
      Jann Horn authored
      commit fbb18169 upstream.
      
      It was possible for an attacking user to trick root (or another user) into
      writing his coredumps into an attacker-readable, pre-existing file using
      rename() or link(), causing the disclosure of secret data from the victim
      process' virtual memory.  Depending on the configuration, it was also
      possible to trick root into overwriting system files with coredumps.  Fix
      that issue by never writing coredumps into existing files.
      
      Requirements for the attack:
       - The attack only applies if the victim's process has a nonzero
         RLIMIT_CORE and is dumpable.
       - The attacker can trick the victim into coredumping into an
         attacker-writable directory D, either because the core_pattern is
         relative and the victim's cwd is attacker-writable or because an
         absolute core_pattern pointing to a world-writable directory is used.
       - The attacker has one of these:
        A: on a system with protected_hardlinks=0:
           execute access to a folder containing a victim-owned,
           attacker-readable file on the same partition as D, and the
           victim-owned file will be deleted before the main part of the attack
           takes place. (In practice, there are lots of files that fulfill
           this condition, e.g. entries in Debian's /var/lib/dpkg/info/.)
           This does not apply to most Linux systems because most distros set
           protected_hardlinks=1.
        B: on a system with protected_hardlinks=1:
           execute access to a folder containing a victim-owned,
           attacker-readable and attacker-writable file on the same partition
           as D, and the victim-owned file will be deleted before the main part
           of the attack takes place.
           (This seems to be uncommon.)
        C: on any system, independent of protected_hardlinks:
           write access to a non-sticky folder containing a victim-owned,
           attacker-readable file on the same partition as D
           (This seems to be uncommon.)
      
      The basic idea is that the attacker moves the victim-owned file to where
      he expects the victim process to dump its core.  The victim process dumps
      its core into the existing file, and the attacker reads the coredump from
      it.
      
      If the attacker can't move the file because he does not have write access
      to the containing directory, he can instead link the file to a directory
      he controls, then wait for the original link to the file to be deleted
      (because the kernel checks that the link count of the corefile is 1).
      
      A less reliable variant that requires D to be non-sticky works with link()
      and does not require deletion of the original link: link() the file into
      D, but then unlink() it directly before the kernel performs the link count
      check.
      
      On systems with protected_hardlinks=0, this variant allows an attacker to
      not only gain information from coredumps, but also clobber existing,
      victim-writable files with coredumps.  (This could theoretically lead to a
      privilege escalation.)
      Signed-off-by: default avatarJann Horn <jann@thejh.net>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2: adjust filename, context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b208d27d
    • Kees Cook's avatar
      fs: make dumpable=2 require fully qualified path · 0677d4e0
      Kees Cook authored
      commit 9520628e upstream.
      
      When the suid_dumpable sysctl is set to "2", and there is no core dump
      pipe defined in the core_pattern sysctl, a local user can cause core files
      to be written to root-writable directories, potentially with
      user-controlled content.
      
      This means an admin can unknowningly reintroduce a variation of
      CVE-2006-2451, allowing local users to gain root privileges.
      
        $ cat /proc/sys/fs/suid_dumpable
        2
        $ cat /proc/sys/kernel/core_pattern
        core
        $ ulimit -c unlimited
        $ cd /
        $ ls -l core
        ls: cannot access core: No such file or directory
        $ touch core
        touch: cannot touch `core': Permission denied
        $ OHAI="evil-string-here" ping localhost >/dev/null 2>&1 &
        $ pid=$!
        $ sleep 1
        $ kill -SEGV $pid
        $ ls -l core
        -rw------- 1 root kees 458752 Jun 21 11:35 core
        $ sudo strings core | grep evil
        OHAI=evil-string-here
      
      While cron has been fixed to abort reading a file when there is any
      parse error, there are still other sensitive directories that will read
      any file present and skip unparsable lines.
      
      Instead of introducing a suid_dumpable=3 mode and breaking all users of
      mode 2, this only disables the unsafe portion of mode 2 (writing to disk
      via relative path).  Most users of mode 2 (e.g.  Chrome OS) already use
      a core dump pipe handler, so this change will not break them.  For the
      situations where a pipe handler is not defined but mode 2 is still
      active, crash dumps will only be written to fully qualified paths.  If a
      relative path is defined (e.g.  the default "core" pattern), dump
      attempts will trigger a printk yelling about the lack of a fully
      qualified path.
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Alan Cox <alan@linux.intel.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: James Morris <james.l.morris@oracle.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      Reviewed-by: default avatarJames Morris <james.l.morris@oracle.com>
      0677d4e0
    • Maciej W. Rozycki's avatar
      binfmt_elf: Don't clobber passed executable's file header · beebd9fa
      Maciej W. Rozycki authored
      commit b582ef5c upstream.
      
      Do not clobber the buffer space passed from `search_binary_handler' and
      originally preloaded by `prepare_binprm' with the executable's file
      header by overwriting it with its interpreter's file header.  Instead
      keep the buffer space intact and directly use the data structure locally
      allocated for the interpreter's file header, fixing a bug introduced in
      2.1.14 with loadable module support (linux-mips.org commit beb11695
      [Import of Linux/MIPS 2.1.14], predating kernel.org repo's history).
      Adjust the amount of data read from the interpreter's file accordingly.
      
      This was not an issue before loadable module support, because back then
      `load_elf_binary' was executed only once for a given ELF executable,
      whether the function succeeded or failed.
      
      With loadable module support supported and enabled, upon a failure of
      `load_elf_binary' -- which may for example be caused by architecture
      code rejecting an executable due to a missing hardware feature requested
      in the file header -- a module load is attempted and then the function
      reexecuted by `search_binary_handler'.  With the executable's file
      header replaced with its interpreter's file header the executable can
      then be erroneously accepted in this subsequent attempt.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@imgtec.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      beebd9fa
    • David Howells's avatar
      FS-Cache: Handle a write to the page immediately beyond the EOF marker · cf185723
      David Howells authored
      commit 102f4d90 upstream.
      
      Handle a write being requested to the page immediately beyond the EOF
      marker on a cache object.  Currently this gets an assertion failure in
      CacheFiles because the EOF marker is used there to encode information about
      a partial page at the EOF - which could lead to an unknown blank spot in
      the file if we extend the file over it.
      
      The problem is actually in fscache where we check the index of the page
      being written against store_limit.  store_limit is set to the number of
      pages that we're allowed to store by fscache_set_store_limit() - which
      means it's one more than the index of the last page we're allowed to store.
      The problem is that we permit writing to a page with an index _equal_ to
      the store limit - when we should reject that case.
      
      Whilst we're at it, change the triggered assertion in CacheFiles to just
      return -ENOBUFS instead.
      
      The assertion failure looks something like this:
      
      CacheFiles: Assertion failed
      1000 < 7b1 is false
      ------------[ cut here ]------------
      kernel BUG at fs/cachefiles/rdwr.c:962!
      ...
      RIP: 0010:[<ffffffffa02c9e83>]  [<ffffffffa02c9e83>] cachefiles_write_page+0x273/0x2d0 [cachefiles]
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [bwh: Backported to 3.2: we don't have __kernel_write() so keep using the
       open-coded equivalent]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      cf185723
    • Kinglong Mee's avatar
      FS-Cache: Don't override netfs's primary_index if registering failed · 8f2746bb
      Kinglong Mee authored
      commit b130ed59 upstream.
      
      Only override netfs->primary_index when registering success.
      Signed-off-by: default avatarKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [bwh: Backported to 3.2: no n_active or flags fields in fscache_cookie]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      8f2746bb
    • Kinglong Mee's avatar
      FS-Cache: Increase reference of parent after registering, netfs success · bdb28a40
      Kinglong Mee authored
      commit 86108c2e upstream.
      
      If netfs exist, fscache should not increase the reference of parent's
      usage and n_children, otherwise, never be decreased.
      
      v2: thanks David's suggest,
       move increasing reference of parent if success
       use kmem_cache_free() freeing primary_index directly
      
      v3: don't move "netfs->primary_index->parent = &fscache_fsdef_index;"
      Signed-off-by: default avatarKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      bdb28a40
    • Paolo Bonzini's avatar
      KVM: svm: unconditionally intercept #DB · b42506c6
      Paolo Bonzini authored
      commit cbdb967a upstream.
      
      This is needed to avoid the possibility that the guest triggers
      an infinite stream of #DB exceptions (CVE-2015-8104).
      
      VMX is not affected: because it does not save DR6 in the VMCS,
      it already intercepts #DB unconditionally.
      Reported-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [bwh: Backported to 3.2, with thanks to Paolo:
       - update_db_bp_intercept() was called update_db_intercept()
       - The remaining call is in svm_guest_debug() rather than through svm_x86_ops]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b42506c6
    • Eric Dumazet's avatar
      net: fix a race in dst_release() · 1a513170
      Eric Dumazet authored
      commit d69bbf88 upstream.
      
      Only cpu seeing dst refcount going to 0 can safely
      dereference dst->flags.
      
      Otherwise an other cpu might already have freed the dst.
      
      Fixes: 27b75c95 ("net: avoid RCU for NOCACHE dst")
      Reported-by: default avatarGreg Thelen <gthelen@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1a513170