1. 27 Nov, 2015 40 commits
    • Maciej W. Rozycki's avatar
      binfmt_elf: Don't clobber passed executable's file header · beebd9fa
      Maciej W. Rozycki authored
      commit b582ef5c upstream.
      
      Do not clobber the buffer space passed from `search_binary_handler' and
      originally preloaded by `prepare_binprm' with the executable's file
      header by overwriting it with its interpreter's file header.  Instead
      keep the buffer space intact and directly use the data structure locally
      allocated for the interpreter's file header, fixing a bug introduced in
      2.1.14 with loadable module support (linux-mips.org commit beb11695
      [Import of Linux/MIPS 2.1.14], predating kernel.org repo's history).
      Adjust the amount of data read from the interpreter's file accordingly.
      
      This was not an issue before loadable module support, because back then
      `load_elf_binary' was executed only once for a given ELF executable,
      whether the function succeeded or failed.
      
      With loadable module support supported and enabled, upon a failure of
      `load_elf_binary' -- which may for example be caused by architecture
      code rejecting an executable due to a missing hardware feature requested
      in the file header -- a module load is attempted and then the function
      reexecuted by `search_binary_handler'.  With the executable's file
      header replaced with its interpreter's file header the executable can
      then be erroneously accepted in this subsequent attempt.
      Signed-off-by: default avatarMaciej W. Rozycki <macro@imgtec.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      beebd9fa
    • David Howells's avatar
      FS-Cache: Handle a write to the page immediately beyond the EOF marker · cf185723
      David Howells authored
      commit 102f4d90 upstream.
      
      Handle a write being requested to the page immediately beyond the EOF
      marker on a cache object.  Currently this gets an assertion failure in
      CacheFiles because the EOF marker is used there to encode information about
      a partial page at the EOF - which could lead to an unknown blank spot in
      the file if we extend the file over it.
      
      The problem is actually in fscache where we check the index of the page
      being written against store_limit.  store_limit is set to the number of
      pages that we're allowed to store by fscache_set_store_limit() - which
      means it's one more than the index of the last page we're allowed to store.
      The problem is that we permit writing to a page with an index _equal_ to
      the store limit - when we should reject that case.
      
      Whilst we're at it, change the triggered assertion in CacheFiles to just
      return -ENOBUFS instead.
      
      The assertion failure looks something like this:
      
      CacheFiles: Assertion failed
      1000 < 7b1 is false
      ------------[ cut here ]------------
      kernel BUG at fs/cachefiles/rdwr.c:962!
      ...
      RIP: 0010:[<ffffffffa02c9e83>]  [<ffffffffa02c9e83>] cachefiles_write_page+0x273/0x2d0 [cachefiles]
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [bwh: Backported to 3.2: we don't have __kernel_write() so keep using the
       open-coded equivalent]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      cf185723
    • Kinglong Mee's avatar
      FS-Cache: Don't override netfs's primary_index if registering failed · 8f2746bb
      Kinglong Mee authored
      commit b130ed59 upstream.
      
      Only override netfs->primary_index when registering success.
      Signed-off-by: default avatarKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      [bwh: Backported to 3.2: no n_active or flags fields in fscache_cookie]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      8f2746bb
    • Kinglong Mee's avatar
      FS-Cache: Increase reference of parent after registering, netfs success · bdb28a40
      Kinglong Mee authored
      commit 86108c2e upstream.
      
      If netfs exist, fscache should not increase the reference of parent's
      usage and n_children, otherwise, never be decreased.
      
      v2: thanks David's suggest,
       move increasing reference of parent if success
       use kmem_cache_free() freeing primary_index directly
      
      v3: don't move "netfs->primary_index->parent = &fscache_fsdef_index;"
      Signed-off-by: default avatarKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      bdb28a40
    • Paolo Bonzini's avatar
      KVM: svm: unconditionally intercept #DB · b42506c6
      Paolo Bonzini authored
      commit cbdb967a upstream.
      
      This is needed to avoid the possibility that the guest triggers
      an infinite stream of #DB exceptions (CVE-2015-8104).
      
      VMX is not affected: because it does not save DR6 in the VMCS,
      it already intercepts #DB unconditionally.
      Reported-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarPaolo Bonzini <pbonzini@redhat.com>
      [bwh: Backported to 3.2, with thanks to Paolo:
       - update_db_bp_intercept() was called update_db_intercept()
       - The remaining call is in svm_guest_debug() rather than through svm_x86_ops]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b42506c6
    • Eric Dumazet's avatar
      net: fix a race in dst_release() · 1a513170
      Eric Dumazet authored
      commit d69bbf88 upstream.
      
      Only cpu seeing dst refcount going to 0 can safely
      dereference dst->flags.
      
      Otherwise an other cpu might already have freed the dst.
      
      Fixes: 27b75c95 ("net: avoid RCU for NOCACHE dst")
      Reported-by: default avatarGreg Thelen <gthelen@google.com>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      1a513170
    • Filipe Manana's avatar
      Btrfs: fix race when listing an inode's xattrs · 7abbc81b
      Filipe Manana authored
      commit f1cd1f0b upstream.
      
      When listing a inode's xattrs we have a time window where we race against
      a concurrent operation for adding a new hard link for our inode that makes
      us not return any xattr to user space. In order for this to happen, the
      first xattr of our inode needs to be at slot 0 of a leaf and the previous
      leaf must still have room for an inode ref (or extref) item, and this can
      happen because an inode's listxattrs callback does not lock the inode's
      i_mutex (nor does the VFS does it for us), but adding a hard link to an
      inode makes the VFS lock the inode's i_mutex before calling the inode's
      link callback.
      
      If we have the following leafs:
      
                     Leaf X (has N items)                    Leaf Y
      
       [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ]  [ (257 XATTR_ITEM 12345), ... ]
                 slot N - 2         slot N - 1              slot 0
      
      The race illustrated by the following sequence diagram is possible:
      
             CPU 1                                               CPU 2
      
        btrfs_listxattr()
      
          searches for key (257 XATTR_ITEM 0)
      
          gets path with path->nodes[0] == leaf X
          and path->slots[0] == N
      
          because path->slots[0] is >=
          btrfs_header_nritems(leaf X), it calls
          btrfs_next_leaf()
      
          btrfs_next_leaf()
            releases the path
      
                                                         adds key (257 INODE_REF 666)
                                                         to the end of leaf X (slot N),
                                                         and leaf X now has N + 1 items
      
            searches for the key (257 INODE_REF 256),
            with path->keep_locks == 1, because that
            is the last key it saw in leaf X before
            releasing the path
      
            ends up at leaf X again and it verifies
            that the key (257 INODE_REF 256) is no
            longer the last key in leaf X, so it
            returns with path->nodes[0] == leaf X
            and path->slots[0] == N, pointing to
            the new item with key (257 INODE_REF 666)
      
          btrfs_listxattr's loop iteration sees that
          the type of the key pointed by the path is
          different from the type BTRFS_XATTR_ITEM_KEY
          and so it breaks the loop and stops looking
          for more xattr items
            --> the application doesn't get any xattr
                listed for our inode
      
      So fix this by breaking the loop only if the key's type is greater than
      BTRFS_XATTR_ITEM_KEY and skip the current key if its type is smaller.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      [bwh: Backported to 3.2: old code used the trivial accessor btrfs_key_type()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      7abbc81b
    • Peter Oberparleiter's avatar
      scsi_sysfs: Fix queue_ramp_up_period return code · 74d26be3
      Peter Oberparleiter authored
      commit 863e02d0 upstream.
      
      Writing a number to /sys/bus/scsi/devices/<sdev>/queue_ramp_up_period
      returns the value of that number instead of the number of bytes written.
      This behavior can confuse programs expecting POSIX write() semantics.
      Fix this by returning the number of bytes written instead.
      Signed-off-by: default avatarPeter Oberparleiter <oberpar@linux.vnet.ibm.com>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarMatthew R. Ochs <mrochs@linux.vnet.ibm.com>
      Reviewed-by: default avatarEwan D. Milne <emilne@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      74d26be3
    • Peter Zijlstra's avatar
      perf: Fix inherited events vs. tracepoint filters · 4edb9551
      Peter Zijlstra authored
      commit b71b437e upstream.
      
      Arnaldo reported that tracepoint filters seem to misbehave (ie. not
      apply) on inherited events.
      
      The fix is obvious; filters are only set on the actual (parent)
      event, use the normal pattern of using this parent event for filters.
      This is safe because each child event has a reference to it.
      Reported-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Tested-by: default avatarArnaldo Carvalho de Melo <acme@kernel.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frédéric Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Wang Nan <wangnan0@huawei.com>
      Link: http://lkml.kernel.org/r/20151102095051.GN17308@twins.programming.kicks-ass.netSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4edb9551
    • Filipe Manana's avatar
      Btrfs: fix race leading to BUG_ON when running delalloc for nodatacow · f5daa8cd
      Filipe Manana authored
      commit 1d512cb7 upstream.
      
      If we are using the NO_HOLES feature, we have a tiny time window when
      running delalloc for a nodatacow inode where we can race with a concurrent
      link or xattr add operation leading to a BUG_ON.
      
      This happens because at run_delalloc_nocow() we end up casting a leaf item
      of type BTRFS_INODE_[REF|EXTREF]_KEY or of type BTRFS_XATTR_ITEM_KEY to a
      file extent item (struct btrfs_file_extent_item) and then analyse its
      extent type field, which won't match any of the expected extent types
      (values BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]) and therefore trigger an
      explicit BUG_ON(1).
      
      The following sequence diagram shows how the race happens when running a
      no-cow dellaloc range [4K, 8K[ for inode 257 and we have the following
      neighbour leafs:
      
                   Leaf X (has N items)                    Leaf Y
      
       [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ]  [ (257 EXTENT_DATA 8192), ... ]
                    slot N - 2         slot N - 1              slot 0
      
       (Note the implicit hole for inode 257 regarding the [0, 8K[ range)
      
             CPU 1                                         CPU 2
      
       run_dealloc_nocow()
         btrfs_lookup_file_extent()
           --> searches for a key with value
               (257 EXTENT_DATA 4096) in the
               fs/subvol tree
           --> returns us a path with
               path->nodes[0] == leaf X and
               path->slots[0] == N
      
         because path->slots[0] is >=
         btrfs_header_nritems(leaf X), it
         calls btrfs_next_leaf()
      
         btrfs_next_leaf()
           --> releases the path
      
                                                    hard link added to our inode,
                                                    with key (257 INODE_REF 500)
                                                    added to the end of leaf X,
                                                    so leaf X now has N + 1 keys
      
           --> searches for the key
               (257 INODE_REF 256), because
               it was the last key in leaf X
               before it released the path,
               with path->keep_locks set to 1
      
           --> ends up at leaf X again and
               it verifies that the key
               (257 INODE_REF 256) is no longer
               the last key in the leaf, so it
               returns with path->nodes[0] ==
               leaf X and path->slots[0] == N,
               pointing to the new item with
               key (257 INODE_REF 500)
      
         the loop iteration of run_dealloc_nocow()
         does not break out the loop and continues
         because the key referenced in the path
         at path->nodes[0] and path->slots[0] is
         for inode 257, its type is < BTRFS_EXTENT_DATA_KEY
         and its offset (500) is less then our delalloc
         range's end (8192)
      
         the item pointed by the path, an inode reference item,
         is (incorrectly) interpreted as a file extent item and
         we get an invalid extent type, leading to the BUG_ON(1):
      
         if (extent_type == BTRFS_FILE_EXTENT_REG ||
            extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
             (...)
         } else if (extent_type == BTRFS_FILE_EXTENT_INLINE) {
             (...)
         } else {
             BUG_ON(1)
         }
      
      The same can happen if a xattr is added concurrently and ends up having
      a key with an offset smaller then the delalloc's range end.
      
      So fix this by skipping keys with a type smaller than
      BTRFS_EXTENT_DATA_KEY.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      f5daa8cd
    • Filipe Manana's avatar
      Btrfs: fix race leading to incorrect item deletion when dropping extents · b4f5eab6
      Filipe Manana authored
      commit aeafbf84 upstream.
      
      While running a stress test I got the following warning triggered:
      
        [191627.672810] ------------[ cut here ]------------
        [191627.673949] WARNING: CPU: 8 PID: 8447 at fs/btrfs/file.c:779 __btrfs_drop_extents+0x391/0xa50 [btrfs]()
        (...)
        [191627.701485] Call Trace:
        [191627.702037]  [<ffffffff8145f077>] dump_stack+0x4f/0x7b
        [191627.702992]  [<ffffffff81095de5>] ? console_unlock+0x356/0x3a2
        [191627.704091]  [<ffffffff8104b3b0>] warn_slowpath_common+0xa1/0xbb
        [191627.705380]  [<ffffffffa0664499>] ? __btrfs_drop_extents+0x391/0xa50 [btrfs]
        [191627.706637]  [<ffffffff8104b46d>] warn_slowpath_null+0x1a/0x1c
        [191627.707789]  [<ffffffffa0664499>] __btrfs_drop_extents+0x391/0xa50 [btrfs]
        [191627.709155]  [<ffffffff8115663c>] ? cache_alloc_debugcheck_after.isra.32+0x171/0x1d0
        [191627.712444]  [<ffffffff81155007>] ? kmemleak_alloc_recursive.constprop.40+0x16/0x18
        [191627.714162]  [<ffffffffa06570c9>] insert_reserved_file_extent.constprop.40+0x83/0x24e [btrfs]
        [191627.715887]  [<ffffffffa065422b>] ? start_transaction+0x3bb/0x610 [btrfs]
        [191627.717287]  [<ffffffffa065b604>] btrfs_finish_ordered_io+0x273/0x4e2 [btrfs]
        [191627.728865]  [<ffffffffa065b888>] finish_ordered_fn+0x15/0x17 [btrfs]
        [191627.730045]  [<ffffffffa067d688>] normal_work_helper+0x14c/0x32c [btrfs]
        [191627.731256]  [<ffffffffa067d96a>] btrfs_endio_write_helper+0x12/0x14 [btrfs]
        [191627.732661]  [<ffffffff81061119>] process_one_work+0x24c/0x4ae
        [191627.733822]  [<ffffffff810615b0>] worker_thread+0x206/0x2c2
        [191627.734857]  [<ffffffff810613aa>] ? process_scheduled_works+0x2f/0x2f
        [191627.736052]  [<ffffffff810613aa>] ? process_scheduled_works+0x2f/0x2f
        [191627.737349]  [<ffffffff810669a6>] kthread+0xef/0xf7
        [191627.738267]  [<ffffffff810f3b3a>] ? time_hardirqs_on+0x15/0x28
        [191627.739330]  [<ffffffff810668b7>] ? __kthread_parkme+0xad/0xad
        [191627.741976]  [<ffffffff81465592>] ret_from_fork+0x42/0x70
        [191627.743080]  [<ffffffff810668b7>] ? __kthread_parkme+0xad/0xad
        [191627.744206] ---[ end trace bbfddacb7aaada8d ]---
      
        $ cat -n fs/btrfs/file.c
        691  int __btrfs_drop_extents(struct btrfs_trans_handle *trans,
        (...)
        758                  btrfs_item_key_to_cpu(leaf, &key, path->slots[0]);
        759                  if (key.objectid > ino ||
        760                      key.type > BTRFS_EXTENT_DATA_KEY || key.offset >= end)
        761                          break;
        762
        763                  fi = btrfs_item_ptr(leaf, path->slots[0],
        764                                      struct btrfs_file_extent_item);
        765                  extent_type = btrfs_file_extent_type(leaf, fi);
        766
        767                  if (extent_type == BTRFS_FILE_EXTENT_REG ||
        768                      extent_type == BTRFS_FILE_EXTENT_PREALLOC) {
        (...)
        774                  } else if (extent_type == BTRFS_FILE_EXTENT_INLINE) {
        (...)
        778                  } else {
        779                          WARN_ON(1);
        780                          extent_end = search_start;
        781                  }
        (...)
      
      This happened because the item we were processing did not match a file
      extent item (its key type != BTRFS_EXTENT_DATA_KEY), and even on this
      case we cast the item to a struct btrfs_file_extent_item pointer and
      then find a type field value that does not match any of the expected
      values (BTRFS_FILE_EXTENT_[REG|PREALLOC|INLINE]). This scenario happens
      due to a tiny time window where a race can happen as exemplified below.
      For example, consider the following scenario where we're using the
      NO_HOLES feature and we have the following two neighbour leafs:
      
                     Leaf X (has N items)                    Leaf Y
      
      [ ... (257 INODE_ITEM 0) (257 INODE_REF 256) ]  [ (257 EXTENT_DATA 8192), ... ]
                slot N - 2         slot N - 1              slot 0
      
      Our inode 257 has an implicit hole in the range [0, 8K[ (implicit rather
      than explicit because NO_HOLES is enabled). Now if our inode has an
      ordered extent for the range [4K, 8K[ that is finishing, the following
      can happen:
      
                CPU 1                                       CPU 2
      
        btrfs_finish_ordered_io()
          insert_reserved_file_extent()
            __btrfs_drop_extents()
               Searches for the key
                (257 EXTENT_DATA 4096) through
                btrfs_lookup_file_extent()
      
               Key not found and we get a path where
               path->nodes[0] == leaf X and
               path->slots[0] == N
      
               Because path->slots[0] is >=
               btrfs_header_nritems(leaf X), we call
               btrfs_next_leaf()
      
               btrfs_next_leaf() releases the path
      
                                                        inserts key
                                                        (257 INODE_REF 4096)
                                                        at the end of leaf X,
                                                        leaf X now has N + 1 keys,
                                                        and the new key is at
                                                        slot N
      
               btrfs_next_leaf() searches for
               key (257 INODE_REF 256), with
               path->keep_locks set to 1,
               because it was the last key it
               saw in leaf X
      
                 finds it in leaf X again and
                 notices it's no longer the last
                 key of the leaf, so it returns 0
                 with path->nodes[0] == leaf X and
                 path->slots[0] == N (which is now
                 < btrfs_header_nritems(leaf X)),
                 pointing to the new key
                 (257 INODE_REF 4096)
      
               __btrfs_drop_extents() casts the
               item at path->nodes[0], slot
               path->slots[0], to a struct
               btrfs_file_extent_item - it does
               not skip keys for the target
               inode with a type less than
               BTRFS_EXTENT_DATA_KEY
               (BTRFS_INODE_REF_KEY < BTRFS_EXTENT_DATA_KEY)
      
               sees a bogus value for the type
               field triggering the WARN_ON in
               the trace shown above, and sets
               extent_end = search_start (4096)
      
               does the if-then-else logic to
               fixup 0 length extent items created
               by a past bug from hole punching:
      
                 if (extent_end == key.offset &&
                     extent_end >= search_start)
                     goto delete_extent_item;
      
               that evaluates to true and it ends
               up deleting the key pointed to by
               path->slots[0], (257 INODE_REF 4096),
               from leaf X
      
      The same could happen for example for a xattr that ends up having a key
      with an offset value that matches search_start (very unlikely but not
      impossible).
      
      So fix this by ensuring that keys smaller than BTRFS_EXTENT_DATA_KEY are
      skipped, never casted to struct btrfs_file_extent_item and never deleted
      by accident. Also protect against the unexpected case of getting a key
      for a lower inode number by skipping that key and issuing a warning.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      [bwh: Backported to 3.2: drop use of ASSERT()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      b4f5eab6
    • Borislav Petkov's avatar
      x86/cpu: Call verify_cpu() after having entered long mode too · 7f28be32
      Borislav Petkov authored
      commit 04633df0 upstream.
      
      When we get loaded by a 64-bit bootloader, kernel entry point is
      startup_64 in head_64.S. We don't trust any and all bootloaders because
      some will fiddle with CPU configuration so we go ahead and massage each
      CPU into sanity again.
      
      For example, some dell BIOSes have this XD disable feature which set
      IA32_MISC_ENABLE[34] and disable NX. This might be some dumb workaround
      for other OSes but Linux sure doesn't need it.
      
      A similar thing is present in the Surface 3 firmware - see
      https://bugzilla.kernel.org/show_bug.cgi?id=106051 - which sets this bit
      only on the BSP:
      
        # rdmsr -a 0x1a0
        400850089
        850089
        850089
        850089
      
      I know, right?!
      
      There's not even an off switch in there.
      
      So fix all those cases by sanitizing the 64-bit entry point too. For
      that, make verify_cpu() callable in 64-bit mode also.
      Requested-and-debugged-by: default avatar"H. Peter Anvin" <hpa@zytor.com>
      Reported-and-tested-by: default avatarBastien Nocera <bugzilla@hadess.net>
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1446739076-21303-1-git-send-email-bp@alien8.deSigned-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      7f28be32
    • Christoph Hellwig's avatar
      scsi: restart list search after unlock in scsi_remove_target · 67022ef1
      Christoph Hellwig authored
      commit 40998193 upstream.
      
      When dropping a lock while iterating a list we must restart the search
      as other threads could have manipulated the list under us.  Without this
      we can get stuck in an endless loop.  This bug was introduced by
      
      commit bc3f02a7
      Author: Dan Williams <djbw@fb.com>
      Date:   Tue Aug 28 22:12:10 2012 -0700
      
          [SCSI] scsi_remove_target: fix softlockup regression on hot remove
      
      Which was itself trying to fix a reported soft lockup issue
      
      http://thread.gmane.org/gmane.linux.kernel/1348679
      
      However, we believe even with this revert of the original patch, the soft
      lockup problem has been fixed by
      
      commit f2495e22
      Author: James Bottomley <JBottomley@Parallels.com>
      Date:   Tue Jan 21 07:01:41 2014 -0800
      
          [SCSI] dual scan thread bug fix
      
      Thanks go to Dan Williams <dan.j.williams@intel.com> for tracking all this
      prior history down.
      Reported-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Tested-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.de>
      Fixes: bc3f02a7Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      67022ef1
    • Stefan Richter's avatar
      firewire: ohci: fix JMicron JMB38x IT context discovery · 785381e9
      Stefan Richter authored
      commit 100ceb66 upstream.
      
      Reported by Clifford and Craig for JMicron OHCI-1394 + SDHCI combo
      controllers:  Often or even most of the time, the controller is
      initialized with the message "added OHCI v1.10 device as card 0, 4 IR +
      0 IT contexts, quirks 0x10".  With 0 isochronous transmit DMA contexts
      (IT contexts), applications like audio output are impossible.
      
      However, OHCI-1394 demands that at least 4 IT contexts are implemented
      by the link layer controller, and indeed JMicron JMB38x do implement
      four of them.  Only their IsoXmitIntMask register is unreliable at early
      access.
      
      With my own JMB381 single function controller I found:
        - I can reproduce the problem with a lower probability than Craig's.
        - If I put a loop around the section which clears and reads
          IsoXmitIntMask, then either the first or the second attempt will
          return the correct initial mask of 0x0000000f.  I never encountered
          a case of needing more than a second attempt.
        - Consequently, if I put a dummy reg_read(...IsoXmitIntMaskSet)
          before the first write, the subsequent read will return the correct
          result.
        - If I merely ignore a wrong read result and force the known real
          result, later isochronous transmit DMA usage works just fine.
      
      So let's just fix this chip bug up by the latter method.  Tested with
      JMB381 on kernel 3.13 and 4.3.
      
      Since OHCI-1394 generally requires 4 IT contexts at a minium, this
      workaround is simply applied whenever the initial read of IsoXmitIntMask
      returns 0, regardless whether it's a JMicron chip or not.  I never heard
      of this issue together with any other chip though.
      
      I am not 100% sure that this fix works on the OHCI-1394 part of JMB380
      and JMB388 combo controllers exactly the same as on the JMB381 single-
      function controller, but so far I haven't had a chance to let an owner
      of a combo chip run a patched kernel.
      
      Strangely enough, IsoRecvIntMask is always reported correctly, even
      though it is probed right before IsoXmitIntMask.
      
      Reported-by: Clifford Dunn
      Reported-by: default avatarCraig Moore <craig.moore@qenos.com>
      Signed-off-by: default avatarStefan Richter <stefanr@s5r6.in-berlin.de>
      [bwh: Backported to 3.2: log with fw_notify() instead of ohci_notice()]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      785381e9
    • Takashi Iwai's avatar
      ALSA: hda - Apply pin fixup for HP ProBook 6550b · 4405adbd
      Takashi Iwai authored
      commit c932b98c upstream.
      
      HP ProBook 6550b needs the same pin fixup applied to other HP B-series
      laptops with docks for making its headphone and dock headphone jacks
      working properly.  We just need to add the codec SSID to the list.
      
      Bugzilla: https://bugzilla.kernel.org/attachment.cgi?id=191971Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      4405adbd
    • Michal Kubeček's avatar
      ipv6: fix tunnel error handling · 29fbb788
      Michal Kubeček authored
      commit ebac62fe upstream.
      
      Both tunnel6_protocol and tunnel46_protocol share the same error
      handler, tunnel6_err(), which traverses through tunnel6_handlers list.
      For ipip6 tunnels, we need to traverse tunnel46_handlers as we do e.g.
      in tunnel46_rcv(). Current code can generate an ICMPv6 error message
      with an IPv4 packet embedded in it.
      
      Fixes: 73d605d1 ("[IPSEC]: changing API of xfrm6_tunnel_register")
      Signed-off-by: default avatarMichal Kubecek <mkubecek@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      29fbb788
    • libin's avatar
      recordmcount: Fix endianness handling bug for nop_mcount · 6b8120db
      libin authored
      commit c84da8b9 upstream.
      
      In nop_mcount, shdr->sh_offset and welp->r_offset should handle
      endianness properly, otherwise it will trigger Segmentation fault
      if the recordmcount main and file.o have different endianness.
      
      Link: http://lkml.kernel.org/r/563806C7.7070606@huawei.comSigned-off-by: default avatarLi Bin <huawei.libin@huawei.com>
      Signed-off-by: default avatarSteven Rostedt <rostedt@goodmis.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      6b8120db
    • sumit.saxena@avagotech.com's avatar
      megaraid_sas : SMAP restriction--do not access user memory from IOCTL code · f7e70bad
      sumit.saxena@avagotech.com authored
      commit 323c4a02 upstream.
      
      This is an issue on SMAP enabled CPUs and 32 bit apps running on 64 bit
      OS. Do not access user memory from kernel code. The SMAP bit restricts
      accessing user memory from kernel code.
      Signed-off-by: default avatarSumit Saxena <sumit.saxena@avagotech.com>
      Signed-off-by: default avatarKashyap Desai <kashyap.desai@avagotech.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      f7e70bad
    • Herbert Xu's avatar
      crypto: algif_hash - Only export and import on sockets with data · bd65107f
      Herbert Xu authored
      commit 4afa5f96 upstream.
      
      The hash_accept call fails to work on sockets that have not received
      any data.  For some algorithm implementations it may cause crashes.
      
      This patch fixes this by ensuring that we only export and import on
      sockets that have received data.
      Reported-by: default avatarHarsh Jain <harshjain.prof@gmail.com>
      Signed-off-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Tested-by: default avatarStephan Mueller <smueller@chronox.de>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      bd65107f
    • Brian Norris's avatar
      mtd: blkdevs: fix potential deadlock + lockdep warnings · 419608b9
      Brian Norris authored
      commit f3c63795 upstream.
      
      Commit 073db4a5 ("mtd: fix: avoid race condition when accessing
      mtd->usecount") fixed a race condition but due to poor ordering of the
      mutex acquisition, introduced a potential deadlock.
      
      The deadlock can occur, for example, when rmmod'ing the m25p80 module, which
      will delete one or more MTDs, along with any corresponding mtdblock
      devices. This could potentially race with an acquisition of the block
      device as follows.
      
       -> blktrans_open()
          ->  mutex_lock(&dev->lock);
          ->  mutex_lock(&mtd_table_mutex);
      
       -> del_mtd_device()
          ->  mutex_lock(&mtd_table_mutex);
          ->  blktrans_notify_remove() -> del_mtd_blktrans_dev()
             ->  mutex_lock(&dev->lock);
      
      This is a classic (potential) ABBA deadlock, which can be fixed by
      making the A->B ordering consistent everywhere. There was no real
      purpose to the ordering in the original patch, AFAIR, so this shouldn't
      be a problem. This ordering was actually already present in
      del_mtd_blktrans_dev(), for one, where the function tried to ensure that
      its caller already held mtd_table_mutex before it acquired &dev->lock:
      
              if (mutex_trylock(&mtd_table_mutex)) {
                      mutex_unlock(&mtd_table_mutex);
                      BUG();
              }
      
      So, reverse the ordering of acquisition of &dev->lock and &mtd_table_mutex so
      we always acquire mtd_table_mutex first.
      
      Snippets of the lockdep output follow:
      
        # modprobe -r m25p80
        [   53.419251]
        [   53.420838] ======================================================
        [   53.427300] [ INFO: possible circular locking dependency detected ]
        [   53.433865] 4.3.0-rc6 #96 Not tainted
        [   53.437686] -------------------------------------------------------
        [   53.444220] modprobe/372 is trying to acquire lock:
        [   53.449320]  (&new->lock){+.+...}, at: [<c043fe4c>] del_mtd_blktrans_dev+0x80/0xdc
        [   53.457271]
        [   53.457271] but task is already holding lock:
        [   53.463372]  (mtd_table_mutex){+.+.+.}, at: [<c0439994>] del_mtd_device+0x18/0x100
        [   53.471321]
        [   53.471321] which lock already depends on the new lock.
        [   53.471321]
        [   53.479856]
        [   53.479856] the existing dependency chain (in reverse order) is:
        [   53.487660]
        -> #1 (mtd_table_mutex){+.+.+.}:
        [   53.492331]        [<c043fc5c>] blktrans_open+0x34/0x1a4
        [   53.497879]        [<c01afce0>] __blkdev_get+0xc4/0x3b0
        [   53.503364]        [<c01b0bb8>] blkdev_get+0x108/0x320
        [   53.508743]        [<c01713c0>] do_dentry_open+0x218/0x314
        [   53.514496]        [<c0180454>] path_openat+0x4c0/0xf9c
        [   53.519959]        [<c0182044>] do_filp_open+0x5c/0xc0
        [   53.525336]        [<c0172758>] do_sys_open+0xfc/0x1cc
        [   53.530716]        [<c000f740>] ret_fast_syscall+0x0/0x1c
        [   53.536375]
        -> #0 (&new->lock){+.+...}:
        [   53.540587]        [<c063f124>] mutex_lock_nested+0x38/0x3cc
        [   53.546504]        [<c043fe4c>] del_mtd_blktrans_dev+0x80/0xdc
        [   53.552606]        [<c043f164>] blktrans_notify_remove+0x7c/0x84
        [   53.558891]        [<c04399f0>] del_mtd_device+0x74/0x100
        [   53.564544]        [<c043c670>] del_mtd_partitions+0x80/0xc8
        [   53.570451]        [<c0439aa0>] mtd_device_unregister+0x24/0x48
        [   53.576637]        [<c046ce6c>] spi_drv_remove+0x1c/0x34
        [   53.582207]        [<c03de0f0>] __device_release_driver+0x88/0x114
        [   53.588663]        [<c03de19c>] device_release_driver+0x20/0x2c
        [   53.594843]        [<c03dd9e8>] bus_remove_device+0xd8/0x108
        [   53.600748]        [<c03dacc0>] device_del+0x10c/0x210
        [   53.606127]        [<c03dadd0>] device_unregister+0xc/0x20
        [   53.611849]        [<c046d878>] __unregister+0x10/0x20
        [   53.617211]        [<c03da868>] device_for_each_child+0x50/0x7c
        [   53.623387]        [<c046eae8>] spi_unregister_master+0x58/0x8c
        [   53.629578]        [<c03e12f0>] release_nodes+0x15c/0x1c8
        [   53.635223]        [<c03de0f8>] __device_release_driver+0x90/0x114
        [   53.641689]        [<c03de900>] driver_detach+0xb4/0xb8
        [   53.647147]        [<c03ddc78>] bus_remove_driver+0x4c/0xa0
        [   53.652970]        [<c00cab50>] SyS_delete_module+0x11c/0x1e4
        [   53.658976]        [<c000f740>] ret_fast_syscall+0x0/0x1c
        [   53.664621]
        [   53.664621] other info that might help us debug this:
        [   53.664621]
        [   53.672979]  Possible unsafe locking scenario:
        [   53.672979]
        [   53.679169]        CPU0                    CPU1
        [   53.683900]        ----                    ----
        [   53.688633]   lock(mtd_table_mutex);
        [   53.692383]                                lock(&new->lock);
        [   53.698306]                                lock(mtd_table_mutex);
        [   53.704658]   lock(&new->lock);
        [   53.707946]
        [   53.707946]  *** DEADLOCK ***
      
      Fixes: 073db4a5 ("mtd: fix: avoid race condition when accessing mtd->usecount")
      Reported-by: default avatarFelipe Balbi <balbi@ti.com>
      Tested-by: default avatarFelipe Balbi <balbi@ti.com>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      419608b9
    • Marek Vasut's avatar
      can: Use correct type in sizeof() in nla_put() · 78f78e5a
      Marek Vasut authored
      commit 562b103a upstream.
      
      The sizeof() is invoked on an incorrect variable, likely due to some
      copy-paste error, and this might result in memory corruption. Fix this.
      Signed-off-by: default avatarMarek Vasut <marex@denx.de>
      Cc: Wolfgang Grandegger <wg@grandegger.com>
      Cc: netdev@vger.kernel.org
      Signed-off-by: default avatarMarc Kleine-Budde <mkl@pengutronix.de>
      [bwh: Backported to 3.2:
       - Keep using the old NLA_PUT macro
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      78f78e5a
    • sumit.saxena@avagotech.com's avatar
      megaraid_sas: Do not use PAGE_SIZE for max_sectors · 5e9c1dca
      sumit.saxena@avagotech.com authored
      commit 357ae967 upstream.
      
      Do not use PAGE_SIZE marco to calculate max_sectors per I/O
      request. Driver code assumes PAGE_SIZE will be always 4096 which can
      lead to wrongly calculated value if PAGE_SIZE is not 4096. This issue
      was reported in Ubuntu Bugzilla Bug #1475166.
      Signed-off-by: default avatarSumit Saxena <sumit.saxena@avagotech.com>
      Signed-off-by: default avatarKashyap Desai <kashyap.desai@avagotech.com>
      Reviewed-by: default avatarTomas Henzl <thenzl@redhat.com>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      5e9c1dca
    • Takashi Iwai's avatar
      ALSA: hda - Disable 64bit address for Creative HDA controllers · f3b2d51d
      Takashi Iwai authored
      commit cadd16ea upstream.
      
      We've had many reports that some Creative sound cards with CA0132
      don't work well.  Some reported that it starts working after reloading
      the module, while some reported it starts working when a 32bit kernel
      is used.  All these facts seem implying that the chip fails to
      communicate when the buffer is located in 64bit address.
      
      This patch addresses these issues by just adding AZX_DCAPS_NO_64BIT
      flag to the corresponding PCI entries.  I casually had a chance to
      test an SB Recon3D board, and indeed this seems helping.
      
      Although this hasn't been tested on all Creative devices, it's safer
      to assume that this restriction applies to the rest of them, too.  So
      the flag is applied to all Creative entries.
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      [bwh: Backported to 3.2: drop the change to AZX_DCAPS_PRESET_CTHDA]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      f3b2d51d
    • Ralf Baechle's avatar
      MIPS: atomic: Fix comment describing atomic64_add_unless's return value. · 3ce1b9be
      Ralf Baechle authored
      commit f25319d2 upstream.
      Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Fixes: f24219b4
      (cherry picked from commit f0a232cde7be18a207fd057dd79bbac8a0a45dec)
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      3ce1b9be
    • Chen Yu's avatar
      ACPI: Use correct IRQ when uninstalling ACPI interrupt handler · bf177657
      Chen Yu authored
      commit 49e4b843 upstream.
      
      Currently when the system is trying to uninstall the ACPI interrupt
      handler, it uses acpi_gbl_FADT.sci_interrupt as the IRQ number.
      However, the IRQ number that the ACPI interrupt handled is installed
      for comes from acpi_gsi_to_irq() and that is the number that should
      be used for the handler removal.
      
      Fix this problem by using the mapped IRQ returned from acpi_gsi_to_irq()
      as appropriate.
      Acked-by: default avatarLv Zheng <lv.zheng@intel.com>
      Signed-off-by: default avatarChen Yu <yu.c.chen@intel.com>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      bf177657
    • Larry Finger's avatar
      staging: rtl8712: Add device ID for Sitecom WLA2100 · e936d80d
      Larry Finger authored
      commit 1e6e6328 upstream.
      
      This adds the USB ID for the Sitecom WLA2100. The Windows 10 inf file
      was checked to verify that the addition is correct.
      Reported-by: default avatarFrans van de Wiel <fvdw@fvdw.eu>
      Signed-off-by: default avatarLarry Finger <Larry.Finger@lwfinger.net>
      Cc: Frans van de Wiel <fvdw@fvdw.eu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e936d80d
    • Dmitry Tunin's avatar
      Bluetooth: ath3k: Add support of AR3012 0cf3:817b device · 2b42ac30
      Dmitry Tunin authored
      commit 18e0afab upstream.
      
      T: Bus=04 Lev=02 Prnt=02 Port=04 Cnt=01 Dev#= 3 Spd=12 MxCh= 0
      D: Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1
      P: Vendor=0cf3 ProdID=817b Rev=00.02
      C: #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
      I: If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      I: If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      
      BugLink: https://bugs.launchpad.net/bugs/1506615Signed-off-by: default avatarDmitry Tunin <hanipouspilot@gmail.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2b42ac30
    • Dmitry Tunin's avatar
      Bluetooth: ath3k: Add new AR3012 0930:021c id · 467b9ca3
      Dmitry Tunin authored
      commit cd355ff0 upstream.
      
      This adapter works with the existing linux-firmware.
      
      T:  Bus=01 Lev=01 Prnt=01 Port=03 Cnt=02 Dev#=  3 Spd=12  MxCh= 0
      D:  Ver= 1.10 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs=  1
      P:  Vendor=0930 ProdID=021c Rev=00.01
      C:  #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=100mA
      I:  If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      I:  If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb
      
      BugLink: https://bugs.launchpad.net/bugs/1502781Signed-off-by: default avatarDmitry Tunin <hanipouspilot@gmail.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      467b9ca3
    • Daeho Jeong's avatar
      ext4, jbd2: ensure entering into panic after recording an error in superblock · 08c89a61
      Daeho Jeong authored
      commit 4327ba52 upstream.
      
      If a EXT4 filesystem utilizes JBD2 journaling and an error occurs, the
      journaling will be aborted first and the error number will be recorded
      into JBD2 superblock and, finally, the system will enter into the
      panic state in "errors=panic" option.  But, in the rare case, this
      sequence is little twisted like the below figure and it will happen
      that the system enters into panic state, which means the system reset
      in mobile environment, before completion of recording an error in the
      journal superblock. In this case, e2fsck cannot recognize that the
      filesystem failure occurred in the previous run and the corruption
      wouldn't be fixed.
      
      Task A                        Task B
      ext4_handle_error()
      -> jbd2_journal_abort()
        -> __journal_abort_soft()
          -> __jbd2_journal_abort_hard()
          | -> journal->j_flags |= JBD2_ABORT;
          |
          |                         __ext4_abort()
          |                         -> jbd2_journal_abort()
          |                         | -> __journal_abort_soft()
          |                         |   -> if (journal->j_flags & JBD2_ABORT)
          |                         |           return;
          |                         -> panic()
          |
          -> jbd2_journal_update_sb_errno()
      Tested-by: default avatarHobin Woo <hobin.woo@samsung.com>
      Signed-off-by: default avatarDaeho Jeong <daeho.jeong@samsung.com>
      Signed-off-by: default avatarTheodore Ts'o <tytso@mit.edu>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      08c89a61
    • Filipe Manana's avatar
      Btrfs: fix truncation of compressed and inlined extents · 2a97932f
      Filipe Manana authored
      commit 0305cd5f upstream.
      
      When truncating a file to a smaller size which consists of an inline
      extent that is compressed, we did not discard (or made unusable) the
      data between the new file size and the old file size, wasting metadata
      space and allowing for the truncated data to be leaked and the data
      corruption/loss mentioned below.
      We were also not correctly decrementing the number of bytes used by the
      inode, we were setting it to zero, giving a wrong report for callers of
      the stat(2) syscall. The fsck tool also reported an error about a mismatch
      between the nbytes of the file versus the real space used by the file.
      
      Now because we weren't discarding the truncated region of the file, it
      was possible for a caller of the clone ioctl to actually read the data
      that was truncated, allowing for a security breach without requiring root
      access to the system, using only standard filesystem operations. The
      scenario is the following:
      
         1) User A creates a file which consists of an inline and compressed
            extent with a size of 2000 bytes - the file is not accessible to
            any other users (no read, write or execution permission for anyone
            else);
      
         2) The user truncates the file to a size of 1000 bytes;
      
         3) User A makes the file world readable;
      
         4) User B creates a file consisting of an inline extent of 2000 bytes;
      
         5) User B issues a clone operation from user A's file into its own
            file (using a length argument of 0, clone the whole range);
      
         6) User B now gets to see the 1000 bytes that user A truncated from
            its file before it made its file world readbale. User B also lost
            the bytes in the range [1000, 2000[ bytes from its own file, but
            that might be ok if his/her intention was reading stale data from
            user A that was never supposed to be public.
      
      Note that this contrasts with the case where we truncate a file from 2000
      bytes to 1000 bytes and then truncate it back from 1000 to 2000 bytes. In
      this case reading any byte from the range [1000, 2000[ will return a value
      of 0x00, instead of the original data.
      
      This problem exists since the clone ioctl was added and happens both with
      and without my recent data loss and file corruption fixes for the clone
      ioctl (patch "Btrfs: fix file corruption and data loss after cloning
      inline extents").
      
      So fix this by truncating the compressed inline extents as we do for the
      non-compressed case, which involves decompressing, if the data isn't already
      in the page cache, compressing the truncated version of the extent, writing
      the compressed content into the inline extent and then truncate it.
      
      The following test case for fstests reproduces the problem. In order for
      the test to pass both this fix and my previous fix for the clone ioctl
      that forbids cloning a smaller inline extent into a larger one,
      which is titled "Btrfs: fix file corruption and data loss after cloning
      inline extents", are needed. Without that other fix the test fails in a
      different way that does not leak the truncated data, instead part of
      destination file gets replaced with zeroes (because the destination file
      has a larger inline extent than the source).
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _need_to_be_root
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _require_cloner
      
        rm -f $seqres.full
      
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount "-o compress"
      
        # Create our test files. File foo is going to be the source of a clone operation
        # and consists of a single inline extent with an uncompressed size of 512 bytes,
        # while file bar consists of a single inline extent with an uncompressed size of
        # 256 bytes. For our test's purpose, it's important that file bar has an inline
        # extent with a size smaller than foo's inline extent.
        $XFS_IO_PROG -f -c "pwrite -S 0xa1 0 128"   \
                -c "pwrite -S 0x2a 128 384" \
                $SCRATCH_MNT/foo | _filter_xfs_io
        $XFS_IO_PROG -f -c "pwrite -S 0xbb 0 256" $SCRATCH_MNT/bar | _filter_xfs_io
      
        # Now durably persist all metadata and data. We do this to make sure that we get
        # on disk an inline extent with a size of 512 bytes for file foo.
        sync
      
        # Now truncate our file foo to a smaller size. Because it consists of a
        # compressed and inline extent, btrfs did not shrink the inline extent to the
        # new size (if the extent was not compressed, btrfs would shrink it to 128
        # bytes), it only updates the inode's i_size to 128 bytes.
        $XFS_IO_PROG -c "truncate 128" $SCRATCH_MNT/foo
      
        # Now clone foo's inline extent into bar.
        # This clone operation should fail with errno EOPNOTSUPP because the source
        # file consists only of an inline extent and the file's size is smaller than
        # the inline extent of the destination (128 bytes < 256 bytes). However the
        # clone ioctl was not prepared to deal with a file that has a size smaller
        # than the size of its inline extent (something that happens only for compressed
        # inline extents), resulting in copying the full inline extent from the source
        # file into the destination file.
        #
        # Note that btrfs' clone operation for inline extents consists of removing the
        # inline extent from the destination inode and copy the inline extent from the
        # source inode into the destination inode, meaning that if the destination
        # inode's inline extent is larger (N bytes) than the source inode's inline
        # extent (M bytes), some bytes (N - M bytes) will be lost from the destination
        # file. Btrfs could copy the source inline extent's data into the destination's
        # inline extent so that we would not lose any data, but that's currently not
        # done due to the complexity that would be needed to deal with such cases
        # (specially when one or both extents are compressed), returning EOPNOTSUPP, as
        # it's normally not a very common case to clone very small files (only case
        # where we get inline extents) and copying inline extents does not save any
        # space (unlike for normal, non-inlined extents).
        $CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/foo $SCRATCH_MNT/bar
      
        # Now because the above clone operation used to succeed, and due to foo's inline
        # extent not being shinked by the truncate operation, our file bar got the whole
        # inline extent copied from foo, making us lose the last 128 bytes from bar
        # which got replaced by the bytes in range [128, 256[ from foo before foo was
        # truncated - in other words, data loss from bar and being able to read old and
        # stale data from foo that should not be possible to read anymore through normal
        # filesystem operations. Contrast with the case where we truncate a file from a
        # size N to a smaller size M, truncate it back to size N and then read the range
        # [M, N[, we should always get the value 0x00 for all the bytes in that range.
      
        # We expected the clone operation to fail with errno EOPNOTSUPP and therefore
        # not modify our file's bar data/metadata. So its content should be 256 bytes
        # long with all bytes having the value 0xbb.
        #
        # Without the btrfs bug fix, the clone operation succeeded and resulted in
        # leaking truncated data from foo, the bytes that belonged to its range
        # [128, 256[, and losing data from bar in that same range. So reading the
        # file gave us the following content:
        #
        # 0000000 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1 a1
        # *
        # 0000200 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a 2a
        # *
        # 0000400
        echo "File bar's content after the clone operation:"
        od -t x1 $SCRATCH_MNT/bar
      
        # Also because the foo's inline extent was not shrunk by the truncate
        # operation, btrfs' fsck, which is run by the fstests framework everytime a
        # test completes, failed reporting the following error:
        #
        #  root 5 inode 257 errors 400, nbytes wrong
      
        status=0
        exit
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      [bwh: Backported to 3.2:
       - Adjust parameters to btrfs_truncate_page() and btrfs_truncate_item()
       - Pass transaction pointer into truncate_inline_extent()
       - Add prototype of btrfs_truncate_page()
       - s/test_bit(BTRFS_ROOT_REF_COWS, &root->state)/root->ref_cows/
       - Keep using BUG_ON() for other error cases, as there is no
         btrfs_abort_transaction()
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      2a97932f
    • Chris Mason's avatar
      Btrfs: don't use ram_bytes for uncompressed inline items · 85c36cd4
      Chris Mason authored
      commit 514ac8ad upstream.
      
      If we truncate an uncompressed inline item, ram_bytes isn't updated to reflect
      the new size.  The fixe uses the size directly from the item header when
      reading uncompressed inlines, and also fixes truncate to update the
      size as it goes.
      Reported-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      [bwh: Backported to 3.2:
       - Don't use btrfs_map_token API
       - There are fewer callers of btrfs_file_extent_inline_len() to change
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      85c36cd4
    • Arnd Bergmann's avatar
      ARM: pxa: remove incorrect __init annotation on pxa27x_set_pwrmode · df2ef64a
      Arnd Bergmann authored
      commit 54c09889 upstream.
      
      The z2 machine calls pxa27x_set_pwrmode() in order to power off
      the machine, but this function gets discarded early at boot because
      it is marked __init, as pointed out by kbuild:
      
      WARNING: vmlinux.o(.text+0x145c4): Section mismatch in reference from the function z2_power_off() to the function .init.text:pxa27x_set_pwrmode()
      The function z2_power_off() references
      the function __init pxa27x_set_pwrmode().
      This is often because z2_power_off lacks a __init
      annotation or the annotation of pxa27x_set_pwrmode is wrong.
      
      This removes the __init section modifier to fix rebooting and the
      build error.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: ba4a90a6 ("ARM: pxa/z2: fix building error of pxa27x_cpu_suspend() no longer available")
      Signed-off-by: default avatarRobert Jarzmik <robert.jarzmik@free.fr>
      [bwh: Backported to 3.2: adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      df2ef64a
    • David Woodhouse's avatar
      iommu/vt-d: Fix ATSR handling for Root-Complex integrated endpoints · a9f83a62
      David Woodhouse authored
      commit d14053b3 upstream.
      
      The VT-d specification says that "Software must enable ATS on endpoint
      devices behind a Root Port only if the Root Port is reported as
      supporting ATS transactions."
      
      We walk up the tree to find a Root Port, but for integrated devices we
      don't find one — we get to the host bridge. In that case we *should*
      allow ATS. Currently we don't, which means that we are incorrectly
      failing to use ATS for the integrated graphics. Fix that.
      
      We should never break out of this loop "naturally" with bus==NULL,
      since we'll always find bridge==NULL in that case (and now return 1).
      
      So remove the check for (!bridge) after the loop, since it can never
      happen. If it did, it would be worthy of a BUG_ON(!bridge). But since
      it'll oops anyway in that case, that'll do just as well.
      Signed-off-by: default avatarDavid Woodhouse <David.Woodhouse@intel.com>
      [bwh: Backported to 3.2:
       - Adjust context
       - There's no (!bridge) check to remove]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a9f83a62
    • Filipe Manana's avatar
      Btrfs: fix file corruption and data loss after cloning inline extents · 16f61ceb
      Filipe Manana authored
      commit 8039d87d upstream.
      
      Currently the clone ioctl allows to clone an inline extent from one file
      to another that already has other (non-inlined) extents. This is a problem
      because btrfs is not designed to deal with files having inline and regular
      extents, if a file has an inline extent then it must be the only extent
      in the file and must start at file offset 0. Having a file with an inline
      extent followed by regular extents results in EIO errors when doing reads
      or writes against the first 4K of the file.
      
      Also, the clone ioctl allows one to lose data if the source file consists
      of a single inline extent, with a size of N bytes, and the destination
      file consists of a single inline extent with a size of M bytes, where we
      have M > N. In this case the clone operation removes the inline extent
      from the destination file and then copies the inline extent from the
      source file into the destination file - we lose the M - N bytes from the
      destination file, a read operation will get the value 0x00 for any bytes
      in the the range [N, M] (the destination inode's i_size remained as M,
      that's why we can read past N bytes).
      
      So fix this by not allowing such destructive operations to happen and
      return errno EOPNOTSUPP to user space.
      
      Currently the fstest btrfs/035 tests the data loss case but it totally
      ignores this - i.e. expects the operation to succeed and does not check
      the we got data loss.
      
      The following test case for fstests exercises all these cases that result
      in file corruption and data loss:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _need_to_be_root
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _require_cloner
        _require_btrfs_fs_feature "no_holes"
        _require_btrfs_mkfs_feature "no-holes"
      
        rm -f $seqres.full
      
        test_cloning_inline_extents()
        {
            local mkfs_opts=$1
            local mount_opts=$2
      
            _scratch_mkfs $mkfs_opts >>$seqres.full 2>&1
            _scratch_mount $mount_opts
      
            # File bar, the source for all the following clone operations, consists
            # of a single inline extent (50 bytes).
            $XFS_IO_PROG -f -c "pwrite -S 0xbb 0 50" $SCRATCH_MNT/bar \
                | _filter_xfs_io
      
            # Test cloning into a file with an extent (non-inlined) where the
            # destination offset overlaps that extent. It should not be possible to
            # clone the inline extent from file bar into this file.
            $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 16K" $SCRATCH_MNT/foo \
                | _filter_xfs_io
            $CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo
      
            # Doing IO against any range in the first 4K of the file should work.
            # Due to a past clone ioctl bug which allowed cloning the inline extent,
            # these operations resulted in EIO errors.
            echo "File foo data after clone operation:"
            # All bytes should have the value 0xaa (clone operation failed and did
            # not modify our file).
            od -t x1 $SCRATCH_MNT/foo
            $XFS_IO_PROG -c "pwrite -S 0xcc 0 100" $SCRATCH_MNT/foo | _filter_xfs_io
      
            # Test cloning the inline extent against a file which has a hole in its
            # first 4K followed by a non-inlined extent. It should not be possible
            # as well to clone the inline extent from file bar into this file.
            $XFS_IO_PROG -f -c "pwrite -S 0xdd 4K 12K" $SCRATCH_MNT/foo2 \
                | _filter_xfs_io
            $CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo2
      
            # Doing IO against any range in the first 4K of the file should work.
            # Due to a past clone ioctl bug which allowed cloning the inline extent,
            # these operations resulted in EIO errors.
            echo "File foo2 data after clone operation:"
            # All bytes should have the value 0x00 (clone operation failed and did
            # not modify our file).
            od -t x1 $SCRATCH_MNT/foo2
            $XFS_IO_PROG -c "pwrite -S 0xee 0 90" $SCRATCH_MNT/foo2 | _filter_xfs_io
      
            # Test cloning the inline extent against a file which has a size of zero
            # but has a prealloc extent. It should not be possible as well to clone
            # the inline extent from file bar into this file.
            $XFS_IO_PROG -f -c "falloc -k 0 1M" $SCRATCH_MNT/foo3 | _filter_xfs_io
            $CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo3
      
            # Doing IO against any range in the first 4K of the file should work.
            # Due to a past clone ioctl bug which allowed cloning the inline extent,
            # these operations resulted in EIO errors.
            echo "First 50 bytes of foo3 after clone operation:"
            # Should not be able to read any bytes, file has 0 bytes i_size (the
            # clone operation failed and did not modify our file).
            od -t x1 $SCRATCH_MNT/foo3
            $XFS_IO_PROG -c "pwrite -S 0xff 0 90" $SCRATCH_MNT/foo3 | _filter_xfs_io
      
            # Test cloning the inline extent against a file which consists of a
            # single inline extent that has a size not greater than the size of
            # bar's inline extent (40 < 50).
            # It should be possible to do the extent cloning from bar to this file.
            $XFS_IO_PROG -f -c "pwrite -S 0x01 0 40" $SCRATCH_MNT/foo4 \
                | _filter_xfs_io
            $CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo4
      
            # Doing IO against any range in the first 4K of the file should work.
            echo "File foo4 data after clone operation:"
            # Must match file bar's content.
            od -t x1 $SCRATCH_MNT/foo4
            $XFS_IO_PROG -c "pwrite -S 0x02 0 90" $SCRATCH_MNT/foo4 | _filter_xfs_io
      
            # Test cloning the inline extent against a file which consists of a
            # single inline extent that has a size greater than the size of bar's
            # inline extent (60 > 50).
            # It should not be possible to clone the inline extent from file bar
            # into this file.
            $XFS_IO_PROG -f -c "pwrite -S 0x03 0 60" $SCRATCH_MNT/foo5 \
                | _filter_xfs_io
            $CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo5
      
            # Reading the file should not fail.
            echo "File foo5 data after clone operation:"
            # Must have a size of 60 bytes, with all bytes having a value of 0x03
            # (the clone operation failed and did not modify our file).
            od -t x1 $SCRATCH_MNT/foo5
      
            # Test cloning the inline extent against a file which has no extents but
            # has a size greater than bar's inline extent (16K > 50).
            # It should not be possible to clone the inline extent from file bar
            # into this file.
            $XFS_IO_PROG -f -c "truncate 16K" $SCRATCH_MNT/foo6 | _filter_xfs_io
            $CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo6
      
            # Reading the file should not fail.
            echo "File foo6 data after clone operation:"
            # Must have a size of 16K, with all bytes having a value of 0x00 (the
            # clone operation failed and did not modify our file).
            od -t x1 $SCRATCH_MNT/foo6
      
            # Test cloning the inline extent against a file which has no extents but
            # has a size not greater than bar's inline extent (30 < 50).
            # It should be possible to clone the inline extent from file bar into
            # this file.
            $XFS_IO_PROG -f -c "truncate 30" $SCRATCH_MNT/foo7 | _filter_xfs_io
            $CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo7
      
            # Reading the file should not fail.
            echo "File foo7 data after clone operation:"
            # Must have a size of 50 bytes, with all bytes having a value of 0xbb.
            od -t x1 $SCRATCH_MNT/foo7
      
            # Test cloning the inline extent against a file which has a size not
            # greater than the size of bar's inline extent (20 < 50) but has
            # a prealloc extent that goes beyond the file's size. It should not be
            # possible to clone the inline extent from bar into this file.
            $XFS_IO_PROG -f -c "falloc -k 0 1M" \
                            -c "pwrite -S 0x88 0 20" \
                            $SCRATCH_MNT/foo8 | _filter_xfs_io
            $CLONER_PROG -s 0 -d 0 -l 0 $SCRATCH_MNT/bar $SCRATCH_MNT/foo8
      
            echo "File foo8 data after clone operation:"
            # Must have a size of 20 bytes, with all bytes having a value of 0x88
            # (the clone operation did not modify our file).
            od -t x1 $SCRATCH_MNT/foo8
      
            _scratch_unmount
        }
      
        echo -e "\nTesting without compression and without the no-holes feature...\n"
        test_cloning_inline_extents
      
        echo -e "\nTesting with compression and without the no-holes feature...\n"
        test_cloning_inline_extents "" "-o compress"
      
        echo -e "\nTesting without compression and with the no-holes feature...\n"
        test_cloning_inline_extents "-O no-holes" ""
      
        echo -e "\nTesting with compression and with the no-holes feature...\n"
        test_cloning_inline_extents "-O no-holes" "-o compress"
      
        status=0
        exit
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      [bwh: Backported to 3.2:
       - Adjust parameters to btrfs_drop_extents()
       - Drop use of ASSERT()
       - Keep using BUG_ON() for other error cases, as there is no
         btrfs_abort_transaction()
       - Adjust context]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      16f61ceb
    • Jan Schmidt's avatar
      Btrfs: added helper btrfs_next_item() · e83c48cb
      Jan Schmidt authored
      commit c7d22a3c upstream.
      
      btrfs_next_item() makes the btrfs path point to the next item, crossing leaf
      boundaries if needed.
      Signed-off-by: default avatarArne Jansen <sensille@gmx.net>
      Signed-off-by: default avatarJan Schmidt <list.btrfs@jan-o-sch.net>
      [bwh: Dependency of the following fix]
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e83c48cb
    • Eric Dumazet's avatar
      packet: fix match_fanout_group() · 9914546b
      Eric Dumazet authored
      commit 161642e2 upstream.
      
      Recent TCP listener patches exposed a prior af_packet bug :
      match_fanout_group() blindly assumes it is always safe
      to cast sk to a packet socket to compare fanout with af_packet_priv
      
      But SYNACK packets can be sent while attached to request_sock, which
      are smaller than a "struct sock".
      
      We can read non existent memory and crash.
      
      Fixes: c0de08d0 ("af_packet: don't emit packet on orig fanout group")
      Fixes: ca6fb065 ("tcp: attach SYNACK messages to request sockets instead of listener")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Willem de Bruijn <willemb@google.com>
      Cc: Eric Leblond <eric@regit.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      9914546b
    • Dan Carpenter's avatar
      devres: fix a for loop bounds check · e7102453
      Dan Carpenter authored
      commit 1f35d04a upstream.
      
      The iomap[] array has PCIM_IOMAP_MAX (6) elements and not
      DEVICE_COUNT_RESOURCE (16).  This bug was found using a static checker.
      It may be that the "if (!(mask & (1 << i)))" check means we never
      actually go past the end of the array in real life.
      
      Fixes: ec04b075 ('iomap: implement pcim_iounmap_regions()')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarTejun Heo <tj@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      e7102453
    • Boris BREZILLON's avatar
      mtd: mtdpart: fix add_mtd_partitions error path · f9ac3882
      Boris BREZILLON authored
      commit e5bae867 upstream.
      
      If we fail to allocate a partition structure in the middle of the partition
      creation process, the already allocated partitions are never removed, which
      means they are still present in the partition list and their resources are
      never freed.
      Signed-off-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Signed-off-by: default avatarBrian Norris <computersforpeace@gmail.com>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      f9ac3882
    • Dan Carpenter's avatar
      mwifiex: fix mwifiex_rdeeprom_read() · a6a8977e
      Dan Carpenter authored
      commit 1f9c6e1b upstream.
      
      There were several bugs here.
      
      1)  The done label was in the wrong place so we didn't copy any
          information out when there was no command given.
      
      2)  We were using PAGE_SIZE as the size of the buffer instead of
          "PAGE_SIZE - pos".
      
      3)  snprintf() returns the number of characters that would have been
          printed if there were enough space.  If there was not enough space
          (and we had fixed the memory corruption bug #2) then it would result
          in an information leak when we do simple_read_from_buffer().  I've
          changed it to use scnprintf() instead.
      
      I also removed the initialization at the start of the function, because
      I thought it made the code a little more clear.
      
      Fixes: 5e6e3a92 ('wireless: mwifiex: initial commit for Marvell mwifiex driver')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarAmitkumar Karwar <akarwar@marvell.com>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      a6a8977e
    • Valentin Rothberg's avatar
      wm831x_power: Use IRQF_ONESHOT to request threaded IRQs · 7293a7e1
      Valentin Rothberg authored
      commit 90adf98d upstream.
      
      Since commit 1c6c6952 ("genirq: Reject bogus threaded irq requests")
      threaded IRQs without a primary handler need to be requested with
      IRQF_ONESHOT, otherwise the request will fail.
      
      scripts/coccinelle/misc/irqf_oneshot.cocci detected this issue.
      
      Fixes: b5874f33 ("wm831x_power: Use genirq")
      Signed-off-by: default avatarValentin Rothberg <valentinrothberg@gmail.com>
      Signed-off-by: default avatarSebastian Reichel <sre@kernel.org>
      Signed-off-by: default avatarBen Hutchings <ben@decadent.org.uk>
      7293a7e1