Commits · a6d4b786ebfbbb9330b6915b590547d348ec148c · Kirill Smelkov / linux

01 Jul, 2014 40 commits

efi-pstore: Fix an overflow on 32-bit builds · a6d4b786

Andrzej Zaborowski authored Jun 09, 2014

commit 783ee431 upstream.

In generic_id the long int timestamp is multiplied by 100000 and needs
an explicit cast to u64.

Without that the id in the resulting pstore filename is wrong and
userspace may have problems parsing it, but more importantly files in
pstore can never be deleted and may fill the EFI flash (brick device?).
This happens because when generic pstore code wants to delete a file,
it passes the id to the EFI backend which reinterpretes it and a wrong
variable name is attempted to be deleted.  There's no error message but
after remounting pstore, deleted files would reappear.
Signed-off-by: Andrew Zaborowski <andrew.zaborowski@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

a6d4b786

builddeb: use $OBJCOPY variable instead of objcopy · 7f922b19

Fathi Boudra authored Apr 12, 2014

commit 6b4a144a upstream.

In cross-build environment, we expect to use the cross-compiler objcopy
instead of the host objcopy.

It fixes following build failures:
objcopy --only-keep-debug lib/modules/3.14/kernel/net/ipv6/xfrm6_mode_tunnel.ko /srv/build/linux/debian/dbgtmp/usr/lib/debug/lib/modules/3.14/kernel/net/ipv6/xfrm6_mode_tunnel.ko
objcopy: Unable to recognise the format of the input file `lib/modules/3.14/kernel/net/ipv6/xfrm6_mode_tunnel.ko'
Signed-off-by: Fathi Boudra <fathi.boudra@linaro.org>
Fixes: 810e8437 ('deb-pkg: split debug symbols in their own package')
Reviewed-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

7f922b19

random: fix nasty entropy accounting bug · 02bbff72

Theodore Ts'o authored Jun 15, 2014

commit e33ba5fa upstream.

Commit 0fb7a01a "random: simplify accounting code", introduced in
v3.15, has a very nasty accounting problem when the entropy pool has
has fewer bytes of entropy than the number of requested reserved
bytes.  In that case, "have_bytes - reserved" goes negative, and since
size_t is unsigned, the expression:

       ibytes = min_t(size_t, ibytes, have_bytes - reserved);

... does not do the right thing.  This is rather bad, because it
defeats the catastrophic reseeding feature in the
xfer_secondary_pool() path.

It also can cause the "BUG: spinlock trylock failure on UP" for some
kernel configurations when prandom_reseed() calls get_random_bytes()
in the early init, since when the entropy count gets corrupted,
credit_entropy_bits() erroneously believes that the nonblocking pool
has been fully initialized (when in fact it is not), and so it calls
prandom_reseed(true) recursively leading to the spinlock BUG.

The logic is *not* the same it was originally, but in the cases where
it matters, the behavior is the same, and the resulting code is
hopefully easier to read and understand.

Fixes: 0fb7a01a "random: simplify accounting code"
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: Greg Price <price@mit.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

02bbff72

epoll: fix use-after-free in eventpoll_release_file · faf067b3

Konstantin Khlebnikov authored Jun 17, 2014

commit ebe06187 upstream.

This fixes use-after-free of epi->fllink.next inside list loop macro.
This loop actually releases elements in the body.  The list is
rcu-protected but here we cannot hold rcu_read_lock because we need to
lock mutex inside.

The obvious solution is to use list_for_each_entry_safe().  RCU-ness
isn't essential because nobody can change this list under us, it's final
fput for this file.

The bug was introduced by ae10b2b4 ("epoll: optimize EPOLL_CTL_DEL
using rcu")
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Reported-by: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Jason Baron <jbaron@akamai.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

faf067b3

x86_32, entry: Do syscall exit work on badsys (CVE-2014-4508) · a8e9e3d0

Andy Lutomirski authored Jun 23, 2014

commit 554086d8 upstream.

The bad syscall nr paths are their own incomprehensible route
through the entry control flow.  Rearrange them to work just like
syscalls that return -ENOSYS.

This fixes an OOPS in the audit code when fast-path auditing is
enabled and sysenter gets a bad syscall nr (CVE-2014-4508).

This has probably been broken since Linux 2.6.27:
af0575bb i386 syscall audit fast-path

Cc: Roland McGrath <roland@redhat.com>
Reported-by: Toralf Förster <toralf.foerster@gmx.de>
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Link: http://lkml.kernel.org/r/e09c499eade6fc321266dd6b54da7beb28d6991c.1403558229.git.luto@amacapital.netSigned-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

a8e9e3d0

lz4: fix another possible overrun · 96ae603f

Greg Kroah-Hartman authored Jun 24, 2014

commit 4148c1f6 upstream.

There is one other possible overrun in the lz4 code as implemented by
Linux at this point in time (which differs from the upstream lz4
codebase, but will get synced at in a future kernel release.)  As
pointed out by Don, we also need to check the overflow in the data
itself.

While we are at it, replace the odd error return value with just a
"simple" -1 value as the return value is never used for anything other
than a basic "did this work or not" check.
Reported-by: "Don A. Bailey" <donb@securitymouse.com>
Reported-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

96ae603f

Bluetooth: Fix properly ignoring LTKs of unknown types · 01233a9c

Johan Hedberg authored May 29, 2014

commit 61b43357 upstream.

In case there are new LTK types in the future we shouldn't just blindly
assume that != MGMT_LTK_UNAUTHENTICATED means that the key is
authenticated. This patch adds explicit checks for each allowed key type
in the form of a switch statement and skips any key which has an unknown
value.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

01233a9c

Bluetooth: Clearly distinguish mgmt LTK type from authenticated property · 85568768

Johan Hedberg authored May 23, 2014

commit d7b25450 upstream.

On the mgmt level we have a key type parameter which currently accepts
two possible values: 0x00 for unauthenticated and 0x01 for
authenticated. However, in the internal struct smp_ltk representation we
have an explicit "authenticated" boolean value.

To make this distinction clear, add defines for the possible mgmt values
and do conversion to and from the internal authenticated value.
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

85568768

btrfs: fix use of uninit "ret" in end_extent_writepage() · 28084dba

Eric Sandeen authored Jun 12, 2014

commit 3e2426bd upstream.

If this condition in end_extent_writepage() is false:

	if (tree->ops && tree->ops->writepage_end_io_hook)

we will then test an uninitialized "ret" at:

	ret = ret < 0 ? ret : -EIO;

The test for ret is for the case where ->writepage_end_io_hook
failed, and we'd choose that ret as the error; but if
there is no ->writepage_end_io_hook, nothing sets ret.

Initializing ret to 0 should be sufficient; if
writepage_end_io_hook wasn't set, (!uptodate) means
non-zero err was passed in, so we choose -EIO in that case.
Signed-of-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

28084dba

Btrfs: fix scrub_print_warning to handle skinny metadata extents · eed31172

Liu Bo authored Jun 09, 2014

commit 6eda71d0 upstream.

The skinny extents are intepreted incorrectly in scrub_print_warning(),
and end up hitting the BUG() in btrfs_extent_inline_ref_size.
Reported-by: Konstantinos Skarlatos <k.skarlatos@gmail.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

eed31172

Btrfs: use right type to get real comparison · 6d2a63f3

Liu Bo authored Jun 08, 2014

commit cd857dd6 upstream.

We want to make sure the point is still within the extent item, not to verify
the memory it's pointing to.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6d2a63f3

Btrfs: don't check nodes for extent items · efb57277

Josef Bacik authored Jun 05, 2014

commit 8a56457f upstream.

The backref code was looking at nodes as well as leaves when we tried to
populate extent item entries.  This is not good, and although we go away with it
for the most part because we'd skip where disk_bytenr != random_memory,
sometimes random_memory would match and suddenly boom.  This fixes that problem.
Thanks,
Signed-off-by: Josef Bacik <jbacik@fb.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

efb57277

fs: btrfs: volumes.c: Fix for possible null pointer dereference · e5171550

Rickard Strandqvist authored May 22, 2014

commit 8321cf25 upstream.

There is otherwise a risk of a possible null pointer dereference.

Was largely found by using a static code analysis program called cppcheck.
Signed-off-by: Rickard Strandqvist <rickard_strandqvist@spectrumdigital.se>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e5171550

btrfs: allocate raid type kobjects dynamically · 80693a87

Jeff Mahoney authored May 27, 2014

commit c1895442 upstream.

We are currently allocating space_info objects in an array when we
allocate space_info. When a user does something like:

# btrfs balance start -mconvert=raid1 -dconvert=raid1 /mnt
# btrfs balance start -mconvert=single -dconvert=single /mnt -f
# btrfs balance start -mconvert=raid1 -dconvert=raid1 /

We can end up with memory corruption since the kobject hasn't
been reinitialized properly and the name pointer was left set.

The rationale behind allocating them statically was to avoid
creating a separate kobject container that just contained the
raid type. It used the index in the array to determine the index.

Ultimately, though, this wastes more memory than it saves in all
but the most complex scenarios and introduces kobject lifetime
questions.

This patch allocates the kobjects dynamically instead. Note that
we also remove the kobject_get/put of the parent kobject since
kobject_add and kobject_del do that internally.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reported-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

80693a87

Btrfs: send, use the right limits for xattr names and values · 69a109b2

Filipe Manana authored May 23, 2014

commit 7e3ae33e upstream.

We were limiting the sum of the xattr name and value lengths to PATH_MAX,
which is not correct, specially on filesystems created with btrfs-progs
v3.12 or higher, where the default leaf size is max(16384, PAGE_SIZE), or
systems with page sizes larger than 4096 bytes.

Xattrs have their own specific maximum name and value lengths, which depend
on the leaf size, therefore use these limits to be able to send xattrs with
sizes larger than PATH_MAX.

A test case for xfstests follows.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

69a109b2

Btrfs: send, don't error in the presence of subvols/snapshots · 1a7fc2e6

Filipe Manana authored May 25, 2014

commit 1af56070 upstream.

If we are doing an incremental send and the base snapshot has a
directory with name X that doesn't exist anymore in the second
snapshot and a new subvolume/snapshot exists in the second snapshot
that has the same name as the directory (name X), the incremental
send would fail with -ENOENT error. This is because it attempts
to lookup for an inode with a number matching the objectid of a
root, which doesn't exist.

Steps to reproduce:

    mkfs.btrfs -f /dev/sdd
    mount /dev/sdd /mnt

    mkdir /mnt/testdir
    btrfs subvolume snapshot -r /mnt /mnt/mysnap1

    rmdir /mnt/testdir
    btrfs subvolume create /mnt/testdir
    btrfs subvolume snapshot -r /mnt /mnt/mysnap2

    btrfs send -p /mnt/mysnap1 /mnt/mysnap2 -f /tmp/send.data

A test case for xfstests follows.
Reported-by: Robert White <rwhite@pobox.com>
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1a7fc2e6

Btrfs: set right total device count for seeding support · 1b524a36

Wang Shilong authored May 13, 2014

commit 29865841 upstream.

Seeding device support allows us to create a new filesystem
based on existed filesystem.

However newly created filesystem's @total_devices should include seed
devices. This patch fix the following problem:

 # mkfs.btrfs -f /dev/sdb
 # btrfstune -S 1 /dev/sdb
 # mount /dev/sdb /mnt
 # btrfs device add -f /dev/sdc /mnt --->fs_devices->total_devices = 1
 # umount /mnt
 # mount /dev/sdc /mnt               --->fs_devices->total_devices = 2

This is because we record right @total_devices in superblock, but
@fs_devices->total_devices is reset to be 0 in btrfs_prepare_sprout().

Fix this problem by not resetting @fs_devices->total_devices.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

1b524a36

Btrfs: mark mapping with error flag to report errors to userspace · 8d729345

Liu Bo authored May 12, 2014

commit 5dca6eea upstream.

According to commit 865ffef3
(fs: fix fsync() error reporting),
it's not stable to just check error pages because pages can be
truncated or invalidated, we should also mark mapping with error
flag so that a later fsync can catch the error.
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

8d729345

Btrfs: fix NULL pointer crash of deleting a seed device · c6b4616c

Liu Bo authored May 11, 2014

commit 29cc83f6 upstream.

Same as normal devices, seed devices should be initialized with
fs_info->dev_root as well, otherwise we'll get a NULL pointer crash.

Cc: Chris Murphy <lists@colorremedies.com>
Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

c6b4616c

Btrfs: make sure there are not any read requests before stopping workers · add81375

Wang Shilong authored Apr 09, 2014

commit de348ee0 upstream.

In close_ctree(), after we have stopped all workers,there maybe still
some read requests(for example readahead) to submit and this *maybe* trigger
an oops that user reported before:

kernel BUG at fs/btrfs/async-thread.c:619!

By hacking codes, i can reproduce this problem with one cpu available.
We fix this potential problem by invalidating all btree inode pages before
stopping all workers.

Thanks to Miao for pointing out this problem.
Signed-off-by: Wang Shilong <wangsl.fnst@cn.fujitsu.com>
Reviewed-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

add81375

Btrfs: send, account for orphan directories when building path strings · 7df26ca1

Filipe Manana authored Mar 22, 2014

commit c992ec94 upstream.

If we have directories with a pending move/rename operation, we must take into
account any orphan directories that got created before executing the pending
move/rename. Those orphan directories are directories with an inode number higher
then the current send progress and that don't exist in the parent snapshot, they
are created before current progress reaches their inode number, with a generated
name of the form oN-M-I and at the root of the filesystem tree, and later when
progress matches their inode number, moved/renamed to their final location.

Reproducer:

          $ mkfs.btrfs -f /dev/sdd
          $ mount /dev/sdd /mnt

          $ mkdir -p /mnt/a/b/c/d
          $ mkdir /mnt/a/b/e
          $ mv /mnt/a/b/c /mnt/a/b/e/CC
          $ mkdir /mnt/a/b/e/CC/d/f
	  $ mkdir /mnt/a/g

          $ btrfs subvolume snapshot -r /mnt /mnt/snap1
          $ btrfs send /mnt/snap1 -f /tmp/base.send

          $ mkdir /mnt/a/g/h
	  $ mv /mnt/a/b/e /mnt/a/g/h/EE
          $ mv /mnt/a/g/h/EE/CC/d /mnt/a/g/h/EE/DD

          $ btrfs subvolume snapshot -r /mnt /mnt/snap2
          $ btrfs send -p /mnt/snap1 /mnt/snap2 -f /tmp/incremental.send

The second receive command failed with the following error:

    ERROR: rename a/b/e/CC/d -> o264-7-0/EE/DD failed. No such file or directory

A test case for xfstests follows soon.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

7df26ca1

Btrfs: output warning instead of error when loading free space cache failed · 580b0278

Miao Xie authored Apr 24, 2014

commit 32d6b47f upstream.

If we fail to load a free space cache, we can rebuild it from the extent tree,
so it is not a serious error, we should not output a error message that
would make the users uncomfortable. This patch uses warning message instead
of it.
Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

580b0278

btrfs: Add ctime/mtime update for btrfs device add/remove. · 2e9e3143

Qu Wenruo authored Apr 16, 2014

commit 5a1972bd upstream.

Btrfs will send uevent to udev inform the device change,
but ctime/mtime for the block device inode is not udpated, which cause
libblkid used by btrfs-progs unable to detect device change and use old
cache, causing 'btrfs dev scan; btrfs dev rmove; btrfs dev scan' give an
error message.
Reported-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Cc: Karel Zak <kzak@redhat.com>
Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2e9e3143

Btrfs: read inode size after acquiring the mutex when punching a hole · 2501c4a0

Filipe Manana authored Apr 26, 2014

commit a1a50f60 upstream.

In a previous change, commit 12870f1c,
I accidentally moved the roundup of inode->i_size to outside of the
critical section delimited by the inode mutex, which is not atomic and
not correct since the size can be changed by other task before we acquire
the mutex. Therefore fix it.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

2501c4a0

Btrfs: fix double free in find_lock_delalloc_range · bddf0fac

Chris Mason authored May 21, 2014

commit 7d788742 upstream.

We need to NULL the cached_state after freeing it, otherwise
we might free it again if find_delalloc_range doesn't find anything.
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bddf0fac

Btrfs: fix leaf corruption caused by ENOSPC while hole punching · 938bdab6

Filipe Manana authored Apr 29, 2014

commit fc19c5e7 upstream.

While running a stress test with multiple threads writing to the same btrfs
file system, I ended up with a situation where a leaf was corrupted in that
it had 2 file extent item keys that had the same exact key. I was able to
detect this quickly thanks to the following patch which triggers an assertion
as soon as a leaf is marked dirty if there are duplicated keys or out of order
keys:

    Btrfs: check if items are ordered when a leaf is marked dirty
    (https://patchwork.kernel.org/patch/3955431/)

Basically while running the test, I got the following in dmesg:

    [28877.415877] WARNING: CPU: 2 PID: 10706 at fs/btrfs/file.c:553 btrfs_drop_extent_cache+0x435/0x440 [btrfs]()
    (...)
    [28877.415917] Call Trace:
    [28877.415922]  [<ffffffff816f1189>] dump_stack+0x4e/0x68
    [28877.415926]  [<ffffffff8104a32c>] warn_slowpath_common+0x8c/0xc0
    [28877.415929]  [<ffffffff8104a37a>] warn_slowpath_null+0x1a/0x20
    [28877.415944]  [<ffffffffa03775a5>] btrfs_drop_extent_cache+0x435/0x440 [btrfs]
    [28877.415949]  [<ffffffff8118e7be>] ? kmem_cache_alloc+0xfe/0x1c0
    [28877.415962]  [<ffffffffa03777d9>] fill_holes+0x229/0x3e0 [btrfs]
    [28877.415972]  [<ffffffffa0345865>] ? block_rsv_add_bytes+0x55/0x80 [btrfs]
    [28877.415984]  [<ffffffffa03792cb>] btrfs_fallocate+0xb6b/0xc20 [btrfs]
    (...)
    [29854.132560] BTRFS critical (device sdc): corrupt leaf, bad key order: block=955232256,root=1, slot=24
    [29854.132565] BTRFS info (device sdc): leaf 955232256 total ptrs 40 free space 778
    (...)
    [29854.132637] 	item 23 key (3486 108 667648) itemoff 2694 itemsize 53
    [29854.132638] 		extent data disk bytenr 14574411776 nr 286720
    [29854.132639] 		extent data offset 0 nr 286720 ram 286720
    [29854.132640] 	item 24 key (3486 108 954368) itemoff 2641 itemsize 53
    [29854.132641] 		extent data disk bytenr 0 nr 0
    [29854.132643] 		extent data offset 0 nr 0 ram 0
    [29854.132644] 	item 25 key (3486 108 954368) itemoff 2588 itemsize 53
    [29854.132645] 		extent data disk bytenr 8699670528 nr 77824
    [29854.132646] 		extent data offset 0 nr 77824 ram 77824
    [29854.132647] 	item 26 key (3486 108 1146880) itemoff 2535 itemsize 53
    [29854.132648] 		extent data disk bytenr 8699670528 nr 77824
    [29854.132649] 		extent data offset 0 nr 77824 ram 77824
    (...)
    [29854.132707] kernel BUG at fs/btrfs/ctree.h:3901!
    (...)
    [29854.132771] Call Trace:
    [29854.132779]  [<ffffffffa0342b5c>] setup_items_for_insert+0x2dc/0x400 [btrfs]
    [29854.132791]  [<ffffffffa0378537>] __btrfs_drop_extents+0xba7/0xdd0 [btrfs]
    [29854.132794]  [<ffffffff8109c0d6>] ? trace_hardirqs_on_caller+0x16/0x1d0
    [29854.132797]  [<ffffffff8109c29d>] ? trace_hardirqs_on+0xd/0x10
    [29854.132800]  [<ffffffff8118e7be>] ? kmem_cache_alloc+0xfe/0x1c0
    [29854.132810]  [<ffffffffa036783b>] insert_reserved_file_extent.constprop.66+0xab/0x310 [btrfs]
    [29854.132820]  [<ffffffffa036a6c6>] __btrfs_prealloc_file_range+0x116/0x340 [btrfs]
    [29854.132830]  [<ffffffffa0374d53>] btrfs_prealloc_file_range+0x23/0x30 [btrfs]
    (...)

So this is caused by getting an -ENOSPC error while punching a file hole, more
specifically, we get -ENOSPC error from __btrfs_drop_extents in the while loop
of file.c:btrfs_punch_hole() when it's unable to modify the btree to delete one
or more file extent items due to lack of enough free space. When this happens,
in btrfs_punch_hole(), we attempt to reclaim free space by switching our transaction
block reservation object to root->fs_info->trans_block_rsv, end our transaction and
start a new transaction basically - and, we keep increasing our current offset
(cur_offset) as long as it's smaller than the end of the target range (lockend) -
this makes use leave the loop with cur_offset == drop_end which in turn makes us
call fill_holes() for inserting a file extent item that represents a 0 bytes range
hole (and this insertion succeeds, as in the meanwhile more space became available).

This 0 bytes file hole extent item is a problem because any subsequent caller of
__btrfs_drop_extents (regular file writes, or fallocate calls for e.g.), with a
start file offset that is equal to the offset of the hole, will not remove this
extent item due to the following conditional in the while loop of
__btrfs_drop_extents:

    if (extent_end <= search_start) {
            path->slots[0]++;
            goto next_slot;
    }

This later makes the call to setup_items_for_insert() (at the very end of
__btrfs_drop_extents), insert a new file extent item with the same offset as
the 0 bytes file hole extent item that follows it. Needless is to say that this
causes chaos, either when reading the leaf from disk (btree_readpage_end_io_hook),
where we perform leaf sanity checks or in subsequent operations that manipulate
file extent items, as in the fallocate call as shown by the dmesg trace above.

Without my other patch to perform the leaf sanity checks once a leaf is marked
as dirty (if the integrity checker is enabled), it would have been much harder
to debug this issue.

This change might fix a few similar issues reported by users in the mailing
list regarding assertion failures in btrfs_set_item_key_safe calls performed
by __btrfs_drop_extents, such as the following report:

    http://comments.gmane.org/gmane.comp.file-systems.btrfs/32938

Asking fill_holes() to create a 0 bytes wide file hole item also produced the
first warning in the trace above, as we passed a range to btrfs_drop_extent_cache
that has an end smaller (by -1) than its start.

On 3.14 kernels this issue manifests itself through leaf corruption, as we get
duplicated file extent item keys in a leaf when calling setup_items_for_insert(),
but on older kernels, setup_items_for_insert() isn't called by __btrfs_drop_extents(),
instead we have callers of __btrfs_drop_extents(), namely the functions
inode.c:insert_inline_extent() and inode.c:insert_reserved_file_extent(), calling
btrfs_insert_empty_item() to insert the new file extent item, which would fail with
error -EEXIST, instead of inserting a duplicated key - which is still a serious
issue as it would make all similar file extent item replace operations keep
failing if they target the same file range.
Signed-off-by: Filipe David Borba Manana <fdmanana@gmail.com>
Signed-off-by: Chris Mason <clm@fb.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

938bdab6

CIFS: Fix memory leaks in SMB2_open · 0d81fcdd

Pavel Shilovsky authored May 24, 2014

commit 663a9621 upstream.
Signed-off-by: Pavel Shilovsky <pshilovsky@samba.org>
Reviewed-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Steve French <smfrench@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0d81fcdd

aio: fix kernel memory disclosure in io_getevents() introduced in v3.10 · 0ea9bb58

Benjamin LaHaise authored Jun 24, 2014

commit edfbbf38 upstream.

A kernel memory disclosure was introduced in aio_read_events_ring() in v3.10
by commit a31ad380.  The changes made to
aio_read_events_ring() failed to correctly limit the index into
ctx->ring_pages[], allowing an attacked to cause the subsequent kmap() of
an arbitrary page with a copy_to_user() to copy the contents into userspace.
This vulnerability has been assigned CVE-2014-0206.  Thanks to Mateusz and
Petr for disclosing this issue.

This patch applies to v3.12+.  A separate backport is needed for 3.10/3.11.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0ea9bb58

aio: fix aio request leak when events are reaped by userspace · b34e0e13

Benjamin LaHaise authored Jun 24, 2014

commit f8567a38 upstream.

The aio cleanups and optimizations by kmo that were merged into the 3.10
tree added a regression for userspace event reaping.  Specifically, the
reference counts are not decremented if the event is reaped in userspace,
leading to the application being unable to submit further aio requests.
This patch applies to 3.12+.  A separate backport is required for 3.10/3.11.
This issue was uncovered as part of CVE-2014-0206.
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Cc: Kent Overstreet <kmo@daterainc.com>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b34e0e13

genirq: Sanitize spurious interrupt detection of threaded irqs · a23f9667

Thomas Gleixner authored Mar 07, 2013

commit 1e77d0a1 upstream.

Till reported that the spurious interrupt detection of threaded
interrupts is broken in two ways:

- note_interrupt() is called for each action thread of a shared
  interrupt line. That's wrong as we are only interested whether none
  of the device drivers felt responsible for the interrupt, but by
  calling multiple times for a single interrupt line we account
  IRQ_NONE even if one of the drivers felt responsible.

- note_interrupt() when called from the thread handler is not
  serialized. That leaves the members of irq_desc which are used for
  the spurious detection unprotected.

To solve this we need to defer the spurious detection of a threaded
interrupt to the next hardware interrupt context where we have
implicit serialization.

If note_interrupt is called with action_ret == IRQ_WAKE_THREAD, we
check whether the previous interrupt requested a deferred check. If
not, we request a deferred check for the next hardware interrupt and
return.

If set, we check whether one of the interrupt threads signaled
success. Depending on this information we feed the result into the
spurious detector.

If one primary handler of a shared interrupt returns IRQ_HANDLED we
disable the deferred check of irq threads on the same line, as we have
found at least one device driver who cared.
Reported-by: Till Straumann <strauman@slac.stanford.edu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Austin Schuh <austin@peloton-tech.com>
Cc: Oliver Hartkopp <socketcan@hartkopp.net>
Cc: Wolfgang Grandegger <wg@grandegger.com>
Cc: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Cc: Marc Kleine-Budde <mkl@pengutronix.de>
Cc: linux-can@vger.kernel.org
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1303071450130.22263@ionosSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

a23f9667

Revert "offb: Add palette hack for little endian" · 9f6900b6

Benjamin Herrenschmidt authored Jun 16, 2014

commit 68986c9f upstream.

This reverts commit e1edf18b.

This patch was a misguided attempt at fixing offb for LE ppc64
kernels on BE qemu but is just wrong ... it breaks real LE/LE
setups, LE with real HW, and existing mixed endian systems
that did the fight thing with the appropriate device-tree
property. Bad reviewing on my part, sorry.

The right fix is to either make qemu change its endian when
the guest changes endian (working on that) or to use the
existing foreign endian support.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

9f6900b6

Revert "drm/radeon: use variable UVD clocks" · bc862d29

Alex Deucher authored Jun 07, 2014

commit 0690a229 upstream.

This caused reduced performance for some users with advanced post
processing enabled.  We need a better method to pick the
UVD state based on the amount of post processing required or tune
the advanced post processing to fit within the lower power state
envelope.

This reverts commit 14a9579d.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

bc862d29

x86, x32: Use compat shims for io_{setup,submit} · 60ee64fa

Mike Frysinger authored May 04, 2014

commit 7fd44dac upstream.

The io_setup takes a pointer to a context id of type aio_context_t.
This in turn is typed to a __kernel_ulong_t.  We could tweak the
exported headers to define this as a 64bit quantity for specific
ABIs, but since we already have a 32bit compat shim for the x86 ABI,
let's just re-use that logic.  The libaio package is also written to
expect this as a pointer type, so a compat shim would simplify that.

The io_submit func operates on an array of pointers to iocb structs.
Padding out the array to be 64bit aligned is a huge pain, so convert
it over to the existing compat shim too.

We don't convert io_getevents to the compat func as its only purpose
is to handle the timespec struct, and the x32 ABI uses 64bit times.

With this change, the libaio package can now pass its testsuite when
built for the x32 ABI.
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Link: http://lkml.kernel.org/r/1399250595-5005-1-git-send-email-vapier@gentoo.org
Cc: H.J. Lu <hjl.tools@gmail.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

60ee64fa

x86-32, espfix: Remove filter for espfix32 due to race · 0d8ab48e

H. Peter Anvin authored Apr 30, 2014

commit 246f2d2e upstream.

It is not safe to use LAR to filter when to go down the espfix path,
because the LDT is per-process (rather than per-thread) and another
thread might change the descriptors behind our back. Fortunately it
is always *safe* (if a bit slow) to go down the espfix path, and a
32-bit LDT stack segment is extremely rare.
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Link: http://lkml.kernel.org/r/1398816946-3351-1-git-send-email-hpa@linux.intel.comSigned-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

0d8ab48e

arm64: mm: remove broken &= operator from pmd_mknotpresent · 6b4e1805

Will Deacon authored Jun 18, 2014

commit e3a920af upstream.

This should be a plain old '&' and could easily lead to undefined
behaviour if the target of a pmd_mknotpresent invocation was the same
as the parameter.

Fixes: 9c7e535f (arm64: mm: Route pmd thp functions through pte equivalents)
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

6b4e1805

arm64/dma: Removing ARCH_HAS_DMA_GET_REQUIRED_MASK macro · 92abb0be

Suravee Suthikulpanit authored Jun 06, 2014

commit f3a183cb upstream.

Arm64 does not define dma_get_required_mask() function.
Therefore, it should not define the ARCH_HAS_DMA_GET_REQUIRED_MASK.
This causes build errors in some device drivers (e.g. mpt2sas)
Signed-off-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

92abb0be

arm64: uid16: fix __kernel_old_{gid,uid}_t definitions · 04c4b55d

Will Deacon authored Jun 02, 2014

commit 34c65c43 upstream.

Whilst native arm64 applications don't have the 16-bit UID/GID syscalls
wired up, compat tasks can still access them. The 16-bit wrappers for
these syscalls use __kernel_old_uid_t and __kernel_old_gid_t, which must
be 16-bit data types to maintain compatibility with the 16-bit UIDs used
by compat applications.

This patch defines 16-bit __kernel_old_{gid,uid}_t types for arm64
instead of using the 32-bit types provided by asm-generic.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

04c4b55d

ARM: mvebu: DT: fix OpenBlocks AX3-4 RAM size · a1980574

Jason Cooper authored Jun 04, 2014

commit e47043ae upstream.

The OpenBlocks AX3-4 has a non-DT bootloader.  It also comes with 1GB of
soldered on RAM, and a DIMM slot for expansion.

Unfortunately, atags_to_fdt() doesn't work in big-endian mode, so we see
the following failure when attempting to boot a big-endian kernel:

  686 slab pages
  17 pages shared
  0 pages swap cached
  [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
  Kernel panic - not syncing: Out of memory and no killable processes...

  CPU: 1 PID: 351 Comm: kworker/u4:0 Not tainted 3.15.0-rc8-next-20140603 #1
  [<c0215a54>] (unwind_backtrace) from [<c021160c>] (show_stack+0x10/0x14)
  [<c021160c>] (show_stack) from [<c0802500>] (dump_stack+0x78/0x94)
  [<c0802500>] (dump_stack) from [<c0800068>] (panic+0x90/0x21c)
  [<c0800068>] (panic) from [<c02b5704>] (out_of_memory+0x320/0x340)
  [<c02b5704>] (out_of_memory) from [<c02b93a0>] (__alloc_pages_nodemask+0x874/0x930)
  [<c02b93a0>] (__alloc_pages_nodemask) from [<c02d446c>] (handle_mm_fault+0x744/0x96c)
  [<c02d446c>] (handle_mm_fault) from [<c02cf250>] (__get_user_pages+0xd0/0x4c0)
  [<c02cf250>] (__get_user_pages) from [<c02f3598>] (get_arg_page+0x54/0xbc)
  [<c02f3598>] (get_arg_page) from [<c02f3878>] (copy_strings+0x278/0x29c)
  [<c02f3878>] (copy_strings) from [<c02f38bc>] (copy_strings_kernel+0x20/0x28)
  [<c02f38bc>] (copy_strings_kernel) from [<c02f4f1c>] (do_execve+0x3a8/0x4c8)
  [<c02f4f1c>] (do_execve) from [<c025ac10>] (____call_usermodehelper+0x15c/0x194)
  [<c025ac10>] (____call_usermodehelper) from [<c020e9b8>] (ret_from_fork+0x14/0x3c)
  CPU0: stopping
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.15.0-rc8-next-20140603 #1
  [<c0215a54>] (unwind_backtrace) from [<c021160c>] (show_stack+0x10/0x14)
  [<c021160c>] (show_stack) from [<c0802500>] (dump_stack+0x78/0x94)
  [<c0802500>] (dump_stack) from [<c021429c>] (handle_IPI+0x138/0x174)
  [<c021429c>] (handle_IPI) from [<c02087f0>] (armada_370_xp_handle_irq+0xb0/0xcc)
  [<c02087f0>] (armada_370_xp_handle_irq) from [<c0212100>] (__irq_svc+0x40/0x50)
  Exception stack(0xc0b6bf68 to 0xc0b6bfb0)
  bf60:                   e9fad598 00000000 00f509a3 00000000 c0b6a000 c0b724c4
  bf80: c0b72458 c0b6a000 00000000 00000000 c0b66da0 c0b6a000 00000000 c0b6bfb0
  bfa0: c027bb94 c027bb24 60000313 ffffffff
  [<c0212100>] (__irq_svc) from [<c027bb24>] (cpu_startup_entry+0x54/0x214)
  [<c027bb24>] (cpu_startup_entry) from [<c0ac5b30>] (start_kernel+0x318/0x37c)
  [<c0ac5b30>] (start_kernel) from [<00208078>] (0x208078)
  ---[ end Kernel panic - not syncing: Out of memory and no killable processes...

A similar failure will also occur if ARM_ATAG_DTB_COMPAT isn't selected.

Fix this by setting a sane default (1 GB) in the dts file.
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Tested-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

a1980574

f2fs: submit bio at the reclaim path · fecb45f7

Jaegeuk Kim authored Apr 24, 2014

commit 2aea39ec upstream.

If f2fs_write_data_page is called through the reclaim path, we should submit
the bio right away.

This patch resolves the following issue that Marc Dietrich reported.
"It took me a while to bisect a problem which causes my ARM (tegra2) netbook to
frequently stall for 5-10 seconds when I enable EXA acceleration (opentegra
experimental ddx)."
And this patch fixes that.
Reported-by: Marc Dietrich <marvin24@gmx.de>
Signed-off-by: Jaegeuk Kim <jaegeuk.kim@samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

fecb45f7

TARGET/sbc,loopback: Adjust command data length in case pi exists on the wire · 7d971038

Sagi Grimberg authored Jun 11, 2014

commit e2a4f55c upstream.

In various areas of the code, it is assumed that
se_cmd->data_length describes pure data. In case
that protection information exists over the wire
(protect bits is are on) the target core re-calculates
the data length from the CDB and the backed device
block size (instead of each transport peeking in the cdb).

Modify loopback device to include protection information
in the transferred data length (like other scsi transports).
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

7d971038