- 22 Oct, 2016 40 commits
-
-
Greg Kroah-Hartman authored
-
Glauber Costa authored
commit 3932a86b upstream. While debugging timeouts happening in my application workload (ScyllaDB), I have observed calls to open() taking a long time, ranging everywhere from 2 seconds - the first ones that are enough to time out my application - to more than 30 seconds. The problem seems to happen because XFS may block on pending metadata updates under certain circumnstances, and that's confirmed with the following backtrace taken by the offcputime tool (iovisor/bcc): ffffffffb90c57b1 finish_task_switch ffffffffb97dffb5 schedule ffffffffb97e310c schedule_timeout ffffffffb97e1f12 __down ffffffffb90ea821 down ffffffffc046a9dc xfs_buf_lock ffffffffc046abfb _xfs_buf_find ffffffffc046ae4a xfs_buf_get_map ffffffffc046babd xfs_buf_read_map ffffffffc0499931 xfs_trans_read_buf_map ffffffffc044a561 xfs_da_read_buf ffffffffc0451390 xfs_dir3_leaf_read.constprop.16 ffffffffc0452b90 xfs_dir2_leaf_lookup_int ffffffffc0452e0f xfs_dir2_leaf_lookup ffffffffc044d9d3 xfs_dir_lookup ffffffffc047d1d9 xfs_lookup ffffffffc0479e53 xfs_vn_lookup ffffffffb925347a path_openat ffffffffb9254a71 do_filp_open ffffffffb9242a94 do_sys_open ffffffffb9242b9e sys_open ffffffffb97e42b2 entry_SYSCALL_64_fastpath 00007fb0698162ed [unknown] Inspecting my run with blktrace, I can see that the xfsaild kthread exhibit very high "Dispatch wait" times, on the dozens of seconds range and consistent with the open() times I have saw in that run. Still from the blktrace output, we can after searching a bit, identify the request that wasn't dispatched: 8,0 11 152 81.092472813 804 A WM 141698288 + 8 <- (8,1) 141696240 8,0 11 153 81.092472889 804 Q WM 141698288 + 8 [xfsaild/sda1] 8,0 11 154 81.092473207 804 G WM 141698288 + 8 [xfsaild/sda1] 8,0 11 206 81.092496118 804 I WM 141698288 + 8 ( 22911) [xfsaild/sda1] <==== 'I' means Inserted (into the IO scheduler) ===================================> 8,0 0 289372 96.718761435 0 D WM 141698288 + 8 (15626265317) [swapper/0] <==== Only 15s later the CFQ scheduler dispatches the request ======================> As we can see above, in this particular example CFQ took 15 seconds to dispatch this request. Going back to the full trace, we can see that the xfsaild queue had plenty of opportunity to run, and it was selected as the active queue many times. It would just always be preempted by something else (example): 8,0 1 0 81.117912979 0 m N cfq1618SN / insert_request 8,0 1 0 81.117913419 0 m N cfq1618SN / add_to_rr 8,0 1 0 81.117914044 0 m N cfq1618SN / preempt 8,0 1 0 81.117914398 0 m N cfq767A / slice expired t=1 8,0 1 0 81.117914755 0 m N cfq767A / resid=40 8,0 1 0 81.117915340 0 m N / served: vt=1948520448 min_vt=1948520448 8,0 1 0 81.117915858 0 m N cfq767A / sl_used=1 disp=0 charge=0 iops=1 sect=0 where cfq767 is the xfsaild queue and cfq1618 corresponds to one of the ScyllaDB IO dispatchers. The requests preempting the xfsaild queue are synchronous requests. That's a characteristic of ScyllaDB workloads, as we only ever issue O_DIRECT requests. While it can be argued that preempting ASYNC requests in favor of SYNC is part of the CFQ logic, I don't believe that doing so for 15+ seconds is anyone's goal. Moreover, unless I am misunderstanding something, that breaks the expectation set by the "fifo_expire_async" tunable, which in my system is set to the default. Looking at the code, it seems to me that the issue is that after we make an async queue active, there is no guarantee that it will execute any request. When the queue itself tests if it cfq_may_dispatch() it can bail if it sees SYNC requests in flight. An incoming request from another queue can also preempt it in such situation before we have the chance to execute anything (as seen in the trace above). This patch sets the must_dispatch flag if we notice that we have requests that are already fifo_expired. This flag is always cleared after cfq_dispatch_request() returns from cfq_dispatch_requests(), so it won't pin the queue for subsequent requests (unless they are themselves expired) Care is taken during preempt to still allow rt requests to preempt us regardless. Testing my workload with this patch applied produces much better results. From the application side I see no timeouts, and the open() latency histogram generated by systemtap looks much better, with the worst outlier at 131ms: Latency histogram of xfs_buf_lock acquisition (microseconds): value |-------------------------------------------------- count 0 | 11 1 |@@@@ 161 2 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1966 4 |@ 54 8 | 36 16 | 7 32 | 0 64 | 0 ~ 1024 | 0 2048 | 0 4096 | 1 8192 | 1 16384 | 2 32768 | 0 65536 | 0 131072 | 1 262144 | 0 524288 | 0 Signed-off-by: Glauber Costa <glauber@scylladb.com> CC: Jens Axboe <axboe@kernel.dk> CC: linux-block@vger.kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: Glauber Costa <glauber@scylladb.com> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
David Howells authored
commit a818101d upstream. An NULL-pointer dereference happens in cachefiles_mark_object_inactive() when it tries to read i_blocks so that it can tell the cachefilesd daemon how much space it's making available. The problem is that cachefiles_drop_object() calls cachefiles_mark_object_inactive() after calling cachefiles_delete_object() because the object being marked active staves off attempts to (re-)use the file at that filename until after it has been deleted. This means that d_inode is NULL by the time we come to try to access it. To fix the problem, have the caller of cachefiles_mark_object_inactive() supply the number of blocks freed up. Without this, the following oops may occur: BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 IP: [<ffffffffa06c5cc1>] cachefiles_mark_object_inactive+0x61/0xb0 [cachefiles] ... CPU: 11 PID: 527 Comm: kworker/u64:4 Tainted: G I ------------ 3.10.0-470.el7.x86_64 #1 Hardware name: Hewlett-Packard HP Z600 Workstation/0B54h, BIOS 786G4 v03.19 03/11/2011 Workqueue: fscache_object fscache_object_work_func [fscache] task: ffff880035edaf10 ti: ffff8800b77c0000 task.ti: ffff8800b77c0000 RIP: 0010:[<ffffffffa06c5cc1>] cachefiles_mark_object_inactive+0x61/0xb0 [cachefiles] RSP: 0018:ffff8800b77c3d70 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8800bf6cc400 RCX: 0000000000000034 RDX: 0000000000000000 RSI: ffff880090ffc710 RDI: ffff8800bf761ef8 RBP: ffff8800b77c3d88 R08: 2000000000000000 R09: 0090ffc710000000 R10: ff51005d2ff1c400 R11: 0000000000000000 R12: ffff880090ffc600 R13: ffff8800bf6cc520 R14: ffff8800bf6cc400 R15: ffff8800bf6cc498 FS: 0000000000000000(0000) GS:ffff8800bb8c0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000098 CR3: 00000000019ba000 CR4: 00000000000007e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Stack: ffff880090ffc600 ffff8800bf6cc400 ffff8800867df140 ffff8800b77c3db0 ffffffffa06c48cb ffff880090ffc600 ffff880090ffc180 ffff880090ffc658 ffff8800b77c3df0 ffffffffa085d846 ffff8800a96b8150 ffff880090ffc600 Call Trace: [<ffffffffa06c48cb>] cachefiles_drop_object+0x6b/0xf0 [cachefiles] [<ffffffffa085d846>] fscache_drop_object+0xd6/0x1e0 [fscache] [<ffffffffa085d615>] fscache_object_work_func+0xa5/0x200 [fscache] [<ffffffff810a605b>] process_one_work+0x17b/0x470 [<ffffffff810a6e96>] worker_thread+0x126/0x410 [<ffffffff810a6d70>] ? rescuer_thread+0x460/0x460 [<ffffffff810ae64f>] kthread+0xcf/0xe0 [<ffffffff810ae580>] ? kthread_create_on_node+0x140/0x140 [<ffffffff81695418>] ret_from_fork+0x58/0x90 [<ffffffff810ae580>] ? kthread_create_on_node+0x140/0x140 The oopsing code shows: callq 0xffffffff810af6a0 <wake_up_bit> mov 0xf8(%r12),%rax mov 0x30(%rax),%rax mov 0x98(%rax),%rax <---- oops here lock add %rax,0x130(%rbx) where this is: d_backing_inode(object->dentry)->i_blocks Fixes: a5b3a80b (CacheFiles: Provide read-and-reset release counters for cachefilesd) Reported-by: Jianhong Yin <jiyin@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jeff Layton <jlayton@redhat.com> Reviewed-by: Steve Dickson <steved@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Miklos Szeredi authored
commit f2b20f6e upstream. This fixes a bug where the permission was not properly checked in overlayfs. The testcase is ltp/utimensat01. It is also cleaner and safer to do the permission checking in the vfs helper instead of the caller. This patch introduces an additional ia_valid flag ATTR_TOUCH (since touch(1) is the most obvious user of utimes(NULL)) that is passed into notify_change whenever the conditions for this special permission checking mode are met. Reported-by: Aihua Zhang <zhangaihua1@huawei.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Tested-by: Aihua Zhang <zhangaihua1@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marcelo Ricardo Leitner authored
commit 3a8db798 upstream. After backporting commit ee44b4bc ("dlm: use sctp 1-to-1 API") series to a kernel with an older workqueue which didn't use RCU yet, it was noticed that we are freeing the workqueues in dlm_lowcomms_stop() too early as free_conn() will try to access that memory for canceling the queued works if any. This issue was introduced by commit 0d737a8c as before it such attempt to cancel the queued works wasn't performed, so the issue was not present. This patch fixes it by simply inverting the free order. Fixes: 0d737a8c ("dlm: fix race while closing connections") Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David Teigland <teigland@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marcelo Cerri authored
commit 80da44c2 upstream. This patch changes the p8_ghash driver to use ghash-generic as a fixed fallback implementation. This allows the correct value of descsize to be defined directly in its shash_alg structure and avoids problems with incorrect buffer sizes when its state is exported or imported. Reported-by: Jan Stancek <jstancek@redhat.com> Fixes: cc333cd6 ("crypto: vmx - Adding GHASH routines for VMX module") Signed-off-by: Marcelo Cerri <marcelo.cerri@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marcelo Cerri authored
commit a397ba82 upstream. Move common values and types used by ghash-generic to a new header file so drivers can directly use ghash-generic as a fallback implementation. Fixes: cc333cd6 ("crypto: vmx - Adding GHASH routines for VMX module") Signed-off-by: Marcelo Cerri <marcelo.cerri@canonical.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jan Kara authored
commit 9b623df6 upstream. When zeroing blocks for DAX allocations, we also have to unmap aliases in the block device mappings. Otherwise writeback can overwrite zeros with stale data from block device page cache. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
gmail authored
commit e81d4477 upstream. The commit 6050d47a: "ext4: bail out from make_indexed_dir() on first error" could end up leaking bh2 in the error path. [ Also avoid renaming bh2 to bh, which just confuses things --tytso ] Signed-off-by: yangsheng <yngsion@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ross Zwisler authored
commit cca32b7e upstream. Currently when doing a DAX hole punch with ext4 we fail to do a writeback. This is because the logic around filemap_write_and_wait_range() in ext4_punch_hole() only looks for dirty page cache pages in the radix tree, not for dirty DAX exceptional entries. Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Fabian Frederick authored
commit edf15aa1 upstream. Running xfstests generic/013 with kmemleak gives the following: unreferenced object 0xffff8801d3d27de0 (size 96): comm "fsstress", pid 4941, jiffies 4294860168 (age 53.485s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff818eaaf3>] kmemleak_alloc+0x23/0x40 [<ffffffff81179805>] __kmalloc+0xf5/0x1d0 [<ffffffff8122ef5c>] ext4_find_extent+0x1ec/0x2f0 [<ffffffff8123530c>] ext4_insert_range+0x34c/0x4a0 [<ffffffff81235942>] ext4_fallocate+0x4e2/0x8b0 [<ffffffff81181334>] vfs_fallocate+0x134/0x210 [<ffffffff8118203f>] SyS_fallocate+0x3f/0x60 [<ffffffff818efa9b>] entry_SYSCALL_64_fastpath+0x13/0x8f [<ffffffffffffffff>] 0xffffffffffffffff Problem seems mitigated by dropping refs and freeing path when there's no path[depth].p_ext Signed-off-by: Fabian Frederick <fabf@skynet.be> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
wangguang authored
commit 4e800c03 upstream. Pages clear buffers after ext4 delayed block allocation failed, However, it does not clean its pte_dirty flag. if the pages unmap ,in cording to the pte_dirty , unmap_page_range may try to call __set_page_dirty, which may lead to the bugon at mpage_prepare_extent_to_map:head = page_buffers(page);. This patch just call clear_page_dirty_for_io to clean pte_dirty at mpage_release_unused_pages for pages mmaped. Steps to reproduce the bug: (1) mmap a file in ext4 addr = (char *)mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); memset(addr, 'i', 4096); (2) return EIO at ext4_writepages->mpage_map_and_submit_extent->mpage_map_one_extent which causes this log message to be print: ext4_msg(sb, KERN_CRIT, "Delayed block allocation failed for " "inode %lu at logical offset %llu with" " max blocks %u with error %d", inode->i_ino, (unsigned long long)map->m_lblk, (unsigned)map->m_len, -err); (3)Unmap the addr cause warning at __set_page_dirty:WARN_ON_ONCE(warn && !PageUptodate(page)); (4) wait for a minute,then bugon happen. Signed-off-by: wangguang <wangguang03@zte.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Daeho Jeong authored
commit 93e3b4e6 upstream. Now, ext4_do_update_inode() clears high 16-bit fields of uid/gid of deleted and evicted inode to fix up interoperability with old kernels. However, it checks only i_dtime of an inode to determine whether the inode was deleted and evicted, and this is very risky, because i_dtime can be used for the pointer maintaining orphan inode list, too. We need to further check whether the i_dtime is being used for the orphan inode list even if the i_dtime is not NULL. We found that high 16-bit fields of uid/gid of inode are unintentionally and permanently cleared when the inode truncation is just triggered, but not finished, and the inode metadata, whose high uid/gid bits are cleared, is written on disk, and the sudden power-off follows that in order. Signed-off-by: Daeho Jeong <daeho.jeong@samsung.com> Signed-off-by: Hobin Woo <hobin.woo@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Whitney authored
commit 14fbd4aa upstream. Online defragging of encrypted files is not currently implemented. However, the move extent ioctl can still return successfully when called. For example, this occurs when xfstest ext4/020 is run on an encrypted file system, resulting in a corrupted test file and a corresponding test failure. Until the proper functionality is implemented, fail the move extent ioctl if either the original or donor file is encrypted. Signed-off-by: Eric Whitney <enwlinux@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Wei Fang authored
commit c2a9737f upstream. We triggered a deadloop in truncate_inode_pages_range() on 32 bits architecture with the test case bellow: ... fd = open(); write(fd, buf, 4096); preadv64(fd, &iovec, 1, 0xffffffff000); ftruncate(fd, 0); ... Then ftruncate() will not return forever. The filesystem used in this case is ubifs, but it can be triggered on many other filesystems. When preadv64() is called with offset=0xffffffff000, a page with index=0xffffffff will be added to the radix tree of ->mapping. Then this page can be found in ->mapping with pagevec_lookup(). After that, truncate_inode_pages_range(), which is called in ftruncate(), will fall into an infinite loop: - find a page with index=0xffffffff, since index>=end, this page won't be truncated - index++, and index become 0 - the page with index=0xffffffff will be found again The data type of index is unsigned long, so index won't overflow to 0 on 64 bits architecture in this case, and the dead loop won't happen. Since truncate_inode_pages_range() is executed with holding lock of inode->i_rwsem, any operation related with this lock will be blocked, and a hung task will happen, e.g.: INFO: task truncate_test:3364 blocked for more than 120 seconds. ... call_rwsem_down_write_failed+0x17/0x30 generic_file_write_iter+0x32/0x1c0 ubifs_write_iter+0xcc/0x170 __vfs_write+0xc4/0x120 vfs_write+0xb2/0x1b0 SyS_write+0x46/0xa0 The page with index=0xffffffff added to ->mapping is useless. Fix this by checking the read position before allocating pages. Link: http://lkml.kernel.org/r/1475151010-40166-1-git-send-email-fangwei1@huawei.comSigned-off-by: Wei Fang <fangwei1@huawei.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Brian King authored
commit 07d0e9a8 upstream. If a VFC port gets unmapped in the VIOS, it may not respond with a CRQ init complete following H_REG_CRQ. If this occurs, we can end up having called scsi_block_requests and not a resulting unblock until the init complete happens, which may never occur, and we end up hanging I/O requests. This patch ensures the host action stay set to IBMVFC_HOST_ACTION_TGT_DEL so we move all rports into devloss state and unblock unless we receive an init complete. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Acked-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Borislav Petkov authored
commit 4bd173c3 upstream. Do the user_len check first and then the ver_addr allocation so that we can save us the kfree() on the error path when user_len is > ARCMSR_API_DATA_BUFLEN. Signed-off-by: Borislav Petkov <bp@suse.de> Cc: Marco Grassi <marco.gra@gmail.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Tomas Henzl <thenzl@redhat.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Carpenter authored
commit 7bc2b55a upstream. We need to put an upper bound on "user_len" so the memcpy() doesn't overflow. Reported-by: Marco Grassi <marco.gra@gmail.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Justin Maggard authored
commit c8475090 upstream. Add missing dmaengine_unmap_put(), so we don't OOM during RAID6 sync. Fixes: 1786b943 ("async_pq_val: convert to dmaengine_unmap_data") Signed-off-by: Justin Maggard <jmaggard@netgear.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mike Galbraith authored
commit 420902c9 upstream. If we hold the superblock lock while calling reiserfs_quota_on_mount(), we can deadlock our own worker - mount blocks kworker/3:2, sleeps forever more. crash> ps|grep UN 715 2 3 ffff880220734d30 UN 0.0 0 0 [kworker/3:2] 9369 9341 2 ffff88021ffb7560 UN 1.3 493404 123184 Xorg 9665 9664 3 ffff880225b92ab0 UN 0.0 47368 812 udisks-daemon 10635 10403 3 ffff880222f22c70 UN 0.0 14904 936 mount crash> bt ffff880220734d30 PID: 715 TASK: ffff880220734d30 CPU: 3 COMMAND: "kworker/3:2" #0 [ffff8802244c3c20] schedule at ffffffff8144584b #1 [ffff8802244c3cc8] __rt_mutex_slowlock at ffffffff814472b3 #2 [ffff8802244c3d28] rt_mutex_slowlock at ffffffff814473f5 #3 [ffff8802244c3dc8] reiserfs_write_lock at ffffffffa05f28fd [reiserfs] #4 [ffff8802244c3de8] flush_async_commits at ffffffffa05ec91d [reiserfs] #5 [ffff8802244c3e08] process_one_work at ffffffff81073726 #6 [ffff8802244c3e68] worker_thread at ffffffff81073eba #7 [ffff8802244c3ec8] kthread at ffffffff810782e0 #8 [ffff8802244c3f48] kernel_thread_helper at ffffffff81450064 crash> rd ffff8802244c3cc8 10 ffff8802244c3cc8: ffffffff814472b3 ffff880222f23250 .rD.....P2.".... ffff8802244c3cd8: 0000000000000000 0000000000000286 ................ ffff8802244c3ce8: ffff8802244c3d30 ffff880220734d80 0=L$.....Ms .... ffff8802244c3cf8: ffff880222e8f628 0000000000000000 (.."............ ffff8802244c3d08: 0000000000000000 0000000000000002 ................ crash> struct rt_mutex ffff880222e8f628 struct rt_mutex { wait_lock = { raw_lock = { slock = 65537 } }, wait_list = { node_list = { next = 0xffff8802244c3d48, prev = 0xffff8802244c3d48 } }, owner = 0xffff880222f22c71, save_state = 0 } crash> bt 0xffff880222f22c70 PID: 10635 TASK: ffff880222f22c70 CPU: 3 COMMAND: "mount" #0 [ffff8802216a9868] schedule at ffffffff8144584b #1 [ffff8802216a9910] schedule_timeout at ffffffff81446865 #2 [ffff8802216a99a0] wait_for_common at ffffffff81445f74 #3 [ffff8802216a9a30] flush_work at ffffffff810712d3 #4 [ffff8802216a9ab0] schedule_on_each_cpu at ffffffff81074463 #5 [ffff8802216a9ae0] invalidate_bdev at ffffffff81178aba #6 [ffff8802216a9af0] vfs_load_quota_inode at ffffffff811a3632 #7 [ffff8802216a9b50] dquot_quota_on_mount at ffffffff811a375c #8 [ffff8802216a9b80] finish_unfinished at ffffffffa05dd8b0 [reiserfs] #9 [ffff8802216a9cc0] reiserfs_fill_super at ffffffffa05de825 [reiserfs] RIP: 00007f7b9303997a RSP: 00007ffff443c7a8 RFLAGS: 00010202 RAX: 00000000000000a5 RBX: ffffffff8144ef12 RCX: 00007f7b932e9ee0 RDX: 00007f7b93d9a400 RSI: 00007f7b93d9a3e0 RDI: 00007f7b93d9a3c0 RBP: 00007f7b93d9a2c0 R8: 00007f7b93d9a550 R9: 0000000000000001 R10: ffffffffc0ed040e R11: 0000000000000202 R12: 000000000000040e R13: 0000000000000000 R14: 00000000c0ed040e R15: 00007ffff443ca20 ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Mike Galbraith <mgalbraith@suse.de> Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nicolas Iooss authored
commit 61ab0d40 upstream. In sst_prepare_and_post_msg(), when a response is received in "block", the following code gets executed: *data = kzalloc(block->size, GFP_KERNEL); memcpy(data, (void *) block->data, block->size); The memcpy() call overwrites the content of the *data pointer instead of filling the newly-allocated memory (which pointer is hold by *data). Fix this by merging kzalloc+memcpy into a single kmemdup() call. Thanks Joe Perches for suggesting using kmemdup() Fixes: 60dc8dba ("ASoC: Intel: sst: Add some helper functions") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rafał Miłecki authored
commit 7f00ee2b upstream. Flowrings contain skbs waiting for transmission that were passed to us by netif. It means we checked every one of them looking for 802.1x Ethernet type. When deleting flowring we have to use freeing function that will check for 802.1x type as well. Freeing skbs without a proper check was leading to counter not being properly decreased. This was triggering a WARNING every time brcmf_netdev_wait_pend8021x was called. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Acked-by: Arend van Spriel <arend@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rafał Miłecki authored
commit 23e9c128 upstream. This function is called from get_station callback which means that every time user space was getting/dumping station(s) we were leaking 2 KiB. Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Fixes: 1f0dc59a ("brcmfmac: rework .get_station() callback") Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Nicolas Iooss authored
commit 7703773e upstream. The struct cfg80211_pmksa defines its bssid field as: const u8 *bssid; contrary to struct brcmf_pmksa, which uses: u8 bssid[ETH_ALEN]; Therefore in brcmf_cfg80211_del_pmksa(), &pmksa->bssid takes the address of this field (of type u8**), not the one of its content (which would be u8*). Remove the & operator to make brcmf_dbg("%pM") and memcmp() behave as expected. This bug have been found using a custom static checker (which checks the usage of %p... attributes at build time). It has been introduced in commit 6c404f34 ("brcmfmac: Cleanup pmksa cache handling code"), which replaced pmksa->bssid by &pmksa->bssid while refactoring the code, without modifying struct cfg80211_pmksa definition. Replace &pmk[i].bssid with pmk[i].bssid too to make the code clearer, this change does not affect the semantic. Fixes: 6c404f34 ("brcmfmac: Cleanup pmksa cache handling code") Signed-off-by: Nicolas Iooss <nicolas.iooss_linux@m4x.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dave Chinner authored
commit 541d48f0 upstream. oss.sgi.com is going away, move contact details over to vger. Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Guilherme G Piccoli authored
commit edfc23ee upstream. Although rare, it's possible to hit PCI error early on device probe, meaning possibly some structs are not entirely initialized, and some might even be completely uninitialized, leading to NULL pointer dereference. The i40e driver currently presents a "bad" behavior if device hits such early PCI error: firstly, the struct i40e_pf might not be attached to pci_dev yet, leading to a NULL pointer dereference on access to pf->state. Even checking if the struct is NULL and avoiding the access in that case isn't enough, since the driver cannot recover from PCI error that early; in our experiments we saw multiple failures on kernel log, like: [549.664] i40e 0007:01:00.1: Initial pf_reset failed: -15 [549.664] i40e: probe of 0007:01:00.1 failed with error -15 [...] [871.644] i40e 0007:01:00.1: The driver for the device stopped because the device firmware failed to init. Try updating your NVM image. [871.644] i40e: probe of 0007:01:00.1 failed with error -32 [...] [872.516] i40e 0007:01:00.0: ARQ: Unknown event 0x0000 ignored Between the first probe failure (error -15) and the second (error -32) another PCI error happened due to the first bad probe. Also, driver started to flood console with those ARQ event messages. This patch will prevent these issues by allowing error recovery mechanism to remove the failed device from the system instead of trying to recover from early PCI errors during device probe. Signed-off-by: Guilherme G Piccoli <gpiccoli@linux.vnet.ibm.com> Acked-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Miklos Szeredi authored
commit a09f99ed upstream. Fuse allowed VFS to set mode in setattr in order to clear suid/sgid on chown and truncate, and (since writeback_cache) write. The problem with this is that it'll potentially restore a stale mode. The poper fix would be to let the filesystems do the suid/sgid clearing on the relevant operations. Possibly some are already doing it but there's no way we can detect this. So fix this by refreshing and recalculating the mode. Do this only if ATTR_KILL_S[UG]ID is set to not destroy performance for writes. This is still racy but the size of the window is reduced. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Miklos Szeredi authored
commit 5e2b8828 upstream. Without "default_permissions" the userspace filesystem's lookup operation needs to perform the check for search permission on the directory. If directory does not allow search for everyone (this is quite rare) then userspace filesystem has to set entry timeout to zero to make sure permissions are always performed. Changing the mode bits of the directory should also invalidate the (previously cached) dentry to make sure the next lookup will have a chance of updating the timeout, if needed. Reported-by: Jean-Pierre André <jean-pierre.andre@wanadoo.fr> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Miklos Szeredi authored
commit cb3ae6d2 upstream. Make sure userspace filesystem is returning a well formed list of xattr names (zero or more nonzero length, null terminated strings). [Michael Theall: only verify in the nonzero size case] Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marcin Wojtas authored
commit a0245eb7 upstream. Original commit, which added support for Armada CP110 system controller used global variables for storing all clock information. It worked fine for Armada 7k SoC, with single CP110 block. After dual-CP110 Armada 8k was introduced, the data got overwritten and corrupted. This patch fixes the issue by allocating resources dynamically in the driver probe and storing it as platform drvdata. Fixes: d3da3eae ("clk: mvebu: new driver for Armada CP110 system ...") Signed-off-by: Marcin Wojtas <mw@semihalf.com> Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Marcin Wojtas authored
commit ad715b26 upstream. Armada CP110 system controller comprises its own routine responsble for registering gate clocks. Among others 'flags' field in struct clk_init_data was not set, using a random values, which may cause an unpredicted behavior. This patch fixes the problem by resetting all fields of clk_init_data before assigning values for all gated clocks of Armada 7k/8k SoCs family. Fixes: d3da3eae ("clk: mvebu: new driver for Armada CP110 system ...") Signed-off-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: Stephen Boyd <sboyd@codeaurora.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mike Marciniszyn authored
commit 72f53af2 upstream. There is a a bug in defered ack stuff that causes a race with the destroy of a QP. A packet causes a defered ack to be pended by putting the QP into an rcd queue. A return from the driver interrupt processing will process that rcd queue of QPs and attempt to do a direct send of the ack. At this point no locks are held and the above QP could now be put in the reset state in the qp destroy logic. A refcount protects the QP while it is in the rcd queue so it isn't going anywhere yet. If the direct send fails to allocate a pio buffer, hfi1_schedule_send() is called to trigger sending an ack from the send engine. There is no state test in that code path. The refcount is then dropped from the driver.c caller potentially allowing the qp destroy to continue from its refcount wait in parallel with the workqueue scheduling of the qp. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Fan authored
commit 85714108 upstream. When dma_common_free_remap, the input parameter 'size' may not be page aligned. And, met kernel warning when doing iommu dma for usb on i.MX8 platform: " WARNING: CPU: 0 PID: 869 at mm/vmalloc.c:70 vunmap_page_range+0x1cc/0x1d0() Modules linked in: CPU: 0 PID: 869 Comm: kworker/u8:2 Not tainted 4.1.12-00444-gc5f9d1d-dirty #147 Hardware name: Freescale i.MX8DV Sabreauto (DT) Workqueue: ci_otg ci_otg_work Call trace: [<ffffffc000089920>] dump_backtrace+0x0/0x124 [<ffffffc000089a54>] show_stack+0x10/0x1c [<ffffffc0006d1e6c>] dump_stack+0x84/0xc8 [<ffffffc0000b4568>] warn_slowpath_common+0x98/0xd0 [<ffffffc0000b4664>] warn_slowpath_null+0x14/0x20 [<ffffffc000170348>] vunmap_page_range+0x1c8/0x1d0 [<ffffffc000170388>] unmap_kernel_range+0x20/0x88 [<ffffffc000460ad0>] dma_common_free_remap+0x74/0x84 [<ffffffc0000940d8>] __iommu_free_attrs+0x9c/0x178 [<ffffffc0005032bc>] ehci_mem_cleanup+0x140/0x194 [<ffffffc000503548>] ehci_stop+0x8c/0xdc [<ffffffc0004e8258>] usb_remove_hcd+0xf0/0x1cc [<ffffffc000516bc0>] host_stop+0x1c/0x58 [<ffffffc000514240>] ci_otg_work+0xdc/0x120 [<ffffffc0000c9c34>] process_one_work+0x134/0x33c [<ffffffc0000c9f78>] worker_thread+0x13c/0x47c [<ffffffc0000cf43c>] kthread+0xd8/0xf0 " For dma_common_pages_remap: dma_common_pages_remap |->get_vm_area_caller |->__get_vm_area_node |->size = PAGE_ALIGN(size); Round up to page aligned So, in dma_common_free_remap, we also need a page aligned size, pass 'PAGE_ALIGN(size)' to unmap_kernel_range. Signed-off-by: Peng Fan <van.freenix@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alexander Usyskin authored
commit e728ae27 upstream. The device lock was unnecessary obtained in bus rescan work before the amthif client search. That causes incorrect lock ordering and task hang: ... [88004.613213] INFO: task kworker/1:14:21832 blocked for more than 120 seconds. ... [88004.645934] Workqueue: events mei_cl_bus_rescan_work ... The correct lock order is cl_bus_lock device_lock me_clients_rwsem Move device_lock into amthif init function that called after me_clients_rwsem is released. This fixes regression introduced by commit: commit 025fb792 ("mei: split amthif client init from end of clients enumeration") Signed-off-by: Alexander Usyskin <alexander.usyskin@intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Junjie Mao authored
commit 14155caf upstream. Fixes: 4246a0b6 ("block: add a bi_error field to struct bio") Signed-off-by: Junjie Mao <junjie.mao@enight.me> Acked-by: David Sterba <dsterba@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Omar Sandoval authored
commit 6675df31 upstream. There are two separate issues that can lead to corrupted free space trees. 1. The free space tree bitmaps had an endianness issue on big-endian systems which is fixed by an earlier patch in this series. 2. btrfs-progs before v4.7.3 modified filesystems without updating the free space tree. To catch both of these issues at once, we need to force the free space tree to be rebuilt. To do so, add a FREE_SPACE_TREE_VALID compat_ro bit. If the bit isn't set, we know that it was either produced by a broken big-endian kernel or may have been corrupted by btrfs-progs. This also provides us with a way to add rudimentary read-write support for the free space tree to btrfs-progs: it can just clear this bit and have the kernel rebuild the free space tree. Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Tested-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Omar Sandoval authored
commit 2fe1d551 upstream. In convert_free_space_to_{bitmaps,extents}(), we buffer the free space bitmaps in memory and copy them directly to/from the extent buffers with {read,write}_extent_buffer(). The extent buffer bitmap helpers use byte granularity, which is equivalent to a little-endian bitmap. This means that on big-endian systems, the in-memory bitmaps will be written to disk byte-swapped. To fix this, use byte-granularity for the bitmaps in memory. Fixes: a5ed9182 ("Btrfs: implement the free space B-tree") Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Tested-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christian Lamparter authored
commit 6ee6d1cb upstream. Ben Greear reported: > I see lots of instability as soon as I load up the carl9710 NIC. > My application is going to be poking at it's debugfs files... > > BUG: KASAN: slab-out-of-bounds in carl9170_debugfs_read+0xd5/0x2a0 > [carl9170] at addr 0xffff8801bc1208b0 > Read of size 8 by task btserver/5888 > ======================================================================= > BUG kmalloc-256 (Tainted: G W ): kasan: bad access detected > ----------------------------------------------------------------------- > > INFO: Allocated in seq_open+0x50/0x100 age=2690 cpu=2 pid=772 >... This breakage was caused by the introduction of intermediate fops in debugfs by commit 9fd4dcec ("debugfs: prevent access to possibly dead file_operations at file open") Thankfully, the original/real fops are still available in d_fsdata. Reported-by: Ben Greear <greearb@candelatech.com> Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christian Lamparter authored
commit 9c4a45b1 upstream. This patch fixes a crash that happens because b43legacy's debugfs code expects file->f_op to be a pointer to its own b43legacy_debugfs_fops struct. This is no longer the case since commit 9fd4dcec ("debugfs: prevent access to possibly dead file_operations at file open") Reviewed-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Christian Lamparter authored
commit 51b275a6 upstream. This patch fixes a crash that happens because b43's debugfs code expects file->f_op to be a pointer to its own b43_debugfs_fops struct. This is no longer the case since commit 9fd4dcec ("debugfs: prevent access to possibly dead file_operations at file open") Reviewed-by: Nicolai Stange <nicstange@gmail.com> Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-