- 12 Sep, 2020 40 commits
-
-
Alex Williamson authored
commit abafbc55 upstream. Accessing the disabled memory space of a PCI device would typically result in a master abort response on conventional PCI, or an unsupported request on PCI express. The user would generally see these as a -1 response for the read return data and the write would be silently discarded, possibly with an uncorrected, non-fatal AER error triggered on the host. Some systems however take it upon themselves to bring down the entire system when they see something that might indicate a loss of data, such as this discarded write to a disabled memory space. To avoid this, we want to try to block the user from accessing memory spaces while they're disabled. We start with a semaphore around the memory enable bit, where writers modify the memory enable state and must be serialized, while readers make use of the memory region and can access in parallel. Writers include both direct manipulation via the command register, as well as any reset path where the internal mechanics of the reset may both explicitly and implicitly disable memory access, and manipulation of the MSI-X configuration, where the MSI-X vector table resides in MMIO space of the device. Readers include the read and write file ops to access the vfio device fd offsets as well as memory mapped access. In the latter case, we make use of our new vma list support to zap, or invalidate, those memory mappings in order to force them to be faulted back in on access. Our semaphore usage will stall user access to MMIO spaces across internal operations like reset, but the user might experience new behavior when trying to access the MMIO space while disabled via the PCI command register. Access via read or write while disabled will return -EIO and access via memory maps will result in a SIGBUS. This is expected to be compatible with known use cases and potentially provides better error handling capabilities than present in the hardware, while avoiding the more readily accessible and severe platform error responses that might otherwise occur. Fixes: CVE-2020-12888 Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> [Ajay: Regenerated the patch for v4.9] Signed-off-by: Ajay Kaher <akaher@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alex Williamson authored
commit 11c4cd07 upstream. Rather than calling remap_pfn_range() when a region is mmap'd, setup a vm_ops handler to support dynamic faulting of the range on access. This allows us to manage a list of vmas actively mapping the area that we can later use to invalidate those mappings. The open callback invalidates the vma range so that all tracking is inserted in the fault handler and removed in the close handler. Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> [Ajay: Regenerated the patch for v4.9] Signed-off-by: Ajay Kaher <akaher@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Alex Williamson authored
commit 41311242 upstream. With conversion to follow_pfn(), DMA mapping a PFNMAP range depends on the range being faulted into the vma. Add support to manually provide that, in the same way as done on KVM with hva_to_pfn_remapped(). Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> [Ajay: Regenerated the patch for v4.9] Signed-off-by: Ajay Kaher <akaher@vmware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eugeniu Rosca authored
commit dc07a728 upstream. Commit 52f23478 ("mm/slub.c: fix corrupted freechain in deactivate_slab()") suffered an update when picked up from LKML [1]. Specifically, relocating 'freelist = NULL' into 'freelist_corrupted()' created a no-op statement. Fix it by sticking to the behavior intended in the original patch [1]. In addition, make freelist_corrupted() immune to passing NULL instead of &freelist. The issue has been spotted via static analysis and code review. [1] https://lore.kernel.org/linux-mm/20200331031450.12182-1-dongli.zhang@oracle.com/ Fixes: 52f23478 ("mm/slub.c: fix corrupted freechain in deactivate_slab()") Signed-off-by: Eugeniu Rosca <erosca@de.adit-jv.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Cc: Dongli Zhang <dongli.zhang@oracle.com> Cc: Joe Jin <joe.jin@oracle.com> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: <stable@vger.kernel.org> Link: https://lkml.kernel.org/r/20200824130643.10291-1-erosca@de.adit-jv.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ye Bin authored
commit 219403d7 upstream. Maybe __create_persistent_data_objects() caller will use PTR_ERR as a pointer, it will lead to some strange things. Signed-off-by: Ye Bin <yebin10@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ye Bin authored
commit d16ff19e upstream. Maybe __create_persistent_data_objects() caller will use PTR_ERR as a pointer, it will lead to some strange things. Signed-off-by: Ye Bin <yebin10@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Mike Snitzer <snitzer@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tejun Heo authored
commit 3b545563 upstream. All three generations of Sandisk SSDs lock up hard intermittently. Experiments showed that disabling NCQ lowered the failure rate significantly and the kernel has been disabling NCQ for some models of SD7's and 8's, which is obviously undesirable. Karthik worked with Sandisk to root cause the hard lockups to trim commands larger than 128M. This patch implements ATA_HORKAGE_MAX_TRIM_128M which limits max trim size to 128M and applies it to all three generations of Sandisk SSDs. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Karthik Shivaram <karthikgs@fb.com> Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Bart Van Assche authored
commit 233bde21 upstream. It happens often while I'm preparing a patch for a block driver that I'm wondering: is a definition of SECTOR_SIZE and/or SECTOR_SHIFT available for this driver? Do I have to introduce definitions of these constants before I can use these constants? To avoid this confusion, move the existing definitions of SECTOR_SIZE and SECTOR_SHIFT into the <linux/blkdev.h> header file such that these become available for all block drivers. Make the SECTOR_SIZE definition in the uapi msdos_fs.h header file conditional to avoid that including that header file after <linux/blkdev.h> causes the compiler to complain about a SECTOR_SIZE redefinition. Note: the SECTOR_SIZE / SECTOR_SHIFT / SECTOR_BITS definitions have not been removed from uapi header files nor from NAND drivers in which these constants are used for another purpose than converting block layer offsets and sizes into a number of sectors. Cc: David S. Miller <davem@davemloft.net> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Nitin Gupta <ngupta@vflare.org> Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Ming Lei authored
commit 7e249690 upstream. Block layer usually doesn't support or allow zero-length bvec. Since commit 1bdc76ae ("iov_iter: use bvec iterator to implement iterate_bvec()"), iterate_bvec() switches to bvec iterator. However, Al mentioned that 'Zero-length segments are not disallowed' in iov_iter. Fixes for_each_bvec() so that it can move on after seeing one zero length bvec. Fixes: 1bdc76ae ("iov_iter: use bvec iterator to implement iterate_bvec()") Reported-by: syzbot <syzbot+61acc40a49a3e46e25ea@syzkaller.appspotmail.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Tested-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> Link: https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2262077.htmlSigned-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takashi Sakamoto authored
commit acd46a6b upstream. Avid Adrenaline is reported that ALSA firewire-digi00x driver is bound to. However, as long as he investigated, the design of this model is hardly similar to the one of Digi 00x family. It's better to exclude the model from modalias of ALSA firewire-digi00x driver. This commit changes device entries so that the model is excluded. $ python3 crpp < ~/git/am-config-rom/misc/avid-adrenaline.img ROM header and bus information block ----------------------------------------------------------------- 400 04203a9c bus_info_length 4, crc_length 32, crc 15004 404 31333934 bus_name "1394" 408 e064a002 irmc 1, cmc 1, isc 1, bmc 0, cyc_clk_acc 100, max_rec 10 (2048) 40c 00a07e01 company_id 00a07e | 410 00085257 device_id 0100085257 | EUI-64 00a07e0100085257 root directory ----------------------------------------------------------------- 414 0005d08c directory_length 5, crc 53388 418 0300a07e vendor 41c 8100000c --> descriptor leaf at 44c 420 0c008380 node capabilities 424 8d000002 --> eui-64 leaf at 42c 428 d1000004 --> unit directory at 438 eui-64 leaf at 42c ----------------------------------------------------------------- 42c 0002410f leaf_length 2, crc 16655 430 00a07e01 company_id 00a07e | 434 00085257 device_id 0100085257 | EUI-64 00a07e0100085257 unit directory at 438 ----------------------------------------------------------------- 438 0004d6c9 directory_length 4, crc 54985 43c 1200a02d specifier id: 1394 TA 440 13014001 version: Vender Unique and AV/C 444 17000001 model 448 81000009 --> descriptor leaf at 46c descriptor leaf at 44c ----------------------------------------------------------------- 44c 00077205 leaf_length 7, crc 29189 450 00000000 textual descriptor 454 00000000 minimal ASCII 458 41766964 "Avid" 45c 20546563 " Tec" 460 686e6f6c "hnol" 464 6f677900 "ogy" 468 00000000 descriptor leaf at 46c ----------------------------------------------------------------- 46c 000599a5 leaf_length 5, crc 39333 470 00000000 textual descriptor 474 00000000 minimal ASCII 478 41647265 "Adre" 47c 6e616c69 "nali" 480 6e650000 "ne" Reported-by: Simon Wood <simon@mungewell.org> Fixes: 9edf723f ("ALSA: firewire-digi00x: add skeleton for Digi 002/003 family") Cc: <stable@vger.kernel.org> # 4.4+ Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://lore.kernel.org/r/20200823075545.56305-1-o-takashi@sakamocchi.jpSigned-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Takashi Iwai authored
commit 949a1ebe upstream. The PCM OSS mulaw plugin has a check of the format of the counter part whether it's a linear format. The check is with snd_BUG_ON() that emits WARN_ON() when the debug config is set, and it confuses syzkaller as if it were a serious issue. Let's drop snd_BUG_ON() for avoiding that. While we're at it, correct the error code to a more suitable, EINVAL. Reported-by: syzbot+23b22dc2e0b81cbfcc95@syzkaller.appspotmail.com Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200901131802.18157-1-tiwai@suse.deSigned-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Tong Zhang authored
commit ee0761d1 upstream. snd_ca0106_spi_write() returns 1 on error, snd_ca0106_pcm_power_dac() is returning the error code directly, and the caller is expecting an negative error code Signed-off-by: Tong Zhang <ztong0001@gmail.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20200824224541.1260307-1-ztong0001@gmail.comSigned-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rogan Dawes authored
[ Upstream commit 7d605309 ] Signed-off-by: Rogan Dawes <rogan@dawes.za.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Bjørn Mork authored
[ Upstream commit 60cfe1ea ] A new Sierra Wireless EM7305 device ID used in a Toshiba laptop, and two Longcheer device IDs entries used by Telewell TW-3G HSPA+ branded modems. Reported-by: Petr Kloc <petr_kloc@yahoo.com> Reported-by: Teemu Likonen <tlikonen@iki.fi> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Daniele Palmas authored
[ Upstream commit 14cf4a77 ] Telit LE920A4 uses the same pid 0x1201 of LE920, but modem implementation is different, since it requires DTR to be set for answering to qmi messages. This patch replaces QMI_FIXED_INTF with QMI_QUIRK_SET_DTR: tests on LE920 have been performed in order to verify backward compatibility. Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Daniele Palmas authored
[ Upstream commit e0ae2c57 ] This patch adds support for Telit FN980 0x1050 composition 0x1050: tty, adb, rmnet, tty, tty, tty, tty Signed-off-by: Daniele Palmas <dnlplm@gmail.com> Acked-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Josef Bacik authored
[ Upstream commit a48b73ec ] With the conversion of the tree locks to rwsem I got the following lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 5.8.0-rc7-00165-g04ec4da5f45f-dirty #922 Not tainted ------------------------------------------------------ compsize/11122 is trying to acquire lock: ffff889fabca8768 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0x3e/0x90 but task is already holding lock: ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (btrfs-fs-00){++++}-{3:3}: down_write_nested+0x3b/0x70 __btrfs_tree_lock+0x24/0x120 btrfs_search_slot+0x756/0x990 btrfs_lookup_inode+0x3a/0xb4 __btrfs_update_delayed_inode+0x93/0x270 btrfs_async_run_delayed_root+0x168/0x230 btrfs_work_helper+0xd4/0x570 process_one_work+0x2ad/0x5f0 worker_thread+0x3a/0x3d0 kthread+0x133/0x150 ret_from_fork+0x1f/0x30 -> #1 (&delayed_node->mutex){+.+.}-{3:3}: __mutex_lock+0x9f/0x930 btrfs_delayed_update_inode+0x50/0x440 btrfs_update_inode+0x8a/0xf0 btrfs_dirty_inode+0x5b/0xd0 touch_atime+0xa1/0xd0 btrfs_file_mmap+0x3f/0x60 mmap_region+0x3a4/0x640 do_mmap+0x376/0x580 vm_mmap_pgoff+0xd5/0x120 ksys_mmap_pgoff+0x193/0x230 do_syscall_64+0x50/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xa9 -> #0 (&mm->mmap_lock#2){++++}-{3:3}: __lock_acquire+0x1272/0x2310 lock_acquire+0x9e/0x360 __might_fault+0x68/0x90 _copy_to_user+0x1e/0x80 copy_to_sk.isra.32+0x121/0x300 search_ioctl+0x106/0x200 btrfs_ioctl_tree_search_v2+0x7b/0xf0 btrfs_ioctl+0x106f/0x30a0 ksys_ioctl+0x83/0xc0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x50/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xa9 other info that might help us debug this: Chain exists of: &mm->mmap_lock#2 --> &delayed_node->mutex --> btrfs-fs-00 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(btrfs-fs-00); lock(&delayed_node->mutex); lock(btrfs-fs-00); lock(&mm->mmap_lock#2); *** DEADLOCK *** 1 lock held by compsize/11122: #0: ffff889fe720fe40 (btrfs-fs-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180 stack backtrace: CPU: 17 PID: 11122 Comm: compsize Kdump: loaded Not tainted 5.8.0-rc7-00165-g04ec4da5f45f-dirty #922 Hardware name: Quanta Tioga Pass Single Side 01-0030993006/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018 Call Trace: dump_stack+0x78/0xa0 check_noncircular+0x165/0x180 __lock_acquire+0x1272/0x2310 lock_acquire+0x9e/0x360 ? __might_fault+0x3e/0x90 ? find_held_lock+0x72/0x90 __might_fault+0x68/0x90 ? __might_fault+0x3e/0x90 _copy_to_user+0x1e/0x80 copy_to_sk.isra.32+0x121/0x300 ? btrfs_search_forward+0x2a6/0x360 search_ioctl+0x106/0x200 btrfs_ioctl_tree_search_v2+0x7b/0xf0 btrfs_ioctl+0x106f/0x30a0 ? __do_sys_newfstat+0x5a/0x70 ? ksys_ioctl+0x83/0xc0 ksys_ioctl+0x83/0xc0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x50/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The problem is we're doing a copy_to_user() while holding tree locks, which can deadlock if we have to do a page fault for the copy_to_user(). This exists even without my locking changes, so it needs to be fixed. Rework the search ioctl to do the pre-fault and then copy_to_user_nofault for the copying. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Daniel Borkmann authored
[ Upstream commit 1d1585ca ] Commit 3d708182 ("uaccess: Add non-pagefault user-space read functions") missed to add probe write function, therefore factor out a probe_write_common() helper with most logic of probe_kernel_write() except setting KERNEL_DS, and add a new probe_user_write() helper so it can be used from BPF side. Again, on some archs, the user address space and kernel address space can co-exist and be overlapping, so in such case, setting KERNEL_DS would mean that the given address is treated as being in kernel address space. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Andrii Nakryiko <andriin@fb.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Link: https://lore.kernel.org/bpf/9df2542e68141bfa3addde631441ee45503856a8.1572649915.git.daniel@iogearbox.netSigned-off-by: Sasha Levin <sashal@kernel.org>
-
Masami Hiramatsu authored
[ Upstream commit 3d708182 ] Add probe_user_read(), strncpy_from_unsafe_user() and strnlen_unsafe_user() which allows caller to access user-space in IRQ context. Current probe_kernel_read() and strncpy_from_unsafe() are not available for user-space memory, because it sets KERNEL_DS while accessing data. On some arch, user address space and kernel address space can be co-exist, but others can not. In that case, setting KERNEL_DS means given address is treated as a kernel address space. Also strnlen_user() is only available from user context since it can sleep if pagefault is enabled. To access user-space memory without pagefault, we need these new functions which sets USER_DS while accessing the data. Link: http://lkml.kernel.org/r/155789869802.26965.4940338412595759063.stgit@devnote2Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Josef Bacik authored
[ Upstream commit d3beaa25 ] These are special extent buffers that get rewound in order to lookup the state of the tree at a specific point in time. As such they do not go through the normal initialization paths that set their lockdep class, so handle them appropriately when they are created and before they are locked. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Nikolay Borisov authored
[ Upstream commit 24cee18a ] When a rewound buffer is created it already has a ref count of 1 and the dummy flag set. Then another ref is taken bumping the count to 2. Finally when this buffer is released from btrfs_release_path the extra reference is decremented by the special handling code in free_extent_buffer. However, this special code is in fact redundant sinca ref count of 1 is still correct since the buffer is only accessed via btrfs_path struct. This paves the way forward of removing the special handling in free_extent_buffer. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Nikolay Borisov authored
[ Upstream commit 6c122e2a ] get_old_root used used only by btrfs_search_old_slot to initialise the path structure. The old root is always a cloned buffer (either via alloc dummy or via btrfs_clone_extent_buffer) and its reference count is 2: 1 from allocation, 1 from extent_buffer_get call in get_old_root. This latter explicit ref count acquire operation is in fact unnecessary since the semantic is such that the newly allocated buffer is handed over to the btrfs_path for lifetime management. Considering this just remove the extra extent_buffer_get in get_old_root. Signed-off-by: Nikolay Borisov <nborisov@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Josef Bacik authored
commit 9771a5cf upstream. With the conversion of the tree locks to rwsem I got the following lockdep splat: ====================================================== WARNING: possible circular locking dependency detected 5.8.0-rc7-00167-g0d7ba0c5b375-dirty #925 Not tainted ------------------------------------------------------ btrfs-uuid/7955 is trying to acquire lock: ffff88bfbafec0f8 (btrfs-root-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180 but task is already holding lock: ffff88bfbafef2a8 (btrfs-uuid-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (btrfs-uuid-00){++++}-{3:3}: down_read_nested+0x3e/0x140 __btrfs_tree_read_lock+0x39/0x180 __btrfs_read_lock_root_node+0x3a/0x50 btrfs_search_slot+0x4bd/0x990 btrfs_uuid_tree_add+0x89/0x2d0 btrfs_uuid_scan_kthread+0x330/0x390 kthread+0x133/0x150 ret_from_fork+0x1f/0x30 -> #0 (btrfs-root-00){++++}-{3:3}: __lock_acquire+0x1272/0x2310 lock_acquire+0x9e/0x360 down_read_nested+0x3e/0x140 __btrfs_tree_read_lock+0x39/0x180 __btrfs_read_lock_root_node+0x3a/0x50 btrfs_search_slot+0x4bd/0x990 btrfs_find_root+0x45/0x1b0 btrfs_read_tree_root+0x61/0x100 btrfs_get_root_ref.part.50+0x143/0x630 btrfs_uuid_tree_iterate+0x207/0x314 btrfs_uuid_rescan_kthread+0x12/0x50 kthread+0x133/0x150 ret_from_fork+0x1f/0x30 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(btrfs-uuid-00); lock(btrfs-root-00); lock(btrfs-uuid-00); lock(btrfs-root-00); *** DEADLOCK *** 1 lock held by btrfs-uuid/7955: #0: ffff88bfbafef2a8 (btrfs-uuid-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x39/0x180 stack backtrace: CPU: 73 PID: 7955 Comm: btrfs-uuid Kdump: loaded Not tainted 5.8.0-rc7-00167-g0d7ba0c5b375-dirty #925 Hardware name: Quanta Tioga Pass Single Side 01-0030993006/Tioga Pass Single Side, BIOS F08_3A18 12/20/2018 Call Trace: dump_stack+0x78/0xa0 check_noncircular+0x165/0x180 __lock_acquire+0x1272/0x2310 lock_acquire+0x9e/0x360 ? __btrfs_tree_read_lock+0x39/0x180 ? btrfs_root_node+0x1c/0x1d0 down_read_nested+0x3e/0x140 ? __btrfs_tree_read_lock+0x39/0x180 __btrfs_tree_read_lock+0x39/0x180 __btrfs_read_lock_root_node+0x3a/0x50 btrfs_search_slot+0x4bd/0x990 btrfs_find_root+0x45/0x1b0 btrfs_read_tree_root+0x61/0x100 btrfs_get_root_ref.part.50+0x143/0x630 btrfs_uuid_tree_iterate+0x207/0x314 ? btree_readpage+0x20/0x20 btrfs_uuid_rescan_kthread+0x12/0x50 kthread+0x133/0x150 ? kthread_create_on_node+0x60/0x60 ret_from_fork+0x1f/0x30 This problem exists because we have two different rescan threads, btrfs_uuid_scan_kthread which creates the uuid tree, and btrfs_uuid_tree_iterate that goes through and updates or deletes any out of date roots. The problem is they both do things in different order. btrfs_uuid_scan_kthread() reads the tree_root, and then inserts entries into the uuid_root. btrfs_uuid_tree_iterate() scans the uuid_root, but then does a btrfs_get_fs_root() which can read from the tree_root. It's actually easy enough to not be holding the path in btrfs_uuid_scan_kthread() when we add a uuid entry, as we already drop it further down and re-start the search when we loop. So simply move the path release before we add our entry to the uuid tree. This also fixes a problem where we're holding a path open after we do btrfs_end_transaction(), which has it's own problems. CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Jason Gunthorpe authored
[ Upstream commit 428fc0af ] Otherwise gcc generates warnings if the expression is complicated. Fixes: 312a0c17 ("[PATCH] LOG2: Alter roundup_pow_of_two() so that it can use a ilog2() on a constant") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lkml.kernel.org/r/0-v1-8a2697e3c003+41165-log_brackets_jgg@nvidia.comSigned-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Tony Lindgren authored
[ Upstream commit 30d24fab ] We can sometimes get bogus thermal shutdowns on omap4430 at least with droid4 running idle with a battery charger connected: thermal thermal_zone0: critical temperature reached (143 C), shutting down Dumping out the register values shows we can occasionally get a 0x7f value that is outside the TRM listed values in the ADC conversion table. And then we get a normal value when reading again after that. Reading the register multiple times does not seem help avoiding the bogus values as they stay until the next sample is ready. Looking at the TRM chapter "18.4.10.2.3 ADC Codes Versus Temperature", we should have values from 13 to 107 listed with a total of 95 values. But looking at the omap4430_adc_to_temp array, the values are off, and the end values are missing. And it seems that the 4430 ADC table is similar to omap3630 rather than omap4460. Let's fix the issue by using values based on the omap3630 table and just ignoring invalid values. Compared to the 4430 TRM, the omap3630 table has the missing values added while the TRM table only shows every second value. Note that sometimes the ADC register values within the valid table can also be way off for about 1 out of 10 values. But it seems that those just show about 25 C too low values rather than too high values. So those do not cause a bogus thermal shutdown. Fixes: 1a31270e ("staging: omap-thermal: add OMAP4 data structures") Cc: Merlijn Wajer <merlijn@wizzup.org> Cc: Pavel Machek <pavel@ucw.cz> Cc: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Link: https://lore.kernel.org/r/20200706183338.25622-1-tony@atomide.comSigned-off-by: Sasha Levin <sashal@kernel.org>
-
Lu Baolu authored
[ Upstream commit 6e4e9ec6 ] The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General Description) that: If multiple control fields in this register need to be modified, software must serialize the modifications through multiple writes to this register. However, in irq_remapping.c, modifications of IRE and CFI are done in one write. We need to do two separate writes with STS checking after each. It also checks the status register before writing command register to avoid unnecessary register write. Fixes: af8d102f ("x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess") Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Link: https://lore.kernel.org/r/20200828000615.8281-1-baolu.lu@linux.intel.comSigned-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Michael Chan authored
[ Upstream commit 55669934 ] If tg3_reset_task() fails, the device state is left in an inconsistent state with IFF_RUNNING still set but NAPI state not enabled. A subsequent operation, such as ifdown or AER error can cause it to soft lock up when it tries to disable NAPI state. Fix it by bringing down the device to !IFF_RUNNING state when tg3_reset_task() fails. tg3_reset_task() running from workqueue will now call tg3_close() when the reset fails. We need to modify tg3_reset_task_cancel() slightly to avoid tg3_close() calling cancel_work_sync() to cancel tg3_reset_task(). Otherwise cancel_work_sync() will wait forever for tg3_reset_task() to finish. Reported-by: David Christensen <drc@linux.vnet.ibm.com> Reported-by: Baptiste Covolato <baptiste@arista.com> Fixes: db219973 ("tg3: Schedule at most one tg3_reset_task run") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Al Viro authored
[ Upstream commit 77f4689d ] epoll_loop_check_proc() can run into a file already committed to destruction; we can't grab a reference on those and don't need to add them to the set for reverse path check anyway. Tested-by: Marc Zyngier <maz@kernel.org> Fixes: a9ed4a65 ("epoll: Keep a reference on files added to the check list") Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Vasundhara Volam authored
[ Upstream commit df3875ec ] When a PCI error is detected the PCI state could be corrupt, save the PCI state after initialization and restore it after the slot reset. Fixes: 6316ea6d ("bnxt_en: Enable AER support.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Vasundhara Volam authored
[ Upstream commit dbbfa96a ] If firmware goes into unstable state, HWRM_NVM_GET_DIR_INFO firmware command may return zero dir entries. Return error in such case to avoid zero length dma buffer request. Fixes: c0c050c5 ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Marek Szyprowski authored
[ Upstream commit 0661cef6 ] Move the burst len fixup after setting the generic value for it. This finally enables the fixup introduced by commit 137bd110 ("dmaengine: pl330: Align DMA memcpy operations to MFIFO width"), which otherwise was overwritten by the generic value. Reported-by: kernel test robot <lkp@intel.com> Fixes: 137bd110 ("dmaengine: pl330: Align DMA memcpy operations to MFIFO width") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20200825064617.16193-1-m.szyprowski@samsung.comSigned-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Dinghao Liu authored
[ Upstream commit e2d79cd8 ] When devm_gpiod_get_optional() fails, bus should be freed just like when of_mdiobus_register() fails. Fixes: 1bddd96c ("net: arc_emac: support the phy reset for emac driver") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Yuusuke Ashizuka authored
[ Upstream commit 1838d6c6 ] When this driver is built as a module, I cannot rmmod it after insmoding it. This is because that this driver calls ravb_mdio_init() at the time of probe, and module->refcnt is incremented by alloc_mdio_bitbang() called after that. Therefore, even if ifup is not performed, the driver is in use and rmmod cannot be performed. $ lsmod Module Size Used by ravb 40960 1 $ rmmod ravb rmmod: ERROR: Module ravb is in use Call ravb_mdio_init() at open and free_mdio_bitbang() at close, thereby rmmod is possible in the ifdown state. Fixes: c156633f ("Renesas Ethernet AVB driver proper") Signed-off-by: Yuusuke Ashizuka <ashiduka@fujitsu.com> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Dinghao Liu authored
[ Upstream commit 100e3345 ] hns_nic_dev_probe allocates ndev, but not free it on two error handling paths, which may lead to memleak. Fixes: 63434888 ("net: hns: net: hns: enet adds support of acpi") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Florian Westphal authored
[ Upstream commit 1e105e6a ] Following bug was reported via irc: nft list ruleset set knock_candidates_ipv4 { type ipv4_addr . inet_service size 65535 elements = { 127.0.0.1 . 123, 127.0.0.1 . 123 } } .. udp dport 123 add @knock_candidates_ipv4 { ip saddr . 123 } udp dport 123 add @knock_candidates_ipv4 { ip saddr . udp dport } It should not have been possible to add a duplicate set entry. After some debugging it turned out that the problem is the immediate value (123) in the second-to-last rule. Concatenations use 32bit registers, i.e. the elements are 8 bytes each, not 6 and it turns out the kernel inserted inet firewall @knock_candidates_ipv4 element 0100007f ffff7b00 : 0 [end] element 0100007f 00007b00 : 0 [end] Note the non-zero upper bits of the first element. It turns out that nft_immediate doesn't zero the destination register, but this is needed when the length isn't a multiple of 4. Furthermore, the zeroing in nft_payload is broken. We can't use [len / 4] = 0 -- if len is a multiple of 4, index is off by one. Skip zeroing in this case and use a conditional instead of (len -1) / 4. Fixes: 49499c3e ("netfilter: nf_tables: switch registers to 32 bit addressing") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit da9125df ] This should be NFTA_LIST_UNSPEC instead of NFTA_LIST_UNPEC, all other similar attribute definitions are postfixed with _UNSPEC. Fixes: 96518518 ("netfilter: add nftables") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Pablo Neira Ayuso authored
[ Upstream commit 6f03bf43 ] Kernel sends an empty NFTA_SET_USERDATA attribute with no value if userspace adds a set with no NFTA_SET_USERDATA attribute. Fixes: e6d8ecac ("netfilter: nf_tables: Add new attributes into nft_set to store user data.") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Florian Fainelli authored
[ Upstream commit e14f633b ] The initialization done by bmips_cpu_setup() typically affects both threads of a given core, on 7435 which supports 2 cores and 2 threads, logical CPU number 2 and 3 would not run this initialization. Fixes: 738a3f79 ("MIPS: BMIPS: Add early CPU initialization code") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Florian Fainelli authored
[ Upstream commit dbfc95f9 ] When the BMIPS generic cpu-feature-overrides.h file was introduced, cpu_has_inclusive_caches/MIPS_CPU_INCLUSIVE_CACHES was not set for BMIPS5000 CPUs. Correct this when we have initialized the MIPS secondary cache successfully. Fixes: f337967d ("MIPS: BMIPS: Add cpu-feature-overrides.h") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
-
Yu Kuai authored
[ Upstream commit 0cef8e2c ] The reurn value of of_find_device_by_node() is not checked, thus null pointer dereference will be triggered if of_find_device_by_node() failed. Fixes: bbe89c8e ("at_hdmac: move to generic DMA binding") Signed-off-by: Yu Kuai <yukuai3@huawei.com> Link: https://lore.kernel.org/r/20200817115728.1706719-2-yukuai3@huawei.comSigned-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
-