- 26 Feb, 2016 18 commits
-
-
Andy Whitcroft authored
The plan here is for linux, gcc, binutils, and eglibc to all depends on each other and to all have a rebuild test. That way the entire set is rebuild tested for any one in the set being uploaded. BugLink: http://bugs.launchpad.net/bugs/1081500Signed-off-by: Andy Whitcroft <apw@canonical.com>
-
Andy Whitcroft authored
The new Debian autopkgtest system takes ownership of the debian/tests directory, in such a way that is incompatible with our usage. Move our tests to debian/tests-build as they are build tests. BugLink: http://bugs.launchpad.net/bugs/1081500Signed-off-by: Andy Whitcroft <apw@canonical.com>
-
Andy Whitcroft authored
Add support for the DEB_BUILD_OPTIONS rebuild-test which indicates this is not a full build but a quick smoke test. For us short circuit the build and only make the first flavour on the assumption it is representative of the others. BugLink: http://bugs.launchpad.net/bugs/1081500Signed-off-by: Andy Whitcroft <apw@canonical.com>
-
Tim Gardner authored
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Ben Collins authored
[apw@canonical.com: forward ported to new cleaned up indep stack.] Signed-off-by: Ben Collins <ben.c@servergy.com> Signed-off-by: Andy Whitcroft <apw@canonical.com>
-
Tim Gardner authored
Move some code around to directly reflect the dependency chain ordering. Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Tim Gardner authored
dh_prep needs to be run only once at the root of the binary-indep dependency chain, i.e., install-headers. Similarly, dh_testdir and dh_testroot only need to be run once at the root of the binary-indep dependency chain. Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Andy Whitcroft authored
Fix up some confusingly wrong debugging output in the config checker. Signed-off-by: Andy Whitcroft <apw@canonical.com>
-
Ben Collins authored
On PowerPC, the flavours have different make targets and installable images. For example, e500/e500mc use a target of uImage, since that is the native format for U-Boot systems. Signed-off-by: Ben Collins <ben.c@servergy.com> Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Andy Whitcroft authored
Signed-off-by: Andy Whitcroft <apw@canonical.com>
-
Andy Whitcroft authored
Signed-off-by: Andy Whitcroft <apw@canonical.com>
-
Andy Whitcroft authored
Signed-off-by: Andy Whitcroft <apw@canonical.com>
-
Andy Whitcroft authored
Signed-off-by: Andy Whitcroft <apw@canonical.com>
-
Andy Whitcroft authored
Signed-off-by: Andy Whitcroft <apw@canonical.com>
-
Andy Whitcroft authored
Pick out the kernel binaries and add them to a custom upload. This upload will trigger signing of the contained files which will later be pulled into linux-*-signed packages. Only include amd64 kernels as we only support EFI signed packages there. Also ensure the kernel has a high enough interface version >= 0x020b otherwise we may end up with an unsafe kernel loaded. Signed-off-by: Andy Whitcroft <apw@canonical.com>
-
Tim Gardner authored
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
Leann Ogasawara authored
Ignore: yes Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
-
- 25 Feb, 2016 22 commits
-
-
Greg Kroah-Hartman authored
-
Luis R. Rodriguez authored
commit 4355efbd upstream. Commit f2411da7 ("driver-core: add driver module asynchronous probe support") added async probe support, in two forms: * in-kernel driver specification annotation * generic async_probe module parameter (modprobe foo async_probe) To support the generic kernel parameter parse_args() was extended via commit ecc86170 ("module: add extra argument for parse_params() callback") however commit failed to f2411da7 failed to add the required argument. This causes a crash then whenever async_probe generic module parameter is used. This was overlooked when the form in which in-kernel async probe support was reworked a bit... Fix this as originally intended. Cc: Hannes Reinecke <hare@suse.de> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> [minimized] Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Rusty Russell authored
commit 2e7bac53 upstream. This trivial wrapper adds clarity and makes the following patch smaller. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Gleixner authored
commit 51cbb524 upstream. As Helge reported for timerfd we have the same issue in itimers. We return remaining time larger than the programmed relative time to user space in case of CONFIG_TIME_LOW_RES=y. Use the proper function to adjust the extra time added in hrtimer_start_range_ns(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Helge Deller <deller@gmx.de> Cc: John Stultz <john.stultz@linaro.org> Cc: linux-m68k@lists.linux-m68k.org Cc: dhowells@redhat.com Link: http://lkml.kernel.org/r/20160114164159.528222587@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Gleixner authored
commit 572c3917 upstream. As Helge reported for timerfd we have the same issue in posix timers. We return remaining time larger than the programmed relative time to user space in case of CONFIG_TIME_LOW_RES=y. Use the proper function to adjust the extra time added in hrtimer_start_range_ns(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Helge Deller <deller@gmx.de> Cc: John Stultz <john.stultz@linaro.org> Cc: linux-m68k@lists.linux-m68k.org Cc: dhowells@redhat.com Link: http://lkml.kernel.org/r/20160114164159.450510905@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Gleixner authored
commit b62526ed upstream. Helge reported that a relative timer can return a remaining time larger than the programmed relative time on parisc and other architectures which have CONFIG_TIME_LOW_RES set. This happens because we add a jiffie to the resulting expiry time to prevent short timeouts. Use the new function hrtimer_expires_remaining_adjusted() to calculate the remaining time. It takes that extra added time into account for relative timers. Reported-and-tested-by: Helge Deller <deller@gmx.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Cc: linux-m68k@lists.linux-m68k.org Cc: dhowells@redhat.com Link: http://lkml.kernel.org/r/20160114164159.354500742@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Mateusz Guzik authored
commit ddf1d398 upstream. An unprivileged user can trigger an oops on a kernel with CONFIG_CHECKPOINT_RESTORE. proc_pid_cmdline_read takes mmap_sem for reading and obtains args + env start/end values. These get sanity checked as follows: BUG_ON(arg_start > arg_end); BUG_ON(env_start > env_end); These can be changed by prctl_set_mm. Turns out also takes the semaphore for reading, effectively rendering it useless. This results in: kernel BUG at fs/proc/base.c:240! invalid opcode: 0000 [#1] SMP Modules linked in: virtio_net CPU: 0 PID: 925 Comm: a.out Not tainted 4.4.0-rc8-next-20160105dupa+ #71 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff880077a68000 ti: ffff8800784d0000 task.ti: ffff8800784d0000 RIP: proc_pid_cmdline_read+0x520/0x530 RSP: 0018:ffff8800784d3db8 EFLAGS: 00010206 RAX: ffff880077c5b6b0 RBX: ffff8800784d3f18 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 00007f78e8857000 RDI: 0000000000000246 RBP: ffff8800784d3e40 R08: 0000000000000008 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000050 R13: 00007f78e8857800 R14: ffff88006fcef000 R15: ffff880077c5b600 FS: 00007f78e884a740(0000) GS:ffff88007b200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f78e8361770 CR3: 00000000790a5000 CR4: 00000000000006f0 Call Trace: __vfs_read+0x37/0x100 vfs_read+0x82/0x130 SyS_read+0x58/0xd0 entry_SYSCALL_64_fastpath+0x12/0x76 Code: 4c 8b 7d a8 eb e9 48 8b 9d 78 ff ff ff 4c 8b 7d 90 48 8b 03 48 39 45 a8 0f 87 f0 fe ff ff e9 d1 fe ff ff 4c 8b 7d 90 eb c6 0f 0b <0f> 0b 0f 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 RIP proc_pid_cmdline_read+0x520/0x530 ---[ end trace 97882617ae9c6818 ]--- Turns out there are instances where the code just reads aformentioned values without locking whatsoever - namely environ_read and get_cmdline. Interestingly these functions look quite resilient against bogus values, but I don't believe this should be relied upon. The first patch gets rid of the oops bug by grabbing mmap_sem for writing. The second patch is optional and puts locking around aformentioned consumers for safety. Consumers of other fields don't seem to benefit from similar treatment and are left untouched. This patch (of 2): The code was taking the semaphore for reading, which does not protect against readers nor concurrent modifications. The problem could cause a sanity checks to fail in procfs's cmdline reader, resulting in an OOPS. Note that some functions perform an unlocked read of various mm fields, but they seem to be fine despite possible modificaton. Signed-off-by: Mateusz Guzik <mguzik@redhat.com> Acked-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Jarod Wilson <jarod@redhat.com> Cc: Jan Stancek <jstancek@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Anshuman Khandual <anshuman.linux@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dave Chinner authored
commit 85bec546 upstream. Recently I've been seeing xfs/051 fail on 1k block size filesystems. Trying to trace the events during the test lead to the problem going away, indicating that it was a race condition that lead to this ASSERT failure: XFS: Assertion failed: atomic_read(&pag->pag_ref) == 0, file: fs/xfs/xfs_mount.c, line: 156 ..... [<ffffffff814e1257>] xfs_free_perag+0x87/0xb0 [<ffffffff814e21b9>] xfs_mountfs+0x4d9/0x900 [<ffffffff814e5dff>] xfs_fs_fill_super+0x3bf/0x4d0 [<ffffffff811d8800>] mount_bdev+0x180/0x1b0 [<ffffffff814e3ff5>] xfs_fs_mount+0x15/0x20 [<ffffffff811d90a8>] mount_fs+0x38/0x170 [<ffffffff811f4347>] vfs_kern_mount+0x67/0x120 [<ffffffff811f7018>] do_mount+0x218/0xd60 [<ffffffff811f7e5b>] SyS_mount+0x8b/0xd0 When I finally caught it with tracing enabled, I saw that AG 2 had an elevated reference count and a buffer was responsible for it. I tracked down the specific buffer, and found that it was missing the final reference count release that would put it back on the LRU and hence be found by xfs_wait_buftarg() calls in the log mount failure handling. The last four traces for the buffer before the assert were (trimmed for relevance) kworker/0:1-5259 xfs_buf_iodone: hold 2 lock 0 flags ASYNC kworker/0:1-5259 xfs_buf_ioerror: hold 2 lock 0 error -5 mount-7163 xfs_buf_lock_done: hold 2 lock 0 flags ASYNC mount-7163 xfs_buf_unlock: hold 2 lock 1 flags ASYNC This is an async write that is completing, so there's nobody waiting for it directly. Hence we call xfs_buf_relse() once all the processing is complete. That does: static inline void xfs_buf_relse(xfs_buf_t *bp) { xfs_buf_unlock(bp); xfs_buf_rele(bp); } Now, it's clear that mount is waiting on the buffer lock, and that it has been released by xfs_buf_relse() and gained by mount. This is expected, because at this point the mount process is in xfs_buf_delwri_submit() waiting for all the IO it submitted to complete. The mount process, however, is waiting on the lock for the buffer because it is in xfs_buf_delwri_submit(). This waits for IO completion, but it doesn't wait for the buffer reference owned by the IO to go away. The mount process collects all the completions, fails the log recovery, and the higher level code then calls xfs_wait_buftarg() to free all the remaining buffers in the filesystem. The issue is that on unlocking the buffer, the scheduler has decided that the mount process has higher priority than the the kworker thread that is running the IO completion, and so immediately switched contexts to the mount process from the semaphore unlock code, hence preventing the kworker thread from finishing the IO completion and releasing the IO reference to the buffer. Hence by the time that xfs_wait_buftarg() is run, the buffer still has an active reference and so isn't on the LRU list that the function walks to free the remaining buffers. Hence we miss that buffer and continue onwards to tear down the mount structures, at which time we get find a stray reference count on the perag structure. On a non-debug kernel, this will be ignored and the structure torn down and freed. Hence when the kworker thread is then rescheduled and the buffer released and freed, it will access a freed perag structure. The problem here is that when the log mount fails, we still need to quiesce the log to ensure that the IO workqueues have returned to idle before we run xfs_wait_buftarg(). By synchronising the workqueues, we ensure that all IO completions are fully processed, not just to the point where buffers have been unlocked. This ensures we don't end up in the situation above. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dave Chinner authored
commit 3e85286e upstream. This reverts commit 24ba16bb as it prevents machines from suspending. This regression occurs when the xfsaild is idle on entry to suspend, and so there s no activity to wake it from it's idle sleep and hence see that it is supposed to freeze. Hence the freezer times out waiting for it and suspend is cancelled. There is no obvious fix for this short of freezing the filesystem properly, so revert this change for now. Signed-off-by: Dave Chinner <david@fromorbit.com> Acked-by: Jiri Kosina <jkosina@suse.cz> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dave Chinner authored
commit b79f4a1c upstream. When we do inode readahead in log recovery, we do can do the readahead before we've replayed the icreate transaction that stamps the buffer with inode cores. The inode readahead verifier catches this and marks the buffer as !done to indicate that it doesn't yet contain valid inodes. In adding buffer error notification (i.e. setting b_error = -EIO at the same time as as we clear the done flag) to such a readahead verifier failure, we can then get subsequent inode recovery failing with this error: XFS (dm-0): metadata I/O error: block 0xa00060 ("xlog_recover_do..(read#2)") error 5 numblks 32 This occurs when readahead completion races with icreate item replay such as: inode readahead find buffer lock buffer submit RA io .... icreate recovery xfs_trans_get_buffer find buffer lock buffer <blocks on RA completion> ..... <ra completion> fails verifier clear XBF_DONE set bp->b_error = -EIO release and unlock buffer <icreate gains lock> icreate initialises buffer marks buffer as done adds buffer to delayed write queue releases buffer At this point, we have an initialised inode buffer that is up to date but has an -EIO state registered against it. When we finally get to recovering an inode in that buffer: inode item recovery xfs_trans_read_buffer find buffer lock buffer sees XBF_DONE is set, returns buffer sees bp->b_error is set fail log recovery! Essentially, we need xfs_trans_get_buf_map() to clear the error status of the buffer when doing a lookup. This function returns uninitialised buffers, so the buffer returned can not be in an error state and none of the code that uses this function expects b_error to be set on return. Indeed, there is an ASSERT(!bp->b_error); in the transaction case in xfs_trans_get_buf_map() that would have caught this if log recovery used transactions.... This patch firstly changes the inode readahead failure to set -EIO on the buffer, and secondly changes xfs_buf_get_map() to never return a buffer with an error state set so this first change doesn't cause unexpected log recovery failures. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Brian Foster <bfoster@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Darrick J. Wong authored
commit 96f859d5 upstream. Because struct xfs_agfl is 36 bytes long and has a 64-bit integer inside it, gcc will quietly round the structure size up to the nearest 64 bits -- in this case, 40 bytes. This results in the XFS_AGFL_SIZE macro returning incorrect results for v5 filesystems on 64-bit machines (118 items instead of 119). As a result, a 32-bit xfs_repair will see garbage in AGFL item 119 and complain. Therefore, tell gcc not to pad the structure so that the AGFL size calculation is correct. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Dave Chinner <david@fromorbit.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Miklos Szeredi authored
commit cf9a6784 upstream. Without this copy-up of a file can be forced, even without actually being allowed to do anything on the file. [Arnd Bergmann] include <linux/pagemap.h> for PAGE_CACHE_SIZE (used by MAX_LFS_FILESIZE definition). Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Miklos Szeredi authored
commit ed06e069 upstream. We copy i_uid and i_gid of underlying inode into overlayfs inode. Except for the root inode. Fix this omission. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Konstantin Khlebnikov authored
commit 84889d49 upstream. This patch fixes kernel crash at removing directory which contains whiteouts from lower layers. Cache of directory content passed as "list" contains entries from all layers, including whiteouts from lower layers. So, lookup in upper dir (moved into work at this stage) will return negative entry. Plus this cache is filled long before and we can race with external removal. Example: mkdir -p lower0/dir lower1/dir upper work overlay touch lower0/dir/a lower0/dir/b mknod lower1/dir/a c 0 0 mount -t overlay none overlay -o lowerdir=lower1:lower0,upperdir=upper,workdir=work rm -fr overlay/dir Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vito Caputo authored
commit e4ad29fa upstream. Rather than always allocating the high-order XATTR_SIZE_MAX buffer which is costly and prone to failure, only allocate what is needed and realloc if necessary. Fixes https://github.com/coreos/bugs/issues/489Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Miklos Szeredi authored
commit 97daf8b9 upstream. When ovl_copy_xattr() encountered a zero size xattr no more xattrs were copied and the function returned success. This is clearly not the desired behavior. Signed-off-by: Miklos Szeredi <miklos@szeredi.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Thomas Gleixner authored
commit fb75a428 upstream. If the proxy lock in the requeue loop acquires the rtmutex for a waiter then it acquired also refcount on the pi_state related to the futex, but the waiter side does not drop the reference count. Add the missing free_pi_state() call. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Darren Hart <darren@dvhart.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Bhuvanesh_Surachari@mentor.com Cc: Andy Lowe <Andy_Lowe@mentor.com> Link: http://lkml.kernel.org/r/20151219200607.178132067@linutronix.deSigned-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Toshi Kani authored
commit 9273a8bb upstream. The pmem driver calls devm_memremap() to map a persistent memory range. When the pmem driver is unloaded, this memremap'd range is not released so the kernel will leak a vma. Fix devm_memremap_release() to handle a given memremap'd address properly. Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Kirill A. Shutemov authored
commit 1ac0b6de upstream. remap_file_pages(2) emulation can reach file which represents removed IPC ID as long as a memory segment is mapped. It breaks expectations of IPC subsystem. Test case (rewritten to be more human readable, originally autogenerated by syzkaller[1]): #define _GNU_SOURCE #include <stdlib.h> #include <sys/ipc.h> #include <sys/mman.h> #include <sys/shm.h> #define PAGE_SIZE 4096 int main() { int id; void *p; id = shmget(IPC_PRIVATE, 3 * PAGE_SIZE, 0); p = shmat(id, NULL, 0); shmctl(id, IPC_RMID, NULL); remap_file_pages(p, 3 * PAGE_SIZE, 0, 7, 0); return 0; } The patch changes shm_mmap() and code around shm_lock() to propagate locking error back to caller of shm_mmap(). [1] http://github.com/google/syzkallerSigned-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Dmitry Vyukov <dvyukov@google.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dan Carpenter authored
commit b1d353ad upstream. "count" is controlled by the user and it can be negative. Let's prevent that by making it unsigned. You have to have CAP_SYS_RAWIO to call this function so the bug is not as serious as it could be. Fixes: 5369c02d ('intel_scu_ipc: Utility driver for intel scu ipc') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Darren Hart <dvhart@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Vineet Gupta authored
commit 6a6ac72f upstream. This showed up on ARC when running LMBench bw_mem tests as Overlapping TLB Machine Check Exception triggered due to STLB entry (2M pages) overlapping some NTLB entry (regular 8K page). bw_mem 2m touches a large chunk of vaddr creating NTLB entries. In the interim khugepaged kicks in, collapsing the contiguous ptes into a single pmd. pmdp_collapse_flush()->flush_pmd_tlb_range() is called to flush out NTLB entries for the ptes. This for ARC (by design) can only shootdown STLB entries (for pmd). The stray NTLB entries cause the overlap with the subsequent STLB entry for collapsed page. So make pmdp_collapse_flush() call pte flush interface not pmd flush. Note that originally all thp flush call sites in generic code called flush_tlb_range() leaving it to architecture to implement the flush for pte and/or pmd. Commit 12ebc158 changed this by calling a new opt-in API flush_pmd_tlb_range() which made the semantics more explicit but failed to distinguish the pte vs pmd flush in generic code, which is what this patch fixes. Note that ARC can fixed w/o touching the generic pmdp_collapse_flush() by defining a ARC version, but that defeats the purpose of generic version, plus sementically this is the right thing to do. Fixes STAR 9000961194: LMBench on AXS103 triggering duplicate TLB exceptions with super pages Fixes: 12ebc158 ("mm,thp: introduce flush_pmd_tlb_range") Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Eric Dumazet authored
commit d7ce3692 upstream. Some servers experienced fatal deadlocks because of a combination of bugs, leading to multiple cpus calling dump_stack(). The checksumming bug was fixed in commit 34ae6a1a ("ipv6: update skb->csum when CE mark is propagated"). The second problem is a faulty locking in dump_stack() CPU1 runs in process context and calls dump_stack(), grabs dump_lock. CPU2 receives a TCP packet under softirq, grabs socket spinlock, and call dump_stack() from netdev_rx_csum_fault(). dump_stack() spins on atomic_cmpxchg(&dump_lock, -1, 2), since dump_lock is owned by CPU1 While dumping its stack, CPU1 is interrupted by a softirq, and happens to process a packet for the TCP socket locked by CPU2. CPU1 spins forever in spin_lock() : deadlock Stack trace on CPU1 looked like : NMI backtrace for cpu 1 RIP: _raw_spin_lock+0x25/0x30 ... Call Trace: <IRQ> tcp_v6_rcv+0x243/0x620 ip6_input_finish+0x11f/0x330 ip6_input+0x38/0x40 ip6_rcv_finish+0x3c/0x90 ipv6_rcv+0x2a9/0x500 process_backlog+0x461/0xaa0 net_rx_action+0x147/0x430 __do_softirq+0x167/0x2d0 call_softirq+0x1c/0x30 do_softirq+0x3f/0x80 irq_exit+0x6e/0xc0 smp_call_function_single_interrupt+0x35/0x40 call_function_single_interrupt+0x6a/0x70 <EOI> printk+0x4d/0x4f printk_address+0x31/0x33 print_trace_address+0x33/0x3c print_context_stack+0x7f/0x119 dump_trace+0x26b/0x28e show_trace_log_lvl+0x4f/0x5c show_stack_log_lvl+0x104/0x113 show_stack+0x42/0x44 dump_stack+0x46/0x58 netdev_rx_csum_fault+0x38/0x3c __skb_checksum_complete_head+0x6e/0x80 __skb_checksum_complete+0x11/0x20 tcp_rcv_established+0x2bd5/0x2fd0 tcp_v6_do_rcv+0x13c/0x620 sk_backlog_rcv+0x15/0x30 release_sock+0xd2/0x150 tcp_recvmsg+0x1c1/0xfc0 inet_recvmsg+0x7d/0x90 sock_recvmsg+0xaf/0xe0 ___sys_recvmsg+0x111/0x3b0 SyS_recvmsg+0x5c/0xb0 system_call_fastpath+0x16/0x1b Fixes: b58d9774 ("dump_stack: serialize the output from dump_stack()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Alex Thorlton <athorlton@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-