- 07 Mar, 2016 40 commits
-
-
Takashi Iwai authored
commit 67ec1072 upstream. A non-atomic PCM stream may take snd_pcm_link_rwsem rw semaphore twice in the same code path, e.g. one in snd_pcm_action_nonatomic() and another in snd_pcm_stream_lock(). Usually this is OK, but when a write lock is issued between these two read locks, the problem happens: the write lock is blocked due to the first reade lock, and the second read lock is also blocked by the write lock. This eventually deadlocks. The reason is the way rwsem manages waiters; it's queued like FIFO, so even if the writer itself doesn't take the lock yet, it blocks all the waiters (including reads) queued after it. As a workaround, in this patch, we replace the standard down_write() with an spinning loop. This is far from optimal, but it's good enough, as the spinning time is supposed to be relatively short for normal PCM operations, and the code paths requiring the write lock aren't called so often. Reported-by: Vinod Koul <vinod.koul@intel.com> Tested-by: Ramesh Babu <ramesh.babu@intel.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Toshi Kani authored
commit f4eafd8b upstream. A kernel page fault oops with the callstack below was observed when a read syscall was made to a pmem device after a huge amount (>512GB) of vmalloc ranges was allocated by ioremap() on a x86_64 system: BUG: unable to handle kernel paging request at ffff880840000ff8 IP: vmalloc_fault+0x1be/0x300 PGD c7f03a067 PUD 0 Oops: 0000 [#1] SM Call Trace: __do_page_fault+0x285/0x3e0 do_page_fault+0x2f/0x80 ? put_prev_entity+0x35/0x7a0 page_fault+0x28/0x30 ? memcpy_erms+0x6/0x10 ? schedule+0x35/0x80 ? pmem_rw_bytes+0x6a/0x190 [nd_pmem] ? schedule_timeout+0x183/0x240 btt_log_read+0x63/0x140 [nd_btt] : ? __symbol_put+0x60/0x60 ? kernel_read+0x50/0x80 SyS_finit_module+0xb9/0xf0 entry_SYSCALL_64_fastpath+0x1a/0xa4 Since v4.1, ioremap() supports large page (pud/pmd) mappings in x86_64 and PAE. vmalloc_fault() however assumes that the vmalloc range is limited to pte mappings. vmalloc faults do not normally happen in ioremap'd ranges since ioremap() sets up the kernel page tables, which are shared by user processes. pgd_ctor() sets the kernel's PGD entries to user's during fork(). When allocation of the vmalloc ranges crosses a 512GB boundary, ioremap() allocates a new pud table and updates the kernel PGD entry to point it. If user process's PGD entry does not have this update yet, a read/write syscall to the range will cause a vmalloc fault, which hits the Oops above as it does not handle a large page properly. Following changes are made to vmalloc_fault(). 64-bit: - No change for the PGD sync operation as it handles large pages already. - Add pud_huge() and pmd_huge() to the validation code to handle large pages. - Change pud_page_vaddr() to pud_pfn() since an ioremap range is not directly mapped (while the if-statement still works with a bogus addr). - Change pmd_page() to pmd_pfn() since an ioremap range is not backed by struct page (while the if-statement still works with a bogus addr). 32-bit: - No change for the sync operation since the index3 PGD entry covers the entire vmalloc range, which is always valid. (A separate change to sync PGD entry is necessary if this memory layout is changed regardless of the page size.) - Add pmd_huge() to the validation code to handle large pages. This is for completeness since vmalloc_fault() won't happen in ioremap'd ranges as its PGD entry is always valid. Reported-by: Henning Schild <henning.schild@siemens.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Acked-by: Borislav Petkov <bp@alien8.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: linux-mm@kvack.org Cc: linux-nvdimm@lists.01.org Link: http://lkml.kernel.org/r/1455758214-24623-1-git-send-email-toshi.kani@hpe.comSigned-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Benjamin Coddington authored
commit d9dfd8d7 upstream. In the case where d_add_unique() finds an appropriate alias to use it will have already incremented the reference count. An additional dget() to swap the open context's dentry is unnecessary and will leak a reference. Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Fixes: 275bb307 ("NFSv4: Move dentry instantiation into the NFSv4-...") Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Alexey Kardashevskiy authored
commit 6ecad912 upstream. Quite often drivers set only "write" permission assuming that this includes "read" permission as well and this works on plenty of platforms. However IODA2 is strict about this and produces an EEH when "read" permission is not set and reading happens. This adds a workaround in the IODA code to always add the "read" bit when the "write" bit is set. Fixes: 10b35b2b ("powerpc/powernv: Do not set "read" flag if direction==DMA_NONE") Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Douglas Miller <dougmill@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
John Youn authored
commit c4509601 upstream. The assignement of EP transfer resources was not handled properly in the dwc3 driver. Commit aebda618 ("usb: dwc3: Reset the transfer resource index on SET_INTERFACE") previously fixed one aspect of this where resources may be exhausted with multiple calls to SET_INTERFACE. However, it introduced an issue where composite devices with multiple interfaces can be assigned the same transfer resources for different endpoints. This patch solves both issues. The assignment of transfer resources cannot perfectly follow the data book due to the fact that the controller driver does not have all knowledge of the configuration in advance. It is given this information piecemeal by the composite gadget framework after every SET_CONFIGURATION and SET_INTERFACE. Trying to follow the databook programming model in this scenario can cause errors. For two reasons: 1) The databook says to do DEPSTARTCFG for every SET_CONFIGURATION and SET_INTERFACE (8.1.5). This is incorrect in the scenario of multiple interfaces. 2) The databook does not mention doing more DEPXFERCFG for new endpoint on alt setting (8.1.6). The following simplified method is used instead: All hardware endpoints can be assigned a transfer resource and this setting will stay persistent until either a core reset or hibernation. So whenever we do a DEPSTARTCFG(0) we can go ahead and do DEPXFERCFG for every hardware endpoint as well. We are guaranteed that there are as many transfer resources as endpoints. This patch triggers off of the calling dwc3_gadget_start_config() for EP0-out, which always happens first, and which should only happen in one of the above conditions. Fixes: aebda618 ("usb: dwc3: Reset the transfer resource index on SET_INTERFACE") Reported-by: Ravi Babu <ravibabu@ti.com> Signed-off-by: John Youn <johnyoun@synopsys.com> Signed-off-by: Felipe Balbi <balbi@kernel.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Toshi Kani authored
commit a82eee74 upstream. Data corruption issues were observed in tests which initiated a system crash/reset while accessing BTT devices. This problem is reproducible. The BTT driver calls pmem_rw_bytes() to update data in pmem devices. This interface calls __copy_user_nocache(), which uses non-temporal stores so that the stores to pmem are persistent. __copy_user_nocache() uses non-temporal stores when a request size is 8 bytes or larger (and is aligned by 8 bytes). The BTT driver updates the BTT map table, which entry size is 4 bytes. Therefore, updates to the map table entries remain cached, and are not written to pmem after a crash. Change __copy_user_nocache() to use non-temporal store when a request size is 4 bytes. The change extends the current byte-copy path for a less-than-8-bytes request, and does not add any overhead to the regular path. Reported-and-tested-by: Micah Parrish <micah.parrish@hpe.com> Reported-and-tested-by: Brian Boylston <brian.boylston@hpe.com> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: linux-nvdimm@lists.01.org Link: http://lkml.kernel.org/r/1455225857-12039-3-git-send-email-toshi.kani@hpe.com [ Small readability edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Toshi Kani authored
commit ee9737c9 upstream. Add comments to __copy_user_nocache() to clarify its procedures and alignment requirements. Also change numeric branch target labels to named local labels. No code changed: arch/x86/lib/copy_user_64.o: text data bss dec hex filename 1239 0 0 1239 4d7 copy_user_64.o.before 1239 0 0 1239 4d7 copy_user_64.o.after md5: 58bed94c2db98c1ca9a2d46d0680aaae copy_user_64.o.before.asm 58bed94c2db98c1ca9a2d46d0680aaae copy_user_64.o.after.asm Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Borislav Petkov <bp@suse.de> Cc: Brian Gerst <brgerst@gmail.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Luis R. Rodriguez <mcgrof@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: brian.boylston@hpe.com Cc: dan.j.williams@intel.com Cc: linux-nvdimm@lists.01.org Cc: micah.parrish@hpe.com Cc: ross.zwisler@linux.intel.com Cc: vishal.l.verma@intel.com Link: http://lkml.kernel.org/r/1455225857-12039-2-git-send-email-toshi.kani@hpe.com [ Small readability edits and added object file comparison. ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Mario Kleiner authored
commit bb74fc1b upstream. drm_vblank_offdelay can have three different types of values: < 0 is to be always treated the same as dev->vblank_disable_immediate = 0 is to be treated as "never disable vblanks" > 0 is to be treated as disable immediate if kms driver wants it that way via dev->vblank_disable_immediate. Otherwise it is a disable timeout in msecs. This got broken in Linux 3.18+ for the implementation of drm_vblank_on. If the user specified a value of zero which should always reenable vblank irqs in this function, a kms driver could override the users choice by setting vblank_disable_immediate to true. This patch fixes the regression and keeps the user in control. v2: Only reenable vblank if there are clients left or the user requested to "never disable vblanks" via offdelay 0. Enabling vblanks even in the "delayed disable" case (offdelay > 0) was specifically added by Ville in commit cd19e52a ("drm: Kick start vblank interrupts at drm_vblank_on()"), but after discussion it turns out that this was done by accident. Citing Ville: "I think it just ended up as a mess due to changing some of the semantics of offdelay<0 vs. offdelay==0 vs. disable_immediate during the review of the series. So yeah, given how drm_vblank_put() works now, I'd just make this check for offdelay==0." Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: michel@daenzer.net Cc: vbabka@suse.cz Cc: ville.syrjala@linux.intel.com Cc: daniel.vetter@ffwll.ch Cc: dri-devel@lists.freedesktop.org Cc: alexander.deucher@amd.com Cc: christian.koenig@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Gerd Hoffmann authored
commit 34855706 upstream. This avoids integer overflows on 32bit machines when calculating reloc_info size, as reported by Alan Cox. Cc: gnomes@lxorguk.ukuu.org.uk Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Rasmus Villemoes authored
commit bc3f5d8c upstream. We need to use post-decrement to get the pci_map_page undone also for i==0, and to avoid some very unpleasant behaviour if pci_map_page failed already at i==0. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Rasmus Villemoes authored
commit 09ccbb74 upstream. We need to use post-decrement to get the pci_map_page undone also for i==0, and to avoid some very unpleasant behaviour if pci_map_page failed already at i==0. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Takashi Iwai authored
commit 13d5e5d4 upstream. The commit [7f0973e9: ALSA: seq: Fix lockdep warnings due to double mutex locks] split the management of two linked lists (source and destination) into two individual calls for avoiding the AB/BA deadlock. However, this may leave the possible double deletion of one of two lists when the counterpart is being deleted concurrently. It ends up with a list corruption, as revealed by syzkaller fuzzer. This patch fixes it by checking the list emptiness and skipping the deletion and the following process. BugLink: http://lkml.kernel.org/r/CACT4Y+bay9qsrz6dQu31EcGaH9XwfW7o3oBzSQUG9fMszoh=Sg@mail.gmail.com Fixes: 7f0973e9 ('ALSA: seq: Fix lockdep warnings due to 'double mutex locks) Reported-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Arnd Bergmann authored
commit b33c8ff4 upstream. In my randconfig tests, I came across a bug that involves several components: * gcc-4.9 through at least 5.3 * CONFIG_GCOV_PROFILE_ALL enabling -fprofile-arcs for all files * CONFIG_PROFILE_ALL_BRANCHES overriding every if() * The optimized implementation of do_div() that tries to replace a library call with an division by multiplication * code in drivers/media/dvb-frontends/zl10353.c doing u32 adc_clock = 450560; /* 45.056 MHz */ if (state->config.adc_clock) adc_clock = state->config.adc_clock; do_div(value, adc_clock); In this case, gcc fails to determine whether the divisor in do_div() is __builtin_constant_p(). In particular, it concludes that __builtin_constant_p(adc_clock) is false, while __builtin_constant_p(!!adc_clock) is true. That in turn throws off the logic in do_div() that also uses __builtin_constant_p(), and instead of picking either the constant- optimized division, and the code in ilog2() that uses __builtin_constant_p() to figure out whether it knows the answer at compile time. The result is a link error from failing to find multiple symbols that should never have been called based on the __builtin_constant_p(): dvb-frontends/zl10353.c:138: undefined reference to `____ilog2_NaN' dvb-frontends/zl10353.c:138: undefined reference to `__aeabi_uldivmod' ERROR: "____ilog2_NaN" [drivers/media/dvb-frontends/zl10353.ko] undefined! ERROR: "__aeabi_uldivmod" [drivers/media/dvb-frontends/zl10353.ko] undefined! This patch avoids the problem by changing __trace_if() to check whether the condition is known at compile-time to be nonzero, rather than checking whether it is actually a constant. I see this one link error in roughly one out of 1600 randconfig builds on ARM, and the patch fixes all known instances. Link: http://lkml.kernel.org/r/1455312410-1058841-1-git-send-email-arnd@arndb.deAcked-by: Nicolas Pitre <nico@linaro.org> Fixes: ab3c9c68 ("branch tracer, intel-iommu: fix build with CONFIG_BRANCH_TRACER=y") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Steven Rostedt (Red Hat) authored
commit f3775549 upstream. The tracepoint infrastructure uses RCU sched protection to enable and disable tracepoints safely. There are some instances where tracepoints are used in infrastructure code (like kfree()) that get called after a CPU is going offline, and perhaps when it is coming back online but hasn't been registered yet. This can probuce the following warning: [ INFO: suspicious RCU usage. ] 4.4.0-00006-g0fe53e8-dirty #34 Tainted: G S ------------------------------- include/trace/events/kmem.h:141 suspicious rcu_dereference_check() usage! other info that might help us debug this: RCU used illegally from offline CPU! rcu_scheduler_active = 1, debug_locks = 1 no locks held by swapper/8/0. stack backtrace: CPU: 8 PID: 0 Comm: swapper/8 Tainted: G S 4.4.0-00006-g0fe53e8-dirty #34 Call Trace: [c0000005b76c78d0] [c0000000008b9540] .dump_stack+0x98/0xd4 (unreliable) [c0000005b76c7950] [c00000000010c898] .lockdep_rcu_suspicious+0x108/0x170 [c0000005b76c79e0] [c00000000029adc0] .kfree+0x390/0x440 [c0000005b76c7a80] [c000000000055f74] .destroy_context+0x44/0x100 [c0000005b76c7b00] [c0000000000934a0] .__mmdrop+0x60/0x150 [c0000005b76c7b90] [c0000000000e3ff0] .idle_task_exit+0x130/0x140 [c0000005b76c7c20] [c000000000075804] .pseries_mach_cpu_die+0x64/0x310 [c0000005b76c7cd0] [c000000000043e7c] .cpu_die+0x3c/0x60 [c0000005b76c7d40] [c0000000000188d8] .arch_cpu_idle_dead+0x28/0x40 [c0000005b76c7db0] [c000000000101e6c] .cpu_startup_entry+0x50c/0x560 [c0000005b76c7ed0] [c000000000043bd8] .start_secondary+0x328/0x360 [c0000005b76c7f90] [c000000000008a6c] start_secondary_prolog+0x10/0x14 This warning is not a false positive either. RCU is not protecting code that is being executed while the CPU is offline. Instead of playing "whack-a-mole(TM)" and adding conditional statements to the tracepoints we find that are used in this instance, simply add a cpu_online() test to the tracepoint code where the tracepoint will be ignored if the CPU is offline. Use of raw_smp_processor_id() is fine, as there should never be a case where the tracepoint code goes from running on a CPU that is online and suddenly gets migrated to a CPU that is offline. Link: http://lkml.kernel.org/r/1455387773-4245-1-git-send-email-kda@linux-powerpc.orgReported-by: Denis Kirjanov <kda@linux-powerpc.org> Fixes: 97e1c18e ("tracing: Kernel Tracepoints") Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Andy Shevchenko authored
commit ee1cdcda upstream. The commit 2895b2ca ("dmaengine: dw: fix cyclic transfer callbacks") re-enabled BLOCK interrupts with regard to make cyclic transfers work. However, this change becomes a regression for non-cyclic transfers as interrupt counters under stress test had been grown enormously (approximately per 4-5 bytes in the UART loop back test). Taking into consideration above enable BLOCK interrupts if and only if channel is programmed to perform cyclic transfer. Fixes: 2895b2ca ("dmaengine: dw: fix cyclic transfer callbacks") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Mans Rullgard <mans@mansr.com> Tested-by: Mans Rullgard <mans@mansr.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Vinod Koul <vinod.koul@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Takashi Iwai authored
commit 0b8c8219 upstream. The commit [991f86d7: ALSA: hda - Flush the pending probe work at remove] introduced the sync of async probe work at remove for fixing the race. However, this may lead to another hangup when the module removal is performed quickly before starting the probe work, because it issues flush_work() and it's blocked forever. The workaround is to use cancel_work_sync() instead of flush_work() there. Fixes: 991f86d7 ('ALSA: hda - Flush the pending probe work at remove') Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Takashi Iwai authored
commit d99a36f4 upstream. When multiple concurrent writes happen on the ALSA sequencer device right after the open, it may try to allocate vmalloc buffer for each write and leak some of them. It's because the presence check and the assignment of the buffer is done outside the spinlock for the pool. The fix is to move the check and the assignment into the spinlock. (The current implementation is suboptimal, as there can be multiple unnecessary vmallocs because the allocation is done before the check in the spinlock. But the pool size is already checked beforehand, so this isn't a big problem; that is, the only possible path is the multiple writes before any pool assignment, and practically seen, the current coverage should be "good enough".) The issue was triggered by syzkaller fuzzer. BugLink: http://lkml.kernel.org/r/CACT4Y+bSzazpXNvtAr=WXaL8hptqjHwqEyFA+VN2AWEx=aurkg@mail.gmail.comReported-by: Dmitry Vyukov <dvyukov@google.com> Tested-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Konrad Rzeszutek Wilk authored
commit 4d8c8bd6 upstream. Occasionaly PV guests would crash with: pciback 0000:00:00.1: Xen PCI mapped GSI0 to IRQ16 BUG: unable to handle kernel paging request at 0000000d1a8c0be0 .. snip.. <ffffffff8139ce1b>] find_next_bit+0xb/0x10 [<ffffffff81387f22>] cpumask_next_and+0x22/0x40 [<ffffffff813c1ef8>] pci_device_probe+0xb8/0x120 [<ffffffff81529097>] ? driver_sysfs_add+0x77/0xa0 [<ffffffff815293e4>] driver_probe_device+0x1a4/0x2d0 [<ffffffff813c1ddd>] ? pci_match_device+0xdd/0x110 [<ffffffff81529657>] __device_attach_driver+0xa7/0xb0 [<ffffffff815295b0>] ? __driver_attach+0xa0/0xa0 [<ffffffff81527622>] bus_for_each_drv+0x62/0x90 [<ffffffff8152978d>] __device_attach+0xbd/0x110 [<ffffffff815297fb>] device_attach+0xb/0x10 [<ffffffff813b75ac>] pci_bus_add_device+0x3c/0x70 [<ffffffff813b7618>] pci_bus_add_devices+0x38/0x80 [<ffffffff813dc34e>] pcifront_scan_root+0x13e/0x1a0 [<ffffffff817a0692>] pcifront_backend_changed+0x262/0x60b [<ffffffff814644c6>] ? xenbus_gather+0xd6/0x160 [<ffffffff8120900f>] ? put_object+0x2f/0x50 [<ffffffff81465c1d>] xenbus_otherend_changed+0x9d/0xa0 [<ffffffff814678ee>] backend_changed+0xe/0x10 [<ffffffff81463a28>] xenwatch_thread+0xc8/0x190 [<ffffffff810f22f0>] ? woken_wake_function+0x10/0x10 which was the result of two things: When we call pci_scan_root_bus we would pass in 'sd' (sysdata) pointer which was an 'pcifront_sd' structure. However in the pci_device_add it expects that the 'sd' is 'struct sysdata' and sets the dev->node to what is in sd->node (offset 4): set_dev_node(&dev->dev, pcibus_to_node(bus)); __pcibus_to_node(const struct pci_bus *bus) { const struct pci_sysdata *sd = bus->sysdata; return sd->node; } However our structure was pcifront_sd which had nothing at that offset: struct pcifront_sd { int domain; /* 0 4 */ /* XXX 4 bytes hole, try to pack */ struct pcifront_device * pdev; /* 8 8 */ } That is an hole - filled with garbage as we used kmalloc instead of kzalloc (the second problem). This patch fixes the issue by: 1) Use kzalloc to initialize to a well known state. 2) Put 'struct pci_sysdata' at the start of 'pcifront_sd'. That way access to the 'node' will access the right offset. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Konrad Rzeszutek Wilk authored
commit d159457b upstream. Commit 8135cf8b (xen/pciback: Save xen_pci_op commands before processing it) broke enabling MSI-X because it would never copy the resulting vectors into the response. The number of vectors requested was being overwritten by the return value (typically zero for success). Save the number of vectors before processing the op, so the correct number of vectors are copied afterwards. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Konrad Rzeszutek Wilk authored
commit 8d47065f upstream. Commit 408fb0e5 (xen/pciback: Don't allow MSI-X ops if PCI_COMMAND_MEMORY is not set) prevented enabling MSI-X on passed-through virtual functions, because it checked the VF for PCI_COMMAND_MEMORY but this is not a valid bit for VFs. Instead, check the physical function for PCI_COMMAND_MEMORY. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Gavin Shan authored
commit 1bc74f1c upstream. When PCI bus is unplugged during full hotplug for EEH recovery, the platform PE instance (struct pnv_ioda_pe) isn't released and it dereferences the stale PCI bus that has been released. It leads to kernel crash when referring to the stale PCI bus. This fixes the issue by correcting the PE's primary bus when it's oneline at plugging time, in pnv_pci_dma_bus_setup() which is to be called by pcibios_fixup_bus(). Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reported-by: Pradipta Ghosh <pradghos@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Gavin Shan authored
commit 05ba75f8 upstream. When PE is created, its primary bus is cached to pe->bus. At later point, the cached primary bus is returned from eeh_pe_bus_get(). However, we could get stale cached primary bus and run into kernel crash in one case: full hotplug as part of fenced PHB error recovery releases all PCI busses under the PHB at unplugging time and recreate them at plugging time. pe->bus is still dereferencing the PCI bus that was released. This adds another PE flag (EEH_PE_PRI_BUS) to represent the validity of pe->bus. pe->bus is updated when its first child EEH device is online and the flag is set. Before unplugging in full hotplug for error recovery, the flag is cleared. Fixes: 8cdb2833 ("powerpc/eeh: Trace PCI bus from PE") Reported-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reported-by: Pradipta Ghosh <pradghos@in.ibm.com> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Luca Coelho authored
commit 5e56276e upstream. The firmware can perform a scheduled scan with not matchsets passed, but it can't send notification that results were found. Since the userspace then cannot know when we got new results and the firmware wouldn't trigger a wake in case we are sleeping, it's better not to allow scans without matchsets. This fixes https://bugzilla.kernel.org/show_bug.cgi?id=110831Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Hannes Reinecke authored
commit 2d99b55d upstream. Commit 35dc2483 introduced a check for current->mm to see if we have a user space context and only copies data if we do. Now if an IO gets interrupted by a signal data isn't copied into user space any more (as we don't have a user space context) but user space isn't notified about it. This patch modifies the behaviour to return -EINTR from bio_uncopy_user() to notify userland that a signal has interrupted the syscall, otherwise it could lead to a situation where the caller may get a buffer with no data returned. This can be reproduced by issuing SG_IO ioctl()s in one thread while constantly sending signals to it. Fixes: 35dc2483 [SCSI] sg: Fix user memory corruption when SG_IO is interrupted by a signal Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Jens Axboe <axboe@fb.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Eryu Guan authored
commit bcff2488 upstream. I notice ext4/307 fails occasionally on ppc64 host, reporting md5 checksum mismatch after moving data from original file to donor file. The reason is that move_extent_per_page() calls __block_write_begin() and block_commit_write() to write saved data from original inode blocks to donor inode blocks, but __block_write_begin() not only maps buffer heads but also reads block content from disk if the size is not block size aligned. At this time the physical block number in mapped buffer head is pointing to the donor file not the original file, and that results in reading wrong data to page, which get written to disk in following block_commit_write call. This also can be reproduced by the following script on 1k block size ext4 on x86_64 host: mnt=/mnt/ext4 donorfile=$mnt/donor testfile=$mnt/testfile e4compact=~/xfstests/src/e4compact rm -f $donorfile $testfile # reserve space for donor file, written by 0xaa and sync to disk to # avoid EBUSY on EXT4_IOC_MOVE_EXT xfs_io -fc "pwrite -S 0xaa 0 1m" -c "fsync" $donorfile # create test file written by 0xbb xfs_io -fc "pwrite -S 0xbb 0 1023" -c "fsync" $testfile # compute initial md5sum md5sum $testfile | tee md5sum.txt # drop cache, force e4compact to read data from disk echo 3 > /proc/sys/vm/drop_caches # test defrag echo "$testfile" | $e4compact -i -v -f $donorfile # check md5sum md5sum -c md5sum.txt Fix it by creating & mapping buffer heads only but not reading blocks from disk, because all the data in page is guaranteed to be up-to-date in mext_page_mkuptodate(). Signed-off-by: Eryu Guan <guaneryu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Insu Yun authored
commit 46901760 upstream. Since sizeof(ext_new_group_data) > sizeof(ext_new_flex_group_data), integer overflow could be happened. Therefore, need to fix integer overflow sanitization. Signed-off-by: Insu Yun <wuninsu@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
James Bottomley authored
commit 90a88d6e upstream. This softlockup is currently happening: [ 444.088002] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/1:1:29] [ 444.088002] Modules linked in: lpfc(-) qla2x00tgt(O) qla2xxx_scst(O) scst_vdisk(O) scsi_transport_fc libcrc32c scst(O) dlm configfs nfsd lockd grace nfs_acl auth_rpcgss sunrpc ed d snd_pcm_oss snd_mixer_oss snd_seq snd_seq_device dm_mod iTCO_wdt snd_hda_codec_realtek snd_hda_codec_generic gpio_ich iTCO_vendor_support ppdev snd_hda_intel snd_hda_codec snd_hda _core snd_hwdep tg3 snd_pcm snd_timer libphy lpc_ich parport_pc ptp acpi_cpufreq snd pps_core fjes parport i2c_i801 ehci_pci tpm_tis tpm sr_mod cdrom soundcore floppy hwmon sg 8250_ fintek pcspkr i915 drm_kms_helper uhci_hcd ehci_hcd drm fb_sys_fops sysimgblt sysfillrect syscopyarea i2c_algo_bit usbcore button video usb_common fan ata_generic ata_piix libata th ermal [ 444.088002] CPU: 1 PID: 29 Comm: kworker/1:1 Tainted: G O 4.4.0-rc5-2.g1e923a3-default #1 [ 444.088002] Hardware name: FUJITSU SIEMENS ESPRIMO E /D2164-A1, BIOS 5.00 R1.10.2164.A1 05/08/2006 [ 444.088002] Workqueue: fc_wq_4 fc_rport_final_delete [scsi_transport_fc] [ 444.088002] task: f6266ec0 ti: f6268000 task.ti: f6268000 [ 444.088002] EIP: 0060:[<c07e7044>] EFLAGS: 00000286 CPU: 1 [ 444.088002] EIP is at _raw_spin_unlock_irqrestore+0x14/0x20 [ 444.088002] EAX: 00000286 EBX: f20d3800 ECX: 00000002 EDX: 00000286 [ 444.088002] ESI: f50ba800 EDI: f2146848 EBP: f6269ec8 ESP: f6269ec8 [ 444.088002] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 [ 444.088002] CR0: 8005003b CR2: 08f96600 CR3: 363ae000 CR4: 000006d0 [ 444.088002] Stack: [ 444.088002] f6269eec c066b0f7 00000286 f2146848 f50ba808 f50ba800 f50ba800 f2146a90 [ 444.088002] f2146848 f6269f08 f8f0a4ed f3141000 f2146800 f2146a90 f619fa00 00000040 [ 444.088002] f6269f40 c026cb25 00000001 166c6392 00000061 f6757140 f6136340 00000004 [ 444.088002] Call Trace: [ 444.088002] [<c066b0f7>] scsi_remove_target+0x167/0x1c0 [ 444.088002] [<f8f0a4ed>] fc_rport_final_delete+0x9d/0x1e0 [scsi_transport_fc] [ 444.088002] [<c026cb25>] process_one_work+0x155/0x3e0 [ 444.088002] [<c026cde7>] worker_thread+0x37/0x490 [ 444.088002] [<c027214b>] kthread+0x9b/0xb0 [ 444.088002] [<c07e72c1>] ret_from_kernel_thread+0x21/0x40 What appears to be happening is that something has pinned the target so it can't go into STARGET_DEL via final release and the loop in scsi_remove_target spins endlessly until that happens. The fix for this soft lockup is to not keep looping over a device that we've called remove on but which hasn't gone into DEL state. This patch will retain a simplistic memory of the last target and not keep looping over it. Reported-by: Sebastian Herbszt <herbszt@gmx.de> Tested-by: Sebastian Herbszt <herbszt@gmx.de> Fixes: 40998193Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Ashok Kumar authored
commit 004fa08d upstream. When the GIC is using EOImode==1, the EOI is done immediately, leaving the deactivation to be performed when the EOI was previously done. Unfortunately, the ITS is not aware of the EOImode at all, and blindly EOIs the interrupt again. On most systems, this is ignored (despite being a programming error), but some others do raise a SError exception as there is no priority drop to perform for this interrupt. The fix is to stop trying to be clever, and always call into the underlying GIC to perform the right access, irrespective of the more we're in. [Marc: Reworked commit message] Fixes: 0b996fd3 ("irqchip/GICv3: Convert to EOImode == 1") Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Ashok Kumar <ashoks@broadcom.com> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
David Sterba authored
commit bc4ef759 upstream. The value of ctx->pos in the last readdir call is supposed to be set to INT_MAX due to 32bit compatibility, unless 'pos' is intentially set to a larger value, then it's LLONG_MAX. There's a report from PaX SIZE_OVERFLOW plugin that "ctx->pos++" overflows (https://forums.grsecurity.net/viewtopic.php?f=1&t=4284), on a 64bit arch, where the value is 0x7fffffffffffffff ie. LLONG_MAX before the increment. We can get to that situation like that: * emit all regular readdir entries * still in the same call to readdir, bump the last pos to INT_MAX * next call to readdir will not emit any entries, but will reach the bump code again, finds pos to be INT_MAX and sets it to LLONG_MAX Normally this is not a problem, but if we call readdir again, we'll find 'pos' set to LLONG_MAX and the unconditional increment will overflow. The report from Victor at (http://thread.gmane.org/gmane.comp.file-systems.btrfs/49500) with debugging print shows that pattern: Overflow: e Overflow: 7fffffff Overflow: 7fffffffffffffff PAX: size overflow detected in function btrfs_real_readdir fs/btrfs/inode.c:5760 cicus.935_282 max, count: 9, decl: pos; num: 0; context: dir_context; CPU: 0 PID: 2630 Comm: polkitd Not tainted 4.2.3-grsec #1 Hardware name: Gigabyte Technology Co., Ltd. H81ND2H/H81ND2H, BIOS F3 08/11/2015 ffffffff81901608 0000000000000000 ffffffff819015e6 ffffc90004973d48 ffffffff81742f0f 0000000000000007 ffffffff81901608 ffffc90004973d78 ffffffff811cb706 0000000000000000 ffff8800d47359e0 ffffc90004973ed8 Call Trace: [<ffffffff81742f0f>] dump_stack+0x4c/0x7f [<ffffffff811cb706>] report_size_overflow+0x36/0x40 [<ffffffff812ef0bc>] btrfs_real_readdir+0x69c/0x6d0 [<ffffffff811dafc8>] iterate_dir+0xa8/0x150 [<ffffffff811e6d8d>] ? __fget_light+0x2d/0x70 [<ffffffff811dba3a>] SyS_getdents+0xba/0x1c0 Overflow: 1a [<ffffffff811db070>] ? iterate_dir+0x150/0x150 [<ffffffff81749b69>] entry_SYSCALL_64_fastpath+0x12/0x83 The jump from 7fffffff to 7fffffffffffffff happens when new dir entries are not yet synced and are processed from the delayed list. Then the code could go to the bump section again even though it might not emit any new dir entries from the delayed list. The fix avoids entering the "bump" section again once we've finished emitting the entries, both for synced and delayed entries. References: https://forums.grsecurity.net/viewtopic.php?f=1&t=4284Reported-by: Victor <services@swwu.com> Signed-off-by: David Sterba <dsterba@suse.com> Tested-by: Holger Hoffstätte <holger.hoffstaette@googlemail.com> Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Linus Walleij authored
commit e972c374 upstream. Since the dawn of time the ICST code has only supported divide by one or hang in an eternal loop. Luckily we were always dividing by one because the reference frequency for the systems using the ICSTs is 24MHz and the [min,max] values for the PLL input if [10,320] MHz for ICST307 and [6,200] for ICST525, so the loop will always terminate immediately without assigning any divisor for the reference frequency. But for the code to make sense, let's insert the missing i++ Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Stefan Haberland authored
commit 9d862aba upstream. Add refcount to the DASD device when a summary unit check worker is scheduled. This prevents that the device is set offline with worker in place. Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Stefan Haberland authored
commit 020bf042 upstream. The channel checks the specified length and the provided amount of data for CCWs and provides an incorrect length error if the size does not match. Under z/VM with simulation activated the length may get changed. Having the suppress length indication bit set is stated as good CCW coding practice and avoids errors under z/VM. Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Anton Protopopov authored
commit 4b550af5 upstream. The setup_ntlmv2_rsp() function may return positive value ENOMEM instead of -ENOMEM in case of kmalloc failure. Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Steve French <smfrench@gmail.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Christian König authored
commit cc1de6e8 upstream. Otherwise we could try to evict overlapping userptr BOs in get_user_pages(), leading to a possible circular locking dependency. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Nicolai Hähnle authored
commit f6ff4f67 upstream. An arbitrary amount of time can pass between spin_unlock and radeon_fence_wait_any, so we need to ensure that nobody frees the fences from under us. Based on the analogous fix for amdgpu. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Nicolai Hähnle authored
commit b19763d0 upstream. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Flora Cui authored
commit ca198528 upstream. No need to re-init asic if it's already been initialized. Skip IB tests since kernel processes are frozen in thaw. Signed-off-by: Flora Cui <Flora.Cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> [ kamal: backport to 4.2-stable: context ] Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Tejun Heo authored
commit d6e022f1 upstream. When looking up the pool_workqueue to use for an unbound workqueue, workqueue assumes that the target CPU is always bound to a valid NUMA node. However, currently, when a CPU goes offline, the mapping is destroyed and cpu_to_node() returns NUMA_NO_NODE. This has always been broken but hasn't triggered often enough before 874bbfe6 ("workqueue: make sure delayed work run in local cpu"). After the commit, workqueue forcifully assigns the local CPU for delayed work items without explicit target CPU to fix a different issue. This widens the window where CPU can go offline while a delayed work item is pending causing delayed work items dispatched with target CPU set to an already offlined CPU. The resulting NUMA_NO_NODE mapping makes workqueue try to queue the work item on a NULL pool_workqueue and thus crash. While 874bbfe6 has been reverted for a different reason making the bug less visible again, it can still happen. Fix it by mapping NUMA_NO_NODE to the default pool_workqueue from unbound_pwq_by_node(). This is a temporary workaround. The long term solution is keeping CPU -> NODE mapping stable across CPU off/online cycles which is being worked on. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Mike Galbraith <umgwanakikbuti@gmail.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Len Brown <len.brown@intel.com> Link: http://lkml.kernel.org/g/1454424264.11183.46.camel@gmail.com Link: http://lkml.kernel.org/g/1453702100-2597-1-git-send-email-tangchen@cn.fujitsu.comSigned-off-by: Kamal Mostafa <kamal@canonical.com>
-
Alexandra Yates authored
commit 342decff upstream. Adding Intel codename DNV platform device IDs for SATA. Signed-off-by: Alexandra Yates <alexandra.yates@linux.intel.com> Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-
Rasmus Villemoes authored
commit ed3f9fd1 upstream. This fails to undo the setup for pin==0; moreover, something interesting happens if the setup failed already at pin==0. Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Fixes: f899fc64 ("drm/i915: use GMBUS to manage i2c links") Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/1455048677-19882-3-git-send-email-linux@rasmusvillemoes.dk (cherry picked from commit 2417c8c0) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Kamal Mostafa <kamal@canonical.com>
-