- 07 Nov, 2014 40 commits
-
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk> (cherry picked from commit b61625d2) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Sanjeev Sharma authored
on some architecture spin_is_locked() always return false in uniprocessor configuration and therefore it would be advise to replace with lockdep_assert_held(). Signed-off-by:
Sanjeev Sharma <Sanjeev_Sharma@mentor.com> Acked-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit ab945eff) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mark Knibbs authored
Castlewood Systems supplied various models of USB-SCSI converter with their ORB external removable-media drive. The ORB Windows and Macintosh drivers support six USB IDs: 084B:A001 [VID 084B is Castlewood Systems] 04E6:0002 (*) ORB USB Smart Cable P/N 88205-001 (generic SCM ID) 2027:A001 Double-H Technology DH-2000SC 1822:0001 (*) Ariston iConnect/iSCSI 07AF:0004 (*) Microtech XpressSCSI (25-pin) 07AF:0005 (*) Microtech XpressSCSI (50-pin) *: quirk already in unusual-devs.h [Apparently the official VID for Double-H Technology is 0x07EB = 2027 decimal. That's another hex/decimal mix-up with these SCM-based products (in addition to the Ariston and Entrega ones). Perhaps the USB-IF informed companies of their allocated VID in decimal, but they assumed it was hex? It seems all Entrega products used VID 0x1645, not just the USB-SCSI converter.] Double-H Technology Co., Ltd. produced a USB-SCSI converter, model DH-2000SC, which is probably the one supported by the ORB drivers. Perhaps the Castlewood-bundled product had a different label or PID though? Castlewood mentioned Conmate as being one type of USB-SCSI converter. Conmate and Double-H seem related somehow; both company addresses in the same road, and at one point the Conmate web site mentioned DH-2000H4, DH-200D4/DH-2000C4 as models of USB hub (DH short for Double-H presumably). Conmate did show a USB-SCSI converter model CM-660 on their web site at one point. My guess is that was identical to the DH-2000SC. Mention of the Double-H product: http://web.archive.org/web/20010221010141/http://www.doubleh.com.tw/dh-2000sc.htm The only picture I could find is at http://jp.acesuppliers.com/catalog/j64/component/page03.html The casing design looks the same as my ORB USB Smart Cable which has ID 04E6:0002. Anyway, that's enough rambling. Here's the patch. storage: Add quirks for Castlewood and Double-H USB-SCSI converters Add quirks for two SCM-based USB-SCSI converters which were bundled with some Castlewood ORB removable drives. Without the quirk only the (single) drive with SCSI ID 0 can be accessed. Signed-off-by:
Mark Knibbs <markk@clara.co.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 57cde01a) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mark Knibbs authored
There is apparently another SCM USB-SCSI converter with ID 04E6:000F. It is listed along with 04E6:000B in the Windows INF file for the Startech ICUSBSCSI2 as "eUSB SCSI Adapter (Bus Powered)". The quirk allows devices with SCSI ID other than 0 to be accessed. Also make a couple of existing SCM product IDs lower case to be consistent with other entries. Signed-off-by:
Mark Knibbs <markk@clara.co.uk> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 3512e7bf) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Lu Baolu authored
This full-speed USB device generates spurious remote wakeup event as soon as USB_DEVICE_REMOTE_WAKEUP feature is set. As the result, Linux can't enter system suspend and S0ix power saving modes once this keyboard is used. This patch tries to introduce USB_QUIRK_IGNORE_REMOTE_WAKEUP quirk. With this quirk set, wakeup capability will be ignored during device configure. This patch could be back-ported to kernels as old as 2.6.39. Signed-off-by:
Lu Baolu <baolu.lu@linux.intel.com> Acked-by:
Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit ddbe1fca) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Eliad Peller authored
Upstream commit a5fe8e76 (regulatory: add NUL to alpha2) contained a hunk that was supposed to be applied to struct ieee80211_reg_rule. However in stable 3.12 (3.12.31 in particular), it ended up in struct regulatory_request. Fix that now. Signed-off-by:
Jiri Slaby <jslaby@suse.cz> Cc: Eliad Peller <eliadx.peller@intel.com> Cc: Johannes Berg <johannes.berg@intel.com> (cherry picked from commit 84197d64) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Pali Rohár authored
Without this patch, dell-wmi is trying to access elements of dynamically allocated array without checking the array size. This can lead to memory corruption or a kernel panic. This patch adds the missing checks for array size. Signed-off-by:
Pali Rohár <pali.rohar@gmail.com> Signed-off-by:
Darren Hart <dvhart@linux.intel.com> (cherry picked from commit a666b6ff) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Benjamin Tissoires authored
Commit "HID: logitech: perform bounds checking on device_id early enough" unfortunately leaks some errors to dmesg which are not real ones: - if the report is not a DJ one, then there is not point in checking the device_id - the receiver (index 0) can also receive some notifications which can be safely ignored given the current implementation Move out the test regarding the report_id and also discards printing errors when the receiver got notified. Fixes: ad3e14d7 Cc: stable@vger.kernel.org Reported-and-tested-by:
Markus Trippelsdorf <markus@trippelsdorf.de> Signed-off-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> (cherry picked from commit 5abfe85c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jiri Kosina authored
device_index is a char type and the size of paired_dj_deivces is 7 elements, therefore proper bounds checking has to be applied to device_index before it is used. We are currently performing the bounds checking in logi_dj_recv_add_djhid_device(), which is too late, as malicious device could send REPORT_TYPE_NOTIF_DEVICE_UNPAIRED early enough and trigger the problem in one of the report forwarding functions called from logi_dj_raw_event(). Fix this by performing the check at the earliest possible ocasion in logi_dj_raw_event(). Cc: stable@vger.kernel.org Reported-by:
Ben Hawkes <hawkes@google.com> Reviewed-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> (cherry picked from commit ad3e14d7) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
Meelis Roos reported that kernels built with gcc-4.9 do not boot, we eventually narrowed this down to only impacting machines using UltraSPARC-III and derivitive cpus. The crash happens right when the first user process is spawned: [ 54.451346] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 [ 54.451346] [ 54.571516] CPU: 1 PID: 1 Comm: init Not tainted 3.16.0-rc2-00211-gd7933ab7 #96 [ 54.666431] Call Trace: [ 54.698453] [0000000000762f8c] panic+0xb0/0x224 [ 54.759071] [000000000045cf68] do_exit+0x948/0x960 [ 54.823123] [000000000042cbc0] fault_in_user_windows+0xe0/0x100 [ 54.902036] [0000000000404ad0] __handle_user_windows+0x0/0x10 [ 54.978662] Press Stop-A (L1-A) to return to the boot prom [ 55.050713] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000004 Further investigation showed that compiling only per_cpu_patch() with an older compiler fixes the boot. Detailed analysis showed that the function is not being miscompiled by gcc-4.9, but it is using a different register allocation ordering. With the gcc-4.9 compiled function, something during the code patching causes some of the %i* input registers to get corrupted. Perhaps we have a TLB miss path into the firmware that is deep enough to cause a register window spill and subsequent restore when we get back from the TLB miss trap. Let's plug this up by doing two things: 1) Stop using the firmware stack for client interface calls into the firmware. Just use the kernel's stack. 2) As soon as we can, call into a new function "start_early_boot()" to put a one-register-window buffer between the firmware's deepest stack frame and the top-most initial kernel one. Reported-by:
Meelis Roos <mroos@linux.ee> Tested-by:
Meelis Roos <mroos@linux.ee> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit ef3e035c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
bob picco authored
This patch attempts to do a few things. The highlights are: 1) enable SPARSE_IRQ unconditionally, 2) kills off !SPARSE_IRQ code 3) allocates ivector_table at boot time and 4) default to cookie only VIRQ mechanism for supported firmware. The first firmware with cookie only support for me appears on T5. You can optionally force the HV firmware to not cookie only mode which is the sysino support. The sysino is a deprecated HV mechanism according to the most recent SPARC Virtual Machine Specification. HV_GRP_INTR is what controls the cookie/sysino firmware versioning. The history of this interface is: 1) Major version 1.0 only supported sysino based interrupt interfaces. 2) Major version 2.0 added cookie based VIRQs, however due to the fact that OSs were using the VIRQs without negoatiating major version 2.0 (Linux and Solaris are both guilty), the VIRQs calls were allowed even with major version 1.0 To complicate things even further, the VIRQ interfaces were only actually hooked up in the hypervisor for LDC interrupt sources. VIRQ calls on other device types would result in HV_EINVAL errors. So effectively, major version 2.0 is unusable. 3) Major version 3.0 was created to signal use of VIRQs and the fact that the hypervisor has these calls hooked up for all interrupt sources, not just those for LDC devices. A new boot option is provided should cookie only HV support have issues. hvirq - this is the version for HV_GRP_INTR. This is related to HV API versioning. The code attempts major=3 first by default. The option can be used to override this default. I've tested with SPARSE_IRQ on T5-8, M7-4 and T4-X and Jalap?no. Signed-off-by:
Bob Picco <bob.picco@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit ee6a9333) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bryan O'Donoghue authored
This patch is to enable the USB gadget device for Intel Quark X1000 Signed-off-by:
Bryan O'Donoghue <bryan.odonoghue@intel.com> Signed-off-by:
Bing Niu <bing.niu@intel.com> Signed-off-by:
Alvin (Weike) Chen <alvin.chen@intel.com> Signed-off-by:
Felipe Balbi <balbi@ti.com> (cherry picked from commit a68df706) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Steffen Klassert authored
Currently we genarate a blackhole route route whenever we have matching policies but can not resolve the states. Here we assume that dst_output() is called to kill the balckholed packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system unwanted. We fix this by generating blackhole routes only from the route lookup functions, here we can guarantee a call to dst_output() afterwards. Fixes: 2774c131 ("xfrm: Handle blackhole route creation via afinfo.") Reported-by:
Konstantinos Kolelis <k.kolelis@sirrix.com> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> (cherry picked from commit f92ee619) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Felipe Balbi authored
Currently, we disable pm_runtime before all register accesses are done, this is dangerous and might lead to abort exceptions due to the driver trying to access a register which is clocked by a clock which was long gated. Fix that by moving pm_runtime_put_sync() and pm_runtime_disable() as the last thing we do before returning from our ->remove() method. Fixes: 72246da4 (usb: Introduce DesignWare USB3 DRD Driver) Cc: <stable@vger.kernel.org> # v3.2+ Signed-off-by:
Felipe Balbi <balbi@ti.com> (cherry picked from commit fed33afc) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
bob picco authored
The T5 (niagara5) has different PCR related HV fast trap values and a new HV API Group. This patch utilizes these and shares when possible with niagara4. We use the same sparc_pmu niagara4_pmu. Should there be new effort to obtain the MCU perf statistics then this would have to be changed. Cc: sparclinux@vger.kernel.org Signed-off-by:
Bob Picco <bob.picco@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 05aa1651) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alexei Starovoitov authored
- fix BPF_LD|ABS|IND from negative offsets: make sure to sign extend lower 32 bits in 64-bit register before calling C helpers from JITed code, otherwise 'int k' argument of bpf_internal_load_pointer_neg_helper() function will be added as large unsigned integer, causing packet size check to trigger and abort the program. It's worth noting that JITed code for 'A = A op K' will affect upper 32 bits differently depending whether K is simm13 or not. Since small constants are sign extended, whereas large constants are stored in temp register and zero extended. That is ok and we don't have to pay a penalty of sign extension for every sethi, since all classic BPF instructions have 32-bit semantics and we only need to set correct upper bits when transitioning from JITed code into C. - though instructions 'A &= 0' and 'A *= 0' are odd, JIT compiler should not optimize them out Signed-off-by:
Alexei Starovoitov <ast@plumgrid.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 35607b02) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
Christopher reports that perf_event_print_debug() can crash in uniprocessor builds. The crash is due to pcr_ops being NULL. This happens because pcr_arch_init() is only invoked by smp_cpus_done() which only executes in SMP builds. init_hw_perf_events() is closely intertwined with pcr_ops being setup properly, therefore: 1) Call pcr_arch_init() early on from init_hw_perf_events(), instead of from smp_cpus_done(). 2) Do not hook up a PMU type if pcr_ops is NULL after pcr_arch_init(). 3) Move init_hw_perf_events to a later initcall so that it we will be sure to invoke pcr_arch_init() after all cpus are brought up. Finally, guard the one naked sequence of pcr_ops dereferences in __global_pmu_self() with an appropriate NULL check. Reported-by:
Christopher Alexander Tobias Schulze <cat.schulze@alice-dsl.net> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 8bccf5b3) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
nmi_cpu_busy() is a SMP function call that just makes sure that all of the cpus are spinning using cpu cycles while the NMI test runs. It does not need to disable IRQs because we just care about NMIs executing which will even with 'normal' IRQs disabled. It is not legal to enable hard IRQs in a SMP cross call, in fact this bug triggers the BUG check in irq_work_run_list(): BUG_ON(!irqs_disabled()); Because now irq_work_run() is invoked from the tail of generic_smp_call_function_single_interrupt(). Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 58556104) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bryan O'Donoghue authored
This patch is to enable the USB gadget device for Intel Quark X1000 Signed-off-by:
Bryan O'Donoghue <bryan.odonoghue@intel.com> Signed-off-by:
Bing Niu <bing.niu@intel.com> Signed-off-by:
Alvin (Weike) Chen <alvin.chen@intel.com> Signed-off-by:
Felipe Balbi <balbi@ti.com> (cherry picked from commit a68df706) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Steffen Klassert authored
Currently we genarate a blackhole route route whenever we have matching policies but can not resolve the states. Here we assume that dst_output() is called to kill the balckholed packets. Unfortunately this assumption is not true in all cases, so it is possible that these packets leave the system unwanted. We fix this by generating blackhole routes only from the route lookup functions, here we can guarantee a call to dst_output() afterwards. Fixes: 2774c131 ("xfrm: Handle blackhole route creation via afinfo.") Reported-by:
Konstantinos Kolelis <k.kolelis@sirrix.com> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> (cherry picked from commit f92ee619) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Vlad Yasevich authored
TG3 appears to have an issue performing TSO and checksum offloading correclty when the frame has been vlan encapsulated (non-accelrated). In these cases, tcp checksum is not correctly updated. This patch attempts to work around this issue. After the patch, 802.1ad vlans start working correctly over tg3 devices. CC: Prashant Sreedharan <prashant@broadcom.com> CC: Michael Chan <mchan@broadcom.com> Signed-off-by:
Vladislav Yasevich <vyasevic@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 476c1885) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
It is not sufficient to only implement get_user_pages_fast(), you must also implement the atomic version __get_user_pages_fast() otherwise you end up using the weak symbol fallback implementation which simply returns zero. This is dangerous, because it causes the futex code to loop forever if transparent hugepages are supported (see get_futex_key()). Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 06090e8e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Dave Kleikamp authored
This is the longest boot string that silo supports. Signed-off-by:
Dave Kleikamp <dave.kleikamp@oracle.com> Cc: Bob Picco <bob.picco@oracle.com> Cc: David S. Miller <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 1cef94c3) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
As currently coded the KTSB accesses in the kernel only support up to 47 bits of physical addressing. Adjust the instruction and patching sequence in order to support arbitrary 64 bits addresses. Signed-off-by:
David S. Miller <davem@davemloft.net> Acked-by:
Bob Picco <bob.picco@oracle.com> (cherry picked from commit 8c82dc0e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
bob picco authored
The T5 (niagara5) has different PCR related HV fast trap values and a new HV API Group. This patch utilizes these and shares when possible with niagara4. We use the same sparc_pmu niagara4_pmu. Should there be new effort to obtain the MCU perf statistics then this would have to be changed. Cc: sparclinux@vger.kernel.org Signed-off-by:
Bob Picco <bob.picco@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 05aa1651) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
This breaks the stack end corruption detection facility. What that facility does it write a magic value to "end_of_stack()" and checking to see if it gets overwritten. "end_of_stack()" is "task_thread_info(p) + 1", which for sparc64 is the beginning of the FPU register save area. So once the user uses the FPU, the magic value is overwritten and the debug checks trigger. Fix this by making the size explicit. Due to the size we use for the fpsaved[], gsr[], and xfsr[] arrays we are limited to 7 levels of FPU state saves. So each FPU register set is 256 bytes, allocate 256 * 7 for the fpregs area. Reported-by:
Meelis Roos <mroos@linux.ee> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit e2653143) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
The AES loops in arch/sparc/crypto/aes_glue.c use a scheme where the key material is preloaded into the FPU registers, and then we loop over and over doing the crypt operation, reusing those pre-cooked key registers. There are intervening blkcipher*() calls between the crypt operation calls. And those might perform memcpy() and thus also try to use the FPU. The sparc64 kernel FPU usage mechanism is designed to allow such recursive uses, but with a catch. There has to be a trap between the two FPU using threads of control. The mechanism works by, when the FPU is already in use by the kernel, allocating a slot for FPU saving at trap time. Then if, within the trap handler, we try to use the FPU registers, the pre-trap FPU register state is saved into the slot. Then at trap return time we notice this and restore the pre-trap FPU state. Over the long term there are various more involved ways we can make this work, but for a quick fix let's take advantage of the fact that the situation where this happens is very limited. All sparc64 chips that support the crypto instructiosn also are using the Niagara4 memcpy routine, and that routine only uses the FPU for large copies where we can't get the source aligned properly to a multiple of 8 bytes. We look to see if the FPU is already in use in this context, and if so we use the non-large copy path which only uses integer registers. Furthermore, we also limit this special logic to when we are doing kernel copy, rather than a user copy. Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit f4da3628) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
Inconsistently, the raw_* IRQ routines do not interact with and update the irqflags tracing and lockdep state, whereas the raw_* spinlock interfaces do. This causes problems in p1275_cmd_direct() because we disable hardirqs by hand using raw_local_irq_restore() and then do a raw_spin_lock() which triggers a lockdep trace because the CPU's hw IRQ state doesn't match IRQ tracing's internal software copy of that state. The CPU's irqs are disabled, yet current->hardirqs_enabled is true. ==================== reboot: Restarting system ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at kernel/locking/lockdep.c:3536 check_flags+0x7c/0x240() DEBUG_LOCKS_WARN_ON(current->hardirqs_enabled) Modules linked in: openpromfs CPU: 0 PID: 1 Comm: systemd-shutdow Tainted: G W 3.17.0-dirty #145 Call Trace: [000000000045919c] warn_slowpath_common+0x5c/0xa0 [0000000000459210] warn_slowpath_fmt+0x30/0x40 [000000000048f41c] check_flags+0x7c/0x240 [0000000000493280] lock_acquire+0x20/0x1c0 [0000000000832b70] _raw_spin_lock+0x30/0x60 [000000000068f2fc] p1275_cmd_direct+0x1c/0x60 [000000000068ed28] prom_reboot+0x28/0x40 [000000000043610c] machine_restart+0x4c/0x80 [000000000047d2d4] kernel_restart+0x54/0x80 [000000000047d618] SyS_reboot+0x138/0x200 [00000000004060b4] linux_sparc_syscall32+0x34/0x60 ---[ end trace 5c439fe81c05a100 ]--- possible reason: unannotated irqs-off. irq event stamp: 2010267 hardirqs last enabled at (2010267): [<000000000049a358>] vprintk_emit+0x4b8/0x580 hardirqs last disabled at (2010266): [<0000000000499f08>] vprintk_emit+0x68/0x580 softirqs last enabled at (2010046): [<000000000045d278>] __do_softirq+0x378/0x4a0 softirqs last disabled at (2010039): [<000000000042bf08>] do_softirq_own_stack+0x28/0x40 Resetting ... ==================== Use local_* variables of the hw IRQ interfaces so that IRQ tracing sees all of our changes. Reported-by:
Meelis Roos <mroos@linux.ee> Tested-by:
Meelis Roos <mroos@linux.ee> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit bdcf81b6) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
When we have to split up a flush request into multiple pieces (in order to avoid the firmware range) we don't specify the arguments in the right order for the second piece. Fix the order, or else we get hangs as the code tries to flush "a lot" of entries and we get lockups like this: [ 4422.981276] NMI watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [expect:117032] [ 4422.996130] Modules linked in: ipv6 loop usb_storage igb ptp sg sr_mod ehci_pci ehci_hcd pps_core n2_rng rng_core [ 4423.016617] CPU: 12 PID: 117032 Comm: expect Not tainted 3.17.0-rc4+ #1608 [ 4423.030331] task: fff8003cc730e220 ti: fff8003d99d54000 task.ti: fff8003d99d54000 [ 4423.045282] TSTATE: 0000000011001602 TPC: 00000000004521e8 TNPC: 00000000004521ec Y: 00000000 Not tainted [ 4423.064905] TPC: <__flush_tlb_kernel_range+0x28/0x40> [ 4423.074964] g0: 000000000052fd10 g1: 00000001295a8000 g2: ffffff7176ffc000 g3: 0000000000002000 [ 4423.092324] g4: fff8003cc730e220 g5: fff8003dfedcc000 g6: fff8003d99d54000 g7: 0000000000000006 [ 4423.109687] o0: 0000000000000000 o1: 0000000000000000 o2: 0000000000000003 o3: 00000000f0000000 [ 4423.127058] o4: 0000000000000080 o5: 00000001295a8000 sp: fff8003d99d56d01 ret_pc: 000000000052ff54 [ 4423.145121] RPC: <__purge_vmap_area_lazy+0x314/0x3a0> [ 4423.155185] l0: 0000000000000000 l1: 0000000000000000 l2: 0000000000a38040 l3: 0000000000000000 [ 4423.172559] l4: fff8003dae8965e0 l5: ffffffffffffffff l6: 0000000000000000 l7: 00000000f7e2b138 [ 4423.189913] i0: fff8003d99d576a0 i1: fff8003d99d576a8 i2: fff8003d99d575e8 i3: 0000000000000000 [ 4423.207284] i4: 0000000000008008 i5: fff8003d99d575c8 i6: fff8003d99d56df1 i7: 0000000000530c24 [ 4423.224640] I7: <free_vmap_area_noflush+0x64/0x80> [ 4423.234193] Call Trace: [ 4423.239051] [0000000000530c24] free_vmap_area_noflush+0x64/0x80 [ 4423.251029] [0000000000531a7c] remove_vm_area+0x5c/0x80 [ 4423.261628] [0000000000531b80] __vunmap+0x20/0x120 [ 4423.271352] [000000000071cf18] n_tty_close+0x18/0x40 [ 4423.281423] [00000000007222b0] tty_ldisc_close+0x30/0x60 [ 4423.292183] [00000000007225a4] tty_ldisc_reinit+0x24/0xa0 [ 4423.303120] [0000000000722ab4] tty_ldisc_hangup+0xd4/0x1e0 [ 4423.314232] [0000000000719aa0] __tty_hangup+0x280/0x3c0 [ 4423.324835] [0000000000724cb4] pty_close+0x134/0x1a0 [ 4423.334905] [000000000071aa24] tty_release+0x104/0x500 [ 4423.345316] [00000000005511d0] __fput+0x90/0x1e0 [ 4423.354701] [000000000047fa54] task_work_run+0x94/0xe0 [ 4423.365126] [0000000000404b44] __handle_signal+0xc/0x2c Fixes: 4ca9a237 ("sparc64: Guard against flushing openfirmware mappings.") Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 473ad7f4) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Andreas Larsson authored
This makes memset follow the standard (instead of returning 0 on success). This is needed when certain versions of gcc optimizes around memset calls and assume that the address argument is preserved in %o0. Signed-off-by:
Andreas Larsson <andreas@gaisler.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 74cad25c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
bob picco authored
We have seen an issue with guest boot into LDOM that causes early boot failures because of no matching rules for node identitity of the memory. I analyzed this on my T4 and concluded there might not be a solution. I saw the issue in mainline too when booting into the control/primary domain - with guests configured. Note, this could be a firmware bug on some older machines. I'll provide a full explanation of the issues below. Should we not find a matching BEST latency group for a real address (RA) then we will assume node 0. On the T4-2 here with the information provided I can't see an alternative. Technically the LDOM shown below should match the MBLOCK to the favorable latency group. However other factors must be considered too. Were the memory controllers configured "fine" grained interleave or "coarse" grain interleaved - T4. Also should a "group" MD node be considered a NUMA node? There has to be at least one Machine Description (MD) "group" and hence one NUMA node. The group can have one or more latency groups (lg) - more than one memory controller. The current code chooses the smallest latency as the most favorable per group. The latency and lg information is in MLGROUP below. MBLOCK is the base and size of the RAs for the machine as fetched from OBP /memory "available" property. My machine has one MBLOCK but more would be possible - with holes? For a T4-2 the following information has been gathered: with LDOM guest MEMBLOCK configuration: memory size = 0x27f870000 memory.cnt = 0x3 memory[0x0] [0x00000020400000-0x0000029fc67fff], 0x27f868000 bytes memory[0x1] [0x0000029fd8a000-0x0000029fd8bfff], 0x2000 bytes memory[0x2] [0x0000029fd92000-0x0000029fd97fff], 0x6000 bytes reserved.cnt = 0x2 reserved[0x0] [0x00000020800000-0x000000216c15c0], 0xec15c1 bytes reserved[0x1] [0x00000024800000-0x0000002c180c1e], 0x7980c1f bytes MBLOCK[0]: base[20000000] size[280000000] offset[0] (note: "base" and "size" reported in "MBLOCK" encompass the "memory[X]" values) (note: (RA + offset) & mask = val is the formula to detect a match for the memory controller. should there be no match for find_node node, a return value of -1 resulted for the node - BAD) There is one group. It has these forward links MLGROUP[1]: node[545] latency[1f7e8] match[200000000] mask[200000000] MLGROUP[2]: node[54d] latency[2de60] match[0] mask[200000000] NUMA NODE[0]: node[545] mask[200000000] val[200000000] (latency[1f7e8]) (note: "val" is the best lg's (smallest latency) "match") no LDOM guest - bare metal MEMBLOCK configuration: memory size = 0xfdf2d0000 memory.cnt = 0x3 memory[0x0] [0x00000020400000-0x00000fff6adfff], 0xfdf2ae000 bytes memory[0x1] [0x00000fff6d2000-0x00000fff6e7fff], 0x16000 bytes memory[0x2] [0x00000fff766000-0x00000fff771fff], 0xc000 bytes reserved.cnt = 0x2 reserved[0x0] [0x00000020800000-0x00000021a04580], 0x1204581 bytes reserved[0x1] [0x00000024800000-0x0000002c7d29fc], 0x7fd29fd bytes MBLOCK[0]: base[20000000] size[fe0000000] offset[0] there are two groups group node[16d5] MLGROUP[0]: node[1765] latency[1f7e8] match[0] mask[200000000] MLGROUP[3]: node[177d] latency[2de60] match[200000000] mask[200000000] NUMA NODE[0]: node[1765] mask[200000000] val[0] (latency[1f7e8]) group node[171d] MLGROUP[2]: node[1775] latency[2de60] match[0] mask[200000000] MLGROUP[1]: node[176d] latency[1f7e8] match[200000000] mask[200000000] NUMA NODE[1]: node[176d] mask[200000000] val[200000000] (latency[1f7e8]) (note: for this two "group" bare metal machine, 1/2 memory is in group one's lg and 1/2 memory is in group two's lg). Cc: sparclinux@vger.kernel.org Signed-off-by:
Bob Picco <bob.picco@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 3dee9df5) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
Every path that ends up at do_sparc64_fault() must install a valid FAULT_CODE_* bitmask in the per-thread fault code byte. Two paths leading to the label winfix_trampoline (which expects the FAULT_CODE_* mask in register %g4) were not doing so: 1) For pre-hypervisor TLB protection violation traps, if we took the 'winfix_trampoline' path we wouldn't have %g4 initialized with the FAULT_CODE_* value yet. Resulting in using the TLB_TAG_ACCESS register address value instead. 2) In the TSB miss path, when we notice that we are going to use a hugepage mapping, but we haven't allocated the hugepage TSB yet, we still have to take the window fixup case into consideration and in that particular path we leave %g4 not setup properly. Errors on this sort were largely invisible previously, but after commit 4ccb9272 ("sparc64: sun4v TLB error power off events") we now have a fault_code mask bit (FAULT_CODE_BAD_RA) that triggers due to this bug. FAULT_CODE_BAD_RA triggers because this bit is set in TLB_TAG_ACCESS (see #1 above) and thus we get seemingly random bus errors triggered for user processes. Fixes: 4ccb9272 ("sparc64: sun4v TLB error power off events") Reported-by:
Meelis Roos <mroos@linux.ee> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 84bd6d8b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
bob picco authored
[ Upstream commit 4ccb9272 ] We've witnessed a few TLB events causing the machine to power off because of prom_halt. In one case it was some nfs related area during rmmod. Another was an mmapper of /dev/mem. A more recent one is an ITLB issue with a bad pagesize which could be a hardware bug. Bugs happen but we should attempt to not power off the machine and/or hang it when possible. This is a DTLB error from an mmapper of /dev/mem: [root@sparcie ~]# SUN4V-DTLB: Error at TPC[fffff80100903e6c], tl 1 SUN4V-DTLB: TPC<0xfffff80100903e6c> SUN4V-DTLB: O7[fffff801081979d0] SUN4V-DTLB: O7<0xfffff801081979d0> SUN4V-DTLB: vaddr[fffff80100000000] ctx[1250] pte[98000000000f0610] error[2] . This is recent mainline for ITLB: [ 3708.179864] SUN4V-ITLB: TPC<0xfffffc010071cefc> [ 3708.188866] SUN4V-ITLB: O7[fffffc010071cee8] [ 3708.197377] SUN4V-ITLB: O7<0xfffffc010071cee8> [ 3708.206539] SUN4V-ITLB: vaddr[e0003] ctx[1a3c] pte[2900000dcc800eeb] error[4] . Normally sun4v_itlb_error_report() and sun4v_dtlb_error_report() would call prom_halt() and drop us to OF command prompt "ok". This isn't the case for LDOMs and the machine powers off. For the HV reported error of HV_ENORADDR for HV HV_MMU_MAP_ADDR_TRAP we cause a SIGBUS error by qualifying it within do_sparc64_fault() for fault code mask of FAULT_CODE_BAD_RA. This is done when trap level (%tl) is less or equal one("1"). Otherwise, for %tl > 1, we proceed eventually to die_if_kernel(). The logic of this patch was partially inspired by David Miller's feedback. Power off of large sparc64 machines is painful. Plus die_if_kernel provides more context. A reset sequence isn't a brief period on large sparc64 but better than power-off/power-on sequence. Cc: sparclinux@vger.kernel.org Signed-off-by:
Bob Picco <bob.picco@oracle.com> Signed-off-by:
David S. Miller <davem@davemloft.net> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit ebc43a92)
-
Daniel Hellstrom authored
dma_zalloc_coherent() calls dma_alloc_coherent(__GFP_ZERO) but the sparc32 implementations sbus_alloc_coherent() and pci32_alloc_coherent() doesn't take the gfp flags into account. Tested on the SPARC32/LEON GRETH Ethernet driver which fails due to dma_alloc_coherent(__GFP_ZERO) returns non zeroed pages. Signed-off-by:
Daniel Hellstrom <daniel@gaisler.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit d1105287) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Dmitry Kasatkin authored
3.16 commit aad4f8bb 'switch simple generic_file_aio_read() users to ->read_iter()' replaced ->aio_read with ->read_iter in most of the file systems and introduced new_sync_read() as a replacement for do_sync_read(). Most of file systems set '->read' and ima_kernel_read is not affected. When ->read is not set, this patch adopts fallback call changes from the vfs_read. Signed-off-by:
Dmitry Kasatkin <d.kasatkin@samsung.com> Signed-off-by:
Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: <stable@vger.kernel.org> 3.16+ (cherry picked from commit 27cd1fc3) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Catalin Marinas authored
Commit b0c29f79 (futexes: Avoid taking the hb->lock if there's nothing to wake up) changes the futex code to avoid taking a lock when there are no waiters. This code has been subsequently fixed in commit 11d4616b (futex: revert back to the explicit waiter counting code). Both the original commit and the fix-up rely on get_futex_key_refs() to always imply a barrier. However, for private futexes, none of the cases in the switch statement of get_futex_key_refs() would be hit and the function completes without a memory barrier as required before checking the "waiters" in futex_wake() -> hb_waiters_pending(). The consequence is a race with a thread waiting on a futex on another CPU, allowing the waker thread to read "waiters == 0" while the waiter thread to have read "futex_val == locked" (in kernel). Without this fix, the problem (user space deadlocks) can be seen with Android bionic's mutex implementation on an arm64 multi-cluster system. Signed-off-by:
Catalin Marinas <catalin.marinas@arm.com> Reported-by:
Matteo Franchin <Matteo.Franchin@arm.com> Fixes: b0c29f79 (futexes: Avoid taking the hb->lock if there's nothing to wake up) Acked-by:
Davidlohr Bueso <dave@stgolabs.net> Tested-by:
Mike Galbraith <umgwanakikbuti@gmail.com> Cc: <stable@vger.kernel.org> Cc: Darren Hart <dvhart@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 76835b0e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Sasha Levin authored
We're missing include/linux/compiler-gcc5.h which is required now because gcc branched off to v5 in trunk. Just copy the relevant bits out of include/linux/compiler-gcc4.h, no new code is added as of now. This fixes a build error when using gcc 5. Signed-off-by:
Sasha Levin <sasha.levin@oracle.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 71458cfc) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Yann Droneaud authored
According to commit 80af2588 ("fanotify: groups can specify their f_flags for new fd"), file descriptors created as part of file access notification events inherit flags from the event_f_flags argument passed to syscall fanotify_init(2)[1]. Unfortunately O_CLOEXEC is currently silently ignored. Indeed, event_f_flags are only given to dentry_open(), which only seems to care about O_ACCMODE and O_PATH in do_dentry_open(), O_DIRECT in open_check_o_direct() and O_LARGEFILE in generic_file_open(). It's a pity, since, according to some lookup on various search engines and http://codesearch.debian.net/, there's already some userspace code which use O_CLOEXEC: - in systemd's readahead[2]: fanotify_fd = fanotify_init(FAN_CLOEXEC|FAN_NONBLOCK, O_RDONLY|O_LARGEFILE|O_CLOEXEC|O_NOATIME); - in clsync[3]: #define FANOTIFY_EVFLAGS (O_LARGEFILE|O_RDONLY|O_CLOEXEC) int fanotify_d = fanotify_init(FANOTIFY_FLAGS, FANOTIFY_EVFLAGS); - in examples [4] from "Filesystem monitoring in the Linux kernel" article[5] by Aleksander Morgado: if ((fanotify_fd = fanotify_init (FAN_CLOEXEC, O_RDONLY | O_CLOEXEC | O_LARGEFILE)) < 0) Additionally, since commit 48149e9d ("fanotify: check file flags passed in fanotify_init"). having O_CLOEXEC as part of fanotify_init() second argument is expressly allowed. So it seems expected to set close-on-exec flag on the file descriptors if userspace is allowed to request it with O_CLOEXEC. But Andrew Morton raised[6] the concern that enabling now close-on-exec might break existing applications which ask for O_CLOEXEC but expect the file descriptor to be inherited across exec(). In the other hand, as reported by Mihai Dontu[7] close-on-exec on the file descriptor returned as part of file access notify can break applications due to deadlock. So close-on-exec is needed for most applications. More, applications asking for close-on-exec are likely expecting it to be enabled, relying on O_CLOEXEC being effective. If not, it might weaken their security, as noted by Jan Kara[8]. So this patch replaces call to macro get_unused_fd() by a call to function get_unused_fd_flags() with event_f_flags value as argument. This way O_CLOEXEC flag in the second argument of fanotify_init(2) syscall is interpreted and close-on-exec get enabled when requested. [1] http://man7.org/linux/man-pages/man2/fanotify_init.2.html [2] http://cgit.freedesktop.org/systemd/systemd/tree/src/readahead/readahead-collect.c?id=v208#n294 [3] https://github.com/xaionaro/clsync/blob/v0.2.1/sync.c#L1631 https://github.com/xaionaro/clsync/blob/v0.2.1/configuration.h#L38 [4] http://www.lanedo.com/~aleksander/fanotify/fanotify-example.c [5] http://www.lanedo.com/2013/filesystem-monitoring-linux-kernel/ [6] http://lkml.kernel.org/r/20141001153621.65e9258e65a6167bf2e4cb50@linux-foundation.org [7] http://lkml.kernel.org/r/20141002095046.3715eb69@mdontu-l [8] http://lkml.kernel.org/r/20141002104410.GB19748@quack.suse.cz Link: http://lkml.kernel.org/r/cover.1411562410.git.ydroneaud@opteya.comSigned-off-by:
Yann Droneaud <ydroneaud@opteya.com> Reviewed-by:
Jan Kara <jack@suse.cz> Reviewed by: Heinrich Schuchardt <xypron.glpk@gmx.de> Tested-by:
Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Mihai Don\u021bu <mihai.dontu@gmail.com> Cc: Pádraig Brady <P@draigBrady.com> Cc: Heinrich Schuchardt <xypron.glpk@gmx.de> Cc: Jan Kara <jack@suse.cz> Cc: Valdis Kletnieks <Valdis.Kletnieks@vt.edu> Cc: Michael Kerrisk-manpages <mtk.manpages@gmail.com> Cc: Lino Sanfilippo <LinoSanfilippo@gmx.de> Cc: Richard Guy Briggs <rgb@redhat.com> Cc: Eric Paris <eparis@redhat.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 0b37e097) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Junxiao Bi authored
Fix a deadlock problem caused by direct memory reclaim in o2net_wq. The situation is as follows: 1) Receive a connect message from another node, node queues a work_struct o2net_listen_work. 2) o2net_wq processes this work and call the following functions: o2net_wq -> o2net_accept_one -> sock_create_lite -> sock_alloc() -> kmem_cache_alloc with GFP_KERNEL -> ____cache_alloc_node ->__alloc_pages_nodemask -> do_try_to_free_pages -> shrink_slab -> evict -> ocfs2_evict_inode -> ocfs2_drop_lock -> dlmunlock -> o2net_send_message_vec then o2net_wq wait for the unlock reply from master. 3) tcp layer received the reply, call o2net_data_ready() and queue sc_rx_work, waiting o2net_wq to process this work. 4) o2net_wq is a single thread workqueue, it process the work one by one. Right now it is still doing o2net_listen_work and cannot handle sc_rx_work. so we deadlock. Junxiao Bi's patch "mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set" (http://ozlabs.org/~akpm/mmots/broken-out/mm-clear-__gfp_fs-when-pf_memalloc_noio-is-set.patch) clears __GFP_FS in memalloc_noio_flags() besides __GFP_IO. We use memalloc_noio_save() to set process flag PF_MEMALLOC_NOIO so that all allocations done by this process are done as if GFP_NOIO was specified. We are not reentering filesystem while doing memory reclaim. Signed-off-by:
joyce.xue <xuejiufei@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set commit 21caf2fc ("mm: teach mm by current context info to not do I/O during memory allocation") introduces PF_MEMALLOC_NOIO flag to avoid doing I/O inside memory allocation, __GFP_IO is cleared when this flag is set, but __GFP_FS implies __GFP_IO, it should also be cleared. Or it may still run into I/O, like in superblock shrinker. And this will make the kernel run into the deadlock case described in that commit. See Dave Chinner's comment about io in superblock shrinker: Filesystem shrinkers do indeed perform IO from the superblock shrinker and have for years. Even clean inodes can require IO before they can be freed - e.g. on an orphan list, need truncation of post-eof blocks, need to wait for ordered operations to complete before it can be freed, etc. IOWs, Ext4, btrfs and XFS all can issue and/or block on arbitrary amounts of IO in the superblock shrinker context. XFS, in particular, has been doing transactions and IO from the VFS inode cache shrinker since it was first introduced.... Fix this by clearing __GFP_FS in memalloc_noio_flags(), this function has masked all the gfp_mask that will be passed into fs for the processes setting PF_MEMALLOC_NOIO in the direct reclaim path. v1 thread at: https://lkml.org/lkml/2014/9/3/32Signed-off-by:
Junxiao Bi <junxiao.bi@oracle.com> Cc: Dave Chinner <david@fromorbit.com> Cc: joyce.xue <xuejiufei@huawei.com> Cc: Ming Lei <ming.lei@canonical.com> Cc: Trond Myklebust <trond.myklebust@primarydata.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit b246d3d1 934f3072) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Champion Chen authored
Suspend could fail for some platforms because btusb_suspend==> btusb_stop_traffic ==> usb_kill_anchored_urbs. When btusb_bulk_complete returns before system suspend and resubmits an URB, the system cannot enter suspend state. Signed-off-by:
Champion Chen <champion_chen@realsil.com.cn> Signed-off-by:
Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by:
Marcel Holtmann <marcel@holtmann.org> Cc: stable@vger.kernel.org (cherry picked from commit 85560c4a) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-