- 12 Aug, 2014 40 commits
-
-
Thomas Richter authored
Remove the bonding debug_fs entries when the module initialization fails. The debug_fs entries should be removed together with all other already allocated resources. Signed-off-by:
Thomas Richter <tmricht@linux.vnet.ibm.com> Signed-off-by:
Jay Vosburgh <j.vosburgh@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit db298686) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Florian Westphal authored
In case of tcp, gso_size contains the tcpmss. For UFO (udp fragmentation offloading) skbs, gso_size is the fragment payload size, i.e. we must not account for udp header size. Otherwise, when using virtio drivers, a to-be-forwarded UFO GSO packet will be needlessly fragmented in the forward path, because we think its individual segments are too large for the outgoing link. Fixes: fe6cc55f ("net: ip, ipv6: handle gso skbs in forwarding path") Cc: Eric Dumazet <eric.dumazet@gmail.com> Reported-by:
Tobias Brunner <tobias@strongswan.org> Signed-off-by:
Florian Westphal <fw@strlen.de> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 6d39d589) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Dmitry Petukhov authored
When l2tp driver tries to get PMTU for the tunnel destination, it uses the pointer to struct sock that represents PPPoX socket, while it should use the pointer that represents UDP socket of the tunnel. Signed-off-by:
Dmitry Petukhov <dmgenp@gmail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit f34c4a35) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Daniel Borkmann authored
In function sctp_wake_up_waiters(), we need to involve a test if the association is declared dead. If so, we don't have any reference to a possible sibling association anymore and need to invoke sctp_write_space() instead, and normally walk the socket's associations and notify them of new wmem space. The reason for special casing is that otherwise, we could run into the following issue when a sctp_primitive_SEND() call from sctp_sendmsg() fails, and tries to flush an association's outq, i.e. in the following way: sctp_association_free() `-> list_del(&asoc->asocs) <-- poisons list pointer asoc->base.dead = true sctp_outq_free(&asoc->outqueue) `-> __sctp_outq_teardown() `-> sctp_chunk_free() `-> consume_skb() `-> sctp_wfree() `-> sctp_wake_up_waiters() <-- dereferences poisoned pointers if asoc->ep->sndbuf_policy=0 Therefore, only walk the list in an 'optimized' way if we find that the current association is still active. We could also use list_del_init() in addition when we call sctp_association_free(), but as Vlad suggests, we want to trap such bugs and thus leave it poisoned as is. Why is it safe to resolve the issue by testing for asoc->base.dead? Parallel calls to sctp_sendmsg() are protected under socket lock, that is lock_sock()/release_sock(). Only within that path under lock held, we're setting skb/chunk owner via sctp_set_owner_w(). Eventually, chunks are freed directly by an association still under that lock. So when traversing association list on destruction time from sctp_wake_up_waiters() via sctp_wfree(), a different CPU can't be running sctp_wfree() while another one calls sctp_association_free() as both happens under the same lock. Therefore, this can also not race with setting/testing against asoc->base.dead as we are guaranteed for this to happen in order, under lock. Further, Vlad says: the times we check asoc->base.dead is when we've cached an association pointer for later processing. In between cache and processing, the association may have been freed and is simply still around due to reference counts. We check asoc->base.dead under a lock, so it should always be safe to check and not race against sctp_association_free(). Stress-testing seems fine now, too. Fixes: cd253f9f357d ("net: sctp: wake up all assocs if sndbuf policy is per socket") Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Cc: Vlad Yasevich <vyasevic@redhat.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Acked-by:
Vlad Yasevich <vyasevic@redhat.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 1e1cdf8a) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Daniel Borkmann authored
SCTP charges chunks for wmem accounting via skb->truesize in sctp_set_owner_w(), and sctp_wfree() respectively as the reverse operation. If a sender runs out of wmem, it needs to wait via sctp_wait_for_sndbuf(), and gets woken up by a call to __sctp_write_space() mostly via sctp_wfree(). __sctp_write_space() is being called per association. Although we assign sk->sk_write_space() to sctp_write_space(), which is then being done per socket, it is only used if send space is increased per socket option (SO_SNDBUF), as SOCK_USE_WRITE_QUEUE is set and therefore not invoked in sock_wfree(). Commit 4c3a5bda ("sctp: Don't charge for data in sndbuf again when transmitting packet") fixed an issue where in case sctp_packet_transmit() manages to queue up more than sndbuf bytes, sctp_wait_for_sndbuf() will never be woken up again unless it is interrupted by a signal. However, a still remaining issue is that if net.sctp.sndbuf_policy=0, that is accounting per socket, and one-to-many sockets are in use, the reclaimed write space from sctp_wfree() is 'unfairly' handed back on the server to the association that is the lucky one to be woken up again via __sctp_write_space(), while the remaining associations are never be woken up again (unless by a signal). The effect disappears with net.sctp.sndbuf_policy=1, that is wmem accounting per association, as it guarantees a fair share of wmem among associations. Therefore, if we have reclaimed memory in case of per socket accounting, wake all related associations to a socket in a fair manner, that is, traverse the socket association list starting from the current neighbour of the association and issue a __sctp_write_space() to everyone until we end up waking ourselves. This guarantees that no association is preferred over another and even if more associations are taken into the one-to-many session, all receivers will get messages from the server and are not stalled forever on high load. This setting still leaves the advantage of per socket accounting in touch as an association can still use up global limits if unused by others. Fixes: 4eb701df ("[SCTP] Fix SCTP sendbuffer accouting.") Signed-off-by:
Daniel Borkmann <dborkman@redhat.com> Cc: Thomas Graf <tgraf@suug.ch> Cc: Neil Horman <nhorman@tuxdriver.com> Cc: Vlad Yasevich <vyasevic@redhat.com> Acked-by:
Vlad Yasevich <vyasevic@redhat.com> Acked-by:
Neil Horman <nhorman@tuxdriver.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 52c35bef) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Oleg Nesterov authored
Add two trivial helpers list_next_entry() and list_prev_entry(), they can have a lot of users including list.h itself. In fact the 1st one is already defined in events/core.c and bnx2x_sp.c, so the patch simply moves the definition to list.h. Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Cc: Eilon Greenstein <eilong@broadcom.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 008208c6) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bjørn Mork authored
A number of older CMOTech modems are based on Qualcomm chips. The blacklisted interfaces are QMI/wwan. Reported-by:
Lars Melin <larsm17@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Bjørn Mork <bjorn@mork.no> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 34f972d6) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bjørn Mork authored
Device interface layout: 0: ff/ff/ff - serial 1: ff/00/00 - serial AT+PPP 2: ff/ff/ff - QMI/wwan 3: 08/06/50 - storage Cc: <stable@vger.kernel.org> Signed-off-by:
Bjørn Mork <bjorn@mork.no> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit dd6b48ec) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bjørn Mork authored
Device interface layout: 0: ff/ff/ff - serial 1: ff/ff/ff - serial AT+PPP 2: 08/06/50 - storage 3: ff/ff/ff - serial 4: ff/ff/ff - QMI/wwan Cc: <stable@vger.kernel.org> Reported-by:
Julio Araujo <julio.araujo@wllctel.com.br> Signed-off-by:
Bjørn Mork <bjorn@mork.no> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 533b3994) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bjørn Mork authored
Cc: <stable@vger.kernel.org> Signed-off-by:
Bjørn Mork <bjorn@mork.no> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit bce4f588) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bjørn Mork authored
Cc: <stable@vger.kernel.org> Signed-off-by:
Bjørn Mork <bjorn@mork.no> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 70a3615f) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bjørn Mork authored
Cc: <stable@vger.kernel.org> Signed-off-by:
Bjørn Mork <bjorn@mork.no> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit a00986f8) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Johan Hovold authored
During firmware download the device expects memory addresses in big-endian byte order. As the wIndex parameter which hold the address is sent in little-endian byte order regardless of host byte order, we need to use swab16 rather than cpu_to_be16. Also make sure to handle the struct ti_i2c_desc size parameter which is returned in little-endian byte order. Reported-by:
Ludovic Drolez <ldrolez@debian.org> Tested-by:
Ludovic Drolez <ldrolez@debian.org> Cc: stable@vger.kernel.org Signed-off-by:
Johan Hovold <jhovold@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 5509076d) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Denis Turischev authored
The same issue like with Panther Point chipsets. If the USB ports are switched to xHCI on shutdown, the xHCI host will send a spurious interrupt, which will wake the system. Some BIOS have work around for this, but not all. One example is Compulab's mini-desktop, the Intense-PC2. The bug can be avoided if the USB ports are switched back to EHCI on shutdown. This patch should be backported to stable kernels as old as 3.12, that contain the commit 638298dc "xhci: Fix spurious wakeups after S5 on Haswell" Signed-off-by:
Denis Turischev <denis@compulab.co.il> Cc: stable@vger.kernel.org Signed-off-by:
Mathias Nyman <mathias.nyman@linux.intel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit c09ec25d) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Miao Xie authored
Currently, with inode cache enabled, we will reuse its inode id immediately after unlinking file, we may hit something like following: |->iput inode |->return inode id into inode cache |->create dir,fsync |->power off An easy way to reproduce this problem is: mkfs.btrfs -f /dev/sdb mount /dev/sdb /mnt -o inode_cache,commit=100 dd if=/dev/zero of=/mnt/data bs=1M count=10 oflag=sync inode_id=`ls -i /mnt/data | awk '{print $1}'` rm -f /mnt/data i=1 while [ 1 ] do mkdir /mnt/dir_$i test1=`stat /mnt/dir_$i | grep Inode: | awk '{print $4}'` if [ $test1 -eq $inode_id ] then dd if=/dev/zero of=/mnt/dir_$i/data bs=1M count=1 oflag=sync echo b > /proc/sysrq-trigger fi sleep 1 i=$(($i+1)) done mount /dev/sdb /mnt umount /dev/sdb btrfs check /dev/sdb We fix this problem by adding unlinked inode's id into pinned tree, and we can not reuse them until committing transaction. Cc: stable@vger.kernel.org Signed-off-by:
Miao Xie <miaox@cn.fujitsu.com> Signed-off-by:
Wang Shilong <wangsl.fnst@cn.fujitsu.com> Signed-off-by:
Chris Mason <clm@fb.com> (cherry picked from commit 1c70d8fb) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Johan Hovold authored
Fix driver new_id sysfs-attribute removal deadlock by making sure to not hold any locks that the attribute operations grab when removing the attribute. Specifically, usb_serial_deregister holds the table mutex when deregistering the driver, which includes removing the new_id attribute. This can lead to a deadlock as writing to new_id increments the attribute's active count before trying to grab the same mutex in usb_serial_probe. The deadlock can easily be triggered by inserting a sleep in usb_serial_deregister and writing the id of an unbound device to new_id during module unload. As the table mutex (in this case) is used to prevent subdriver unload during probe, it should be sufficient to only hold the lock while manipulating the usb-serial driver list during deregister. A racing probe will then either fail to find a matching subdriver or fail to get the corresponding module reference. Since v3.15-rc1 this also triggers the following lockdep warning: ====================================================== [ INFO: possible circular locking dependency detected ] 3.15.0-rc2 #123 Tainted: G W ------------------------------------------------------- modprobe/190 is trying to acquire lock: (s_active#4){++++.+}, at: [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94 but task is already holding lock: (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (table_lock){+.+.+.}: [<c0075f84>] __lock_acquire+0x1694/0x1ce4 [<c0076de8>] lock_acquire+0xb4/0x154 [<c03af3cc>] _raw_spin_lock+0x4c/0x5c [<c02bbc24>] usb_store_new_id+0x14c/0x1ac [<bf007eb4>] new_id_store+0x68/0x70 [usbserial] [<c025f568>] drv_attr_store+0x30/0x3c [<c01690e0>] sysfs_kf_write+0x5c/0x60 [<c01682c0>] kernfs_fop_write+0xd4/0x194 [<c010881c>] vfs_write+0xbc/0x198 [<c0108e4c>] SyS_write+0x4c/0xa0 [<c000f880>] ret_fast_syscall+0x0/0x48 -> #0 (s_active#4){++++.+}: [<c03a7a28>] print_circular_bug+0x68/0x2f8 [<c0076218>] __lock_acquire+0x1928/0x1ce4 [<c0076de8>] lock_acquire+0xb4/0x154 [<c0166b70>] __kernfs_remove+0x254/0x310 [<c0167aa0>] kernfs_remove_by_name_ns+0x4c/0x94 [<c0169fb8>] remove_files.isra.1+0x48/0x84 [<c016a2fc>] sysfs_remove_group+0x58/0xac [<c016a414>] sysfs_remove_groups+0x34/0x44 [<c02623b8>] driver_remove_groups+0x1c/0x20 [<c0260e9c>] bus_remove_driver+0x3c/0xe4 [<c026235c>] driver_unregister+0x38/0x58 [<bf007fb4>] usb_serial_bus_deregister+0x84/0x88 [usbserial] [<bf004db4>] usb_serial_deregister+0x6c/0x78 [usbserial] [<bf005330>] usb_serial_deregister_drivers+0x2c/0x4c [usbserial] [<bf016618>] usb_serial_module_exit+0x14/0x1c [sierra] [<c009d6cc>] SyS_delete_module+0x184/0x210 [<c000f880>] ret_fast_syscall+0x0/0x48 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(table_lock); lock(s_active#4); lock(table_lock); lock(s_active#4); *** DEADLOCK *** 1 lock held by modprobe/190: #0: (table_lock){+.+.+.}, at: [<bf004d84>] usb_serial_deregister+0x3c/0x78 [usbserial] stack backtrace: CPU: 0 PID: 190 Comm: modprobe Tainted: G W 3.15.0-rc2 #123 [<c0015e10>] (unwind_backtrace) from [<c0013728>] (show_stack+0x20/0x24) [<c0013728>] (show_stack) from [<c03a9a54>] (dump_stack+0x24/0x28) [<c03a9a54>] (dump_stack) from [<c03a7cac>] (print_circular_bug+0x2ec/0x2f8) [<c03a7cac>] (print_circular_bug) from [<c0076218>] (__lock_acquire+0x1928/0x1ce4) [<c0076218>] (__lock_acquire) from [<c0076de8>] (lock_acquire+0xb4/0x154) [<c0076de8>] (lock_acquire) from [<c0166b70>] (__kernfs_remove+0x254/0x310) [<c0166b70>] (__kernfs_remove) from [<c0167aa0>] (kernfs_remove_by_name_ns+0x4c/0x94) [<c0167aa0>] (kernfs_remove_by_name_ns) from [<c0169fb8>] (remove_files.isra.1+0x48/0x84) [<c0169fb8>] (remove_files.isra.1) from [<c016a2fc>] (sysfs_remove_group+0x58/0xac) [<c016a2fc>] (sysfs_remove_group) from [<c016a414>] (sysfs_remove_groups+0x34/0x44) [<c016a414>] (sysfs_remove_groups) from [<c02623b8>] (driver_remove_groups+0x1c/0x20) [<c02623b8>] (driver_remove_groups) from [<c0260e9c>] (bus_remove_driver+0x3c/0xe4) [<c0260e9c>] (bus_remove_driver) from [<c026235c>] (driver_unregister+0x38/0x58) [<c026235c>] (driver_unregister) from [<bf007fb4>] (usb_serial_bus_deregister+0x84/0x88 [usbserial]) [<bf007fb4>] (usb_serial_bus_deregister [usbserial]) from [<bf004db4>] (usb_serial_deregister+0x6c/0x78 [usbserial]) [<bf004db4>] (usb_serial_deregister [usbserial]) from [<bf005330>] (usb_serial_deregister_drivers+0x2c/0x4c [usbserial]) [<bf005330>] (usb_serial_deregister_drivers [usbserial]) from [<bf016618>] (usb_serial_module_exit+0x14/0x1c [sierra]) [<bf016618>] (usb_serial_module_exit [sierra]) from [<c009d6cc>] (SyS_delete_module+0x184/0x210) [<c009d6cc>] (SyS_delete_module) from [<c000f880>] (ret_fast_syscall+0x0/0x48) Signed-off-by:
Johan Hovold <jhovold@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 10164c2a) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Liu Hua authored
For vmcore generated by LPAE enabled kernel, user space utility such as crash needs additional infomation to parse. So this patch add arch_crash_save_vmcoreinfo as what PAE enabled i386 linux does. Cc: <stable@vger.kernel.org> Reviewed-by:
Will Deacon <will.deacon@arm.com> Signed-off-by:
Liu Hua <sdu.liu@huawei.com> Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> (cherry picked from commit 56b700fd) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Xiangyu Lu authored
In big-endian systems, "%1" get the most significant part of the value, cause the instruction to get the wrong result. When viewing ftrace record in big-endian ARM systems, we found that the timestamp errors: swapper-0 [001] 1325.970000: 0:120:R ==> [001] 16:120:R events/1 events/1-16 [001] 1325.970000: 16:120:S ==> [001] 0:120:R swapper swapper-0 [000] 1325.1000000: 0:120:R + [000] 15:120:R events/0 swapper-0 [000] 1325.1000000: 0:120:R ==> [000] 15:120:R events/0 swapper-0 [000] 1326.030000: 0:120:R + [000] 1150:120:R sshd swapper-0 [000] 1326.030000: 0:120:R ==> [000] 1150:120:R sshd When viewed ftrace records, it will call the do_div(n, base) function, which achieved arch/arm/include/asm/div64.h in. When n = 10000000, base = 1000000, in do_div(n, base) will execute "umull %Q0, %R0, %1, %Q2". Reviewed-by:
Dave Martin <Dave.Martin@arm.com> Reviewed-by:
Nicolas Pitre <nico@linaro.org> Cc: <stable@vger.kernel.org> # 2.6.20+ Signed-off-by:
Alex Wu <wuquanming@huawei.com> Signed-off-by:
Xiangyu Lu <luxiangyu@huawei.com> Signed-off-by:
Russell King <rmk+kernel@arm.linux.org.uk> (cherry picked from commit 80bb3ef1) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Linus Torvalds authored
fixup_user_fault() is used by the futex code when the direct user access fails, and the futex code wants it to either map in the page in a usable form or return an error. It relied on handle_mm_fault() to map the page, and correctly checked the error return from that, but while that does map the page, it doesn't actually guarantee that the page will be mapped with sufficient permissions to be then accessed. So do the appropriate tests of the vma access rights by hand. [ Side note: arguably handle_mm_fault() could just do that itself, but we have traditionally done it in the caller, because some callers - notably get_user_pages() - have been able to access pages even when they are mapped with PROT_NONE. Maybe we should re-visit that design decision, but in the meantime this is the minimal patch. ] Found by Dave Jones running his trinity tool. Reported-by:
Dave Jones <davej@redhat.com> Acked-by:
Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 1b17844b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alex Deucher authored
Some newer PX laptops have the pci device class set to DISPLAY_OTHER rather than DISPLAY_VGA. This properly detects ATPX on those laptops. Based on a patch from: Pali Rohár <pali.rohar@gmail.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Cc: airlied@gmail.com (cherry picked from commit e9a4099a) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Hans de Goede authored
We expect that all the Haswell series will need such quirks, sigh. The T431s seems to be T430 hardware in a T440s case, using the T440s touchpad, with the same min/max issue. The X1 Carbon 3rd generation name says 2nd while it is a 3rd generation. The X1 and T431s share a PnPID with the T540p, but the reported ranges are closer to those of the T440s. HdG: Squashed 5 quirk patches into one. T431s + L440 + L540 are written by me, S1 Yoga and X1 are written by Benjamin Tissoires. Hdg: Standardized S1 Yoga and X1 values, Yoga uses the same touchpad as the X240, X1 uses the same touchpad as the T440. Cc: stable@vger.kernel.org Signed-off-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by:
Hans de Goede <hdegoede@redhat.com> Signed-off-by:
Dmitry Torokhov <dmitry.torokhov@gmail.com> (cherry picked from commit 46a2986e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Dan Williams authored
The AHCI spec allows implementations to issue commands in tag order rather than FIFO order: 5.3.2.12 P:SelectCmd HBA sets pSlotLoc = (pSlotLoc + 1) mod (CAP.NCS + 1) or HBA selects the command to issue that has had the PxCI bit set to '1' longer than any other command pending to be issued. The result is that commands posted sequentially (time-wise) may play out of sequence when issued by hardware. This behavior has likely been hidden by drives that arrange for commands to complete in issue order. However, it appears recent drives (two from different vendors that we have found so far) inflict out-of-order completions as a matter of course. So, we need to take care to maintain ordered submission, otherwise we risk triggering a drive to fall out of sequential-io automation and back to random-io processing, which incurs large latency and degrades throughput. This issue was found in simple benchmarks where QD=2 seq-write performance was 30-50% *greater* than QD=32 seq-write performance. Tagging for -stable and making the change globally since it has a low risk-to-reward ratio. Also, word is that recent versions of an unnamed OS also does it this way now. So, drives in the field are already experienced with this tag ordering scheme. Cc: <stable@vger.kernel.org> Cc: Dave Jiang <dave.jiang@intel.com> Cc: Ed Ciechanowski <ed.ciechanowski@intel.com> Reviewed-by:
Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by:
Dan Williams <dan.j.williams@intel.com> Signed-off-by:
Tejun Heo <tj@kernel.org> (cherry picked from commit 8a4aeec8) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jeff Layton authored
...otherwise the logic in the timeout handling doesn't work correctly. Spotted-by:
Trond Myklebust <trond.myklebust@primarydata.com> Cc: stable@vger.kernel.org Signed-off-by:
Jeff Layton <jlayton@redhat.com> Signed-off-by:
J. Bruce Fields <bfields@redhat.com> (cherry picked from commit 3758cf7e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Thomas Gleixner authored
The current implementation of irq_set_affinity() refuses rightfully to route an interrupt to an offline cpu. But there is a special case, where this is actually desired. Some of the ARM SoCs have per cpu timers which require setting the affinity during cpu startup where the cpu is not yet in the online mask. If we can't do that, then the local timer interrupt for the about to become online cpu is routed to some random online cpu. The developers of the affected machines tried to work around that issue, but that results in a massive mess in that timer code. We have a yet unused argument in the set_affinity callbacks of the irq chips, which I added back then for a similar reason. It was never required so it got not used. But I'm happy that I never removed it. That allows us to implement a sane handling of the above scenario. So the affected SoC drivers can add the required force handling to their interrupt chip, switch the timer code to irq_force_affinity() and things just work. This does not affect any existing user of irq_set_affinity(). Tagged for stable to allow a simple fix of the affected SoC clock event drivers. Reported-and-tested-by:
Krzysztof Kozlowski <k.kozlowski@samsung.com> Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Cc: Tomasz Figa <t.figa@samsung.com>, Cc: Daniel Lezcano <daniel.lezcano@linaro.org>, Cc: Kukjin Kim <kgene.kim@samsung.com> Cc: linux-arm-kernel@lists.infradead.org, Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20140416143315.717251504@linutronix.deSigned-off-by:
Thomas Gleixner <tglx@linutronix.de> (cherry picked from commit 01f8fa4f) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jeff Layton authored
A fl->fl_break_time of 0 has a special meaning to the lease break code that basically means "never break the lease". knfsd uses this to ensure that leases don't disappear out from under it. Unfortunately, the code in __break_lease can end up passing this value to wait_event_interruptible as a timeout, which prevents it from going to sleep at all. This makes __break_lease to spin in a tight loop and causes soft lockups. Fix this by ensuring that we pass a minimum value of 1 as a timeout instead. Cc: <stable@vger.kernel.org> Cc: J. Bruce Fields <bfields@fieldses.org> Reported-by:
Terry Barnaby <terry1@beam.ltd.uk> Signed-off-by:
Jeff Layton <jlayton@redhat.com> (cherry picked from commit f1c6bb2c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Theodore Ts'o authored
We haven't taken i_mutex yet, so we need to use i_size_read(). Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org (cherry picked from commit 6e6358fc) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jan Kara authored
When heavily exercising xattr code the assertion that jbd2_journal_dirty_metadata() shouldn't return error was triggered: WARNING: at /srv/autobuild-ceph/gitbuilder.git/build/fs/jbd2/transaction.c:1237 jbd2_journal_dirty_metadata+0x1ba/0x260() CPU: 0 PID: 8877 Comm: ceph-osd Tainted: G W 3.10.0-ceph-00049-g68d04c9 #1 Hardware name: Dell Inc. PowerEdge R410/01V648, BIOS 1.6.3 02/07/2011 ffffffff81a1d3c8 ffff880214469928 ffffffff816311b0 ffff880214469968 ffffffff8103fae0 ffff880214469958 ffff880170a9dc30 ffff8802240fbe80 0000000000000000 ffff88020b366000 ffff8802256e7510 ffff880214469978 Call Trace: [<ffffffff816311b0>] dump_stack+0x19/0x1b [<ffffffff8103fae0>] warn_slowpath_common+0x70/0xa0 [<ffffffff8103fb2a>] warn_slowpath_null+0x1a/0x20 [<ffffffff81267c2a>] jbd2_journal_dirty_metadata+0x1ba/0x260 [<ffffffff81245093>] __ext4_handle_dirty_metadata+0xa3/0x140 [<ffffffff812561f3>] ext4_xattr_release_block+0x103/0x1f0 [<ffffffff81256680>] ext4_xattr_block_set+0x1e0/0x910 [<ffffffff8125795b>] ext4_xattr_set_handle+0x38b/0x4a0 [<ffffffff810a319d>] ? trace_hardirqs_on+0xd/0x10 [<ffffffff81257b32>] ext4_xattr_set+0xc2/0x140 [<ffffffff81258547>] ext4_xattr_user_set+0x47/0x50 [<ffffffff811935ce>] generic_setxattr+0x6e/0x90 [<ffffffff81193ecb>] __vfs_setxattr_noperm+0x7b/0x1c0 [<ffffffff811940d4>] vfs_setxattr+0xc4/0xd0 [<ffffffff8119421e>] setxattr+0x13e/0x1e0 [<ffffffff811719c7>] ? __sb_start_write+0xe7/0x1b0 [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60 [<ffffffff8118c65c>] ? fget_light+0x3c/0x130 [<ffffffff8118f2e8>] ? mnt_want_write_file+0x28/0x60 [<ffffffff8118f1f8>] ? __mnt_want_write+0x58/0x70 [<ffffffff811946be>] SyS_fsetxattr+0xbe/0x100 [<ffffffff816407c2>] system_call_fastpath+0x16/0x1b The reason for the warning is that buffer_head passed into jbd2_journal_dirty_metadata() didn't have journal_head attached. This is caused by the following race of two ext4_xattr_release_block() calls: CPU1 CPU2 ext4_xattr_release_block() ext4_xattr_release_block() lock_buffer(bh); /* False */ if (BHDR(bh)->h_refcount == cpu_to_le32(1)) } else { le32_add_cpu(&BHDR(bh)->h_refcount, -1); unlock_buffer(bh); lock_buffer(bh); /* True */ if (BHDR(bh)->h_refcount == cpu_to_le32(1)) get_bh(bh); ext4_free_blocks() ... jbd2_journal_forget() jbd2_journal_unfile_buffer() -> JH is gone error = ext4_handle_dirty_xattr_block(handle, inode, bh); -> triggers the warning We fix the problem by moving ext4_handle_dirty_xattr_block() under the buffer lock. Sadly this cannot be done in nojournal mode as that function can call sync_dirty_buffer() which would deadlock. Luckily in nojournal mode the race is harmless (we only dirty already freed buffer) and thus for nojournal mode we leave the dirtying outside of the buffer lock. Reported-by:
Sage Weil <sage@inktank.com> Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org (cherry picked from commit ec4cb1aa) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Matthew Wilcox authored
ext4_end_bio() currently throws away the error that it receives. Chances are this is part of a spate of errors, one of which will end up getting the error returned to userspace somehow, but we shouldn't take that risk. Also print out the errno to aid in debug. Signed-off-by:
Matthew Wilcox <matthew.r.wilcox@intel.com> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Reviewed-by:
Jan Kara <jack@suse.cz> Cc: stable@vger.kernel.org (cherry picked from commit 9503c67c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bartlomiej Zolnierkiewicz authored
Add missing clk_put() call to ata_host_activate() failure path. Sergei says, "Hm, I have once fixed that (see that *if* (!ret)) but looks like a later commit 477c87e9 (ARM: at91/pata: use gpio_is_valid to check the gpio) broke it again. :-( Would be good if the changelog did mention that..." Cc: Andrew Victor <linux@maxim.org.za> Cc: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by:
Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Signed-off-by:
Tejun Heo <tj@kernel.org> Cc: stable@vger.kernel.org (cherry picked from commit 27aa64b9) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alec Berg authored
Ensure that querying the IIO buffer scan_mask returns a value of 0 or 1. Currently querying the scan mask has the value returned by test_bit(), which returns either true or false. For some architectures test_bit() may return -1 for true, which will appear to return an error when returning from iio_scan_mask_query(). Additionally, it's important for the sysfs interface to consistently return the same thing when querying the scan_mask. Signed-off-by:
Alec Berg <alecaberg@chromium.org> Cc: stable@vger.kernel.org Signed-off-by:
Jonathan Cameron <jic23@kernel.org> (cherry picked from commit 2076a20f) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mel Gorman authored
David Vrabel identified a regression when using automatic NUMA balancing under Xen whereby page table entries were getting corrupted due to the use of native PTE operations. Quoting him Xen PV guest page tables require that their entries use machine addresses if the preset bit (_PAGE_PRESENT) is set, and (for successful migration) non-present PTEs must use pseudo-physical addresses. This is because on migration MFNs in present PTEs are translated to PFNs (canonicalised) so they may be translated back to the new MFN in the destination domain (uncanonicalised). pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma() set and clear the _PAGE_PRESENT bit using pte_set_flags(), pte_clear_flags(), etc. In a Xen PV guest, these functions must translate MFNs to PFNs when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting _PAGE_PRESENT. His suggested fix converted p[te|md]_[set|clear]_flags to using paravirt-friendly ops but this is overkill. He suggested an alternative of using p[te|md]_modify in the NUMA page table operations but this is does more work than necessary and would require looking up a VMA for protections. This patch modifies the NUMA page table operations to use paravirt friendly operations to set/clear the flags of interest. Unfortunately this will take a performance hit when updating the PTEs on CONFIG_PARAVIRT but I do not see a way around it that does not break Xen. Signed-off-by:
Mel Gorman <mgorman@suse.de> Acked-by:
David Vrabel <david.vrabel@citrix.com> Tested-by:
David Vrabel <david.vrabel@citrix.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Steven Noonan <steven@uplinklabs.net> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 29c77870) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mizuma, Masayoshi authored
soft lockup in freeing gigantic hugepage fixed in commit 55f67141 "mm: hugetlb: fix softlockup when a large number of hugepages are freed." can happen in return_unused_surplus_pages(), so let's fix it. Signed-off-by:
Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Signed-off-by:
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Aneesh Kumar <aneesh.kumar@linux.vnet.ibm.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 7848a4bf) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alex Deucher authored
Avoid a possible segfault. Noticed-by:
Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit 8c79bae6) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Quentin Casasnovas authored
On bo reservation failure, we end up leaking fpriv. v2 (chk): rebased and added missing free on vm failure as well Fixes: 5e386b57 ("drm/radeon: fix missing bo reservation") Cc: stable@vger.kernel.org Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by:
Quentin Casasnovas <quentin.casasnovas@oracle.com> Signed-off-by:
Christian König <christian.koenig@amd.com> (cherry picked from commit 74073c9d) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Michael Ulbricht authored
By specifying NO_UNION_NORMAL the ACM driver does only use the first two USB interfaces (modem data & control). The AT Port, Diagnostic and NMEA interfaces are left to the USB serial driver. Signed-off-by:
Michael Ulbricht <michael.ulbricht@systec-electronic.com> Signed-off-by:
Alexander Stein <alexander.stein@systec-electronic.com> Signed-off-by:
Oliver Neukum <oliver@neukum.org> Cc: stable <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 895d240d) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alan Stern authored
The code in hcd-pci.c that matches up EHCI controllers with their companion UHCI or OHCI controllers assumes that the private drvdata fields don't get set too early. However, it turns out that this field gets set by usb_create_hcd(), before hcd-pci expects it, and this can result in a crash when two controllers are probed in parallel (as can happen when a new controller card is hotplugged). The companions_rwsem lock was supposed to prevent this sort of thing, but usb_create_hcd() is called outside the scope of the rwsem. A simple solution is to check that the root-hub pointer has been initialized as well as the drvdata field. This doesn't happen until usb_add_hcd() is called; that call and the check are both protected by the rwsem. This patch should be applied to stable kernels from 3.10 onward. Signed-off-by:
Alan Stern <stern@rowland.harvard.edu> Reported-by:
Stefani Seibold <stefani@seibold.net> Tested-by:
Stefani Seibold <stefani@seibold.net> CC: <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit a2ff864b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Johan Hovold authored
Fix regression introduced by commit 8e493ca1 ("USB: usb_wwan: fix bulk-urb allocation") by making sure to require both bulk-in and out endpoints during port probe. The original option driver (which usb_wwan is based on) was written under the assumption that either endpoint could be missing, but evidently this cannot have been tested properly. Specifically, it would handle opening a device without bulk-in (but would blow up during resume which was implemented later), but not a missing bulk-out in write() (although it is handled in some places such as write_room()). Fortunately (?), the driver also got the test for missing endpoints wrong so the urbs were in fact always allocated, although they would be initialised using the wrong endpoint address (0) and any submission of such an urb would fail. The commit mentioned above fixed the test for missing endpoints but thereby exposed the other bugs which would now generate null-pointer exceptions rather than failed urb submissions. The regression was introduced in v3.7, but the offending commit was also marked for stable. Reported-by:
Rafał Miłecki <zajec5@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by:
Johan Hovold <jhovold@gmail.com> Tested-by:
Rafał Miłecki <zajec5@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit bd73bd88) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Aaron Sanders authored
Add device ids to pl2303 for the Hewlett-Packard HP POS pole displays: LD960: 03f0:0B39 LCM220: 03f0:3139 LCM960: 03f0:3239 [ Johan: fix indentation and sort PIDs numerically ] Signed-off-by:
Aaron Sanders <aaron.sanders@hp.com> Cc: stable@vger.kernel.org Signed-off-by:
Johan Hovold <jhovold@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit b16c02fb) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Tristan Bruns authored
Signed-off-by:
Tristan Bruns <tristan@tristanbruns.de> Cc: stable <stable@vger.kernel.org> Signed-off-by:
Johan Hovold <jhovold@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 72b30079) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Daniele Palmas authored
option driver, added VID/PID for Telit UE910v2 modem Signed-off-by:
Daniele Palmas <dnlplm@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by:
Johan Hovold <jhovold@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit d6de486b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-