- 10 Oct, 2014 40 commits
-
-
Takashi Iwai authored
ALC269 & co have many vendor-specific setups with COEF verbs. However, some verbs seem specific to some codec versions and they result in the codec stalling. Typically, such a case can be avoided by checking the return value from reading a COEF. If the return value is -1, it implies that the COEF is invalid, thus it shouldn't be written. This patch adds the invalid COEF checks in appropriate places accessing ALC269 and its variants. The patch actually fixes the resume problem on Acer AO725 laptop. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=52181Tested-by:
Francesco Muzio <muziofg@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Takashi Iwai <tiwai@suse.de> (cherry picked from commit f3ee07d8) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jiri Kosina authored
device_index is a char type and the size of paired_dj_deivces is 7 elements, therefore proper bounds checking has to be applied to device_index before it is used. We are currently performing the bounds checking in logi_dj_recv_add_djhid_device(), which is too late, as malicious device could send REPORT_TYPE_NOTIF_DEVICE_UNPAIRED early enough and trigger the problem in one of the report forwarding functions called from logi_dj_raw_event(). Fix this by performing the check at the earliest possible ocasion in logi_dj_raw_event(). Cc: stable@vger.kernel.org Reported-by:
Ben Hawkes <hawkes@google.com> Reviewed-by:
Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by:
Jiri Kosina <jkosina@suse.cz> (cherry picked from commit ad3e14d7) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
pte_ERROR() is not used anywhere, delete it. For pgd_ERROR() and pmd_ERROR(), output something similar to x86, giving the address of the pgd/pmd as well as it's value. Also provide the caller, since these macros are invoked from pgd_clear_bad() and pmd_clear_bad() which provides little context as to what high level operation was occuring when the BAD state was detected. Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit fe866433) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 0eef331a) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
In commit b2d43834 ("sparc64: Make PAGE_OFFSET variable."), the MAX_PHYS_ADDRESS_BITS value was increased (to 47). This constant reference to '41UL' was missed. Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit ee73887e) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
David S. Miller authored
When _PAGE_SPECIAL and _PAGE_PMD_HUGE were added to the mask, the comment was not updated. Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit c2e4e676) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Tobias Brunner authored
The SPI check introduced in ea9884b3 was intended for IPComp SAs but actually prevented AH SAs from getting installed (depending on the SPI). Fixes: ea9884b3 ("xfrm: check user specified spi for IPComp") Cc: Fan Du <fan.du@windriver.com> Signed-off-by:
Tobias Brunner <tobias@strongswan.org> Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> (cherry picked from commit a0e5ef53) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Andrew Hunter authored
timeval_to_jiffies tried to round a timeval up to an integral number of jiffies, but the logic for doing so was incorrect: intervals corresponding to exactly N jiffies would become N+1. This manifested itself particularly repeatedly stopping/starting an itimer: setitimer(ITIMER_PROF, &val, NULL); setitimer(ITIMER_PROF, NULL, &val); would add a full tick to val, _even if it was exactly representable in terms of jiffies_ (say, the result of a previous rounding.) Doing this repeatedly would cause unbounded growth in val. So fix the math. Here's what was wrong with the conversion: we essentially computed (eliding seconds) jiffies = usec * (NSEC_PER_USEC/TICK_NSEC) by using scaling arithmetic, which took the best approximation of NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC = x/(2^USEC_JIFFIE_SC), and computed: jiffies = (usec * x) >> USEC_JIFFIE_SC and rounded this calculation up in the intermediate form (since we can't necessarily exactly represent TICK_NSEC in usec.) But the scaling arithmetic is a (very slight) *over*approximation of the true value; that is, instead of dividing by (1 usec/ 1 jiffie), we effectively divided by (1 usec/1 jiffie)-epsilon (rounding down). This would normally be fine, but we want to round timeouts up, and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this would be fine if our division was exact, but dividing this by the slightly smaller factor was equivalent to adding just _over_ 1 to the final result (instead of just _under_ 1, as desired.) In particular, with HZ=1000, we consistently computed that 10000 usec was 11 jiffies; the same was true for any exact multiple of TICK_NSEC. We could possibly still round in the intermediate form, adding something less than 2^USEC_JIFFIE_SC - 1, but easier still is to convert usec->nsec, round in nanoseconds, and then convert using time*spec*_to_jiffies. This adds one constant multiplication, and is not observably slower in microbenchmarks on recent x86 hardware. Tested: the following program: int main() { struct itimerval zero = {{0, 0}, {0, 0}}; /* Initially set to 10 ms. */ struct itimerval initial = zero; initial.it_interval.tv_usec = 10000; setitimer(ITIMER_PROF, &initial, NULL); /* Save and restore several times. */ for (size_t i = 0; i < 10; ++i) { struct itimerval prev; setitimer(ITIMER_PROF, &zero, &prev); /* on old kernels, this goes up by TICK_USEC every iteration */ printf("previous value: %ld %ld %ld %ld\n", prev.it_interval.tv_sec, prev.it_interval.tv_usec, prev.it_value.tv_sec, prev.it_value.tv_usec); setitimer(ITIMER_PROF, &prev, NULL); } return 0; } Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Paul Turner <pjt@google.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Reviewed-by:
Paul Turner <pjt@google.com> Reported-by:
Aaron Jacobs <jacobsa@google.com> Signed-off-by:
Andrew Hunter <ahh@google.com> [jstultz: Tweaked to apply to 3.17-rc] Signed-off-by:
John Stultz <john.stultz@linaro.org> (cherry picked from commit d78c9300) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mel Gorman authored
This patch reverts 1ba6e0b5 ("mm: numa: split_huge_page: transfer the NUMA type from the pmd to the pte"). If a huge page is being split due a protection change and the tail will be in a PROT_NONE vma then NUMA hinting PTEs are temporarily created in the protected VMA. VM_RW|VM_PROTNONE |-----------------| ^ split here In the specific case above, it should get fixed up by change_pte_range() but there is a window of opportunity for weirdness to happen. Similarly, if a huge page is shrunk and split during a protection update but before pmd_numa is cleared then a pte_numa can be left behind. Instead of adding complexity trying to deal with the case, this patch will not mark PTEs NUMA when splitting a huge page. NUMA hinting faults will not be triggered which is marginal in comparison to the complexity in dealing with the corner cases during THP split. Cc: stable@vger.kernel.org Signed-off-by:
Mel Gorman <mgorman@suse.de> Acked-by:
Rik van Riel <riel@redhat.com> Acked-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit abc40bd2) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Waiman Long authored
In __split_huge_page_map(), the check for page_mapcount(page) is invariant within the for loop. Because of the fact that the macro is implemented using atomic_read(), the redundant check cannot be optimized away by the compiler leading to unnecessary read to the page structure. This patch moves the invariant bug check out of the loop so that it will be done only once. On a 3.16-rc1 based kernel, the execution time of a microbenchmark that broke up 1000 transparent huge pages using munmap() had an execution time of 38,245us and 38,548us with and without the patch respectively. The performance gain is about 1%. Signed-off-by:
Waiman Long <Waiman.Long@hp.com> Acked-by:
Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Scott J Norton <scott.norton@hp.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit f8303c25) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Steven Rostedt (Red Hat) authored
Commit 651e22f2 "ring-buffer: Always reset iterator to reader page" fixed one bug but in the process caused another one. The reset is to update the header page, but that fix also changed the way the cached reads were updated. The cache reads are used to test if an iterator needs to be updated or not. A ring buffer iterator, when created, disables writes to the ring buffer but does not stop other readers or consuming reads from happening. Although all readers are synchronized via a lock, they are only synchronized when in the ring buffer functions. Those functions may be called by any number of readers. The iterator continues down when its not interrupted by a consuming reader. If a consuming read occurs, the iterator starts from the beginning of the buffer. The way the iterator sees that a consuming read has happened since its last read is by checking the reader "cache". The cache holds the last counts of the read and the reader page itself. Commit 651e22f2 changed what was saved by the cache_read when the rb_iter_reset() occurred, making the iterator never match the cache. Then if the iterator calls rb_iter_reset(), it will go into an infinite loop by checking if the cache doesn't match, doing the reset and retrying, just to see that the cache still doesn't match! Which should never happen as the reset is suppose to set the cache to the current value and there's locks that keep a consuming reader from having access to the data. Fixes: 651e22f2 "ring-buffer: Always reset iterator to reader page" Signed-off-by:
Steven Rostedt <rostedt@goodmis.org> (cherry picked from commit 24607f11) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Peter Zijlstra authored
Oleg noticed that a cleanup by Sylvain actually uncovered a bug; by calling perf_event_free_task() when failing sched_fork() we will not yet have done the memset() on ->perf_event_ctxp[] and will therefore try and 'free' the inherited contexts, which are still in use by the parent process. This is bad.. Suggested-by:
Oleg Nesterov <oleg@redhat.com> Reported-by:
Oleg Nesterov <oleg@redhat.com> Reported-by:
Sylvain 'ythier' Hitier <sylvain.hitier@gmail.com> Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 6c72e350) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jan Kara authored
We did not implement any bound on number of indirect ICBs we follow when loading inode. Thus corrupted medium could cause kernel to go into an infinite loop, possibly causing a stack overflow. Fix the possible stack overflow by removing recursion from __udf_read_inode() and limit number of indirect ICBs we follow to avoid infinite loops. Signed-off-by:
Jan Kara <jack@suse.cz> (cherry picked from commit c03aa9f6) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Julian Anastasov authored
commit fc604767 ("ipvs: changes for local real server") from 2.6.37 introduced DNAT support to local real server but the IPv6 LOCAL_OUT handler ip_vs_local_reply6() is registered incorrectly as IPv4 hook causing any outgoing IPv4 traffic to be dropped depending on the IP header values. Chris tracked down the problem to CONFIG_IP_VS_IPV6=y Bug report: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349768Reported-by:
Chris J Arges <chris.j.arges@canonical.com> Tested-by:
Chris J Arges <chris.j.arges@canonical.com> Signed-off-by:
Julian Anastasov <ja@ssi.bg> Signed-off-by:
Simon Horman <horms@verge.net.au> (cherry picked from commit eb90b0c7) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alex Gartrell authored
Previously, only the four high bits of the tclass were maintained in the ipv6 case. This matches the behavior of ipv4, though whether or not we should reflect ECN bits may be up for debate. Signed-off-by:
Alex Gartrell <agartrell@fb.com> Acked-by:
Julian Anastasov <ja@ssi.bg> Signed-off-by:
Simon Horman <horms@verge.net.au> (cherry picked from commit 76f084bc) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
NeilBrown authored
r1_bio->start_next_window is not initialised in the READ case, so allow_barrier may incorrectly decrement conf->current_window_requests which can cause raise_barrier() to block forever. Fixes: 79ef3a8a cc: stable@vger.kernel.org (v3.13+) Reported-by:
Brassow Jonathan <jbrassow@redhat.com> Signed-off-by:
NeilBrown <neilb@suse.de> (cherry picked from commit f0cc9a05) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
NeilBrown authored
If a devices is being recovered it is not InSync and is not Faulty. If a read error is experienced on that device, fix_read_error() will be called, but it ignores non-InSync devices. So it will neither fix the error nor fail the device. It is incorrect that fix_read_error() ignores non-InSync devices. It should only ignore Faulty devices. So fix it. This became a bug when we allowed reading from a device that was being recovered. It is suitable for any subsequent -stable kernel. Fixes: da8840a7 Cc: stable@vger.kernel.org (v3.5+) Reported-by:
Alexander Lyakas <alex.bolshoy@gmail.com> Tested-by:
Alexander Lyakas <alex.bolshoy@gmail.com> Signed-off-by:
NeilBrown <neilb@suse.de> (cherry picked from commit b8cb6b4c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
NeilBrown authored
Both normal IO and resync IO can be retried with reschedule_retry() and so be counted into ->nr_queued, but only normal IO gets counted in ->nr_pending. Before the recent improvement to RAID1 resync there could only possibly have been one or the other on the queue. When handling a read failure it could only be normal IO. So when handle_read_error() called freeze_array() the fact that freeze_array only compares ->nr_queued against ->nr_pending was safe. But now that these two types can interleave, we can have both normal and resync IO requests queued, so we need to count them both in nr_pending. This error can lead to freeze_array() hanging if there is a read error, so it is suitable for -stable. Fixes: 79ef3a8a cc: stable@vger.kernel.org (v3.13+) Reported-by:
Brassow Jonathan <jbrassow@redhat.com> Signed-off-by:
NeilBrown <neilb@suse.de> (cherry picked from commit 34e97f17) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
NeilBrown authored
commit c6d119cf upstream. commit 79ef3a8a made it possible for reads to happen concurrently with resync. This means that we need to be more careful where read_balancing is allowed during resync - we can no longer be sure that any resync that has already started will definitely finish. So keep read_balancing to before recovery_cp, which is conservative but safe. This bug makes it possible to read from a device that doesn't have up-to-date data, so it can cause data corruption. So it is suitable for any kernel since 3.11. Fixes: 79ef3a8aSigned-off-by:
NeilBrown <neilb@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit dd777df4) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Hans Verkuil authored
commit 6a03dc92 upstream. This was caused by an uninitialized setup.config field. Based on a suggestion from Devin Heitmueller. Signed-off-by:
Hans Verkuil <hans.verkuil@cisco.com> Thanks-to: Devin Heitmueller <dheitmueller@kernellabs.com> Reported-by:
Scott Robinson <scott.robinson55@gmail.com> Tested-by:
Hans Verkuil <hans.verkuil@cisco.com> Signed-off-by:
Mauro Carvalho Chehab <m.chehab@samsung.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 94146fe3) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Antti Palosaari authored
commit 9dc0f3fe upstream. IT9135 RF tuner clock is coming from demodulator. We need enable it early in demod init, before any tuner I/O. Currently it is enabled by tuner driver itself, but it is too late and performance will be reduced as some registers are not updated correctly. Clock is disabled automatically when demod is put onto sleep. Cc: Bimow Chen <Bimow.Chen@ite.com.tw> Signed-off-by:
Antti Palosaari <crope@iki.fi> Signed-off-by:
Mauro Carvalho Chehab <m.chehab@samsung.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 994e79d2) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Malcolm Priestley authored
commit a04646c0 upstream. add the following IDs USB_PID_PCTV_78E (0x025a) for PCTV 78e USB_PID_PCTV_79E (0x0262) for PCTV 79e For these it9135 devices. Signed-off-by:
Malcolm Priestley <tvboxspy@gmail.com> Cc: Antti Palosaari <crope@iki.fi> Signed-off-by:
Antti Palosaari <crope@iki.fi> Signed-off-by:
Mauro Carvalho Chehab <m.chehab@samsung.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 5e80de30) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Anton Altaparmakov authored
On 32-bit architectures, the legacy buffer_head functions are not always handling the sector number with the proper 64-bit types, and will thus fail on 4TB+ disks. Any code that uses __getblk() (and thus bread(), breadahead(), sb_bread(), sb_breadahead(), sb_getblk()), and calls it using a 64-bit block on a 32-bit arch (where "long" is 32-bit) causes an inifinite loop in __getblk_slow() with an infinite stream of errors logged to dmesg like this: __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648 b_state=0x00000020, b_size=512 device sda1 blocksize: 512 Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the top 32-bits are missing (in this case the 0x1 at the top). This is because grow_dev_page() is broken and has a 32-bit overflow due to shifting the page index value (a pgoff_t - which is just 32 bits on 32-bit architectures) left-shifted as the block number. But the top bits to get lost as the pgoff_t is not type cast to sector_t / 64-bit before the shift. This patch fixes this issue by type casting "index" to sector_t before doing the left shift. Note this is not a theoretical bug but has been seen in the field on a 4TiB hard drive with logical sector size 512 bytes. This patch has been verified to fix the infinite loop problem on 3.17-rc5 kernel using a 4TB disk image mounted using "-o loop". Without this patch doing a "find /nt" where /nt is an NTFS volume causes the inifinite loop 100% reproducibly whilst with the patch it works fine as expected. Signed-off-by:
Anton Altaparmakov <aia21@cantab.net> Cc: stable@vger.kernel.org Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit f2d5a944) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alex Deucher authored
Use the new vga_switcheroo_fini_domain_pm_ops function to unregister the pm ops. Based on a patch from: Pali Rohár <pali.rohar@gmail.com> bug: https://bugzilla.kernel.org/show_bug.cgi?id=84431Reviewed-by:
Ben Skeggs <bskeggs@redhat.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by:
Pali Rohár <pali.rohar@gmail.com> Cc: Ben Skeggs <bskeggs@redhat.com> (cherry picked from commit 2e97140d) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Alex Deucher authored
Use the new vga_switcheroo_fini_domain_pm_ops function to unregister the pm ops. Based on a patch from: Pali Rohár <pali.rohar@gmail.com> bug: https://bugzilla.kernel.org/show_bug.cgi?id=84431Reviewed-by:
Ben Skeggs <bskeggs@redhat.com> Signed-off-by:
Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Cc: Ben Skeggs <bskeggs@redhat.com> (cherry picked from commit 53beaa01) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Zhiqiang Zhang authored
ref-cycles event is specially to Intel core, but can still used in arm architecture with the wrong return value with 3.10 stable. this patch fix the bug and make it return NOT SUPPORTED distinctly. In upstream this bug has been fixed by other way, which changes more than one file and more than 1000 lines. the primary commit is 6b7658ec. besides we can not simply cherry-pick. Signed-off-by:
Zhiqiang Zhang <zhangzhiqiang.zhang@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com Cc: Will Deacon <will.deacon@arm.com> Cc: Christopher Covington <cov@codeaurora.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit e7a0374e) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Cong Wang authored
We saw a kernel soft lockup in perf_remove_from_context(), it looks like the `perf` process, when exiting, could not go out of the retry loop. Meanwhile, the target process was forking a child. So either the target process should execute the smp function call to deactive the event (if it was running) or it should do a context switch which deactives the event. It seems we optimize out a context switch in perf_event_context_sched_out(), and what's more important, we still test an obsolete task pointer when retrying, so no one actually would deactive that event in this situation. Fix it directly by reloading the task pointer in perf_remove_from_context(). This should cure the above soft lockup. Signed-off-by:
Cong Wang <cwang@twopensource.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/1409696840-843-1-git-send-email-xiyou.wangcong@gmail.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> (cherry picked from commit 3577af70) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Matan Barak authored
When marsheling a user path to the kernel struct ib_sa_path, need to zero smac, dmac and set the vlan id to the "no vlan" value. Fixes: dd5f03be ("IB/core: Ethernet L2 attributes in verbs/cm structures") Reported-by:
Aleksey Senin <alekseys@mellanox.com> Signed-off-by:
Matan Barak <matanb@mellanox.com> Signed-off-by:
Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by:
Roland Dreier <roland@purestorage.com> (cherry picked from commit a59c5850) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Richard Larocque authored
Locks the k_itimer's it_lock member when handling the alarm timer's expiry callback. The regular posix timers defined in posix-timers.c have this lock held during timout processing because their callbacks are routed through posix_timer_fn(). The alarm timers follow a different path, so they ought to grab the lock somewhere else. Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sharvil Nanavati <sharvil@google.com> Signed-off-by:
Richard Larocque <rlarocque@google.com> Signed-off-by:
John Stultz <john.stultz@linaro.org> (cherry picked from commit 474e941b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Richard Larocque authored
Avoids sending a signal to alarm timers created with sigev_notify set to SIGEV_NONE by checking for that special case in the timeout callback. The regular posix timers avoid sending signals to SIGEV_NONE timers by not scheduling any callbacks for them in the first place. Although it would be possible to do something similar for alarm timers, it's simpler to handle this as a special case in the timeout. Prior to this patch, the alarm timer would ignore the sigev_notify value and try to deliver signals to the process anyway. Even worse, the sanity check for the value of sigev_signo is skipped when SIGEV_NONE was specified, so the signal number could be bogus. If sigev_signo was an unitialized value (as it often would be if SIGEV_NONE is used), then it's hard to predict which signal will be sent. Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sharvil Nanavati <sharvil@google.com> Signed-off-by:
Richard Larocque <rlarocque@google.com> Signed-off-by:
John Stultz <john.stultz@linaro.org> (cherry picked from commit 265b81d2) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Richard Larocque authored
Returns the time remaining for an alarm timer, rather than the time at which it is scheduled to expire. If the timer has already expired or it is not currently scheduled, the it_value's members are set to zero. This new behavior matches that of the other posix-timers and the POSIX specifications. This is a change in user-visible behavior, and may break existing applications. Hopefully, few users rely on the old incorrect behavior. Cc: stable@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Richard Cochran <richardcochran@gmail.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Sharvil Nanavati <sharvil@google.com> Signed-off-by:
Richard Larocque <rlarocque@google.com> [jstultz: minor style tweak] Signed-off-by:
John Stultz <john.stultz@linaro.org> (cherry picked from commit e86fea76) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
John David Anglin authored
In spite of what the GCC manual says, the -mfast-indirect-calls has never been supported in the 64-bit parisc compiler. Indirect calls have always been done using function descriptors irrespective of the -mfast-indirect-calls option. Recently, it was noticed that a function descriptor was always requested when the -mfast-indirect-calls option was specified. This caused problems when the option was used in application code and doesn't make any sense because the whole point of the option is to avoid using a function descriptor for indirect calls. Fixing this broke 64-bit kernel builds. I will fix GCC but for now we need the attached change. This results in the same kernel code as before. Signed-off-by:
John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # v3.0+ Signed-off-by:
Helge Deller <deller@gmx.de> (cherry picked from commit d26a7730) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Guy Martin authored
The current LWS cas only works correctly for 32bit. The new LWS allows for CAS operations of variable size. Signed-off-by:
Guy Martin <gmsoft@tuxicoman.be> Cc: <stable@vger.kernel.org> # 3.13+ Signed-off-by:
Helge Deller <deller@gmx.de> (cherry picked from commit 89206491) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Richard Genoud authored
In set_termios(), interrupts where not disabled if UART_ENABLE_MS() was false. Tested on at91sam9g35. Signed-off-by:
Richard Genoud <richard.genoud@gmail.com> Cc: stable <stable@vger.kernel.org> # >= 3.16 Reviewed-by:
Peter Hurley <peter@hurleysoftware.com> Acked-by:
Nicolas Ferre <nicolas.ferre@atmel.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 35b675b9) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Michael Ellerman authored
Similar to the previous commit which described why we need to add a barrier to arch_spin_is_locked(), we have a similar problem with spin_unlock_wait(). We need a barrier on entry to ensure any spinlock we have previously taken is visibly locked prior to the load of lock->slock. It's also not clear if spin_unlock_wait() is intended to have ACQUIRE semantics. For now be conservative and add a barrier on exit to give it ACQUIRE semantics. Signed-off-by:
Michael Ellerman <mpe@ellerman.id.au> Signed-off-by:
Benjamin Herrenschmidt <benh@kernel.crashing.org> (cherry picked from commit 78e05b14) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Anton Blanchard authored
ABIv2 kernels are failing to backtrace through the kernel. An example: 39.30% readseek2_proce [kernel.kallsyms] [k] find_get_entry | --- find_get_entry __GI___libc_read The problem is in valid_next_sp() where we check that the new stack pointer is at least STACK_FRAME_OVERHEAD below the previous one. ABIv1 has a minimum stack frame size of 112 bytes consisting of 48 bytes and 64 bytes of parameter save area. ABIv2 changes that to 32 bytes with no paramter save area. STACK_FRAME_OVERHEAD is in theory the minimum stack frame size, but we over 240 uses of it, some of which assume that it includes space for the parameter area. We need to work through all our stack defines and rationalise them but let's fix perf now by creating STACK_FRAME_MIN_SIZE and using in valid_next_sp(). This fixes the issue: 30.64% readseek2_proce [kernel.kallsyms] [k] find_get_entry | --- find_get_entry pagecache_get_page generic_file_read_iter new_sync_read vfs_read sys_read syscall_exit __GI___libc_read Cc: stable@vger.kernel.org # 3.16+ Reported-by:
Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by:
Anton Blanchard <anton@samba.org> (cherry picked from commit 85101af1) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Wanpeng Li authored
The following bug can be triggered by hot adding and removing a large number of xen domain0's vcpus repeatedly: BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group PGD 5a9d5067 PUD 13067 PMD 0 Oops: 0000 [#3] SMP [...] Call Trace: load_balance ? _raw_spin_unlock_irqrestore idle_balance __schedule schedule schedule_timeout ? lock_timer_base schedule_timeout_uninterruptible msleep lock_device_hotplug_sysfs online_store dev_attr_store sysfs_write_file vfs_write SyS_write system_call_fastpath Last level cache shared mask is built during CPU up and the build_sched_domain() routine takes advantage of it to setup the sched domain CPU topology. However, llc_shared_mask is not released during CPU disable, which leads to an invalid sched domainCPU topology. This patch fix it by releasing the llc_shared_mask correctly during CPU disable. Yasuaki also reported that this can happen on real hardware: https://lkml.org/lkml/2014/7/22/1018 His case is here: == Here is an example on my system. My system has 4 sockets and each socket has 15 cores and HT is enabled. In this case, each core of sockes is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 Socket#2 | 30-44, 90-104 Socket#3 | 45-59, 105-119 Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000. It means that last level cache of Socket#2 is shared with CPU#30-44 and 90-104. When hot-removing socket#2 and #3, each core of sockets is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30 remains having 0x3fff80000001fffc0000000. After that, when hot-adding socket#2 and #3, each core of sockets is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 Socket#2 | 30-59 Socket#3 | 90-119 Then llc_shared_mask of CPU#30 becomes 0x3fff8000fffffffc0000000. It means that last level cache of Socket#2 is shared with CPU#30-59 and 90-104. So the mask has the wrong value. Signed-off-by:
Wanpeng Li <wanpeng.li@linux.intel.com> Tested-by:
Linn Crosetto <linn@hp.com> Reviewed-by:
Borislav Petkov <bp@suse.de> Reviewed-by:
Toshi Kani <toshi.kani@hp.com> Reviewed-by:
Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: <stable@vger.kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.comSigned-off-by:
Ingo Molnar <mingo@kernel.org> (cherry picked from commit 03bd4e1f) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Joseph Qi authored
There is a deadlock case which reported by Guozhonghua: https://oss.oracle.com/pipermail/ocfs2-devel/2014-September/010079.html This case is caused by &res->spinlock and &dlm->master_lock misordering in different threads. It was introduced by commit 8d400b81 ("ocfs2/dlm: Clean up refmap helpers"). Since lockres is new, it doesn't not require the &res->spinlock. So remove it. Fixes: 8d400b81 ("ocfs2/dlm: Clean up refmap helpers") Signed-off-by:
Joseph Qi <joseph.qi@huawei.com> Reviewed-by:
joyce.xue <xuejiufei@huawei.com> Reported-by:
Guozhonghua <guozhonghua@h3c.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 5760a97c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Andreas Rohner authored
This bug leads to reproducible silent data loss, despite the use of msync(), sync() and a clean unmount of the file system. It is easily reproducible with the following script: ----------------[BEGIN SCRIPT]-------------------- mkfs.nilfs2 -f /dev/sdb mount /dev/sdb /mnt dd if=/dev/zero bs=1M count=30 of=/mnt/testfile umount /mnt mount /dev/sdb /mnt CHECKSUM_BEFORE="$(md5sum /mnt/testfile)" /root/mmaptest/mmaptest /mnt/testfile 30 10 5 sync CHECKSUM_AFTER="$(md5sum /mnt/testfile)" umount /mnt mount /dev/sdb /mnt CHECKSUM_AFTER_REMOUNT="$(md5sum /mnt/testfile)" umount /mnt echo "BEFORE MMAP:\t$CHECKSUM_BEFORE" echo "AFTER MMAP:\t$CHECKSUM_AFTER" echo "AFTER REMOUNT:\t$CHECKSUM_AFTER_REMOUNT" ----------------[END SCRIPT]-------------------- The mmaptest tool looks something like this (very simplified, with error checking removed): ----------------[BEGIN mmaptest]-------------------- data = mmap(NULL, file_size - file_offset, PROT_READ | PROT_WRITE, MAP_SHARED, fd, file_offset); for (i = 0; i < write_count; ++i) { memcpy(data + i * 4096, buf, sizeof(buf)); msync(data, file_size - file_offset, MS_SYNC)) } ----------------[END mmaptest]-------------------- The output of the script looks something like this: BEFORE MMAP: 281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile AFTER MMAP: 6604a1c31f10780331a6850371b3a313 /mnt/testfile AFTER REMOUNT: 281ed1d5ae50e8419f9b978aab16de83 /mnt/testfile So it is clear, that the changes done using mmap() do not survive a remount. This can be reproduced a 100% of the time. The problem was introduced in commit 136e8770 ("nilfs2: fix issue of nilfs_set_page_dirty() for page at EOF boundary"). If the page was read with mpage_readpage() or mpage_readpages() for example, then it has no buffers attached to it. In that case page_has_buffers(page) in nilfs_set_page_dirty() will be false. Therefore nilfs_set_file_dirty() is never called and the pages are never collected and never written to disk. This patch fixes the problem by also calling nilfs_set_file_dirty() if the page has no buffers attached to it. [akpm@linux-foundation.org: s/PAGE_SHIFT/PAGE_CACHE_SHIFT/] Signed-off-by:
Andreas Rohner <andreas.rohner@gmx.net> Tested-by:
Andreas Rohner <andreas.rohner@gmx.net> Signed-off-by:
Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 56d7acc7) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Andrey Vagin authored
Currently we handle only ENOSPC. In case of other errors the file_handle variable isn't filled properly and we will show a part of stack. Signed-off-by:
Andrey Vagin <avagin@openvz.org> Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 7e882481) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-