- 21 Nov, 2014 40 commits
-
-
Mathias Krause authored
If RFCOMM_DEFER_SETUP is set in the flags, rfcomm_sock_recvmsg() returns early with 0 without updating the possibly set msg_namelen member. This, in turn, leads to a 128 byte kernel stack leak in net/socket.c. Fix this by updating msg_namelen in this case. For all other cases it will be handled in bt_sock_stream_recvmsg(). Cc: Marcel Holtmann <marcel@holtmann.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Johan Hedberg <johan.hedberg@gmail.com> Signed-off-by:
Mathias Krause <minipli@googlemail.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit e11e0455) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
James Bottomley authored
USB surprise removal of sr is triggering an oops in scsi_dispatch_command(). What seems to be happening is that USB is hanging on to a queue reference until the last close of the upper device, so the crash is caused by surprise remove of a mounted CD followed by attempted unmount. The problem is that USB doesn't issue its final commands as part of the SCSI teardown path, but on last close when the block queue is long gone. The long term fix is probably to make sr do the teardown in the same way as sd (so remove all the lower bits on ejection, but keep the upper disk alive until last close of user space). However, the current oops can be simply fixed by not allowing any commands to be sent to a dead queue. Cc: stable@kernel.org Signed-off-by:
James Bottomley <JBottomley@Parallels.com> (cherry picked from commit bfe159a5) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Ian Abbott authored
`s626_enc_insn_config()` is incorrectly dereferencing `insn->data` which is a pointer to user memory. It should be dereferencing the separate `data` parameter that points to a copy of the data in kernel memory. Signed-off-by:
Ian Abbott <abbotti@mev.co.uk> Reviewed-by:
H Hartley Sweeten <hsweeten@visionengravers.com> Cc: stable <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit b655c2c4) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Romain Francoise authored
Before v2.6.38 CONFIG_EXPERT was known as CONFIG_EMBEDDED but the Kconfig entry was not changed to match when upstream commit 628c6246 ("x86, random: Architectural inlines to get random integers with RDRAND") was backported. Signed-off-by:
Romain Francoise <romain@orebokech.com> Signed-off-by:
Willy Tarreau <w@1wt.eu> (cherry picked from commit 119274d6) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Hugh Dickins authored
In fuzzing with trinity, lockdep protested "possible irq lock inversion dependency detected" when isolate_lru_page() reenabled interrupts while still holding the supposedly irq-safe tree_lock: invalidate_inode_pages2 invalidate_complete_page2 spin_lock_irq(&mapping->tree_lock) clear_page_mlock isolate_lru_page spin_unlock_irq(&zone->lru_lock) isolate_lru_page() is correct to enable interrupts unconditionally: invalidate_complete_page2() is incorrect to call clear_page_mlock() while holding tree_lock, which is supposed to nest inside lru_lock. Both truncate_complete_page() and invalidate_complete_page() call clear_page_mlock() before taking tree_lock to remove page from radix_tree. I guess invalidate_complete_page2() preferred to test PageDirty (again) under tree_lock before committing to the munlock; but since the page has already been unmapped, its state is already somewhat inconsistent, and no worse if clear_page_mlock() moved up. Reported-by:
Sasha Levin <levinsasha928@gmail.com> Deciphered-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Hugh Dickins <hughd@google.com> Acked-by:
Mel Gorman <mel@csn.ul.ie> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michel Lespinasse <walken@google.com> Cc: Ying Han <yinghan@google.com> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit ec4d9f62) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Oleg Nesterov authored
Minor cleanup. ____call_usermodehelper() can simply return, no need to call do_exit() explicitely. Signed-off-by:
Oleg Nesterov <oleg@redhat.com> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Tejun Heo <tj@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 5b9bd473) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Kees Cook authored
Fix possible overflow of the buffer used for expanding environment variables when building file list. In the extremely unlikely case of an attacker having control over the environment variables visible to gen_init_cpio, control over the contents of the file gen_init_cpio parses, and gen_init_cpio was built without compiler hardening, the attacker can gain arbitrary execution control via a stack buffer overflow. $ cat usr/crash.list file foo ${BIG}${BIG}${BIG}${BIG}${BIG}${BIG} 0755 0 0 $ BIG=$(perl -e 'print "A" x 4096;') ./usr/gen_init_cpio usr/crash.list *** buffer overflow detected ***: ./usr/gen_init_cpio terminated This also replaces the space-indenting with tabs. Patch based on existing fix extracted from grsecurity. Signed-off-by:
Kees Cook <keescook@chromium.org> Cc: Michal Marek <mmarek@suse.cz> Cc: Brad Spengler <spender@grsecurity.net> Cc: PaX Team <pageexec@freemail.hu> Cc: <stable@vger.kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 20f1de65) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Ben Hutchings authored
This reverts commit 2af3af56, which was commit 6c4088ac upstream. This broke compilation of the driver in 2.6.32.y as the early_io{remap,unmap}() functions are not defined for ia64. The driver can *only* be built for ia64 (even in current mainline), so a fix for x86_64 is pointless. Signed-off-by:
Ben Hutchings <ben@decadent.org.uk> Signed-off-by:
Willy Tarreau <w@1wt.eu> (cherry picked from commit 01ab25d5) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Linus Torvalds authored
Add a new interface, add_device_randomness() for adding data to the random pool that is likely to differ between two devices (or possibly even per boot). This would be things like MAC addresses or serial numbers, or the read-out of the RTC. This does *not* add any actual entropy to the pool, but it initializes the pool to different values for devices that might otherwise be identical and have very little entropy available to them (particularly common in the embedded world). [ Modified by tytso to mix in a timestamp, since there may be some variability caused by the time needed to detect/configure the hardware in question. ] Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
"Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org (cherry picked from commit a2080a67) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
bjschuma@gmail.com authored
This allows distros to remove the line from their modprobe configuration. Signed-off-by:
Bryan Schumaker <bjschuma@netapp.com> Cc: stable@vger.kernel.org Signed-off-by:
Trond Myklebust <Trond.Myklebust@netapp.com> (cherry picked from commit 425e776d) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Thomas Jarosch authored
Some BIOS implementations leave the Intel GPU interrupts enabled, even though no one is handling them (f.e. i915 driver is never loaded). Additionally the interrupt destination is not set up properly and the interrupt ends up -somewhere-. These spurious interrupts are "sticky" and the kernel disables the (shared) interrupt line after 100.000+ generated interrupts. Fix it by disabling the still enabled interrupts. This resolves crashes often seen on monitor unplug. Tested on the following boards: - Intel DH61CR: Affected - Intel DH67BL: Affected - Intel S1200KP server board: Affected - Asus P8H61-M LE: Affected, but system does not crash. Probably the IRQ ends up somewhere unnoticed. According to reports on the net, the Intel DH61WW board is also affected. Many thanks to Jesse Barnes from Intel for helping with the register configuration and to Intel in general for providing public hardware documentation. Signed-off-by:
Thomas Jarosch <thomas.jarosch@intra2net.com> Tested-by:
Charlie Suffin <charlie.suffin@stratus.com> Signed-off-by:
Jesse Barnes <jbarnes@virtuousgeek.org> (cherry picked from commit f67fd55f) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Salman Qazi authored
When a machine boots up, the TSC generally gets reset. However, when kexec is used to boot into a kernel, the TSC value would be carried over from the previous kernel. The computation of cycns_offset in set_cyc2ns_scale is prone to an overflow, if the machine has been up more than 208 days prior to the kexec. The overflow happens when we multiply *scale, even though there is enough room to store the final answer. We fix this issue by decomposing tsc_now into the quotient and remainder of division by CYC2NS_SCALE_FACTOR and then performing the multiplication separately on the two components. Refactor code to share the calculation with the previous fix in __cycles_2_ns(). Signed-off-by:
Salman Qazi <sqazi@google.com> Acked-by:
John Stultz <john.stultz@linaro.org> Acked-by:
Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Paul Turner <pjt@google.com> Cc: john stultz <johnstul@us.ibm.com> Link: http://lkml.kernel.org/r/20120310004027.19291.88460.stgit@dungbeetle.mtv.corp.google.comSigned-off-by:
Ingo Molnar <mingo@elte.hu> (cherry picked from commit 9993bc63) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Sasha Levin authored
'long secs' is passed as divisor to div_s64, which accepts a 32bit divisor. On 64bit machines that value is trimmed back from 8 bytes back to 4, causing a divide by zero when the number is bigger than (1 << 32) - 1 and all 32 lower bits are 0. Use div64_long() instead. Signed-off-by:
Sasha Levin <levinsasha928@gmail.com> Cc: johnstul@us.ibm.com Link: http://lkml.kernel.org/r/1331829374-31543-2-git-send-email-levinsasha928@gmail.com Cc: stable@vger.kernel.org Signed-off-by:
Thomas Gleixner <tglx@linutronix.de> (cherry picked from commit a078c6d0) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Willy Tarreau authored
Initial stable commit : 2215d910 This patch backported into 2.6.32.55 is enabled when CONFIG_AMD_NB is set, but this config option does not exist in 2.6.32, it was called CONFIG_K8_NB, so the fix was never applied. Some other changes were needed to make it work. first, the correct include file name was asm/k8.h and not asm/amd_nb.h, and second, amd_get_mmconfig_range() is needed and was merged by previous patch. Thanks to Jiri Slabi who reported the issue and diagnosed all the dependencies. Signed-off-by:
Willy Tarreau <w@1wt.eu> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Jesse Barnes <jbarnes@virtuousgeek.org> Signed-off-by:
Willy Tarreau <w@1wt.eu> (cherry picked from commit 46e8a56a) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Ben Hutchings authored
commit 32974ad4 upstream This just changes Kconfig rather than touching all the other files the original commit did. Patch description from the original commit : | [IA64] Remove COMPAT_IA32 support | | This has been broken since May 2008 when Al Viro killed altroot support. | Since nobody has complained, it would appear that there are no users of | this code (A plausible theory since the main OSVs that support ia64 prefer | to use the IA32-EL software emulation). | | Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by:
Ben Hutchings <ben@decadent.org.uk> Signed-off-by:
Willy Tarreau <w@1wt.eu> (cherry picked from commit d9a25c03) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Heiko Carstens authored
For kernels < 3.0 the backport of 048cd4e5 "compat: fix compile breakage on s390" will break compilation... Re-add a single #include <asm/compat.h> in order to fix this. This patch is _not_ necessary for upstream, only for stable kernels which include the "build fix" mentioned above. Reported-by:
Jiri Slaby <jslaby@suse.cz> Signed-off-by:
Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by:
Willy Tarreau <w@1wt.eu> (cherry picked from commit ee116431) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Ian Kent authored
commit a32744d4 upstream. When the autofs protocol version 5 packet type was added in commit 5c0a32fc ("autofs4: add new packet type for v5 communications"), it obvously tried quite hard to be word-size agnostic, and uses explicitly sized fields that are all correctly aligned. However, with the final "char name[NAME_MAX+1]" array at the end, the actual size of the structure ends up being not very well defined: because the struct isn't marked 'packed', doing a "sizeof()" on it will align the size of the struct up to the biggest alignment of the members it has. And despite all the members being the same, the alignment of them is different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round number (256), the name[] array starts out a 4-byte aligned. End result: the "packed" size of the structure is 300 bytes: 4-byte, but not 8-byte aligned. As a result, despite all the fields being in the same place on all architectures, sizeof() will round up that size to 304 bytes on architectures that have 8-byte alignment for u64. Note that this is *not* a problem for 32-bit compat mode on POWER, since there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit and 64-bit alignment is different for 64-bit entities, and as a result the structure that has exactly the same layout has different sizes. So on x86-64, but no other architecture, we will just subtract 4 from the size of the structure when running in a compat task. That way we will write the properly sized packet that user mode expects. Not pretty. Sadly, this very subtle, and unnecessary, size difference has been encoded in user space that wants to read packets of *exactly* the right size, and will refuse to touch anything else. Reported-and-tested-by:
Thomas Meyer <thomas@m3y3r.de> Signed-off-by:
Ian Kent <raven@themaw.net> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Cc: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 82e43e2a)
-
OGAWA Hirofumi authored
ratelimit_state initialization of printk_ratelimited() seems broken. This fixes it by using DEFINE_RATELIMIT_STATE() to initialize spinlock properly. Signed-off-by:
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Cc: Joe Perches <joe@perches.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit d8521fcc) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Yong Zhang authored
When __ratelimit() returns 1 this means that we can go ahead. Signed-off-by:
Yong Zhang <yong.zhang@windriver.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Joe Perches <joe@perches.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit bb1dc0ba) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jan Kara authored
When we hit EIO while writing LVID, the buffer uptodate bit is cleared. This then results in an anoying warning from mark_buffer_dirty() when we write the buffer again. So just set uptodate flag unconditionally. Reviewed-by:
Namjae Jeon <linkinjeon@gmail.com> Signed-off-by:
Jan Kara <jack@suse.cz> (cherry picked from commit 853a0c25) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Bjørn Mork authored
wdm_in_callback() will also touch this field, so we cannot change it without locking Cc: stable@vger.kernel.org Signed-off-by:
Bjørn Mork <bjorn@mork.no> Acked-by:
Oliver Neukum <oneukum@suse.de> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de> (cherry picked from commit c428b70c) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Herton Ronaldo Krzesinski authored
This reverts commit c8cdf3f9, applied on linux 2.6.32.53 stable release, as it can introduce the following build error while building 2.6.32.y on armel: linux-2.6.32/drivers/mmc/host/mmci.c: In function 'mmci_cmd_irq': linux-2.6.32/drivers/mmc/host/mmci.c:237: error: implicit declaration of function 'dma_inprogress' linux-2.6.32/drivers/mmc/host/mmci.c:238: error: implicit declaration of function 'mmci_dma_data_error' Aparently the commit was wrongly pushed into 2.6.32, since it depends on commit c8ebae37 ("ARM: mmci: add dmaengine-based DMA support"), not present on 2.6.32. Signed-off-by:
Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 80375fc4) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Tyler Hicks authored
ecryptfs_write() handles the truncation of eCryptfs inodes. It grabs a page, zeroes out the appropriate portions, and then encrypts the page before writing it to the lower filesystem. It was unkillable and due to the lack of sparse file support could result in tying up a large portion of system resources, while encrypting pages of zeros, with no way for the truncate operation to be stopped from userspace. This patch adds the ability for ecryptfs_write() to detect a pending fatal signal and return as gracefully as possible. The intent is to leave the lower file in a useable state, while still allowing a user to break out of the encryption loop. If a pending fatal signal is detected, the eCryptfs inode size is updated to reflect the modified inode size and then -EINTR is returned. Signed-off-by:
Tyler Hicks <tyhicks@canonical.com> Cc: <stable@vger.kernel.org> (cherry picked from commit 5e6f0d76) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Joe Perches authored
Add a printk_ratelimited statement expression macro that uses a per-call ratelimit_state so that multiple subsystems output messages are not suppressed by a global __ratelimit state. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: s/_rl/_ratelimited/g] Signed-off-by:
Joe Perches <joe@perches.com> Cc: Naohiro Ooiwa <nooiwa@miraclelinux.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Hiroshi Shimamoto <h-shimamoto@ct.jp.nec.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 8a64f336) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Huajun Li authored
usb: usb-storage doesn't support dynamic id currently, the patch disables the feature to fix an oops Echo vendor and product number of a non usb-storage device to usb-storage driver's new_id, then plug in the device to host and you will find following oops msg, the root cause is usb_stor_probe1() refers invalid id entry if giving a dynamic id, so just disable the feature. [ 3105.018012] general protection fault: 0000 [#1] SMP DEBUG_PAGEALLOC [ 3105.018062] CPU 0 [ 3105.018075] Modules linked in: usb_storage usb_libusual bluetooth dm_crypt binfmt_misc snd_hda_codec_analog snd_hda_intel snd_hda_codec snd_hwdep hp_wmi ppdev sparse_keymap snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device psmouse snd serio_raw tpm_infineon soundcore i915 snd_page_alloc tpm_tis parport_pc tpm tpm_bios drm_kms_helper drm i2c_algo_bit video lp parport usbhid hid sg sr_mod sd_mod ehci_hcd uhci_hcd usbcore e1000e usb_common floppy [ 3105.018408] [ 3105.018419] Pid: 189, comm: khubd Tainted: G I 3.2.0-rc7+ #29 Hewlett-Packard HP Compaq dc7800p Convertible Minitower/0AACh [ 3105.018481] RIP: 0010:[<ffffffffa045830d>] [<ffffffffa045830d>] usb_stor_probe1+0x2fd/0xc20 [usb_storage] [ 3105.018536] RSP: 0018:ffff880056a3d830 EFLAGS: 00010286 [ 3105.018562] RAX: ffff880065f4e648 RBX: ffff88006bb28000 RCX: 0000000000000000 [ 3105.018597] RDX: ffff88006f23c7b0 RSI: 0000000000000001 RDI: 0000000000000206 [ 3105.018632] RBP: ffff880056a3d900 R08: 0000000000000000 R09: ffff880067365000 [ 3105.018665] R10: 00000000000002ac R11: 0000000000000010 R12: ffff6000b41a7340 [ 3105.018698] R13: ffff880065f4ef60 R14: ffff88006bb28b88 R15: ffff88006f23d270 [ 3105.018733] FS: 0000000000000000(0000) GS:ffff88007a200000(0000) knlGS:0000000000000000 [ 3105.018773] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 3105.018801] CR2: 00007fc99c8c4650 CR3: 0000000001e05000 CR4: 00000000000006f0 [ 3105.018835] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 3105.018870] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [ 3105.018906] Process khubd (pid: 189, threadinfo ffff880056a3c000, task ffff88005677a400) [ 3105.018945] Stack: [ 3105.018959] 0000000000000000 0000000000000000 ffff880056a3d8d0 0000000000000002 [ 3105.019011] 0000000000000000 ffff880056a3d918 ffff880000000000 0000000000000002 [ 3105.019058] ffff880056a3d8d0 0000000000000012 ffff880056a3d8d0 0000000000000006 [ 3105.019105] Call Trace: [ 3105.019128] [<ffffffffa0458cd4>] storage_probe+0xa4/0xe0 [usb_storage] [ 3105.019173] [<ffffffffa0097822>] usb_probe_interface+0x172/0x330 [usbcore] [ 3105.019211] [<ffffffff815fda67>] driver_probe_device+0x257/0x3b0 [ 3105.019243] [<ffffffff815fdd43>] __device_attach+0x73/0x90 [ 3105.019272] [<ffffffff815fdcd0>] ? __driver_attach+0x110/0x110 [ 3105.019303] [<ffffffff815fb93c>] bus_for_each_drv+0x9c/0xf0 [ 3105.019334] [<ffffffff815fd6c7>] device_attach+0xf7/0x120 [ 3105.019364] [<ffffffff815fc905>] bus_probe_device+0x45/0x80 [ 3105.019396] [<ffffffff815f98a6>] device_add+0x876/0x990 [ 3105.019434] [<ffffffffa0094e42>] usb_set_configuration+0x822/0x9e0 [usbcore] [ 3105.019479] [<ffffffffa00a3492>] generic_probe+0x62/0xf0 [usbcore] [ 3105.019518] [<ffffffffa0097a46>] usb_probe_device+0x66/0xb0 [usbcore] [ 3105.019555] [<ffffffff815fda67>] driver_probe_device+0x257/0x3b0 [ 3105.019589] [<ffffffff815fdd43>] __device_attach+0x73/0x90 [ 3105.019617] [<ffffffff815fdcd0>] ? __driver_attach+0x110/0x110 [ 3105.019648] [<ffffffff815fb93c>] bus_for_each_drv+0x9c/0xf0 [ 3105.019680] [<ffffffff815fd6c7>] device_attach+0xf7/0x120 [ 3105.019709] [<ffffffff815fc905>] bus_probe_device+0x45/0x80 [ 3105.021040] usb usb6: usb auto-resume [ 3105.021045] usb usb6: wakeup_rh [ 3105.024849] [<ffffffff815f98a6>] device_add+0x876/0x990 [ 3105.025086] [<ffffffffa0088987>] usb_new_device+0x1e7/0x2b0 [usbcore] [ 3105.025086] [<ffffffffa008a4d7>] hub_thread+0xb27/0x1ec0 [usbcore] [ 3105.025086] [<ffffffff810d5200>] ? wake_up_bit+0x50/0x50 [ 3105.025086] [<ffffffffa00899b0>] ? usb_remote_wakeup+0xa0/0xa0 [usbcore] [ 3105.025086] [<ffffffff810d49b8>] kthread+0xd8/0xf0 [ 3105.025086] [<ffffffff81939884>] kernel_thread_helper+0x4/0x10 [ 3105.025086] [<ffffffff8192a8c0>] ? _raw_spin_unlock_irq+0x50/0x80 [ 3105.025086] [<ffffffff8192b1b4>] ? retint_restore_args+0x13/0x13 [ 3105.025086] [<ffffffff810d48e0>] ? __init_kthread_worker+0x80/0x80 [ 3105.025086] [<ffffffff81939880>] ? gs_change+0x13/0x13 [ 3105.025086] Code: 00 48 83 05 cd ad 00 00 01 48 83 05 cd ad 00 00 01 4c 8b ab 30 0c 00 00 48 8b 50 08 48 83 c0 30 48 89 45 a0 4c 89 a3 40 0c 00 00 <41> 0f b6 44 24 10 48 89 55 a8 3c ff 0f 84 b8 04 00 00 48 83 05 [ 3105.025086] RIP [<ffffffffa045830d>] usb_stor_probe1+0x2fd/0xc20 [usb_storage] [ 3105.025086] RSP <ffff880056a3d830> [ 3105.060037] hub 6-0:1.0: hub_resume [ 3105.062616] usb usb5: usb auto-resume [ 3105.064317] ehci_hcd 0000:00:1d.7: resume root hub [ 3105.094809] ---[ end trace a7919e7f17c0a727 ]--- [ 3105.130069] hub 5-0:1.0: hub_resume [ 3105.132131] usb usb4: usb auto-resume [ 3105.132136] usb usb4: wakeup_rh [ 3105.180059] hub 4-0:1.0: hub_resume [ 3106.290052] usb usb6: suspend_rh (auto-stop) [ 3106.290077] usb usb4: suspend_rh (auto-stop) Signed-off-by:
Huajun Li <huajun.li.lee@gmail.com> Cc: stable <stable@vger.kernel.org> Signed-off-by:
Greg Kroah-Hartman <gregkh@suse.de> (cherry picked from commit 1a3a026b) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Mohammed Shafi Shajakhan authored
don't do aggregation related stuff for 'AP mode client power save handling' if aggregation is not enabled in the driver, otherwise it will lead to panic because those data structures won't be never intialized in 'ath_tx_node_init' if aggregation is disabled EIP is at ath_tx_aggr_wakeup+0x37/0x80 [ath9k] EAX: e8c09a20 EBX: f2a304e8 ECX: 00000001 EDX: 00000000 ESI: e8c085e0 EDI: f2a304ac EBP: f40e1ca4 ESP: f40e1c8c DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 Process swapper/1 (pid: 0, ti=f40e0000 task=f408e860 task.ti=f40dc000) Stack: 0001e966 e8c09a20 00000000 f2a304ac e8c085e0 f2a304ac f40e1cb0 f8186741 f8186700 f40e1d2c f922988d f2a304ac 00000202 00000001 c0b4ba43 00000000 0000000f e8eb75c0 e8c085e0 205b0001 34383220 f2a304ac f2a30000 00010020 Call Trace: [<f8186741>] ath9k_sta_notify+0x41/0x50 [ath9k] [<f8186700>] ? ath9k_get_survey+0x110/0x110 [ath9k] [<f922988d>] ieee80211_sta_ps_deliver_wakeup+0x9d/0x350 [mac80211] [<c018dc75>] ? __module_address+0x95/0xb0 [<f92465b3>] ap_sta_ps_end+0x63/0xa0 [mac80211] [<f9246746>] ieee80211_rx_h_sta_process+0x156/0x2b0 [mac80211] [<f9247d1e>] ieee80211_rx_handlers+0xce/0x510 [mac80211] [<c018440b>] ? trace_hardirqs_on+0xb/0x10 [<c056936e>] ? skb_queue_tail+0x3e/0x50 [<f9248271>] ieee80211_prepare_and_rx_handle+0x111/0x750 [mac80211] [<f9248bf9>] ieee80211_rx+0x349/0xb20 [mac80211] [<f9248949>] ? ieee80211_rx+0x99/0xb20 [mac80211] [<f818b0b8>] ath_rx_tasklet+0x818/0x1d00 [ath9k] [<f8187a75>] ? ath9k_tasklet+0x35/0x1c0 [ath9k] [<f8187a75>] ? ath9k_tasklet+0x35/0x1c0 [ath9k] [<f8187b33>] ath9k_tasklet+0xf3/0x1c0 [ath9k] [<c0151b7e>] tasklet_action+0xbe/0x180 Cc: stable@kernel.org Cc: Senthil Balasubramanian <senthilb@qca.qualcomm.com> Cc: Rajkumar Manoharan <rmanohar@qca.qualcomm.com> Reported-by:
Ashwin Mendonca <ashwinloyal@gmail.com> Tested-by:
Ashwin Mendonca <ashwinloyal@gmail.com> Signed-off-by:
Mohammed Shafi Shajakhan <mohammed@qca.qualcomm.com> Signed-off-by:
John W. Linville <linville@tuxdriver.com> (cherry picked from commit b25bfda3) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Johannes Weiner authored
Commit 3812c8c8 ("mm: memcg: do not trap chargers with full callstack on OOM") assumed that only a few places that can trigger a memcg OOM situation do not return VM_FAULT_OOM, like optional page cache readahead. But there are many more and it's impractical to annotate them all. First of all, we don't want to invoke the OOM killer when the failed allocation is gracefully handled, so defer the actual kill to the end of the fault handling as well. This simplifies the code quite a bit for added bonus. Second, since a failed allocation might not be the abrupt end of the fault, the memcg OOM handler needs to be re-entrant until the fault finishes for subsequent allocation attempts. If an allocation is attempted after the task already OOMed, allow it to bypass the limit so that it can quickly finish the fault and invoke the OOM killer. Reported-by:
azurIt <azurit@pobox.sk> Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.cz> Cc: <stable@kernel.org> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 49426420) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Johannes Weiner authored
The memcg OOM handling is incredibly fragile and can deadlock. When a task fails to charge memory, it invokes the OOM killer and loops right there in the charge code until it succeeds. Comparably, any other task that enters the charge path at this point will go to a waitqueue right then and there and sleep until the OOM situation is resolved. The problem is that these tasks may hold filesystem locks and the mmap_sem; locks that the selected OOM victim may need to exit. For example, in one reported case, the task invoking the OOM killer was about to charge a page cache page during a write(), which holds the i_mutex. The OOM killer selected a task that was just entering truncate() and trying to acquire the i_mutex: OOM invoking task: mem_cgroup_handle_oom+0x241/0x3b0 mem_cgroup_cache_charge+0xbe/0xe0 add_to_page_cache_locked+0x4c/0x140 add_to_page_cache_lru+0x22/0x50 grab_cache_page_write_begin+0x8b/0xe0 ext3_write_begin+0x88/0x270 generic_file_buffered_write+0x116/0x290 __generic_file_aio_write+0x27c/0x480 generic_file_aio_write+0x76/0xf0 # takes ->i_mutex do_sync_write+0xea/0x130 vfs_write+0xf3/0x1f0 sys_write+0x51/0x90 system_call_fastpath+0x18/0x1d OOM kill victim: do_truncate+0x58/0xa0 # takes i_mutex do_last+0x250/0xa30 path_openat+0xd7/0x440 do_filp_open+0x49/0xa0 do_sys_open+0x106/0x240 sys_open+0x20/0x30 system_call_fastpath+0x18/0x1d The OOM handling task will retry the charge indefinitely while the OOM killed task is not releasing any resources. A similar scenario can happen when the kernel OOM killer for a memcg is disabled and a userspace task is in charge of resolving OOM situations. In this case, ALL tasks that enter the OOM path will be made to sleep on the OOM waitqueue and wait for userspace to free resources or increase the group's limit. But a userspace OOM handler is prone to deadlock itself on the locks held by the waiting tasks. For example one of the sleeping tasks may be stuck in a brk() call with the mmap_sem held for writing but the userspace handler, in order to pick an optimal victim, may need to read files from /proc/<pid>, which tries to acquire the same mmap_sem for reading and deadlocks. This patch changes the way tasks behave after detecting a memcg OOM and makes sure nobody loops or sleeps with locks held: 1. When OOMing in a user fault, invoke the OOM killer and restart the fault instead of looping on the charge attempt. This way, the OOM victim can not get stuck on locks the looping task may hold. 2. When OOMing in a user fault but somebody else is handling it (either the kernel OOM killer or a userspace handler), don't go to sleep in the charge context. Instead, remember the OOMing memcg in the task struct and then fully unwind the page fault stack with -ENOMEM. pagefault_out_of_memory() will then call back into the memcg code to check if the -ENOMEM came from the memcg, and then either put the task to sleep on the memcg's OOM waitqueue or just restart the fault. The OOM victim can no longer get stuck on any lock a sleeping task may hold. Debugged by Michal Hocko. Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Reported-by:
azurIt <azurit@pobox.sk> Acked-by:
Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 3812c8c8) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Johannes Weiner authored
commit fb2a6fc5 upstream. The memcg OOM handler open-codes a sleeping lock for OOM serialization (trylock, wait, repeat) because the required locking is so specific to memcg hierarchies. However, it would be nice if this construct would be clearly recognizable and not be as obfuscated as it is right now. Clean up as follows: 1. Remove the return value of mem_cgroup_oom_unlock() 2. Rename mem_cgroup_oom_lock() to mem_cgroup_oom_trylock(). 3. Pull the prepare_to_wait() out of the memcg_oom_lock scope. This makes it more obvious that the task has to be on the waitqueue before attempting to OOM-trylock the hierarchy, to not miss any wakeups before going to sleep. It just didn't matter until now because it was all lumped together into the global memcg_oom_lock spinlock section. 4. Pull the mem_cgroup_oom_notify() out of the memcg_oom_lock scope. It is proctected by the hierarchical OOM-lock. 5. The memcg_oom_lock spinlock is only required to propagate the OOM lock in any given hierarchy atomically. Restrict its scope to mem_cgroup_oom_(trylock|unlock). 6. Do not wake up the waitqueue unconditionally at the end of the function. Only the lockholder has to wake up the next in line after releasing the lock. Note that the lockholder kicks off the OOM-killer, which in turn leads to wakeups from the uncharges of the exiting task. But a contender is not guaranteed to see them if it enters the OOM path after the OOM kills but before the lockholder releases the lock. Thus there has to be an explicit wakeup after releasing the lock. 7. Put the OOM task on the waitqueue before marking the hierarchy as under OOM as that is the point where we start to receive wakeups. No point in listening before being on the waitqueue. 8. Likewise, unmark the hierarchy before finishing the sleep, for symmetry. Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Acked-by:
Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: azurIt <azurit@pobox.sk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 7a147e0c)
-
Johannes Weiner authored
System calls and kernel faults (uaccess, gup) can handle an out of memory situation gracefully and just return -ENOMEM. Enable the memcg OOM killer only for user faults, where it's really the only option available. Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Acked-by:
Michal Hocko <mhocko@suse.cz> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: azurIt <azurit@pobox.sk> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 519e5247) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Johannes Weiner authored
The x86 fault handler bails in the middle of error handling when the task has a fatal signal pending. For a subsequent patch this is a problem in OOM situations because it relies on pagefault_out_of_memory() being called even when the task has been killed, to perform proper per-task OOM state unwinding. Shortcutting the fault like this is a rather minor optimization that saves a few instructions in rare cases. Just remove it for user-triggered faults. Use the opportunity to split the fault retry handling from actual fault errors and add locking documentation that reads suprisingly similar to ARM's. Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Reviewed-by:
Michal Hocko <mhocko@suse.cz> Acked-by:
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: azurIt <azurit@pobox.sk> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 3a13c4d7) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Johannes Weiner authored
Kernel faults are expected to handle OOM conditions gracefully (gup, uaccess etc.), so they should never invoke the OOM killer. Reserve this for faults triggered in user context when it is the only option. Most architectures already do this, fix up the remaining few. Signed-off-by:
Johannes Weiner <hannes@cmpxchg.org> Reviewed-by:
Michal Hocko <mhocko@suse.cz> Acked-by:
KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: azurIt <azurit@pobox.sk> Signed-off-by:
Andrew Morton <akpm@linux-foundation.org> Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> (cherry picked from commit 87134102) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Tomas Henzl authored
commit 2cc5bfaf upstream. When the driver calls scsi_done and after that frees it's internal preallocated memory it can happen that a new job is enqueud before the memory is freed. The allocation fails and the message "cmd_alloc returned NULL" is shown. Patch below fixes it by moving cmd->scsi_done after cmd_free. Signed-off-by:
Tomas Henzl <thenzl@redhat.com> Acked-by:
Stephen M. Cameron <scameron@beardog.cce.hp.com> Signed-off-by:
James Bottomley <JBottomley@Parallels.com> Cc: Masoud Sharbiani <msharbiani@twitter.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> (cherry picked from commit 2e4ce498) (cherry picked from commit HEAD) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Ben Dooks authored
Currently BUG() uses .word or .hword to create the necessary illegal instructions. However if we are building BE8 then these get swapped by the linker into different illegal instructions in the text. This means that the BUG() macro does not get trapped properly. Change to using <asm/opcodes.h> to provide the necessary ARM instruction building as we cannot rely on gcc/gas having the `.inst` instructions which where added to try and resolve this issue (reported by Dave Martin <Dave.Martin@arm.com>). Signed-off-by:
Ben Dooks <ben.dooks@codethink.co.uk> Reviewed-by:
Dave Martin <Dave.Martin@arm.com> (cherry picked from commit 63328070) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Yoichi Yuasa authored
[ 1.904000] BUG: scheduling while atomic: swapper/1/0x00000002 [ 1.908000] Modules linked in: [ 1.916000] CPU: 0 PID: 1 Comm: swapper Not tainted 3.12.0-rc2-lemote-los.git-5318619-dirty #1 [ 1.920000] Stack : 0000000031aac000 ffffffff810d0000 0000000000000052 ffffffff802730a4 0000000000000000 0000000000000001 ffffffff810cdf90 ffffffff810d0000 ffffffff8068b968 ffffffff806f5537 ffffffff810cdf90 980000009f0782e8 0000000000000001 ffffffff80720000 ffffffff806b0000 980000009f078000 980000009f290000 ffffffff805f312c 980000009f05b5d8 ffffffff80233518 980000009f05b5e8 ffffffff80274b7c 980000009f078000 ffffffff8068b968 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 980000009f05b520 0000000000000000 ffffffff805f2f6c 0000000000000000 ffffffff80700000 ffffffff80700000 ffffffff806fc758 ffffffff80700000 ffffffff8020be98 ffffffff806fceb0 ffffffff805f2f6c ... [ 2.028000] Call Trace: [ 2.032000] [<ffffffff8020be98>] show_stack+0x80/0x98 [ 2.036000] [<ffffffff805f2f6c>] __schedule_bug+0x44/0x6c [ 2.040000] [<ffffffff805fac58>] __schedule+0x518/0x5b0 [ 2.044000] [<ffffffff805f8a58>] schedule_timeout+0x128/0x1f0 [ 2.048000] [<ffffffff80240314>] msleep+0x3c/0x60 [ 2.052000] [<ffffffff80495400>] do_probe+0x238/0x3a8 [ 2.056000] [<ffffffff804958b0>] ide_probe_port+0x340/0x7e8 [ 2.060000] [<ffffffff80496028>] ide_host_register+0x2d0/0x7a8 [ 2.064000] [<ffffffff8049c65c>] ide_pci_init_two+0x4e4/0x790 [ 2.068000] [<ffffffff8049f9b8>] amd74xx_probe+0x148/0x2c8 [ 2.072000] [<ffffffff803f571c>] pci_device_probe+0xc4/0x130 [ 2.076000] [<ffffffff80478f60>] driver_probe_device+0x98/0x270 [ 2.080000] [<ffffffff80479298>] __driver_attach+0xe0/0xe8 [ 2.084000] [<ffffffff80476ab0>] bus_for_each_dev+0x78/0xe0 [ 2.088000] [<ffffffff80478468>] bus_add_driver+0x230/0x310 [ 2.092000] [<ffffffff80479b44>] driver_register+0x84/0x158 [ 2.096000] [<ffffffff80200504>] do_one_initcall+0x104/0x160 Signed-off-by:
Yoichi Yuasa <yuasa@linux-mips.org> Reported-by:
Aaro Koskinen <aaro.koskinen@iki.fi> Tested-by:
Aaro Koskinen <aaro.koskinen@iki.fi> Cc: linux-mips@linux-mips.org Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org> Patchwork: https://patchwork.linux-mips.org/patch/5941/Signed-off-by:
Ralf Baechle <ralf@linux-mips.org> (cherry picked from commit 5596b0b2) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jiri Pirko authored
br_stp_rcv() is reached by non-rx_handler path. That means there is no guarantee that dev is bridge port and therefore simple NULL check of ->rx_handler_data is not enough. There is need to check if dev is really bridge port and since only rcu read lock is held here, do it by checking ->rx_handler pointer. Note that synchronize_net() in netdev_rx_handler_unregister() ensures this approach as valid. Introduced originally by: commit f350a0a8 "bridge: use rx_handler_data pointer to store net_bridge_port pointer" Fixed but not in the best way by: commit b5ed54e9 "bridge: fix RCU races with bridge port" Reintroduced by: commit 716ec052 "bridge: fix NULL pointer deref of br_port_get_rcu" Please apply to stable trees as well. Thanks. RH bugzilla reference: https://bugzilla.redhat.com/show_bug.cgi?id=1025770Reported-by:
Laine Stump <laine@redhat.com> Debugged-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Michael S. Tsirkin <mst@redhat.com> Signed-off-by:
Jiri Pirko <jiri@resnulli.us> Acked-by:
Michael S. Tsirkin <mst@redhat.com> Acked-by:
Eric Dumazet <edumazet@google.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit 859828c0) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Steffen Klassert authored
Otherwise it gets overwritten by register_netdev(). Signed-off-by:
Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by:
David S. Miller <davem@davemloft.net> (cherry picked from commit f03eb128) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Felipe Balbi authored
According to our Gadget Framework API documentation, ->set_halt() *must* return -EAGAIN if we have pending transfers (on either direction) or FIFO isn't empty (on TX endpoints). Fix this bug so that the mass storage gadget can be used without stall=0 parameter. This patch should be backported to all kernels since v3.2. Cc: <stable@vger.kernel.org> # v3.2+ Suggested-by:
Alan Stern <stern@rowland.harvard.edu> Signed-off-by:
Felipe Balbi <balbi@ti.com> (cherry picked from commit 7a608559) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jan Kara authored
The WARN_ON checking whether i_mutex is held in pagecache_isize_extended() was wrong because some filesystems (e.g. XFS) use different locks for serialization of truncates / writes. So just remove the check. Signed-off-by:
Jan Kara <jack@suse.cz> Reviewed-by:
Dave Chinner <dchinner@redhat.com> Signed-off-by:
Dave Chinner <david@fromorbit.com> (cherry picked from commit f55fefd1) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-
Jan Kara authored
->page_mkwrite() is used by filesystems to allocate blocks under a page which is becoming writeably mmapped in some process' address space. This allows a filesystem to return a page fault if there is not enough space available, user exceeds quota or similar problem happens, rather than silently discarding data later when writepage is called. However VFS fails to call ->page_mkwrite() in all the cases where filesystems need it when blocksize < pagesize. For example when blocksize = 1024, pagesize = 4096 the following is problematic: ftruncate(fd, 0); pwrite(fd, buf, 1024, 0); map = mmap(NULL, 1024, PROT_WRITE, MAP_SHARED, fd, 0); map[0] = 'a'; ----> page_mkwrite() for index 0 is called ftruncate(fd, 10000); /* or even pwrite(fd, buf, 1, 10000) */ mremap(map, 1024, 10000, 0); map[4095] = 'a'; ----> no page_mkwrite() called At the moment ->page_mkwrite() is called, filesystem can allocate only one block for the page because i_size == 1024. Otherwise it would create blocks beyond i_size which is generally undesirable. But later at ->writepage() time, we also need to store data at offset 4095 but we don't have block allocated for it. This patch introduces a helper function filesystems can use to have ->page_mkwrite() called at all the necessary moments. Signed-off-by:
Jan Kara <jack@suse.cz> Signed-off-by:
Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org (cherry picked from commit 90a80202) Signed-off-by:
Sasha Levin <sasha.levin@oracle.com>
-