- 28 Feb, 2013 40 commits
-
-
Tejun Heo authored
idr is silly in quite a few ways, one of which is how it's supposed to be destroyed - idr_destroy() doesn't release IDs and doesn't even whine if the idr isn't empty. If the caller forgets idr_remove_all(), it simply leaks memory. Even ida gets this wrong and leaks memory on destruction. There is absoltely no reason not to call idr_remove_all() from idr_destroy(). Nobody is abusing idr_destroy() for shrinking free layer buffer and continues to use idr after idr_destroy(), so it's safe to do remove_all from destroy. In the whole kernel, there is only one place where idr_remove_all() is legitimiately used without following idr_destroy() while there are quite a few places where the caller forgets either idr_remove_all() or idr_destroy() leaking memory. This patch makes idr_destroy() call idr_destroy_all() and updates the function description accordingly. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Tejun Heo authored
The iteration logic of idr_get_next() is borrowed mostly verbatim from idr_for_each(). It walks down the tree looking for the slot matching the current ID. If the matching slot is not found, the ID is incremented by the distance of single slot at the given level and repeats. The implementation assumes that during the whole iteration id is aligned to the layer boundaries of the level closest to the leaf, which is true for all iterations starting from zero or an existing element and thus is fine for idr_for_each(). However, idr_get_next() may be given any point and if the starting id hits in the middle of a non-existent layer, increment to the next layer will end up skipping the same offset into it. For example, an IDR with IDs filled between [64, 127] would look like the following. [ 0 64 ... ] /----/ | | | NULL [ 64 ... 127 ] If idr_get_next() is called with 63 as the starting point, it will try to follow down the pointer from 0. As it is NULL, it will then try to proceed to the next slot in the same level by adding the slot distance at that level which is 64 - making the next try 127. It goes around the loop and finds and returns 127 skipping [64, 126]. Note that this bug also triggers in idr_for_each_entry() loop which deletes during iteration as deletions can make layers go away leaving the iteration with unaligned ID into missing layers. Fix it by ensuring proceeding to the next slot doesn't carry over the unaligned offset - ie. use round_up(id + 1, slot_distance) instead of id += slot_distance. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: David Teigland <teigland@redhat.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Tomas Henzl authored
While adding and removing a lot of disks disks and partitions this sometimes shows up: WARNING: at fs/sysfs/dir.c:512 sysfs_add_one+0xc9/0x130() (Not tainted) Hardware name: sysfs: cannot create duplicate filename '/dev/block/259:751' Modules linked in: raid1 autofs4 bnx2fc cnic uio fcoe libfcoe libfc 8021q scsi_transport_fc scsi_tgt garp stp llc sunrpc cpufreq_ondemand powernow_k8 freq_table mperf ipv6 dm_mirror dm_region_hash dm_log power_meter microcode dcdbas serio_raw amd64_edac_mod edac_core edac_mce_amd i2c_piix4 i2c_core k10temp bnx2 sg ixgbe dca mdio ext4 mbcache jbd2 dm_round_robin sr_mod cdrom sd_mod crc_t10dif ata_generic pata_acpi pata_atiixp ahci mptsas mptscsih mptbase scsi_transport_sas dm_multipath dm_mod [last unloaded: scsi_wait_scan] Pid: 44103, comm: async/16 Not tainted 2.6.32-195.el6.x86_64 #1 Call Trace: warn_slowpath_common+0x87/0xc0 warn_slowpath_fmt+0x46/0x50 sysfs_add_one+0xc9/0x130 sysfs_do_create_link+0x12b/0x170 sysfs_create_link+0x13/0x20 device_add+0x317/0x650 idr_get_new+0x13/0x50 add_partition+0x21c/0x390 rescan_partitions+0x32b/0x470 sd_open+0x81/0x1f0 [sd_mod] __blkdev_get+0x1b6/0x3c0 blkdev_get+0x10/0x20 register_disk+0x155/0x170 add_disk+0xa6/0x160 sd_probe_async+0x13b/0x210 [sd_mod] add_wait_queue+0x46/0x60 async_thread+0x102/0x250 default_wake_function+0x0/0x20 async_thread+0x0/0x250 kthread+0x96/0xa0 child_rip+0xa/0x20 kthread+0x0/0xa0 child_rip+0x0/0x20 This most likely happens because dev_t is freed while the number is still used and idr_get_new() is not protected on every use. The fix adds a mutex where it wasn't before and moves the dev_t free function so it is called after device del. Signed-off-by: Tomas Henzl <thenzl@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Zhang Yanfei authored
Though there is no error if we free a NULL pointer, I think we could avoid this behaviour. Change the code a little in kimage_crash_alloc() could avoid this kind of unnecessary free. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Sasha Levin <sasha.levin@oracle.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Zhang Yanfei authored
If kimage_normal_alloc() fails to alloc pages for image->swap_page, it should call kimage_free_page_list() to free allocated pages in image->control_pages list before it frees image. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Sasha Levin <sasha.levin@oracle.com> Reviewed-by: Simon Horman <horms@verge.net.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sasha Levin authored
If kimage_normal_alloc() fails to initialize an allocated kimage, it will free the image but would still set 'rimage', as a result kexec_load will try to free it again. This would explode as part of the freeing process is accessing internal members which point to uninitialized memory. Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mitsuhiro Tanino authored
This patch exports a PG_hwpoison into vmcoreinfo when CONFIG_MEMORY_FAILURE is defined. "makedumpfile" needs to read information of memory, such as 'mem_section', 'zone', 'pageflags' from vmcore. We introduce a function into "makedumpfile" to exclude hwpoison page from vmcore dump. In order to introduce this function, PG_hwpoison flag have to export into vmcoreinfo. Signed-off-by: Mitsuhiro Tanino <mitsuhiro.tanino.gm@hitachi.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Mitsuhiro Tanino <mitsuhiro.tanino.gm@hitachi.com> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Zhang Yanfei authored
hole_end has been checked to make sure it is <= crash_res.end in the while condition check, so the if condition check is duplicate. Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Atsushi Kumagai authored
tAdd adds the values related to buddy system to vmcoreinfo data so that makedumpfile (dump filtering command) can filter out all free pages with the new logic. It's faster than the current logic because it can distinguish free page by analyzing page structure at the same time as filtering for other unnecessary pages (e.g. anonymous page). OTOH, the current logic has to trace free_list to distinguish free pages while analyzing page structure to filter out other unnecessary pages. The new logic uses the fact that buddy page is marked by _mapcount == PAGE_BUDDY_MAPCOUNT_VALUE. But, _mapcount shares its memory with other fields for SLAB/SLUB when PG_slab is set, so we need to check if PG_slab is set or not before looking up _mapcount value. And we can get the order of buddy system from private field. To sum it up, the values below are required for this logic. Required values: - OFFSET(page._mapcount) - OFFSET(page.private) - NUMBER(PG_slab) - NUMBER(PAGE_BUDDY_MAPCOUNT_VALUE) Changelog from v1 to v2: 1. remove SIZE(pageflags) The new logic was changed after I sent v1 patch. Accordingly, SIZE(pageflags) has been unnecessary for makedumpfile. What's makedumpfile: makedumpfile creates a small dumpfile by excluding unnecessary pages for the analysis. To distinguish unnecessary pages, makedumpfile gets the vmcoreinfo data which has the minimum debugging information only for dump filtering. Signed-off-by: Atsushi Kumagai <kumagai-atsushi@mxc.nes.nec.co.jp> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alan Cox authored
If new_nsproxy is set we will always call switch_task_namespaces and then set new_nsproxy back to NULL so the reassignment and fall through check are redundant Signed-off-by: Alan Cox <alan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrew Morton authored
[akpm@linux-foundation.org: checkpatch fixes] Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Cyrill Gorcunov authored
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mandeep Singh Baines authored
Prevents hung_task detector from panicing the machine. This is also needed to prevent this wait from blocking suspend. (It doesnt' currently block suspend but it would once the next patch in this series is applied.) [yongjun_wei@trendmicro.com.cn: kernel/exit.c: remove duplicated include] Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Cc: Ben Chan <benchan@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Mandeep Singh Baines authored
We shouldn't try_to_freeze if locks are held. Holding a lock can cause a deadlock if the lock is later acquired in the suspend or hibernate path (e.g. by dpm). Holding a lock can also cause a deadlock in the case of cgroup_freezer if a lock is held inside a frozen cgroup that is later acquired by a process outside that group. [akpm@linux-foundation.org: export debug_check_no_locks_held] Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Cc: Ben Chan <benchan@chromium.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Tejun Heo <tj@kernel.org> Cc: Rafael J. Wysocki <rjw@sisk.pl> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Zhang Yanfei authored
In read_vmcore() two `if' tests are duplicated. Change the position of them could reduce the duplication. This change does not affect the behaviour of the function. [akpm@linux-foundation.org: avoid `if (foo = bar)' thing, use min_t()] [akpm@linux-foundation.org: s/max_t/min_t/] Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrew Morton authored
- use pr_foo() throughout - remove a couple of duplicated KERN_WARNINGs, via WARN(KERN_WARNING "...") - nuke a few warnings which I've never seen happen, ever. Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kees Cook authored
The existing SUID_DUMP_* defines duplicate the newer SUID_DUMPABLE_* defines introduced in 54b50199 ("coredump: warn about unsafe suid_dumpable / core_pattern combo"). Remove the new ones, and use the prior values instead. Signed-off-by: Kees Cook <keescook@chromium.org> Reported-by: Chen Gang <gang.chen@asianux.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alan Cox <alan@linux.intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Doug Ledford <dledford@redhat.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: James Morris <james.l.morris@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Valdis Kletnieks authored
Several printk's were missing KERN_INFO and KERN_CONT flags. In addition, a printk that was outside a #if/#endif should have been inside, which would result in stray blank line on non-x86 boxes. Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu> Cc: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Andrey Vagin authored
The idea is simple. We need to get the siginfo for each signal on checkpointing dump, and then return it back on restore. The first problem is that the kernel doesn't report complete siginfos to userspace. In a signal handler the kernel strips SI_CODE from siginfo. When a siginfo is received from signalfd, it has a different format with fixed sizes of fields. The interface of signalfd was extended. If a signalfd is created with the flag SFD_RAW, it returns siginfo in a raw format. rt_sigqueueinfo looks suitable for restoring signals, but it can't send siginfo with a positive si_code, because these codes are reserved for the kernel. In the real world each person has right to do anything with himself, so I think a process should able to send any siginfo to itself. This patch: The kernel prevents sending of siginfo with positive si_code, because these codes are reserved for kernel. I think we can allow a task to send such a siginfo to itself. This operation should not be dangerous. This functionality is required for restoring signals in checkpoint/restart. Signed-off-by: Andrey Vagin <avagin@openvz.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Pavel Emelyanov <xemul@parallels.com> Cc: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Warren Turkal authored
Signed-off-by: Warren Turkal <wt@ooyala.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Shuah Khan authored
Signed-off-by: Shuah Khan <shuah.khan@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleksij Rempel authored
There is no documented methods to mark FAT as dirty. Unofficially MS started to use reserved Byte in boot sector for this purpose, at least since Win 2000. With Win 7 user is warned if fs is dirty and asked to clean it. Different versions of Win, handle it in different ways, but always have same meaning: - Win 2000 and XP, set it on write operations and remove it after operation was finnished - Win 7, set dirty flag on first write and remove it on umount. We will do it as follows: - set dirty flag on mount. If fs was initially dirty, warn user, remember it and do not do any changes to boot sector. - clean it on umount. If fs was initially dirty, leave it dirty. - do not do any thing if fs mounted read-only. - TODO: leave fs dirty if we found some error after mount. Signed-off-by: Oleksij Rempel <bug-track@fisher-privat.net> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleksij Rempel authored
Later we will need "state" field to check if volume was cleanly unmounted. Signed-off-by: Oleksij Rempel <bug-track@fisher-privat.net> Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vyacheslav Dubeyko authored
The fsck_hfs (under MacOS X) complains about unzeroed unused b-tree nodes after deletion of folders' tree under Linux. SYMPTOMS: Running Disk Utiltiy's "Verify Disk" on "test" gives the following: Verifying volume “Test” Checking file systemChecking Journaled HFS Plus volume. Checking extents overflow file. Checking catalog file. Unused node is not erased (node = 3111) Checking multi-linked files. Checking catalog hierarchy. Checking extended attributes file. Checking volume bitmap. Checking volume information. The volume Test was found corrupt and needs to be repaired. Error: This disk needs to be repaired. Click Repair Disk. REPRODUCING PATH: 1. Prepare HFS+ (non-case sensitive) partition (for example, 5GB) under MacOS X. 2. Copy linux kernel source tree (for example, 3.7-rc6 version) on this partition under MacOS X. 3. Then switch to Linux and mount this prepared partition. 4. Execute `sudo rm -r` under prepared directory with linux kernel source tree. 5. Unmount and boot back into OS X. 6. Open up Disk Utility and verify partition. REPRODUCIBILITY: 100% FIX: It is added code of node clearing in hfs_bnode_put() method for the case when node has flag HFS_BNODE_DELETED. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Kyle Laracey <kalaracey@gmail.com> Acked-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vyacheslav Dubeyko authored
Add support of manipulation by attributes file. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vyacheslav Dubeyko authored
Rework functionality of getting, setting and deleting of extended attributes. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vyacheslav Dubeyko authored
Add functionality of manipulating by records in attributes tree. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vyacheslav Dubeyko authored
Add all necessary on-disk layout declarations related to attributes file. Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vyacheslav Dubeyko authored
hfsplus: reworked support of extended attributes. Current mainline implementation of hfsplus file system driver treats as extended attributes only two fields (fdType and fdCreator) of user_info field in file description record (struct hfsplus_cat_file). It is possible to get or set only these two fields as extended attributes. But HFS+ treats as com.apple.FinderInfo extended attribute an union of user_info and finder_info fields as for file (struct hfsplus_cat_file) as for folder (struct hfsplus_cat_folder). Moreover, current mainline implementation of hfsplus file system driver doesn't support special metadata file - attributes tree. Mac OS X 10.4 and later support extended attributes by making use of the HFS+ filesystem Attributes file B*-tree feature which allows for named forks. Mac OS X supports only inline extended attributes, limiting their size to 3802 bytes. Any regular file may have a list of extended attributes. HFS+ supports an arbitrary number of named forks. Each attribute is denoted by a name and the associated data. The name is a null-terminated Unicode string. It is possible to list, to get, to set, and to remove extended attributes from files or directories. It exists some peculiarity during getting of extended attributes list by means of getfattr utility. The getfattr utility expects prefix "user." before any extended attribute's name. So, it ignores any names that don't contained such prefix. Such behavior of getfattr utility results in unexpected empty output of extended attributes list even in the case when file (or folder) contains extended attributes. It needs to use empty string as regular expression pattern for names matching (getfattr --match=""). For support of extended attributes in HFS+: 1. It was added necessary on-disk layout declarations related to Attributes tree into hfsplus_raw.h file. 2. It was added attributes.c file with implementation of functionality of manipulation by records in Attributes tree. 3. It was reworked hfsplus_listxattr, hfsplus_getxattr, hfsplus_setxattr functions in ioctl.c. Moreover, it was added hfsplus_removexattr method. This patch: Add osx.* prefix for handling namespace of Mac OS X extended attributes. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Vyacheslav Dubeyko <slava@dubeyko.com> Reported-by: Hin-Tak Leung <htl10@users.sourceforge.net> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Imre Deak authored
For better code reuse use the newly added page iterator to iterate through the pages. The offset, length within the page is still calculated by the mapping iterator as well as the actual mapping. Idea from Tejun Heo. Signed-off-by: Imre Deak <imre.deak@intel.com> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: James Hogan <james.hogan@imgtec.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Imre Deak authored
Add an iterator to walk through a scatter list a page at a time starting at a specific page offset. As opposed to the mapping iterator this is meant to be small, performing well even in simple loops like collecting all pages on the scatterlist into an array or setting up an iommu table based on the pages' DMA address. Signed-off-by: Imre Deak <imre.deak@intel.com> Cc: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Tested-by: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Stephen Warren authored
The intent is to ensure that all Tegra-related patches are sent to the linux-tegra@ mailing list, so people can keep up-to-date on all misc driver changes. Doing this with a keyword is far simpler and more compact than listing all Tegra-related drivers, even if wildcards were used. Words such as integrate or integrator are common. Ensure the character right before "tegra" isn't a-z (case-insensitive), to make sure the keyword doesn't match those. The only files that the keyword doesn't match are the NVEC driver. Add the linux-tegra mailing list to the NVEC entry to solve this. Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: Joe Perches <joe@perches.com> Cc: Julian Andres Klode <jak@jak-linux.org> Cc: Marc Dietrich <marvin24@gmx.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Stephen Warren authored
Allow K: entries in MAINTAINERS to match directly against filenames; either those extracted from patch +++ or --- lines, or those specified on the command-line using the -f option. This potentially allows fewer lines in a MAINTAINERS entry, if all the relevant files are scattered throughout the whole kernel tree, yet contain some common keyword. An example would be using an ARM SoC name as the keyword to catch all related drivers. I don't think setting exact_pattern_match_hash would be appropriate here; at least for intended Tegra use case, this feature is to ensure that all Tegra-related driver changes get Cc'd to the Tegra mailing list. Setting exact_pattern_match_hash would prevent git history parsing for e.g. S-o-b tags, which still seems like it would be useful. Hence, this flag isn't set. Signed-off-by: Stephen Warren <swarren@nvidia.com> Acked-by: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Oleg Nesterov authored
__orderly_poweroff() does argv_free() if call_usermodehelper_fns() returns -ENOMEM. As Lucas pointed out, this can be wrong if -ENOMEM was not triggered by the failing call_usermodehelper_setup(), in this case both __orderly_poweroff() and argv_cleanup() can do kfree(). Kill argv_cleanup() and change __orderly_poweroff() to call argv_free() unconditionally like do_coredump() does. This info->cleanup() is not needed (and wrong) since 6c0c0d4d "fix bug in orderly_poweroff() which did the UMH_NO_WAIT => UMH_WAIT_EXEC change, we can rely on the fact that CLONE_VFORK can't return until do_execve() succeeds/fails. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: David Howells <dhowells@redhat.com> Cc: James Morris <james.l.morris@oracle.com> Cc: hongfeng <hongfeng@marvell.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Michel Lespinasse authored
Update the frv arch_get_unmapped_area function to make use of vm_unmapped_area() instead of implementing a brute force search. Signed-off-by: Michel Lespinasse <walken@google.com> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Xiaowei.Hu authored
ocfs2_block_group_alloc_discontig() disables chain relink by setting ac->ac_allow_chain_relink = 0 because it grabs clusters from multiple cluster groups. It doesn't keep the credits for all chain relink,but ocfs2_claim_suballoc_bits overrides this in this call trace: ocfs2_block_group_claim_bits()->ocfs2_claim_clusters()-> __ocfs2_claim_clusters()->ocfs2_claim_suballoc_bits() ocfs2_claim_suballoc_bits set ac->ac_allow_chain_relink = 1; then call ocfs2_search_chain() one time and disable it again, and then we run out of credits. Fix is to allow relink by default and disable it in ocfs2_block_group_alloc_discontig. Without this patch, End-users will run into a crash due to run out of credits, backtrace like this: RIP: 0010:[<ffffffffa0808b14>] [<ffffffffa0808b14>] jbd2_journal_dirty_metadata+0x164/0x170 [jbd2] RSP: 0018:ffff8801b919b5b8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff88022139ddc0 RCX: ffff880159f652d0 RDX: ffff880178aa3000 RSI: ffff880159f652d0 RDI: ffff880087f09bf8 RBP: ffff8801b919b5e8 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000001e00 R11: 00000000000150b0 R12: ffff880159f652d0 R13: ffff8801a0cae908 R14: ffff880087f09bf8 R15: ffff88018d177800 FS: 00007fc9b0b6b6e0(0000) GS:ffff88022fd40000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000040819c CR3: 0000000184017000 CR4: 00000000000006e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process dd (pid: 9945, threadinfo ffff8801b919a000, task ffff880149a264c0) Call Trace: ocfs2_journal_dirty+0x2f/0x70 [ocfs2] ocfs2_relink_block_group+0x111/0x480 [ocfs2] ocfs2_search_chain+0x455/0x9a0 [ocfs2] ... Signed-off-by: Xiaowei.Hu <xiaowei.hu@oracle.com> Reviewed-by: Srinivas Eeda <srinivas.eeda@oracle.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jeff Liu authored
We need to re-initialize the security for a new reflinked inode with its parent dirs if it isn't specified to be preserved for ocfs2_reflink(). However, the code logic is broken at ocfs2_init_security_and_acl() although ocfs2_init_security_get() succeed. As a result, ocfs2_acl_init() does not involked and therefore the default ACL of parent dir was missing on the new inode. Note this was introduced by 9d8f13ba ("security: new security_inode_init_security API adds function callback") To reproduce: set default ACL for the parent dir(ocfs2 in this case): $ setfacl -m default:user:jeff:rwx ../ocfs2/ $ getfacl ../ocfs2/ # file: ../ocfs2/ # owner: jeff # group: jeff user::rwx group::r-x other::r-x default:user::rwx default:user:jeff:rwx default:group::r-x default:mask::rwx default:other::r-x $ touch a $ getfacl a # file: a # owner: jeff # group: jeff user::rw- group::rw- other::r-- Before patching, create reflink file b from a, the user default ACL entry(user:jeff:rwx)was missing: $ ./ocfs2_reflink a b $ getfacl b # file: b # owner: jeff # group: jeff user::rw- group::rw- other::r-- In this case, the end user can also observed an error message at syslog: (ocfs2_reflink,3229,2):ocfs2_init_security_and_acl:7193 ERROR: status = 0 After applying this patch, create reflink file c from a: $ ./ocfs2_reflink a c $ getfacl c # file: c # owner: jeff # group: jeff user::rw- user:jeff:rwx #effective:rw- group::r-x #effective:r-- mask::rw- other::r-- Test program: /* Usage: reflink <source> <dest> */ #include <stdio.h> #include <stdint.h> #include <stdbool.h> #include <string.h> #include <errno.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/ioctl.h> static int reflink_file(char const *src_name, char const *dst_name, bool preserve_attrs) { int fd; #ifndef REFLINK_ATTR_NONE # define REFLINK_ATTR_NONE 0 #endif #ifndef REFLINK_ATTR_PRESERVE # define REFLINK_ATTR_PRESERVE 1 #endif #ifndef OCFS2_IOC_REFLINK struct reflink_arguments { uint64_t old_path; uint64_t new_path; uint64_t preserve; }; # define OCFS2_IOC_REFLINK _IOW ('o', 4, struct reflink_arguments) #endif struct reflink_arguments args = { .old_path = (unsigned long) src_name, .new_path = (unsigned long) dst_name, .preserve = preserve_attrs ? REFLINK_ATTR_PRESERVE : REFLINK_ATTR_NONE, }; fd = open(src_name, O_RDONLY); if (fd < 0) { fprintf(stderr, "Failed to open %s: %s\n", src_name, strerror(errno)); return -1; } if (ioctl(fd, OCFS2_IOC_REFLINK, &args) < 0) { fprintf(stderr, "Failed to reflink %s to %s: %s\n", src_name, dst_name, strerror(errno)); return -1; } } int main(int argc, char *argv[]) { if (argc != 3) { fprintf(stdout, "Usage: %s source dest\n", argv[0]); return 1; } return reflink_file(argv[1], argv[2], 0); } Signed-off-by: Jie Liu <jeff.liu@oracle.com> Reviewed-by: Tao Ma <boyu.mt@taobao.com> Cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Nishanth Menon authored
Commit ef5da59f ("scripts/kernel-doc: handle struct member __aligned") permits "char something [123] __aligned(8);". However, by using \d we constraint ourselves with integers. This is not always the case. In fact, it might be better to do char something[123] __aligned(sizeof(u16)); For example, With wireless_dev defining: u8 address[ETH_ALEN] __aligned(sizeof(u16)); With \d, scripts/kernel-doc erroneously says: Warning(include/net/cfg80211.h:2618): Excess struct/union/enum/typedef member 'address' description in 'wireless_dev' This is because the regex __aligned\s*\(\d+\) fails match at \d as sizeof is used. So replace \d with . to indicate "something" in kernel-doc to ignore __aligned(SOMETHING) in structs. With this change, we can use integers OR sizeof() or macros as we please. Signed-off-by: Nishanth Menon <nm@ti.com> Cc: Fengguang Wu <fengguang.wu@intel.com> Cc: Johannes Berg <johannes.berg@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Michel Lespinasse authored
munlock_vma_pages_range() was always incrementing addresses by PAGE_SIZE at a time. When munlocking THP pages (or the huge zero page), this resulted in taking the mm->page_table_lock 512 times in a row. We can do better by making use of the page_mask returned by follow_page_mask (for the huge zero page case), or the size of the page munlock_vma_page() operated on (for the true THP page case). Signed-off-by: Michel Lespinasse <walken@google.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Kim, Milo authored
TI LP8788 PMU supports regulators, battery charger, RTC, ADC, backlight dri= ver and current sinks. This patch enables LP8788 backlight module. (Brightness mode) The brightness is controlled by PWM input or I2C register. All modes are supported in the driver. (Platform data) Configurable data can be defined in the platform side. name : backlight driver name. (default: "lcd-backlight") initial_brightness : initial value of backlight brightness bl_mode : brightness control by PWM or lp8788 register dim_mode : dimming mode selection full_scale : full scale current setting rise_time : brightness ramp up step time fall_time : brightness ramp down step time pwm_pol : PWM polarity setting when bl_mode is PWM based period_ns : platform specific PWM period value. unit is nano. The default values are set in case no platform data is defined. [akpm@linux-foundation.org: checkpatch fixes] Signed-off-by: Milo(Woogyom) Kim <milo.kim@ti.com> Cc: Richard Purdie <rpurdie@rpsys.net> Cc: Samuel Ortiz <sameo@linux.intel.com> Cc: Thierry Reding <thierry.reding@avionic-design.de> Cc: "devendra.aaru" <devendra.aaru@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-