- 06 May, 2013 22 commits
-
-
Wang Shilong authored
The reason that BUG_ON() happens in these places is just because of ENOMEM. We try ro return ENOMEM rather than trigger BUG_ON(), the caller will abort the transaction thus avoiding the kernel panic. Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com> Reviewed-by: Miao Xie <miaox@cn.fujitsu.com> Reviewed-by: Jan Schmidt <list.btrfs@jan-o-sch.net> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Josef Bacik authored
A user reported a panic while running a balance. What was happening was he was relocating a block, which added the reference to the relocation tree. Then relocation would walk through the relocation tree and drop that reference and free that block, and then it would walk down a snapshot which referenced the same block and add another ref to the block. The problem is this was all happening in the same transaction, so the parent block was free'ed up when we drop our reference which was immediately available for allocation, and then it was used _again_ to add a reference for the same block from a different snapshot. This resulted in something like this in the delayed ref tree add ref to 90234880, parent=2067398656, ref_root 1766, level 1 del ref to 90234880, parent=2067398656, ref_root 18446744073709551608, level 1 add ref to 90234880, parent=2067398656, ref_root 1767, level 1 as you can see the ref_root's don't match, because when we inc the ref we use the header owner, which is the original tree the block belonged to, instead of the data reloc tree. Then when we remove the extent we use the reloc tree objectid. But none of this matters, since it is a shared reference which means only the parent matters. When the delayed ref stuff runs it adds all the increments first, and then does all the drops, to make sure that we don't delete the ref if we net a positive ref count. But tree blocks aren't allowed to have multiple refs from the same block, so this panics when it tries to add the second ref. We need the add and the drop to cancel each other out in memory so we only do the final add. So to fix this we need to adjust how the delayed refs are added to the tree. Only the ref_root matters when it is a normal backref, and only the parent matters when it is a shared backref. So make our decision based on what ref type we have. This allows us to keep the ref_root in memory in case anybody wants to use it for something else, and it allows the delayed refs to be merged properly so we don't end up with this panic. With this patch the users image no longer panics on mount, and it has a clean fsck after a normal mount/umount cycle. Thanks, Cc: stable@vger.kernel.org Reported-by: Roman Mamedov <rm@romanrm.ru> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Josef Bacik authored
Testing my enospc log code I managed to abort a transaction during mount, which put me into an infinite loop. This is because of two things, first we don't reset trans_no_join if we abort during transaction commit, which will force anybody trying to start a transaction to just loop endlessly waiting for it to be set to 0. But this is still just a symptom, the second issue is we don't set the fs state to error during errors on mount. This is because we don't want to do the flip read only thing during mount, but we still really want to set the fs state to an error to keep us from even getting to the trans_no_join check. So fix both of these things, make sure to reset trans_no_join if we abort during a commit, and make sure we set the fs state to error no matter if we're mounting or not. This should keep us from getting into this infinite loop again. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Wang Shilong authored
Steps to reproduce: mkfs.btrfs <disk> mount <disk> <mnt> btrfs quota enable <mnt> btrfs sub create <mnt>/subv i=1 while [ $i -le 10000 ] do dd if=/dev/zero of=<mnt>/subv/data_$i bs=1K count=1 i=$(($i+1)) if [ $i -eq 500 ] then btrfs quota disable $mnt fi done dmesg Obviously, this warn_on() is unnecessary, and it will be easily triggered. Just remove it. Signed-off-by: Wang Shilong <wangsl-fnst@cn.fujitsu.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Liu Bo authored
set_extent_bit()'s (u64 *failed_start) expects NULL not 0. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Eric Sandeen authored
Document all current btrfs mount options. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
David Sterba authored
The subvolume ioctls block on the parent directory mutex that can be held by other concurrent snapshot activity for a long time. Give the user at least some chance to get out of this situation by allowing to send a kill signal. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
David Sterba authored
Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
David Sterba authored
The messages btrfs: unlinked 123 orphans btrfs: truncated 456 orphans are not useful to regular users and raise questions whether there are problems with the filesystem. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
David Sterba authored
This mount option was a workaround when subvol= assumed path relative to the default subvolume, not the toplevel one. This was fixed long time ago and subvolrootid has no effect. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Simon Kirby authored
With more than one btrfs volume mounted, it can be very difficult to find out which volume is hitting an error. btrfs_error() will print this, but it is currently rigged as more of a fatal error handler, while many of the printk()s are currently for debugging and yet-unhandled cases. This patch just changes the functions where the device information is already available. Some cases remain where the root or fs_info is not passed to the function emitting the error. This may introduce some confusion with volumes backed by multiple devices emitting errors referring to the primary device in the set instead of the one on which the error occurred. Use btrfs_printk(fs_info, format, ...) rather than writing the device string every time, and introduce macro wrappers ala XFS for brevity. Since the function already cannot be used for continuations, print a newline as part of the btrfs_printk() message rather than at each caller. Signed-off-by: Simon Kirby <sim@hostway.ca> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
David Sterba authored
The Kconfig title does not make much sense after the cleanup of CONFIG_EXPERIMENTAL option, align the wording with other filesystems. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
David Sterba authored
Each time pick one dead root from the list and let the caller know if it's needed to continue. This should improve responsiveness during umount and balance which at some point waits for cleaning all currently queued dead roots. A new dead root is added to the end of the list, so the snapshots disappear in the order of deletion. The snapshot cleaning work is now done only from the cleaner thread and the others wake it if needed. Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Zhi Yong Wu authored
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Zhi Yong Wu authored
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Liu Bo authored
Share the exactly same code of stopping workers. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Josef Bacik authored
We currently store the first key of the tree block inside the reference for the tree block in the extent tree. This takes up quite a bit of space. Make a new key type for metadata which holds the level as the offset and completely removes storing the btrfs_tree_block_info inside the extent ref. This reduces the size from 51 bytes to 33 bytes per extent reference for each tree block. In practice this results in a 30-35% decrease in the size of our extent tree, which means we COW less and can keep more of the extent tree in memory which makes our heavy metadata operations go much faster. This is not an automatic format change, you must enable it at mkfs time or with btrfstune. This patch deals with having metadata stored as either the old format or the new format so it is easy to convert. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Liu Bo authored
free_root_pointers() has been introduced to cleanup all of tree roots, so just use it instead. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Reviewed-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Liu Bo authored
Argument 'root' is no more used in btrfs_csum_data(). Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
David Sterba authored
The transaction abort stacktrace is printed only once per module lifetime, but we'd like to see it each time it happens per mounted filesystem. Introduce a fs_state flag that records it. Tweak the messages around abort: * add error number to the first abort * print the exact negative errno from btrfs_decode_error * clean up btrfs_decode_error and callers * no dots at the end of the messages Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
David Sterba authored
Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Josef Bacik authored
We keep hitting bugs in the tree log replay because btrfs_remove_free_space doesn't account for some corner case. So add a bunch of tests to try and fully test btrfs_remove_free_space since the only time it is called is during tree log replay. These tests all finish successfully, so as we find more of these bugs we need to add to these tests to make sure we don't regress in fixing things. I've hidden the tests behind a Kconfig option, but they take no time to run so all btrfs developers should have this turned on all the time. Thanks, Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
- 29 Apr, 2013 2 commits
-
-
Liu Bo authored
btrfs_abort_devices() is no more used. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
-
Linus Torvalds authored
-
- 27 Apr, 2013 4 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-socLinus Torvalds authored
Pull ARM SoC fix from Olof Johansson: "A late-arriving fix for musb on OMAP4, resolving an issue where the musb IP won't be clocked and thus not functional. Small in scope, most of the lines changed is a longish comment." * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: ARM: OMAP4: hwmod data: make 'ocp2scp_usb_phy_phy_48m" as the main clock
-
Linus Torvalds authored
I think we could just move the full vm_iomap_memory() function into util.h or similar, but I didn't get any reply from anybody actually using nommu even to this trivial patch, so I'm not going to touch it any more than required. Here's the fairly minimal stub to make the nommu case at least potentially work. It doesn't seem like anybody cares, though. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull perf fix from Ingo Molnar: "This fix adds missing RCU read protection" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: events: Protect access via task_subsys_state_check()
-
Olof Johansson authored
Merge tag 'omap-for-v3.9-rc6/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes From Tony Lindgren: One MUSB regression fix that I forgot to send earlier. Without this MUSB no longer works on omap4 based devices. * tag 'omap-for-v3.9-rc6/fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: OMAP4: hwmod data: make 'ocp2scp_usb_phy_phy_48m" as the main clock Signed-off-by: Olof Johansson <olof@lixom.net>
-
- 26 Apr, 2013 5 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-mediaLinus Torvalds authored
Pull media fixes from Mauro Carvalho Chehab: "Two driver fixes. One avoids reading any file at a system with a cx25821 board (fortunately, this is not a common device). The other one prevents reading after a buffer with ISDB-T devices based on mb86a20s." * 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: [media] cx25821: do not expose broken video output streams [media] mb86a20s: Fix estimate_rate setting
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds authored
Pull late parisc fixes from Helge Deller: "I know it's *very* late in the 3.9 release cycle, but since there aren't that many people testing the parisc linux kernel, a few (for our port) critical issues just showed up a few days back for the first time. What's in it? - add missing __ucmpdi2 symbol, which is required for btrfs on 32bit kernel. - change kunmap() macro to static inline function. This fixes a debian/gcc-4.4 build error. - add locking when doing PTE updates. This fixes random userspace crashes. - disable (optional) -mlong-calls compiler option for modules, else modules can't be loaded at runtime. - a smart patch by Will Deacon which fixes 64bit put_user() warnings on 32bit kernel." * 'fixes-3.9-late' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: use spin_lock_irqsave/spin_unlock_irqrestore for PTE updates parisc: disable -mlong-calls compiler option for kernel modules parisc: uaccess: fix compiler warnings caused by __put_user casting parisc: Change kunmap macro to static inline function parisc: Provide __ucmpdi2 to resolve undefined references in 32 bit builds.
-
Matt Fleming authored
variable_is_present() accesses '__efivars' directly, but when called via gsmi_init() Michel reports observing the following crash, BUG: unable to handle kernel NULL pointer dereference at (null) IP: variable_is_present+0x55/0x170 Call Trace: register_efivars+0x106/0x370 gsmi_init+0x2ad/0x3da do_one_initcall+0x3f/0x170 The reason for the crash is that '__efivars' hasn't been initialised nor has it been registered with register_efivars() by the time the google EFI SMI driver runs. The gsmi code uses its own struct efivars, and therefore, a different variable list. Fix the above crash by passing the registered struct efivars to variable_is_present(), so that we traverse the correct list. Reported-by: Michel Lespinasse <walken@google.com> Tested-by: Michel Lespinasse <walken@google.com> Cc: Mike Waychison <mikew@google.com> Cc: Matthew Garrett <matthew.garrett@nebula.com> Cc: Seiji Aguchi <seiji.aguchi@hds.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Jiri Slaby authored
In commit b0de59b5 ("TTY: do not update atime/mtime on read/write") we removed timestamps from tty inodes to fix a security issue and waited if something breaks. Well, 'w', the utility to find out logged users and their inactivity time broke. It shows that users are inactive since the time they logged in. To revert to the old behaviour while still preventing attackers to guess the password length, we update the timestamps in one-minute intervals by this patch. Signed-off-by: Jiri Slaby <jslaby@suse.cz> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Zhao Hongjiang authored
dprintk() shouldn't access @ring after it's unmapped. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 25 Apr, 2013 7 commits
-
-
H. Peter Anvin authored
* The EFI variable anti-bricking algorithm merged in -rc8 broke booting on some Apple machines because they implement EFI spec 1.10, which doesn't provide a QueryVariableInfo() runtime function and the logic used to check for the existence of that function was insufficient. Fix from Josh Boyer. * The anti-bricking algorithm also introduced a compiler warning on 32-bit. Fix from Borislav Petkov. Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
-
John David Anglin authored
User applications running on SMP kernels have long suffered from instability and random segmentation faults. This patch improves the situation although there is more work to be done. One of the problems is the various routines in pgtable.h that update page table entries use different locking mechanisms, or no lock at all (set_pte_at). This change modifies the routines to all use the same lock pa_dbit_lock. This lock is used for dirty bit updates in the interruption code. The patch also purges the TLB entries associated with the PTE to ensure that inconsistent values are not used after the page table entry is updated. The UP and SMP code are now identical. The change also includes a minor update to the purge_tlb_entries function in cache.c to improve its efficiency. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: Helge Deller <deller@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de>
-
Helge Deller authored
CONFIG_MLONGCALLS was introduced in commit ec758f98 to overcome linker issues when linking huge linux kernels, e.g. with many modules linked in. But in the kernel module loader there is no support yet for the new relocation types, which is why modules built with -mlong-calls can't be loaded. Furthermore, for modules long calls are not really necessary, since we already use stub sections which resolve long distance calls. So, let's just disable this compiler option when compiling kernel modules. Signed-off-by: Helge Deller <deller@gmx.de>
-
Will Deacon authored
When targetting 32-bit processors, __put_user emits a pair of stw instructions for the 8-byte case. If the type of __val is a pointer, the marshalling code casts it to the wider integer type of u64, resulting in the following compiler warnings: kernel/signal.c: In function 'copy_siginfo_to_user': kernel/signal.c:2752:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] kernel/signal.c:2752:11: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] [...] This patch fixes the warnings by removing the marshalling code and using the correct output modifiers in the __put_{user,kernel}_asm64 macros so that GCC will allocate the right registers without the need to extract the two words explicitly. Cc: Helge Deller <deller@gmx.de> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Helge Deller <deller@gmx.de>
-
John David Anglin authored
Change kunmap macro to static inline function to fix build error compiling drivers/base/dma-buf.c. Without the change, the following error can occur: CC drivers/base/dma-buf.o drivers/base/dma-buf.c: In function 'dma_buf_kunmap': drivers/base/dma-buf.c:427:46: error: macro "kunmap" passed 3 arguments, but takes just 1 I believe parisc is the only arch to implement kunmap using a macro. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Signed-off-by: Helge Deller <deller@gmx.de>
-
John David Anglin authored
The Debian experimental linux source package (3.8.5-1) build fails with the following errors: ... MODPOST 2016 modules ERROR: "__ucmpdi2" [fs/btrfs/btrfs.ko] undefined! ERROR: "__ucmpdi2" [drivers/md/dm-verity.ko] undefined! The attached patch resolves this problem. It is based on the s390 implementation of ucmpdi2.c. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Signed-off-by: Helge Deller <deller@gmx.de>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds authored
Pull sparc fix from David Miller: "Brown paper bag fix for sparc64" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: Fix missing put_cpu_var() in tlb_batch_add_one() when not batching.
-