- 28 May, 2015 7 commits
-
-
NeilBrown authored
When a batch of stripes is broken up, we keep some of the flags that were per-stripe, and copy other flags from the head to all others. This only happens while a stripe is being handled, so many of the flags are irrelevant. The "SYNC_FLAGS" (which I've renamed to make it clear there are several) and STRIPE_DEGRADED are set per-stripe and so need to be preserved. STRIPE_INSYNC is the only flag that is set on the head that needs to be propagated to all others. For safety, add a WARN_ON if others are set, except: STRIPE_HANDLE - this is safe and per-stripe and we are going to set in several cases anyway STRIPE_INSYNC STRIPE_IO_STARTED - this is just a hint and doesn't hurt. STRIPE_ON_PLUG_LIST STRIPE_ON_RELEASE_LIST - It is a point pointless for a batched stripe to be on one of these lists, but it can happen as can be safely ignored. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
When we break a stripe_batch_list we sometimes want to set STRIPE_HANDLE on the individual stripes, and sometimes not. So pass a 'handle_flags' arg. If it is zero, always set STRIPE_HANDLE (on non-head stripes). If not zero, only set it if any of the given flags are present. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
break_stripe_batch list didn't clear head_sh->batch_head. This was probably a bug. Also clear all R5_Overlap flags and if any were cleared, wake up 'wait_for_overlap'. This isn't always necessary but the worst effect is a little extra checking for code that is waiting on wait_for_overlap. Also, don't use wake_up_nr() because that does the wrong thing if 'nr' is zero, and it number of flags cleared doesn't strongly correlate with the number of threads to wake. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
handle_stripe_clean_event() contains a chunk of code very similar to check_break_stripe_batch_list(). If we make the latter more like the former, we can end up with just one copy of this code. This first step removed the condition (and the 'check_') part of the name. This has the added advantage of making it clear what check is being performed at the point where the function is called. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
If a stripe is a member of a batch, but not the head, it must not be handled separately from the rest of the batch. 'clear_batch_ready()' handles this requirement to some extent but not completely. If a member is passed to handle_stripe() a second time it returns '0' indicating the stripe can be handled, which is wrong. So add an extra test. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
When we add a write to a stripe we need to make sure the bitmap bit is set. While doing that the stripe is not locked so it could be added to a batch after which further changes to STRIPE_BIT_DELAY and ->bm_seq are ineffective. So we need to hold off adding to a stripe until bitmap_startwrite has completed at least once, and we need to avoid further changes to STRIPE_BIT_DELAY once the stripe has been added to a batch. If a bitmap_startwrite() completes after the stripe was added to a batch, it will not have set the bit, only incremented a counter, so no extra delay of the stripe is needed. Reported-by: Shaohua Li <shli@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
When we add a stripe to a batch, we need to be sure that head stripe will wait for the bitmap update required for the new stripe. Signed-off-by: NeilBrown <neilb@suse.de>
-
- 20 May, 2015 3 commits
-
-
NeilBrown authored
Evaluating "&mddev->disks" is simple pointer arithmetic, so it does not need 'rcu' annotations - no dereferencing is happening. Also enhance the comment to explain that 'rdev' in that case is not actually a pointer to an rdev. Reported-by: Patrick Marlier <patrick.marlier@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Eric Work authored
The variable "sector" in "raid0_make_request()" was improperly updated by a call to "sector_div()" which modifies its first argument in place. Commit 47d68979 restored this variable after the call for later re-use. Unfortunetly the restore was done after the referenced variable "bio" was advanced. This lead to the original value and the restored value being different. Here we move this line to the proper place. One observed side effect of this bug was discarding a file though unlinking would cause an unrelated file's contents to be discarded. Signed-off-by: NeilBrown <neilb@suse.de> Fixes: 47d68979 ("md/raid0: fix bug with chunksize not a power of 2.") Cc: stable@vger.kernel.org (any that received above backport) URL: https://bugzilla.kernel.org/show_bug.cgi?id=98501
-
Shaohua Li authored
ops_run_reconstruct6() doesn't correctly chain asyn operations. The tx returned by async_gen_syndrome should be added as the dependent tx of next stripe. The issue is introduced by commit 59fc630b RAID5: batch adjacent full stripe write Reported-and-tested-by: Maxime Ripard <maxime.ripard@free-electrons.com> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
- 08 May, 2015 7 commits
-
-
NeilBrown authored
There is no need for special handling of stripe-batches when the array is degraded. There may be if there is a failure in the batch, but STRIPE_DEGRADED does not imply an error. So don't set STRIPE_BATCH_ERR in ops_run_io just because the array is degraded. This actually causes a bug: the STRIPE_DEGRADED flag gets cleared in check_break_stripe_batch_list() and so the bitmap bit gets cleared when it shouldn't. So in check_break_stripe_batch_list(), split the batch up completely - again STRIPE_DEGRADED isn't meaningful. Also don't set STRIPE_BATCH_ERR when there is a write error to a replacement device. This simply removes the replacement device and requires no extra handling. Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
As the new 'scribble' array is sized based on chunk size, we need to make sure the size matches the largest of 'old' and 'new' chunk sizes when the array is undergoing reshape. We also potentially need to resize it even when not resizing the stripe cache, as chunk size can change without changing number of devices. So move the 'resize' code into a separate function, and consider old and new sizes when allocating. Signed-off-by: NeilBrown <neilb@suse.de> Fixes: 46d5b785 ("raid5: use flex_array for scribble data")
-
NeilBrown authored
If any memory allocation in resize_stripes fails we will return -ENOMEM, but in some cases we update conf->pool_size anyway. This means that if we try again, the allocations will be assumed to be larger than they are, and badness results. So only update pool_size if there is no error. This bug was introduced in 2.6.17 and the patch is suitable for -stable. Fixes: ad01c9e3 ("[PATCH] md: Allow stripes to be expanded in preparation for expanding an array") Cc: stable@vger.kernel.org (v2.6.17+) Signed-off-by: NeilBrown <neilb@suse.de>
-
NeilBrown authored
When performing a reconstruct write, we need to read all blocks that are not being over-written .. except the parity (P and Q) blocks. The code currently reads these (as they are not being over-written!) unnecessarily. Signed-off-by: NeilBrown <neilb@suse.de> Fixes: ea664c82 ("md/raid5: need_this_block: tidy/fix last condition.")
-
NeilBrown authored
It is not incorrect to call handle_stripe_fill() when a batch of full-stripe writes is active. It is, however, a BUG if fetch_block() then decides it needs to actually fetch anything. So move the 'BUG_ON' to where it belongs. Signed-off-by: NeilBrown <neilb@suse.de> Fixes: 59fc630b ("RAID5: batch adjacent full stripe write")
-
NeilBrown authored
The new batch_lock and batch_list fields are being initialized in grow_one_stripe() but not in resize_stripes(). This causes a crash on resize. So separate the core initialization into a new function and call it from both allocation sites. Signed-off-by: NeilBrown <neilb@suse.de> Fixes: 59fc630b ("RAID5: batch adjacent full stripe write")
-
Heinz Mauelshagen authored
This patch is a prerequisite for dm-raid "raid0" support to allow dm-raid to access the MD RAID0 personality doing unconditional accesses to mddev->queue, which is NULL in case of dm-raid stacked on top of MD. Most of the conditional mddev->queue accesses made it to upstream but this missing one, which prohibits md raid0 to set disk stack limits (being done in dm core in case of md underneath dm). Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com> Tested-by: Heinz Mauelshagen <heinzm@redhat.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
- 04 May, 2015 3 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds authored
Pull ext4 fixes from Ted Ts'o: "Some miscellaneous bug fixes and some final on-disk and ABI changes for ext4 encryption which provide better security and performance" * tag 'for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix growing of tiny filesystems ext4: move check under lock scope to close a race. ext4: fix data corruption caused by unwritten and delayed extents ext4 crypto: remove duplicated encryption mode definitions ext4 crypto: do not select from EXT4_FS_ENCRYPTION ext4 crypto: add padding to filenames before encrypting ext4 crypto: simplify and speed up filename encryption
-
git://people.freedesktop.org/~airlied/linuxLinus Torvalds authored
Pull drm fixes from Dave Airlie: "One intel fix, one rockchip fix, and a bunch of radeon fixes for some regressions from audio rework and vm stability" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/i915/chv: Implement WaDisableShadowRegForCpd drm/radeon: fix userptr return value checking (v2) drm/radeon: check new address before removing old one drm/radeon: reset BOs address after clearing it. drm/radeon: fix lockup when BOs aren't part of the VM on release drm/radeon: add SI DPM quirk for Sapphire R9 270 Dual-X 2G GDDR5 drm/radeon: adjust pll when audio is not enabled drm/radeon: only enable audio streams if the monitor supports it drm/radeon: only mark audio as connected if the monitor supports it (v3) drm/radeon/audio: don't enable packets until the end drm/radeon: drop dce6_dp_enable drm/radeon: fix ordering of AVI packet setup drm/radeon: Use drm_calloc_ab for CS relocs drm/rockchip: fix error check when getting irq MAINTAINERS: add entry for Rockchip drm drivers
-
- 03 May, 2015 8 commits
-
-
git://anongit.freedesktop.org/drm-intelDave Airlie authored
Just a single intel fix * tag 'drm-intel-fixes-2015-04-30' of git://anongit.freedesktop.org/drm-intel: drm/i915/chv: Implement WaDisableShadowRegForCpd
-
https://github.com/markyzq/kernel-drm-rockchipDave Airlie authored
one fix and maintainers update * 'drm-next0420' of https://github.com/markyzq/kernel-drm-rockchip: drm/rockchip: fix error check when getting irq MAINTAINERS: add entry for Rockchip drm drivers
-
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds authored
Pull SCSI fixes from James Bottomley: "This is three logical fixes (as 5 patches). The 3ware class of drivers were causing an oops with multiqueue by tearing down the command mappings after completing the command (where the variables in the command used to tear down the mapping were no-longer valid). There's also a fix for the qnap iscsi target which was choking on us sending it commands that were too long and a fix for the reworked aha1542 allocating GFP_KERNEL under a lock" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: 3w-9xxx: fix command completion race 3w-xxxx: fix command completion race 3w-sas: fix command completion race aha1542: Allocate memory before taking a lock SCSI: add 1024 max sectors black list flag
-
git://git.infradead.org/users/vkoul/slave-dmaLinus Torvalds authored
Pull slave dmaengine fixes from Vinod Koul: "Here are the fixes in dmaengine subsystem for rc2: - privatecnt fix for slave dma request API by Christopher - warn fix for PM ifdef in usb-dmac by Geert - fix hardware dependency for xgene by Jean" * 'next' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: increment privatecnt when using dma_get_any_slave_channel dmaengine: xgene: Set hardware dependency dmaengine: usb-dmac: Protect PM-only functions to kill warning
-
git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linuxLinus Torvalds authored
Pull powerpc fixes from Michael Ellerman: - build fix for SMP=n in book3s_xics.c - fix for Daniel's pci_controller_ops on powernv. - revert the TM syscall abort patch for now. - CPU affinity fix from Nathan. - two EEH fixes from Gavin. - fix for CR corruption from Sam. - selftest build fix. * tag 'powerpc-4.1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/mpe/linux: powerpc/powernv: Restore non-volatile CRs after nap powerpc/eeh: Delay probing EEH device during hotplug powerpc/eeh: Fix race condition in pcibios_set_pcie_reset_state() powerpc/pseries: Correct cpu affinity for dlpar added cpus selftests/powerpc: Fix the pmu install rule Revert "powerpc/tm: Abort syscalls in active transactions" powerpc/powernv: Fix early pci_controller_ops loading. powerpc/kvm: Fix SMP=n build error in book3s_xics.c
-
Jan Kara authored
The estimate of necessary transaction credits in ext4_flex_group_add() is too pessimistic. It reserves credit for sb, resize inode, and resize inode dindirect block for each group added in a flex group although they are always the same block and thus it is enough to account them only once. Also the number of modified GDT block is overestimated since we fit EXT4_DESC_PER_BLOCK(sb) descriptors in one block. Make the estimation more precise. That reduces number of requested credits enough that we can grow 20 MB filesystem (which has 1 MB journal, 79 reserved GDT blocks, and flex group size 16 by default). Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
-
Davide Italiano authored
fallocate() checks that the file is extent-based and returns EOPNOTSUPP in case is not. Other tasks can convert from and to indirect and extent so it's safe to check only after grabbing the inode mutex. Signed-off-by: Davide Italiano <dccitaliano@gmail.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
Lukas Czerner authored
Currently it is possible to lose whole file system block worth of data when we hit the specific interaction with unwritten and delayed extents in status extent tree. The problem is that when we insert delayed extent into extent status tree the only way to get rid of it is when we write out delayed buffer. However there is a limitation in the extent status tree implementation so that when inserting unwritten extent should there be even a single delayed block the whole unwritten extent would be marked as delayed. At this point, there is no way to get rid of the delayed extents, because there are no delayed buffers to write out. So when a we write into said unwritten extent we will convert it to written, but it still remains delayed. When we try to write into that block later ext4_da_map_blocks() will set the buffer new and delayed and map it to invalid block which causes the rest of the block to be zeroed loosing already written data. For now we can fix this by simply not allowing to set delayed status on written extent in the extent status tree. Also add WARN_ON() to make sure that we notice if this happens in the future. This problem can be easily reproduced by running the following xfs_io. xfs_io -f -c "pwrite -S 0xaa 4096 2048" \ -c "falloc 0 131072" \ -c "pwrite -S 0xbb 65536 2048" \ -c "fsync" /mnt/test/fff echo 3 > /proc/sys/vm/drop_caches xfs_io -c "pwrite -S 0xdd 67584 2048" /mnt/test/fff This can be theoretically also reproduced by at random by running fsx, but it's not very reliable, though on machines with bigger page size (like ppc) this can be seen more often (especially xfstest generic/127) Signed-off-by: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@vger.kernel.org
-
- 02 May, 2015 7 commits
-
-
Chanho Park authored
This patch removes duplicated encryption modes which were already in ext4.h. They were duplicated from commit 3edc18d8 and commit f542fb. Cc: Theodore Ts'o <tytso@mit.edu> Cc: Michael Halcrow <mhalcrow@google.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Chanho Park <chanho61.park@samsung.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-
Herbert Xu authored
This patch adds a tristate EXT4_ENCRYPTION to do the selections for EXT4_FS_ENCRYPTION because selecting from a bool causes all the selected options to be built-in, even if EXT4 itself is a module. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-
git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds authored
Pull networking fixes from David Miller: 1) Receive packet length needs to be adjust by 2 on RX to accomodate the two padding bytes in altera_tse driver. From Vlastimil Setka. 2) If rx frame is dropped due to out of memory in macb driver, we leave the receive ring descriptors in an undefined state. From Punnaiah Choudary Kalluri 3) Some netlink subsystems erroneously signal NLM_F_MULTI. That is only for dumps. Fix from Nicolas Dichtel. 4) Fix mis-use of raw rt->rt_pmtu value in ipv4, one must always go via the ipv4_mtu() helper. From Herbert Xu. 5) Fix null deref in bridge netfilter, and miscalculated lengths in jump/goto nf_tables verdicts. From Florian Westphal. 6) Unhash ping sockets properly. 7) Software implementation of BPF divide did 64/32 rather than 64/64 bit divide. The JITs got it right. Fix from Alexei Starovoitov. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits) ipv4: Missing sk_nulls_node_init() in ping_unhash(). net: fec: Fix RGMII-ID mode net/mlx4_en: Schedule napi when RX buffers allocation fails netxen_nic: use spin_[un]lock_bh around tx_clean_lock net/mlx4_core: Fix unaligned accesses mlx4_en: Use correct loop cursor in error path. cxgb4: Fix MC1 memory offset calculation bnx2x: Delay during kdump load net: Fix Kernel Panic in bonding driver debugfs file: rlb_hash_table net: dsa: Fix scope of eeprom-length property net: macb: Fix race condition in driver when Rx frame is dropped hv_netvsc: Fix a bug in netvsc_start_xmit() altera_tse: Correct rx packet length mlx4: Fix tx ring affinity_mask creation tipc: fix problem with parallel link synchronization mechanism tipc: remove wrong use of NLM_F_MULTI bridge/nl: remove wrong use of NLM_F_MULTI bridge/mdb: remove wrong use of NLM_F_MULTI net: sched: act_connmark: don't zap skb->nfct trivial: net: systemport: bcmsysport.h: fix 0x0x prefix ...
-
Stefan Hajnoczi authored
Here the "other side" refers to the guest or host. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Rusty Russell authored
With my job change kernel work will be "own time"; I'm keeping lguest and modules (and the virtio standards work), but virtio kernel has to go. This makes it clear that Michael is in charge. He's good, but having me watch over his shoulder won't help. Good luck Michael! Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-clientLinus Torvalds authored
Pull Ceph RBD fix from Sage Weil. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: rbd: end I/O the entire obj_request on error
-
David S. Miller authored
If we don't do that, then the poison value is left in the ->pprev backlink. This can cause crashes if we do a disconnect, followed by a connect(). Tested-by: Linus Torvalds <torvalds@linux-foundation.org> Reported-by: Wen Xu <hotdog3645@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
- 01 May, 2015 5 commits
-
-
Ilya Dryomov authored
When we end I/O struct request with error, we need to pass obj_request->length as @nr_bytes so that the entire obj_request worth of bytes is completed. Otherwise block layer ends up confused and we trip on rbd_assert(more ^ (which == img_request->obj_request_count)); in rbd_img_obj_callback() due to more being true no matter what. We already do it in most cases but we are missing some, in particular those where we don't even get a chance to submit any obj_requests, due to an early -ENOMEM for example. A number of obj_request->xferred assignments seem to be redundant but I haven't touched any of obj_request->xferred stuff to keep this small and isolated. Cc: Alex Elder <elder@linaro.org> Cc: stable@vger.kernel.org # 3.10+ Reported-by: Shawn Edwards <lesser.evil@gmail.com> Reviewed-by: Sage Weil <sage@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
-
Theodore Ts'o authored
This obscures the length of the filenames, to decrease the amount of information leakage. By default, we pad the filenames to the next 4 byte boundaries. This costs nothing, since the directory entries are aligned to 4 byte boundaries anyway. Filenames can also be padded to 8, 16, or 32 bytes, which will consume more directory space. Change-Id: Ibb7a0fb76d2c48e2061240a709358ff40b14f322 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-
Theodore Ts'o authored
Avoid using SHA-1 when calculating the user-visible filename when the encryption key is available, and avoid decrypting lots of filenames when searching for a directory entry in a directory block. Change-Id: If4655f144784978ba0305b597bfa1c8d7bb69e63 Signed-off-by: Theodore Ts'o <tytso@mit.edu>
-
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfsLinus Torvalds authored
Pull btrfs fixes from Chris Mason: "A few more btrfs fixes. These range from corners Filipe found in the new free space cache writeback to a grab bag of fixes from the list" * 'for-linus-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: btrfs_release_extent_buffer_page didn't free pages of dummy extent Btrfs: fill ->last_trans for delayed inode in btrfs_fill_inode. btrfs: unlock i_mutex after attempting to delete subvolume during send btrfs: check io_ctl_prepare_pages return in __btrfs_write_out_cache btrfs: fix race on ENOMEM in alloc_extent_buffer btrfs: handle ENOMEM in btrfs_alloc_tree_block Btrfs: fix find_free_dev_extent() malfunction in case device tree has hole Btrfs: don't check for delalloc_bytes in cache_save_setup Btrfs: fix deadlock when starting writeback of bg caches Btrfs: fix race between start dirty bg cache writeout and bg deletion
-
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linuxLinus Torvalds authored
Pull arm64 fixes from Will Deacon: "Not too much here, but we've addressed a couple of nasty issues in the dma-mapping code as well as adding the halfword and byte variants of load_acquire/store_release following on from the CSD locking bug that you fixed in the core. - fix perf devicetree warnings at probe time - fix memory leak in __dma_free() - ensure DMA buffers are always zeroed - show IRQ trigger in /proc/interrupts (for parity with ARM) - implement byte and halfword access for smp_{load_acquire,store_release}" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: perf: Fix the pmu node name in warning message arm64: perf: don't warn about missing interrupt-affinity property for PPIs arm64: add missing PAGE_ALIGN() to __dma_free() arm64: dma-mapping: always clear allocated buffers ARM64: Enable CONFIG_GENERIC_IRQ_SHOW_LEVEL arm64: add missing data types in smp_load_acquire/smp_store_release
-