- 12 Oct, 2015 17 commits
-
-
Guoqing Jiang authored
For cluster raid, we should not kick it from array if the disk can't be remove from array successfully. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
-
Guoqing Jiang authored
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
-
Guoqing Jiang authored
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
-
Guoqing Jiang authored
During the past test, the node occasionally received the msg which is sent from itself, this case should not happen in theory, but it is better to avoid it in case something wrong happened. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
-
Guoqing Jiang authored
Since slot will be set within _sendmsg, we can remove the redundant code in resync_info_update. Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
-
Guoqing Jiang authored
Signed-off-by: Guoqing Jiang <gqjiang@suse.com>
-
Goldwyn Rodrigues authored
The receive daemon prints kernel messages for every network message received. This would fill the kernel message log with unnecessary messages. Remove the pr_info() messages. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
-
Goldwyn Rodrigues authored
Adding the disk worked incorrectly with the new reload code. Fix it: - No operation should be performed on rdev marked as Candidate - After a metadata update operation, kick disk if role is 0xfffe else clear Candidate bit and continue with the regular change check. - Saving the mode of the lock resource to check if token lock is already locked, because it can be called twice while adding a disk. However, unlock_comm() must be called only once. - add_new_disk() is called by the node initiating the --add operation. If it needs to be canceled, call add_new_disk_cancel(). The operation is completed by md_update_sb() which will write and unlock the communication. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
-
Goldwyn Rodrigues authored
Resync or recovery must be performed by only one node at a time. A DLM lock resource, resync_lockres provides the mutual exclusion so that only one node performs the recovery/resync at a time. If a node is unable to get the resync_lockres, because recovery is being performed by another node, it set MD_RECOVER_NEEDED so as to schedule recovery in the future. Remove the debug message in resync_info_update() used during development. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
-
Goldwyn Rodrigues authored
In a clustered environment, a change such as marking a device faulty, can be recorded by any of the nodes. This is communicated to all the nodes and re-recording such a change is unnecessary, and quite often pretty disruptive. With this patch, just before the update, we detect for the changes and if the changes are already in superblock, we abort the update after clearing all the flags Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
-
Goldwyn Rodrigues authored
md_reload_sb is too simplistic and it explicitly needs to determine the changes made by the writing node. However, there are multiple areas where a simple reload could fail. Instead, read the superblock of one of the "good" rdevs and update the necessary information: - read the superblock into a newly allocated page, by temporarily swapping out rdev->sb_page and calling ->load_super. - if that fails return - if it succeeds, call check_sb_changes 1. iterates over list of active devices and checks the matching dev_roles[] value. If that is 'faulty', the device must be marked as faulty - call md_error to mark the device as faulty. Make sure not to set CHANGE_DEVS and wakeup mddev->thread or else it would initiate a resync process, which is the responsibility of the "primary" node. - clear the Blocked bit - Call remove_and_add_spares() to hot remove the device. If the device is 'spare': - call remove_and_add_spares() to get the number of spares added in this operation. - Reduce mddev->degraded to mark the array as not degraded. 2. reset recovery_cp - read the rest of the rdevs to update recovery_offset. If recovery_offset is equal to MaxSector, call spare_active() to set it In_sync This required that recovery_offset be initialized to MaxSector, as opposed to zero so as to communicate the end of sync for a rdev. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
-
Goldwyn Rodrigues authored
remove_and_add_spares() checks for all devices to activate spare. Change it to activate a specific device if a non-null rdev argument is passed. remove_and_add_spares() can be used to activate spares in slot_store() as well. For hot_remove_disk(), check if rdev->raid_disk == -1 before calling remove_and_add_spares() Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
-
Goldwyn Rodrigues authored
When the suspended_area is deleted, the suspended processes must be woken up in order to complete their I/O. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
-
Guoqing Jiang authored
Previously, BITMAP_NEEDS_SYNC message is sent when the resyc aborts, but it could abort for different reasons, and not all of reasons require another node to take over the resync ownship. It is better make BITMAP_NEEDS_SYNC message only be sent when the node is leaving cluster with dirty bitmap. And we also need to ensure dlm connection is ok. Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>
-
Goldwyn Rodrigues authored
Suspending the entire device for resync could take too long. Resync in small chunks. cluster's resync window (32M) is maintained in r1conf as cluster_sync_low and cluster_sync_high and processed in raid1's sync_request(). If the current resync is outside the cluster resync window: 1. Set the cluster_sync_low to curr_resync_completed. 2. Check if the sync will fit in the new window, if not issue a wait_barrier() and set cluster_sync_low to sector_nr. 3. Set cluster_sync_high to cluster_sync_low + resync_window. 4. Send a message to all nodes so they may add it in their suspension list. bitmap_cond_end_sync is modified to allow to force a sync inorder to get the curr_resync_completed uptodate with the sector passed. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Goldwyn Rodrigues authored
Add BITMAP_MAJOR_CLUSTERED as 5, in order to prevent older kernels to assemble a clustered device. In order to maximize compatibility, the major version is set to BITMAP_MAJOR_CLUSTERED *only* if the bitmap is clustered. Added MD_FEATURE_CLUSTERED in order to return error for older kernels which would assemble MD even if the bitmap is corrupted. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>
-
Goldwyn Rodrigues authored
process_suspend_info - which handles the RESYNCING request - must not reply until all writes which were initiated before the request arrived, have completed. As a by-product, all process_* functions now take mddev as their first arguement making it uniform. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: NeilBrown <neilb@suse.com>
-
- 02 Oct, 2015 8 commits
-
-
NeilBrown authored
Passing -1 to bitmap_storage_alloc() causes page->index to be set to -1, which is quite problematic. So only pass ->cluster_slot if mddev_is_clustered(). Fixes: b97e9257 ("Use separate bitmaps for each nodes in the cluster") Cc: stable@vger.kernel.org (v4.1+) Signed-off-by: NeilBrown <neilb@suse.com>
-
Jes Sorensen authored
close_sync() needs to set conf->next_resync to a large, but safe value below MaxSector and use it to determine whether or not to set start_next_window in wait_barrier() Solution suggested by Neil Brown. Reported-by: Nate Dailey <nate.dailey@stratus.com> Tested-by: Xiao Ni <xni@redhat.com> Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com>
-
Julia Lawall authored
Remove unneeded NULL test. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ expression x; @@ -if (x != NULL) \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x); // </smpl> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: NeilBrown <neilb@suse.com>
-
Shaohua Li authored
If faulty disks of an array are more than allowed degraded number, the array enters error handling. It will be marked as read-only with MD_CHANGE_PENDING/RECOVERY_NEEDED set. But currently recovery doesn't clear CHANGE_PENDING bit for read-only array. If MD_CHANGE_PENDING is set for a raid5 array, all returned IO will be hold on a list till the bit is clear. But recovery nevery clears this bit, the IO is always in pending state and nevery finish. This has bad effects like upper layer can't get an IO error and the array can't be stopped. Fixes: c3cce6cd ("md/raid5: ensure device failure recorded before write request returns.") Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
-
NeilBrown authored
Calling e.g. blk_queue_max_hw_sectors() after calls to disk_stack_limits() discards the settings determined by disk_stack_limits(). So we need to make those calls first. Fixes: 199dc6ed ("md/raid0: update queue parameter in a safer location.") Cc: stable@vger.kernel.org (v2.6.35+ - please apply with 199dc6ed). Reported-by: Jes Sorensen <Jes.Sorensen@redhat.com> Signed-off-by: NeilBrown <neilb@suse.com>
-
NeilBrown authored
When need_this_block probably shouldn't be called when there are more than 2 failed devices, we really don't want it to try indexing beyond the end of the failed_num[] of fdev[] arrays. So limit the loops to at most 2 iterations. Reported-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.de>
-
Shaohua Li authored
handle_failed_stripe() makes the stripe fail, eg, all IO will return with a failure, but it doesn't update stripe_head_state. Later handle_stripe() has special handling for raid6 for handle_stripe_fill(). That check before handle_stripe_fill() doesn't skip the failed stripe and we get a kernel crash in need_this_block. This patch clear the analysis state to make sure no functions wrongly called after handle_failed_stripe() Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
-
NeilBrown authored
If a superblock update is pending, wait for it to complete before letting md_set_readonly() switch to readonly. Otherwise we might lose important information about a device having failed. For external arrays, waiting for superblock updates can wait on user-space, so in that case, just return an error. Reported-and-tested-by: Shaohua Li <shli@fb.com> Signed-off-by: NeilBrown <neilb@suse.com>
-
- 22 Sep, 2015 1 commit
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroupLinus Torvalds authored
Pull cgroup fixes from Tejun Heo: "The threadgroup locking changes which went in during 4.2 devel cycle added write locking of a percpu_rwsem in cgroup task migration path; unfortunately, that involved expedited rcu syncing which turned out to be too slow and heavy for certain workloads. The patchset which is dependent on this one didn't get committed during that devel cycle, so these two patches can be reverted safely. Oleg reworked percpu_rwsem for 4.4 so that the writer path is a lot lighter. The reported issue goes away with Oleg's reworked percpu_rwsem and I'll reapply these patches on the for-4.4 branch so that they can land together with Oleg's changes" * 'for-4.3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: Revert "sched, cgroup: replace signal_struct->group_rwsem with a global percpu_rwsem" Revert "cgroup: simplify threadgroup locking"
-
- 21 Sep, 2015 3 commits
-
-
Linus Torvalds authored
Merge tag 'renesas-sh-drivers-for-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas Pull SH drivers updates from Simon Horman: "I am sending this change after v4.3-rc1 has been released as it depends on SoC changes which are present in that rc release. Summary: - disable PM runtime for multi-platform ARM with genpd - disable legacy default PM Domain on emev2" * tag 'renesas-sh-drivers-for-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: drivers: sh: Disable PM runtime for multi-platform ARM with genpd drivers: sh: Disable legacy default PM Domain on emev2
-
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linuxLinus Torvalds authored
Pull s390 fixes from Martin Schwidefsky: "A couple of system call updates. The two new system calls userfaultfd and membarrier have been added, as well as the 17 direct calls for the multiplexed socket system calls. In addition the system call compat wrappers have been flagged as notrace functions and a few wrappers could be removed. And bug fixes for the vector register handling, cpu_mf, suspend/resume, compat signals, SMT cputime accounting and the zfcp dumper" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390: wire up separate socketcalls system calls s390/compat: remove superfluous compat wrappers s390/compat: do not trace compat wrapper functions s390/s390x: allocate sys_membarrier system call number s390/configs//zfcpdump_defconfig: Remove CONFIG_MEMSTICK s390: wire up userfaultfd system call s390/vtime: correct scaled cputime for SMT s390/cpum_cf: Corrected return code for unauthorized counter sets s390/compat: correct uc_sigmask of the compat signal frame s390: fix floating point register corruption s390/hibernate: fix save and restore of vector registers
-
Jann Horn authored
Signed-off-by: Jann Horn <jann@thejh.net> Reviewed-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 20 Sep, 2015 10 commits
-
-
Linus Torvalds authored
-
git://ftp.arm.linux.org.uk/~rmk/linux-armLinus Torvalds authored
Pull ARM fixes from Russell King: "Three fixes and a resulting cleanup for -rc2: - Andre Przywara reported that he was seeing a warning with the new cast inside DMA_ERROR_CODE's definition, and fixed the incorrect use. - Doug Anderson noticed that kgdb causes a "scheduling while atomic" bug. - OMAP5 folk noticed that their Thumb-2 compiled X servers crashed when enabling support to cover ARMv6 CPUs due to a kernel bug leaking some conditional context into the signal handler" * 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: ARM: 8425/1: kgdb: Don't try to stop the machine when setting breakpoints ARM: 8437/1: dma-mapping: fix build warning with new DMA_ERROR_CODE definition ARM: get rid of needless #if in signal handling code ARM: fix Thumb2 signal handling when ARMv6 is enabled
-
Linus Torvalds authored
Merge tag 'linux-kselftest-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: "This update contains 7 fixes for problems ranging from build failurs to incorrect error reporting" * tag 'linux-kselftest-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: exec: revert to default emit rule selftests: change install command to rsync selftests: mqueue: simplify the Makefile selftests: mqueue: allow extra cflags selftests: rename jump label to static_keys selftests/seccomp: add support for s390 seltests/zram: fix syntax error
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull power management and ACPI updates from Rafael Wysocki: "Included are: a somewhat late devfreq update which however is mostly fixes and cleanups with one new thing only (the PPMUv2 support on Exynos5433), an ACPI cpufreq driver fixup and two ACPI core cleanups related to preprocessor directives. Specifics: - Fix a memory allocation size in the devfreq core (Xiaolong Ye). - Fix a mistake in the exynos-ppmu DT binding (Javier Martinez Canillas). - Add support for PPMUv2 ((Platform Performance Monitoring Unit version 2.0) on the Exynos5433 SoCs (Chanwoo Choi). - Fix a type casting bug in the Exynos PPMU code (MyungJoo Ham). - Assorted devfreq code cleanups and optimizations (Javi Merino, MyungJoo Ham, Viresh Kumar). - Fix up the ACPI cpufreq driver to use a more lightweight way to get to its private data in the ->get() callback (Rafael J Wysocki). - Fix a CONFIG_ prefix bug in one of the ACPI drivers and make the ACPI subsystem use IS_ENABLED() instead of #ifdefs in function bodies (Sudeep Holla)" * tag 'pm+acpi-4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: acpi-cpufreq: Use cpufreq_cpu_get_raw() in ->get() ACPI: Eliminate CONFIG_.*{, _MODULE} #ifdef in favor of IS_ENABLED() ACPI: int340x_thermal: add missing CONFIG_ prefix PM / devfreq: Fix incorrect type issue. PM / devfreq: tegra: Update governor to use devfreq_update_stats() PM / devfreq: comments for get_dev_status usage updated PM / devfreq: drop comment about thermal setting max_freq PM / devfreq: cache the last call to get_dev_status() PM / devfreq: Drop unlikely before IS_ERR(_OR_NULL) PM / devfreq: exynos-ppmu: bit-wise operation bugfix. PM / devfreq: exynos-ppmu: Update documentation to support PPMUv2 PM / devfreq: exynos-ppmu: Add the support of PPMUv2 for Exynos5433 PM / devfreq: event: Remove incorrect property in exynos-ppmu DT binding
-
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linuxLinus Torvalds authored
Pull clk fixes from Stephen Boyd: "A few driver fixes for tegra, rockchip, and st SoCs and a two-liner in the framework to avoid oops when get_parent ops return out of range values on tegra platforms" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: drivers: clk: st: Rename st_pll3200c32_407_c0_x into st_pll3200c32_cx_x clk: check for invalid parent index of orphans in __clk_init() clk: tegra: dfll: Properly protect OPP list clk: rockchip: add critical clock for rk3368
-
Linus Torvalds authored
Merge tag 'led-fixes-for-v4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds Pull LED fixes from Jacek Anaszewski: - fix module autoload for six OF platform drivers (aat1290, bcm6328, bcm6358, ktd2692, max77693, ns2) - aat1290: add missing static modifier - ipaq-micro: add missing LEDS_CLASS dependency - lp55xx: correct Kconfig dependecy for f/w user helper * tag 'led-fixes-for-v4.3-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/j.anaszewski/linux-leds: leds:lp55xx: Correct Kconfig dependency for f/w user helper leds: leds-ipaq-micro: Add LEDS_CLASS dependency leds: aat1290: add 'static' modifier to init_mm_current_scale leds: leds-ns2: Fix module autoload for OF platform driver leds: max77693: Fix module autoload for OF platform driver leds: ktd2692: Fix module autoload for OF platform driver leds: bcm6358: Fix module autoload for OF platform driver leds: bcm6328: Fix module autoload for OF platform driver leds: aat1290: Fix module autoload for OF platform driver
-
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdmaLinus Torvalds authored
Pull rdma fixes from Doug Ledford: "The new hfi1 driver in staging/rdma has had a number of fixup patches since being added to the tree. This is the first batch of those fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/hfi: Properly set permissions for user device files IB/hfi1: mask vs shift confusion IB/hfi1: clean up some defines IB/hfi1: info leak in get_ctxt_info() IB/hfi1: fix a locking bug IB/hfi1: checking for NULL instead of IS_ERR IB/hfi1: fix sdma_descq_cnt parameter parsing IB/hfi1: fix copy_to/from_user() error handling IB/hfi1: fix pstateinfo from returning improperly byteswapped value
-
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimmLinus Torvalds authored
Pull libnvdimm fixes from Dan Williams: - a boot regression (since v4.2) fix for some ARM configurations from Tyler - regression (since v4.1) fixes for mkfs.xfs on a DAX enabled device from Jeff. These are tagged for -stable. - a pair of locking fixes from Axel that are hidden from lockdep since they involve device_lock(). The "btt" one is tagged for -stable, the other only applies to the new "pfn" mechanism in v4.3. - a fix for the pmem ->rw_page() path to use wmb_pmem() from Ross. * 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: mm: fix type cast in __pfn_to_phys() pmem: add proper fencing to pmem_rw_page() libnvdimm: pfn_devs: Fix locking in namespace_store libnvdimm: btt_devs: Fix locking in namespace_store blockdev: don't set S_DAX for misaligned partitions dax: fix O_DIRECT I/O to the last block of a blockdev
-
git://git.kernel.dk/linux-blockLinus Torvalds authored
Pull block updates from Jens Axboe: "This is a bit bigger than it should be, but I could (did) not want to send it off last week due to both wanting extra testing, and expecting a fix for the bounce regression as well. In any case, this contains: - Fix for the blk-merge.c compilation warning on gcc 5.x from me. - A set of back/front SG gap merge fixes, from me and from Sagi. This ensures that we honor SG gapping for integrity payloads as well. - Two small fixes for null_blk from Matias, fixing a leak and a capacity propagation issue. - A blkcg fix from Tejun, fixing a NULL dereference. - A fast clone optimization from Ming, fixing a performance regression since the arbitrarily sized bio's were introduced. - Also from Ming, a regression fix for bouncing IOs" * 'for-linus' of git://git.kernel.dk/linux-block: block: fix bounce_end_io block: blk-merge: fast-clone bio when splitting rw bios block: blkg_destroy_all() should clear q->root_blkg and ->root_rl.blkg block: Copy a user iovec if it includes gaps block: Refuse adding appending a gapped integrity page to a bio block: Refuse request/bio merges with gaps in the integrity payload block: Check for gaps on front and back merges null_blk: fix wrong capacity when bs is not 512 bytes null_blk: fix memory leak on cleanup block: fix bogus compiler warnings in blk-merge.c
-
Chris Mason authored
Commit 505a666e ("writeback: plug writeback in wb_writeback() and writeback_inodes_wb()") has us holding a plug during writeback_sb_inodes, which increases the merge rate when relatively contiguous small files are written by the filesystem. It helps both on flash and spindles. For an fs_mark workload creating 4K files in parallel across 8 drives, this commit improves performance ~9% more by unplugging before calling cond_resched(). cond_resched() doesn't trigger an implicit unplug, so explicitly getting the IO down to the device before scheduling reduces latencies for anyone waiting on clean pages. It also cuts down on how often we use kblockd to unplug, which means less work bouncing from one workqueue to another. Many more details about how we got here: https://lkml.org/lkml/2015/9/11/570Signed-off-by: Chris Mason <clm@fb.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 19 Sep, 2015 1 commit
-
-
Tyler Baker authored
The various definitions of __pfn_to_phys() have been consolidated to use a generic macro in include/asm-generic/memory_model.h. This hit mainline in the form of 012dcef3 "mm: move __phys_to_pfn and __pfn_to_phys to asm/generic/memory_model.h". When the generic macro was implemented the type cast to phys_addr_t was dropped which caused boot regressions on ARM platforms with more than 4GB of memory and LPAE enabled. It was suggested to use PFN_PHYS() defined in include/linux/pfn.h as provides the correct logic and avoids further duplication. Reported-by: kernelci.org bot <bot@kernelci.org> Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Tyler Baker <tyler.baker@linaro.org> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
-