- 09 Sep, 2015 4 commits
-
-
Linus Torvalds authored
Merge second patch-bomb from Andrew Morton: "Almost all of the rest of MM. There was an unusually large amount of MM material this time" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (141 commits) zpool: remove no-op module init/exit mm: zbud: constify the zbud_ops mm: zpool: constify the zpool_ops mm: swap: zswap: maybe_preload & refactoring zram: unify error reporting zsmalloc: remove null check from destroy_handle_cache() zsmalloc: do not take class lock in zs_shrinker_count() zsmalloc: use class->pages_per_zspage zsmalloc: consider ZS_ALMOST_FULL as migrate source zsmalloc: partial page ordering within a fullness_list zsmalloc: use shrinker to trigger auto-compaction zsmalloc: account the number of compacted pages zsmalloc/zram: introduce zs_pool_stats api zsmalloc: cosmetic compaction code adjustments zsmalloc: introduce zs_can_compact() function zsmalloc: always keep per-class stats zsmalloc: drop unused variable `nr_to_migrate' mm/memblock.c: fix comment in __next_mem_range() mm/page_alloc.c: fix type information of memoryless node memory-hotplug: fix comments in zone_spanned_pages_in_node() and zone_spanned_pages_in_node() ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds authored
Pull parisc updates from Helge Deller: "The most important changes in this patchset are: - re-enable 64bit PCI bus addresses which were temporarily disabled for PA-RISC in kernel 4.2 - fix the 64bit CAS operation in the LWS path which now enables us to enable the 64bit gcc atomic builtins even on 32bit userspace with 64bit kernel - fix a long-standing bug which sometimes crashed kernel at bootup while serial interrupt wasn't registered yet" * 'parisc-4.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Use platform_device_register_simple("rtc-generic") parisc: Drop CONFIG_SMP around update_cr16_clocksource() parisc: Use double word condition in 64bit CAS operation parisc: Filter out spurious interrupts in PA-RISC irq handler parisc: Additionally check for in_atomic() in page fault handler PCI,parisc: Enable 64-bit bus addresses on PA-RISC parisc: Define ioremap_uc and ioremap_wc
-
Linus Torvalds authored
Merge tag 'linux-kselftest-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest update from Shuah Khan: "This update adds new zram test and fixes to problems found during testing this new zram test. In addition, there are a few bug fixes and ksefltest improvement patches from Linaro developers. I will send another update later on this week to fix kselftest breakage due to commit 2bf9e0ab ("locking/static_keys: Provide a selftest") after the fix soaks in next for a couple of days" * tag 'linux-kselftest-4.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/zram: Makefile fix selftests/zram: must be run as root selftests: breakpoints: fix installing error on the architecture except x86 selftests: check before install selftests/zram: Adding zram tests
-
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommuLinus Torvalds authored
Pull iommu updates for from Joerg Roedel: "This time the IOMMU updates are mostly cleanups or fixes. No big new features or drivers this time. In particular the changes include: - Bigger cleanup of the Domain<->IOMMU data structures and the code that manages them in the Intel VT-d driver. This makes the code easier to understand and maintain, and also easier to keep the data structures in sync. It is also a preparation step to make use of default domains from the IOMMU core in the Intel VT-d driver. - Fixes for a couple of DMA-API misuses in ARM IOMMU drivers, namely in the ARM and Tegra SMMU drivers. - Fix for a potential buffer overflow in the OMAP iommu driver's debug code - A couple of smaller fixes and cleanups in various drivers - One small new feature: Report domain-id usage in the Intel VT-d driver to easier detect bugs where these are leaked" * tag 'iommu-updates-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (83 commits) iommu/vt-d: Really use upper context table when necessary x86/vt-d: Fix documentation of DRHD iommu/fsl: Really fix init section(s) content iommu/io-pgtable-arm: Unmap and free table when overwriting with block iommu/io-pgtable-arm: Move init-fn declarations to io-pgtable.h iommu/msm: Use BUG_ON instead of if () BUG() iommu/vt-d: Access iomem correctly iommu/vt-d: Make two functions static iommu/vt-d: Use BUG_ON instead of if () BUG() iommu/vt-d: Return false instead of 0 in irq_remapping_cap() iommu/amd: Use BUG_ON instead of if () BUG() iommu/amd: Make a symbol static iommu/amd: Simplify allocation in irq_remapping_alloc() iommu/tegra-smmu: Parameterize number of TLB lines iommu/tegra-smmu: Factor out tegra_smmu_set_pde() iommu/tegra-smmu: Extract tegra_smmu_pte_get_use() iommu/tegra-smmu: Use __GFP_ZERO to allocate zeroed pages iommu/tegra-smmu: Remove PageReserved manipulation iommu/tegra-smmu: Convert to use DMA API iommu/tegra-smmu: smmu_flush_ptc() wants device addresses ...
-
- 08 Sep, 2015 36 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmapLinus Torvalds authored
Pull regmap updates from Mark Brown: "This has been a busy release for regmap. By far the biggest set of changes here are those from Markus Pargmann which implement support for block transfers in smbus devices. This required quite a bit of refactoring but leaves us better able to handle odd restrictions that controllers may have and with better performance on smbus. Other new features include: - Fix interactions with lockdep for nested regmaps (eg, when a device using regmap is connected to a bus where the bus controller has a separate regmap). Lockdep's default class identification is too crude to work without help. - Support for must write bitfield operations, useful for operations which require writing a bit to trigger them from Kuniori Morimoto. - Support for delaying during register patch application from Nariman Poushin. - Support for overriding cache state via the debugfs implementation from Richard Fitzgerald" * tag 'regmap-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap: (25 commits) regmap: fix a NULL pointer dereference in __regmap_init regmap: Support bulk reads for devices without raw formatting regmap-i2c: Add smbus i2c block support regmap: Add raw_write/read checks for max_raw_write/read sizes regmap: regmap max_raw_read/write getter functions regmap: Introduce max_raw_read/write for regmap_bulk_read/write regmap: Add missing comments about struct regmap_bus regmap: No multi_write support if bus->write does not exist regmap: Split use_single_rw internally into use_single_read/write regmap: Fix regmap_bulk_write for bus writes regmap: regmap_raw_read return error on !bus->read regulator: core: Print at debug level on debugfs creation failure regmap: Fix regmap_can_raw_write check regmap: fix typos in regmap.c regmap: Fix integertypes for register address and value regmap: Move documentation to regmap.h regmap: Use different lockdep class for each regmap init call thermal: sti: Add parentheses around bridge->ops->regmap_init call mfd: vexpress: Add parentheses around bridge->ops->regmap_init call regmap: debugfs: Fix misuse of IS_ENABLED ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linuxLinus Torvalds authored
Pull fbdev updates from Tomi Valkeinen: "Minor fixes and cleanups" * tag 'fbdev-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: video: fbdev: atmel_lcdfb: remove useless include video: fbdev: pxa168fb: Use devm_clk_get fbdev: ssd1307fb: fix error return code fbdev: fix snprintf() limit in show_bl_curve() video: fbdev: s3c-fb: Constify platform_device_id video: fbdev: atmel: fix warning for const return value video: fbdev: Drop owner assignment from platform_driver video: fbdev: Drop owner assignment from i2c_driver fbdev: remove unnecessary memset in vfb framebuffer: disable vgacon on microblaze arch fbdev: udlfb: remove unneeded initialization in few places fbdev: Allow compile test of GPIO consumers if !GPIOLIB fbdev: fix cea_modes array size
-
git://git.linaro.org/people/ulf.hansson/mmcLinus Torvalds authored
Pull MMC updates from Ulf Hansson: "MMC core: - Fix a race condition in the request handling - Skip trim commands for some buggy kingston eMMCs - An optimization and a correction for erase groups - Set CMD23 quirk for some Sandisk cards MMC host: - sdhci: Give GPIO CD higher precedence and don't poll when it's used - sdhci: Fix DMA memory leakage - sdhci: Some updates for clock management - sdhci-of-at91: introduce driver for the Atmel SDMMC - sdhci-of-arasan: Add support for sdhci-5.1 - sdhci-esdhc-imx: Add support for imx7d which also supports HS400 - sdhci: A collection of fixes and improvements for various sdhci hosts - omap_hsmmc: Modernization of the regulator code - dw_mmc: A couple of fixes for DMA and PIO mode - usdhi6rol0: A few fixes and support probe deferral for regulators - pxamci: Convert to use dmaengine - sh_mmcif: Fix the suspend process in a short term solution - tmio: Adjust timeout for commands - sunxi: Fix timeout while gating/ungating clock" * tag 'mmc-v4.3' of git://git.linaro.org/people/ulf.hansson/mmc: (67 commits) mmc: android-goldfish: remove incorrect __iomem annotation mmc: core: fix race condition in mmc_wait_data_done mmc: host: omap_hsmmc: remove CONFIG_REGULATOR check mmc: host: omap_hsmmc: use ios->vdd for setting vmmc voltage mmc: host: omap_hsmmc: use regulator_is_enabled to find pbias status mmc: host: omap_hsmmc: enable/disable vmmc_aux regulator based on previous state mmc: host: omap_hsmmc: don't use ->set_power to set initial regulator state mmc: host: omap_hsmmc: avoid pbias regulator enable on power off mmc: host: omap_hsmmc: add separate function to set pbias mmc: host: omap_hsmmc: add separate functions for enable/disable supply mmc: host: omap_hsmmc: return error if any of the regulator APIs fail mmc: host: omap_hsmmc: remove unnecessary pbias set_voltage mmc: host: omap_hsmmc: use mmc_host's vmmc and vqmmc mmc: host: omap_hsmmc: use the ocrmask provided by the vmmc regulator mmc: host: omap_hsmmc: cleanup omap_hsmmc_reg_get() mmc: host: omap_hsmmc: return on fatal errors from omap_hsmmc_reg_get mmc: host: omap_hsmmc: use devm_regulator_get_optional() for vmmc mmc: sdhci-of-at91: fix platform_no_drv_owner.cocci warnings mmc: sh_mmcif: Fix suspend process mmc: usdhi6rol0: fix error return code ...
-
Linus Torvalds authored
Merge tag 'platform-drivers-x86-v4.3-1' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86 Pull x86 platform driver updates from Darren Hart: "Significant work on toshiba_acpi, including new hardware support, refactoring, and cleanups. Extend device support for asus, ideapad, and acer systems. New surface pro 3 buttons driver. Misc minor cleanups for thinkpad and hp-wireless. acer-wmi: - No rfkill on HP Omen 15 wifi thinkpad_acpi: - Remove side effects from vdbg_printk -> no_printk macro surface pro 3: - Add support driver for Surface Pro 3 buttons hp-wireless: - remove unneeded goto/label in hpwl_init ideapad-laptop: - add alternative representation for Yoga 2 to DMI table - Add Lenovo Yoga 3 14 to no_hw_rfkill dmi list asus-laptop: - Add key found on Asus F3M MAINTAINERS: - Remove Toshiba Linux mailing list address toshiba_acpi: - Bump driver version to 0.23 - Remove unnecessary checks and returns in HCI/SCI functions - Refactor *{get, set} functions return value - Remove "*not supported" feature prints - Change *available functions return type - Add set_fan_status function - Change some variables to avoid warnings from ninja-check - Reorder toshiba_acpi_alt_keymap entries - Remove unused wireless defines - Transflective backlight updates - Avoid registering input device on WMI event laptops - Add /dev/toshiba_acpi device - Adapt /proc/acpi/toshiba/keys to TOS1900 devices" * tag 'platform-drivers-x86-v4.3-1' of git://git.infradead.org/users/dvhart/linux-platform-drivers-x86: (21 commits) acer-wmi: No rfkill on HP Omen 15 wifi thinkpad_acpi: Remove side effects from vdbg_printk -> no_printk macro surface pro 3: Add support driver for Surface Pro 3 buttons hp-wireless: remove unneeded goto/label in hpwl_init ideapad-laptop: add alternative representation for Yoga 2 to DMI table asus-laptop: Add key found on Asus F3M MAINTAINERS: Remove Toshiba Linux mailing list address ideapad-laptop: Add Lenovo Yoga 3 14 to no_hw_rfkill dmi list toshiba_acpi: Bump driver version to 0.23 toshiba_acpi: Remove unnecessary checks and returns in HCI/SCI functions toshiba_acpi: Refactor *{get, set} functions return value toshiba_acpi: Remove "*not supported" feature prints toshiba_acpi: Change *available functions return type toshiba_acpi: Add set_fan_status function toshiba_acpi: Change some variables to avoid warnings from ninja-check toshiba_acpi: Reorder toshiba_acpi_alt_keymap entries toshiba_acpi: Remove unused wireless defines toshiba_acpi: Transflective backlight updates toshiba_acpi: Avoid registering input device on WMI event laptops toshiba_acpi: Add /dev/toshiba_acpi device ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linuxLinus Torvalds authored
Pull i2c updates from Wolfram Sang: "Features: - new drivers: Renesas EMEV2, register based MUX, NXP LPC2xxx - core: scans DT and assigns wakeup interrupts. no driver changes needed. - core: some refcouting issues fixed and better API for that - core: new helper function for best effort block read emulation - slave framework: proper DT bindings and userspace instantiation - some bigger work for xiic, pxa, omap drivers .. and quite a number of smaller driver fixes, cleanups, improvements" * 'i2c/for-4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (65 commits) i2c: mux: reg Change ioread endianness for readback i2c: mux: reg: fix compilation warnings i2c: mux: reg: simplify register size checking i2c: muxes: fix leaked i2c adapter device node references i2c: allow specifying separate wakeup interrupt in device tree of/irq: export of_get_irq_byname() i2c: xgene-slimpro: dma_mapping_error() doesn't return an error code i2c: Replace I2C_CROS_EC_TUNNEL dependency eeprom: at24: use i2c_smbus_read_i2c_block_data_or_emulated i2c: core: Add support for best effort block read emulation i2c: lpc2k: add driver i2c: mux: Add register-based mux i2c-mux-reg i2c: dt: describe generic bindings i2c: slave: print warning if slave flag not set i2c: support 10 bit and slave addresses in sysfs 'new_device' i2c: take address space into account when checking for used addresses i2c: apply DT flags when probing i2c: make address check indpendent from client struct i2c: rename address check functions i2c: apply address offset for slaves, too ...
-
git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linuxLinus Torvalds authored
Pull RTC updates from Alexandre Belloni: "Core: - use is_visible() to control sysfs attributes - switch wakealarm attribute to DEVICE_ATTR_RW - make rtc_does_wakealarm() return boolean - properly manage lifetime of dev and cdev in rtc device - remove unnecessary device_get() in rtc_device_unregister - fix double free in rtc_register_device() error path New drivers: - NXP LPC24xx - Xilinx Zynq MP - Dialog DA9062 Subsystem wide cleanups: - fix drivers that consider 0 as a valid IRQ in client->irq - Drop (un)likely before IS_ERR(_OR_NULL) - drop the remaining owner assignment for i2c_driver and platform_driver - module autoload fixes Drivers: - 88pm80x: add device tree support - abx80x: fix RTC write bit - ab8500: Add a sentinel to ab85xx_rtc_ids[] - armada38x: Align RTC set time procedure with the official errata - as3722: correct month value - at91sam9: cleanups - at91rm9200: get and use slow clock and cleanups - bq32k: remove redundant check - cmos: century support, proper fix for the spurious wakeup - ds1307: cleanups and wakeup irq support - ds1374: Remove unused variable - ds1685: Use module_platform_driver - ds3232: fix WARNING trace in resume function - gemini: fix ptr_ret.cocci warnings - mt6397: implement suspend/resume - omap: support internal and external clock enabling - opal: Enable alarms only when opal supports tpo - pcf2127: use OFS flag to detect unreliable date and warn the user - pl031: fix typo for author email - rx8025: huge cleanup and fixes - sa1100/pxa: share common code - s5m: fix to update ctrl register - s3c: fix clocks and wakeup, cleanup - sirfsoc: use regmap - nvram_read()/nvram_write() functions for cmos, ds1305, ds1307, ds1343, ds1511, ds1553, ds1742, m48t59, rp5c01, stk17ta8, tx4939 - use rtc_valid_tm() error code when reading date/time instead of 0 for isl12022, pcf2123, pcf2127" * tag 'rtc-v4.3' of git://git.kernel.org/pub/scm/linux/kernel/git/abelloni/linux: (90 commits) rtc: abx80x: fix RTC write bit rtc: ab8500: Add a sentinel to ab85xx_rtc_ids[] rtc: ds1374: Remove unused variable rtc: Fix module autoload for OF platform drivers rtc: Fix module autoload for rtc-{ab8500,max8997,s5m} drivers rtc: omap: Add external clock enabling support rtc: omap: Add internal clock enabling support ARM: dts: AM437x: Add the internal and external clock nodes for rtc rtc: s5m: fix to update ctrl register rtc: add xilinx zynqmp rtc driver devicetree: bindings: rtc: add bindings for xilinx zynqmp rtc rtc: as3722: correct month value ARM: config: Switch PXA27x platforms to use PXA RTC driver ARM: mmp: remove unused RTC register definitions ARM: sa1100: remove unused RTC register definitions rtc: sa1100/pxa: convert to run-time register mapping ARM: pxa: add memory resource to SA1100 RTC device rtc: pxa: convert to use shared sa1100 functions rtc: sa1100: prepare to share sa1100_rtc_ops rtc: ds3232: fix WARNING trace in resume function ...
-
Dan Streetman authored
Remove zpool_init() and zpool_exit(); they do nothing other than print "loaded" and "unloaded". Signed-off-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Krzysztof Kozlowski authored
The structure zbud_ops is not modified so make the pointer to it a pointer to const. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Krzysztof Kozlowski authored
The structure zpool_ops is not modified so make the pointer to it a pointer to const. Signed-off-by: Krzysztof Kozlowski <k.kozlowski@samsung.com> Acked-by: Dan Streetman <ddstreet@ieee.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Dmitry Safonov authored
zswap_get_swap_cache_page and read_swap_cache_async have pretty much the same code with only significant difference in return value and usage of swap_readpage. I a helper __read_swap_cache_async() with the common code. Behavior change: now zswap_get_swap_cache_page will use radix_tree_maybe_preload instead radix_tree_preload. Looks like, this wasn't changed only by the reason of code duplication. Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@fb.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Herrmann <dh.herrmann@gmail.com> Cc: Seth Jennings <sjennings@variantweb.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sergey Senozhatsky authored
Make zram syslog error reporting more consistent. We have random error levels in some places. For example, critical errors like "Error allocating memory for compressed page" and "Unable to allocate temp memory" are reported as KERN_INFO messages. a) Reassign error levels Error messages that directly affect zram functionality -- pr_err(): Error allocating zram address table Error creating memory pool Decompression failed! err=%d, page=%u Unable to allocate temp memory Compression failed! err=%d Error allocating memory for compressed page: %u, size=%zu Cannot initialise %s compressing backend Error allocating disk queue for device %d Error allocating disk structure for device %d Error creating sysfs group for device %d Unable to register zram-control class Unable to get major number Messages that do not affect functionality, but user must be warned (because sysfs attrs will be removed in this particular case) -- pr_warn(): %d (%s) Attribute %s (and others) will be removed. %s Messages that do not affect functionality and mostly are informative -- pr_info(): Cannot change max compression streams Can't change algorithm for initialized device Cannot change disksize for initialized device Added device: %s Removed device: %s b) Update sysfs_create_group() error message First, it lacks a trailing new line; add it. Second, every error message in zram_add() has a "for device %d" part, which makes errors more informative. Add missing part to "Error creating sysfs group" message. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sergey Senozhatsky authored
We can pass a NULL cache pointer to kmem_cache_destroy(), because it NULL-checks its argument now. Remove redundant test from destroy_handle_cache(). Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sergey Senozhatsky authored
We can avoid taking class ->lock around zs_can_compact() in zs_shrinker_count(), because the number that we return back is outdated in general case, by design. We have different sources that are able to change class's state right after we return from zs_can_compact() -- ongoing I/O operations, manually triggered compaction, or two of them happening simultaneously. We re-do this calculations during compaction on a per class basis anyway. zs_unregister_shrinker() will not return until we have an active shrinker, so classes won't unexpectedly disappear while zs_shrinker_count() iterates them. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Minchan Kim authored
There is no need to recalcurate pages_per_zspage in runtime. Just use class->pages_per_zspage to avoid unnecessary runtime overhead. Signed-off-by: Minchan Kim <minchan@kernel.org> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Minchan Kim authored
There is no reason to prevent select ZS_ALMOST_FULL as migration source if we cannot find source from ZS_ALMOST_EMPTY. With this patch, zs_can_compact will return more exact result. Signed-off-by: Minchan Kim <minchan.kim@lge.com> Acked-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sergey Senozhatsky authored
We want to see more ZS_FULL pages and less ZS_ALMOST_{FULL, EMPTY} pages. Put a page with higher ->inuse count first within its ->fullness_list, which will give us better chances to fill up this page with new objects (find_get_zspage() return ->fullness_list head for new object allocation), so some zspages will become ZS_ALMOST_FULL/ZS_FULL quicker. It performs a trivial and cheap ->inuse compare which does not slow down zsmalloc and in the worst case keeps the list pages in no particular order. A more expensive solution could sort fullness_list by ->inuse count. [minchan@kernel.org: code adjustments] Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sergey Senozhatsky authored
Perform automatic pool compaction by a shrinker when system is getting tight on memory. User-space has a very little knowledge regarding zsmalloc fragmentation and basically has no mechanism to tell whether compaction will result in any memory gain. Another issue is that user space is not always aware of the fact that system is getting tight on memory. Which leads to very uncomfortable scenarios when user space may start issuing compaction 'randomly' or from crontab (for example). Fragmentation is not always necessarily bad, allocated and unused objects, after all, may be filled with the data later, w/o the need of allocating a new zspage. On the other hand, we obviously don't want to waste memory when the system needs it. Compaction now has a relatively quick pool scan so we are able to estimate the number of pages that will be freed easily, which makes it possible to call this function from a shrinker->count_objects() callback. We also abort compaction as soon as we detect that we can't free any pages any more, preventing wasteful objects migrations. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Suggested-by: Minchan Kim <minchan@kernel.org> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sergey Senozhatsky authored
Compaction returns back to zram the number of migrated objects, which is quite uninformative -- we have objects of different sizes so user space cannot obtain any valuable data from that number. Change compaction to operate in terms of pages and return back to compaction issuer the number of pages that were freed during compaction. So from now on we will export more meaningful value in zram<id>/mm_stat -- the number of freed (compacted) pages. This requires: (a) a rename of `num_migrated' to 'pages_compacted' (b) a internal API change -- return first_page's fullness_group from putback_zspage(), so we know when putback_zspage() did free_zspage(). It helps us to account compaction stats correctly. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sergey Senozhatsky authored
`zs_compact_control' accounts the number of migrated objects but it has a limited lifespan -- we lose it as soon as zs_compaction() returns back to zram. It worked fine, because (a) zram had it's own counter of migrated objects and (b) only zram could trigger compaction. However, this does not work for automatic pool compaction (not issued by zram). To account objects migrated during auto-compaction (issued by the shrinker) we need to store this number in zs_pool. Define a new `struct zs_pool_stats' structure to keep zs_pool's stats there. It provides only `num_migrated', as of this writing, but it surely can be extended. A new zsmalloc zs_pool_stats() symbol exports zs_pool's stats back to caller. Use zs_pool_stats() in zram and remove `num_migrated' from zram_stats. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Suggested-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sergey Senozhatsky authored
Change zs_object_copy() argument order to be (DST, SRC) rather than (SRC, DST). copy/move functions usually have (to, from) arguments order. Rename alloc_target_page() to isolate_target_page(). This function doesn't allocate anything, it isolates target page, pretty much like isolate_source_page(). Tweak __zs_compact() comment. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sergey Senozhatsky authored
This function checks if class compaction will free any pages. Rephrasing -- do we have enough unused objects to form at least one ZS_EMPTY page and free it. It aborts compaction if class compaction will not result in any (further) savings. EXAMPLE (this debug output is not part of this patch set): - class size - number of allocated objects - number of used objects - max objects per zspage - pages per zspage - estimated number of pages that will be freed [..] class-512 objs:544 inuse:540 maxobj-per-zspage:8 pages-per-zspage:1 zspages-to-free:0 ... class-512 compaction is useless. break class-496 objs:660 inuse:570 maxobj-per-zspage:33 pages-per-zspage:4 zspages-to-free:2 class-496 objs:627 inuse:570 maxobj-per-zspage:33 pages-per-zspage:4 zspages-to-free:1 class-496 objs:594 inuse:570 maxobj-per-zspage:33 pages-per-zspage:4 zspages-to-free:0 ... class-496 compaction is useless. break class-448 objs:657 inuse:617 maxobj-per-zspage:9 pages-per-zspage:1 zspages-to-free:4 class-448 objs:648 inuse:617 maxobj-per-zspage:9 pages-per-zspage:1 zspages-to-free:3 class-448 objs:639 inuse:617 maxobj-per-zspage:9 pages-per-zspage:1 zspages-to-free:2 class-448 objs:630 inuse:617 maxobj-per-zspage:9 pages-per-zspage:1 zspages-to-free:1 class-448 objs:621 inuse:617 maxobj-per-zspage:9 pages-per-zspage:1 zspages-to-free:0 ... class-448 compaction is useless. break class-432 objs:728 inuse:685 maxobj-per-zspage:28 pages-per-zspage:3 zspages-to-free:1 class-432 objs:700 inuse:685 maxobj-per-zspage:28 pages-per-zspage:3 zspages-to-free:0 ... class-432 compaction is useless. break class-416 objs:819 inuse:705 maxobj-per-zspage:39 pages-per-zspage:4 zspages-to-free:2 class-416 objs:780 inuse:705 maxobj-per-zspage:39 pages-per-zspage:4 zspages-to-free:1 class-416 objs:741 inuse:705 maxobj-per-zspage:39 pages-per-zspage:4 zspages-to-free:0 ... class-416 compaction is useless. break class-400 objs:690 inuse:674 maxobj-per-zspage:10 pages-per-zspage:1 zspages-to-free:1 class-400 objs:680 inuse:674 maxobj-per-zspage:10 pages-per-zspage:1 zspages-to-free:0 ... class-400 compaction is useless. break class-384 objs:736 inuse:709 maxobj-per-zspage:32 pages-per-zspage:3 zspages-to-free:0 ... class-384 compaction is useless. break [..] Every "compaction is useless" indicates that we saved CPU cycles. class-512 has 544 object allocated 540 objects used 8 objects per-page Even if we have a ALMOST_EMPTY zspage, we still don't have enough room to migrate all of its objects and free this zspage; so compaction will not make a lot of sense, it's better to just leave it as is. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sergey Senozhatsky authored
Always account per-class `zs_size_stat' stats. This data will help us make better decisions during compaction. We are especially interested in OBJ_ALLOCATED and OBJ_USED, which can tell us if class compaction will result in any memory gain. For instance, we know the number of allocated objects in the class, the number of objects being used (so we also know how many objects are not used) and the number of objects per-page. So we can ensure if we have enough unused objects to form at least one ZS_EMPTY zspage during compaction. We calculate this value on per-class basis so we can calculate a total number of zspages that can be released. Which is exactly what a shrinker wants to know. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Sergey Senozhatsky authored
This patchset tweaks compaction and makes it possible to trigger pool compaction automatically when system is getting low on memory. zsmalloc in some cases can suffer from a notable fragmentation and compaction can release some considerable amount of memory. The problem here is that currently we fully rely on user space to perform compaction when needed. However, performing zsmalloc compaction is not always an obvious thing to do. For example, suppose we have a `idle' fragmented (compaction was never performed) zram device and system is getting low on memory due to some 3rd party user processes (gcc LTO, or firefox, etc.). It's quite unlikely that user space will issue zpool compaction in this case. Besides, user space cannot tell for sure how badly pool is fragmented; however, this info is known to zsmalloc and, hence, to a shrinker. This patch (of 7): __zs_compact() does not use `nr_to_migrate', drop it. Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com> Acked-by: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alexander Kuleshov authored
Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Zhen Lei authored
For a memoryless node, the output of get_pfn_range_for_nid are all zero. It will display mem from 0 to -1. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Xishi Qiu authored
When hot adding a node from add_memory(), we will add memblock first, so the node is not empty. But when called from cpu_up(), the node should be empty. Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>\ Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Yaowei Bai authored
We use sysctl_lowmem_reserve_ratio rather than sysctl_lower_zone_reserve_ratio to determine how aggressive the kernel is in defending lowmem from the possibility of being captured into pinned user memory. To avoid misleading, correct it in some comments. Signed-off-by: Yaowei Bai <bywxiaobai@163.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Yaowei Bai authored
The comment says that the per-cpu batchsize and zone watermarks are determined by present_pages which is definitely wrong, they are both calculated from managed_pages. Fix it. Signed-off-by: Yaowei Bai <bywxiaobai@163.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Chen Gang authored
There's no point in initializing vma->vm_pgoff if the insertion attempt will be failing anyway. Run the checks before performing the initialization. Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Petr Mladek authored
Commit 1dfb059b ("thp: reduce khugepaged freezing latency") fixed khugepaged to do not block a system suspend. But the result is that it could not get interrupted before the given timeout because the condition for the wait event is "false". This patch puts back the original approach but it uses freezable_schedule_timeout_interruptible() instead of schedule_timeout_interruptible(). It does the right thing. I am pretty sure that the freezable variant was not used in the original fix only because it was not available at that time. The regression has been there for ages. It was not critical. It just did the allocation throttling a little bit more aggressively. I found this problem when converting the kthread to kthread worker API and trying to understand the code. This bug is thought to have minimal userspace-visible impact. Somebody could set a high alloc_sleep value by mistake, and then try to fix it back, but khugepaged would keep sleeping until the high value expires. Signed-off-by: Petr Mladek <pmladek@suse.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: Ebru Akagunduz <ebru.akagunduz@gmail.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Alexander Kuleshov authored
s/succees/success/ Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Joonsoo Kim authored
We cache isolate_start_pfn before entering isolate_migratepages(). If pageblock is skipped in isolate_migratepages() due to whatever reason, cc->migrate_pfn can be far from isolate_start_pfn hence we flush pages that were freed. For example, the following scenario can be possible: - assume order-9 compaction, pageblock order is 9 - start_isolate_pfn is 0x200 - isolate_migratepages() - skip a number of pageblocks - start to isolate from pfn 0x600 - cc->migrate_pfn = 0x620 - return - last_migrated_pfn is set to 0x200 - check flushing condition - current_block_start is set to 0x600 - last_migrated_pfn < current_block_start then do useless flush This wrong flush would not help the performance and success rate so this patch tries to fix it. One simple way to know the exact position where we start to isolate migratable pages is that we cache it in isolate_migratepages() before entering actual isolation. This patch implements that and fixes the problem. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
alloc_pages_node() might fail when called with NUMA_NO_NODE and __GFP_THISNODE on a CPU belonging to a memoryless node. To make the local-node fallback more robust and prevent such situations, use numa_mem_id(), which was introduced for similar scenarios in the slab context. Suggested-by: Christoph Lameter <cl@linux.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Christoph Lameter <cl@linux.com> Cc: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
Perform the same debug checks in alloc_pages_node() as are done in __alloc_pages_node(), by making the former function a wrapper of the latter one. In addition to better diagnostics in DEBUG_VM builds for situations which have been already fatal (e.g. out-of-bounds node id), there are two visible changes for potential existing buggy callers of alloc_pages_node(): - calling alloc_pages_node() with any negative nid (e.g. due to arithmetic overflow) was treated as passing NUMA_NO_NODE and fallback to local node was applied. This will now be fatal. - calling alloc_pages_node() with an offline node will now be checked for DEBUG_VM builds. Since it's not fatal if the node has been previously online, and this patch may expose some existing buggy callers, change the VM_BUG_ON in __alloc_pages_node() to VM_WARN_ON. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Rientjes <rientjes@google.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Vlastimil Babka authored
alloc_pages_exact_node() was introduced in commit 6484eb3e ("page allocator: do not check NUMA node ID when the caller knows the node is valid") as an optimized variant of alloc_pages_node(), that doesn't fallback to current node for nid == NUMA_NO_NODE. Unfortunately the name of the function can easily suggest that the allocation is restricted to the given node and fails otherwise. In truth, the node is only preferred, unless __GFP_THISNODE is passed among the gfp flags. The misleading name has lead to mistakes in the past, see for example commits 5265047a ("mm, thp: really limit transparent hugepage allocation to local node") and b360edb4 ("mm, mempolicy: migrate_to_node should only migrate to node"). Another issue with the name is that there's a family of alloc_pages_exact*() functions where 'exact' means exact size (instead of page order), which leads to more confusion. To prevent further mistakes, this patch effectively renames alloc_pages_exact_node() to __alloc_pages_node() to better convey that it's an optimized variant of alloc_pages_node() not intended for general usage. Both functions get described in comments. It has been also considered to really provide a convenience function for allocations restricted to a node, but the major opinion seems to be that __GFP_THISNODE already provides that functionality and we shouldn't duplicate the API needlessly. The number of users would be small anyway. Existing callers of alloc_pages_exact_node() are simply converted to call __alloc_pages_node(), with the exception of sba_alloc_coherent() which open-codes the check for NUMA_NO_NODE, so it is converted to use alloc_pages_node() instead. This means it no longer performs some VM_BUG_ON checks, and since the current check for nid in alloc_pages_node() uses a 'nid < 0' comparison (which includes NUMA_NO_NODE), it may hide wrong values which would be previously exposed. Both differences will be rectified by the next patch. To sum up, this patch makes no functional changes, except temporarily hiding potentially buggy callers. Restricting the checks in alloc_pages_node() is left for the next patch which can in turn expose more existing buggy callers. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Robin Holt <robinmholt@gmail.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Christoph Lameter <cl@linux.com> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Mel Gorman <mgorman@suse.de> Cc: David Rientjes <rientjes@google.com> Cc: Greg Thelen <gthelen@google.com> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Gleb Natapov <gleb@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Cliff Whickman <cpw@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Hugh Dickins authored
This is merely a politeness: I've not found that shrink_page_list() leads to deadlock with the page it holds locked across wait_on_page_writeback(); but nevertheless, why hold others off by keeping the page locked there? And while we're at it: remove the mistaken "not " from the commentary on this Case 3 (and a distracting blank line from Case 2, if I may). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Mel Gorman <mgorman@techsingularity.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-