1. 27 Sep, 2018 4 commits
  2. 26 Sep, 2018 6 commits
    • Thomas Zimmermann's avatar
      drm/hisilicon: Replace ttm_bo_unref with ttm_bo_put · c932c4f8
      Thomas Zimmermann authored
      The function ttm_bo_put releases a reference to a TTM buffer object. The
      function's name is more aligned to the Linux kernel convention of naming
      ref-counting function _get and _put.
      
      A call to ttm_bo_unref takes the address of the TTM BO object's pointer and
      clears the pointer's value to NULL. This is not necessary in most cases and
      sometimes even worked around by the calling code. A call to ttm_bo_put only
      releases the reference without clearing the pointer.
      
      The current behaviour of cleaning the pointer is kept in the calling code,
      but should be removed if not required in a later patch.
      Signed-off-by: default avatarThomas Zimmermann <tzimmermann@suse.de>
      Reviewed-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      Signed-off-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      c932c4f8
    • Thomas Zimmermann's avatar
      drm/hisilicon: Replace drm_dev_unref with drm_dev_put · 45fcedae
      Thomas Zimmermann authored
      This patch unifies the naming of DRM functions for reference counting
      of struct drm_device. The resulting code is more aligned with the rest
      of the Linux kernel interfaces.
      Signed-off-by: default avatarThomas Zimmermann <tzimmermann@suse.de>
      Reviewed-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      Signed-off-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      45fcedae
    • Souptick Joarder's avatar
      gpu/drm/hisilicon: Convert drm_atomic_helper_suspend/resume() · 081d0571
      Souptick Joarder authored
      convert drm_atomic_helper_suspend/resume() to use
      drm_mode_config_helper_suspend/resume().
      
      Fixed one sparse warning by making hibmc_drm_interrupt
      static.
      Signed-off-by: default avatarAjit Negi <ajitn.linux@gmail.com>
      Signed-off-by: default avatarSouptick Joarder <jrdr.linux@gmail.com>
      Reviewed-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      Signed-off-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      081d0571
    • John Garry's avatar
      drm/hisilicon: hibmc: Use HUAWEI PCI vendor ID macro · a66dae3a
      John Garry authored
      Switch to use Huawei PCI vendor ID macro from pci_ids.h file.
      
      In addition, switch to use PCI_VDEVICE() instead of open coding.
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Reviewed-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      Signed-off-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      a66dae3a
    • John Garry's avatar
      drm/hisilicon: hibmc: Don't overwrite fb helper surface depth · 0ff9f496
      John Garry authored
      Currently the driver overwrites the surface depth provided by the fb
      helper to give an invalid bpp/surface depth combination.
      
      This has been exposed by commit 70109354 ("drm: Reject unknown legacy
      bpp and depth for drm_mode_addfb ioctl"), which now causes the driver to
      fail to probe.
      
      Fix by not overwriting the surface depth.
      
      Fixes: d1667b86 ("drm/hisilicon/hibmc: Add support for frame buffer")
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Reviewed-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      Signed-off-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      0ff9f496
    • John Garry's avatar
      drm/hisilicon: hibmc: Do not carry error code in HiBMC framebuffer pointer · 331d880b
      John Garry authored
      In hibmc_drm_fb_create(), when the call to hibmc_framebuffer_init() fails
      with error, do not store the error code in the HiBMC device frame-buffer
      pointer, as this will be later checked for non-zero value in
      hibmc_fbdev_destroy() when our intention is to check for a valid function
      pointer.
      
      This fixes the following crash:
      [    9.699791] Unable to handle kernel NULL pointer dereference at virtual address 000000000000001a
      [    9.708672] Mem abort info:
      [    9.711489]   ESR = 0x96000004
      [    9.714570]   Exception class = DABT (current EL), IL = 32 bits
      [    9.720551]   SET = 0, FnV = 0
      [    9.723631]   EA = 0, S1PTW = 0
      [    9.726799] Data abort info:
      [    9.729702]   ISV = 0, ISS = 0x00000004
      [    9.733573]   CM = 0, WnR = 0
      [    9.736566] [000000000000001a] user address but active_mm is swapper
      [    9.742987] Internal error: Oops: 96000004 [#1] PREEMPT SMP
      [    9.748614] Modules linked in:
      [    9.751694] CPU: 16 PID: 293 Comm: kworker/16:1 Tainted: G        W         4.19.0-rc4-next-20180920-00001-g9b0012c #322
      [    9.762681] Hardware name: Huawei Taishan 2280 /D05, BIOS Hisilicon D05 IT21 Nemo 2.0 RC0 04/18/2018
      [    9.771915] Workqueue: events work_for_cpu_fn
      [    9.776312] pstate: 60000005 (nZCv daif -PAN -UAO)
      [    9.781150] pc : drm_mode_object_put+0x0/0x20
      [    9.785547] lr : hibmc_fbdev_fini+0x40/0x58
      [    9.789767] sp : ffff00000af1bcf0
      [    9.793108] x29: ffff00000af1bcf0 x28: 0000000000000000
      [    9.798473] x27: 0000000000000000 x26: ffff000008f66630
      [    9.803838] x25: 0000000000000000 x24: ffff0000095abb98
      [    9.809203] x23: ffff8017db92fe00 x22: ffff8017d2b13000
      [    9.814568] x21: ffffffffffffffea x20: ffff8017d2f80018
      [    9.819933] x19: ffff8017d28a0018 x18: ffffffffffffffff
      [    9.825297] x17: 0000000000000000 x16: 0000000000000000
      [    9.830662] x15: ffff0000092296c8 x14: ffff00008939970f
      [    9.836026] x13: ffff00000939971d x12: ffff000009229940
      [    9.841391] x11: ffff0000085f8fc0 x10: ffff00000af1b9a0
      [    9.846756] x9 : 000000000000000d x8 : 6620657a696c6169
      [    9.852121] x7 : ffff8017d3340580 x6 : ffff8017d4168000
      [    9.857486] x5 : 0000000000000000 x4 : ffff8017db92fb20
      [    9.862850] x3 : 0000000000002690 x2 : ffff8017d3340480
      [    9.868214] x1 : 0000000000000028 x0 : 0000000000000002
      [    9.873580] Process kworker/16:1 (pid: 293, stack limit = 0x(____ptrval____))
      [    9.880788] Call trace:
      [    9.883252]  drm_mode_object_put+0x0/0x20
      [    9.887297]  hibmc_unload+0x1c/0x80
      [    9.890815]  hibmc_pci_probe+0x170/0x3c8
      [    9.894773]  local_pci_probe+0x3c/0xb0
      [    9.898555]  work_for_cpu_fn+0x18/0x28
      [    9.902337]  process_one_work+0x1e0/0x318
      [    9.906382]  worker_thread+0x228/0x450
      [    9.910164]  kthread+0x128/0x130
      [    9.913418]  ret_from_fork+0x10/0x18
      [    9.917024] Code: a94153f3 a8c27bfd d65f03c0 d503201f (f9400c01)
      [    9.923180] ---[ end trace 2695ffa0af5be375 ]---
      
      Fixes: d1667b86 ("drm/hisilicon/hibmc: Add support for frame buffer")
      Signed-off-by: default avatarJohn Garry <john.garry@huawei.com>
      Reviewed-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      Signed-off-by: default avatarXinliang Liu <z.liuxinliang@hisilicon.com>
      331d880b
  3. 24 Sep, 2018 11 commits
  4. 23 Sep, 2018 7 commits
  5. 22 Sep, 2018 1 commit
    • Omar Sandoval's avatar
      block: use nanosecond resolution for iostat · b57e99b4
      Omar Sandoval authored
      Klaus Kusche reported that the I/O busy time in /proc/diskstats was not
      updating properly on 4.18. This is because we started using ktime to
      track elapsed time, and we convert nanoseconds to jiffies when we update
      the partition counter. However, this gets rounded down, so any I/Os that
      take less than a jiffy are not accounted for. Previously in this case,
      the value of jiffies would sometimes increment while we were doing I/O,
      so at least some I/Os were accounted for.
      
      Let's convert the stats to use nanoseconds internally. We still report
      milliseconds as before, now more accurately than ever. The value is
      still truncated to 32 bits for backwards compatibility.
      
      Fixes: 522a7775 ("block: consolidate struct request timestamp fields")
      Cc: stable@vger.kernel.org
      Reported-by: default avatarKlaus Kusche <klaus.kusche@computerix.info>
      Signed-off-by: default avatarOmar Sandoval <osandov@fb.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      b57e99b4
  6. 21 Sep, 2018 5 commits
    • Greg Kroah-Hartman's avatar
      Merge tag 'pinctrl-v4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl · 10dc890d
      Greg Kroah-Hartman authored
      Linus writes:
        "Pin control fixes for v4.19:
         - Two fixes for the Intel pin controllers than cause
           problems on laptops."
      
      * tag 'pinctrl-v4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
        pinctrl: intel: Do pin translation in other GPIO operations as well
        pinctrl: cannonlake: Fix gpio base for GPP-E
      10dc890d
    • Greg Kroah-Hartman's avatar
      Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · a27fb6d9
      Greg Kroah-Hartman authored
      Paolo writes:
        "It's mostly small bugfixes and cleanups, mostly around x86 nested
         virtualization.  One important change, not related to nested
         virtualization, is that the ability for the guest kernel to trap
         CPUID instructions (in Linux that's the ARCH_SET_CPUID arch_prctl) is
         now masked by default.  This is because the feature is detected
         through an MSR; a very bad idea that Intel seems to like more and
         more.  Some applications choke if the other fields of that MSR are
         not initialized as on real hardware, hence we have to disable the
         whole MSR by default, as was the case before Linux 4.12."
      
      * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (23 commits)
        KVM: nVMX: Fix bad cleanup on error of get/set nested state IOCTLs
        kvm: selftests: Add platform_info_test
        KVM: x86: Control guest reads of MSR_PLATFORM_INFO
        KVM: x86: Turbo bits in MSR_PLATFORM_INFO
        nVMX x86: Check VPID value on vmentry of L2 guests
        nVMX x86: check posted-interrupt descriptor addresss on vmentry of L2
        KVM: nVMX: Wake blocked vCPU in guest-mode if pending interrupt in virtual APICv
        KVM: VMX: check nested state and CR4.VMXE against SMM
        kvm: x86: make kvm_{load|put}_guest_fpu() static
        x86/hyper-v: rename ipi_arg_{ex,non_ex} structures
        KVM: VMX: use preemption timer to force immediate VMExit
        KVM: VMX: modify preemption timer bit only when arming timer
        KVM: VMX: immediately mark preemption timer expired only for zero value
        KVM: SVM: Switch to bitmap_zalloc()
        KVM/MMU: Fix comment in walk_shadow_page_lockless_end()
        kvm: selftests: use -pthread instead of -lpthread
        KVM: x86: don't reset root in kvm_mmu_setup()
        kvm: mmu: Don't read PDPTEs when paging is not enabled
        x86/kvm/lapic: always disable MMIO interface in x2APIC mode
        KVM: s390: Make huge pages unavailable in ucontrol VMs
        ...
      a27fb6d9
    • Greg Kroah-Hartman's avatar
      Merge tag 'upstream-4.19-rc4' of git://git.infradead.org/linux-ubifs · 0eba8697
      Greg Kroah-Hartman authored
      Richard writes:
        "This pull request contains fixes for UBIFS:
         - A wrong UBIFS assertion in mount code
         - Fix for a NULL pointer deref in mount code
         - Revert of a bad fix for xattrs"
      
      * tag 'upstream-4.19-rc4' of git://git.infradead.org/linux-ubifs:
        Revert "ubifs: xattr: Don't operate on deleted inodes"
        ubifs: drop false positive assertion
        ubifs: Check for name being NULL while mounting
      0eba8697
    • Greg Kroah-Hartman's avatar
      Merge tag 'for-linus-20180920' of git://git.kernel.dk/linux-block · 211b100a
      Greg Kroah-Hartman authored
      Jens writes:
        "Storage fixes for 4.19-rc5
      
        - Fix for leaking kernel pointer in floppy ioctl (Andy Whitcroft)
      
        - NVMe pull request from Christoph, and a single ANA log page fix
          (Hannes)
      
        - Regression fix for libata qd32 support, where we trigger an illegal
          active command transition. This fixes a CD-ROM detection issue that
          was reported, but could also trigger premature completion of the
          internal tag (me)"
      
      * tag 'for-linus-20180920' of git://git.kernel.dk/linux-block:
        floppy: Do not copy a kernel pointer to user memory in FDGETPRM ioctl
        libata: mask swap internal and hardware tag
        nvme: count all ANA groups for ANA Log page
      211b100a
    • Greg Kroah-Hartman's avatar
      Merge tag 'drm-fixes-2018-09-21' of git://anongit.freedesktop.org/drm/drm · a38fd7d8
      Greg Kroah-Hartman authored
      David writes:
        "drm fixes for 4.19-rc5:
      
         - core: fix debugfs for atomic, fix the check for atomic for
           non-modesetting drivers
         - amdgpu: adds a new PCI id, some kfd fixes and a sdma fix
         - i915: a bunch of GVT fixes.
         - vc4: scaling fix
         - vmwgfx: modesetting fixes and a old buffer eviction fix
         - udl: framebuffer destruction fix
         - sun4i: disable on R40 fix until next kernel
         - pl111: NULL termination on table fix"
      
      * tag 'drm-fixes-2018-09-21' of git://anongit.freedesktop.org/drm/drm: (21 commits)
        drm/amdkfd: Fix ATS capablity was not reported correctly on some APUs
        drm/amdkfd: Change the control stack MTYPE from UC to NC on GFX9
        drm/amdgpu: Fix SDMA HQD destroy error on gfx_v7
        drm/vmwgfx: Fix buffer object eviction
        drm/vmwgfx: Don't impose STDU limits on framebuffer size
        drm/vmwgfx: limit mode size for all display unit to texture_max
        drm/vmwgfx: limit screen size to stdu_max during check_modeset
        drm/vmwgfx: don't check for old_crtc_state enable status
        drm/amdgpu: add new polaris pci id
        drm: sun4i: drop second PLL from A64 HDMI PHY
        drm: fix drm_drv_uses_atomic_modeset on non modesetting drivers.
        drm/i915/gvt: clear ggtt entries when destroy vgpu
        drm/i915/gvt: request srcu_read_lock before checking if one gfn is valid
        drm/i915/gvt: Add GEN9_CLKGATE_DIS_4 to default BXT mmio handler
        drm/i915/gvt: Init PHY related registers for BXT
        drm/atomic: Use drm_drv_uses_atomic_modeset() for debugfs creation
        drm/fb-helper: Remove set but not used variable 'connector_funcs'
        drm: udl: Destroy framebuffer only if it was initialized
        drm/sun4i: Remove R40 display pipeline compatibles
        drm/pl111: Make sure of_device_id tables are NULL terminated
        ...
      a38fd7d8
  7. 20 Sep, 2018 6 commits
    • Dave Airlie's avatar
      Merge branch 'drm-next-4.20' of git://people.freedesktop.org/~agd5f/linux into drm-next · 36c9c3c9
      Dave Airlie authored
      This is a new pull for drm-next on top of last weeks with the following
      changes:
      - Fixed 64 bit divide
      - Fixed vram type on vega20
      - Misc vega20 fixes
      - Misc DC fixes
      - Fix GDS/GWS/OA domain handling
      
      Previous changes from last week:
      amdgpu/kfd:
      - Picasso (new APU) support
      - Raven2 (new APU) support
      - Vega20 enablement
      - ACP powergating improvements
      - Add ABGR/XBGR display support
      - VCN JPEG engine support
      - Initial xGMI support
      - Use load balancing for engine scheduling
      - Lots of new documentation
      - Rework and clean up i2c and aux handling in DC
      - Add DP YCbCr 4:2:0 support in DC
      - Add DMCU firmware loading for Raven (used for ABM and PSR)
      - New debugfs features in DC
      - LVDS support in DC
      - Implement wave kill for gfx/compute (light weight reset for shaders)
      - Use AGP aperture to avoid gart mappings when possible
      - GPUVM performance improvements
      - Bulk moves for more efficient GPUVM LRU handling
      - Merge amdgpu and amdkfd into one module
      - Enable gfxoff and stutter mode on Raven
      - Misc cleanups
      
      Scheduler:
      - Load balancing support
      - Bug fixes
      
      ttm:
      - Bulk move functionality
      - Bug fixes
      
      radeon:
      - Misc cleanups
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      
      From: Alex Deucher <alexdeucher@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180920150438.12693-1-alexander.deucher@amd.com
      36c9c3c9
    • Dave Airlie's avatar
      Merge branch 'drm-fixes-4.19' of git://people.freedesktop.org/~agd5f/linux into drm-fixes · 4fcb7f8b
      Dave Airlie authored
      A few fixes for 4.19:
      - Add a new polaris pci id
      - KFD fixes for raven and gfx7
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Alex Deucher <alexdeucher@gmail.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180920155850.5455-1-alexander.deucher@amd.com
      4fcb7f8b
    • Dave Airlie's avatar
      Merge branch 'vmwgfx-fixes-4.19' of git://people.freedesktop.org/~thomash/linux into drm-fixes · 618cc151
      Dave Airlie authored
      A couple of modesetting fixes and a fix for a long-standing buffer-eviction
      problem cc'd stable.
      Signed-off-by: default avatarDave Airlie <airlied@redhat.com>
      From: Thomas Hellstrom <thellstrom@vmware.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20180920063935.35492-1-thellstrom@vmware.com
      618cc151
    • Feng Tang's avatar
      x86/mm: Expand static page table for fixmap space · 05ab1d8a
      Feng Tang authored
      We met a kernel panic when enabling earlycon, which is due to the fixmap
      address of earlycon is not statically setup.
      
      Currently the static fixmap setup in head_64.S only covers 2M virtual
      address space, while it actually could be in 4M space with different
      kernel configurations, e.g. when VSYSCALL emulation is disabled.
      
      So increase the static space to 4M for now by defining FIXMAP_PMD_NUM to 2,
      and add a build time check to ensure that the fixmap is covered by the
      initial static page tables.
      
      Fixes: 1ad83c85 ("x86_64,vsyscall: Make vsyscall emulation configurable")
      Suggested-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarFeng Tang <feng.tang@intel.com>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Tested-by: default avatarkernel test robot <rong.a.chen@intel.com>
      Reviewed-by: Juergen Gross <jgross@suse.com> (Xen parts)
      Cc: H Peter Anvin <hpa@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Cc: Dave Hansen <dave.hansen@intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Andy Lutomirsky <luto@kernel.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20180920025828.23699-1-feng.tang@intel.com
      05ab1d8a
    • Junxiao Bi's avatar
      ocfs2: fix ocfs2 read block panic · 234b69e3
      Junxiao Bi authored
      While reading block, it is possible that io error return due to underlying
      storage issue, in this case, BH_NeedsValidate was left in the buffer head.
      Then when reading the very block next time, if it was already linked into
      journal, that will trigger the following panic.
      
      [203748.702517] kernel BUG at fs/ocfs2/buffer_head_io.c:342!
      [203748.702533] invalid opcode: 0000 [#1] SMP
      [203748.702561] Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs sunrpc dm_switch dm_queue_length dm_multipath bonding be2iscsi iscsi_boot_sysfs bnx2i cnic uio cxgb4i iw_cxgb4 cxgb4 cxgb3i libcxgbi iw_cxgb3 cxgb3 mdio ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core ib_addr ipv6 iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ipmi_devintf iTCO_wdt iTCO_vendor_support dcdbas ipmi_ssif i2c_core ipmi_si ipmi_msghandler acpi_pad pcspkr sb_edac edac_core lpc_ich mfd_core shpchp sg tg3 ptp pps_core ext4 jbd2 mbcache2 sr_mod cdrom sd_mod ahci libahci megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod
      [203748.703024] CPU: 7 PID: 38369 Comm: touch Not tainted 4.1.12-124.18.6.el6uek.x86_64 #2
      [203748.703045] Hardware name: Dell Inc. PowerEdge R620/0PXXHP, BIOS 2.5.2 01/28/2015
      [203748.703067] task: ffff880768139c00 ti: ffff88006ff48000 task.ti: ffff88006ff48000
      [203748.703088] RIP: 0010:[<ffffffffa05e9f09>]  [<ffffffffa05e9f09>] ocfs2_read_blocks+0x669/0x7f0 [ocfs2]
      [203748.703130] RSP: 0018:ffff88006ff4b818  EFLAGS: 00010206
      [203748.703389] RAX: 0000000008620029 RBX: ffff88006ff4b910 RCX: 0000000000000000
      [203748.703885] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 00000000023079fe
      [203748.704382] RBP: ffff88006ff4b8d8 R08: 0000000000000000 R09: ffff8807578c25b0
      [203748.704877] R10: 000000000f637376 R11: 000000003030322e R12: 0000000000000000
      [203748.705373] R13: ffff88006ff4b910 R14: ffff880732fe38f0 R15: 0000000000000000
      [203748.705871] FS:  00007f401992c700(0000) GS:ffff880bfebc0000(0000) knlGS:0000000000000000
      [203748.706370] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [203748.706627] CR2: 00007f4019252440 CR3: 00000000a621e000 CR4: 0000000000060670
      [203748.707124] Stack:
      [203748.707371]  ffff88006ff4b828 ffffffffa0609f52 ffff88006ff4b838 0000000000000001
      [203748.707885]  0000000000000000 0000000000000000 ffff880bf67c3800 ffffffffa05eca00
      [203748.708399]  00000000023079ff ffffffff81c58b80 0000000000000000 0000000000000000
      [203748.708915] Call Trace:
      [203748.709175]  [<ffffffffa0609f52>] ? ocfs2_inode_cache_io_unlock+0x12/0x20 [ocfs2]
      [203748.709680]  [<ffffffffa05eca00>] ? ocfs2_empty_dir_filldir+0x80/0x80 [ocfs2]
      [203748.710185]  [<ffffffffa05ec0cb>] ocfs2_read_dir_block_direct+0x3b/0x200 [ocfs2]
      [203748.710691]  [<ffffffffa05f0fbf>] ocfs2_prepare_dx_dir_for_insert.isra.57+0x19f/0xf60 [ocfs2]
      [203748.711204]  [<ffffffffa065660f>] ? ocfs2_metadata_cache_io_unlock+0x1f/0x30 [ocfs2]
      [203748.711716]  [<ffffffffa05f4f3a>] ocfs2_prepare_dir_for_insert+0x13a/0x890 [ocfs2]
      [203748.712227]  [<ffffffffa05f442e>] ? ocfs2_check_dir_for_entry+0x8e/0x140 [ocfs2]
      [203748.712737]  [<ffffffffa061b2f2>] ocfs2_mknod+0x4b2/0x1370 [ocfs2]
      [203748.713003]  [<ffffffffa061c385>] ocfs2_create+0x65/0x170 [ocfs2]
      [203748.713263]  [<ffffffff8121714b>] vfs_create+0xdb/0x150
      [203748.713518]  [<ffffffff8121b225>] do_last+0x815/0x1210
      [203748.713772]  [<ffffffff812192e9>] ? path_init+0xb9/0x450
      [203748.714123]  [<ffffffff8121bca0>] path_openat+0x80/0x600
      [203748.714378]  [<ffffffff811bcd45>] ? handle_pte_fault+0xd15/0x1620
      [203748.714634]  [<ffffffff8121d7ba>] do_filp_open+0x3a/0xb0
      [203748.714888]  [<ffffffff8122a767>] ? __alloc_fd+0xa7/0x130
      [203748.715143]  [<ffffffff81209ffc>] do_sys_open+0x12c/0x220
      [203748.715403]  [<ffffffff81026ddb>] ? syscall_trace_enter_phase1+0x11b/0x180
      [203748.715668]  [<ffffffff816f0c9f>] ? system_call_after_swapgs+0xe9/0x190
      [203748.715928]  [<ffffffff8120a10e>] SyS_open+0x1e/0x20
      [203748.716184]  [<ffffffff816f0d5e>] system_call_fastpath+0x18/0xd7
      [203748.716440] Code: 00 00 48 8b 7b 08 48 83 c3 10 45 89 f8 44 89 e1 44 89 f2 4c 89 ee e8 07 06 11 e1 48 8b 03 48 85 c0 75 df 8b 5d c8 e9 4d fa ff ff <0f> 0b 48 8b 7d a0 e8 dc c6 06 00 48 b8 00 00 00 00 00 00 00 10
      [203748.717505] RIP  [<ffffffffa05e9f09>] ocfs2_read_blocks+0x669/0x7f0 [ocfs2]
      [203748.717775]  RSP <ffff88006ff4b818>
      
      Joesph ever reported a similar panic.
      Link: https://oss.oracle.com/pipermail/ocfs2-devel/2013-May/008931.html
      
      Link: http://lkml.kernel.org/r/20180912063207.29484-1-junxiao.bi@oracle.comSigned-off-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Mark Fasheh <mark@fasheh.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      234b69e3
    • Roman Gushchin's avatar
      mm: slowly shrink slabs with a relatively small number of objects · 172b06c3
      Roman Gushchin authored
      9092c71b ("mm: use sc->priority for slab shrink targets") changed the
      way that the target slab pressure is calculated and made it
      priority-based:
      
          delta = freeable >> priority;
          delta *= 4;
          do_div(delta, shrinker->seeks);
      
      The problem is that on a default priority (which is 12) no pressure is
      applied at all, if the number of potentially reclaimable objects is less
      than 4096 (1<<12).
      
      This causes the last objects on slab caches of no longer used cgroups to
      (almost) never get reclaimed.  It's obviously a waste of memory.
      
      It can be especially painful, if these stale objects are holding a
      reference to a dying cgroup.  Slab LRU lists are reparented on memcg
      offlining, but corresponding objects are still holding a reference to the
      dying cgroup.  If we don't scan these objects, the dying cgroup can't go
      away.  Most likely, the parent cgroup hasn't any directly charged objects,
      only remaining objects from dying children cgroups.  So it can easily hold
      a reference to hundreds of dying cgroups.
      
      If there are no big spikes in memory pressure, and new memory cgroups are
      created and destroyed periodically, this causes the number of dying
      cgroups grow steadily, causing a slow-ish and hard-to-detect memory
      "leak".  It's not a real leak, as the memory can be eventually reclaimed,
      but it could not happen in a real life at all.  I've seen hosts with a
      steadily climbing number of dying cgroups, which doesn't show any signs of
      a decline in months, despite the host is loaded with a production
      workload.
      
      It is an obvious waste of memory, and to prevent it, let's apply a minimal
      pressure even on small shrinker lists.  E.g.  if there are freeable
      objects, let's scan at least min(freeable, scan_batch) objects.
      
      This fix significantly improves a chance of a dying cgroup to be
      reclaimed, and together with some previous patches stops the steady growth
      of the dying cgroups number on some of our hosts.
      
      Link: http://lkml.kernel.org/r/20180905230759.12236-1-guro@fb.com
      Fixes: 9092c71b ("mm: use sc->priority for slab shrink targets")
      Signed-off-by: default avatarRoman Gushchin <guro@fb.com>
      Acked-by: default avatarRik van Riel <riel@surriel.com>
      Cc: Josef Bacik <jbacik@fb.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Shakeel Butt <shakeelb@google.com>
      Cc: Michal Hocko <mhocko@kernel.org>
      Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      172b06c3