Commits · b1f378ab5334d96e661dd74586210a476b498802 · Kirill Smelkov / linux

08 Oct, 2018 18 commits

mmc: sdhci-of-esdhc: add erratum A008171 support · b1f378ab

Yinbo Zhu authored Aug 23, 2018

In tuning mode of operation, when TBCTL[TB_EN] is set, eSDHC may report
one of the following errors :
1)Tuning error while running tuning operation where SYSCTL2[SAMPCLKSEL]
will not get set even when SYSCTL2[EXTN] is reset. OR
2)Data transaction error (e.g. IRQSTAT[DCE], IRQSTAT[DEBE]) during data
transaction errors.
This issue occurs when the data window sampled within eSDHC is in full
cycle. So, in that case, eSDHC is not able to find out the start and
end points of the data window and sets the sampling pointer at default
location (which is middle of the internal SD clock). If this sampling
point coincides with the data eye boundary, then it can result in the
above mentioned errors. Impact: Tuning mode of operation for SDR50,
SDR104 or HS200 speed modes may not work properly
Workaround: In case eSDHC reports tuning error or data errors in tuning
mode of operation, by add the erratum A008171 support to fix the issue.
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

b1f378ab

mmc: sdhci: add tuning error codes · 7d8bb1f4

Yinbo Zhu authored Aug 23, 2018

This patch is to add tuning error codes to
judge tuning state
Signed-off-by: Yinbo Zhu <yinbo.zhu@nxp.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

7d8bb1f4

mmc: uniphier-sd: add UniPhier SD/eMMC controller driver · 3fd784f7

Masahiro Yamada authored Aug 23, 2018

Here is another TMIO MMC variant found in Socionext UniPhier SoCs.

As commit b6147490 ("mmc: tmio: split core functionality, DMA and
MFD glue") said, these MMC controllers use the IP from Panasonic.

However, the MMC controller in the TMIO (Toshiba Mobile IO) MFD chip
was the first upstreamed user of this IP.  The common driver code
for this IP is now called 'tmio-mmc-core' in Linux although it is a
historical misnomer.

Anyway, this driver select's MMC_TMIO_CORE to borrow the common code
from tmio-mmc-core.c

Older UniPhier SoCs (LD4, Pro4, sLD8) support the external DMA engine
like renesas_sdhi_sys_dmac.c.  The difference is UniPhier SoCs use a
single DMA channel whereas Renesas chips request separate channels for
RX and TX.

Newer UniPhier SoCs (Pro5 and later) support the internal DMA engine
like renesas_sdhi_internal_dmac.c  The register map is almost the same,
so I guess Renesas and Socionext use the same internal DMA hardware.
The main difference is, the register offsets are doubled for Renesas.

                        Renesas      Socionext
                        SDHI         UniPhier
  DM_CM_DTRAN_MODE      0x820        0x410
  DM_CM_DTRAN_CTRL      0x828        0x414
  DM_CM_RST             0x830        0x418
  DM_CM_INFO1           0x840        0x420
  DM_CM_INFO1_MASK      0x848        0x424
  DM_CM_INFO2           0x850        0x428
  DM_CM_INFO2_MASK      0x858        0x42c
  DM_DTRAN_ADDR         0x880        0x440
  DM_DTRAN_ADDREX        ---         0x444

This comes from the difference of host->bus_shift; 2 for Renesas SoCs,
and 1 for UniPhier SoCs.  Also, the datasheet for UniPhier SoCs defines
DM_DTRAN_ADDR and DM_DTRAN_ADDREX as two separate registers.

It could be possible to factor out the DMA common code by introducing
some hooks to cope with platform quirks, but this patch does not touch
that for now.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

3fd784f7

dt-bindings: mmc: add DT binding for UniPhier SD/eMMC controller · fb19fdf4

Masahiro Yamada authored Aug 23, 2018

This SD/eMMC controller is used for UniPhier SoC family.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Acked-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

fb19fdf4

mmc: tmio: move tmio_mmc_set_clock() to platform hook · 0196c8db

Masahiro Yamada authored Aug 23, 2018

tmio_mmc_set_clock() is full of quirks because different SoC vendors
extended this in different ways.

The original IP defines the divisor range 1/2 ... 1/512.

 bit 7 is set:    1/512
 bit 6 is set:    1/256
   ...
 bit 0 is set:    1/4
 all bits clear:  1/2

It is platform-dependent how to achieve the 1/1 clock.

I guess the TMIO-MFD variant uses the clock selector outside of this IP,
as far as I see tmio_core_mmc_clk_div() in drivers/mfd/tmio_core.c

I guess bit[7:0]=0xff is Renesas-specific extension.

Socionext (and Panasonic) uses bit 10 (CLKSEL) for 1/1.  Also, newer
versions of UniPhier SoC variants use bit 16 for 1/1024.

host->clk_update() is only used by the Renesas variants, whereas
host->set_clk_div() is only used by the TMIO-MFD variants.

To cope with this mess, promote tmio_mmc_set_clock() to a new
platform hook ->set_clock(), and melt the old two hooks into it.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

0196c8db

mmc: tmio: replace tmio_mmc_clk_stop() calls with tmio_mmc_set_clock() · 74005a01

Masahiro Yamada authored Aug 23, 2018

tmio_mmc_clk_stop(host) is equivalent to tmio_mmc_set_clock(host, 0).
This replacement is needed for the next commit.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

74005a01

mmc: jz4740: Add support for the JZ4725B · a0c938b5

Paul Cercueil authored Aug 21, 2018

The JZ4725B is the first JZ SoC version that introduced a 32-bit IMASK
register, not the JZ4750.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

a0c938b5

mmc: use SPDX identifier for Renesas drivers · f707079d

Wolfram Sang authored Aug 22, 2018

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

f707079d

dt-bindings: mmc: tmio_mmc: document Renesas R8A77970 bindings · 00c6527b

Sergei Shtylyov authored Aug 21, 2018

Document the R-Car V3M (R8A77970) SoC in the R-Car SDHI bindings -- it's
the usual R-Car gen3 compatible controller with the internal DMA engine.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>

00c6527b

mmc: renesas_sdhi_internal_dmac: add R8A77970 to whitelist · 16a129b3

Sergei Shtylyov authored Aug 18, 2018

I've successfully tested eMMC on the V3H Starter Kit board and since the
R8A77970 SoC has a single SDHI core, it can't be a subject to the known RX
DMA errata.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>

16a129b3

mmc: renesas_sdhi_internal_dmac: Fix a few typos · c1ec8f86

Sergei Shtylyov authored Aug 17, 2018

Remove the stray underscore in the DM_CM_DTRAN_MODE.BUS_WIDTH register
field name and fix the typo in the comment of the #define
DTRAN_MODE_CH_NUM_CH1.
Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

c1ec8f86

mmc: jz4740: Drop dependency on MACH_JZ4740/80 · 685bc885

Paul Cercueil authored Aug 21, 2018

Depending on MACH_JZ4740 | MACH_JZ4780 prevent us from creating a generic
kernel that works on more than one MIPS board. Instead, we just depend on
MIPS being set.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

685bc885

mmc: dw_mmc: hi3798cv200: add MMC_CAP_CMD23 cap · ed3ae724

Igor Opaniuk authored Aug 20, 2018

Enable access to the RPMB on the on-board eMMC of the
Poplar board.
Signed-off-by: Igor Opaniuk <igor.opaniuk@linaro.org>
Acked-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

ed3ae724

mmc: renesas_sdhi: Add r8a774a1 support · 722c68a5

Fabrizio Castro authored Aug 14, 2018

Document RZ/G2M (R8A774A1) SoC bindings.
Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
Reviewed-by: Biju Das <biju.das@bp.renesas.com>
Reviewed-by: Rob Herring <robh@kernel.org>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

722c68a5

mmc: renesas_sdhi_internal_dmac: Whitelist r8a774a1 · 2e1501a8

Fabrizio Castro authored Aug 14, 2018

We need r8a774a1 to be whitelisted for SDHI to work on the RZ/G2M,
but we don't care about the revision of the SoC, so just whitelist
the generic part number.
Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>
Reviewed-by: Biju Das <biju.das@bp.renesas.com>
Reviewed-by: Simon Horman <horms+renesas@verge.net.au>
Reviewed-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

2e1501a8

mmc: sdhci-of-arasan: Do now show error message in case of deffered probe · 60208a26

Michal Simek authored Aug 06, 2018

When mmc-pwrseq property is passed mmc_pwrseq_alloc() can return
-EPROBE_DEFER because driver for power sequence provider is not probed
yet. Do not show error message when this situation happens.
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

60208a26

mmc: sdhci-iproc: Add ACPI support · 7c7ba433

Srinath Mannam authored Aug 05, 2018

Add ACPI support to all IPROC SDHCI variants.
Signed-off-by: Srinath Mannam <srinath.mannam@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Vladimir Olovyannikov <vladimir.olovyannikov@broadcom.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

7c7ba433

mmc: sdhci-pltfm: Convert DT properties to generic device properties · 8199d312

Adrian Hunter authored Aug 05, 2018

Convert DT properties to generic device properties
so that drivers can get properties from DT or ACPI.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Srinath Mannam <srinath.mannam@broadcom.com>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>

8199d312

07 Oct, 2018 7 commits

Linux 4.19-rc7 · 0238df64
Greg Kroah-Hartman authored Oct 07, 2018

0238df64

Merge tag 'char-misc-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · fb1c592c

Greg Kroah-Hartman authored Oct 07, 2018

I wrote:
  "Char/Misc fixes for 4.19-rc7

   Here are 8 small fixes for some char/misc driver issues

   Included here are:
	- fpga driver fixes
	- thunderbolt bugfixes
	- firmware core revert/fix
	- hv core fix
	- hv tool fix

   All of these have been in linux-next with no reported issues."

* tag 'char-misc-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  thunderbolt: Initialize after IOMMUs
  thunderbolt: Do not handle ICM events after domain is stopped
  firmware: Always initialize the fw_priv list object
  docs: fpga: document fpga manager flags
  fpga: bridge: fix obvious function documentation error
  tools: hv: fcopy: set 'error' in case an unknown operation was requested
  fpga: do not access region struct after fpga_region_unregister
  Drivers: hv: vmbus: Use get/put_cpu() in vmbus_connect()

fb1c592c

Merge tag 'tty-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty · 4ebaf075

Greg Kroah-Hartman authored Oct 07, 2018

I wrote:
  "Serial driver fixes for 4.19-rc7

   Here are 3 small serial driver fixes for 4.19-rc7
    - 2 sh-sci bugfixes for reported issues
    - a revert of the PM handling for the 8250_dw code

   All of these have been in linux-next with no reported issues."

* tag 'tty-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  Revert "serial: sh-sci: Allow for compressed SCIF address"
  Revert "serial: sh-sci: Remove SCIx_RZ_SCIFA_REGTYPE"
  Revert "serial: 8250_dw: Fix runtime PM handling"

4ebaf075

Merge tag 'usb-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · cc02f852

Greg Kroah-Hartman authored Oct 07, 2018

I wrote:
  "USB fixes for 4.19-rc7

   Here are some small USB fixes for 4.19-rc7

   These include:
     - the usual xhci bugfixes for reported issues
     - some new serial driver device ids
     - bugfix for the option serial driver for some devices
     - bugfix for the cdc_acm driver that has been there for a long time.

   All of these have been in linux-next for a while with no reported
   issues."

* tag 'usb-4.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: xhci-mtk: resume USB3 roothub first
  xhci: Add missing CAS workaround for Intel Sunrise Point xHCI
  usb: cdc_acm: Do not leak URB buffers
  USB: serial: simple: add Motorola Tetra MTP6550 id
  USB: serial: option: add two-endpoints device-id flag
  USB: serial: option: improve Quectel EP06 detection

cc02f852

Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux · 055d8d9e

Greg Kroah-Hartman authored Oct 07, 2018

Wolfram writes:
  "i2c for 4.19

   I2C has three driver bugfixes and a fix for a typo for you."

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: designware: Call i2c_dw_clk_rate() only when calculating timings
  i2c: i2c-scmi: fix for i2c_smbus_write_block_data
  i2c: i2c-isch: fix spelling mistake "unitialized" -> "uninitialized"
  i2c: i2c-qcom-geni: Properly handle DMA safe buffers

055d8d9e

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 40fa9167

Greg Kroah-Hartman authored Oct 07, 2018

James writes:
  "SCSI fixes on 20181006

   Small fix for an unititialized mutex in the qedi driver."

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qedi: Initialize the stats mutex lock

40fa9167

Merge tag 'powerpc-4.19-4' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux · cd2093cb

Greg Kroah-Hartman authored Oct 07, 2018

Michael writes:
  "powerpc fixes for 4.19 #4

   Four regression fixes.

   A fix for a change to lib/xz which broke our zImage loader when
   building with XZ compression. OK'ed by Herbert who merged the
   original patch.

   The recent fix we did to avoid patching __init text broke some 32-bit
   machines, fix that.

   Our show_user_instructions() could be tricked into printing kernel
   memory, add a check to avoid that.

   And a fix for a change to our NUMA initialisation logic, which causes
   crashes in some kdump configurations.

   Thanks to:
     Christophe Leroy, Hari Bathini, Jann Horn, Joel Stanley, Meelis
     Roos, Murilo Opsfelder Araujo, Srikar Dronamraju."

* tag 'powerpc-4.19-4' of https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/numa: Skip onlining a offline node in kdump path
  powerpc: Don't print kernel instructions in show_user_instructions()
  powerpc/lib: fix book3s/32 boot failure due to code patching
  lib/xz: Put CRC32_POLY_LE in xz_private.h

cd2093cb

06 Oct, 2018 1 commit

Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c1d84a1b

Greg Kroah-Hartman authored Oct 06, 2018

Dave writes:
  "Networking fixes:

  1) Fix truncation of 32-bit right shift in bpf, from Jann Horn.

  2) Fix memory leak in wireless wext compat, from Stefan Seyfried.

  3) Use after free in cfg80211's reg_process_hint(), from Yu Zhao.

  4) Need to cancel pending work when unbinding in smsc75xx otherwise
     we oops, also from Yu Zhao.

  5) Don't allow enslaving a team device to itself, from Ido Schimmel.

  6) Fix backwards compat with older userspace for rtnetlink FDB dumps.
     From Mauricio Faria.

  7) Add validation of tc policy netlink attributes, from David Ahern.

  8) Fix RCU locking in rawv6_send_hdrinc(), from Wei Wang."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (26 commits)
  net: mvpp2: Extract the correct ethtype from the skb for tx csum offload
  ipv6: take rcu lock in rawv6_send_hdrinc()
  net: sched: Add policy validation for tc attributes
  rtnetlink: fix rtnl_fdb_dump() for ndmsg header
  yam: fix a missing-check bug
  net: bpfilter: Fix type cast and pointer warnings
  net: cxgb3_main: fix a missing-check bug
  bpf: 32-bit RSH verification must truncate input before the ALU op
  net: phy: phylink: fix SFP interface autodetection
  be2net: don't flip hw_features when VXLANs are added/deleted
  net/packet: fix packet drop as of virtio gso
  net: dsa: b53: Keep CPU port as tagged in all VLANs
  openvswitch: load NAT helper
  bnxt_en: get the reduced max_irqs by the ones used by RDMA
  bnxt_en: free hwrm resources, if driver probe fails.
  bnxt_en: Fix enables field in HWRM_QUEUE_COS2BW_CFG request
  bnxt_en: Fix VNIC reservations on the PF.
  team: Forbid enslaving team device to itself
  net/usb: cancel pending work when unbinding smsc75xx
  mlxsw: spectrum: Delete RIF when VLAN device is removed
  ...

c1d84a1b

05 Oct, 2018 14 commits

Merge branch 'akpm' · 091a1eaa

Greg Kroah-Hartman authored Oct 05, 2018

* akpm:
  mm: madvise(MADV_DODUMP): allow hugetlbfs pages
  ocfs2: fix locking for res->tracking and dlm->tracking_list
  mm/vmscan.c: fix int overflow in callers of do_shrink_slab()
  mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly
  mm/vmstat.c: fix outdated vmstat_text
  proc: restrict kernel stack dumps to root
  mm/hugetlb: add mmap() encodings for 32MB and 512MB page sizes
  mm/migrate.c: split only transparent huge pages when allocation fails
  ipc/shm.c: use ERR_CAST() for shm_lock() error return
  mm/gup_benchmark: fix unsigned comparison to zero in __gup_benchmark_ioctl
  mm, thp: fix mlocking THP page with migration enabled
  ocfs2: fix crash in ocfs2_duplicate_clusters_by_page()
  hugetlb: take PMD sharing into account when flushing tlb/caches
  mm: migration: fix migration of huge PMD shared pages

091a1eaa

mm: madvise(MADV_DODUMP): allow hugetlbfs pages · d41aa525

Daniel Black authored Oct 05, 2018

Reproducer, assuming 2M of hugetlbfs available:

Hugetlbfs mounted, size=2M and option user=testuser

  # mount | grep ^hugetlbfs
  hugetlbfs on /dev/hugepages type hugetlbfs (rw,pagesize=2M,user=dan)
  # sysctl vm.nr_hugepages=1
  vm.nr_hugepages = 1
  # grep Huge /proc/meminfo
  AnonHugePages:         0 kB
  ShmemHugePages:        0 kB
  HugePages_Total:       1
  HugePages_Free:        1
  HugePages_Rsvd:        0
  HugePages_Surp:        0
  Hugepagesize:       2048 kB
  Hugetlb:            2048 kB

Code:

  #include <sys/mman.h>
  #include <stddef.h>
  #define SIZE 2*1024*1024
  int main()
  {
    void *ptr;
    ptr = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_HUGETLB | MAP_ANONYMOUS, -1, 0);
    madvise(ptr, SIZE, MADV_DONTDUMP);
    madvise(ptr, SIZE, MADV_DODUMP);
  }

Compile and strace:

  mmap(NULL, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0) = 0x7ff7c9200000
  madvise(0x7ff7c9200000, 2097152, MADV_DONTDUMP) = 0
  madvise(0x7ff7c9200000, 2097152, MADV_DODUMP) = -1 EINVAL (Invalid argument)

hugetlbfs pages have VM_DONTEXPAND in the VmFlags driver pages based on
author testing with analysis from Florian Weimer[1].

The inclusion of VM_DONTEXPAND into the VM_SPECIAL defination was a
consequence of the large useage of VM_DONTEXPAND in device drivers.

A consequence of [2] is that VM_DONTEXPAND marked pages are unable to be
marked DODUMP.

A user could quite legitimately madvise(MADV_DONTDUMP) their hugetlbfs
memory for a while and later request that madvise(MADV_DODUMP) on the same
memory.  We correct this omission by allowing madvice(MADV_DODUMP) on
hugetlbfs pages.

[1] https://stackoverflow.com/questions/52548260/madvisedodump-on-the-same-ptr-size-as-a-successful-madvisedontdump-fails-wit
[2] commit 0103bd16 ("mm: prepare VM_DONTDUMP for using in drivers")

Link: http://lkml.kernel.org/r/20180930054629.29150-1-daniel@linux.ibm.com
Link: https://lists.launchpad.net/maria-discuss/msg05245.html
Fixes: 0103bd16 ("mm: prepare VM_DONTDUMP for using in drivers")
Reported-by: Kenneth Penza <kpenza@gmail.com>
Signed-off-by: Daniel Black <daniel@linux.ibm.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

d41aa525

ocfs2: fix locking for res->tracking and dlm->tracking_list · cbe355f5

Ashish Samant authored Oct 05, 2018

In dlm_init_lockres() we access and modify res->tracking and
dlm->tracking_list without holding dlm->track_lock.  This can cause list
corruptions and can end up in kernel panic.

Fix this by locking res->tracking and dlm->tracking_list with
dlm->track_lock instead of dlm->spinlock.

Link: http://lkml.kernel.org/r/1529951192-4686-1-git-send-email-ashish.samant@oracle.comSigned-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: Changwei Ge <ge.changwei@h3c.com>
Acked-by: Joseph Qi <jiangqi903@gmail.com>
Acked-by: Jun Piao <piaojun@huawei.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <ge.changwei@h3c.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

cbe355f5

mm/vmscan.c: fix int overflow in callers of do_shrink_slab() · b8e57efa

Kirill Tkhai authored Oct 05, 2018

do_shrink_slab() returns unsigned long value, and the placing into int
variable cuts high bytes off.  Then we compare ret and 0xfffffffe (since
SHRINK_EMPTY is converted to ret type).

Thus a large number of objects returned by do_shrink_slab() may be
interpreted as SHRINK_EMPTY, if low bytes of their value are equal to
0xfffffffe.  Fix that by declaration ret as unsigned long in these
functions.

Link: http://lkml.kernel.org/r/153813407177.17544.14888305435570723973.stgit@localhost.localdomainSigned-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reported-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

b8e57efa

mm/vmstat.c: skip NR_TLB_REMOTE_FLUSH* properly · 58bc4c34

Jann Horn authored Oct 05, 2018

5dd0b16c ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even
on UP") made the availability of the NR_TLB_REMOTE_FLUSH* counters inside
the kernel unconditional to reduce #ifdef soup, but (either to avoid
showing dummy zero counters to userspace, or because that code was missed)
didn't update the vmstat_array, meaning that all following counters would
be shown with incorrect values.

This only affects kernel builds with
CONFIG_VM_EVENT_COUNTERS=y && CONFIG_DEBUG_TLBFLUSH=y && CONFIG_SMP=n.

Link: http://lkml.kernel.org/r/20181001143138.95119-2-jannh@google.com
Fixes: 5dd0b16c ("mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP")
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Kemi Wang <kemi.wang@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

58bc4c34

mm/vmstat.c: fix outdated vmstat_text · 28e2c4bb

Jann Horn authored Oct 05, 2018

7a9cdebd ("mm: get rid of vmacache_flush_all() entirely") removed the
VMACACHE_FULL_FLUSHES statistics, but didn't remove the corresponding
entry in vmstat_text.  This causes an out-of-bounds access in
vmstat_show().

Luckily this only affects kernels with CONFIG_DEBUG_VM_VMACACHE=y, which
is probably very rare.

Link: http://lkml.kernel.org/r/20181001143138.95119-1-jannh@google.com
Fixes: 7a9cdebd ("mm: get rid of vmacache_flush_all() entirely")
Signed-off-by: Jann Horn <jannh@google.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Kemi Wang <kemi.wang@intel.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

28e2c4bb

proc: restrict kernel stack dumps to root · f8a00cef

Jann Horn authored Oct 05, 2018

Currently, you can use /proc/self/task/*/stack to cause a stack walk on
a task you control while it is running on another CPU.  That means that
the stack can change under the stack walker.  The stack walker does
have guards against going completely off the rails and into random
kernel memory, but it can interpret random data from your kernel stack
as instruction pointers and stack pointers.  This can cause exposure of
kernel stack contents to userspace.

Restrict the ability to inspect kernel stacks of arbitrary tasks to root
in order to prevent a local attacker from exploiting racy stack unwinding
to leak kernel task stack contents.  See the added comment for a longer
rationale.

There don't seem to be any users of this userspace API that can't
gracefully bail out if reading from the file fails.  Therefore, I believe
that this change is unlikely to break things.  In the case that this patch
does end up needing a revert, the next-best solution might be to fake a
single-entry stack based on wchan.

Link: http://lkml.kernel.org/r/20180927153316.200286-1-jannh@google.com
Fixes: 2ec220e2 ("proc: add /proc/*/stack")
Signed-off-by: Jann Horn <jannh@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Ken Chen <kenchen@google.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

f8a00cef

mm/hugetlb: add mmap() encodings for 32MB and 512MB page sizes · 20916d46

Anshuman Khandual authored Oct 05, 2018

ARM64 architecture also supports 32MB and 512MB HugeTLB page sizes.  This
just adds mmap() system call argument encoding for them.

Link: http://lkml.kernel.org/r/1537841300-6979-1-git-send-email-anshuman.khandual@arm.comSigned-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Punit Agrawal <punit.agrawal@arm.com>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

20916d46

mm/migrate.c: split only transparent huge pages when allocation fails · e6112fc3

Anshuman Khandual authored Oct 05, 2018

split_huge_page_to_list() fails on HugeTLB pages.  I was experimenting
with moving 32MB contig HugeTLB pages on arm64 (with a debug patch
applied) and hit the following stack trace when the kernel crashed.

[ 3732.462797] Call trace:
[ 3732.462835]  split_huge_page_to_list+0x3b0/0x858
[ 3732.462913]  migrate_pages+0x728/0xc20
[ 3732.462999]  soft_offline_page+0x448/0x8b0
[ 3732.463097]  __arm64_sys_madvise+0x724/0x850
[ 3732.463197]  el0_svc_handler+0x74/0x110
[ 3732.463297]  el0_svc+0x8/0xc
[ 3732.463347] Code: d1000400 f90b0e60 f2fbd5a2 a94982a1 (f9000420)

When unmap_and_move[_huge_page]() fails due to lack of memory, the
splitting should happen only for transparent huge pages not for HugeTLB
pages.  PageTransHuge() returns true for both THP and HugeTLB pages.
Hence the conditonal check should test PagesHuge() flag to make sure that
given pages is not a HugeTLB one.

Link: http://lkml.kernel.org/r/1537798495-4996-1-git-send-email-anshuman.khandual@arm.com
Fixes: 94723aaf ("mm: unclutter THP migration")
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e6112fc3

ipc/shm.c: use ERR_CAST() for shm_lock() error return · 59cf0a93

Kees Cook authored Oct 05, 2018

This uses ERR_CAST() instead of an open-coded cast, as it is casting
across structure pointers, which upsets __randomize_layout:

ipc/shm.c: In function `shm_lock':
ipc/shm.c:209:9: note: randstruct: casting between randomized structure pointer types (ssa): `struct shmid_kernel' and `struct kern_ipc_perm'

  return (void *)ipcp;
         ^~~~~~~~~~~~

Link: http://lkml.kernel.org/r/20180919180722.GA15073@beast
Fixes: 82061c57 ("ipc: drop ipc_lock()")
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Manfred Spraul <manfred@colorfullife.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

59cf0a93

mm/gup_benchmark: fix unsigned comparison to zero in __gup_benchmark_ioctl · 51896864

YueHaibing authored Oct 05, 2018

get_user_pages_fast() will return negative value if no pages were pinned,
then be converted to a unsigned, which is compared to zero, giving the
wrong result.

Link: http://lkml.kernel.org/r/20180921095015.26088-1-yuehaibing@huawei.com
Fixes: 09e35a4a ("mm/gup_benchmark: handle gup failures")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

51896864

mm, thp: fix mlocking THP page with migration enabled · e125fe40

Kirill A. Shutemov authored Oct 05, 2018

A transparent huge page is represented by a single entry on an LRU list.
Therefore, we can only make unevictable an entire compound page, not
individual subpages.

If a user tries to mlock() part of a huge page, we want the rest of the
page to be reclaimable.

We handle this by keeping PTE-mapped huge pages on normal LRU lists: the
PMD on border of VM_LOCKED VMA will be split into PTE table.

Introduction of THP migration breaks[1] the rules around mlocking THP
pages.  If we had a single PMD mapping of the page in mlocked VMA, the
page will get mlocked, regardless of PTE mappings of the page.

For tmpfs/shmem it's easy to fix by checking PageDoubleMap() in
remove_migration_pmd().

Anon THP pages can only be shared between processes via fork().  Mlocked
page can only be shared if parent mlocked it before forking, otherwise CoW
will be triggered on mlock().

For Anon-THP, we can fix the issue by munlocking the page on removing PTE
migration entry for the page.  PTEs for the page will always come after
mlocked PMD: rmap walks VMAs from oldest to newest.

Test-case:

	#include <unistd.h>
	#include <sys/mman.h>
	#include <sys/wait.h>
	#include <linux/mempolicy.h>
	#include <numaif.h>

	int main(void)
	{
	        unsigned long nodemask = 4;
	        void *addr;

		addr = mmap((void *)0x20000000UL, 2UL << 20, PROT_READ | PROT_WRITE,
			MAP_PRIVATE | MAP_ANONYMOUS | MAP_LOCKED, -1, 0);

	        if (fork()) {
			wait(NULL);
			return 0;
	        }

	        mlock(addr, 4UL << 10);
	        mbind(addr, 2UL << 20, MPOL_PREFERRED | MPOL_F_RELATIVE_NODES,
	                &nodemask, 4, MPOL_MF_MOVE);

	        return 0;
	}

[1] https://lkml.kernel.org/r/CAOMGZ=G52R-30rZvhGxEbkTw7rLLwBGadVYeo--iizcD3upL3A@mail.gmail.com

Link: http://lkml.kernel.org/r/20180917133816.43995-1-kirill.shutemov@linux.intel.com
Fixes: 616b8371 ("mm: thp: enable thp migration in generic path")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Reviewed-by: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>	[4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

e125fe40

ocfs2: fix crash in ocfs2_duplicate_clusters_by_page() · 69eb7765

Larry Chen authored Oct 05, 2018

ocfs2_duplicate_clusters_by_page() may crash if one of the extent's pages
is dirty.  When a page has not been written back, it is still in dirty
state.  If ocfs2_duplicate_clusters_by_page() is called against the dirty
page, the crash happens.

To fix this bug, we can just unlock the page and wait until the page until
its not dirty.

The following is the backtrace:

kernel BUG at /root/code/ocfs2/refcounttree.c:2961!
[exception RIP: ocfs2_duplicate_clusters_by_page+822]
__ocfs2_move_extent+0x80/0x450 [ocfs2]
? __ocfs2_claim_clusters+0x130/0x250 [ocfs2]
ocfs2_defrag_extent+0x5b8/0x5e0 [ocfs2]
__ocfs2_move_extents_range+0x2a4/0x470 [ocfs2]
ocfs2_move_extents+0x180/0x3b0 [ocfs2]
? ocfs2_wait_for_recovery+0x13/0x70 [ocfs2]
ocfs2_ioctl_move_extents+0x133/0x2d0 [ocfs2]
ocfs2_ioctl+0x253/0x640 [ocfs2]
do_vfs_ioctl+0x90/0x5f0
SyS_ioctl+0x74/0x80
do_syscall_64+0x74/0x140
entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Once we find the page is dirty, we do not wait until it's clean, rather we
use write_one_page() to write it back

Link: http://lkml.kernel.org/r/20180829074740.9438-1-lchen@suse.com
[lchen@suse.com: update comments]
  Link: http://lkml.kernel.org/r/20180830075041.14879-1-lchen@suse.com
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Larry Chen <lchen@suse.com>
Acked-by: Changwei Ge <ge.changwei@h3c.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <jiangqi903@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

69eb7765

hugetlb: take PMD sharing into account when flushing tlb/caches · dff11abe

Mike Kravetz authored Oct 05, 2018

When fixing an issue with PMD sharing and migration, it was discovered via
code inspection that other callers of huge_pmd_unshare potentially have an
issue with cache and tlb flushing.

Use the routine adjust_range_if_pmd_sharing_possible() to calculate worst
case ranges for mmu notifiers.  Ensure that this range is flushed if
huge_pmd_unshare succeeds and unmaps a PUD_SUZE area.

Link: http://lkml.kernel.org/r/20180823205917.16297-3-mike.kravetz@oracle.comSigned-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

dff11abe