Commits · 872e192fab643887f143106eb56443d87e5e87c1 · nexedi / linux

29 Mar, 2019 1 commit

scsi: qedi: remove declaration of nvm_image from stack · 872e192f

Colin Ian King authored Mar 27, 2019

The nvm_image is a large struct qedi_nvm_iscsi_image object of over 24K so
don't declare it on the stack just for a sizeof requirement; use sizeof on
struct qedi_nvm_iscsi_image instead.

Fixes: c77a2fa3 ("scsi: qedi: Add the CRC size within iSCSI NVM image")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

872e192f

28 Mar, 2019 10 commits

scsi: ibmvfc: Clean up transport events · d6e2635b

Tyrel Datwyler authored Mar 20, 2019

No change to functionality. Simply make transport event messages a little
clearer, and rework CRQ format enums such that we have separate enums for
INIT messages and XPORT events.

[mkp: typo]
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

d6e2635b

scsi: ibmvfc: Byte swap status and error codes when logging · 3e6f7de4

Tyrel Datwyler authored Mar 20, 2019

Status and error codes are returned in big endian from the VIOS. The values
are translated into a human readable format when logged, but the values are
also logged. This patch byte swaps those values so that they are consistent
between BE and LE platforms.
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

3e6f7de4

scsi: ibmvfc: Add failed PRLI to cmd_status lookup array · 95237c25

Tyrel Datwyler authored Mar 20, 2019

The VIOS uses the SCSI_ERROR class to report PRLI failures. These errors
are indicated with the combination of a IBMVFC_FC_SCSI_ERROR return status
and 0x8000 error code. Add these codes to cmd_status[] with appropriate
human readable error message.
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

95237c25

scsi: ibmvfc: Remove "failed" from logged errors · 6dc6a944

Tyrel Datwyler authored Mar 20, 2019

The text of messages logged with ibmvfc_log_error() always contain the term
"failed". In the case of cancelled commands during EH they are reported
back by the VIOS using error codes. This can be confusing to somebody
looking at these log messages as to whether a command was successfully
cancelled. The following real log message for example it is unclear if the
transaction was actaully cancelled.

<6>sd 0:0:1:1: Cancelling outstanding commands.
<3>sd 0:0:1:1: [sde] Command (28) failed: transaction cancelled (2:6) flags: 0 fcp_rsp: 0, resid=0, scsi_status: 0

Remove prefixing of "failed" to all error logged messages. The
ibmvfc_log_error() function translates the returned error/status codes to a
human readable message already.
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

6dc6a944

scsi: zfcp: reduce flood of fcrscn1 trace records on multi-element RSCN · c8206579

Steffen Maier authored Mar 26, 2019

If an incoming ELS of type RSCN contains more than one element, zfcp
suboptimally causes repeated erp trigger NOP trace records for each
previously failed port. These could be ports that went away.  It loops over
each RSCN element, and for each of those in an inner loop over all
zfcp_ports.

The trigger to recover failed ports should be just the reception of some
RSCN, no matter how many elements it has. So we can loop over failed ports
separately, and only then loop over each RSCN element to handle the
non-failed ports.

The call chain was:

  zfcp_fc_incoming_rscn
    for (i = 1; i < no_entries; i++)
      _zfcp_fc_incoming_rscn
        list_for_each_entry(port, &adapter->port_list, list)
          if (masked port->d_id match) zfcp_fc_test_link
          if (!port->d_id) zfcp_erp_port_reopen "fcrscn1"   <===

In order the reduce the "flooding" of the REC trace area in such cases, we
factor out handling the failed ports to be outside of the entries loop:

  zfcp_fc_incoming_rscn
    if (no_entries > 1)                                     <===
      list_for_each_entry(port, &adapter->port_list, list)  <===
        if (!port->d_id) zfcp_erp_port_reopen "fcrscn1"     <===
    for (i = 1; i < no_entries; i++)
      _zfcp_fc_incoming_rscn
        list_for_each_entry(port, &adapter->port_list, list)
          if (masked port->d_id match) zfcp_fc_test_link

Abbreviated example trace records before this code change:

Tag            : fcrscn1
WWPN           : 0x500507630310d327
ERP want       : 0x02
ERP need       : 0x02

Tag            : fcrscn1
WWPN           : 0x500507630310d327
ERP want       : 0x02
ERP need       : 0x00                 NOP => superfluous trace record

The last trace entry repeats if there are more than 2 RSCN elements.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

c8206579

scsi: zfcp: fix scsi_eh host reset with port_forced ERP for non-NPIV FCP devices · 242ec145

Steffen Maier authored Mar 26, 2019

Suppose more than one non-NPIV FCP device is active on the same channel.
Send I/O to storage and have some of the pending I/O run into a SCSI
command timeout, e.g. due to bit errors on the fibre. Now the error
situation stops. However, we saw FCP requests continue to timeout in the
channel. The abort will be successful, but the subsequent TUR fails.
Scsi_eh starts. The LUN reset fails. The target reset fails. The host
reset only did an FCP device recovery. However, for non-NPIV FCP devices,
this does not close and reopen ports on the SAN-side if other non-NPIV FCP
device(s) share the same open ports.

In order to resolve the continuing FCP request timeouts, we need to
explicitly close and reopen ports on the SAN-side.

This was missing since the beginning of zfcp in v2.6.0 history commit
ea127f97 ("[PATCH] s390 (7/7): zfcp host adapter.").

Note: The FSF requests for forced port reopen could run into FSF request
timeouts due to other reasons. This would trigger an internal FCP device
recovery. Pending forced port reopen recoveries would get dismissed. So
some ports might not get fully reopened during this host reset handler.
However, subsequent I/O would trigger the above described escalation and
eventually all ports would be forced reopen to resolve any continuing FCP
request timeouts due to earlier bit errors.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Fixes: 1da177e4 ("Linux-2.6.12-rc2")
Cc: <stable@vger.kernel.org> #3.0+
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

242ec145

scsi: zfcp: fix rport unblock if deleted SCSI devices on Scsi_Host · fe67888f

Steffen Maier authored Mar 26, 2019

An already deleted SCSI device can exist on the Scsi_Host and remain there
because something still holds a reference.  A new SCSI device with the same
H:C:T:L and FCP device, target port WWPN, and FCP LUN can be created.  When
we try to unblock an rport, we still find the deleted SCSI device and
return early because the zfcp_scsi_dev of that SCSI device is not
ZFCP_STATUS_COMMON_UNBLOCKED. Hence we miss to unblock the rport, even if
the new proper SCSI device would be in good state.

Therefore, skip deleted SCSI devices when iterating the sdevs of the shost.
[cf. __scsi_device_lookup{_by_target}() or scsi_device_get()]

The following abbreviated trace sequence can indicate such problem:

Area           : REC
Tag            : ersfs_3
LUN            : 0x4045400300000000
WWPN           : 0x50050763031bd327
LUN status     : 0x40000000     not ZFCP_STATUS_COMMON_UNBLOCKED
Ready count    : n		not incremented yet
Running count  : 0x00000000
ERP want       : 0x01
ERP need       : 0xc1		ZFCP_ERP_ACTION_NONE

Area           : REC
Tag            : ersfs_3
LUN            : 0x4045400300000000
WWPN           : 0x50050763031bd327
LUN status     : 0x41000000
Ready count    : n+1
Running count  : 0x00000000
ERP want       : 0x01
ERP need       : 0x01

...

Area           : REC
Level          : 4		only with increased trace level
Tag            : ertru_l
LUN            : 0x4045400300000000
WWPN           : 0x50050763031bd327
LUN status     : 0x40000000
Request ID     : 0x0000000000000000
ERP status     : 0x01800000
ERP step       : 0x1000
ERP action     : 0x01
ERP count      : 0x00

NOT followed by a trace record with tag "scpaddy"
for WWPN 0x50050763031bd327.
Signed-off-by: Steffen Maier <maier@linux.ibm.com>
Fixes: 6f2ce1c6 ("scsi: zfcp: fix rport unblock race with LUN recovery")
Cc: <stable@vger.kernel.org> #2.6.32+
Reviewed-by: Jens Remus <jremus@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

fe67888f

scsi: sd: Quiesce warning if device does not report optimal I/O size · 1d5de5bd

Martin K. Petersen authored Mar 27, 2019

Commit a83da8a4 ("scsi: sd: Optimal I/O size should be a multiple
of physical block size") split one conditional into several separate
statements in an effort to provide more accurate warning messages when
a device reports a nonsensical value. However, this reorganization
accidentally dropped the precondition of the reported value being
larger than zero. This lead to a warning getting emitted on devices
that do not report an optimal I/O size at all.

Remain silent if a device does not report an optimal I/O size.

Fixes: a83da8a4 ("scsi: sd: Optimal I/O size should be a multiple of physical block size")
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: <stable@vger.kernel.org>
Reported-by: Hussam Al-Tayeb <ht990332@gmx.com>
Tested-by: Hussam Al-Tayeb <ht990332@gmx.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

1d5de5bd

scsi: sd: Fix a race between closing an sd device and sd I/O · c14a5726

Bart Van Assche authored Mar 25, 2019

The scsi_end_request() function calls scsi_cmd_to_driver() indirectly and
hence needs the disk->private_data pointer. Avoid that that pointer is
cleared before all affected I/O requests have finished. This patch avoids
that the following crash occurs:

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Call trace:
 scsi_mq_uninit_cmd+0x1c/0x30
 scsi_end_request+0x7c/0x1b8
 scsi_io_completion+0x464/0x668
 scsi_finish_command+0xbc/0x160
 scsi_eh_flush_done_q+0x10c/0x170
 sas_scsi_recover_host+0x84c/0xa98 [libsas]
 scsi_error_handler+0x140/0x5b0
 kthread+0x100/0x12c
 ret_from_fork+0x10/0x18

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Jason Yan <yanaijie@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reported-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

c14a5726

scsi: core: Run queue when state is set to running after being blocked · 70fc085c

zhengbin authored Mar 22, 2019

Use dd to test a SCSI device:

  1. echo "blocked" >/sys/block/sda/device/state
  2. dd if=/dev/sda of=/mnt/t.log bs=1M count=10
  3. echo "running" >/sys/block/sda/device/state

dd should finish this work after step 3, but it hangs.

After step2, the call chain is this:

blk_mq_dispatch_rq_list-->scsi_queue_rq-->prep_to_mq

prep_to_mq will return BLK_STS_RESOURCE, and scsi_queue_rq will
transition it to BLK_STS_DEV_RESOURCE which means that driver can
guarantee that IO dispatch will be triggered in future when the
resource is available.  Need to follow the rule if we set the device
state to running.

[mkp: tweaked commit description and code comment as suggested by Bart]
Signed-off-by: zhengbin <zhengbin13@huawei.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

70fc085c

26 Mar, 2019 3 commits

scsi: qla4xxx: fix a potential NULL pointer dereference · fba1bdd2

Kangjie Lu authored Mar 14, 2019

In case iscsi_lookup_endpoint fails, the fix returns -EINVAL to avoid NULL
pointer dereference.
Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Acked-by: Manish Rangankar <mrangankar@marvell.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

fba1bdd2

scsi: aacraid: Insure we don't access PCIe space during AER/EEH · b6554cfe

Dave Carroll authored Mar 22, 2019

There are a few windows during AER/EEH when we can access PCIe I/O mapped
registers. This will harden the access to insure we do not allow PCIe
access during errors
Signed-off-by: Dave Carroll <david.carroll@microsemi.com>
Reviewed-by: Sagar Biradar <sagar.biradar@microchip.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

b6554cfe

scsi: mpt3sas: Fix kernel panic during expander reset · c2fe742f

Sreekanth Reddy authored Mar 04, 2019

During expander reset handling, the driver invokes kernel function
scsi_host_find_tag() to obtain outstanding requests associated with the
scsi host managed by the driver. Driver loops from tag value zero to hba
queue depth to obtain the outstanding scmds. But when blk-mq is enabled,
the block layer may return stale entry for one or more requests. This may
lead to kernel panic if the returned value is inaccessible or the memory
pointed by the returned value is reused.

Reference of upstream discussion:

	https://patchwork.kernel.org/patch/10734933/

Instead of calling scsi_host_find_tag() API for each and every smid (smid
is tag +1) from one to shost->can_queue, now driver will call this API (to
obtain the outstanding scmd) only for those smid's which are outstanding at
the driver level.

Driver will determine whether this smid is outstanding at driver level by
looking into it's corresponding MPI request frame, if its MPI request frame
is empty, then it means that this smid is free and does not need to call
scsi_host_find_tag() for it.  By doing this, driver will invoke
scsi_host_find_tag() for only those tags which are outstanding at the
driver level.

Driver will check whether particular MPI request frame is empty or not by
looking into the "DevHandle" field. If this field is zero then it means
that this MPI request is empty. For active MPI request DevHandle must be
non-zero.

Also driver will memset the MPI request frame once the corresponding scmd
is processed (i.e. just before calling
scmd->done function).
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

c2fe742f

21 Mar, 2019 2 commits

scsi: ibmvscsi: Fix empty event pool access during host removal · 7f5203c1

Tyrel Datwyler authored Mar 20, 2019

The event pool used for queueing commands is destroyed fairly early in the
ibmvscsi_remove() code path. Since, this happens prior to the call so
scsi_remove_host() it is possible for further calls to queuecommand to be
processed which manifest as a panic due to a NULL pointer dereference as
seen here:

PANIC: "Unable to handle kernel paging request for data at address
0x00000000"

Context process backtrace:

DSISR: 0000000042000000 ????Syscall Result: 0000000000000000
4 [c000000002cb3820] memcpy_power7 at c000000000064204
[Link Register] [c000000002cb3820] ibmvscsi_send_srp_event at d000000003ed14a4
5 [c000000002cb3920] ibmvscsi_send_srp_event at d000000003ed14a4 [ibmvscsi] ?(unreliable)
6 [c000000002cb39c0] ibmvscsi_queuecommand at d000000003ed2388 [ibmvscsi]
7 [c000000002cb3a70] scsi_dispatch_cmd at d00000000395c2d8 [scsi_mod]
8 [c000000002cb3af0] scsi_request_fn at d00000000395ef88 [scsi_mod]
9 [c000000002cb3be0] __blk_run_queue at c000000000429860
10 [c000000002cb3c10] blk_delay_work at c00000000042a0ec
11 [c000000002cb3c40] process_one_work at c0000000000dac30
12 [c000000002cb3cd0] worker_thread at c0000000000db110
13 [c000000002cb3d80] kthread at c0000000000e3378
14 [c000000002cb3e30] ret_from_kernel_thread at c00000000000982c

The kernel buffer log is overfilled with this log:

[11261.952732] ibmvscsi: found no event struct in pool!

This patch reorders the operations during host teardown. Start by calling
the SRP transport and Scsi_Host remove functions to flush any outstanding
work and set the host offline. LLDD teardown follows including destruction
of the event pool, freeing the Command Response Queue (CRQ), and unmapping
any persistent buffers. The event pool destruction is protected by the
scsi_host lock, and the pool is purged prior of any requests for which we
never received a response. Finally, move the removal of the scsi host from
our global list to the end so that the host is easily locatable for
debugging purposes during teardown.

Cc: <stable@vger.kernel.org> # v2.6.12+
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

7f5203c1

scsi: ibmvscsi: Protect ibmvscsi_head from concurrent modificaiton · 7205981e

Tyrel Datwyler authored Mar 20, 2019

For each ibmvscsi host created during a probe or destroyed during a remove
we either add or remove that host to/from the global ibmvscsi_head
list. This runs the risk of concurrent modification.

This patch adds a simple spinlock around the list modification calls to
prevent concurrent updates as is done similarly in the ibmvfc driver and
ipr driver.

Fixes: 32d6e4b6 ("scsi: ibmvscsi: add vscsi hosts to global list_head")
Cc: <stable@vger.kernel.org> # v4.10+
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

7205981e

20 Mar, 2019 1 commit

scsi: hisi_sas: Add softreset in hisi_sas_I_T_nexus_reset() · 0e83fc61

Luo Jiaxing authored Mar 20, 2019

We found out that for v2 hw, a SATA disk can not be written to after the
system comes up.

In commit ffb1c820 ("scsi: hisi_sas: remove the check of sas_dev status
in hisi_sas_I_T_nexus_reset()"), we introduced a path where we may issue an
internal abort for a SATA device, but without following it with a
softreset.

We need to always follow an internal abort with a software reset, as per HW
programming flow, so add this.

Fixes: ffb1c820 ("scsi: hisi_sas: remove the check of sas_dev status in hisi_sas_I_T_nexus_reset()")
Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

0e83fc61

19 Mar, 2019 4 commits

scsi: qla2xxx: Fix NULL pointer crash due to stale CPUID · ac444b4f

Himanshu Madhani authored Mar 15, 2019

This patch fixes crash due to NULL pointer derefrence because CPU pointer
is not set and used by driver.  Instead, driver is passes CPU as tag via
ha->isp_ops->{lun_reset|target_reset}

[   30.160780] qla2xxx [0000:a0:00.1]-8038:9: Cable is unplugged...
[   69.984045] qla2xxx [0000:a0:00.0]-8009:8: DEVICE RESET ISSUED nexus=8:0:0 cmd=00000000b0d62f46.
[   69.992849] BUG: unable to handle kernel NULL pointer dereference at 0000000000000040
[   70.000680] PGD 0 P4D 0
[   70.003232] Oops: 0000 [#1] SMP PTI
[   70.006727] CPU: 2 PID: 6714 Comm: sg_reset Kdump: loaded Not tainted 4.18.0-67.el8.x86_64 #1
[   70.015258] Hardware name: NEC Express5800/T110j [N8100-2758Y]/MX32-PH0-NJ, BIOS F11 02/13/2019
[   70.024016] RIP: 0010:blk_mq_rq_cpu+0x9/0x10
[   70.028315] Code: 01 58 01 00 00 48 83 c0 28 48 3d 80 02 00 00 75 ab c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48
 8b 47 08 <8b> 40 40 c3 0f 1f 00 0f 1f 44 00 00 48 83 ec 10 48 c7 c6 20 6e 7c
[   70.047087] RSP: 0018:ffff99a481487d58 EFLAGS: 00010246
[   70.052322] RAX: 0000000000000000 RBX: ffffffffc041b08b RCX: 0000000000000000
[   70.059466] RDX: 0000000000000000 RSI: ffff8d10b6b16898 RDI: ffff8d10b341e400
[   70.066615] RBP: ffffffffc03a6bd0 R08: 0000000000000415 R09: 0000000000aaaaaa
[   70.073765] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8d10b341e528
[   70.080914] R13: ffff8d10aadefc00 R14: ffff8d0f64efa998 R15: ffff8d0f64efa000
[   70.088083] FS:  00007f90a201e540(0000) GS:ffff8d10b6b00000(0000) knlGS:0000000000000000
[   70.096188] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   70.101959] CR2: 0000000000000040 CR3: 0000000268886005 CR4: 00000000003606e0
[   70.109127] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   70.116277] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   70.123425] Call Trace:
[   70.125896]  __qla2xxx_eh_generic_reset+0xb1/0x220 [qla2xxx]
[   70.131572]  scsi_ioctl_reset+0x1f5/0x2a0
[   70.135600]  scsi_ioctl+0x18e/0x397
[   70.139099]  ? sd_ioctl+0x7c/0x100 [sd_mod]
[   70.143287]  blkdev_ioctl+0x32b/0x9f0
[   70.146954]  ? __check_object_size+0xa3/0x181
[   70.151323]  block_ioctl+0x39/0x40
[   70.154735]  do_vfs_ioctl+0xa4/0x630
[   70.158322]  ? syscall_trace_enter+0x1d3/0x2c0
[   70.162769]  ksys_ioctl+0x60/0x90
[   70.166104]  __x64_sys_ioctl+0x16/0x20
[   70.169859]  do_syscall_64+0x5b/0x1b0
[   70.173532]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[   70.178587] RIP: 0033:0x7f90a1b3445b
[   70.182183] Code: 0f 1e fa 48 8b 05 2d aa 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00
 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd a9 2c 00 f7 d8 64 89 01 48
[   70.200956] RSP: 002b:00007fffdca88b68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[   70.208535] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f90a1b3445b
[   70.215684] RDX: 00007fffdca88b84 RSI: 0000000000002284 RDI: 0000000000000003
[   70.222833] RBP: 00007fffdca88ca8 R08: 00007fffdca88b84 R09: 0000000000000000
[   70.229981] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffdca88b84
[   70.237131] R13: 0000000000000000 R14: 000055ab09b0bd28 R15: 0000000000000000
[   70.244284] Modules linked in: nft_chain_route_ipv4 xt_CHECKSUM nft_chain_nat_ipv4 ipt_MASQUERADE nf_nat_ipv4 nf_nat nf_conntrack_ipv4
 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 nft_counter nft_compat tun bridge stp llc nf_tables nfnetli
nk devlink sunrpc vfat fat intel_rapl intel_pmc_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm wmi_bmof iTCO_wdt iTCO_
vendor_support irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel ipmi_ssif intel_cstate intel_uncore intel_rapl_perf ipmi_si jo
ydev pcspkr ipmi_devintf sg wmi ipmi_msghandler video acpi_power_meter acpi_pad mei_me i2c_i801 mei ip_tables ext4 mbcache jbd2 sr_mod cd
rom sd_mod qla2xxx ast i2c_algo_bit drm_kms_helper nvme_fc syscopyarea sysfillrect uas sysimgblt fb_sys_fops nvme_fabrics ttm
[   70.314805]  usb_storage nvme_core crc32c_intel scsi_transport_fc ahci drm libahci tg3 libata megaraid_sas pinctrl_cannonlake pinctrl_
intel
[   70.327335] CR2: 0000000000000040

Fixes: 9cf2bab6 ("block: kill request ->cpu member")
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

ac444b4f

scsi: qla2xxx: Fix FC-AL connection target discovery · 4705f10e

Quinn Tran authored Mar 15, 2019

Commit 7f147f9b ("scsi: qla2xxx: Fix N2N target discovery with Local
loop") fixed N2N target discovery for local loop.  However, same code is
used for FC-AL discovery as well. Added check to make sure we are bypassing
area and domain check only in N2N topology for target discovery.

Fixes: 7f147f9b ("scsi: qla2xxx: Fix N2N target discovery with Local loop")
Cc: stable@vger.kernel.org # 5.0+
Signed-off-by: Quinn Tran <qtran@marvell.com>
Signed-off-by: Himanshu Madhani <hmadhani@marvell.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

4705f10e

scsi: core: Avoid that a kernel warning appears during system resume · 17605afa

Bart Van Assche authored Mar 15, 2019

Since scsi_device_quiesce() skips SCSI devices that have another state than
RUNNING, OFFLINE or TRANSPORT_OFFLINE, scsi_device_resume() should not
complain about SCSI devices that have been skipped. Hence this patch.  This
patch avoids that the following warning appears during resume:

WARNING: CPU: 3 PID: 1039 at blk_clear_pm_only+0x2a/0x30
CPU: 3 PID: 1039 Comm: kworker/u8:49 Not tainted 5.0.0+ #1
Hardware name: LENOVO 4180F42/4180F42, BIOS 83ET75WW (1.45 ) 05/10/2013
Workqueue: events_unbound async_run_entry_fn
RIP: 0010:blk_clear_pm_only+0x2a/0x30
Call Trace:
 ? scsi_device_resume+0x28/0x50
 ? scsi_dev_type_resume+0x2b/0x80
 ? async_run_entry_fn+0x2c/0xd0
 ? process_one_work+0x1f0/0x3f0
 ? worker_thread+0x28/0x3c0
 ? process_one_work+0x3f0/0x3f0
 ? kthread+0x10c/0x130
 ? __kthread_create_on_node+0x150/0x150
 ? ret_from_fork+0x1f/0x30

Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Cc: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: Martin Steigerwald <martin@lichtvoll.de>
Cc: <stable@vger.kernel.org>
Reported-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Tested-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
Fixes: 3a0a5299 ("block, scsi: Make SCSI quiesce and resume work reliably") # v4.15
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

17605afa

scsi: core: Also call destroy_rcu_head() for passthrough requests · db983f6e

Bart Van Assche authored Mar 18, 2019

cmd->rcu is initialized by scsi_initialize_rq(). For passthrough
requests, blk_get_request() calls scsi_initialize_rq(). For filesystem
requests, scsi_init_command() calls scsi_initialize_rq(). Make sure
that destroy_rcu_head() is called for passthrough requests.

Cc: Christoph Hellwig <hch@lst.de>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Johannes Thumshirn <jthumshirn@suse.de>
Reported-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

db983f6e

18 Mar, 2019 1 commit

scsi: iscsi: flush running unbind operations when removing a session · 165aa2bf

Maurizio Lombardi authored Jan 28, 2019

In some cases, the iscsi_remove_session() function is called while an
unbind_work operation is still running.  This may cause a situation where
sysfs objects are removed in an incorrect order, triggering a kernel
warning.

[  605.249442] ------------[ cut here ]------------
[  605.259180] sysfs group 'power' not found for kobject 'target2:0:0'
[  605.321371] WARNING: CPU: 1 PID: 26794 at fs/sysfs/group.c:235 sysfs_remove_group+0x76/0x80
[  605.341266] Modules linked in: dm_service_time target_core_user target_core_pscsi target_core_file target_core_iblock iscsi_target_mod target_core_mod nls_utf8 isofs ppdev bochs_drm nfit ttm libnvdimm drm_kms_helper syscopyarea sysfillrect sysimgblt joydev pcspkr fb_sys_fops drm i2c_piix4 sg parport_pc parport xfs libcrc32c dm_multipath sr_mod sd_mod cdrom ata_generic 8021q garp mrp ata_piix stp crct10dif_pclmul crc32_pclmul llc libata crc32c_intel virtio_net net_failover ghash_clmulni_intel serio_raw failover sunrpc dm_mirror dm_region_hash dm_log dm_mod be2iscsi bnx2i cnic uio cxgb4i cxgb4 libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
[  605.627479] CPU: 1 PID: 26794 Comm: kworker/u32:2 Not tainted 4.18.0-60.el8.x86_64 #1
[  605.721401] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
[  605.823651] Workqueue: scsi_wq_2 __iscsi_unbind_session [scsi_transport_iscsi]
[  605.830940] RIP: 0010:sysfs_remove_group+0x76/0x80
[  605.922907] Code: 48 89 df 5b 5d 41 5c e9 38 c4 ff ff 48 89 df e8 e0 bf ff ff eb cb 49 8b 14 24 48 8b 75 00 48 c7 c7 38 73 cb a7 e8 24 77 d7 ff <0f> 0b 5b 5d 41 5c c3 0f 1f 00 0f 1f 44 00 00 41 56 41 55 41 54 55
[  606.122304] RSP: 0018:ffffbadcc8d1bda8 EFLAGS: 00010286
[  606.218492] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[  606.326381] RDX: ffff98bdfe85eb40 RSI: ffff98bdfe856818 RDI: ffff98bdfe856818
[  606.514498] RBP: ffffffffa7ab73e0 R08: 0000000000000268 R09: 0000000000000007
[  606.529469] R10: 0000000000000000 R11: ffffffffa860d9ad R12: ffff98bdf978e838
[  606.630535] R13: ffff98bdc2cd4010 R14: ffff98bdc2cd3ff0 R15: ffff98bdc2cd4000
[  606.824707] FS:  0000000000000000(0000) GS:ffff98bdfe840000(0000) knlGS:0000000000000000
[  607.018333] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  607.117844] CR2: 00007f84b78ac024 CR3: 000000002c00a003 CR4: 00000000003606e0
[  607.117844] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  607.420926] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  607.524236] Call Trace:
[  607.530591]  device_del+0x56/0x350
[  607.624393]  ? ata_tlink_match+0x30/0x30 [libata]
[  607.727805]  ? attribute_container_device_trigger+0xb4/0xf0
[  607.829911]  scsi_target_reap_ref_release+0x39/0x50
[  607.928572]  scsi_remove_target+0x1a2/0x1d0
[  608.017350]  __iscsi_unbind_session+0xb3/0x160 [scsi_transport_iscsi]
[  608.117435]  process_one_work+0x1a7/0x360
[  608.132917]  worker_thread+0x30/0x390
[  608.222900]  ? pwq_unbound_release_workfn+0xd0/0xd0
[  608.323989]  kthread+0x112/0x130
[  608.418318]  ? kthread_bind+0x30/0x30
[  608.513821]  ret_from_fork+0x35/0x40
[  608.613909] ---[ end trace 0b98c310c8a6138c ]---
Signed-off-by: Maurizio Lombardi <mlombard@redhat.com>
Acked-by: Chris Leech <cleech@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

165aa2bf

17 Mar, 2019 14 commits

Linux 5.1-rc1 · 9e98c678
Linus Torvalds authored Mar 17, 2019

9e98c678

Merge tag 'kbuild-v5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild · 28d747f2

Linus Torvalds authored Mar 17, 2019

Pull more Kbuild updates from Masahiro Yamada:

 - add more Build-Depends to Debian source package

 - prefix header search paths with $(srctree)/

 - make modpost show verbose section mismatch warnings

 - avoid hard-coded CROSS_COMPILE for h8300

 - fix regression for Debian make-kpkg command

 - add semantic patch to detect missing put_device()

 - fix some warnings of 'make deb-pkg'

 - optimize NOSTDINC_FLAGS evaluation

 - add warnings about redundant generic-y

 - clean up Makefiles and scripts

* tag 'kbuild-v5.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kconfig: remove stale lxdialog/.gitignore
  kbuild: force all architectures except um to include mandatory-y
  kbuild: warn redundant generic-y
  Revert "modsign: Abort modules_install when signing fails"
  kbuild: Make NOSTDINC_FLAGS a simply expanded variable
  kbuild: deb-pkg: avoid implicit effects
  coccinelle: semantic code search for missing put_device()
  kbuild: pkg: grep include/config/auto.conf instead of $KCONFIG_CONFIG
  kbuild: deb-pkg: introduce is_enabled and if_enabled_echo to builddeb
  kbuild: deb-pkg: add CONFIG_ prefix to kernel config options
  kbuild: add workaround for Debian make-kpkg
  kbuild: source include/config/auto.conf instead of ${KCONFIG_CONFIG}
  unicore32: simplify linker script generation for decompressor
  h8300: use cc-cross-prefix instead of hardcoding h8300-unknown-linux-
  kbuild: move archive command to scripts/Makefile.lib
  modpost: always show verbose warning for section mismatch
  ia64: prefix header search path with $(srctree)/
  libfdt: prefix header search paths with $(srctree)/
  deb-pkg: generate correct build dependencies

28d747f2

Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 80b98e92

Linus Torvalds authored Mar 17, 2019

Pull x86 asm updates from Thomas Gleixner:
 "Two cleanup patches removing dead conditionals and unused code"

* 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/asm: Remove unused __constant_c_x_memset() macro and inlines
  x86/asm: Remove dead __GNUC__ conditionals

80b98e92

Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · 69ebf9a1

Linus Torvalds authored Mar 17, 2019

Pull perf fixes from Thomas Gleixner:
 "Three fixes for the fallout from the TSX errata workaround:

   - Prevent memory corruption caused by a unchecked out of bound array
     index.

   - Two trivial fixes to address compiler warnings"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Make dev_attr_allow_tsx_force_abort static
  perf/x86: Fixup typo in stub functions
  perf/x86/intel: Fix memory corruption

69ebf9a1

Merge tag 'for-linus-5.1b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip · c5b5138c

Linus Torvalds authored Mar 17, 2019

Pull xen fix from Juergen Gross:
 "A fix for a Xen bug introduced by David's series for excluding
  ballooned pages in vmcores"

* tag 'for-linus-5.1b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/balloon: Fix mapping PG_offline pages to user space

c5b5138c

Merge tag '9p-for-5.1' of git://github.com/martinetd/linux · db77bef5

Linus Torvalds authored Mar 17, 2019

Pull 9p updates from Dominique Martinet:
 "Here is a 9p update for 5.1; there honestly hasn't been much.

  Two fixes (leak on invalid mount argument and possible deadlock on
  i_size update on 32bit smp) and a fall-through warning cleanup"

* tag '9p-for-5.1' of git://github.com/martinetd/linux:
  9p/net: fix memory leak in p9_client_create
  9p: use inode->i_lock to protect i_size_write() under 32-bit
  9p: mark expected switch fall-through

db77bef5

perf/x86/intel: Make dev_attr_allow_tsx_force_abort static · c634dc6b

kbuild test robot authored Mar 14, 2019

Fixes: 400816f6 ("perf/x86/intel: Implement support for TSX Force Abort")
Signed-off-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: kbuild-all@01.org
Cc: Borislav Petkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20190313184243.GA10820@lkp-sb-ep06

c634dc6b

kconfig: remove stale lxdialog/.gitignore · c71bb9f8

Masahiro Yamada authored Mar 17, 2019

When this .gitignore was added, lxdialog was an independent hostprogs-y.

Now that all objects in lxdialog/ are directly linked to mconf, the
lxdialog is no longer generated.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

c71bb9f8

kbuild: force all architectures except um to include mandatory-y · 037fc336

Masahiro Yamada authored Mar 17, 2019

Currently, every arch/*/include/uapi/asm/Kbuild explicitly includes
the common Kbuild.asm file. Factor out the duplicated include directives
to scripts/Makefile.asm-generic so that no architecture would opt out
of the mandatory-y mechanism.

um is not forced to include mandatory-y since it is a very exceptional
case which does not support UAPI.
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

037fc336

kbuild: warn redundant generic-y · 7cbbbb8b

Masahiro Yamada authored Mar 17, 2019

The generic-y is redundant under the following condition:

 - arch has its own implementation

 - the same header is added to generated-y

 - the same header is added to mandatory-y

If a redundant generic-y is found, the warning like follows is displayed:

  scripts/Makefile.asm-generic:20: redundant generic-y found in arch/arm/include/asm/Kbuild: timex.h

I fixed up arch Kbuild files found by this.
Suggested-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

7cbbbb8b

Revert "modsign: Abort modules_install when signing fails" · f84dde10

Douglas Anderson authored Mar 15, 2019

This reverts commit caf6fe91.

The commit was fine but is no longer needed as of commit 3a2429e1
("kbuild: change if_changed_rule for multi-line recipe").  Let's go
back to using ";" to be consistent.

For some discussion, see:

https://lkml.kernel.org/r/CAK7LNASde0Q9S5GKeQiWhArfER4S4wL1=R_FW8q0++_X3T5=hQ@mail.gmail.comSigned-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

f84dde10

kbuild: Make NOSTDINC_FLAGS a simply expanded variable · 0c22be07

Douglas Anderson authored Mar 14, 2019

During a simple no-op (nothing changed) build I saw 39 invocations of
the C compiler with the argument "-print-file-name=include".  We don't
need to call the C compiler 39 times for this--one time will suffice.

Let's change NOSTDINC_FLAGS to a simply expanded variable to avoid
this since there doesn't appear to be any reason it should be
recursively expanded.

On my build this shaved ~400 ms off my "no-op" build.

Note that the recursive expansion seems to date back to the (really
old) commit e8f5bdb0 ("[PATCH] Makefile include path ordering").
It's a little unclear to me if the point of that patch was to switch
the variable to be recursively expanded (which it did) or to avoid
directly assigning to NOSTDINC_FLAGS (AKA to switch to +=) because
someone else (out of tree?) was setting it.  I presume later since if
the only goal was to switch to recursive expansion the patch would
have just removed the ":".
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

0c22be07

kbuild: deb-pkg: avoid implicit effects · f6d9db63

Arseny Maslennikov authored Mar 09, 2019

* The man page for dpkg-source(1) notes:

>      -b, --build directory [format-specific-parameters]
>             Build  a  source  package  (--build since dpkg 1.17.14).
>             <...>
>
>             dpkg-source will build the source package with the first
>             format found in this ordered list: the format  indicated
>             with  the  --format  command  line  option,  the  format
>             indicated in debian/source/format, “1.0”.  The  fallback
>             to “1.0” is deprecated and will be removed at some point
>             in the future, you should always  document  the  desired
>             source   format  in  debian/source/format.  See  section
>             SOURCE PACKAGE FORMATS for an extensive  description  of
>             the various source package formats.

  Thus it would be more foolproof to explicitly use 1.0 (as we always
  did) than to rely on dpkg-source's defaults.

* In a similar vein, debian/rules is not made executable by mkdebian,
  and dpkg-source warns about that but still silently fixes the file.
  Let's be explicit once again.
Signed-off-by: Arseny Maslennikov <ar@cs.msu.ru>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

f6d9db63

coccinelle: semantic code search for missing put_device() · da9cfb87

Wen Yang authored Feb 15, 2019

The of_find_device_by_node() takes a reference to the underlying device
structure, we should release that reference.
The implementation of this semantic code search is:
In a function, for a local variable returned by calling
of_find_device_by_node(),
a, if it is released by a function such as
   put_device()/of_dev_put()/platform_device_put() after the last use,
   it is considered that there is no reference leak;
b, if it is passed back to the caller via
   dev_get_drvdata()/platform_get_drvdata()/get_device(), etc., the
   reference will be released in other functions, and the current function
   also considers that there is no reference leak;
c, for the rest of the situation, the current function should release the
   reference by calling put_device, this code search will report the
   corresponding error message.

By using this semantic code search, we have found some object reference leaks,
such as:
commit 11907e9d ("ASoC: fsl-asoc-card: fix object reference leaks in
fsl_asoc_card_probe")
commit a12085d1 ("mtd: rawnand: atmel: fix possible object reference leak")
commit 11493f26 ("mtd: rawnand: jz4780: fix possible object reference leak")

There are still dozens of reference leaks in the current kernel code.

Further, for the case of b, the object returned to other functions may also
have a reference leak, we will continue to develop other cocci scripts to
further check the reference leak.
Signed-off-by: Wen Yang <wen.yang99@zte.com.cn>
Reviewed-by: Julia Lawall <Julia.Lawall@lip6.fr>
Reviewed-by: Markus Elfring <Markus.Elfring@web.de>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>

da9cfb87

16 Mar, 2019 4 commits

Merge tag 'pidfd-v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux · a9dce667

Linus Torvalds authored Mar 16, 2019

Pull pidfd system call from Christian Brauner:
 "This introduces the ability to use file descriptors from /proc/<pid>/
  as stable handles on struct pid. Even if a pid is recycled the handle
  will not change. For a start these fds can be used to send signals to
  the processes they refer to.

  With the ability to use /proc/<pid> fds as stable handles on struct
  pid we can fix a long-standing issue where after a process has exited
  its pid can be reused by another process. If a caller sends a signal
  to a reused pid it will end up signaling the wrong process.

  With this patchset we enable a variety of use cases. One obvious
  example is that we can now safely delegate an important part of
  process management - sending signals - to processes other than the
  parent of a given process by sending file descriptors around via scm
  rights and not fearing that the given process will have been recycled
  in the meantime. It also allows for easy testing whether a given
  process is still alive or not by sending signal 0 to a pidfd which is
  quite handy.

  There has been some interest in this feature e.g. from systems
  management (systemd, glibc) and container managers. I have requested
  and gotten comments from glibc to make sure that this syscall is
  suitable for their needs as well. In the future I expect it to take on
  most other pid-based signal syscalls. But such features are left for
  the future once they are needed.

  This has been sitting in linux-next for quite a while and has not
  caused any issues. It comes with selftests which verify basic
  functionality and also test that a recycled pid cannot be signaled via
  a pidfd.

  Jon has written about a prior version of this patchset. It should
  cover the basic functionality since not a lot has changed since then:

      https://lwn.net/Articles/773459/

  The commit message for the syscall itself is extensively documenting
  the syscall, including it's functionality and extensibility"

* tag 'pidfd-v5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux:
  selftests: add tests for pidfd_send_signal()
  signal: add pidfd_send_signal() syscall

a9dce667

Merge tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm · f67e3fb4

Linus Torvalds authored Mar 16, 2019

Pull device-dax updates from Dan Williams:
 "New device-dax infrastructure to allow persistent memory and other
  "reserved" / performance differentiated memories, to be assigned to
  the core-mm as "System RAM".

  Some users want to use persistent memory as additional volatile
  memory. They are willing to cope with potential performance
  differences, for example between DRAM and 3D Xpoint, and want to use
  typical Linux memory management apis rather than a userspace memory
  allocator layered over an mmap() of a dax file. The administration
  model is to decide how much Persistent Memory (pmem) to use as System
  RAM, create a device-dax-mode namespace of that size, and then assign
  it to the core-mm. The rationale for device-dax is that it is a
  generic memory-mapping driver that can be layered over any "special
  purpose" memory, not just pmem. On subsequent boots udev rules can be
  used to restore the memory assignment.

  One implication of using pmem as RAM is that mlock() no longer keeps
  data off persistent media. For this reason it is recommended to enable
  NVDIMM Security (previously merged for 5.0) to encrypt pmem contents
  at rest. We considered making this recommendation an actively enforced
  requirement, but in the end decided to leave it as a distribution /
  administrator policy to allow for emulation and test environments that
  lack security capable NVDIMMs.

  Summary:

   - Replace the /sys/class/dax device model with /sys/bus/dax, and
     include a compat driver so distributions can opt-in to the new ABI.

   - Allow for an alternative driver for the device-dax address-range

   - Introduce the 'kmem' driver to hotplug / assign a device-dax
     address-range to the core-mm.

   - Arrange for the device-dax target-node to be onlined so that the
     newly added memory range can be uniquely referenced by numa apis"

NOTE! I'm not entirely happy with the whole "PMEM as RAM" model because
we currently have special - and very annoying rules in the kernel about
accessing PMEM only with the "MC safe" accessors, because machine checks
inside the regular repeat string copy functions can be fatal in some
(not described) circumstances.

And apparently the PMEM modules can cause that a lot more than regular
RAM.  The argument is that this happens because PMEM doesn't necessarily
get scrubbed at boot like RAM does, but that is planned to be added for
the user space tooling.

Quoting Dan from another email:
 "The exposure can be reduced in the volatile-RAM case by scanning for
  and clearing errors before it is onlined as RAM. The userspace tooling
  for that can be in place before v5.1-final. There's also runtime
  notifications of errors via acpi_nfit_uc_error_notify() from
  background scrubbers on the DIMM devices. With that mechanism the
  kernel could proactively clear newly discovered poison in the volatile
  case, but that would be additional development more suitable for v5.2.

  I understand the concern, and the need to highlight this issue by
  tapping the brakes on feature development, but I don't see PMEM as RAM
  making the situation worse when the exposure is also there via DAX in
  the PMEM case. Volatile-RAM is arguably a safer use case since it's
  possible to repair pages where the persistent case needs active
  application coordination"

* tag 'devdax-for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  device-dax: "Hotplug" persistent memory for use like normal RAM
  mm/resource: Let walk_system_ram_range() search child resources
  mm/memory-hotplug: Allow memory resources to be children
  mm/resource: Move HMM pr_debug() deeper into resource code
  mm/resource: Return real error codes from walk failures
  device-dax: Add a 'modalias' attribute to DAX 'bus' devices
  device-dax: Add a 'target_node' attribute
  device-dax: Auto-bind device after successful new_id
  acpi/nfit, device-dax: Identify differentiated memory with a unique numa-node
  device-dax: Add /sys/class/dax backwards compatibility
  device-dax: Add support for a dax override driver
  device-dax: Move resource pinning+mapping into the common driver
  device-dax: Introduce bus + driver model
  device-dax: Start defining a dax bus model
  device-dax: Remove multi-resource infrastructure
  device-dax: Kill dax_region base
  device-dax: Kill dax_region ida

f67e3fb4

Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 477558d7

Linus Torvalds authored Mar 16, 2019

Pull more SCSI updates from James Bottomley:
 "This is the final round of mostly small fixes and performance
  improvements to our initial submit.

  The main regression fix is the ia64 simscsi build failure which was
  missed in the serial number elimination conversion"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (24 commits)
  scsi: ia64: simscsi: use request tag instead of serial_number
  scsi: aacraid: Fix performance issue on logical drives
  scsi: lpfc: Fix error codes in lpfc_sli4_pci_mem_setup()
  scsi: libiscsi: Hold back_lock when calling iscsi_complete_task
  scsi: hisi_sas: Change SERDES_CFG init value to increase reliability of HiLink
  scsi: hisi_sas: Send HARD RESET to clear the previous affiliation of STP target port
  scsi: hisi_sas: Set PHY linkrate when disconnected
  scsi: hisi_sas: print PHY RX errors count for later revision of v3 hw
  scsi: hisi_sas: Fix a timeout race of driver internal and SMP IO
  scsi: hisi_sas: Change return variable type in phy_up_v3_hw()
  scsi: qla2xxx: check for kstrtol() failure
  scsi: lpfc: fix 32-bit format string warning
  scsi: lpfc: fix unused variable warning
  scsi: target: tcmu: Switch to bitmap_zalloc()
  scsi: libiscsi: fall back to sendmsg for slab pages
  scsi: qla2xxx: avoid printf format warning
  scsi: lpfc: resolve static checker warning in lpfc_sli4_hba_unset
  scsi: lpfc: Correct __lpfc_sli_issue_iocb_s4 lockdep check
  scsi: ufs: hisi: fix ufs_hba_variant_ops passing
  scsi: qla2xxx: Fix panic in qla_dfs_tgt_counters_show
  ...

477558d7

Merge tag 'for-5.1/block-post-20190315' of git://git.kernel.dk/linux-block · 11efae35

Linus Torvalds authored Mar 16, 2019

Pull more block layer changes from Jens Axboe:
 "This is a collection of both stragglers, and fixes that came in after
  I finalized the initial pull. This contains:

   - An MD pull request from Song, with a few minor fixes

   - Set of NVMe patches via Christoph

   - Pull request from Konrad, with a few fixes for xen/blkback

   - pblk fix IO calculation fix (Javier)

   - Segment calculation fix for pass-through (Ming)

   - Fallthrough annotation for blkcg (Mathieu)"

* tag 'for-5.1/block-post-20190315' of git://git.kernel.dk/linux-block: (25 commits)
  blkcg: annotate implicit fall through
  nvme-tcp: support C2HData with SUCCESS flag
  nvmet: ignore EOPNOTSUPP for discard
  nvme: add proper write zeroes setup for the multipath device
  nvme: add proper discard setup for the multipath device
  nvme: remove nvme_ns_config_oncs
  nvme: disable Write Zeroes for qemu controllers
  nvmet-fc: bring Disconnect into compliance with FC-NVME spec
  nvmet-fc: fix issues with targetport assoc_list list walking
  nvme-fc: reject reconnect if io queue count is reduced to zero
  nvme-fc: fix numa_node when dev is null
  nvme-fc: use nr_phys_segments to determine existence of sgl
  nvme-loop: init nvmet_ctrl fatal_err_work when allocate
  nvme: update comment to make the code easier to read
  nvme: put ns_head ref if namespace fails allocation
  nvme-trace: fix cdw10 buffer overrun
  nvme: don't warn on block content change effects
  nvme: add get-feature to admin cmds tracer
  md: Fix failed allocation of md_register_thread
  It's wrong to add len to sector_nr in raid10 reshape twice
  ...

11efae35