Commits · 0d5efb8a3c62c598d092bedb579debacd3f746eb · nexedi / linux

09 Feb, 2017 18 commits

target/iscsi: Fix spelling of "perform" · 0d5efb8a

Bart Van Assche authored Dec 23, 2016

Change two occurrences of "preform" into "perform".
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

0d5efb8a

target/iscsi: Fix indentation in iscsi_target_start_negotiation() · 1efaa949

Bart Van Assche authored Dec 23, 2016

This patch avoids that smatch complains about inconsistent
indentation in iscsi_target_start_negotiation().
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Nicholas A. Bellinger <nab@linux-iscsi.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

1efaa949

target/tcm_fc: Remove a set-but-not-used variable · 664a722b

Bart Van Assche authored Jan 04, 2017

This was detected by building with W=1.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Johannes Thumshirn <jth@kernel.org>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

664a722b

target/cxgbit: Use T6 specific macro to set the force bit · bdec5188

Varun Prakash authored Jan 24, 2017

For T6 adapters use T6 specific macro to set the force bit.
Signed-off-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

bdec5188

target/cxgbit: Fix endianness annotations · 5cadafb2

Bart Van Assche authored Jan 13, 2017

This patch does not change any functionality but avoids that sparse
complains about endianness.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Varun Prakash <varun@chelsio.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

5cadafb2

qla2xxx: Avoid using variable-length arrays · 2fdbc65e

Bart Van Assche authored Jan 20, 2017

This patch does not change any functionality but avoids that sparse
complains about using variable-length arrays.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Cc: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

2fdbc65e

qla2xxx: Simplify usage of SRB structure in driver · 25ff6af1

Joe Carnuccio authored Jan 19, 2017

This patch simplifies SRB structure usage in driver.

- Simplify sp->done() and sp->free() interfaces.
- Remove sp->fcport->vha to use vha pointer from sp.
- Use sp->vha context in qla2x00_rel_sp().
Signed-off-by: Joe Carnuccio <joe.carnuccio@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

25ff6af1

qla2xxx: Improve RSCN handling in driver · 41dc529a

Quinn Tran authored Jan 19, 2017

Current code blindly does State Change Registration when
the link is up. Move SCR behind fabric scan, so that arbitrated
loop scan would not get erroneous error message.

Some of the other improvements are as follows

- Add session deletion for TPRLO and send acknowledgment for TPRLO.
- Enable FW option to move ABTS, RIDA & PUREX from RSPQ to ATIOQ.
- Save NPort ID early in link init.
- Move ABTS & RIDA to ATIOQ helps in keeping command ordering and
  link up sequence ordering.
- Save Nport ID and update VP map so that SCSI CMD/ATIO won't be dropped.
- fcport alloc does the initializes memory to zero. Remove memset to
  zero since It might corrupt link list.
- Turn off Registration for State Change MB in loop mode.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

41dc529a

qla2xxx: Remove unused reverse_ini_mode · 0ca55938

Himanshu Madhani authored Jan 19, 2017

With support for dual mode in the driver, this mode becomes
dead code. Remove reverse_ini_mode from code.
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

0ca55938

qla2xxx: Add Dual mode support in the driver · ead03855

Quinn Tran authored Jan 19, 2017

Add switch to allow both Initiator Mode & Target
mode to operate at the same time.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

ead03855

qla2xxx: Add framework for async fabric discovery · 726b8548

Quinn Tran authored Jan 19, 2017

Currently code performs a full scan of the fabric for
every RSCN. Its an expensive process in a noisy large SAN.

This patch optimizes expensive fabric discovery process by
scanning switch for the affected port when RSCN is received.

Currently Initiator Mode code makes login/logout decision without
knowledge of target mode. This causes driver and firmware to go
out-of-sync. This framework synchronizes both initiator mode
personality and target mode personality in making login/logout
decision.

This patch adds following capabilities in the driver

- Send Notification Acknowledgement asynchronously.
- Update session/fcport state asynchronously.
- Create a session or fcport struct asynchronously.
- Send GNL asynchronously. The command will ask FW to
  provide a list of FC Port entries FW knows about.
- Send GPDB asynchronously. The command will ask FW to
  provide detail data of an FC Port FW knows about or
  perform ADISC to verify the state of the session.
- Send GPNID asynchronously. The command will ask switch
  to provide WWPN for provided NPort ID.
- Send GPSC asynchronously. The command will ask switch
  to provide registered port speed for provided WWPN.
- Send GIDPN asynchronously. The command will ask the
  switch to provide Nport ID for provided WWPN.
- In driver unload path, schedule all session for deletion
  and wait for deletion to complete before allowing driver
  unload to proceed.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
[ bvanassche: fixed spelling in patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

726b8548

qla2xxx: Track I-T nexus as single fc_port struct · 5d964837

Quinn Tran authored Jan 19, 2017

Current code merges qla_tgt_sess and fc_port structure
into single fc_port structure representing same I-T nexus.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
[ bvanassche: fixed spelling of patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

5d964837

qla2xxx: Use d_id instead of s_id for more clarity · 37cacc0a

Quinn Tran authored Jan 19, 2017

Updated code with d_id from s_id for better readability and
clarity.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
[ bvanassche: fixed spelling of patch description ]
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

37cacc0a

qla2xxx: Fix wrong argument in sp done callback · 5e4deaf6

Quinn Tran authored Jan 19, 2017

Callback for sp->done expects scsi_qla_host is passed in as argument,
Instead qla_hw_data is passed in.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

5e4deaf6

qla2xxx: Remove SRR code · 2c39b5ca

Himanshu Madhani authored Jan 19, 2017

During initial implementation, tape support was included but not
enabled by default on target. So far, we don't see any target
customer requesting this support. Since this code is not being
used actively, we want to remove it and we will add back if there
are any request in future for SRR support.
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Giridhar Malavali <giridhar.malavali@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

2c39b5ca

qla2xxx: Make trace flags more readable · 1eb42f96

Quinn Tran authored Jan 19, 2017

Trace flags are useful during debugging crash dumps
using crash utility. These trace flags makes it easier
to understand various states a command has successfully
completed.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

1eb42f96

qla2xxx: Cleanup TMF code translation from qla_target · be92fc3f

Quinn Tran authored Jan 19, 2017

Move code code which converts Task Mgmt Command flags for
ATIO to TCM #defines, from qla2xxx driver to tcm_qla2xxx
driver.
Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

be92fc3f

qla2xxx: Remove direct access of scsi_status field in se_cmd · df2e32c5

Quinn Tran authored Jan 19, 2017

Signed-off-by: Quinn Tran <quinn.tran@cavium.com>
Signed-off-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

df2e32c5

08 Feb, 2017 6 commits

ibmvscsis: Add SGL limit · b22bc278

Bryant G. Ly authored Feb 06, 2017

This patch adds internal LIO sgl limit since the driver already
sets a max transfer limit on transport layer of 1MB to the client.

Cc: stable@vger.kernel.org
Tested-by: Steven Royer <seroyer@linux.vnet.ibm.com>
Signed-off-by: Bryant G. Ly <bryantly@linux.vnet.ibm.com>
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

b22bc278

target: Fix COMPARE_AND_WRITE ref leak for non GOOD status · 9b2792c3

Nicholas Bellinger authored Feb 06, 2017

This patch addresses a long standing bug where the commit phase
of COMPARE_AND_WRITE would result in a se_cmd->cmd_kref reference
leak if se_cmd->scsi_status returned non SAM_STAT_GOOD.

This would manifest first as a lost SCSI response, and eventual
hung task during fabric driver logout or re-login, as existing
shutdown logic waited for the COMPARE_AND_WRITE se_cmd->cmd_kref
to reach zero.

To address this bug, compare_and_write_post() has been changed
to drop the incorrect !cmd->scsi_status conditional that was
preventing *post_ret = 1 for being set during non SAM_STAT_GOOD
status.

This patch has been tested with SAM_STAT_CHECK_CONDITION status
from normal target_complete_cmd() callback path, as well as the
incoming __target_execute_cmd() submission failure path when
se_cmd->execute_cmd() returns non zero status.
Reported-by: Donald White <dew@datera.io>
Cc: Donald White <dew@datera.io>
Tested-by: Gary Guo <ghg@datera.io>
Cc: Gary Guo <ghg@datera.io>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: <stable@vger.kernel.org> # v3.12+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

9b2792c3

target: Fix multi-session dynamic se_node_acl double free OOPs · 01d4d673

Nicholas Bellinger authored Dec 07, 2016

This patch addresses a long-standing bug with multi-session
(eg: iscsi-target + iser-target) se_node_acl dynamic free
withini transport_deregister_session().

This bug is caused when a storage endpoint is configured with
demo-mode (generate_node_acls = 1 + cache_dynamic_acls = 1)
initiators, and initiator login creates a new dynamic node acl
and attaches two sessions to it.

After that, demo-mode for the storage instance is disabled via
configfs (generate_node_acls = 0 + cache_dynamic_acls = 0) and
the existing dynamic acl is never converted to an explicit ACL.

The end result is dynamic acl resources are released twice when
the sessions are shutdown in transport_deregister_session().

If the storage instance is not changed to disable demo-mode,
or the dynamic acl is converted to an explict ACL, or there
is only a single session associated with the dynamic ACL,
the bug is not triggered.

To address this big, move the release of dynamic se_node_acl
memory into target_complete_nacl() so it's only freed once
when se_node_acl->acl_kref reaches zero.

(Drop unnecessary list_del_init usage - HCH)
Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Cc: stable@vger.kernel.org # 4.1+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

01d4d673

target: Fix early transport_generic_handle_tmr abort scenario · c54eeffb

Nicholas Bellinger authored Dec 06, 2016

This patch fixes a bug where incoming task management requests
can be explicitly aborted during an active LUN_RESET, but who's
struct work_struct are canceled in-flight before execution.

This occurs when core_tmr_drain_tmr_list() invokes cancel_work_sync()
for the incoming se_tmr_req->task_cmd->work, resulting in cmd->work
for target_tmr_work() never getting invoked and the aborted TMR
waiting indefinately within transport_wait_for_tasks().

To address this case, perform a CMD_T_ABORTED check early in
transport_generic_handle_tmr(), and invoke the normal path via
transport_cmd_check_stop_to_fabric() to complete any TMR kthreads
blocked waiting for CMD_T_STOP in transport_wait_for_tasks().

Also, move the TRANSPORT_ISTATE_PROCESSING assignment earlier
into transport_generic_handle_tmr() so the existing check in
core_tmr_drain_tmr_list() avoids attempting abort the incoming
se_tmr_req->task_cmd->work if it has already been queued into
se_device->tmr_wq.
Reported-by: Rob Millner <rlm@daterainc.com>
Tested-by: Rob Millner <rlm@daterainc.com>
Cc: Rob Millner <rlm@daterainc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

c54eeffb

target: Use correct SCSI status during EXTENDED_COPY exception · 0583c261

Nicholas Bellinger authored Oct 31, 2016

This patch adds the missing target_complete_cmd() SCSI status
parameter change in target_xcopy_do_work(), that was originally
missing in commit 926317de.

It correctly propigates up the correct SCSI status during
EXTENDED_COPY exception cases, instead of always using the
hardcoded SAM_STAT_CHECK_CONDITION from original code.

This is required for ESX host environments that expect to
hit SAM_STAT_RESERVATION_CONFLICT for certain scenarios,
and SAM_STAT_CHECK_CONDITION results in non-retriable
status for these cases.
Reported-by: Nixon Vincent <nixon.vincent@calsoftinc.com>
Tested-by: Nixon Vincent <nixon.vincent@calsoftinc.com>
Cc: Nixon Vincent <nixon.vincent@calsoftinc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

0583c261

target: Don't BUG_ON during NodeACL dynamic -> explicit conversion · 391e2a6d

Nicholas Bellinger authored Oct 23, 2016

After the v4.2+ RCU conversion to se_node_acl->lun_entry_hlist,
a BUG_ON() was added in core_enable_device_list_for_node() to
detect when the located orig->se_lun_acl contains an existing
se_lun_acl pointer reference.

However, this scenario can happen when a dynamically generated
NodeACL is being converted to an explicit NodeACL, when the
explicit NodeACL contains a different LUN mapping than the
default provided by the WWN endpoint.

So instead of triggering BUG_ON(), go ahead and fail instead
following the original pre RCU conversion logic.
Reported-by: Benjamin ESTRABAUD <ben.estrabaud@mpstor.com>
Cc: Benjamin ESTRABAUD <ben.estrabaud@mpstor.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: stable@vger.kernel.org # 4.2+
Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>

391e2a6d

05 Feb, 2017 1 commit
- Linux 4.10-rc7 · d5adbfcd
  Linus Torvalds authored Feb 05, 2017
  
  d5adbfcd
04 Feb, 2017 6 commits

Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip · a572a1b9

Linus Torvalds authored Feb 04, 2017

Pull irq fixes from Thomas Gleixner:

 - Prevent double activation of interrupt lines, which causes problems
   on certain interrupt controllers

 - Handle the fallout of the above because x86 (ab)uses the activation
   function to reconfigure interrupts under the hood.

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/irq: Make irq activate operations symmetric
  irqdomain: Avoid activating interrupts more than once

a572a1b9

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm · 24bc5fe7

Linus Torvalds authored Feb 04, 2017

Pull KVM fix from Radim Krčmář:
 "Fix a regression that prevented migration between hosts with different
  XSAVE features even if the missing features were not used by the guest
  (for stable)"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: do not save guest-unsupported XSAVE state

24bc5fe7

Merge tag 'char-misc-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc · 412e6d3f

Linus Torvalds authored Feb 04, 2017

Pull char/misc driver fixes from Greg KH:
 "Here are two bugfixes that resolve some reported issues. One in the
  firmware loader, that should fix the much-reported problem of crashes
  with it. The other is a hyperv fix for a reported regression.

  Both have been in linux-next for a week or so with no reported issues"

* tag 'char-misc-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  Drivers: hv: vmbus: finally fix hv_need_to_signal_on_read()
  firmware: fix NULL pointer dereference in __fw_load_abort()

412e6d3f

Merge tag 'staging-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging · 252bf9f4

Linus Torvalds authored Feb 04, 2017

Pull staging/IIO fixes from Greg KH:
 "Here are a few small IIO and one staging driver fix for 4.10-rc7. They
  fix some reported issues with the drivers.

  All of them have been in linux-next for a week or so with no reported
  issues"

* tag 'staging-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging:
  staging: greybus: timesync: validate platform state callback
  iio: dht11: Use usleep_range instead of msleep for start signal
  iio: adc: palmas_gpadc: retrieve a valid iio_dev in suspend/resume
  iio: health: max30100: fixed parenthesis around FIFO count check
  iio: health: afe4404: retrieve a valid iio_dev in suspend/resume
  iio: health: afe4403: retrieve a valid iio_dev in suspend/resume

252bf9f4

Merge tag 'usb-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb · 8fcdcc42

Linus Torvalds authored Feb 04, 2017

Pull USB fixes from Greg KH:
 "Here are some small USB fixes for some reported issues, and the usual
  number of new device ids for 4.10-rc7.

  All of these, except the last new device id, have been in linux-next
  for a while with no reported issues"

* tag 'usb-4.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: serial: pl2303: add ATEN device ID
  usb: gadget: f_fs: Assorted buffer overflow checks.
  USB: Add quirk for WORLDE easykey.25 MIDI keyboard
  usb: musb: Fix external abort on non-linefetch for musb_irq_work()
  usb: musb: Fix host mode error -71 regression
  USB: serial: option: add device ID for HP lt2523 (Novatel E371)
  USB: serial: qcserial: add Dell DW5570 QDL

8fcdcc42

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · a0a28644

Linus Torvalds authored Feb 03, 2017

Pull SCSI fix from James Bottomley:
 "A single fix this time: a fix for a virtqueue removal bug which only
  appears to affect S390, but which results in the queue hanging forever
  thus causing the machine to fail shutdown"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: virtio_scsi: Reject commands when virtqueue is broken

a0a28644

03 Feb, 2017 9 commits

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost · a49e6f58

Linus Torvalds authored Feb 03, 2017

Pull virtio/vhost fixes from Michael S. Tsirkin:
 "Last minute fixes:

   - ARM DMA fix revert

   - vhost endian-ness fix

   - MAINTAINERS: email address change for Amit"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  MAINTAINERS: update email address for Amit Shah
  vhost: fix initialization for vq->is_le
  Revert "vring: Force use of DMA API for ARM-based systems with legacy devices"

a49e6f58

Merge tag 'vfio-v4.10-rc7' of git://github.com/awilliam/linux-vfio · e9f7f17d

Linus Torvalds authored Feb 03, 2017

Pull VFIO fix from Alex Williamson:
 "Fix an error path in SPAPR IOMMU backend (Alexey Kardashevskiy)"

* tag 'vfio-v4.10-rc7' of git://github.com/awilliam/linux-vfio:
  vfio/spapr: Fix missing mutex unlock when creating a window

e9f7f17d

Merge branch 'akpm' (patches from Andrew) · 7a92cc6b

Linus Torvalds authored Feb 03, 2017

Merge fixes from Andrew Morton:
 "8 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm, fs: check for fatal signals in do_generic_file_read()
  fs: break out of iomap_file_buffered_write on fatal signals
  base/memory, hotplug: fix a kernel oops in show_valid_zones()
  mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone()
  jump label: pass kbuild_cflags when checking for asm goto support
  shmem: fix sleeping from atomic context
  kasan: respect /proc/sys/kernel/traceoff_on_warning
  zswap: disable changing params if init fails

7a92cc6b

mm, fs: check for fatal signals in do_generic_file_read() · 5abf186a

Michal Hocko authored Feb 03, 2017

do_generic_file_read() can be told to perform a large request from
userspace.  If the system is under OOM and the reading task is the OOM
victim then it has an access to memory reserves and finishing the full
request can lead to the full memory depletion which is dangerous.  Make
sure we rather go with a short read and allow the killed task to
terminate.

Link: http://lkml.kernel.org/r/20170201092706.9966-3-mhocko@kernel.orgSigned-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

5abf186a

fs: break out of iomap_file_buffered_write on fatal signals · d1908f52

Michal Hocko authored Feb 03, 2017

Tetsuo has noticed that an OOM stress test which performs large write
requests can cause the full memory reserves depletion.  He has tracked
this down to the following path

	__alloc_pages_nodemask+0x436/0x4d0
	alloc_pages_current+0x97/0x1b0
	__page_cache_alloc+0x15d/0x1a0          mm/filemap.c:728
	pagecache_get_page+0x5a/0x2b0           mm/filemap.c:1331
	grab_cache_page_write_begin+0x23/0x40   mm/filemap.c:2773
	iomap_write_begin+0x50/0xd0             fs/iomap.c:118
	iomap_write_actor+0xb5/0x1a0            fs/iomap.c:190
	? iomap_write_end+0x80/0x80             fs/iomap.c:150
	iomap_apply+0xb3/0x130                  fs/iomap.c:79
	iomap_file_buffered_write+0x68/0xa0     fs/iomap.c:243
	? iomap_write_end+0x80/0x80
	xfs_file_buffered_aio_write+0x132/0x390 [xfs]
	? remove_wait_queue+0x59/0x60
	xfs_file_write_iter+0x90/0x130 [xfs]
	__vfs_write+0xe5/0x140
	vfs_write+0xc7/0x1f0
	? syscall_trace_enter+0x1d0/0x380
	SyS_write+0x58/0xc0
	do_syscall_64+0x6c/0x200
	entry_SYSCALL64_slow_path+0x25/0x25

the oom victim has access to all memory reserves to make a forward
progress to exit easier.  But iomap_file_buffered_write and other
callers of iomap_apply loop to complete the full request.  We need to
check for fatal signals and back off with a short write instead.

As the iomap_apply delegates all the work down to the actor we have to
hook into those.  All callers that work with the page cache are calling
iomap_write_begin so we will check for signals there.  dax_iomap_actor
has to handle the situation explicitly because it copies data to the
userspace directly.  Other callers like iomap_page_mkwrite work on a
single page or iomap_fiemap_actor do not allocate memory based on the
given len.

Fixes: 68a9f5e7 ("xfs: implement iomap based buffered write path")
Link: http://lkml.kernel.org/r/20170201092706.9966-2-mhocko@kernel.orgSigned-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>	[4.8+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d1908f52

base/memory, hotplug: fix a kernel oops in show_valid_zones() · a96dfddb

Toshi Kani authored Feb 03, 2017

Reading a sysfs "memoryN/valid_zones" file leads to the following oops
when the first page of a range is not backed by struct page.
show_valid_zones() assumes that 'start_pfn' is always valid for
page_zone().

 BUG: unable to handle kernel paging request at ffffea017a000000
 IP: show_valid_zones+0x6f/0x160

This issue may happen on x86-64 systems with 64GiB or more memory since
their memory block size is bumped up to 2GiB.  [1] An example of such
systems is desribed below.  0x3240000000 is only aligned by 1GiB and
this memory block starts from 0x3200000000, which is not backed by
struct page.

 BIOS-e820: [mem 0x0000003240000000-0x000000603fffffff] usable

Since test_pages_in_a_zone() already checks holes, fix this issue by
extending this function to return 'valid_start' and 'valid_end' for a
given range.  show_valid_zones() then proceeds with the valid range.

[1] 'Commit bdee237c ("x86: mm: Use 2GB memory block size on
    large-memory x86-64 systems")'

Link: http://lkml.kernel.org/r/20170127222149.30893-3-toshi.kani@hpe.comSigned-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Zhang Zhen <zhenzhang.zhang@huawei.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>	[4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

a96dfddb

mm/memory_hotplug.c: check start_pfn in test_pages_in_a_zone() · deb88a2a

Toshi Kani authored Feb 03, 2017

Patch series "fix a kernel oops when reading sysfs valid_zones", v2.

A sysfs memory file is created for each 2GiB memory block on x86-64 when
the system has 64GiB or more memory.  [1] When the start address of a
memory block is not backed by struct page, i.e.  a memory range is not
aligned by 2GiB, reading its 'valid_zones' attribute file leads to a
kernel oops.  This issue was observed on multiple x86-64 systems with
more than 64GiB of memory.  This patch-set fixes this issue.

Patch 1 first fixes an issue in test_pages_in_a_zone(), which does not
test the start section.

Patch 2 then fixes the kernel oops by extending test_pages_in_a_zone()
to return valid [start, end).

Note for stable kernels: The memory block size change was made by commit
bdee237c ("x86: mm: Use 2GB memory block size on large-memory x86-64
systems"), which was accepted to 3.9.  However, this patch-set depends
on (and fixes) the change to test_pages_in_a_zone() made by commit
5f0f2887 ("mm/memory_hotplug.c: check for missing sections in
test_pages_in_a_zone()"), which was accepted to 4.4.

So, I recommend that we backport it up to 4.4.

[1] 'Commit bdee237c ("x86: mm: Use 2GB memory block size on
    large-memory x86-64 systems")'

This patch (of 2):

test_pages_in_a_zone() does not check 'start_pfn' when it is aligned by
section since 'sec_end_pfn' is set equal to 'pfn'.  Since this function
is called for testing the range of a sysfs memory file, 'start_pfn' is
always aligned by section.

Fix it by properly setting 'sec_end_pfn' to the next section pfn.

Also make sure that this function returns 1 only when the range belongs
to a zone.

Link: http://lkml.kernel.org/r/20170127222149.30893-2-toshi.kani@hpe.comSigned-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Andrew Banman <abanman@sgi.com>
Cc: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Greg KH <greg@kroah.com>
Cc: <stable@vger.kernel.org>	[4.4+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

deb88a2a

jump label: pass kbuild_cflags when checking for asm goto support · 35f860f9

David Lin authored Feb 03, 2017

Some versions of ARM GCC compiler such as Android toolchain throws in a
'-fpic' flag by default.  This causes the gcc-goto check script to fail
although some config would have '-fno-pic' flag in the KBUILD_CFLAGS.

This patch passes the KBUILD_CFLAGS to the check script so that the
script does not rely on the default config from different compilers.

Link: http://lkml.kernel.org/r/20170120234329.78868-1-dtwlin@google.comSigned-off-by: David Lin <dtwlin@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Michal Marek <mmarek@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

35f860f9

shmem: fix sleeping from atomic context · 253fd0f0

Kirill A. Shutemov authored Feb 03, 2017

Syzkaller fuzzer managed to trigger this:

    BUG: sleeping function called from invalid context at mm/shmem.c:852
    in_atomic(): 1, irqs_disabled(): 0, pid: 529, name: khugepaged
    3 locks held by khugepaged/529:
     #0:  (shrinker_rwsem){++++..}, at: [<ffffffff818d7ef1>] shrink_slab.part.59+0x121/0xd30 mm/vmscan.c:451
     #1:  (&type->s_umount_key#29){++++..}, at: [<ffffffff81a63630>] trylock_super+0x20/0x100 fs/super.c:392
     #2:  (&(&sbinfo->shrinklist_lock)->rlock){+.+.-.}, at: [<ffffffff818fd83e>] spin_lock include/linux/spinlock.h:302 [inline]
     #2:  (&(&sbinfo->shrinklist_lock)->rlock){+.+.-.}, at: [<ffffffff818fd83e>] shmem_unused_huge_shrink+0x28e/0x1490 mm/shmem.c:427
    CPU: 2 PID: 529 Comm: khugepaged Not tainted 4.10.0-rc5+ #201
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
    Call Trace:
       shmem_undo_range+0xb20/0x2710 mm/shmem.c:852
       shmem_truncate_range+0x27/0xa0 mm/shmem.c:939
       shmem_evict_inode+0x35f/0xca0 mm/shmem.c:1030
       evict+0x46e/0x980 fs/inode.c:553
       iput_final fs/inode.c:1515 [inline]
       iput+0x589/0xb20 fs/inode.c:1542
       shmem_unused_huge_shrink+0xbad/0x1490 mm/shmem.c:446
       shmem_unused_huge_scan+0x10c/0x170 mm/shmem.c:512
       super_cache_scan+0x376/0x450 fs/super.c:106
       do_shrink_slab mm/vmscan.c:378 [inline]
       shrink_slab.part.59+0x543/0xd30 mm/vmscan.c:481
       shrink_slab mm/vmscan.c:2592 [inline]
       shrink_node+0x2c7/0x870 mm/vmscan.c:2592
       shrink_zones mm/vmscan.c:2734 [inline]
       do_try_to_free_pages+0x369/0xc80 mm/vmscan.c:2776
       try_to_free_pages+0x3c6/0x900 mm/vmscan.c:2982
       __perform_reclaim mm/page_alloc.c:3301 [inline]
       __alloc_pages_direct_reclaim mm/page_alloc.c:3322 [inline]
       __alloc_pages_slowpath+0xa24/0x1c30 mm/page_alloc.c:3683
       __alloc_pages_nodemask+0x544/0xae0 mm/page_alloc.c:3848
       __alloc_pages include/linux/gfp.h:426 [inline]
       __alloc_pages_node include/linux/gfp.h:439 [inline]
       khugepaged_alloc_page+0xc2/0x1b0 mm/khugepaged.c:750
       collapse_huge_page+0x182/0x1fe0 mm/khugepaged.c:955
       khugepaged_scan_pmd+0xfdf/0x12a0 mm/khugepaged.c:1208
       khugepaged_scan_mm_slot mm/khugepaged.c:1727 [inline]
       khugepaged_do_scan mm/khugepaged.c:1808 [inline]
       khugepaged+0xe9b/0x1590 mm/khugepaged.c:1853
       kthread+0x326/0x3f0 kernel/kthread.c:227
       ret_from_fork+0x31/0x40 arch/x86/entry/entry_64.S:430

The iput() from atomic context was a bad idea: if after igrab() somebody
else calls iput() and we left with the last inode reference, our iput()
would lead to inode eviction and therefore sleeping.

This patch should fix the situation.

Link: http://lkml.kernel.org/r/20170131093141.GA15899@node.shutemov.nameSigned-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

253fd0f0