1. 28 Aug, 2015 6 commits
  2. 27 Aug, 2015 11 commits
  3. 12 Aug, 2015 1 commit
    • Benjamin Randazzo's avatar
      md: use kzalloc() when bitmap is disabled · 74df5a75
      Benjamin Randazzo authored
      commit b6878d9e upstream.
      
      In drivers/md/md.c get_bitmap_file() uses kmalloc() for creating a
      mdu_bitmap_file_t called "file".
      
      5769         file = kmalloc(sizeof(*file), GFP_NOIO);
      5770         if (!file)
      5771                 return -ENOMEM;
      
      This structure is copied to user space at the end of the function.
      
      5786         if (err == 0 &&
      5787             copy_to_user(arg, file, sizeof(*file)))
      5788                 err = -EFAULT
      
      But if bitmap is disabled only the first byte of "file" is initialized
      with zero, so it's possible to read some bytes (up to 4095) of kernel
      space memory from user space. This is an information leak.
      
      5775         /* bitmap disabled, zero the first byte and copy out */
      5776         if (!mddev->bitmap_info.file)
      5777                 file->pathname[0] = '\0';
      Signed-off-by: default avatarBenjamin Randazzo <benjamin@randazzo.fr>
      Signed-off-by: default avatarNeilBrown <neilb@suse.com>
      Reference: CVE-2015-5697
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      74df5a75
  4. 11 Aug, 2015 5 commits
    • Kamal Mostafa's avatar
      Linux 3.19.8-ckt5 · 4b186bdc
      Kamal Mostafa authored
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      4b186bdc
    • Wengang Wang's avatar
      rds: rds_ib_device.refcount overflow · a6a33d67
      Wengang Wang authored
      commit 4fabb594 upstream.
      
      Fixes: 3e0249f9 ("RDS/IB: add refcount tracking to struct rds_ib_device")
      
      There lacks a dropping on rds_ib_device.refcount in case rds_ib_alloc_fmr
      failed(mr pool running out). this lead to the refcount overflow.
      
      A complain in line 117(see following) is seen. From vmcore:
      s_ib_rdma_mr_pool_depleted is 2147485544 and rds_ibdev->refcount is -2147475448.
      That is the evidence the mr pool is used up. so rds_ib_alloc_fmr is very likely
      to return ERR_PTR(-EAGAIN).
      
      115 void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
      116 {
      117         BUG_ON(atomic_read(&rds_ibdev->refcount) <= 0);
      118         if (atomic_dec_and_test(&rds_ibdev->refcount))
      119                 queue_work(rds_wq, &rds_ibdev->free_work);
      120 }
      
      fix is to drop refcount when rds_ib_alloc_fmr failed.
      Signed-off-by: default avatarWengang Wang <wen.gang.wang@oracle.com>
      Reviewed-by: default avatarHaggai Eran <haggaie@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      a6a33d67
    • Florian Fainelli's avatar
      net: dsa: Test array index before use · 336fe934
      Florian Fainelli authored
      commit 8f5063e9 upstream.
      
      port_index is used an index into an array, and this information comes
      from Device Tree, make sure that port_index is not equal to the array
      size before using it. Move the check against port_index earlier in the
      loop.
      
      Fixes: 5e95329b: ("dsa: add device tree bindings to register DSA switches")
      Reported-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      336fe934
    • Arnaldo Carvalho de Melo's avatar
      perf hists browser: Take the --comm, --dsos, etc filters into account · 8401d7fe
      Arnaldo Carvalho de Melo authored
      commit 9c0fa8dd upstream.
      
      At some point:
      
        commit 2c86c7ca
        Author: Namhyung Kim <namhyung@kernel.org>
        Date:   Mon Mar 17 18:18:54 2014 -0300
      
          perf report: Merge al->filtered with hist_entry->filtered
      
      We stopped dropping samples for things filtered via the --comms, --dsos,
      --symbols, etc, i.e. things marked as filtered in the symbol resolution
      routines (thread__find_addr_map(), perf_event__preprocess_sample(),
      etc).
      
      But then, in:
      
        commit 268397cb
        Author: Namhyung Kim <namhyung@kernel.org>
        Date:   Tue Apr 22 14:49:31 2014 +0900
      
          perf top/tui: Update nr_entries properly after a filter is applied
      
      We don't take into account entries that were filtered in
      perf_event__preprocess_sample() and friends, which leads to
      inconsistency in the browser seek routines, that expects the number of
      hist_entry->filtered entries to match what it thinks is the number of
      unfiltered, browsable entries.
      
      So, for instance, when we do:
      
        perf top --symbols ___non_existent_symbol___
      
      the hist_browser__nr_entries() routine thinks there are no filters in
      place, uses the hists->nr_entries but all entries are filtered, leading
      to a segfault.
      
      Tested with:
      
         perf top --symbols malloc,free --percentage=relative
      
      Freezing, by pressing 'f', at any time and doing the math on the
      percentages ends up with 100%, ditto for:
      
         perf top --dsos libpthread-2.20.so,libxul.so --percentage=relative
      
      Both were segfaulting, all fixed now.
      
      More work needed to do away with checking if filters are in place, we
      should just use the nr_non_filtered_samples counter, no need to
      conditionally use it or hists.nr_filter, as what the browser does is
      just show unfiltered stuff. An audit of how it is being accounted is
      needed, this is the minimal fix.
      Reported-by: default avatarMichael Petlan <mpetlan@redhat.com>
      Fixes: 268397cb ("perf top/tui: Update nr_entries properly after a filter is applied")
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-6w01d5q97qk0d64kuojme5in@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      8401d7fe
    • Arnaldo Carvalho de Melo's avatar
      perf symbols: Store if there is a filter in place · 7e447b5f
      Arnaldo Carvalho de Melo authored
      commit 0bc2f2f7 upstream.
      
      When setting yup the symbols library we setup several filter lists,
      for dsos, comms, symbols, etc, and there is code that, if there are
      filters, do certain operations, like recalculate the number of non
      filtered histogram entries in the top/report TUI.
      
      But they were considering just the "Zoom" filters, when they need to
      take into account as well the above mentioned filters (perf top --comms,
      --dsos, etc).
      
      So store in symbol_conf.has_filter true if any of those filters is in
      place.
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: David Ahern <dsahern@gmail.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Stephane Eranian <eranian@google.com>
      Link: http://lkml.kernel.org/n/tip-f5edfmhq69vfvs1kmikq1wep@git.kernel.orgSigned-off-by: default avatarArnaldo Carvalho de Melo <acme@redhat.com>
      [ kamal: 3.19-stable prereq for
        9c0fa8dd perf hists browser: Take the --comm, --dsos, etc filters into account ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      7e447b5f
  5. 06 Aug, 2015 17 commits
    • Florian Fainelli's avatar
      net: dsa: Fix off-by-one in switch address parsing · 843d28ea
      Florian Fainelli authored
      commit c8cf89f7 upstream.
      
      cd->sw_addr is used as a MDIO bus address, which cannot exceed
      PHY_MAX_ADDR (32), our check was off-by-one.
      
      Fixes: 5e95329b ("dsa: add device tree bindings to register DSA switches")
      Signed-off-by: default avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      843d28ea
    • Eric Dumazet's avatar
      net: graceful exit from netif_alloc_netdev_queues() · c79fd258
      Eric Dumazet authored
      commit d339727c upstream.
      
      User space can crash kernel with
      
      ip link add ifb10 numtxqueues 100000 type ifb
      
      We must replace a BUG_ON() by proper test and return -EINVAL for
      crazy values.
      
      Fixes: 60877a32 ("net: allow large number of tx queues")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      c79fd258
    • Julian Anastasov's avatar
      net: do not process device backlog during unregistration · 997721b5
      Julian Anastasov authored
      commit e9e4dd32 upstream.
      
      commit 381c759d ("ipv4: Avoid crashing in ip_error")
      fixes a problem where processed packet comes from device
      with destroyed inetdev (dev->ip_ptr). This is not expected
      because inetdev_destroy is called in NETDEV_UNREGISTER
      phase and packets should not be processed after
      dev_close_many() and synchronize_net(). Above fix is still
      required because inetdev_destroy can be called for other
      reasons. But it shows the real problem: backlog can keep
      packets for long time and they do not hold reference to
      device. Such packets are then delivered to upper levels
      at the same time when device is unregistered.
      Calling flush_backlog after NETDEV_UNREGISTER_FINAL still
      accounts all packets from backlog but before that some packets
      continue to be delivered to upper levels long after the
      synchronize_net call which is supposed to wait the last
      ones. Also, as Eric pointed out, processed packets, mostly
      from other devices, can continue to add new packets to backlog.
      
      Fix the problem by moving flush_backlog early, after the
      device driver is stopped and before the synchronize_net() call.
      Then use netif_running check to make sure we do not add more
      packets to backlog. We have to do it in enqueue_to_backlog
      context when the local IRQ is disabled. As result, after the
      flush_backlog and synchronize_net sequence all packets
      should be accounted.
      
      Thanks to Eric W. Biederman for the test script and his
      valuable feedback!
      Reported-by: default avatarVittorio Gambaletta <linuxbugs@vittgam.net>
      Fixes: 6e583ce5 ("net: eliminate refcounting in backlog queue")
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      997721b5
    • Nikolay Aleksandrov's avatar
      bridge: mdb: zero out the local br_ip variable before use · d420426f
      Nikolay Aleksandrov authored
      commit f1158b74 upstream.
      
      Since commit b0e9a30d ("bridge: Add vlan id to multicast groups")
      there's a check in br_ip_equal() for a matching vlan id, but the mdb
      functions were not modified to use (or at least zero it) so when an
      entry was added it would have a garbage vlan id (from the local br_ip
      variable in __br_mdb_add/del) and this would prevent it from being
      matched and also deleted. So zero out the whole local ip var to protect
      ourselves from future changes and also to fix the current bug, since
      there's no vlan id support in the mdb uapi - use always vlan id 0.
      Example before patch:
      root@debian:~# bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent
      root@debian:~# bridge mdb
      dev br0 port eth1 grp 239.0.0.1 permanent
      root@debian:~# bridge mdb del dev br0 port eth1 grp 239.0.0.1 permanent
      RTNETLINK answers: Invalid argument
      
      After patch:
      root@debian:~# bridge mdb add dev br0 port eth1 grp 239.0.0.1 permanent
      root@debian:~# bridge mdb
      dev br0 port eth1 grp 239.0.0.1 permanent
      root@debian:~# bridge mdb del dev br0 port eth1 grp 239.0.0.1 permanent
      root@debian:~# bridge mdb
      Signed-off-by: default avatarNikolay Aleksandrov <razor@blackwall.org>
      Fixes: b0e9a30d ("bridge: Add vlan id to multicast groups")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      d420426f
    • Satish Ashok's avatar
      bridge: mdb: start delete timer for temp static entries · 150017cd
      Satish Ashok authored
      commit f7e2965d upstream.
      
      Start the delete timer when adding temp static entries so they can expire.
      Signed-off-by: default avatarSatish Ashok <sashok@cumulusnetworks.com>
      Signed-off-by: default avatarNikolay Aleksandrov <nikolay@cumulusnetworks.com>
      Fixes: ccb1c31a ("bridge: add flags to distinguish permanent mdb entires")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      150017cd
    • Timo Teräs's avatar
      ip_tunnel: fix ipv4 pmtu check to honor inner ip header df · a15d7002
      Timo Teräs authored
      commit fc24f2b2 upstream.
      
      Frag needed should be sent only if the inner header asked
      to not fragment. Currently fragmentation is broken if the
      tunnel has df set, but df was not asked in the original
      packet. The tunnel's df needs to be still checked to update
      internally the pmtu cache.
      
      Commit 23a3647b broke it, and this commit fixes
      the ipv4 df check back to the way it was.
      
      Fixes: 23a3647b ("ip_tunnels: Use skb-len to PMTU check.")
      Cc: Pravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarTimo Teräs <timo.teras@iki.fi>
      Acked-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      a15d7002
    • Julian Anastasov's avatar
      net: call rcu_read_lock early in process_backlog · b7047fc7
      Julian Anastasov authored
      commit 2c17d27c upstream.
      
      Incoming packet should be either in backlog queue or
      in RCU read-side section. Otherwise, the final sequence of
      flush_backlog() and synchronize_net() may miss packets
      that can run without device reference:
      
      CPU 1                  CPU 2
                             skb->dev: no reference
                             process_backlog:__skb_dequeue
                             process_backlog:local_irq_enable
      
      on_each_cpu for
      flush_backlog =>       IPI(hardirq): flush_backlog
                             - packet not found in backlog
      
                             CPU delayed ...
      synchronize_net
      - no ongoing RCU
      read-side sections
      
      netdev_run_todo,
      rcu_barrier: no
      ongoing callbacks
                             __netif_receive_skb_core:rcu_read_lock
                             - too late
      free dev
                             process packet for freed dev
      
      Fixes: 6e583ce5 ("net: eliminate refcounting in backlog queue")
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Stephen Hemminger <stephen@networkplumber.org>
      Signed-off-by: default avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [ kamal: backport to 3.19-stable: no CONFIG_NET_INGRESS; context ]
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      b7047fc7
    • Angga's avatar
      ipv6: Make MLD packets to only be processed locally · 6b0a0675
      Angga authored
      commit 4c938d22 upstream.
      
      Before commit daad1512 ("ipv6: Make ipv6_is_mld() inline and use it
      from ip6_mc_input().") MLD packets were only processed locally. After the
      change, a copy of MLD packet goes through ip6_mr_input, causing
      MRT6MSG_NOCACHE message to be generated to user space.
      
      Make MLD packet only processed locally.
      
      Fixes: daad1512 ("ipv6: Make ipv6_is_mld() inline and use it from ip6_mc_input().")
      Signed-off-by: default avatarHermin Anggawijaya <hermin.anggawijaya@alliedtelesis.co.nz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      6b0a0675
    • Haggai Eran's avatar
      dma-debug: skip debug_dma_assert_idle() when disabled · be2a25be
      Haggai Eran authored
      commit c9d120b0 upstream.
      
      If dma-debug is disabled due to a memory error, DMA unmaps do not affect
      the dma_active_cacheline radix tree anymore, and debug_dma_assert_idle()
      can print false warnings.
      
      Disable debug_dma_assert_idle() when dma_debug_disabled() is true.
      Signed-off-by: default avatarHaggai Eran <haggaie@mellanox.com>
      Fixes: 0abdd7a8 ("dma-debug: introduce debug_dma_assert_idle()")
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Vinod Koul <vinod.koul@intel.com>
      Cc: Russell King <rmk+kernel@arm.linux.org.uk>
      Cc: James Bottomley <JBottomley@Parallels.com>
      Cc: Florian Fainelli <f.fainelli@gmail.com>
      Cc: Sebastian Ott <sebott@linux.vnet.ibm.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Horia Geanta <horia.geanta@freescale.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      be2a25be
    • Marek Szyprowski's avatar
      ARM: 8404/1: dma-mapping: fix off-by-one error in bitmap size check · 3767511c
      Marek Szyprowski authored
      commit 462859aa upstream.
      
      nr_bitmaps member of mapping structure stores the number of already
      allocated bitmaps and it is interpreted as loop iterator (it starts from
      0 not from 1), so a comparison against number of possible bitmap
      extensions should include this fact. This patch fixes this by changing
      the extension failure condition. This issue has been introduced by
      commit 4d852ef8 ("arm: dma-mapping: Add
      support to extend DMA IOMMU mappings").
      Reported-by: default avatarHyungwon Hwang <human.hwang@samsung.com>
      Signed-off-by: default avatarMarek Szyprowski <m.szyprowski@samsung.com>
      Reviewed-by: default avatarHyungwon Hwang <human.hwang@samsung.com>
      Signed-off-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      3767511c
    • Thomas Gleixner's avatar
      genirq: Prevent resend to interrupts marked IRQ_NESTED_THREAD · ec397083
      Thomas Gleixner authored
      commit 75a06189 upstream.
      
      The resend mechanism happily calls the interrupt handler of interrupts
      which are marked IRQ_NESTED_THREAD from softirq context. This can
      result in crashes because the interrupt handler is not the proper way
      to invoke the device handlers. They must be invoked via
      handle_nested_irq.
      
      Prevent the resend even if the interrupt has no valid parent irq
      set. Its better to have a lost interrupt than a crashing machine.
      Reported-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      ec397083
    • Alex Deucher's avatar
      drm/radeon/ci: silence a harmless PCC warning · 66554a00
      Alex Deucher authored
      commit bda5e3e9 upstream.
      
      This has been a source of confusion.  Make it debug only.
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      66554a00
    • Christian König's avatar
      drm/radeon: fix user ptr race condition · 971e37f7
      Christian König authored
      commit 12f1384d upstream.
      
      Port of amdgpu patch 9298e52f.
      Signed-off-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      971e37f7
    • Michel Dänzer's avatar
      drm/radeon: Don't flush the GART TLB if rdev->gart.ptr == NULL · dd77c755
      Michel Dänzer authored
      commit 233709d2 upstream.
      
      This can be the case when the GPU is powered off, e.g. via vgaswitcheroo
      or runpm. When the GPU is powered up again, radeon_gart_table_vram_pin
      flushes the TLB after setting rdev->gart.ptr to non-NULL.
      
      Fixes panic on powering off R7xx GPUs.
      
      Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61529Reviewed-by: default avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: default avatarMichel Dänzer <michel.daenzer@amd.com>
      Signed-off-by: default avatarAlex Deucher <alexander.deucher@amd.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      dd77c755
    • Alex Deucher's avatar
    • Jens Axboe's avatar
      scsi: fix host max depth checking for the 'queue_depth' sysfs interface · 1ed9f752
      Jens Axboe authored
      commit 1278dd68 upstream.
      
      Commit 1e6f2416 changed the scsi sysfs 'queue_depth' code to
      rejects depths higher than the scsi host template setting. But lots
      of hosts set this to 1, and update the settings in the scsi host
      when the controller/devices probing happens.
      
      This breaks (at least) mpt2sas and mpt3sas runtime setting of queue
      depth, returning EINVAL for all settings but '1'. And once it's set to
      1, there's no way to go back up.
      
      Fixes: 1e6f2416 "scsi: don't allow setting of queue_depth bigger than can_queue"
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Reviewed-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      1ed9f752
    • Seymour, Shane M's avatar
      st: null pointer dereference panic caused by use after kref_put by st_open · 7c4479ca
      Seymour, Shane M authored
      commit e7ac6c66 upstream.
      
      Two SLES11 SP3 servers encountered similar crashes simultaneously
      following some kind of SAN/tape target issue:
      
      ...
      qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 --  1 2002.
      qla2xxx [0000:81:00.0]-801c:3: Abort command issued nexus=3:0:2 --  1 2002.
      qla2xxx [0000:81:00.0]-8009:3: DEVICE RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-800f:3: DEVICE RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-8009:3: TARGET RESET ISSUED nexus=3:0:2 cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-800c:3: do_reset failed for cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-800f:3: TARGET RESET FAILED: Task management failed nexus=3:0:2 cmd=ffff882f89c2c7c0.
      qla2xxx [0000:81:00.0]-8012:3: BUS RESET ISSUED nexus=3:0:2.
      qla2xxx [0000:81:00.0]-802b:3: BUS RESET SUCCEEDED nexus=3:0:2.
      qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps).
      qla2xxx [0000:81:00.0]-8018:3: ADAPTER RESET ISSUED nexus=3:0:2.
      qla2xxx [0000:81:00.0]-00af:3: Performing ISP error recovery - ha=ffff88bf04d18000.
       rport-3:0-0: blocked FC remote port time out: removing target and saving binding
      qla2xxx [0000:81:00.0]-505f:3: Link is operational (8 Gbps).
      qla2xxx [0000:81:00.0]-8017:3: ADAPTER RESET SUCCEEDED nexus=3:0:2.
       rport-2:0-0: blocked FC remote port time out: removing target and saving binding
      sg_rq_end_io: device detached
      BUG: unable to handle kernel NULL pointer dereference at 00000000000002a8
      IP: [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
      PGD 7e6586f067 PUD 7e5af06067 PMD 0 [1739975.390354] Oops: 0002 [#1] SMP
      CPU 0
      ...
      Supported: No, Proprietary modules are loaded [1739975.390463]
      Pid: 27965, comm: ABCD Tainted: PF           X 3.0.101-0.29-default #1 HP ProLiant DL580 Gen8
      RIP: 0010:[<ffffffff8133b268>]  [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
      RSP: 0018:ffff8839dc1e7c68  EFLAGS: 00010202
      RAX: 0000000000000000 RBX: ffff883f0592fc00 RCX: 0000000000000090
      RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000138
      RBP: 0000000000000138 R08: 0000000000000010 R09: ffffffff81bd39d0
      R10: 00000000000009c0 R11: ffffffff81025790 R12: 0000000000000001
      R13: ffff883022212b80 R14: 0000000000000004 R15: ffff883022212b80
      FS:  00007f8e54560720(0000) GS:ffff88407f800000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00000000000002a8 CR3: 0000007e6ced6000 CR4: 00000000001407f0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process ABCD (pid: 27965, threadinfo ffff8839dc1e6000, task ffff883592e0c640)
      Stack:
       ffff883f0592fc00 00000000fffffffa 0000000000000001 ffff883022212b80
       ffff883eff772400 ffffffffa03fa309 0000000000000000 0000000000000000
       ffffffffa04003a0 ffff883f063196c0 ffff887f0379a930 ffffffff8115ea1e
      Call Trace:
       [<ffffffffa03fa309>] st_open+0x129/0x240 [st]
       [<ffffffff8115ea1e>] chrdev_open+0x13e/0x200
       [<ffffffff811588a8>] __dentry_open+0x198/0x310
       [<ffffffff81167d74>] do_last+0x1f4/0x800
       [<ffffffff81168fe9>] path_openat+0xd9/0x420
       [<ffffffff8116946c>] do_filp_open+0x4c/0xc0
       [<ffffffff8115a00f>] do_sys_open+0x17f/0x250
       [<ffffffff81468d92>] system_call_fastpath+0x16/0x1b
       [<00007f8e4f617fd0>] 0x7f8e4f617fcf
      Code: eb d3 90 48 83 ec 28 40 f6 c6 04 48 89 6c 24 08 4c 89 74 24 20 48 89 fd 48 89 1c 24 4c 89 64 24 10 41 89 f6 4c 89 6c 24 18 74 11 <f0> ff 8f 70 01 00 00 0f 94 c0 45 31 ed 84 c0 74 2b 4c 8d a5 a0
      RIP  [<ffffffff8133b268>] __pm_runtime_idle+0x28/0x90
       RSP <ffff8839dc1e7c68>
      CR2: 00000000000002a8
      
      Analysis reveals the cause of the crash to be due to STp->device
      being NULL. The pointer was NULLed via scsi_tape_put(STp) when it
      calls scsi_tape_release(). In st_open() we jump to err_out after
      scsi_block_when_processing_errors() completes and returns the
      device as offline (sdev_state was SDEV_DEL):
      
      1180 /* Open the device. Needs to take the BKL only because of incrementing the SCSI host
      1181    module count. */
      1182 static int st_open(struct inode *inode, struct file *filp)
      1183 {
      1184         int i, retval = (-EIO);
      1185         int resumed = 0;
      1186         struct scsi_tape *STp;
      1187         struct st_partstat *STps;
      1188         int dev = TAPE_NR(inode);
      1189         char *name;
      ...
      1217         if (scsi_autopm_get_device(STp->device) < 0) {
      1218                 retval = -EIO;
      1219                 goto err_out;
      1220         }
      1221         resumed = 1;
      1222         if (!scsi_block_when_processing_errors(STp->device)) {
      1223                 retval = (-ENXIO);
      1224                 goto err_out;
      1225         }
      ...
      1264  err_out:
      1265         normalize_buffer(STp->buffer);
      1266         spin_lock(&st_use_lock);
      1267         STp->in_use = 0;
      1268         spin_unlock(&st_use_lock);
      1269         scsi_tape_put(STp); <-- STp->device = 0 after this
      1270         if (resumed)
      1271                 scsi_autopm_put_device(STp->device);
      1272         return retval;
      
      The ref count for the struct scsi_tape had already been reduced
      to 1 when the .remove method of the st module had been called.
      The kref_put() in scsi_tape_put() caused scsi_tape_release()
      to be called:
      
      0266 static void scsi_tape_put(struct scsi_tape *STp)
      0267 {
      0268         struct scsi_device *sdev = STp->device;
      0269
      0270         mutex_lock(&st_ref_mutex);
      0271         kref_put(&STp->kref, scsi_tape_release); <-- calls this
      0272         scsi_device_put(sdev);
      0273         mutex_unlock(&st_ref_mutex);
      0274 }
      
      In scsi_tape_release() the struct scsi_device in the struct
      scsi_tape gets set to NULL:
      
      4273 static void scsi_tape_release(struct kref *kref)
      4274 {
      4275         struct scsi_tape *tpnt = to_scsi_tape(kref);
      4276         struct gendisk *disk = tpnt->disk;
      4277
      4278         tpnt->device = NULL; <<<---- where the dev is nulled
      4279
      4280         if (tpnt->buffer) {
      4281                 normalize_buffer(tpnt->buffer);
      4282                 kfree(tpnt->buffer->reserved_pages);
      4283                 kfree(tpnt->buffer);
      4284         }
      4285
      4286         disk->private_data = NULL;
      4287         put_disk(disk);
      4288         kfree(tpnt);
      4289         return;
      4290 }
      
      Although the problem was reported on SLES11.3 the problem appears
      in linux-next as well.
      
      The crash is fixed by reordering the code so we no longer access
      the struct scsi_tape after the kref_put() is done on it in st_open().
      Signed-off-by: default avatarShane Seymour <shane.seymour@hp.com>
      Signed-off-by: default avatarDarren Lavender <darren.lavender@hp.com>
      Reviewed-by: default avatarJohannes Thumshirn <jthumshirn@suse.com>
      Acked-by: default avatarKai Mäkisara <kai.makisara@kolumbus.fi>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Odin.com>
      Signed-off-by: default avatarKamal Mostafa <kamal@canonical.com>
      7c4479ca