1. 22 Feb, 2018 19 commits
    • Mahesh Salgaonkar's avatar
      powerpc/radix: Remove trace_tlbie call from radix__flush_tlb_all · 5b98d314
      Mahesh Salgaonkar authored
      commit 8d81296c upstream.
      
      radix__flush_tlb_all() is called only in kexec path in real mode and any
      tracepoints at this stage will make kexec to fail if enabled.
      
      To verify enable tlbie trace before kexec.
      
      $ echo 1 > /sys/kernel/debug/tracing/events/powerpc/tlbie/enable
      == kexec into new kernel and kexec fails.
      
      Fix this by not calling trace_tlbie from radix__flush_tlb_all().
      
      Fixes: 0428491c ("powerpc/mm: Trace tlbie(l) instructions")
      Cc: stable@vger.kernel.org # v4.13+
      Signed-off-by: default avatarMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Acked-by: default avatarBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5b98d314
    • Gang He's avatar
      ocfs2: try a blocking lock before return AOP_TRUNCATED_PAGE · 2e7e8bd8
      Gang He authored
      commit ff26cc10 upstream.
      
      If we can't get inode lock immediately in the function
      ocfs2_inode_lock_with_page() when reading a page, we should not return
      directly here, since this will lead to a softlockup problem when the
      kernel is configured with CONFIG_PREEMPT is not set.  The method is to
      get a blocking lock and immediately unlock before returning, this can
      avoid CPU resource waste due to lots of retries, and benefits fairness
      in getting lock among multiple nodes, increase efficiency in case
      modifying the same file frequently from multiple nodes.
      
      The softlockup crash (when set /proc/sys/kernel/softlockup_panic to 1)
      looks like:
      
        Kernel panic - not syncing: softlockup: hung tasks
        CPU: 0 PID: 885 Comm: multi_mmap Tainted: G L 4.12.14-6.1-default #1
        Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
        Call Trace:
          <IRQ>
          dump_stack+0x5c/0x82
          panic+0xd5/0x21e
          watchdog_timer_fn+0x208/0x210
          __hrtimer_run_queues+0xcc/0x200
          hrtimer_interrupt+0xa6/0x1f0
          smp_apic_timer_interrupt+0x34/0x50
          apic_timer_interrupt+0x96/0xa0
          </IRQ>
         RIP: 0010:unlock_page+0x17/0x30
         RSP: 0000:ffffaf154080bc88 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10
         RAX: dead000000000100 RBX: fffff21e009f5300 RCX: 0000000000000004
         RDX: dead0000000000ff RSI: 0000000000000202 RDI: fffff21e009f5300
         RBP: 0000000000000000 R08: 0000000000000000 R09: ffffaf154080bb00
         R10: ffffaf154080bc30 R11: 0000000000000040 R12: ffff993749a39518
         R13: 0000000000000000 R14: fffff21e009f5300 R15: fffff21e009f5300
          ocfs2_inode_lock_with_page+0x25/0x30 [ocfs2]
          ocfs2_readpage+0x41/0x2d0 [ocfs2]
          filemap_fault+0x12b/0x5c0
          ocfs2_fault+0x29/0xb0 [ocfs2]
          __do_fault+0x1a/0xa0
          __handle_mm_fault+0xbe8/0x1090
          handle_mm_fault+0xaa/0x1f0
          __do_page_fault+0x235/0x4b0
          trace_do_page_fault+0x3c/0x110
          async_page_fault+0x28/0x30
         RIP: 0033:0x7fa75ded638e
         RSP: 002b:00007ffd6657db18 EFLAGS: 00010287
         RAX: 000055c7662fb700 RBX: 0000000000000001 RCX: 000055c7662fb700
         RDX: 0000000000001770 RSI: 00007fa75e909000 RDI: 000055c7662fb700
         RBP: 0000000000000003 R08: 000000000000000e R09: 0000000000000000
         R10: 0000000000000483 R11: 00007fa75ded61b0 R12: 00007fa75e90a770
         R13: 000000000000000e R14: 0000000000001770 R15: 0000000000000000
      
      About performance improvement, we can see the testing time is reduced,
      and CPU utilization decreases, the detailed data is as follows.  I ran
      multi_mmap test case in ocfs2-test package in a three nodes cluster.
      
      Before applying this patch:
          PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
         2754 ocfs2te+  20   0  170248   6980   4856 D 80.73 0.341   0:18.71 multi_mmap
         1505 root      rt   0  222236 123060  97224 S 2.658 6.015   0:01.44 corosync
            5 root      20   0       0      0      0 S 1.329 0.000   0:00.19 kworker/u8:0
           95 root      20   0       0      0      0 S 1.329 0.000   0:00.25 kworker/u8:1
         2728 root      20   0       0      0      0 S 0.997 0.000   0:00.24 jbd2/sda1-33
         2721 root      20   0       0      0      0 S 0.664 0.000   0:00.07 ocfs2dc-3C8CFD4
         2750 ocfs2te+  20   0  142976   4652   3532 S 0.664 0.227   0:00.28 mpirun
      
        ocfs2test@tb-node2:~>multiple_run.sh -i ens3 -k ~/linux-4.4.21-69.tar.gz -o ~/ocfs2mullog -C hacluster -s pcmk -n tb-node2,tb-node1,tb-node3 -d /dev/sda1 -b 4096 -c 32768 -t multi_mmap /mnt/shared
        Tests with "-b 4096 -C 32768"
        Thu Dec 28 14:44:52 CST 2017
        multi_mmap..................................................Passed.
        Runtime 783 seconds.
      
      After apply this patch:
      
          PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
         2508 ocfs2te+  20   0  170248   6804   4680 R 54.00 0.333   0:55.37 multi_mmap
          155 root      20   0       0      0      0 S 2.667 0.000   0:01.20 kworker/u8:3
           95 root      20   0       0      0      0 S 2.000 0.000   0:01.58 kworker/u8:1
         2504 ocfs2te+  20   0  142976   4604   3480 R 1.667 0.225   0:01.65 mpirun
            5 root      20   0       0      0      0 S 1.000 0.000   0:01.36 kworker/u8:0
         2482 root      20   0       0      0      0 S 1.000 0.000   0:00.86 jbd2/sda1-33
          299 root       0 -20       0      0      0 S 0.333 0.000   0:00.13 kworker/2:1H
          335 root       0 -20       0      0      0 S 0.333 0.000   0:00.17 kworker/1:1H
          535 root      20   0   12140   7268   1456 S 0.333 0.355   0:00.34 haveged
         1282 root      rt   0  222284 123108  97224 S 0.333 6.017   0:01.33 corosync
      
        ocfs2test@tb-node2:~>multiple_run.sh -i ens3 -k ~/linux-4.4.21-69.tar.gz -o ~/ocfs2mullog -C hacluster -s pcmk -n tb-node2,tb-node1,tb-node3 -d /dev/sda1 -b 4096 -c 32768 -t multi_mmap /mnt/shared
        Tests with "-b 4096 -C 32768"
        Thu Dec 28 15:04:12 CST 2017
        multi_mmap..................................................Passed.
        Runtime 487 seconds.
      
      Link: http://lkml.kernel.org/r/1514447305-30814-1-git-send-email-ghe@suse.com
      Fixes: 1cce4df0 ("ocfs2: do not lock/unlock() inode DLM lock")
      Signed-off-by: default avatarGang He <ghe@suse.com>
      Reviewed-by: default avatarEric Ren <zren@suse.com>
      Acked-by: default avataralex chen <alex.chen@huawei.com>
      Acked-by: default avatarpiaojun <piaojun@huawei.com>
      Cc: Mark Fasheh <mfasheh@versity.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Junxiao Bi <junxiao.bi@oracle.com>
      Cc: Joseph Qi <jiangqi903@gmail.com>
      Cc: Changwei Ge <ge.changwei@h3c.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      2e7e8bd8
    • Brian Norris's avatar
      mwifiex: resolve reset vs. remove()/shutdown() deadlocks · 1ec4c78e
      Brian Norris authored
      commit a64e7a79 upstream.
      
      Commit b014e96d ("PCI: Protect pci_error_handlers->reset_notify()
      usage with device_lock()") resolves races between driver reset and
      removal, but it introduces some new deadlock problems. If we see a
      timeout while we've already started suspending, removing, or shutting
      down the driver, we might see:
      
      (a) a worker thread, running mwifiex_pcie_work() ->
          mwifiex_pcie_card_reset_work() -> pci_reset_function()
      (b) a removal thread, running mwifiex_pcie_remove() ->
          mwifiex_free_adapter() -> mwifiex_unregister() ->
          mwifiex_cleanup_pcie() -> cancel_work_sync(&card->work)
      
      Unfortunately, mwifiex_pcie_remove() already holds the device lock that
      pci_reset_function() is now requesting, and so we see a deadlock.
      
      It's necessary to cancel and synchronize our outstanding work before
      tearing down the driver, so we can't have this work wait indefinitely
      for the lock.
      
      It's reasonable to only "try" to reset here, since this will mostly
      happen for cases where it's already difficult to reset the firmware
      anyway (e.g., while we're suspending or powering off the system). And if
      reset *really* needs to happen, we can always try again later.
      
      Fixes: b014e96d ("PCI: Protect pci_error_handlers->reset_notify() usage with device_lock()")
      Cc: <stable@vger.kernel.org>
      Cc: Xinming Hu <huxm@marvell.com>
      Signed-off-by: default avatarBrian Norris <briannorris@chromium.org>
      Signed-off-by: default avatarKalle Valo <kvalo@codeaurora.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1ec4c78e
    • Bjorn Andersson's avatar
      PM / devfreq: Propagate error from devfreq_add_device() · 62def1d6
      Bjorn Andersson authored
      commit d1bf2d30 upstream.
      
      Propagate the error of devfreq_add_device() in devm_devfreq_add_device()
      rather than statically returning ENOMEM. This makes it slightly faster
      to pinpoint the cause of a returned error.
      
      Fixes: 8cd84092 ("PM / devfreq: Add resource-managed function for devfreq device")
      Cc: stable@vger.kernel.org
      Acked-by: default avatarChanwoo Choi <cw00.choi@samsung.com>
      Signed-off-by: default avatarBjorn Andersson <bjorn.andersson@linaro.org>
      Signed-off-by: default avatarMyungJoo Ham <myungjoo.ham@samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      62def1d6
    • Christian König's avatar
      swiotlb: suppress warning when __GFP_NOWARN is set · 37efa60e
      Christian König authored
      commit d0bc0c2a upstream.
      
      TTM tries to allocate coherent memory in chunks of 2MB first to improve
      TLB efficiency and falls back to allocating 4K pages if that fails.
      
      Suppress the warning when the 2MB allocations fails since there is a
      valid fall back path.
      Signed-off-by: default avatarChristian König <christian.koenig@amd.com>
      Reported-by: default avatarMike Galbraith <efault@gmx.de>
      Acked-by: default avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Bug: https://bugs.freedesktop.org/show_bug.cgi?id=104082Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      37efa60e
    • Shilpasri G Bhat's avatar
      cpufreq: powernv: Dont assume distinct pstate values for nominal and pmin · 8e56a935
      Shilpasri G Bhat authored
      commit 3fa4680b upstream.
      
      Some OpenPOWER boxes can have same pstate values for nominal and
      pmin pstates. In these boxes the current code will not initialize
      'powernv_pstate_info.min' variable and result in erroneous CPU
      frequency reporting. This patch fixes this problem.
      
      Fixes: 09ca4c9b (cpufreq: powernv: Replacing pstate_id with frequency table index)
      Reported-by: default avatarAlvin Wang <wangat@tw.ibm.com>
      Signed-off-by: default avatarShilpasri G Bhat <shilpa.bhat@linux.vnet.ibm.com>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Cc: 4.8+ <stable@vger.kernel.org> # 4.8+
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8e56a935
    • Bart Van Assche's avatar
      RDMA/rxe: Fix rxe_qp_cleanup() · 75a3f11c
      Bart Van Assche authored
      commit bb3ffb7a upstream.
      
      rxe_qp_cleanup() can sleep so it must be run in thread context and
      not in atomic context. This patch avoids that the following bug is
      triggered:
      
      Kernel BUG at 00000000560033f3 [verbose debug info unavailable]
      BUG: sleeping function called from invalid context at net/core/sock.c:2761
      in_atomic(): 1, irqs_disabled(): 0, pid: 7, name: ksoftirqd/0
      INFO: lockdep is turned off.
      Preemption disabled at:
      [<00000000b6e69628>] __do_softirq+0x4e/0x540
      CPU: 0 PID: 7 Comm: ksoftirqd/0 Not tainted 4.15.0-rc7-dbg+ #4
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
      Call Trace:
       dump_stack+0x85/0xbf
       ___might_sleep+0x177/0x260
       lock_sock_nested+0x1d/0x90
       inet_shutdown+0x2e/0xd0
       rxe_qp_cleanup+0x107/0x140 [rdma_rxe]
       rxe_elem_release+0x18/0x80 [rdma_rxe]
       rxe_requester+0x1cf/0x11b0 [rdma_rxe]
       rxe_do_task+0x78/0xf0 [rdma_rxe]
       tasklet_action+0x99/0x270
       __do_softirq+0xc0/0x540
       run_ksoftirqd+0x1c/0x70
       smpboot_thread_fn+0x1be/0x270
       kthread+0x117/0x130
       ret_from_fork+0x24/0x30
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: Moni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      75a3f11c
    • Bart Van Assche's avatar
      RDMA/rxe: Fix a race condition in rxe_requester() · 571cb36f
      Bart Van Assche authored
      commit 65567e41 upstream.
      
      The rxe driver works as follows:
      * The send queue, receive queue and completion queues are implemented as
        circular buffers.
      * ib_post_send() and ib_post_recv() calls are serialized through a spinlock.
      * Removing elements from various queues happens from tasklet
        context. Tasklets are guaranteed to run on at most one CPU. This serializes
        access to these queues. See also rxe_completer(), rxe_requester() and
        rxe_responder().
      * rxe_completer() processes the skbs queued onto qp->resp_pkts.
      * rxe_requester() handles the send queue (qp->sq.queue).
      * rxe_responder() processes the skbs queued onto qp->req_pkts.
      
      Since rxe_drain_req_pkts() processes qp->req_pkts, calling
      rxe_drain_req_pkts() from rxe_requester() is racy. Hence this patch.
      Reported-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      571cb36f
    • Bart Van Assche's avatar
      RDMA/rxe: Fix a race condition related to the QP error state · 7b4e8a46
      Bart Van Assche authored
      commit 6f301e06 upstream.
      
      The following sequence:
      * Change queue pair state into IB_QPS_ERR.
      * Post a work request on the queue pair.
      
      Triggers the following race condition in the rdma_rxe driver:
      * rxe_qp_error() triggers an asynchronous call of rxe_completer(), the function
        that examines the QP send queue.
      * rxe_post_send() posts a work request on the QP send queue.
      
      If rxe_completer() runs prior to rxe_post_send(), it will drain the send
      queue and the driver will assume no further action is necessary.
      However, once we post the send to the send queue, because the queue is
      in error, no send completion will ever happen and the send will get
      stuck.  In order to process the send, we need to make sure that
      rxe_completer() gets run after a send is posted to a queue pair in an
      error state.  This patch ensures that happens.
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: Moni Shoua <monis@mellanox.com>
      Cc: <stable@vger.kernel.org> # v4.8
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7b4e8a46
    • Arnd Bergmann's avatar
      kselftest: fix OOM in memory compaction test · 7dd2dbdd
      Arnd Bergmann authored
      commit 4c1baad2 upstream.
      
      Running the compaction_test sometimes results in out-of-memory
      failures. When I debugged this, it turned out that the code to
      reset the number of hugepages to the initial value is simply
      broken since we write into an open sysctl file descriptor
      multiple times without seeking back to the start.
      
      Adding the lseek here fixes the problem.
      
      Cc: stable@vger.kernel.org
      Reported-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Link: https://bugs.linaro.org/show_bug.cgi?id=3145Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarShuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7dd2dbdd
    • Anders Roxell's avatar
      selftests: seccomp: fix compile error seccomp_bpf · 9c2e7a04
      Anders Roxell authored
      commit 912ec316 upstream.
      
      aarch64-linux-gnu-gcc -Wl,-no-as-needed -Wall
          -lpthread seccomp_bpf.c -o seccomp_bpf
      seccomp_bpf.c: In function 'tracer_ptrace':
      seccomp_bpf.c:1720:12: error: '__NR_open' undeclared
          (first use in this function)
        if (nr == __NR_open)
                  ^~~~~~~~~
      seccomp_bpf.c:1720:12: note: each undeclared identifier is reported
          only once for each function it appears in
      In file included from seccomp_bpf.c:48:0:
      seccomp_bpf.c: In function 'TRACE_syscall_ptrace_syscall_dropped':
      seccomp_bpf.c:1795:39: error: '__NR_open' undeclared
          (first use in this function)
        EXPECT_SYSCALL_RETURN(EPERM, syscall(__NR_open));
                                             ^
      open(2) is a legacy syscall, replaced with openat(2) since 2.6.16.
      Thus new architectures in the kernel, such as arm64, don't implement
      these legacy syscalls.
      
      Fixes: a33b2d03 ("selftests/seccomp: Add tests for basic ptrace actions")
      Signed-off-by: default avatarAnders Roxell <anders.roxell@linaro.org>
      Tested-by: default avatarNaresh Kamboju <naresh.kamboju@linaro.org>
      Cc: stable@vger.kernel.org
      Acked-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarShuah Khan <shuahkh@osg.samsung.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9c2e7a04
    • Michael J. Ruhl's avatar
      IB/core: Avoid a potential OOPs for an unused optional parameter · 1d6eb826
      Michael J. Ruhl authored
      commit 2ff124d5 upstream.
      
      The ev_file is an optional parameter for CQ creation. If the parameter
      is not passed, the ev_file pointer will be NULL.  Using that pointer
      to set the cq_context will result in an OOPs.
      
      Verify that ev_file is not NULL before using.
      
      Cc: <stable@vger.kernel.org> # 4.14.x
      Fixes: 9ee79fce ("IB/core: Add completion queue (cq) object actions")
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Reviewed-by: default avatarIra Weiny <ira.weiny@intel.com>
      Signed-off-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1d6eb826
    • Bodong Wang's avatar
      IB/core: Fix ib_wc structure size to remain in 64 bytes boundary · d40ad865
      Bodong Wang authored
      commit cd2a6e7d upstream.
      
      The change of slid from u16 to u32 results in sizeof(struct ib_wc)
      cross 64B boundary, which causes more cache misses. This patch
      rearranges the fields and remain the size to 64B.
      
      Pahole output before this change:
      
      struct ib_wc {
              union {
                      u64                wr_id;                /*           8 */
                      struct ib_cqe *    wr_cqe;               /*           8 */
              };                                               /*     0     8 */
              enum ib_wc_status          status;               /*     8     4 */
              enum ib_wc_opcode          opcode;               /*    12     4 */
              u32                        vendor_err;           /*    16     4 */
              u32                        byte_len;             /*    20     4 */
              struct ib_qp *             qp;                   /*    24     8 */
              union {
                      __be32             imm_data;             /*           4 */
                      u32                invalidate_rkey;      /*           4 */
              } ex;                                            /*    32     4 */
              u32                        src_qp;               /*    36     4 */
              int                        wc_flags;             /*    40     4 */
              u16                        pkey_index;           /*    44     2 */
      
              /* XXX 2 bytes hole, try to pack */
      
              u32                        slid;                 /*    48     4 */
              u8                         sl;                   /*    52     1 */
              u8                         dlid_path_bits;       /*    53     1 */
              u8                         port_num;             /*    54     1 */
              u8                         smac[6];              /*    55     6 */
      
              /* XXX 1 byte hole, try to pack */
      
              u16                        vlan_id;              /*    62     2 */
              /* --- cacheline 1 boundary (64 bytes) --- */
              u8                         network_hdr_type;     /*    64     1 */
      
              /* size: 72, cachelines: 2, members: 17 */
              /* sum members: 62, holes: 2, sum holes: 3 */
              /* padding: 7 */
              /* last cacheline: 8 bytes */
      };
      
      Pahole output after this change:
      
      struct ib_wc {
              union {
                      u64                wr_id;                /*           8 */
                      struct ib_cqe *    wr_cqe;               /*           8 */
              };                                               /*     0     8 */
              enum ib_wc_status          status;               /*     8     4 */
              enum ib_wc_opcode          opcode;               /*    12     4 */
              u32                        vendor_err;           /*    16     4 */
              u32                        byte_len;             /*    20     4 */
              struct ib_qp *             qp;                   /*    24     8 */
              union {
                      __be32             imm_data;             /*           4 */
                      u32                invalidate_rkey;      /*           4 */
              } ex;                                            /*    32     4 */
              u32                        src_qp;               /*    36     4 */
              u32                        slid;                 /*    40     4 */
              int                        wc_flags;             /*    44     4 */
              u16                        pkey_index;           /*    48     2 */
              u8                         sl;                   /*    50     1 */
              u8                         dlid_path_bits;       /*    51     1 */
              u8                         port_num;             /*    52     1 */
              u8                         smac[6];              /*    53     6 */
      
              /* XXX 1 byte hole, try to pack */
      
              u16                        vlan_id;              /*    60     2 */
              u8                         network_hdr_type;     /*    62     1 */
      
              /* size: 64, cachelines: 1, members: 17 */
              /* sum members: 62, holes: 1, sum holes: 1 */
              /* padding: 1 */
      };
      
      Fixes: 7db20ecd ("IB/core: Change wc.slid from 16 to 32 bits")
      Signed-off-by: default avatarBodong Wang <bodong@mellanox.com>
      Reviewed-by: default avatarParav Pandit <parav@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d40ad865
    • Bart Van Assche's avatar
      IB/core: Fix two kernel warnings triggered by rxe registration · 18c0ee90
      Bart Van Assche authored
      commit 02ee9da3 upstream.
      
      Eliminate the WARN_ONs that create following two warnings when
      registering an rxe device:
      
      WARNING: CPU: 2 PID: 1005 at drivers/infiniband/core/device.c:449 ib_register_device+0x591/0x640 [ib_core]
      CPU: 2 PID: 1005 Comm: run_tests Not tainted 4.15.0-rc4-dbg+ #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:ib_register_device+0x591/0x640 [ib_core]
      Call Trace:
       rxe_register_device+0x3c6/0x470 [rdma_rxe]
       rxe_add+0x543/0x5e0 [rdma_rxe]
       rxe_net_add+0x37/0xb0 [rdma_rxe]
       rxe_param_set_add+0x5a/0x120 [rdma_rxe]
       param_attr_store+0x5e/0xc0
       module_attr_store+0x19/0x30
       sysfs_kf_write+0x3d/0x50
       kernfs_fop_write+0x116/0x1a0
       __vfs_write+0x23/0x120
       vfs_write+0xbe/0x1b0
       SyS_write+0x44/0xa0
       entry_SYSCALL_64_fastpath+0x23/0x9a
      
      WARNING: CPU: 2 PID: 1005 at drivers/infiniband/core/sysfs.c:1279 ib_device_register_sysfs+0x11d/0x160 [ib_core]
      CPU: 2 PID: 1005 Comm: run_tests Tainted: G        W        4.15.0-rc4-dbg+ #2
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014
      RIP: 0010:ib_device_register_sysfs+0x11d/0x160 [ib_core]
      Call Trace:
       ib_register_device+0x3f7/0x640 [ib_core]
       rxe_register_device+0x3c6/0x470 [rdma_rxe]
       rxe_add+0x543/0x5e0 [rdma_rxe]
       rxe_net_add+0x37/0xb0 [rdma_rxe]
       rxe_param_set_add+0x5a/0x120 [rdma_rxe]
       param_attr_store+0x5e/0xc0
       module_attr_store+0x19/0x30
       sysfs_kf_write+0x3d/0x50
       kernfs_fop_write+0x116/0x1a0
       __vfs_write+0x23/0x120
       vfs_write+0xbe/0x1b0
       SyS_write+0x44/0xa0
       entry_SYSCALL_64_fastpath+0x23/0x9a
      
      The code should accept either a parent pointer or a fully specified DMA
      specification without producing warnings.
      
      Fixes: 99db9494 ("IB/core: Remove ib_device.dma_device")
      Signed-off-by: default avatarBart Van Assche <bart.vanassche@wdc.com>
      Cc: Leon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      18c0ee90
    • Jack Morgenstein's avatar
      IB/mlx4: Fix incorrectly releasing steerable UD QPs when have only ETH ports · ade57e90
      Jack Morgenstein authored
      commit 852f6927 upstream.
      
      Allocating steerable UD QPs depends on having at least one IB port,
      while releasing those QPs does not.
      
      As a result, when there are only ETH ports, the IB (RoCE) driver
      requests releasing a qp range whose base qp is zero, with
      qp count zero.
      
      When SR-IOV is enabled, and the VF driver is running on a VM over
      a hypervisor which treats such qp release calls as errors
      (rather than NOPs), we see lines in the VM message log like:
      
       mlx4_core 0002:00:02.0: Failed to release qp range base:0 cnt:0
      
      Fix this by adding a check for a zero count in mlx4_release_qp_range()
      (which thus treats releasing 0 qps as a nop), and eliminating the
      check for device managed flow steering when releasing steerable UD QPs.
      (Freeing ib_uc_qpns_bitmap unconditionally is also OK, since it
      remains NULL when steerable UD QPs are not allocated).
      
      Fixes: 4196670b ("IB/mlx4: Don't allocate range of steerable UD QPs for Ethernet-only device")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ade57e90
    • Mike Marciniszyn's avatar
      IB/qib: Fix comparison error with qperf compare/swap test · 5a425546
      Mike Marciniszyn authored
      commit 87b3524c upstream.
      
      This failure exists with qib:
      
      ver_rc_compare_swap:
      mismatch, sequence 2, expected 123456789abcdef, got 0
      
      The request builder was using the incorrect inlines to
      build the request header resulting in incorrect data
      in the atomic header.
      
      Fix by using the appropriate inlines to create the request.
      
      Fixes: 261a4351 ("IB/qib,IB/hfi: Use core common header file")
      Reviewed-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5a425546
    • Jack Morgenstein's avatar
      IB/umad: Fix use of unprotected device pointer · 7a748f0b
      Jack Morgenstein authored
      commit f23a5350 upstream.
      
      The ib_write_umad() is protected by taking the umad file mutex.
      However, it accesses file->port->ib_dev -- which is protected only by the
      port's mutex (field file_mutex).
      
      The ib_umad_remove_one() calls ib_umad_kill_port() which sets
      port->ib_dev to NULL under the port mutex (NOT the file mutex).
      It then sets the mad agent to "dead" under the umad file mutex.
      
      This is a race condition -- because there is a window where
      port->ib_dev is NULL, while the agent is not "dead".
      
      As a result, we saw stack traces like:
      
      [16490.678059] BUG: unable to handle kernel NULL pointer dereference at 00000000000000b0
      [16490.678246] IP: ib_umad_write+0x29c/0xa3a [ib_umad]
      [16490.678333] PGD 0 P4D 0
      [16490.678404] Oops: 0000 [#1] SMP PTI
      [16490.678466] Modules linked in: rdma_ucm(OE) ib_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_uverbs(OE) ib_umad(OE) mlx4_en(OE) ptp pps_core mlx4_ib(OE-) ib_core(OE) mlx4_core(OE) mlx_compat
      (OE) memtrack(OE) devlink mst_pciconf(OE) mst_pci(OE) netconsole nfsv3 nfs_acl nfs lockd grace fscache cfg80211 rfkill esp6_offload esp6 esp4_offload esp4 sunrpc kvm_intel kvm ppdev parport_pc irqbypass
      parport joydev i2c_piix4 virtio_balloon cirrus drm_kms_helper ttm drm e1000 serio_raw virtio_pci virtio_ring virtio ata_generic pata_acpi qemu_fw_cfg [last unloaded: mlxfw]
      [16490.679202] CPU: 4 PID: 3115 Comm: sminfo Tainted: G           OE   4.14.13-300.fc27.x86_64 #1
      [16490.679339] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu2 04/01/2014
      [16490.679477] task: ffff9cf753890000 task.stack: ffffaf70c26b0000
      [16490.679571] RIP: 0010:ib_umad_write+0x29c/0xa3a [ib_umad]
      [16490.679664] RSP: 0018:ffffaf70c26b3d90 EFLAGS: 00010202
      [16490.679747] RAX: 0000000000000010 RBX: ffff9cf75610fd80 RCX: 0000000000000000
      [16490.679856] RDX: 0000000000000001 RSI: 00007ffdf2bfd714 RDI: ffff9cf6bb2a9c00
      
      In the above trace, ib_umad_write is trying to dereference the NULL
      file->port->ib_dev pointer.
      
      Fix this by using the agent's device pointer (the device field
      in struct ib_mad_agent) -- which IS protected by the umad file mutex.
      
      Fixes: 44c58487 ("IB/core: Define 'ib' and 'roce' rdma_ah_attr types")
      Signed-off-by: default avatarJack Morgenstein <jackm@dev.mellanox.co.il>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      7a748f0b
    • Steffen Weber's avatar
      scsi: smartpqi: allow static build ("built-in") · e99306bb
      Steffen Weber authored
      commit dc2db1dc upstream.
      
      If CONFIG_SCSI_SMARTPQI=y then don't build this driver as a module.
      Signed-off-by: default avatarSteffen Weber <steffen.weber@gmail.com>
      Signed-off-by: default avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e99306bb
    • Randy Dunlap's avatar
      tracing: Prevent PROFILE_ALL_BRANCHES when FORTIFY_SOURCE=y · b6f2efb8
      Randy Dunlap authored
      commit 68e76e03 upstream.
      
      I regularly get 50 MB - 60 MB files during kernel randconfig builds.
      These large files mostly contain (many repeats of; e.g., 124,594):
      
      In file included from ../include/linux/string.h:6:0,
                       from ../include/linux/uuid.h:20,
                       from ../include/linux/mod_devicetable.h:13,
                       from ../scripts/mod/devicetable-offsets.c:3:
      ../include/linux/compiler.h:64:4: warning: '______f' is static but declared in inline function 'strcpy' which is not static [enabled by default]
          ______f = {     \
          ^
      ../include/linux/compiler.h:56:23: note: in expansion of macro '__trace_if'
                             ^
      ../include/linux/string.h:425:2: note: in expansion of macro 'if'
        if (p_size == (size_t)-1 && q_size == (size_t)-1)
        ^
      
      This only happens when CONFIG_FORTIFY_SOURCE=y and
      CONFIG_PROFILE_ALL_BRANCHES=y, so prevent PROFILE_ALL_BRANCHES if
      FORTIFY_SOURCE=y.
      
      Link: http://lkml.kernel.org/r/9199446b-a141-c0c3-9678-a3f9107f2750@infradead.orgSigned-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Signed-off-by: default avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b6f2efb8
  2. 16 Feb, 2018 21 commits