1. 23 Feb, 2018 5 commits
  2. 15 Feb, 2018 7 commits
    • Yonatan Cohen's avatar
      IB/mlx5: Implement fragmented completion queue (CQ) · 388ca8be
      Yonatan Cohen authored
      
      The current implementation of create CQ requires contiguous
      memory, such requirement is problematic once the memory is
      fragmented or the system is low in memory, it causes for
      failures in dma_zalloc_coherent().
      
      This patch implements new scheme of fragmented CQ to overcome
      this issue by introducing new type: 'struct mlx5_frag_buf_ctrl'
      to allocate fragmented buffers, rather than contiguous ones.
      
      Base the Completion Queues (CQs) on this new fragmented buffer.
      
      It fixes following crashes:
      kworker/29:0: page allocation failure: order:6, mode:0x80d0
      CPU: 29 PID: 8374 Comm: kworker/29:0 Tainted: G OE 3.10.0
      Workqueue: ib_cm cm_work_handler [ib_cm]
      Call Trace:
      [<>] dump_stack+0x19/0x1b
      [<>] warn_alloc_failed+0x110/0x180
      [<>] __alloc_pages_slowpath+0x6b7/0x725
      [<>] __alloc_pages_nodemask+0x405/0x420
      [<>] dma_generic_alloc_coherent+0x8f/0x140
      [<>] x86_swiotlb_alloc_coherent+0x21/0x50
      [<>] mlx5_dma_zalloc_coherent_node+0xad/0x110 [mlx5_core]
      [<>] ? mlx5_db_alloc_node+0x69/0x1b0 [mlx5_core]
      [<>] mlx5_buf_alloc_node+0x3e/0xa0 [mlx5_core]
      [<>] mlx5_buf_alloc+0x14/0x20 [mlx5_core]
      [<>] create_cq_kernel+0x90/0x1f0 [mlx5_ib]
      [<>] mlx5_ib_create_cq+0x3b0/0x4e0 [mlx5_ib]
      Signed-off-by: default avatarYonatan Cohen <yonatanc@mellanox.com>
      Reviewed-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      388ca8be
    • Saeed Mahameed's avatar
      net/mlx5: Remove redundant EQ API exports · 3ec5693b
      Saeed Mahameed authored
      
      EQ structure and API is private to mlx5_core driver only, external
      drivers should not have access or the means to manipulate EQ objects.
      
      Remove redundant exports and move API functions out of the linux/mlx5
      include directory into the driver's mlx5_core.h private include file.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: default avatarGal Pressman <galp@mellanox.com>
      3ec5693b
    • Saeed Mahameed's avatar
      net/mlx5: Move CQ completion and event forwarding logic to eq.c · 3ac7afdb
      Saeed Mahameed authored
      
      Since CQ tree is now per EQ, CQ completion and event forwarding became
      specific implementation of EQ logic, this patch moves that logic to eq.c
      and makes those functions static.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: default avatarGal Pressman <galp@mellanox.com>
      3ac7afdb
    • Saeed Mahameed's avatar
      net/mlx5: CQ hold/put API · f105b45b
      Saeed Mahameed authored
      
      Now as the CQ table is per EQ, add an API to hold/put CQ to be used from
      eq.c in downstream patch.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: default avatarGal Pressman <galp@mellanox.com>
      f105b45b
    • Saeed Mahameed's avatar
      net/mlx5: EQ add/del CQ API · d5c07157
      Saeed Mahameed authored
      
      Add API to add/del CQ to/from EQs CQ table to be used in cq.c upon CQ
      creation/destruction, as CQ table is now private to eq.c.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: default avatarGal Pressman <galp@mellanox.com>
      d5c07157
    • Saeed Mahameed's avatar
      net/mlx5: Add missing likely/unlikely hints to cq events · d2ff4fa5
      Saeed Mahameed authored
      
      If a hardware event is targeting a CQ, that CQ should exist.
      Add unlikely to error handling flows.
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: default avatarGal Pressman <galp@mellanox.com>
      d2ff4fa5
    • Saeed Mahameed's avatar
      net/mlx5: CQ Database per EQ · 02d92f79
      Saeed Mahameed authored
      
      Before this patch the driver had one CQ database protected via one
      spinlock, this spinlock is meant to synchronize between CQ
      adding/removing and CQ IRQ interrupt handling.
      
      On a system with large number of CPUs and on a work load that requires
      lots of interrupts, this global spinlock becomes a very nasty hotspot
      and introduces a contention between the active cores, which will
      significantly hurt performance and becomes a bottleneck that prevents
      seamless cpu scaling.
      
      To solve this we simply move the CQ database and its spinlock to be per
      EQ (IRQ), thus per core.
      
      Tested with:
      system: 2 sockets, 14 cores per socket, hyperthreading, 2x14x2=56 cores
      netperf command: ./super_netperf 200 -P 0 -t TCP_RR  -H <server> -l 30 -- -r 300,300 -o -s 1M,1M -S 1M,1M
      
      WITHOUT THIS PATCH:
      Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft %steal  %guest  %gnice   %idle
      Average:     all    4.32    0.00   36.15    0.09    0.00   34.02   0.00    0.00    0.00   25.41
      
      Samples: 2M of event 'cycles:pp', Event count (approx.): 1554616897271
      Overhead  Command          Shared Object                 Symbol
      +   14.28%  swapper          [kernel.vmlinux]              [k] intel_idle
      +   12.25%  swapper          [kernel.vmlinux]              [k] queued_spin_lock_slowpath
      +   10.29%  netserver        [kernel.vmlinux]              [k] queued_spin_lock_slowpath
      +    1.32%  netserver        [kernel.vmlinux]              [k] mlx5e_xmit
      
      WITH THIS PATCH:
      Average:     CPU    %usr   %nice    %sys %iowait    %irq   %soft  %steal  %guest  %gnice   %idle
      Average:     all    4.27    0.00   34.31    0.01    0.00   18.71    0.00    0.00    0.00   42.69
      
      Samples: 2M of event 'cycles:pp', Event count (approx.): 1498132937483
      Overhead  Command          Shared Object             Symbol
      +   23.33%  swapper          [kernel.vmlinux]          [k] intel_idle
      +    1.69%  netserver        [kernel.vmlinux]          [k] mlx5e_xmit
      Tested-by: default avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      Reviewed-by: default avatarGal Pressman <galp@mellanox.com>
      02d92f79
  3. 11 Feb, 2018 9 commits
    • Linus Torvalds's avatar
      Linux 4.16-rc1 · 7928b2cb
      Linus Torvalds authored
      7928b2cb
    • Al Viro's avatar
      unify {de,}mangle_poll(), get rid of kernel-side POLL... · 7a163b21
      Al Viro authored
      
      except, again, POLLFREE and POLL_BUSY_LOOP.
      
      With this, we finally get to the promised end result:
      
       - POLL{IN,OUT,...} are plain integers and *not* in __poll_t, so any
         stray instances of ->poll() still using those will be caught by
         sparse.
      
       - eventpoll.c and select.c warning-free wrt __poll_t
      
       - no more kernel-side definitions of POLL... - userland ones are
         visible through the entire kernel (and used pretty much only for
         mangle/demangle)
      
       - same behavior as after the first series (i.e. sparc et.al. epoll(2)
         working correctly).
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7a163b21
    • Linus Torvalds's avatar
      vfs: do bulk POLL* -> EPOLL* replacement · a9a08845
      Linus Torvalds authored
      
      This is the mindless scripted replacement of kernel use of POLL*
      variables as described by Al, done by this script:
      
          for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
              L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
              for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
          done
      
      with de-mangling cleanups yet to come.
      
      NOTE! On almost all architectures, the EPOLL* constants have the same
      values as the POLL* constants do.  But they keyword here is "almost".
      For various bad reasons they aren't the same, and epoll() doesn't
      actually work quite correctly in some cases due to this on Sparc et al.
      
      The next patch from Al will sort out the final differences, and we
      should be all done.
      Scripted-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a9a08845
    • Linus Torvalds's avatar
      Merge branch 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs · ee5daa13
      Linus Torvalds authored
      Pull more poll annotation updates from Al Viro:
       "This is preparation to solving the problems you've mentioned in the
        original poll series.
      
        After this series, the kernel is ready for running
      
            for V in IN OUT PRI ERR RDNORM RDBAND WRNORM WRBAND HUP RDHUP NVAL MSG; do
                  L=`git grep -l -w POLL$V | grep -v '^t' | grep -v /um/ | grep -v '^sa' | grep -v '/poll.h$'|grep -v '^D'`
                  for f in $L; do sed -i "-es/^\([^\"]*\)\(\<POLL$V\>\)/\\1E\\2/" $f; done
            done
      
        as a for bulk search-and-replace.
      
        After that, the kernel is ready to apply the patch to unify
        {de,}mangle_poll(), and then get rid of kernel-side POLL... uses
        entirely, and we should be all done with that stuff.
      
        Basically, that's what you suggested wrt KPOLL..., except that we can
        use EPOLL... instead - they already are arch-independent (and equal to
        what is currently kernel-side POLL...).
      
        After the preparations (in this series) switch to returning EPOLL...
        from ->poll() instances is completely mechanical and kernel-side
        POLL... can go away. The last step (killing kernel-side POLL... and
        unifying {de,}mangle_poll() has to be done after the
        search-and-replace job, since we need userland-side POLL... for
        unified {de,}mangle_poll(), thus the cherry-pick at the last step.
      
        After that we will have:
      
         - POLL{IN,OUT,...} *not* in __poll_t, so any stray instances of
           ->poll() still using those will be caught by sparse.
      
         - eventpoll.c and select.c warning-free wrt __poll_t
      
         - no more kernel-side definitions of POLL... - userland ones are
           visible through the entire kernel (and used pretty much only for
           mangle/demangle)
      
         - same behavior as after the first series (i.e. sparc et.al. epoll(2)
           working correctly)"
      
      * 'work.poll2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
        annotate ep_scan_ready_list()
        ep_send_events_proc(): return result via esed->res
        preparation to switching ->poll() to returning EPOLL...
        add EPOLLNVAL, annotate EPOLL... and event_poll->event
        use linux/poll.h instead of asm/poll.h
        xen: fix poll misannotation
        smc: missing poll annotations
      ee5daa13
    • Linus Torvalds's avatar
      Merge tag 'xtensa-20180211' of git://github.com/jcmvbkbc/linux-xtensa · 3fc928dc
      Linus Torvalds authored
      Pull xtense fix from Max Filippov:
       "Build fix for xtensa architecture with KASAN enabled"
      
      * tag 'xtensa-20180211' of git://github.com/jcmvbkbc/linux-xtensa:
        xtensa: fix build with KASAN
      3fc928dc
    • Linus Torvalds's avatar
      Merge tag 'nios2-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2 · 60d7a21a
      Linus Torvalds authored
      Pull nios2 update from Ley Foon Tan:
      
       - clean up old Kconfig options from defconfig
      
       - remove leading 0x and 0s from bindings notation in dts files
      
      * tag 'nios2-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
        nios2: defconfig: Cleanup from old Kconfig options
        nios2: dts: Remove leading 0x and 0s from bindings notation
      60d7a21a
    • Max Filippov's avatar
      xtensa: fix build with KASAN · f8d0cbf2
      Max Filippov authored
      The commit 917538e2 ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT
      usage") removed KASAN_SHADOW_SCALE_SHIFT definition from
      include/linux/kasan.h and added it to architecture-specific headers,
      except for xtensa. This broke the xtensa build with KASAN enabled.
      Define KASAN_SHADOW_SCALE_SHIFT in arch/xtensa/include/asm/kasan.h
      
      Reported by: kbuild test robot <fengguang.wu@intel.com>
      Fixes: 917538e2
      
       ("kasan: clean up KASAN_SHADOW_SCALE_SHIFT usage")
      Acked-by: default avatarAndrey Konovalov <andreyknvl@google.com>
      Signed-off-by: default avatarMax Filippov <jcmvbkbc@gmail.com>
      f8d0cbf2
    • Krzysztof Kozlowski's avatar
      nios2: defconfig: Cleanup from old Kconfig options · e0691ebb
      Krzysztof Kozlowski authored
      Remove old, dead Kconfig option INET_LRO. It is gone since
      commit 7bbf3cae
      
       ("ipv4: Remove inet_lro library").
      Signed-off-by: default avatarKrzysztof Kozlowski <krzk@kernel.org>
      Acked-by: default avatarLey Foon Tan <ley.foon.tan@intel.com>
      e0691ebb
    • Mathieu Malaterre's avatar
      nios2: dts: Remove leading 0x and 0s from bindings notation · 5d13c731
      Mathieu Malaterre authored
      Improve the DTS files by removing all the leading "0x" and zeros to fix the
      following dtc warnings:
      
      Warning (unit_address_format): Node /XXX unit name should not have leading "0x"
      
      and
      
      Warning (unit_address_format): Node /XXX unit name should not have leading 0s
      
      Converted using the following command:
      
      find . -type f \( -iname *.dts -o -iname *.dtsi \) -exec sed -E -i -e "s/@0x([0-9a-fA-F\.]+)\s?\{/@\L\1 \{/g" -e "s/@0+([0-9a-fA-F\.]+)\s?\{/@\L\1 \{/g" {} +
      
      For simplicity, two sed expressions were used to solve each warnings separately.
      
      To make the regex expression more robust a few other issues were resolved,
      namely setting unit-address to lower case, and adding a whitespace before the
      the opening curly brace:
      
      https://elinux.org/Device_Tree_Linux#Linux_conventions
      
      This is a follow up to commit 4c9847b7
      
       ("dt-bindings: Remove leading 0x from bindings notation")
      Reported-by: default avatarDavid Daney <ddaney@caviumnetworks.com>
      Suggested-by: default avatarRob Herring <robh@kernel.org>
      Signed-off-by: default avatarMathieu Malaterre <malat@debian.org>
      Acked-by: default avatarLey Foon Tan <ley.foon.tan@intel.com>
      5d13c731
  4. 10 Feb, 2018 14 commits
  5. 09 Feb, 2018 5 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · c839682c
      Linus Torvalds authored
      Pull networking fixes from David Miller:
      
       1) Make allocations less aggressive in x_tables, from Minchal Hocko.
      
       2) Fix netfilter flowtable Kconfig deps, from Pablo Neira Ayuso.
      
       3) Fix connection loss problems in rtlwifi, from Larry Finger.
      
       4) Correct DRAM dump length for some chips in ath10k driver, from Yu
          Wang.
      
       5) Fix ABORT handling in rxrpc, from David Howells.
      
       6) Add SPDX tags to Sun networking drivers, from Shannon Nelson.
      
       7) Some ipv6 onlink handling fixes, from David Ahern.
      
       8) Netem packet scheduler interval calcualtion fix from Md. Islam.
      
       9) Don't put crypto buffers on-stack in rxrpc, from David Howells.
      
      10) Fix handling of error non-delivery status in netlink multicast
          delivery over multiple namespaces, from Nicolas Dichtel.
      
      11) Missing xdp flush in tuntap driver, from Jason Wang.
      
      12) Synchonize RDS protocol netns/module teardown with rds object
          management, from Sowini Varadhan.
      
      13) Add nospec annotation...
      c839682c
    • Linus Torvalds's avatar
      Merge tag 'nfs-for-4.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs · 82f0a41e
      Linus Torvalds authored
      Pull more NFS client updates from Trond Myklebust:
       "A few bugfixes and some small sunrpc latency/performance improvements
        before the merge window closes:
      
        Stable fixes:
      
         - fix an incorrect calculation of the RDMA send scatter gather
           element limit
      
         - fix an Oops when attempting to free resources after RDMA device
           removal
      
        Bugfixes:
      
         - SUNRPC: Ensure we always release the TCP socket in a timely fashion
           when the connection is shut down.
      
         - SUNRPC: Don't call __UDPX_INC_STATS() from a preemptible context
      
        Latency/Performance:
      
         - SUNRPC: Queue latency sensitive socket tasks to the less contended
           xprtiod queue
      
         - SUNRPC: Make the xprtiod workqueue unbounded.
      
         - SUNRPC: Make the rpciod workqueue unbounded"
      
      * tag 'nfs-for-4.16-2' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
        SUNRPC: Don't call __UDPX_INC_STATS() from a preemptible context
        fix parallelism for rpc tasks
        Make the xprtiod workqueue unbounded.
        SUNRPC: Queue latency-sensitive socket tasks to xprtiod
        SUNRPC: Ensure we always close the socket after a connection shuts down
        xprtrdma: Fix BUG after a device removal
        xprtrdma: Fix calculation of ri_max_send_sges
      82f0a41e
    • Linus Torvalds's avatar
      Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending · 858f45bf
      Linus Torvalds authored
      Pull SCSI target updates from Nicholas Bellinger:
       "The highlights include:
      
         - numerous target-core-user improvements related to queue full and
           timeout handling. (MNC)
      
         - prevent target-core-user corruption when invalid data page is
           requested. (MNC)
      
         - add target-core device action configfs attributes to allow
           user-space to trigger events separate from existing attributes
           exposed to end-users. (MNC)
      
         - fix iscsi-target NULL pointer dereference 4.6+ regression in CHAP
           error path. (David Disseldorp)
      
         - avoid target-core backend UNMAP callbacks if range is zero. (Andrei
           Vagin)
      
         - fix a iscsi-target 4.14+ regression related multiple PDU logins,
           that was exposed due to removal of TCP prequeue support. (Florian
           Westphal + MNC)
      
        Also, there is a iser-target bug still being worked on for post -rc1
        code to address a long standing issue resulting in persistent
        ib_post_send() failures, for RNICs with small max_send_sge"
      
      * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (36 commits)
        iscsi-target: make sure to wake up sleeping login worker
        tcmu: Fix trailing semicolon
        tcmu: fix cmd user after free
        target: fix destroy device in target_configure_device
        tcmu: allow userspace to reset ring
        target core: add device action configfs files
        tcmu: fix error return code in tcmu_configure_device()
        target_core_user: add cmd id to broken ring message
        target: add SAM_STAT_BUSY sense reason
        tcmu: prevent corruption when invalid data page requested
        target: don't call an unmap callback if a range length is zero
        target/iscsi: avoid NULL dereference in CHAP auth error path
        cxgbit: call neigh_event_send() to update MAC address
        target: tcm_loop: Use seq_puts() in tcm_loop_show_info()
        target: tcm_loop: Delete an unnecessary return statement in tcm_loop_submission_work()
        target: tcm_loop: Delete two unnecessary variable initialisations in tcm_loop_issue_tmr()
        target: tcm_loop: Combine substrings for 26 messages
        target: tcm_loop: Improve a size determination in two functions
        target: tcm_loop: Delete an error message for a failed memory allocation in four functions
        sbp-target: Delete an error message for a failed memory allocation in three functions
        ...
      858f45bf
    • Linus Torvalds's avatar
      Merge tag 'trace-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace · 8158c2ff
      Linus Torvalds authored
      Pull tracing fixes from Steven Rostedt:
       "Al Viro discovered some breakage with the parsing of the
        set_ftrace_filter as well as the removing of function probes.
      
        This fixes the code with Al's suggestions. I also added a few
        selftests to test the broken cases such that they wont happen
        again"
      
      * tag 'trace-v4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
        selftests/ftrace: Add more tests for removing of function probes
        selftests/ftrace: Add some missing glob checks
        selftests/ftrace: Have reset_ftrace_filter handle multiple instances
        selftests/ftrace: Have reset_ftrace_filter handle modules
        tracing: Fix parsing of globs with a wildcard at the beginning
        ftrace: Remove incorrect setting of glob search field
      8158c2ff
    • Linus Torvalds's avatar
      Merge tag '4.16-minor-rc-SMB3-fixes' of git://git.samba.org/sfrench/cifs-2.6 · a2834832
      Linus Torvalds authored
      Pull cifs fixes from Steve French:
       "There are a couple additional security fixes that are still being
        tested that are not in this set."
      
      * tag '4.16-minor-rc-SMB3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
        Add missing structs and defines from recent SMB3.1.1 documentation
        address lock imbalance warnings in smbdirect.c
        cifs: silence compiler warnings showing up with gcc-8.0.0
        Add some missing debug fields in server and tcon structs
      a2834832