1. 05 Jan, 2016 20 commits
    • David S. Miller's avatar
      Merge branch 'sctp-transport-rhashtable' · 33c15297
      David S. Miller authored
      Xin Long says:
      
      ====================
      sctp: use transport hashtable to replace association's with rhashtable
      
      for telecom center, the usual case is that a server is connected by thousands
      of clients. but if the server with only one enpoint(udp style) use the same
      sport and dport to communicate with every clients, and every assoc in server
      will be hashed in the same chain of global assoc hashtable due to currently we
      choose dport and sport as the hash key.
      
      when a packet is received, sctp_rcv try to find the assoc with sport and dport,
      since that chain is too long to find it fast, it make the performance turn to
      very low, some test data is as follow:
      
      in server:
      $./ss [start a udp style server there]
      in client:
      $./cc [start 2500 sockets to connect server with same port and different ip,
             and use one of them to send data to server]
      
      ===== test on net-next
      -- perf top
      server:
        55.73%  [kernel]             [k] sctp_assoc_is_match
         6.80%  [kernel]             [k] sctp_assoc_lookup_paddr
         4.81%  [kernel]             [k] sctp_v4_cmp_addr
         3.12%  [kernel]             [k] _raw_spin_unlock_irqrestore
         1.94%  [kernel]             [k] sctp_cmp_addr_exact
      
      client:
        46.01%  [kernel]                    [k] sctp_endpoint_lookup_assoc
         5.55%  libc-2.17.so                [.] __libc_calloc
         5.39%  libc-2.17.so                [.] _int_free
         3.92%  libc-2.17.so                [.] _int_malloc
         3.23%  [kernel]                    [k] __memset
      
      -- spent time
      time is 487s, send pkt is 10000000
      
      we need to change the way to calculate the hash key, to use lport +
      rport + paddr as the hash key can avoid this issue.
      
      besides, this patchset will use transport hashtable to replace
      association hashtable to lookup with rhashtable api. get transport
      first then get association by t->asoc. and also it will make tcp
      style work better.
      
      ===== test with this patchset:
      -- perf top
      server:
        15.98%  [kernel]                 [k] _raw_spin_unlock_irqrestore
         9.92%  [kernel]                 [k] __pv_queued_spin_lock_slowpath
         7.22%  [kernel]                 [k] copy_user_generic_string
         2.38%  libpthread-2.17.so       [.] __recvmsg_nocancel
         1.88%  [kernel]                 [k] sctp_recvmsg
      
      client:
        11.90%  [kernel]                   [k] sctp_hash_cmp
         8.52%  [kernel]                   [k] rht_deferred_worker
         4.94%  [kernel]                   [k] __pv_queued_spin_lock_slowpath
         3.95%  [kernel]                   [k] sctp_bind_addr_match
         2.49%  [kernel]                   [k] __memset
      
      -- spent time
      time is 22s, send pkt is 10000000
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      33c15297
    • Xin Long's avatar
      sctp: remove the local_bh_disable/enable in sctp_endpoint_lookup_assoc · c79c0666
      Xin Long authored
      sctp_endpoint_lookup_assoc is called in the protection of sock lock
      there is no need to call local_bh_disable in this function. so remove
      them.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      c79c0666
    • Xin Long's avatar
      sctp: drop the old assoc hashtable of sctp · b5eff712
      Xin Long authored
      transport hashtable will replace the association hashtable,
      so association hashtable is not used in sctp any more, so
      drop the codes about that.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b5eff712
    • Xin Long's avatar
      sctp: apply rhashtable api to sctp procfs · 39f66a7d
      Xin Long authored
      Traversal the transport rhashtable, get the association only once through
      the condition assoc->peer.primary_path != transport.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      39f66a7d
    • Xin Long's avatar
      sctp: apply rhashtable api to send/recv path · 4f008781
      Xin Long authored
      apply lookup apis to two functions, for __sctp_endpoint_lookup_assoc
      and __sctp_lookup_association, it's invoked in the protection of sock
      lock, it will be safe, but sctp_lookup_association need to call
      rcu_read_lock() and to detect the t->dead to protect it.
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4f008781
    • Xin Long's avatar
      sctp: add the rhashtable apis for sctp global transport hashtable · d6c0256a
      Xin Long authored
      tranport hashtbale will replace the association hashtable to do the
      lookup for transport, and then get association by t->assoc, rhashtable
      apis will be used because of it's resizable, scalable and using rcu.
      
      lport + rport + paddr will be the base hashkey to locate the chain,
      with net to protect one netns from another, then plus the laddr to
      compare to get the target.
      
      this patch will provider the lookup functions:
      - sctp_epaddr_lookup_transport
      - sctp_addrs_lookup_transport
      
      hash/unhash functions:
      - sctp_hash_transport
      - sctp_unhash_transport
      
      init/destroy functions:
      - sctp_transport_hashtable_init
      - sctp_transport_hashtable_destroy
      Signed-off-by: default avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: default avatarMarcelo Ricardo Leitner <marcelo.leitner@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d6c0256a
    • David S. Miller's avatar
      Merge branch 'faster-soreuseport' · 6a5ef90c
      David S. Miller authored
      Craig Gallek says:
      
      ====================
      Faster SO_REUSEPORT
      
      This series contains two optimizations for the SO_REUSEPORT feature:
      Faster lookup when selecting a socket for an incoming packet and
      the ability to select the socket from the group using a BPF program.
      
      This series only includes the UDP path.  I plan to submit a follow-up
      including the TCP path if the implementation in this series is
      acceptable.
      
      Changes in v4:
      - pskb_may_pull is unnecessary with pskb_pull (per Alexei Starovoitov)
      
      Changes in v3:
      - skb_pull_inline -> pskb_pull (per Alexei Starovoitov)
      - reuseport_attach* -> sk_reuseport_attach* and simple return statement
        syntax change (per Daniel Borkmann)
      
      Changes in v2:
      - Fix ARM build; remove unnecessary include.
      - Handle case where protocol header is not in linear section (per
        Alexei Starovoitov).
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a5ef90c
    • Craig Gallek's avatar
      soreuseport: BPF selection functional test · 3ca8e402
      Craig Gallek authored
      This program will build classic and extended BPF programs and
      validate the socket selection logic when used with
      SO_ATTACH_REUSEPORT_CBPF and SO_ATTACH_REUSEPORT_EBPF.
      
      It also validates the re-programing flow and several edge cases.
      Signed-off-by: default avatarCraig Gallek <kraig@google.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3ca8e402
    • Craig Gallek's avatar
      soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF · 538950a1
      Craig Gallek authored
      Expose socket options for setting a classic or extended BPF program
      for use when selecting sockets in an SO_REUSEPORT group.  These options
      can be used on the first socket to belong to a group before bind or
      on any socket in the group after bind.
      
      This change includes refactoring of the existing sk_filter code to
      allow reuse of the existing BPF filter validation checks.
      Signed-off-by: default avatarCraig Gallek <kraig@google.com>
      Acked-by: default avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      538950a1
    • Craig Gallek's avatar
      soreuseport: fast reuseport UDP socket selection · e32ea7e7
      Craig Gallek authored
      Include a struct sock_reuseport instance when a UDP socket binds to
      a specific address for the first time with the reuseport flag set.
      When selecting a socket for an incoming UDP packet, use the information
      available in sock_reuseport if present.
      
      This required adding an additional field to the UDP source address
      equality function to differentiate between exact and wildcard matches.
      The original use case allowed wildcard matches when checking for
      existing port uses during bind.  The new use case of adding a socket
      to a reuseport group requires exact address matching.
      
      Performance test (using a machine with 2 CPU sockets and a total of
      48 cores):  Create reuseport groups of varying size.  Use one socket
      from this group per user thread (pinning each thread to a different
      core) calling recvmmsg in a tight loop.  Record number of messages
      received per second while saturating a 10G link.
        10 sockets: 18% increase (~2.8M -> 3.3M pkts/s)
        20 sockets: 14% increase (~2.9M -> 3.3M pkts/s)
        40 sockets: 13% increase (~3.0M -> 3.4M pkts/s)
      
      This work is based off a similar implementation written by
      Ying Cai <ycai@google.com> for implementing policy-based reuseport
      selection.
      Signed-off-by: default avatarCraig Gallek <kraig@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e32ea7e7
    • Craig Gallek's avatar
      soreuseport: define reuseport groups · ef456144
      Craig Gallek authored
      struct sock_reuseport is an optional shared structure referenced by each
      socket belonging to a reuseport group.  When a socket is bound to an
      address/port not yet in use and the reuseport flag has been set, the
      structure will be allocated and attached to the newly bound socket.
      When subsequent calls to bind are made for the same address/port, the
      shared structure will be updated to include the new socket and the
      newly bound socket will reference the group structure.
      
      Usually, when an incoming packet was destined for a reuseport group,
      all sockets in the same group needed to be considered before a
      dispatching decision was made.  With this structure, an appropriate
      socket can be found after looking up just one socket in the group.
      
      This shared structure will also allow for more complicated decisions to
      be made when selecting a socket (eg a BPF filter).
      
      This work is based off a similar implementation written by
      Ying Cai <ycai@google.com> for implementing policy-based reuseport
      selection.
      Signed-off-by: default avatarCraig Gallek <kraig@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef456144
    • David S. Miller's avatar
      Merge branch 'mlxsw-fixes' · ebb3cf41
      David S. Miller authored
      Jiri Pirko says:
      
      ====================
      mlxsw: couple of fixes
      
      Couple of fixes from Ido.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ebb3cf41
    • Ido Schimmel's avatar
      mlxsw: spectrum: Change bridge port attributes only when bridged · 6c72a3d0
      Ido Schimmel authored
      Bridge port attributes are offloaded to hardware when invoked with SELF
      flag set, but it really makes no sense to reflect them when port is not
      bridged.
      
      Allow a user to change these attribute only when port is bridged and
      initialize them correctly when joining or leaving a bridge.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6c72a3d0
    • Ido Schimmel's avatar
      mlxsw: spectrum: Set bridge status in appropriate functions · 5a8f4525
      Ido Schimmel authored
      Set the bridge status of physical ports in the appropriate functions, to
      be consistent with LAG join/leave and vPorts joining/leaving bridge.
      
      Also, remove the error messages in these two functions, as we already
      emit errors in both the single functions they call.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5a8f4525
    • Ido Schimmel's avatar
      mlxsw: spectrum: Return NOTIFY_BAD on bridge failure · 78124078
      Ido Schimmel authored
      It is possible for us to fail when joining or leaving a bridge, so let
      the user know about that by returning NOTIFY_BAD, as already done for
      LAG join/leave and 802.1D bridges.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      78124078
    • Ido Schimmel's avatar
      mlxsw: spectrum: Initialize PVID only once · 7b31abe7
      Ido Schimmel authored
      We set PVID to 1 in mlxsw_sp_port_vlan_init(), so we can remove this
      statement.
      Signed-off-by: default avatarIdo Schimmel <idosch@mellanox.com>
      Signed-off-by: default avatarJiri Pirko <jiri@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b31abe7
    • Julia Lawall's avatar
      chelsio: constify cphy_ops structures · 46f85a92
      Julia Lawall authored
      The cphy_ops structures are never modified, so declare them as const.
      
      Done with the help of Coccinelle.
      Signed-off-by: default avatarJulia Lawall <Julia.Lawall@lip6.fr>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46f85a92
    • Arnd Bergmann's avatar
      fsl/fman: allow modular build · 46678612
      Arnd Bergmann authored
      ARM allmodconfig fails because of the addition of the FMAN driver:
      
      drivers/built-in.o: In function `dtsec_restart_autoneg':
      binder.c:(.text+0x173328): undefined reference to `mdiobus_read'
      binder.c:(.text+0x173348): undefined reference to `mdiobus_write'
      drivers/built-in.o: In function `dtsec_config':
      binder.c:(.text+0x173d24): undefined reference to `of_phy_find_device'
      drivers/built-in.o: In function `init_phy':
      binder.c:(.text+0x1763b0): undefined reference to `of_phy_connect'
      drivers/built-in.o: In function `stop':
      binder.c:(.text+0x176014): undefined reference to `phy_stop'
      drivers/built-in.o: In function `start':
      binder.c:(.text+0x176078): undefined reference to `phy_start'
      
      The reason is that the driver uses PHYLIB, but that is a loadable
      module here, and fman itself is built-in.
      
      This patch makes it possible to configure fman as a module as well
      so we don't change the status of PHYLIB in an allmodconfig kernel,
      and it adds a 'select PHYLIB' statement to ensure that phylib is
      always built-in when fman is.
      
      The driver uses "builtin_platform_driver(fman_driver);", which means
      it cannot be unloaded, but it's still possible to have it as a loadable
      module that gets loaded once and never removed.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: 5adae51a ("fsl/fman: Add FMan MURAM support")
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      46678612
    • Arnd Bergmann's avatar
      net: make ip6tunnel_xmit definition conditional · 0efeff29
      Arnd Bergmann authored
      Moving the caller of iptunnel_xmit_stats causes a build error in
      randconfig builds that disable CONFIG_INET:
      
      In file included from ../net/xfrm/xfrm_input.c:17:0:
      ../include/net/ip6_tunnel.h: In function 'ip6tunnel_xmit':
      ../include/net/ip6_tunnel.h:93:2: error: implicit declaration of function 'iptunnel_xmit_stats' [-Werror=implicit-function-declaration]
        iptunnel_xmit_stats(dev, pkt_len);
      
      The reason is that the iptunnel_xmit_stats definition is hidden
      inside #ifdef CONFIG_INET but the caller is not. We can change
      one or the other to fix it, and this patch adds a second #ifdef
      around ip6tunnel_xmit() to avoid seeing the invalid call.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Fixes: 039f5062 ("ip_tunnel: Move stats update to iptunnel_xmit()")
      Acked-by: default avatarPravin B Shelar <pshelar@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0efeff29
    • David S. Miller's avatar
      Merge tag 'nfc-next-4.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next · 15ab90f4
      David S. Miller authored
      Samuel Ortiz says:
      
      ====================
      NFC 4.5 pull request
      
      This is the first NFC pull request for 4.5 and it brings:
      
      - A new driver for the STMicroelectronics ST95HF NFC chipset.
        The ST95HF is an NFC digital transceiver with an embedded analog
        front-end and as such relies on the Linux NFC digital
        implementation. This is the 3rd user of the NFC digital stack.
      
      - ACPI support for the ST st-nci and st21nfca drivers.
      
      - A small improvement for the nfcsim driver, as we can now tune
        the Rx delay through sysfs.
      
      - A bunch of minor cleanups and small fixes from Christophe Ricard,
        for a few drivers and the NFC core code.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      15ab90f4
  2. 04 Jan, 2016 18 commits
  3. 31 Dec, 2015 2 commits