1. 19 May, 2015 13 commits
  2. 15 May, 2015 4 commits
    • Frederic Danis's avatar
      Bluetooth: btbcm: Fix calls to __hci_cmd_sync() · 43b79209
      Frederic Danis authored
      Remove test of command reply status as it is already performed by
      __hci_cmd_sync().
      
      __hci_cmd_sync_ev() function already returns an error if it got a
      non-zero status either through a Command Complete or a Command
      Status event.
      
      For both of these events the status is collected up in the event
      handlers called by hci_event_packet() and then passed as the second
      parameter to req_complete_skb(). The req_complete_skb() callback in
      turn is hci_req_sync_complete() for __hci_cmd_sync_ev() which stores
      the status in hdev->req_result. The hdev->req_result is then further
      converted through bt_to_errno() back in __hci_cmd_sync_ev().
      Signed-off-by: default avatarFrederic Danis <frederic.danis@linux.intel.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      43b79209
    • Frederic Danis's avatar
      Bluetooth: btintel: Fix calls to __hci_cmd_sync() · b1f5cf0c
      Frederic Danis authored
      Remove test of command reply status as it is already performed by
      __hci_cmd_sync().
      
      __hci_cmd_sync_ev() function already returns an error if it got a
      non-zero status either through a Command Complete or a Command
      Status event.
      
      For both of these events the status is collected up in the event
      handlers called by hci_event_packet() and then passed as the second
      parameter to req_complete_skb(). The req_complete_skb() callback in
      turn is hci_req_sync_complete() for __hci_cmd_sync_ev() which stores
      the status in hdev->req_result. The hdev->req_result is then further
      converted through bt_to_errno() back in __hci_cmd_sync_ev().
      Signed-off-by: default avatarFrederic Danis <frederic.danis@linux.intel.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      b1f5cf0c
    • Frederic Danis's avatar
      Bluetooth: btusb: Fix calls to __hci_cmd_sync() · 5e13441c
      Frederic Danis authored
      Remove test of command reply status as it is already performed by
      __hci_cmd_sync().
      
      __hci_cmd_sync_ev() function already returns an error if it got a
      non-zero status either through a Command Complete or a Command
      Status event.
      
      For both of these events the status is collected up in the event
      handlers called by hci_event_packet() and then passed as the second
      parameter to req_complete_skb(). The req_complete_skb() callback in
      turn is hci_req_sync_complete() for __hci_cmd_sync_ev() which stores
      the status in hdev->req_result. The hdev->req_result is then further
      converted through bt_to_errno() back in __hci_cmd_sync_ev().
      Signed-off-by: default avatarFrederic Danis <frederic.danis@linux.intel.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      5e13441c
    • Frederic Danis's avatar
      Bluetooth: Fix calls to __hci_cmd_sync() · cffd2eed
      Frederic Danis authored
      Remove test of command reply status as it is already performed by
      __hci_cmd_sync().
      
      __hci_cmd_sync_ev() function already returns an error if it got a
      non-zero status either through a Command Complete or a Command
      Status event.
      
      For both of these events the status is collected up in the event
      handlers called by hci_event_packet() and then passed as the second
      parameter to req_complete_skb(). The req_complete_skb() callback in
      turn is hci_req_sync_complete() for __hci_cmd_sync_ev() which stores
      the status in hdev->req_result. The hdev->req_result is then further
      converted through bt_to_errno() back in __hci_cmd_sync_ev().
      Signed-off-by: default avatarFrederic Danis <frederic.danis@linux.intel.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      cffd2eed
  3. 14 May, 2015 1 commit
  4. 13 May, 2015 22 commits
    • Xinming Hu's avatar
      Bluetooth: btmrvl: fix compilation warning · a1e85f04
      Xinming Hu authored
      This patch fixes a compile warnning "dump_num maybe used uninitialized in
      this function".
      Signed-off-by: default avatarXinming Hu <huxm@marvell.com>
      Signed-off-by: default avatarAmitkumar Karwar <akarwar@marvell.com>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      a1e85f04
    • Leo Yan's avatar
      Bluetooth: btwilink: remove DEBUG define · 4541c561
      Leo Yan authored
      Remove the DEBUG define as the debug code; so can remove mass debug info
      from log buffer when using dmesg.
      Signed-off-by: default avatarLeo Yan <leo.yan@linaro.org>
      Signed-off-by: default avatarMarcel Holtmann <marcel@holtmann.org>
      4541c561
    • Pablo Neira's avatar
      net: kill useless net_*_ingress_queue() definitions when NET_CLS_ACT is unset · f0b5e8a4
      Pablo Neira authored
      This fixes 4577139b ("net: use jump label patching for ingress qdisc in
      __netif_receive_skb_core").
      
      The only client of this is sch_ingress and it depends on NET_CLS_ACT. So
      there is no way these definition can be of any help.
      
      Cc: Daniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: default avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: default avatarAlexei Starovoitov <ast@plumgrid.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      f0b5e8a4
    • David S. Miller's avatar
      Merge branch 'packet_rollover' · 9f0a74d7
      David S. Miller authored
      Willem de Bruijn says:
      
      ====================
      refine packet socket rollover:
      
      1. mitigate a case of lock contention
      2. avoid exporting resource exhaustion to other sockets,
         by migrating only to a victim socket that has ample room
      3. avoid reordering of most flows on the socket,
         by migrating first the flow responsible for load imbalance
      4. help processes detect load imbalance,
         by exporting rollover counters
      
      Context: rollover implements flow migration in packet socket fanout
      groups in case of extreme load imbalance. It is a specific
      implementation of migration that minimizes reordering by selecting
      the same victim socket when possible (and by selecting subsequent
      victims in a round robin fashion, from which its name derives).
      
      Changes:
        v2 -> v3:
          - statistics: replace unsigned long with __aligned_u64
        v1 -> v2:
          - huge flow detection: run lockless
          - huge flow detection: replace stored index with random
          - contention avoidance: test in packet_poll while lock held
          - contention avoidance: clear pressure sooner
      
                packet_poll and packet_recvmsg would clear only if the sock
                is empty to avoid taking the necessary lock. But,
                * packet_poll already holds this lock, so a lockless variant
                  __packet_rcv_has_room is cheap.
                * packet_recvmsg is usually called only for non-ring sockets,
                  which also runs lockless.
      
          - preparation: drop "single return" patch
      
                packet_rcv_has_room is now a locked wrapper around
                __packet_rcv_has_room, achieving the same (single footer).
      
      The benchmark mentioned in the patches is at
      https://github.com/wdebruij/kerneltools/blob/master/tests/bench_rollover.c
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9f0a74d7
    • Willem de Bruijn's avatar
      packet: rollover statistics · a9b63918
      Willem de Bruijn authored
      Rollover indicates exceptional conditions. Export a counter to inform
      socket owners of this state.
      
      If no socket with sufficient room is found, rollover fails. Also count
      these events.
      
      Finally, also count when flows are rolled over early thanks to huge
      flow detection, to validate its correctness.
      
      Tested:
        Read counters in bench_rollover on all other tests in the patchset
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      a9b63918
    • Willem de Bruijn's avatar
      packet: rollover huge flows before small flows · 3b3a5b0a
      Willem de Bruijn authored
      Migrate flows from a socket to another socket in the fanout group not
      only when the socket is full. Start migrating huge flows early, to
      divert possible 4-tuple attacks without affecting normal traffic.
      
      Introduce fanout_flow_is_huge(). This detects huge flows, which are
      defined as taking up more than half the load. It does so cheaply, by
      storing the rxhashes of the N most recent packets. If over half of
      these are the same rxhash as the current packet, then drop it. This
      only protects against 4-tuple attacks. N is chosen to fit all data in
      a single cache line.
      
      Tested:
        Ran bench_rollover for 10 sec with 1.5 Mpps of single flow input.
      
          lpbb5:/export/hda3/willemb# ./bench_rollover -l 1000 -r -s
          cpu         rx       rx.k     drop.k   rollover     r.huge   r.failed
            0         14         14          0          0          0          0
            1         20         20          0          0          0          0
            2         16         16          0          0          0          0
            3    6168824    6168824          0    4867721    4867721          0
            4    4867741    4867741          0          0          0          0
            5         12         12          0          0          0          0
            6         15         15          0          0          0          0
            7         17         17          0          0          0          0
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3b3a5b0a
    • Willem de Bruijn's avatar
      packet: rollover lock contention avoidance · 2ccdbaa6
      Willem de Bruijn authored
      Rollover has to call packet_rcv_has_room on sockets in the fanout
      group to find a socket to migrate to. This operation is expensive
      especially if the packet sockets use rings, when a lock has to be
      acquired.
      
      Avoid pounding on the lock by all sockets by temporarily marking a
      socket as "under memory pressure" when such pressure is detected.
      While set, only the socket owner may call packet_rcv_has_room on the
      socket. Once it detects normal conditions, it clears the flag. The
      socket is not used as a victim by any other socket in the meantime.
      
      Under reasonably balanced load, each socket writer frequently calls
      packet_rcv_has_room and clears its own pressure field. As a backup
      for when the socket is rarely written to, also clear the flag on
      reading (packet_recvmsg, packet_poll) if this can be done cheaply
      (i.e., without calling packet_rcv_has_room). This is only for
      edge cases.
      
      Tested:
        Ran bench_rollover: a process with 8 sockets in a single fanout
        group, each pinned to a single cpu that receives one nic recv
        interrupt. RPS and RFS are disabled. The benchmark uses packet
        rx_ring, which has to take a lock when determining whether a
        socket has room.
      
        Sent 3.5 Mpps of UDP traffic with sufficient entropy to spread
        uniformly across the packet sockets (and inserted an iptables
        rule to drop in PREROUTING to avoid protocol stack processing).
      
        Without this patch, all sockets try to migrate traffic to
        neighbors, causing lock contention when searching for a non-
        empty neighbor. The lock is the top 9 entries.
      
          perf record -a -g sleep 5
      
          -  17.82%   bench_rollover  [kernel.kallsyms]    [k] _raw_spin_lock
             - _raw_spin_lock
                - 99.00% spin_lock
          	 + 81.77% packet_rcv_has_room.isra.41
          	 + 18.23% tpacket_rcv
                + 0.84% packet_rcv_has_room.isra.41
          +   5.20%      ksoftirqd/6  [kernel.kallsyms]    [k] _raw_spin_lock
          +   5.15%      ksoftirqd/1  [kernel.kallsyms]    [k] _raw_spin_lock
          +   5.14%      ksoftirqd/2  [kernel.kallsyms]    [k] _raw_spin_lock
          +   5.12%      ksoftirqd/7  [kernel.kallsyms]    [k] _raw_spin_lock
          +   5.12%      ksoftirqd/5  [kernel.kallsyms]    [k] _raw_spin_lock
          +   5.10%      ksoftirqd/4  [kernel.kallsyms]    [k] _raw_spin_lock
          +   4.66%      ksoftirqd/0  [kernel.kallsyms]    [k] _raw_spin_lock
          +   4.45%      ksoftirqd/3  [kernel.kallsyms]    [k] _raw_spin_lock
          +   1.55%   bench_rollover  [kernel.kallsyms]    [k] packet_rcv_has_room.isra.41
      
        On net-next with this patch, this lock contention is no longer a
        top entry. Most time is spent in the actual read function. Next up
        are other locks:
      
          +  15.52%  bench_rollover  bench_rollover     [.] reader
          +   4.68%         swapper  [kernel.kallsyms]  [k] memcpy_erms
          +   2.77%         swapper  [kernel.kallsyms]  [k] packet_lookup_frame.isra.51
          +   2.56%     ksoftirqd/1  [kernel.kallsyms]  [k] memcpy_erms
          +   2.16%         swapper  [kernel.kallsyms]  [k] tpacket_rcv
          +   1.93%         swapper  [kernel.kallsyms]  [k] mlx4_en_process_rx_cq
      
        Looking closer at the remaining _raw_spin_lock, the cost of probing
        in rollover is now comparable to the cost of taking the lock later
        in tpacket_rcv.
      
          -   1.51%         swapper  [kernel.kallsyms]  [k] _raw_spin_lock
             - _raw_spin_lock
                + 33.41% packet_rcv_has_room
                + 28.15% tpacket_rcv
                + 19.54% enqueue_to_backlog
                + 6.45% __free_pages_ok
                + 2.78% packet_rcv_fanout
                + 2.13% fanout_demux_rollover
                + 2.01% netif_receive_skb_internal
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2ccdbaa6
    • Willem de Bruijn's avatar
      packet: rollover only to socket with headroom · 9954729b
      Willem de Bruijn authored
      Only migrate flows to sockets that have sufficient headroom, where
      sufficient is defined as having at least 25% empty space.
      
      The kernel has three different buffer types: a regular socket, a ring
      with frames (TPACKET_V[12]) or a ring with blocks (TPACKET_V3). The
      latter two do not expose a read pointer to the kernel, so headroom is
      not computed easily. All three needs a different implementation to
      estimate free space.
      
      Tested:
        Ran bench_rollover for 10 sec with 1.5 Mpps of single flow input.
      
        bench_rollover has as many sockets as there are NIC receive queues
        in the system. Each socket is owned by a process that is pinned to
        one of the receive cpus. RFS is disabled. RPS is enabled with an
        identity mapping (cpu x -> cpu x), to count drops with softnettop.
      
          lpbb5:/export/hda3/willemb# ./bench_rollover -r -l 1000 -s
          Press [Enter] to exit
      
          cpu         rx       rx.k     drop.k   rollover     r.huge   r.failed
            0         16         16          0          0          0          0
            1         21         21          0          0          0          0
            2    5227502    5227502          0          0          0          0
            3         18         18          0          0          0          0
            4    6083289    6083289          0    5227496          0          0
            5         22         22          0          0          0          0
            6         21         21          0          0          0          0
            7          9          9          0          0          0          0
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9954729b
    • Willem de Bruijn's avatar
      packet: rollover prepare: per-socket state · 0648ab70
      Willem de Bruijn authored
      Replace rollover state per fanout group with state per socket. Future
      patches will add fields to the new structure.
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0648ab70
    • Willem de Bruijn's avatar
      packet: rollover prepare: move code out of callsites · ad377cab
      Willem de Bruijn authored
      packet_rcv_fanout calls fanout_demux_rollover twice. Move all rollover
      logic into the callee to simplify these callsites, especially with
      upcoming changes.
      
      The main differences between the two callsites is that the FLAG
      variant tests whether the socket previously selected by another
      mode (RR, RND, HASH, ..) has room before migrating flows, whereas the
      rollover mode has no original socket to test.
      Signed-off-by: default avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ad377cab
    • Eric Dumazet's avatar
      ipv4: __ip_local_out_sk() is static · 7d771aaa
      Eric Dumazet authored
      __ip_local_out_sk() is only used from net/ipv4/ip_output.c
      
      net/ipv4/ip_output.c:94:5: warning: symbol '__ip_local_out_sk' was not
      declared. Should it be static?
      
      Fixes: 7026b1dd ("netfilter: Pass socket pointer down through okfn().")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7d771aaa
    • Eric Dumazet's avatar
      tcp/dccp: tw_timer_handler() is static · 216f8bb9
      Eric Dumazet authored
      tw_timer_handler() is only used from net/ipv4/inet_timewait_sock.c
      
      Fixes: 789f558c ("tcp/dccp: get rid of central timewait timer")
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      216f8bb9
    • David S. Miller's avatar
      Merge branch 'cls_flower' · dd58c635
      David S. Miller authored
      Jiri Pirko says:
      
      ====================
      introduce programable flow dissector and cls_flower
      
      Per Davem's request, I prepared this patchset which introduces programmable
      flow dissector. For current users of flow_keys, there is a wrapper
      skb_flow_dissect_flow_keys which maintains the previous behaviour.
      For purposes of cls_flower, couple of new dissection keys were introduced.
      
      Note that this dissector can be also eventually used by openvswitch code.
      
      Also, as a next step, I plan to get rid of *skb_flow_get_ports(export)
      and *__skb_get_poff as their functionality can be now implemented by
      skb_flow_dissect as well.
      
      v2->v3:
      - remove TCA_FLOWER_POLICE attr suggested by Jamal
      
      v1->v2:
      - move __skb_tx_hash rather to dev.c as suggested by Alex
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      dd58c635
    • Jiri Pirko's avatar
      tc: introduce Flower classifier · 77b9900e
      Jiri Pirko authored
      This patch introduces a flow-based filter. So far, the very essential
      packet fields are supported.
      
      This patch is only the first step. There is a lot of potential performance
      improvements possible to implement. Also a lot of features are missing
      now. They will be addressed in follow-up patches.
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Acked-by: default avatarJamal Hadi Salim <jhs@mojatatu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      77b9900e
    • Jiri Pirko's avatar
      59346afe
    • Jiri Pirko's avatar
      67a900cc
    • Jiri Pirko's avatar
      flow_dissector: introduce support for ipv6 addressses · b924933c
      Jiri Pirko authored
      So far, only hashes made out of ipv6 addresses could be dissected. This
      patch introduces support for dissection of full ipv6 addresses.
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b924933c
    • Jiri Pirko's avatar
      c3f8eaeb
    • Jiri Pirko's avatar
    • Jiri Pirko's avatar
      flow_dissector: introduce programable flow_dissector · fbff949e
      Jiri Pirko authored
      Introduce dissector infrastructure which allows user to specify which
      parts of skb he wants to dissect.
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fbff949e
    • Jiri Pirko's avatar
      flow_dissector: fix doc for skb_get_poff · 0db89b8b
      Jiri Pirko authored
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0db89b8b
    • Jiri Pirko's avatar
      net: move netdev_pick_tx and dependencies to net/core/dev.c · 638b2a69
      Jiri Pirko authored
      next to its user. No relation to flow_dissector so it makes no sense to
      have it in flow_dissector.c
      Signed-off-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      638b2a69