1. 03 Sep, 2014 2 commits
    • Pablo Neira Ayuso's avatar
      netfilter: nft_rbtree: no need for spinlock from set destroy path · d99407f4
      Pablo Neira Ayuso authored
      The sets are released from the rcu callback, after the rule is removed
      from the chain list, which implies that nfnetlink cannot update the
      rbtree and no packets are walking on the set anymore. Thus, we can get
      rid of the spinlock in the set destroy path there.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Reviewied-by: default avatarThomas Graf <tgraf@suug.ch>
      d99407f4
    • Pablo Neira Ayuso's avatar
      netfilter: nft_hash: no need for rcu in the hash set destroy path · 39f39016
      Pablo Neira Ayuso authored
      The sets are released from the rcu callback, after the rule is removed
      from the chain list, which implies that nfnetlink cannot update the
      hashes (thus, no resizing may occur) and no packets are walking on the
      set anymore.
      
      This resolves a lockdep splat in the nft_hash_destroy() path since the
      nfnl mutex is not held there.
      
      ===============================
      [ INFO: suspicious RCU usage. ]
      3.16.0-rc2+ #168 Not tainted
      -------------------------------
      net/netfilter/nft_hash.c:362 suspicious rcu_dereference_protected() usage!
      
      other info that might help us debug this:
      
      rcu_scheduler_active = 1, debug_locks = 1
      1 lock held by ksoftirqd/0/3:
       #0:  (rcu_callback){......}, at: [<ffffffff81096393>] rcu_process_callbacks+0x27e/0x4c7
      
      stack backtrace:
      CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 3.16.0-rc2+ #168
      Hardware name: LENOVO 23259H1/23259H1, BIOS G2ET32WW (1.12 ) 05/30/2012
       0000000000000001 ffff88011769bb98 ffffffff8142c922 0000000000000006
       ffff880117694090 ffff88011769bbc8 ffffffff8107c3ff ffff8800cba52400
       ffff8800c476bea8 ffff8800c476bea8 ffff8800cba52400 ffff88011769bc08
      Call Trace:
       [<ffffffff8142c922>] dump_stack+0x4e/0x68
       [<ffffffff8107c3ff>] lockdep_rcu_suspicious+0xfa/0x103
       [<ffffffffa079931e>] nft_hash_destroy+0x50/0x137 [nft_hash]
       [<ffffffffa078cd57>] nft_set_destroy+0x11/0x2a [nf_tables]
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      Acked-by: default avatarThomas Graf <tgraf@suug.ch>
      39f39016
  2. 02 Sep, 2014 26 commits
  3. 01 Sep, 2014 6 commits
  4. 30 Aug, 2014 6 commits
    • Daniel Borkmann's avatar
      net: sctp: fix ABI mismatch through sctp_assoc_to_state helper · 38ab1fa9
      Daniel Borkmann authored
      Since SCTP day 1, that is, 19b55a2af145 ("Initial commit") from lksctp
      tree, the official <netinet/sctp.h> header carries a copy of enum
      sctp_sstat_state that looks like (compared to the current in-kernel
      enumeration):
      
        User definition:                     Kernel definition:
      
        enum sctp_sstat_state {              typedef enum {
          SCTP_EMPTY             = 0,          <removed>
          SCTP_CLOSED            = 1,          SCTP_STATE_CLOSED            = 0,
          SCTP_COOKIE_WAIT       = 2,          SCTP_STATE_COOKIE_WAIT       = 1,
          SCTP_COOKIE_ECHOED     = 3,          SCTP_STATE_COOKIE_ECHOED     = 2,
          SCTP_ESTABLISHED       = 4,          SCTP_STATE_ESTABLISHED       = 3,
          SCTP_SHUTDOWN_PENDING  = 5,          SCTP_STATE_SHUTDOWN_PENDING  = 4,
          SCTP_SHUTDOWN_SENT     = 6,          SCTP_STATE_SHUTDOWN_SENT     = 5,
          SCTP_SHUTDOWN_RECEIVED = 7,          SCTP_STATE_SHUTDOWN_RECEIVED = 6,
          SCTP_SHUTDOWN_ACK_SENT = 8,          SCTP_STATE_SHUTDOWN_ACK_SENT = 7,
        };                                   } sctp_state_t;
      
      This header was later on also placed into the uapi, so that user space
      programs can compile without having <netinet/sctp.h>, but the shipped
      with <linux/sctp.h> instead.
      
      While RFC6458 under 8.2.1.Association Status (SCTP_STATUS) says that
      sstat_state can range from SCTP_CLOSED to SCTP_SHUTDOWN_ACK_SENT, we
      nevertheless have a what it appears to be dummy SCTP_EMPTY state from
      the very early days.
      
      While it seems to do just nothing, commit 0b8f9e25 ("sctp: remove
      completely unsed EMPTY state") did the right thing and removed this dead
      code. That however, causes an off-by-one when the user asks the SCTP
      stack via SCTP_STATUS API and checks for the current socket state thus
      yielding possibly undefined behaviour in applications as they expect
      the kernel to tell the right thing.
      
      The enumeration had to be changed however as based on the current socket
      state, we access a function pointer lookup-table through this. Therefore,
      I think the best way to deal with this is just to add a helper function
      sctp_assoc_to_state() to encapsulate the off-by-one quirk.
      Reported-by: default avatarTristan Su <sooqing@gmail.com>
      Fixes: 0b8f9e25 ("sctp: remove completely unsed EMPTY state")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      38ab1fa9
    • Eric Dumazet's avatar
      net: attempt a single high order allocation · d9b2938a
      Eric Dumazet authored
      In commit ed98df33 ("net: use __GFP_NORETRY for high order
      allocations") we tried to address one issue caused by order-3
      allocations.
      
      We still observe high latencies and system overhead in situations where
      compaction is not successful.
      
      Instead of trying order-3, order-2, and order-1, do a single order-3
      best effort and immediately fallback to plain order-0.
      
      This mimics slub strategy to fallback to slab min order if the high
      order allocation used for performance failed.
      
      Order-3 allocations give a performance boost only if they can be done
      without recurring and expensive memory scan.
      
      Quoting David :
      
      The page allocator relies on synchronous (sync light) memory compaction
      after direct reclaim for allocations that don't retry and deferred
      compaction doesn't work with this strategy because the allocation order
      is always decreasing from the previous failed attempt.
      
      This means sync light compaction will always be encountered if memory
      cannot be defragmented or reclaimed several times during the
      skb_page_frag_refill() iteration.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d9b2938a
    • David S. Miller's avatar
      Merge branch 'mlx4-net' · bcc73547
      David S. Miller authored
      Or Gerlitz says:
      
      ====================
      Setup mlx4 user space Ethernet QPs to properly handle VXLAN
      
      This short series fixes the mlx4 driver setting of user space Ethernet QPs
      (e.g those opened by DPDK applications) such that they will properly handle
      VXLAN traffic/offloads
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bcc73547
    • Or Gerlitz's avatar
      mlx4: Set user-space raw Ethernet QPs to properly handle VXLAN traffic · d2fce8a9
      Or Gerlitz authored
      Raw Ethernet QPs opened from user-space lack the proper setup to
      recieve/handle VXLAN traffic when VXLAN offloads are enabled.
      
      Fix that by adding a tunnel steering rule on top of the normal unicast
      steering rule and set the tunnel_type field in the QP context.
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d2fce8a9
    • Or Gerlitz's avatar
      net/mlx4: Move the tunnel steering helper function to mlx4_core · b95089d0
      Or Gerlitz authored
      Move the function which we use to set VXLAN DMFS (flow-steering) rules
      from mlx4_en to mlx4_core. This refactoring will allow the mlx4_ib driver
      to call the helper for the use case of user-space RAW Ethernet QPs, such
      that they can serve VXLAN traffic too.
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b95089d0
    • Giuseppe CAVALLARO's avatar
      stmmac: fix dma api misuse · 362b37be
      Giuseppe CAVALLARO authored
      Enabling DMA_API_DEBUG, warnings are reported at runtime
      because the device driver frees DMA memory with wrong functions
      and it does not call dma_mapping_error after mapping dma memory.
      
      The first problem is fixed by of introducing a flag that helps us
      keeping track which mapping technique was used, so that we can use
      the right API for unmap.
      This approach was inspired by the e1000 driver, which uses a similar
      technique.
      Signed-off-by: default avatarAndre Draszik <andre.draszik@st.com>
      Signed-off-by: default avatarGiuseppe Cavallaro <peppe.cavallaro@st.com>
      Reviewed-by: default avatarDenis Kirjanov <kda@linux-powerpc.org>
      Cc: Hans de Goede <hdegoede@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      362b37be