1. 03 Sep, 2012 7 commits
    • Jan Beulich's avatar
      netfilter: properly annotate ipv4_netfilter_{init,fini}() · ce9f3f31
      Jan Beulich authored
      Despite being just a few bytes of code, they should still have proper
      annotations.
      Signed-off-by: default avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      ce9f3f31
    • Michael Wang's avatar
      netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_queue() · 1c15b677
      Michael Wang authored
      Since 'list_for_each_continue_rcu' has already been replaced by
      'list_for_each_entry_continue_rcu', pass 'list_head' to nf_queue() as a
      parameter can not benefit us any more.
      
      This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of
      nf_queue() and __nf_queue() to save code.
      Signed-off-by: default avatarMichael Wang <wangyun@linux.vnet.ibm.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      1c15b677
    • Michael Wang's avatar
      netfilter: pass 'nf_hook_ops' instead of 'list_head' to nf_iterate() · 2a6decfd
      Michael Wang authored
      Since 'list_for_each_continue_rcu' has already been replaced by
      'list_for_each_entry_continue_rcu', pass 'list_head' to nf_iterate() as a
      parameter can not benefit us any more.
      
      This patch will replace 'list_head' with 'nf_hook_ops' as the parameter of
      nf_iterate() to save code.
      Signed-off-by: default avatarMichael Wang <wangyun@linux.vnet.ibm.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      2a6decfd
    • Cong Wang's avatar
      netfilter: remove xt_NOTRACK · 96550501
      Cong Wang authored
      It was scheduled to be removed for a long time.
      
      Cc: Pablo Neira Ayuso <pablo@netfilter.org>
      Cc: Patrick McHardy <kaber@trash.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: netfilter@vger.kernel.org
      Signed-off-by: default avatarCong Wang <xiyou.wangcong@gmail.com>
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      96550501
    • Pablo Neira Ayuso's avatar
      netfilter: nf_conntrack: add nf_ct_timeout_lookup · 84b5ee93
      Pablo Neira Ayuso authored
      This patch adds the new nf_ct_timeout_lookup function to encapsulate
      the timeout policy attachment that is called in the nf_conntrack_in
      path.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      84b5ee93
    • Pablo Neira Ayuso's avatar
      netfilter: xt_CT: refactorize xt_ct_tg_check · 236df005
      Pablo Neira Ayuso authored
      This patch adds xt_ct_set_helper and xt_ct_set_timeout to reduce
      the size of xt_ct_tg_check.
      
      This aims to improve code mantainability by splitting xt_ct_tg_check
      in smaller chunks.
      
      Suggested by Eric Dumazet.
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      236df005
    • Pablo Neira Ayuso's avatar
      netfilter: xt_socket: fix compilation warnings with gcc 4.7 · 6703aa74
      Pablo Neira Ayuso authored
      This patch fixes compilation warnings in xt_socket with gcc-4.7.
      
      In file included from net/netfilter/xt_socket.c:22:0:
      net/netfilter/xt_socket.c: In function ‘socket_mt6_v1’:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:265:16: note: ‘sport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:265:9: note: ‘dport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:264:27: note: ‘saddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:264:19: note: ‘daddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      net/netfilter/xt_socket.c: In function ‘socket_match.isra.4’:
      include/net/netfilter/nf_tproxy_core.h:75:2: warning: ‘protocol’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:113:5: note: ‘protocol’ was declared here
      In file included from include/net/tcp.h:37:0,
                       from net/netfilter/xt_socket.c:17:
      include/net/inet_hashtables.h:356:45: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:112:16: note: ‘sport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:106:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:112:9: note: ‘dport’ was declared here
      In file included from include/net/tcp.h:37:0,
                       from net/netfilter/xt_socket.c:17:
      include/net/inet_hashtables.h:356:15: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:111:16: note: ‘saddr’ was declared here
      In file included from include/net/tcp.h:37:0,
                       from net/netfilter/xt_socket.c:17:
      include/net/inet_hashtables.h:356:15: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:111:9: note: ‘daddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      net/netfilter/xt_socket.c: In function ‘socket_mt6_v1’:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘sport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:268:16: note: ‘sport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:23: warning: ‘dport’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:268:9: note: ‘dport’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘saddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:267:27: note: ‘saddr’ was declared here
      In file included from net/netfilter/xt_socket.c:22:0:
      include/net/netfilter/nf_tproxy_core.h:175:6: warning: ‘daddr’ may be used uninitialized in this function [-Wmaybe-uninitialized]
      net/netfilter/xt_socket.c:267:19: note: ‘daddr’ was declared here
      Signed-off-by: default avatarPablo Neira Ayuso <pablo@netfilter.org>
      6703aa74
  2. 30 Aug, 2012 18 commits
  3. 26 Aug, 2012 1 commit
  4. 23 Aug, 2012 9 commits
    • Pavel Emelyanov's avatar
      packet: Protect packet sk list with mutex (v2) · 0fa7fa98
      Pavel Emelyanov authored
      Change since v1:
      
      * Fixed inuse counters access spotted by Eric
      
      In patch eea68e2f (packet: Report socket mclist info via diag module) I've
      introduced a "scheduling in atomic" problem in packet diag module -- the
      socket list is traversed under rcu_read_lock() while performed under it sk
      mclist access requires rtnl lock (i.e. -- mutex) to be taken.
      
      [152363.820563] BUG: scheduling while atomic: crtools/12517/0x10000002
      [152363.820573] 4 locks held by crtools/12517:
      [152363.820581]  #0:  (sock_diag_mutex){+.+.+.}, at: [<ffffffff81a2dcb5>] sock_diag_rcv+0x1f/0x3e
      [152363.820613]  #1:  (sock_diag_table_mutex){+.+.+.}, at: [<ffffffff81a2de70>] sock_diag_rcv_msg+0xdb/0x11a
      [152363.820644]  #2:  (nlk->cb_mutex){+.+.+.}, at: [<ffffffff81a67d01>] netlink_dump+0x23/0x1ab
      [152363.820693]  #3:  (rcu_read_lock){.+.+..}, at: [<ffffffff81b6a049>] packet_diag_dump+0x0/0x1af
      
      Similar thing was then re-introduced by further packet diag patches (fanount
      mutex and pgvec mutex for rings) :(
      
      Apart from being terribly sorry for the above, I propose to change the packet
      sk list protection from spinlock to mutex. This lock currently protects two
      modifications:
      
      * sklist
      * prot inuse counters
      
      The sklist modifications can be just reprotected with mutex since they already
      occur in a sleeping context. The inuse counters modifications are trickier -- the
      __this_cpu_-s are used inside, thus requiring the caller to handle the potential
      issues with contexts himself. Since packet sockets' counters are modified in two
      places only (packet_create and packet_release) we only need to protect the context
      from being preempted. BH disabling is not required in this case.
      Signed-off-by: default avatarPavel Emelyanov <xemul@parallels.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fa7fa98
    • Allan, Bruce W's avatar
      mdio: translation of MMD EEE registers to/from ethtool settings · b32607dd
      Allan, Bruce W authored
      The helper functions which translate IEEE MDIO Manageable Device (MMD)
      Energy-Efficient Ethernet (EEE) registers 3.20, 7.60 and 7.61 to and from
      the comparable ethtool supported/advertised settings will be needed by
      drivers other than those in PHYLIB (e.g. e1000e in a follow-on patch).
      
      In the same fashion as similar translation functions in linux/mii.h, move
      these functions from the PHYLIB core to the linux/mdio.h header file so the
      code will not have to be duplicated in each driver needing MMD-to-ethtool
      (and vice-versa) translations.  The function and some variable names have
      been renamed to be more descriptive.
      
      Not tested on the only hardware that currently calls the related functions,
      stmmac, because I don't have access to any.  Has been compile tested and
      the translations have been tested on a locally modified version of e1000e.
      Signed-off-by: default avatarBruce Allan <bruce.w.allan@intel.com>
      Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b32607dd
    • danborkmann@iogearbox.net's avatar
      af_packet: use define instead of constant · 9e67030a
      danborkmann@iogearbox.net authored
      Instead of using a hard-coded value for the status variable, it would make
      the code more readable to use its destined define from linux/if_packet.h.
      
      Signed-off-by: daniel.borkmann@tik.ee.ethz.ch
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9e67030a
    • Ying Xue's avatar
      rds: Don't disable BH on BH context · bfdc587c
      Ying Xue authored
      Since we have already in BH context when *_write_space(),
      *_data_ready() as well as *_state_change() are called, it's
      unnecessary to disable BH.
      Signed-off-by: default avatarYing Xue <ying.xue@windriver.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bfdc587c
    • John Eaglesham's avatar
      bonding: support for IPv6 transmit hashing · 6b923cb7
      John Eaglesham authored
      Currently the "bonding" driver does not support load balancing outgoing
      traffic in LACP mode for IPv6 traffic. IPv4 (and TCP or UDP over IPv4)
      are currently supported; this patch adds transmit hashing for IPv6 (and
      TCP or UDP over IPv6), bringing IPv6 up to par with IPv4 support in the
      bonding driver. In addition, bounds checking has been added to all
      transmit hashing functions.
      
      The algorithm chosen (xor'ing the bottom three quads of the source and
      destination addresses together, then xor'ing each byte of that result into
      the bottom byte, finally xor'ing with the last bytes of the MAC addresses)
      was selected after testing almost 400,000 unique IPv6 addresses harvested
      from server logs. This algorithm had the most even distribution for both
      big- and little-endian architectures while still using few instructions. Its
      behavior also attempts to closely match that of the IPv4 algorithm.
      
      The IPv6 flow label was intentionally not included in the hash as it appears
      to be unset in the vast majority of IPv6 traffic sampled, and the current
      algorithm not using the flow label already offers a very even distribution.
      
      Fragmented IPv6 packets are handled the same way as fragmented IPv4 packets,
      ie, they are not balanced based on layer 4 information. Additionally,
      IPv6 packets with intermediate headers are not balanced based on layer
      4 information. In practice these intermediate headers are not common and
      this should not cause any problems, and the alternative (a packet-parsing
      loop and look-up table) seemed slow and complicated for little gain.
      Tested-by: default avatarJohn Eaglesham <linux@8192.net>
      Signed-off-by: default avatarJohn Eaglesham <linux@8192.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6b923cb7
    • Eric Dumazet's avatar
      ipv6: gre: fix ip6gre_err() · b87fb39e
      Eric Dumazet authored
      ip6gre_err() miscomputes grehlen (sizeof(ipv6h) is 4 or 8,
      not 40 as expected), and should take into account 'offset' parameter.
      
      Also uses pskb_may_pull() to cope with some fragged skbs
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Dmitry Kozlov <xeb@mail.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b87fb39e
    • Eric Dumazet's avatar
      xfrm: fix RCU bugs · ef8531b6
      Eric Dumazet authored
      This patch reverts commit 56892261 (xfrm: Use rcu_dereference_bh to
      deference pointer protected by rcu_read_lock_bh), and fixes bugs
      introduced in commit 418a99ac ( Replace rwlock on xfrm_policy_afinfo
      with rcu )
      
      1) We properly use RCU variant in this file, not a mix of RCU/RCU_BH
      
      2) We must defer some writes after the synchronize_rcu() call or a reader
       can crash dereferencing NULL pointer.
      
      3) Now we use the xfrm_policy_afinfo_lock spinlock only from process
      context, we no longer need to block BH in xfrm_policy_register_afinfo()
      and xfrm_policy_unregister_afinfo()
      
      4) Can use RCU_INIT_POINTER() instead of rcu_assign_pointer() in
      xfrm_policy_unregister_afinfo()
      
      5) Remove a forward inline declaration (xfrm_policy_put_afinfo()),
        and also move xfrm_policy_get_afinfo() declaration.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Fan Du <fan.du@windriver.com>
      Cc: Priyanka Jain <Priyanka.Jain@freescale.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      ef8531b6
    • Eric Dumazet's avatar
      net: remove delay at device dismantle · 0115e8e3
      Eric Dumazet authored
      I noticed extra one second delay in device dismantle, tracked down to
      a call to dst_dev_event() while some call_rcu() are still in RCU queues.
      
      These call_rcu() were posted by rt_free(struct rtable *rt) calls.
      
      We then wait a little (but one second) in netdev_wait_allrefs() before
      kicking again NETDEV_UNREGISTER.
      
      As the call_rcu() are now completed, dst_dev_event() can do the needed
      device swap on busy dst.
      
      To solve this problem, add a new NETDEV_UNREGISTER_FINAL, called
      after a rcu_barrier(), but outside of RTNL lock.
      
      Use NETDEV_UNREGISTER_FINAL with care !
      
      Change dst_dev_event() handler to react to NETDEV_UNREGISTER_FINAL
      
      Also remove NETDEV_UNREGISTER_BATCH, as its not used anymore after
      IP cache removal.
      
      With help from Gao feng
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Cc: Tom Herbert <therbert@google.com>
      Cc: Mahesh Bandewar <maheshb@google.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Gao feng <gaofeng@cn.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0115e8e3
    • David S. Miller's avatar
      Merge git://1984.lsi.us.es/nf-next · bf277b0c
      David S. Miller authored
      Pablo Neira Ayuso says:
      
      ====================
      This is the first batch of Netfilter and IPVS updates for your
      net-next tree. Mostly cleanups for the Netfilter side. They are:
      
      * Remove unnecessary RTNL locking now that we have support
        for namespace in nf_conntrack, from Patrick McHardy.
      
      * Cleanup to eliminate unnecessary goto in the initialization
        path of several Netfilter tables, from Jean Sacren.
      
      * Another cleanup from Wu Fengguang, this time to PTR_RET instead
        of if IS_ERR then return PTR_ERR.
      
      * Use list_for_each_entry_continue_rcu in nf_iterate, from
        Michael Wang.
      
      * Add pmtu_disc sysctl option to disable PMTU in their tunneling
        transmitter, from Julian Anastasov.
      
      * Generalize application protocol registration in IPVS and modify
        IPVS FTP helper to use it, from Julian Anastasov.
      
      * update Kconfig. The IPVS FTP helper depends on the Netfilter FTP
        helper for NAT support, from Julian Anastasov.
      
      * Add logic to update PMTU for IPIP packets in IPVS, again
        from Julian Anastasov.
      
      * A couple of sparse warning fixes for IPVS and Netfilter from
        Claudiu Ghioc and Patrick McHardy respectively.
      
      Patrick's IPv6 NAT changes will follow after this batch, I need
      to flush this batch first before refreshing my tree.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      bf277b0c
  5. 22 Aug, 2012 5 commits