1. 14 Nov, 2011 1 commit
    • Eric Dumazet's avatar
      ipv6: reduce percpu needs for icmpv6msg mibs · 2a24444f
      Eric Dumazet authored
      Reading /proc/net/snmp6 on a machine with a lot of cpus is very
      expensive (can be ~88000 us).
      
      This is because ICMPV6MSG MIB uses 4096 bytes per cpu, and folding
      values for all possible cpus can read 16 Mbytes of memory (32MBytes on
      non x86 arches)
      
      ICMP messages are not considered as fast path on a typical server, and
      eventually few cpus handle them anyway. We can afford an atomic
      operation instead of using percpu data.
      
      This saves 4096 bytes per cpu and per network namespace.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      2a24444f
  2. 13 Nov, 2011 13 commits
  3. 12 Nov, 2011 4 commits
  4. 09 Nov, 2011 2 commits
    • Eric Dumazet's avatar
      ipv4: PKTINFO doesnt need dst reference · d826eb14
      Eric Dumazet authored
      Le lundi 07 novembre 2011 à 15:33 +0100, Eric Dumazet a écrit :
      
      > At least, in recent kernels we dont change dst->refcnt in forwarding
      > patch (usinf NOREF skb->dst)
      >
      > One particular point is the atomic_inc(dst->refcnt) we have to perform
      > when queuing an UDP packet if socket asked PKTINFO stuff (for example a
      > typical DNS server has to setup this option)
      >
      > I have one patch somewhere that stores the information in skb->cb[] and
      > avoid the atomic_{inc|dec}(dst->refcnt).
      >
      
      OK I found it, I did some extra tests and believe its ready.
      
      [PATCH net-next] ipv4: IP_PKTINFO doesnt need dst reference
      
      When a socket uses IP_PKTINFO notifications, we currently force a dst
      reference for each received skb. Reader has to access dst to get needed
      information (rt_iif & rt_spec_dst) and must release dst reference.
      
      We also forced a dst reference if skb was put in socket backlog, even
      without IP_PKTINFO handling. This happens under stress/load.
      
      We can instead store the needed information in skb->cb[], so that only
      softirq handler really access dst, improving cache hit ratios.
      
      This removes two atomic operations per packet, and false sharing as
      well.
      
      On a benchmark using a mono threaded receiver (doing only recvmsg()
      calls), I can reach 720.000 pps instead of 570.000 pps.
      
      IP_PKTINFO is typically used by DNS servers, and any multihomed aware
      UDP application.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d826eb14
    • Eric Dumazet's avatar
      ipv4: reduce percpu needs for icmpmsg mibs · acb32ba3
      Eric Dumazet authored
      Reading /proc/net/snmp on a machine with a lot of cpus is very expensive
      (can be ~88000 us).
      
      This is because ICMPMSG MIB uses 4096 bytes per cpu, and folding values
      for all possible cpus can read 16 Mbytes of memory.
      
      ICMP messages are not considered as fast path on a typical server, and
      eventually few cpus handle them anyway. We can afford an atomic
      operation instead of using percpu data.
      
      This saves 4096 bytes per cpu and per network namespace.
      Signed-off-by: default avatarEric Dumazet <eric.dumazet@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      acb32ba3
  5. 08 Nov, 2011 19 commits
  6. 07 Nov, 2011 1 commit
    • Al Viro's avatar
      VFS: we need to set LOOKUP_JUMPED on mountpoint crossing · a3fbbde7
      Al Viro authored
      Mountpoint crossing is similar to following procfs symlinks - we do
      not get ->d_revalidate() called for dentry we have arrived at, with
      unpleasant consequences for NFS4.
      
      Simple way to reproduce the problem in mainline:
      
          cat >/tmp/a.c <<'EOF'
          #include <unistd.h>
          #include <fcntl.h>
          #include <stdio.h>
          main()
          {
                  struct flock fl = {.l_type = F_RDLCK, .l_whence = SEEK_SET, .l_len = 1};
                  if (fcntl(0, F_SETLK, &fl))
                          perror("setlk");
          }
          EOF
          cc /tmp/a.c -o /tmp/test
      
      then on nfs4:
      
          mount --bind file1 file2
          /tmp/test < file1		# ok
          /tmp/test < file2		# spews "setlk: No locks available"...
      
      What happens is the missing call of ->d_revalidate() after mountpoint
      crossing and that's where NFS4 would issue OPEN request to server.
      
      The fix is simple - treat mountpoint crossing the same way we deal with
      following procfs-style symlinks.  I.e.  set LOOKUP_JUMPED...
      
      Cc: stable@kernel.org
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a3fbbde7