1. 06 Aug, 2012 10 commits
  2. 31 Jul, 2012 4 commits
    • David S. Miller's avatar
      ipv4: Properly purge netdev references on uncached routes. · caacf05e
      David S. Miller authored
      When a device is unregistered, we have to purge all of the
      references to it that may exist in the entire system.
      
      If a route is uncached, we currently have no way of accomplishing
      this.
      
      So create a global list that is scanned when a network device goes
      down.  This mirrors the logic in net/core/dst.c's dst_ifdown().
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      caacf05e
    • David S. Miller's avatar
      c5038a83
    • Eric Dumazet's avatar
      ipv4: percpu nh_rth_output cache · d26b3a7c
      Eric Dumazet authored
      Input path is mostly run under RCU and doesnt touch dst refcnt
      
      But output path on forwarding or UDP workloads hits
      badly dst refcount, and we have lot of false sharing, for example
      in ipv4_mtu() when reading rt->rt_pmtu
      
      Using a percpu cache for nh_rth_output gives a nice performance
      increase at a small cost.
      
      24 udpflood test on my 24 cpu machine (dummy0 output device)
      (each process sends 1.000.000 udp frames, 24 processes are started)
      
      before : 5.24 s
      after : 2.06 s
      For reference, time on linux-3.5 : 6.60 s
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Tested-by: default avatarAlexander Duyck <alexander.h.duyck@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      d26b3a7c
    • Eric Dumazet's avatar
      ipv4: Restore old dst_free() behavior. · 54764bb6
      Eric Dumazet authored
      commit 404e0a8b (net: ipv4: fix RCU races on dst refcounts) tried
      to solve a race but added a problem at device/fib dismantle time :
      
      We really want to call dst_free() as soon as possible, even if sockets
      still have dst in their cache.
      dst_release() calls in free_fib_info_rcu() are not welcomed.
      
      Root of the problem was that now we also cache output routes (in
      nh_rth_output), we must use call_rcu() instead of call_rcu_bh() in
      rt_free(), because output route lookups are done in process context.
      
      Based on feedback and initial patch from David Miller (adding another
      call_rcu_bh() call in fib, but it appears it was not the right fix)
      
      I left the inet_sk_rx_dst_set() helper and added __rcu attributes
      to nh_rth_output and nh_rth_input to better document what is going on in
      this code.
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      54764bb6
  3. 30 Jul, 2012 18 commits
  4. 28 Jul, 2012 2 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · f7da9cdf
      Linus Torvalds authored
      Pull networking fixes from David Miller:
       "Several bug fixes, some to new features appearing in this merge
        window, some that have been around for a while.
      
        I have a short list of known problems that need to be sorted out, but
        all of them can be solved easily during the run up to 3.6-final.
      
        I'll be offline until Sunday afternoon, but nothing need hold up
        3.6-rc1 and the close of the merge window, networking wise, at this
        point.
      
        1) Fix interface check in ipv4 TCP early demux, from Eric Dumazet.
      
        2) Fix a long standing bug in TCP DMA to userspace offload that can
           hang applications using MSG_TRUNC, from Jiri Kosina.
      
        3) Don't allow TCP_USER_TIMEOUT to be negative, from Hangbin Liu.
      
        4) Don't use GFP_KERNEL under spinlock in kaweth driver, from Dan
           Carpenter"
      
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        tcp: perform DMA to userspace only if there is a task waiting for it
        Revert "openvswitch: potential NULL deref in sample()"
        ipv4: fix TCP early demux
        net: fix rtnetlink IFF_PROMISC and IFF_ALLMULTI handling
        USB: kaweth.c: use GFP_ATOMIC under spin_lock
        tcp: Add TCP_USER_TIMEOUT negative value check
        bcma: add missing iounmap on error path
        bcma: fix regression in interrupt assignment on mips
        mac80211_hwsim: fix possible race condition in usage of info->control.sta & control.vif
      f7da9cdf
    • Linus Torvalds's avatar
      Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 · 173f8654
      Linus Torvalds authored
      Pull ext4 updates from Ted Ts'o:
       "The usual collection of bug fixes and optimizations.  Perhaps of
        greatest note is a speed up for parallel, non-allocating DIO writes,
        since we no longer take the i_mutex lock in that case.
      
        For bug fixes, we fix an incorrect overhead calculation which caused
        slightly incorrect results for df(1) and statfs(2).  We also fixed
        bugs in the metadata checksum feature."
      
      * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (23 commits)
        ext4: undo ext4_calc_metadata_amount if we fail to claim space
        ext4: don't let i_reserved_meta_blocks go negative
        ext4: fix hole punch failure when depth is greater than 0
        ext4: remove unnecessary argument from __ext4_handle_dirty_metadata()
        ext4: weed out ext4_write_super
        ext4: remove unnecessary superblock dirtying
        ext4: convert last user of ext4_mark_super_dirty() to ext4_handle_dirty_super()
        ext4: remove useless marking of superblock dirty
        ext4: fix ext4 mismerge back in January
        ext4: remove dynamic array size in ext4_chksum()
        ext4: remove unused variable in ext4_update_super()
        ext4: make quota as first class supported feature
        ext4: don't take the i_mutex lock when doing DIO overwrites
        ext4: add a new nolock flag in ext4_map_blocks
        ext4: split ext4_file_write into buffered IO and direct IO
        ext4: remove an unused statement in ext4_mb_get_buddy_page_lock()
        ext4: fix out-of-date comments in extents.c
        ext4: use s_csum_seed instead of i_csum_seed for xattr block
        ext4: use proper csum calculation in ext4_rename
        ext4: fix overhead calculation used by ext4_statfs()
        ...
      173f8654
  5. 27 Jul, 2012 6 commits
    • Linus Torvalds's avatar
      Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm · cea8f46c
      Linus Torvalds authored
      Pull ARM updates from Russell King:
       "First ARM push of this merge window, post me coming back from holiday.
        This is what has been in linux-next for the last few weeks.  Not much
        to say which isn't described by the commit summaries."
      
      * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (32 commits)
        ARM: 7463/1: topology: Update cpu_power according to DT information
        ARM: 7462/1: topology: factorize the update of sibling masks
        ARM: 7461/1: topology: Add arch_scale_freq_power function
        ARM: 7456/1: ptrace: provide separate functions for tracing syscall {entry,exit}
        ARM: 7455/1: audit: move syscall auditing until after ptrace SIGTRAP handling
        ARM: 7454/1: entry: don't bother with syscall tracing on ret_from_fork path
        ARM: 7453/1: audit: only allow syscall auditing for pure EABI userspace
        ARM: 7452/1: delay: allow timer-based delay implementation to be selected
        ARM: 7451/1: arch timer: implement read_current_timer and get_cycles
        ARM: 7450/1: dcache: select DCACHE_WORD_ACCESS for little-endian ARMv6+ CPUs
        ARM: 7449/1: use generic strnlen_user and strncpy_from_user functions
        ARM: 7448/1: perf: remove arm_perf_pmu_ids global enumeration
        ARM: 7447/1: rwlocks: remove unused branch labels from trylock routines
        ARM: 7446/1: spinlock: use ticket algorithm for ARMv6+ locking implementation
        ARM: 7445/1: mm: update CONTEXTIDR register to contain PID of current process
        ARM: 7444/1: kernel: add arch-timer C3STOP feature
        ARM: 7460/1: remove asm/locks.h
        ARM: 7439/1: head.S: simplify initial page table mapping
        ARM: 7437/1: zImage: Allow DTB command line concatenation with ATAG_CMDLINE
        ARM: 7436/1: Do not map the vectors page as write-through on UP systems
        ...
      cea8f46c
    • Russell King's avatar
    • David S. Miller's avatar
      Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless · 7b9b04fb
      David S. Miller authored
      John W. Linville says:
      
      ====================
      These fixes are intended for the 3.6 stream.
      
      Hauke Mehrtens provides a pair of bcma fixes, one to fix a build
      regression on mips and another to correct a pair of missing iounmap
      calls.
      
      Thomas Huehn offers a mac80211_hwsim fix to avoid a possible
      use-after-free bug.
      ====================
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      7b9b04fb
    • Jiri Kosina's avatar
      tcp: perform DMA to userspace only if there is a task waiting for it · 59ea33a6
      Jiri Kosina authored
      Back in 2006, commit 1a2449a8 ("[I/OAT]: TCP recv offload to I/OAT")
      added support for receive offloading to IOAT dma engine if available.
      
      The code in tcp_rcv_established() tries to perform early DMA copy if
      applicable. It however does so without checking whether the userspace
      task is actually expecting the data in the buffer.
      
      This is not a problem under normal circumstances, but there is a corner
      case where this doesn't work -- and that's when MSG_TRUNC flag to
      recvmsg() is used.
      
      If the IOAT dma engine is not used, the code properly checks whether
      there is a valid ucopy.task and the socket is owned by userspace, but
      misses the check in the dmaengine case.
      
      This problem can be observed in real trivially -- for example 'tbench' is a
      good reproducer, as it makes a heavy use of MSG_TRUNC. On systems utilizing
      IOAT, you will soon find tbench waiting indefinitely in sk_wait_data(), as they
      have been already early-copied in tcp_rcv_established() using dma engine.
      
      This patch introduces the same check we are performing in the simple
      iovec copy case to the IOAT case as well. It fixes the indefinite
      recvmsg(MSG_TRUNC) hangs.
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      59ea33a6
    • Jesse Gross's avatar
      Revert "openvswitch: potential NULL deref in sample()" · 60810307
      Jesse Gross authored
      This reverts commit 5b3e7e6c.
      
      The problem that the original commit was attempting to fix can
      never happen in practice because validation is done one a per-flow
      basis rather than a per-packet basis.  Adding additional checks at
      runtime is unnecessary and inconsistent with the rest of the code.
      
      CC: Dan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarJesse Gross <jesse@nicira.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      60810307
    • Eric Dumazet's avatar
      ipv4: fix TCP early demux · 505fbcf0
      Eric Dumazet authored
      commit 92101b3b (ipv4: Prepare for change of rt->rt_iif encoding.)
      invalidated TCP early demux, because rx_dst_ifindex is not properly
      initialized and checked.
      
      Also remove the use of inet_iif(skb) in favor or skb->skb_iif
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      505fbcf0