1. 21 Oct, 2011 7 commits
    • Steven Whitehouse's avatar
      GFS2: Use ->dirty_inode() · ab9bbda0
      Steven Whitehouse authored
      The aim of this patch is to use the newly enhanced ->dirty_inode()
      super block operation to deal with atime updates, rather than
      piggy backing that code into ->write_inode() as is currently
      done.
      
      The net result is a simplification of the code in various places
      and a reduction of the number of gfs2_dinode_out() calls since
      this is now implied by ->dirty_inode().
      
      Some of the mark_inode_dirty() calls have been moved under glocks
      in order to take advantage of then being able to avoid locking in
      ->dirty_inode() when we already have suitable locks.
      
      One consequence is that generic_write_end() now correctly deals
      with file size updates, so that we do not need a separate check
      for that afterwards. This also, indirectly, means that fdatasync
      should work correctly on GFS2 - the current code always syncs the
      metadata whether it needs to or not.
      
      Has survived testing with postmark (with and without atime) and
      also fsx.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      ab9bbda0
    • Steven Whitehouse's avatar
      GFS2: Fix bug trap and journaled data fsync · f1818529
      Steven Whitehouse authored
      Journaled data requires that a complete flush of all dirty data for
      the file is done, in order that the ail flush which comes after
      will succeed.
      
      Also the recently enhanced bug trap can trigger falsely in case
      an ail flush from fsync races with a page read. This updates the
      bug trap such that it will ignore buffers which are locked and
      only trigger on dirty and/or pinned buffers when the ail flush
      is run from fsync. The original bug trap is retained when ail
      flush is run from ->go_sync()
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      f1818529
    • Steven Whitehouse's avatar
      GFS2: Fix inode allocation error path · 40ac218f
      Steven Whitehouse authored
      If we have got far enough through the inode allocation code
      path that an inode has already been allocated, then we must
      call iput to dispose of it, if an error occurs during a
      later part of the process. This will always be the final iput
      since there will be no other references to the inode.
      
      Unlike when the inode has been unlinked, its block state will
      be GFS2_BLKST_INODE rather than GFS2_BLKST_UNLINKED so we need
      to skip the test in ->evict_inode() for this one case in order
      to ensure that it will be deallocated correctly. This patch adds
      a new flag in order to ensure that this will happen correctly.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      40ac218f
    • Steven Whitehouse's avatar
      GFS2: Make atime checks more efficient · 1d4ec642
      Steven Whitehouse authored
      We do not need to start a transaction unless the atime
      check has proved positive. Also if we are going to flush
      the complete ail list anyway, we might as well skip the
      writeback for this specific inode's metadata, since that
      will be done as part of the ail writeback process in an
      order offering potentially more efficient I/O.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      1d4ec642
    • Steven Whitehouse's avatar
      GFS2: Fix bug-trap in ail flush code · 75549186
      Steven Whitehouse authored
      The assert was being tested under the wrong lock, a
      legacy of the original code. Also, if it does trigger,
      the resulting information was not always a lot of help.
      
      This moves the patch under the correct lock and also
      prints out more useful information in tacking down the
      source of the problem.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      75549186
    • Steven Whitehouse's avatar
      GFS2: Split data write & wait in fsync · 2f0264d5
      Steven Whitehouse authored
      Now that the data writing is part of fsync proper, we can split
      the waiting part out and do it later on. This reduces the
      number of waits that we do during fsync on average.
      
      There is also no need to take the i_mutex unless we are flushing
      metadata to disk, so we can move that to within the metadata
      flushing code.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      2f0264d5
    • Steven Whitehouse's avatar
      GFS2: Clean up dir hash table reading · 4c28d338
      Steven Whitehouse authored
      Since there is now only a single caller to gfs2_dir_read_data()
      and it has a number of constant arguments, we can factor
      those out. Also some tests relating to the inode size were
      being done twice.
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      4c28d338
  2. 20 Oct, 2011 3 commits
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc · fd11e153
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
        sparc: Add alignment flag to PCI expansion resources
        sparc: Avoid calling sigprocmask()
        sparc: Use set_current_blocked()
        sparc32,leon: SRMMU MMU Table probe fix
      fd11e153
    • Linus Torvalds's avatar
      Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net · 505f48b5
      Linus Torvalds authored
      * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
        fib_rules: fix unresolved_rules counting
        r8169: fix wrong eee setting for rlt8111evl
        r8169: fix driver shutdown WoL regression.
        ehea: Change maintainer to me
        pptp: pptp_rcv_core() misses pskb_may_pull() call
        tproxy: copy transparent flag when creating a time wait
        pptp: fix skb leak in pptp_xmit()
        bonding: use local function pointer of bond->recv_probe in bond_handle_frame
        smsc911x: Add support for SMSC LAN89218
        tg3: negate USE_PHYLIB flag check
        netconsole: enable netconsole can make net_device refcnt incorrent
        bluetooth: Properly clone LSM attributes to newly created child connections
        l2tp: fix a potential skb leak in l2tp_xmit_skb()
        bridge: fix hang on removal of bridge via netlink
        x25: Prevent skb overreads when checking call user data
        x25: Handle undersized/fragmented skbs
        x25: Validate incoming call user data lengths
        udplite: fast-path computation of checksum coverage
        IPVS netns shutdown/startup dead-lock
        netfilter: nf_conntrack: fix event flooding in GRE protocol tracker
      505f48b5
    • Hugh Dickins's avatar
      mm: fix race between mremap and removing migration entry · 486cf46f
      Hugh Dickins authored
      I don't usually pay much attention to the stale "? " addresses in
      stack backtraces, but this lucky report from Pawel Sikora hints that
      mremap's move_ptes() has inadequate locking against page migration.
      
       3.0 BUG_ON(!PageLocked(p)) in migration_entry_to_page():
       kernel BUG at include/linux/swapops.h:105!
       RIP: 0010:[<ffffffff81127b76>]  [<ffffffff81127b76>]
                             migration_entry_wait+0x156/0x160
        [<ffffffff811016a1>] handle_pte_fault+0xae1/0xaf0
        [<ffffffff810feee2>] ? __pte_alloc+0x42/0x120
        [<ffffffff8112c26b>] ? do_huge_pmd_anonymous_page+0xab/0x310
        [<ffffffff81102a31>] handle_mm_fault+0x181/0x310
        [<ffffffff81106097>] ? vma_adjust+0x537/0x570
        [<ffffffff81424bed>] do_page_fault+0x11d/0x4e0
        [<ffffffff81109a05>] ? do_mremap+0x2d5/0x570
        [<ffffffff81421d5f>] page_fault+0x1f/0x30
      
      mremap's down_write of mmap_sem, together with i_mmap_mutex or lock,
      and pagetable locks, were good enough before page migration (with its
      requirement that every migration entry be found) came in, and enough
      while migration always held mmap_sem; but not enough nowadays, when
      there's memory hotremove and compaction.
      
      The danger is that move_ptes() lets a migration entry dodge around
      behind remove_migration_pte()'s back, so it's in the old location when
      looking at the new, then in the new location when looking at the old.
      
      Either mremap's move_ptes() must additionally take anon_vma lock(), or
      migration's remove_migration_pte() must stop peeking for is_swap_entry()
      before it takes pagetable lock.
      
      Consensus chooses the latter: we prefer to add overhead to migration
      than to mremapping, which gets used by JVMs and by exec stack setup.
      Reported-and-tested-by: default avatarPaweł Sikora <pluto@agmk.net>
      Signed-off-by: default avatarHugh Dickins <hughd@google.com>
      Acked-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Acked-by: default avatarMel Gorman <mgorman@suse.de>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      486cf46f
  3. 19 Oct, 2011 19 commits
  4. 18 Oct, 2011 6 commits
  5. 17 Oct, 2011 5 commits