1. 23 Oct, 2020 2 commits
    • Abhi Das's avatar
      gfs2: Recover statfs info in journal head · bedb0f05
      Abhi Das authored
      Apply the outstanding statfs changes in the journal head to the
      master statfs file. Zero out the local statfs file for good measure.
      
      Previously, statfs updates would be read in from the local statfs inode and
      synced to the master statfs inode during recovery.
      
      We now use the statfs updates in the journal head to update the master statfs
      inode instead of reading in from the local statfs inode. To preserve backward
      compatibility with kernels that can't do this, we still need to keep the
      local statfs inode up to date by writing changes to it. At some point in the
      future, we can do away with the local statfs inodes altogether and keep the
      statfs changes solely in the journal.
      Signed-off-by: default avatarAbhi Das <adas@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      bedb0f05
    • Abhi Das's avatar
      gfs2: lookup local statfs inodes prior to journal recovery · 97fd734b
      Abhi Das authored
      We need to lookup the master statfs inode and the local statfs
      inodes earlier in the mount process (in init_journal) so journal
      recovery can use them when it attempts to recover the statfs info.
      We lookup all the local statfs inodes and store them in a linked
      list to allow a node to recover statfs info for other nodes in the
      cluster.
      Signed-off-by: default avatarAbhi Das <adas@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      97fd734b
  2. 20 Oct, 2020 5 commits
  3. 15 Oct, 2020 11 commits
    • Bob Peterson's avatar
      gfs2: eliminate GLF_QUEUED flag in favor of list_empty(gl_holders) · e2c6c8a7
      Bob Peterson authored
      Before this patch, glock.c maintained a flag, GLF_QUEUED, which indicated
      when a glock had a holder queued. It was only checked for inode glocks,
      although set and cleared by all glocks, and it was only used to determine
      whether the glock should be held for the minimum hold time before releasing.
      
      The problem is that the flag is not accurate at all. If a process holds
      the glock, the flag is set. When they dequeue the glock, it only cleared
      the flag in cases when the state actually changed. So if the state doesn't
      change, the flag may still be set, even when nothing is queued.
      
      This happens to iopen glocks often: the get held in SH, then the file is
      closed, but the glock remains in SH mode.
      
      We don't need a special flag to indicate this: we can simply tell whether
      the glock has any items queued to the holders queue. It's a waste of cpu
      time to maintain it.
      
      This patch eliminates the flag in favor of simply checking list_empty
      on the glock holders.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      e2c6c8a7
    • Bob Peterson's avatar
      gfs2: Ignore journal log writes for jdata holes · b2a846db
      Bob Peterson authored
      When flushing out its ail1 list, gfs2_write_jdata_page calls function
      __block_write_full_page passing in function gfs2_get_block_noalloc.
      But there was a problem when a process wrote to a jdata file, then
      truncated it or punched a hole, leaving references to the blocks within
      the new hole in its ail list, which are to be written to the journal log.
      
      In writing them to the journal, after calling gfs2_block_map, function
      gfs2_get_block_noalloc determined that the (hole-punched) block was not
      mapped, so it returned -EIO to generic_writepages, which passed it back
      to gfs2_ail1_start_one. This, in turn, performed a withdraw, assuming
      there was a real IO error writing to the journal.
      
      This might be a valid error when writing metadata to the journal, but for
      journaled data writes, it does not warrant a withdraw.
      
      This patch adds a check to function gfs2_block_map that makes an exception
      for journaled data writes that correspond to jdata holes: If the iomap
      get function returns a block type of IOMAP_HOLE, it instead returns
      -ENODATA which does not cause the withdraw. Other errors are returned as
      before.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      b2a846db
    • Bob Peterson's avatar
      gfs2: simplify gfs2_block_map · a6645745
      Bob Peterson authored
      Function gfs2_block_map had a lot of redundancy between its create and
      no_create paths. This patch simplifies the code to eliminate the redundancy.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      a6645745
    • Bob Peterson's avatar
      gfs2: Only set PageChecked if we have a transaction · 6302d6f4
      Bob Peterson authored
      With jdata writes, we frequently got into situations where gfs2 deadlocked
      because of this calling sequence:
      
      gfs2_ail1_start
         gfs2_ail1_flush - for every tr on the sd_ail1_list:
            gfs2_ail1_start_one - for every bd on the tr's tr_ail1_list:
               generic_writepages
      	    write_cache_pages passing __writepage()
      	       calls clear_page_dirty_for_io which calls set_page_dirty:
      	          which calls jdata_set_page_dirty which sets PageChecked.
      	       __writepage() calls
      	          mapping->a_ops->writepage AKA gfs2_jdata_writepage
      
      However, gfs2_jdata_writepage checks if PageChecked is set, and if so, it
      ignores the write and redirties the page. The problem is that write_cache_pages
      calls clear_page_dirty_for_io, which often calls set_page_dirty(). See comments
      in page-writeback.c starting with "Yes, Virginia". If it's jdata,
      set_page_dirty will call jdata_set_page_dirty which will set PageChecked.
      That causes a conflict because it makes it look like the page has been
      redirtied by another writer, in which case we need to skip writing it and
      redirty the page. That ends up in a deadlock because it isn't a "real" writer
      and nothing will ever clear PageChecked.
      
      If we do have a real writer, it will have started a transaction. So this
      patch checks if a transaction is in use, and if not, it skips setting
      PageChecked. That way, the page will be dirtied, cleaned, and written
      appropriately.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      6302d6f4
    • Bob Peterson's avatar
      gfs2: don't lock sd_ail_lock in gfs2_releasepage · 249ffe18
      Bob Peterson authored
      Patch 380f7c65 changed gfs2_releasepage
      so that it held the sd_ail_lock spin_lock for most of its processing.
      It did this for some mysterious undocumented bug somewhere in the
      evict code path. But in the nine years since, evict has been reworked
      and fixed many times, and so have the transactions and ail list.
      I can't see a reason to hold the sd_ail_lock unless it's protecting
      the actual ail lists hung off the transactions. Therefore, this patch
      removes the locking to increase speed and efficiency, and to further help
      us rework the log flush code to be more concurrent with transactions.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      249ffe18
    • Bob Peterson's avatar
      gfs2: make gfs2_ail1_empty_one return the count of active items · 36c78309
      Bob Peterson authored
      This patch is one baby step toward simplifying the journal management.
      It simply changes function gfs2_ail1_empty_one from a void to an int and
      makes it return a count of active items. This allows the caller to check
      the return code rather than list_empty on the tr_ail1_list. This way
      we can, in a later patch, combine transaction ail1 and ail2 lists.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      36c78309
    • Bob Peterson's avatar
      gfs2: Wipe jdata and ail1 in gfs2_journal_wipe, formerly gfs2_meta_wipe · 68942870
      Bob Peterson authored
      Before this patch, when blocks were freed, it called gfs2_meta_wipe to
      take the metadata out of the pending journal blocks. It did this mostly
      by calling another function called gfs2_remove_from_journal. This is
      shortsighted because it does not do anything with jdata blocks which
      may also be in the journal.
      
      This patch expands the function so that it wipes out jdata blocks from
      the journal as well, and it wipes it from the ail1 list if it hasn't
      been written back yet. Since it now processes jdata blocks as well,
      the function has been renamed from gfs2_meta_wipe to gfs2_journal_wipe.
      
      New function gfs2_ail1_wipe wants a static view of the ail list, so it
      locks the sd_ail_lock when removing items. To accomplish this, function
      gfs2_remove_from_journal no longer locks the sd_ail_lock, and it's now
      the caller's responsibility to do so.
      
      I was going to make sd_ail_lock locking conditional, but the practice is
      generally frowned upon. For details, see: https://lwn.net/Articles/109066/Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      68942870
    • Bob Peterson's avatar
      gfs2: enhance log_blocks trace point to show log blocks free · 97c5e43d
      Bob Peterson authored
      This patch adds some code to enhance the log_blocks trace point. It
      reports the number of free log blocks. This makes the trace point much
      more useful, especially for debugging performance problems when we can
      tell when the journal gets full and needs to wait for flushes, etc.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      97c5e43d
    • Bob Peterson's avatar
      gfs2: add missing log_blocks trace points in gfs2_write_revokes · 77650bdb
      Bob Peterson authored
      Function gfs2_write_revokes was incrementing and decrementing the number
      of log blocks free, but there was never a log_blocks trace point for it.
      Thus, the free blocks from a log_blocks trace would jump around
      mysteriously.
      
      This patch adds the missing trace points so the trace makes more sense.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      77650bdb
    • Bob Peterson's avatar
      gfs2: rename gfs2_write_full_page to gfs2_write_jdata_page, remove parm · 21b6924b
      Bob Peterson authored
      Since the function is only used for writing jdata pages, this patch
      simply renames function gfs2_write_full_page to a more appropriate
      name: gfs2_write_jdata_page. This makes the code easier to understand.
      
      The function was only called in one place, which passed in a pointer to
      function gfs2_get_block_noalloc. The function doesn't need to be
      passed in. Therefore, this also eliminates the unnecessary parameter
      to increase efficiency.
      
      I also took the liberty of cleaning up the function comments.
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      21b6924b
    • Anant Thazhemadam's avatar
      gfs2: add validation checks for size of superblock · 0ddc5154
      Anant Thazhemadam authored
      In gfs2_check_sb(), no validation checks are performed with regards to
      the size of the superblock.
      syzkaller detected a slab-out-of-bounds bug that was primarily caused
      because the block size for a superblock was set to zero.
      A valid size for a superblock is a power of 2 between 512 and PAGE_SIZE.
      Performing validation checks and ensuring that the size of the superblock
      is valid fixes this bug.
      
      Reported-by: syzbot+af90d47a37376844e731@syzkaller.appspotmail.com
      Tested-by: syzbot+af90d47a37376844e731@syzkaller.appspotmail.com
      Suggested-by: default avatarAndrew Price <anprice@redhat.com>
      Signed-off-by: default avatarAnant Thazhemadam <anant.thazhemadam@gmail.com>
      [Minor code reordering.]
      Signed-off-by: default avatarAndreas Gruenbacher <agruenba@redhat.com>
      0ddc5154
  4. 14 Oct, 2020 12 commits
  5. 11 Oct, 2020 10 commits