Commit babbcc02 authored by Linus Torvalds's avatar Linus Torvalds

Merge tag 'xfs-6.9-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs updates from Chandan Babu:

 - Online repair updates:
    - More ondisk structures being repaired:
        - Inode's mode field by trying to obtain file type value from
          the a directory entry
        - Quota counters
        - Link counts of inodes
        - FS summary counters
        - Support for in-memory btrees has been added to support repair
          of rmap btrees
    - Misc changes:
        - Report corruption of metadata to the health tracking subsystem
        - Enable indirect health reporting when resources are scarce
        - Reduce memory usage while repairing refcount btree
        - Extend "Bmap update" intent item to support atomic extent
          swapping on the realtime device
        - Extend "Bmap update" intent item to support extended attribute
          fork and unwritten extents
    - Code cleanups:
        - Bmap log intent
        - Btree block pointer checking
        - Btree readahead
        - Buffer target
        - Symbolic link code

 - Remove mrlock wrapper around the rwsem

 - Convert all the GFP_NOFS flag usages to use the scoped
   memalloc_nofs_save() API instead of direct calls with the GFP_NOFS

 - Refactor and simplify xfile abstraction. Lower level APIs in shmem.c
   are required to be exported in order to achieve this

 - Skip checking alignment constraints for inode chunk allocations when
   block size is larger than inode chunk size

 - Do not submit delwri buffers collected during log recovery when an
   error has been encountered

 - Fix SEEK_HOLE/DATA for file regions which have active COW extents

 - Fix lock order inversion when executing error handling path during
   shrinking a filesystem

 - Remove duplicate ifdefs

* tag 'xfs-6.9-merge-8' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (183 commits)
  xfs: shrink failure needs to hold AGI buffer
  mm/shmem.c: Use new form of *@param in kernel-doc
  kernel-doc: Add unary operator * to $type_param_ref
  xfs: use kvfree() in xlog_cil_free_logvec()
  xfs: xfs_btree_bload_prep_block() should use __GFP_NOFAIL
  xfs: fix scrub stats file permissions
  xfs: fix log recovery erroring out on refcount recovery failure
  xfs: move symlink target write function to libxfs
  xfs: move remote symlink target read function to libxfs
  xfs: move xfs_symlink_remote.c declarations to xfs_symlink_remote.h
  xfs: xfs_bmap_finish_one should map unwritten extents properly
  xfs: support deferred bmap updates on the attr fork
  xfs: support recovering bmap intent items targetting realtime extents
  xfs: add a realtime flag to the bmap update log redo items
  xfs: add a xattr_entry helper
  xfs: fix xfs_bunmapi to allow unmapping of partial rt extents
  xfs: move xfs_bmap_defer_add to xfs_bmap_item.c
  xfs: reuse xfs_bmap_update_cancel_item
  xfs: add a bi_entry helper
  xfs: remove xfs_trans_set_bmap_flags
  ...
parents 279d44ce 75bcffbb
......@@ -1915,19 +1915,13 @@ four of those five higher level data structures.
The fifth use case is discussed in the :ref:`realtime summary <rtsummary>` case
study.
The most general storage interface supported by the xfile enables the reading
and writing of arbitrary quantities of data at arbitrary offsets in the xfile.
This capability is provided by ``xfile_pread`` and ``xfile_pwrite`` functions,
which behave similarly to their userspace counterparts.
XFS is very record-based, which suggests that the ability to load and store
complete records is important.
To support these cases, a pair of ``xfile_obj_load`` and ``xfile_obj_store``
functions are provided to read and persist objects into an xfile.
They are internally the same as pread and pwrite, except that they treat any
error as an out of memory error.
For online repair, squashing error conditions in this manner is an acceptable
behavior because the only reaction is to abort the operation back to userspace.
All five xfile usecases can be serviced by these four functions.
To support these cases, a pair of ``xfile_load`` and ``xfile_store``
functions are provided to read and persist objects into an xfile that treat any
error as an out of memory error. For online repair, squashing error conditions
in this manner is an acceptable behavior because the only reaction is to abort
the operation back to userspace.
However, no discussion of file access idioms is complete without answering the
question, "But what about mmap?"
......@@ -1939,15 +1933,14 @@ tmpfs can only push a pagecache folio to the swap cache if the folio is neither
pinned nor locked, which means the xfile must not pin too many folios.
Short term direct access to xfile contents is done by locking the pagecache
folio and mapping it into kernel address space.
Programmatic access (e.g. pread and pwrite) uses this mechanism.
Folio locks are not supposed to be held for long periods of time, so long
term direct access to xfile contents is done by bumping the folio refcount,
folio and mapping it into kernel address space. Object load and store uses this
mechanism. Folio locks are not supposed to be held for long periods of time, so
long term direct access to xfile contents is done by bumping the folio refcount,
mapping it into kernel address space, and dropping the folio lock.
These long term users *must* be responsive to memory reclaim by hooking into
the shrinker infrastructure to know when to release folios.
The ``xfile_get_page`` and ``xfile_put_page`` functions are provided to
The ``xfile_get_folio`` and ``xfile_put_folio`` functions are provided to
retrieve the (locked) folio that backs part of an xfile and to release it.
The only code to use these folio lease functions are the xfarray
:ref:`sorting<xfarray_sort>` algorithms and the :ref:`in-memory
......@@ -2277,13 +2270,12 @@ follows:
pointing to the xfile.
3. Pass the buffer cache target, buffer ops, and other information to
``xfbtree_create`` to write an initial tree header and root block to the
xfile.
``xfbtree_init`` to initialize the passed in ``struct xfbtree`` and write an
initial root block to the xfile.
Each btree type should define a wrapper that passes necessary arguments to
the creation function.
For example, rmap btrees define ``xfs_rmapbt_mem_create`` to take care of
all the necessary details for callers.
A ``struct xfbtree`` object will be returned.
4. Pass the xfbtree object to the btree cursor creation function for the
btree type.
......
......@@ -124,12 +124,24 @@ config XFS_DRAIN_INTENTS
bool
select JUMP_LABEL if HAVE_ARCH_JUMP_LABEL
config XFS_LIVE_HOOKS
bool
select JUMP_LABEL if HAVE_ARCH_JUMP_LABEL
config XFS_MEMORY_BUFS
bool
config XFS_BTREE_IN_MEM
bool
config XFS_ONLINE_SCRUB
bool "XFS online metadata check support"
default n
depends on XFS_FS
depends on TMPFS && SHMEM
select XFS_LIVE_HOOKS
select XFS_DRAIN_INTENTS
select XFS_MEMORY_BUFS
help
If you say Y here you will be able to check metadata on a
mounted XFS filesystem. This feature is intended to reduce
......@@ -164,6 +176,7 @@ config XFS_ONLINE_REPAIR
bool "XFS online metadata repair support"
default n
depends on XFS_FS && XFS_ONLINE_SCRUB
select XFS_BTREE_IN_MEM
help
If you say Y here you will be able to repair metadata on a
mounted XFS filesystem. This feature is intended to reduce
......
......@@ -92,8 +92,7 @@ xfs-y += xfs_aops.o \
xfs_symlink.o \
xfs_sysfs.o \
xfs_trans.o \
xfs_xattr.o \
kmem.o
xfs_xattr.o
# low-level transaction/log code
xfs-y += xfs_log.o \
......@@ -137,6 +136,9 @@ xfs-$(CONFIG_FS_DAX) += xfs_notify_failure.o
endif
xfs-$(CONFIG_XFS_DRAIN_INTENTS) += xfs_drain.o
xfs-$(CONFIG_XFS_LIVE_HOOKS) += xfs_hooks.o
xfs-$(CONFIG_XFS_MEMORY_BUFS) += xfs_buf_mem.o
xfs-$(CONFIG_XFS_BTREE_IN_MEM) += libxfs/xfs_btree_mem.o
# online scrub/repair
ifeq ($(CONFIG_XFS_ONLINE_SCRUB),y)
......@@ -159,6 +161,8 @@ xfs-y += $(addprefix scrub/, \
health.o \
ialloc.o \
inode.o \
iscan.o \
nlinks.o \
parent.o \
readdir.o \
refcount.o \
......@@ -179,6 +183,7 @@ xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \
xfs-$(CONFIG_XFS_QUOTA) += $(addprefix scrub/, \
dqiterate.o \
quota.o \
quotacheck.o \
)
# online repair
......@@ -188,12 +193,17 @@ xfs-y += $(addprefix scrub/, \
alloc_repair.o \
bmap_repair.o \
cow_repair.o \
fscounters_repair.o \
ialloc_repair.o \
inode_repair.o \
newbt.o \
nlinks_repair.o \
rcbag_btree.o \
rcbag.o \
reap.o \
refcount_repair.o \
repair.o \
rmap_repair.o \
)
xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \
......@@ -202,6 +212,7 @@ xfs-$(CONFIG_XFS_RT) += $(addprefix scrub/, \
xfs-$(CONFIG_XFS_QUOTA) += $(addprefix scrub/, \
quota_repair.o \
quotacheck_repair.o \
)
endif
endif
// SPDX-License-Identifier: GPL-2.0
/*
* Copyright (c) 2000-2005 Silicon Graphics, Inc.
* All Rights Reserved.
*/
#include "xfs.h"
#include "xfs_message.h"
#include "xfs_trace.h"
void *
kmem_alloc(size_t size, xfs_km_flags_t flags)
{
int retries = 0;
gfp_t lflags = kmem_flags_convert(flags);
void *ptr;
trace_kmem_alloc(size, flags, _RET_IP_);
do {
ptr = kmalloc(size, lflags);
if (ptr || (flags & KM_MAYFAIL))
return ptr;
if (!(++retries % 100))
xfs_err(NULL,
"%s(%u) possible memory allocation deadlock size %u in %s (mode:0x%x)",
current->comm, current->pid,
(unsigned int)size, __func__, lflags);
memalloc_retry_wait(lflags);
} while (1);
}
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2000-2005 Silicon Graphics, Inc.
* All Rights Reserved.
*/
#ifndef __XFS_SUPPORT_KMEM_H__
#define __XFS_SUPPORT_KMEM_H__
#include <linux/slab.h>
#include <linux/sched.h>
#include <linux/mm.h>
#include <linux/vmalloc.h>
/*
* General memory allocation interfaces
*/
typedef unsigned __bitwise xfs_km_flags_t;
#define KM_NOFS ((__force xfs_km_flags_t)0x0004u)
#define KM_MAYFAIL ((__force xfs_km_flags_t)0x0008u)
#define KM_ZERO ((__force xfs_km_flags_t)0x0010u)
#define KM_NOLOCKDEP ((__force xfs_km_flags_t)0x0020u)
/*
* We use a special process flag to avoid recursive callbacks into
* the filesystem during transactions. We will also issue our own
* warnings, so we explicitly skip any generic ones (silly of us).
*/
static inline gfp_t
kmem_flags_convert(xfs_km_flags_t flags)
{
gfp_t lflags;
BUG_ON(flags & ~(KM_NOFS | KM_MAYFAIL | KM_ZERO | KM_NOLOCKDEP));
lflags = GFP_KERNEL | __GFP_NOWARN;
if (flags & KM_NOFS)
lflags &= ~__GFP_FS;
/*
* Default page/slab allocator behavior is to retry for ever
* for small allocations. We can override this behavior by using
* __GFP_RETRY_MAYFAIL which will tell the allocator to retry as long
* as it is feasible but rather fail than retry forever for all
* request sizes.
*/
if (flags & KM_MAYFAIL)
lflags |= __GFP_RETRY_MAYFAIL;
if (flags & KM_ZERO)
lflags |= __GFP_ZERO;
if (flags & KM_NOLOCKDEP)
lflags |= __GFP_NOLOCKDEP;
return lflags;
}
extern void *kmem_alloc(size_t, xfs_km_flags_t);
static inline void kmem_free(const void *ptr)
{
kvfree(ptr);
}
static inline void *
kmem_zalloc(size_t size, xfs_km_flags_t flags)
{
return kmem_alloc(size, flags | KM_ZERO);
}
/*
* Zone interfaces
*/
static inline struct page *
kmem_to_page(void *addr)
{
if (is_vmalloc_addr(addr))
return vmalloc_to_page(addr);
return virt_to_page(addr);
}
#endif /* __XFS_SUPPORT_KMEM_H__ */
......@@ -217,6 +217,7 @@ xfs_initialize_perag_data(
*/
if (fdblocks > sbp->sb_dblocks || ifree > ialloc) {
xfs_alert(mp, "AGF corruption. Please run xfs_repair.");
xfs_fs_mark_sick(mp, XFS_SICK_FS_COUNTERS);
error = -EFSCORRUPTED;
goto out;
}
......@@ -241,7 +242,7 @@ __xfs_free_perag(
struct xfs_perag *pag = container_of(head, struct xfs_perag, rcu_head);
ASSERT(!delayed_work_pending(&pag->pag_blockgc_work));
kmem_free(pag);
kfree(pag);
}
/*
......@@ -263,7 +264,7 @@ xfs_free_perag(
xfs_defer_drain_free(&pag->pag_intents_drain);
cancel_delayed_work_sync(&pag->pag_blockgc_work);
xfs_buf_hash_destroy(pag);
xfs_buf_cache_destroy(&pag->pag_bcache);
/* drop the mount's active reference */
xfs_perag_rele(pag);
......@@ -351,9 +352,9 @@ xfs_free_unused_perag_range(
spin_unlock(&mp->m_perag_lock);
if (!pag)
break;
xfs_buf_hash_destroy(pag);
xfs_buf_cache_destroy(&pag->pag_bcache);
xfs_defer_drain_free(&pag->pag_intents_drain);
kmem_free(pag);
kfree(pag);
}
}
......@@ -381,7 +382,7 @@ xfs_initialize_perag(
continue;
}
pag = kmem_zalloc(sizeof(*pag), KM_MAYFAIL);
pag = kzalloc(sizeof(*pag), GFP_KERNEL | __GFP_RETRY_MAYFAIL);
if (!pag) {
error = -ENOMEM;
goto out_unwind_new_pags;
......@@ -389,7 +390,7 @@ xfs_initialize_perag(
pag->pag_agno = index;
pag->pag_mount = mp;
error = radix_tree_preload(GFP_NOFS);
error = radix_tree_preload(GFP_KERNEL | __GFP_RETRY_MAYFAIL);
if (error)
goto out_free_pag;
......@@ -416,9 +417,10 @@ xfs_initialize_perag(
init_waitqueue_head(&pag->pag_active_wq);
pag->pagb_count = 0;
pag->pagb_tree = RB_ROOT;
xfs_hooks_init(&pag->pag_rmap_update_hooks);
#endif /* __KERNEL__ */
error = xfs_buf_hash_init(pag);
error = xfs_buf_cache_init(&pag->pag_bcache);
if (error)
goto out_remove_pag;
......@@ -453,7 +455,7 @@ xfs_initialize_perag(
radix_tree_delete(&mp->m_perag_tree, index);
spin_unlock(&mp->m_perag_lock);
out_free_pag:
kmem_free(pag);
kfree(pag);
out_unwind_new_pags:
/* unwind any prior newly initialized pags */
xfs_free_unused_perag_range(mp, first_initialised, agcount);
......@@ -491,7 +493,7 @@ xfs_btroot_init(
struct xfs_buf *bp,
struct aghdr_init_data *id)
{
xfs_btree_init_block(mp, bp, id->type, 0, 0, id->agno);
xfs_btree_init_buf(mp, bp, id->bc_ops, 0, 0, id->agno);
}
/* Finish initializing a free space btree. */
......@@ -549,7 +551,7 @@ xfs_freesp_init_recs(
}
/*
* Alloc btree root block init functions
* bnobt/cntbt btree root block init functions
*/
static void
xfs_bnoroot_init(
......@@ -557,17 +559,7 @@ xfs_bnoroot_init(
struct xfs_buf *bp,
struct aghdr_init_data *id)
{
xfs_btree_init_block(mp, bp, XFS_BTNUM_BNO, 0, 0, id->agno);
xfs_freesp_init_recs(mp, bp, id);
}
static void
xfs_cntroot_init(
struct xfs_mount *mp,
struct xfs_buf *bp,
struct aghdr_init_data *id)
{
xfs_btree_init_block(mp, bp, XFS_BTNUM_CNT, 0, 0, id->agno);
xfs_btree_init_buf(mp, bp, id->bc_ops, 0, 0, id->agno);
xfs_freesp_init_recs(mp, bp, id);
}
......@@ -583,7 +575,7 @@ xfs_rmaproot_init(
struct xfs_btree_block *block = XFS_BUF_TO_BLOCK(bp);
struct xfs_rmap_rec *rrec;
xfs_btree_init_block(mp, bp, XFS_BTNUM_RMAP, 0, 4, id->agno);
xfs_btree_init_buf(mp, bp, id->bc_ops, 0, 4, id->agno);
/*
* mark the AG header regions as static metadata The BNO
......@@ -678,14 +670,13 @@ xfs_agfblock_init(
agf->agf_versionnum = cpu_to_be32(XFS_AGF_VERSION);
agf->agf_seqno = cpu_to_be32(id->agno);
agf->agf_length = cpu_to_be32(id->agsize);
agf->agf_roots[XFS_BTNUM_BNOi] = cpu_to_be32(XFS_BNO_BLOCK(mp));
agf->agf_roots[XFS_BTNUM_CNTi] = cpu_to_be32(XFS_CNT_BLOCK(mp));
agf->agf_levels[XFS_BTNUM_BNOi] = cpu_to_be32(1);
agf->agf_levels[XFS_BTNUM_CNTi] = cpu_to_be32(1);
agf->agf_bno_root = cpu_to_be32(XFS_BNO_BLOCK(mp));
agf->agf_cnt_root = cpu_to_be32(XFS_CNT_BLOCK(mp));
agf->agf_bno_level = cpu_to_be32(1);
agf->agf_cnt_level = cpu_to_be32(1);
if (xfs_has_rmapbt(mp)) {
agf->agf_roots[XFS_BTNUM_RMAPi] =
cpu_to_be32(XFS_RMAP_BLOCK(mp));
agf->agf_levels[XFS_BTNUM_RMAPi] = cpu_to_be32(1);
agf->agf_rmap_root = cpu_to_be32(XFS_RMAP_BLOCK(mp));
agf->agf_rmap_level = cpu_to_be32(1);
agf->agf_rmap_blocks = cpu_to_be32(1);
}
......@@ -796,7 +787,7 @@ struct xfs_aghdr_grow_data {
size_t numblks;
const struct xfs_buf_ops *ops;
aghdr_init_work_f work;
xfs_btnum_t type;
const struct xfs_btree_ops *bc_ops;
bool need_init;
};
......@@ -850,13 +841,15 @@ xfs_ag_init_headers(
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_bnobt_buf_ops,
.work = &xfs_bnoroot_init,
.bc_ops = &xfs_bnobt_ops,
.need_init = true
},
{ /* CNT root block */
.daddr = XFS_AGB_TO_DADDR(mp, id->agno, XFS_CNT_BLOCK(mp)),
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_cntbt_buf_ops,
.work = &xfs_cntroot_init,
.work = &xfs_bnoroot_init,
.bc_ops = &xfs_cntbt_ops,
.need_init = true
},
{ /* INO root block */
......@@ -864,7 +857,7 @@ xfs_ag_init_headers(
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_inobt_buf_ops,
.work = &xfs_btroot_init,
.type = XFS_BTNUM_INO,
.bc_ops = &xfs_inobt_ops,
.need_init = true
},
{ /* FINO root block */
......@@ -872,7 +865,7 @@ xfs_ag_init_headers(
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_finobt_buf_ops,
.work = &xfs_btroot_init,
.type = XFS_BTNUM_FINO,
.bc_ops = &xfs_finobt_ops,
.need_init = xfs_has_finobt(mp)
},
{ /* RMAP root block */
......@@ -880,6 +873,7 @@ xfs_ag_init_headers(
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_rmapbt_buf_ops,
.work = &xfs_rmaproot_init,
.bc_ops = &xfs_rmapbt_ops,
.need_init = xfs_has_rmapbt(mp)
},
{ /* REFC root block */
......@@ -887,7 +881,7 @@ xfs_ag_init_headers(
.numblks = BTOBB(mp->m_sb.sb_blocksize),
.ops = &xfs_refcountbt_buf_ops,
.work = &xfs_btroot_init,
.type = XFS_BTNUM_REFC,
.bc_ops = &xfs_refcountbt_ops,
.need_init = xfs_has_reflink(mp)
},
{ /* NULL terminating block */
......@@ -905,7 +899,7 @@ xfs_ag_init_headers(
id->daddr = dp->daddr;
id->numblks = dp->numblks;
id->type = dp->type;
id->bc_ops = dp->bc_ops;
error = xfs_ag_init_hdr(mp, id, dp->work, dp->ops);
if (error)
break;
......@@ -950,8 +944,10 @@ xfs_ag_shrink_space(
agf = agfbp->b_addr;
aglen = be32_to_cpu(agi->agi_length);
/* some extra paranoid checks before we shrink the ag */
if (XFS_IS_CORRUPT(mp, agf->agf_length != agi->agi_length))
if (XFS_IS_CORRUPT(mp, agf->agf_length != agi->agi_length)) {
xfs_ag_mark_sick(pag, XFS_SICK_AG_AGF);
return -EFSCORRUPTED;
}
if (delta >= aglen)
return -EINVAL;
......@@ -979,14 +975,23 @@ xfs_ag_shrink_space(
if (error) {
/*
* if extent allocation fails, need to roll the transaction to
* If extent allocation fails, need to roll the transaction to
* ensure that the AGFL fixup has been committed anyway.
*
* We need to hold the AGF across the roll to ensure nothing can
* access the AG for allocation until the shrink is fully
* cleaned up. And due to the resetting of the AG block
* reservation space needing to lock the AGI, we also have to
* hold that so we don't get AGI/AGF lock order inversions in
* the error handling path.
*/
xfs_trans_bhold(*tpp, agfbp);
xfs_trans_bhold(*tpp, agibp);
err2 = xfs_trans_roll(tpp);
if (err2)
return err2;
xfs_trans_bjoin(*tpp, agfbp);
xfs_trans_bjoin(*tpp, agibp);
goto resv_init_out;
}
......
......@@ -36,8 +36,9 @@ struct xfs_perag {
atomic_t pag_active_ref; /* active reference count */
wait_queue_head_t pag_active_wq;/* woken active_ref falls to zero */
unsigned long pag_opstate;
uint8_t pagf_levels[XFS_BTNUM_AGF];
/* # of levels in bno & cnt btree */
uint8_t pagf_bno_level; /* # of levels in bno btree */
uint8_t pagf_cnt_level; /* # of levels in cnt btree */
uint8_t pagf_rmap_level;/* # of levels in rmap btree */
uint32_t pagf_flcount; /* count of blocks in freelist */
xfs_extlen_t pagf_freeblks; /* total free blocks */
xfs_extlen_t pagf_longest; /* longest free space */
......@@ -86,8 +87,10 @@ struct xfs_perag {
* Alternate btree heights so that online repair won't trip the write
* verifiers while rebuilding the AG btrees.
*/
uint8_t pagf_repair_levels[XFS_BTNUM_AGF];
uint8_t pagf_repair_bno_level;
uint8_t pagf_repair_cnt_level;
uint8_t pagf_repair_refcount_level;
uint8_t pagf_repair_rmap_level;
#endif
spinlock_t pag_state_lock;
......@@ -104,9 +107,7 @@ struct xfs_perag {
int pag_ici_reclaimable; /* reclaimable inodes */
unsigned long pag_ici_reclaim_cursor; /* reclaim restart point */
/* buffer cache index */
spinlock_t pag_buf_lock; /* lock for pag_buf_hash */
struct rhashtable pag_buf_hash;
struct xfs_buf_cache pag_bcache;
/* background prealloc block trimming */
struct delayed_work pag_blockgc_work;
......@@ -119,6 +120,9 @@ struct xfs_perag {
* inconsistencies.
*/
struct xfs_defer_drain pag_intents_drain;
/* Hook to feed rmapbt updates to an active online repair. */
struct xfs_hooks pag_rmap_update_hooks;
#endif /* __KERNEL__ */
};
......@@ -331,7 +335,7 @@ struct aghdr_init_data {
/* per header data */
xfs_daddr_t daddr; /* header location */
size_t numblks; /* size of header */
xfs_btnum_t type; /* type of btree root block */
const struct xfs_btree_ops *bc_ops; /* btree ops */
};
int xfs_ag_init_headers(struct xfs_mount *mp, struct aghdr_init_data *id);
......
This diff is collapsed.
......@@ -16,6 +16,7 @@
#include "xfs_alloc.h"
#include "xfs_extent_busy.h"
#include "xfs_error.h"
#include "xfs_health.h"
#include "xfs_trace.h"
#include "xfs_trans.h"
#include "xfs_ag.h"
......@@ -23,13 +24,22 @@
static struct kmem_cache *xfs_allocbt_cur_cache;
STATIC struct xfs_btree_cur *
xfs_allocbt_dup_cursor(
xfs_bnobt_dup_cursor(
struct xfs_btree_cur *cur)
{
return xfs_allocbt_init_cursor(cur->bc_mp, cur->bc_tp,
cur->bc_ag.agbp, cur->bc_ag.pag, cur->bc_btnum);
return xfs_bnobt_init_cursor(cur->bc_mp, cur->bc_tp, cur->bc_ag.agbp,
cur->bc_ag.pag);
}
STATIC struct xfs_btree_cur *
xfs_cntbt_dup_cursor(
struct xfs_btree_cur *cur)
{
return xfs_cntbt_init_cursor(cur->bc_mp, cur->bc_tp, cur->bc_ag.agbp,
cur->bc_ag.pag);
}
STATIC void
xfs_allocbt_set_root(
struct xfs_btree_cur *cur,
......@@ -38,13 +48,18 @@ xfs_allocbt_set_root(
{
struct xfs_buf *agbp = cur->bc_ag.agbp;
struct xfs_agf *agf = agbp->b_addr;
int btnum = cur->bc_btnum;
ASSERT(ptr->s != 0);
agf->agf_roots[btnum] = ptr->s;
be32_add_cpu(&agf->agf_levels[btnum], inc);
cur->bc_ag.pag->pagf_levels[btnum] += inc;
if (xfs_btree_is_bno(cur->bc_ops)) {
agf->agf_bno_root = ptr->s;
be32_add_cpu(&agf->agf_bno_level, inc);
cur->bc_ag.pag->pagf_bno_level += inc;
} else {
agf->agf_cnt_root = ptr->s;
be32_add_cpu(&agf->agf_cnt_level, inc);
cur->bc_ag.pag->pagf_cnt_level += inc;
}
xfs_alloc_log_agf(cur->bc_tp, agbp, XFS_AGF_ROOTS | XFS_AGF_LEVELS);
}
......@@ -116,7 +131,7 @@ xfs_allocbt_update_lastrec(
__be32 len;
int numrecs;
ASSERT(cur->bc_btnum == XFS_BTNUM_CNT);
ASSERT(!xfs_btree_is_bno(cur->bc_ops));
switch (reason) {
case LASTREC_UPDATE:
......@@ -226,7 +241,10 @@ xfs_allocbt_init_ptr_from_cur(
ASSERT(cur->bc_ag.pag->pag_agno == be32_to_cpu(agf->agf_seqno));
ptr->s = agf->agf_roots[cur->bc_btnum];
if (xfs_btree_is_bno(cur->bc_ops))
ptr->s = agf->agf_bno_root;
else
ptr->s = agf->agf_cnt_root;
}
STATIC int64_t
......@@ -299,13 +317,12 @@ xfs_allocbt_verify(
struct xfs_perag *pag = bp->b_pag;
xfs_failaddr_t fa;
unsigned int level;
xfs_btnum_t btnum = XFS_BTNUM_BNOi;
if (!xfs_verify_magic(bp, block->bb_magic))
return __this_address;
if (xfs_has_crc(mp)) {
fa = xfs_btree_sblock_v5hdr_verify(bp);
fa = xfs_btree_agblock_v5hdr_verify(bp);
if (fa)
return fa;
}
......@@ -320,26 +337,32 @@ xfs_allocbt_verify(
* against.
*/
level = be16_to_cpu(block->bb_level);
if (bp->b_ops->magic[0] == cpu_to_be32(XFS_ABTC_MAGIC))
btnum = XFS_BTNUM_CNTi;
if (pag && xfs_perag_initialised_agf(pag)) {
unsigned int maxlevel = pag->pagf_levels[btnum];
unsigned int maxlevel, repair_maxlevel = 0;
#ifdef CONFIG_XFS_ONLINE_REPAIR
/*
* Online repair could be rewriting the free space btrees, so
* we'll validate against the larger of either tree while this
* is going on.
*/
maxlevel = max_t(unsigned int, maxlevel,
pag->pagf_repair_levels[btnum]);
if (bp->b_ops->magic[0] == cpu_to_be32(XFS_ABTC_MAGIC)) {
maxlevel = pag->pagf_cnt_level;
#ifdef CONFIG_XFS_ONLINE_REPAIR
repair_maxlevel = pag->pagf_repair_cnt_level;
#endif
} else {
maxlevel = pag->pagf_bno_level;
#ifdef CONFIG_XFS_ONLINE_REPAIR
repair_maxlevel = pag->pagf_repair_bno_level;
#endif
if (level >= maxlevel)
}
if (level >= max(maxlevel, repair_maxlevel))
return __this_address;
} else if (level >= mp->m_alloc_maxlevels)
return __this_address;
return xfs_btree_sblock_verify(bp, mp->m_alloc_mxr[level != 0]);
return xfs_btree_agblock_verify(bp, mp->m_alloc_mxr[level != 0]);
}
static void
......@@ -348,7 +371,7 @@ xfs_allocbt_read_verify(
{
xfs_failaddr_t fa;
if (!xfs_btree_sblock_verify_crc(bp))
if (!xfs_btree_agblock_verify_crc(bp))
xfs_verifier_error(bp, -EFSBADCRC, __this_address);
else {
fa = xfs_allocbt_verify(bp);
......@@ -372,7 +395,7 @@ xfs_allocbt_write_verify(
xfs_verifier_error(bp, -EFSCORRUPTED, fa);
return;
}
xfs_btree_sblock_calc_crc(bp);
xfs_btree_agblock_calc_crc(bp);
}
......@@ -454,11 +477,19 @@ xfs_allocbt_keys_contiguous(
be32_to_cpu(key2->alloc.ar_startblock));
}
static const struct xfs_btree_ops xfs_bnobt_ops = {
const struct xfs_btree_ops xfs_bnobt_ops = {
.name = "bno",
.type = XFS_BTREE_TYPE_AG,
.rec_len = sizeof(xfs_alloc_rec_t),
.key_len = sizeof(xfs_alloc_key_t),
.ptr_len = XFS_BTREE_SHORT_PTR_LEN,
.dup_cursor = xfs_allocbt_dup_cursor,
.lru_refs = XFS_ALLOC_BTREE_REF,
.statoff = XFS_STATS_CALC_INDEX(xs_abtb_2),
.sick_mask = XFS_SICK_AG_BNOBT,
.dup_cursor = xfs_bnobt_dup_cursor,
.set_root = xfs_allocbt_set_root,
.alloc_block = xfs_allocbt_alloc_block,
.free_block = xfs_allocbt_free_block,
......@@ -477,11 +508,20 @@ static const struct xfs_btree_ops xfs_bnobt_ops = {
.keys_contiguous = xfs_allocbt_keys_contiguous,
};
static const struct xfs_btree_ops xfs_cntbt_ops = {
const struct xfs_btree_ops xfs_cntbt_ops = {
.name = "cnt",
.type = XFS_BTREE_TYPE_AG,
.geom_flags = XFS_BTGEO_LASTREC_UPDATE,
.rec_len = sizeof(xfs_alloc_rec_t),
.key_len = sizeof(xfs_alloc_key_t),
.ptr_len = XFS_BTREE_SHORT_PTR_LEN,
.lru_refs = XFS_ALLOC_BTREE_REF,
.statoff = XFS_STATS_CALC_INDEX(xs_abtc_2),
.sick_mask = XFS_SICK_AG_CNTBT,
.dup_cursor = xfs_allocbt_dup_cursor,
.dup_cursor = xfs_cntbt_dup_cursor,
.set_root = xfs_allocbt_set_root,
.alloc_block = xfs_allocbt_alloc_block,
.free_block = xfs_allocbt_free_block,
......@@ -500,76 +540,55 @@ static const struct xfs_btree_ops xfs_cntbt_ops = {
.keys_contiguous = NULL, /* not needed right now */
};
/* Allocate most of a new allocation btree cursor. */
STATIC struct xfs_btree_cur *
xfs_allocbt_init_common(
/*
* Allocate a new bnobt cursor.
*
* For staging cursors tp and agbp are NULL.
*/
struct xfs_btree_cur *
xfs_bnobt_init_cursor(
struct xfs_mount *mp,
struct xfs_trans *tp,
struct xfs_perag *pag,
xfs_btnum_t btnum)
struct xfs_buf *agbp,
struct xfs_perag *pag)
{
struct xfs_btree_cur *cur;
ASSERT(btnum == XFS_BTNUM_BNO || btnum == XFS_BTNUM_CNT);
cur = xfs_btree_alloc_cursor(mp, tp, btnum, mp->m_alloc_maxlevels,
xfs_allocbt_cur_cache);
cur->bc_ag.abt.active = false;
if (btnum == XFS_BTNUM_CNT) {
cur->bc_ops = &xfs_cntbt_ops;
cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_abtc_2);
cur->bc_flags = XFS_BTREE_LASTREC_UPDATE;
} else {
cur->bc_ops = &xfs_bnobt_ops;
cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_abtb_2);
}
cur = xfs_btree_alloc_cursor(mp, tp, &xfs_bnobt_ops,
mp->m_alloc_maxlevels, xfs_allocbt_cur_cache);
cur->bc_ag.pag = xfs_perag_hold(pag);
cur->bc_ag.agbp = agbp;
if (agbp) {
struct xfs_agf *agf = agbp->b_addr;
if (xfs_has_crc(mp))
cur->bc_flags |= XFS_BTREE_CRC_BLOCKS;
cur->bc_nlevels = be32_to_cpu(agf->agf_bno_level);
}
return cur;
}
/*
* Allocate a new allocation btree cursor.
* Allocate a new cntbt cursor.
*
* For staging cursors tp and agbp are NULL.
*/
struct xfs_btree_cur * /* new alloc btree cursor */
xfs_allocbt_init_cursor(
struct xfs_mount *mp, /* file system mount point */
struct xfs_trans *tp, /* transaction pointer */
struct xfs_buf *agbp, /* buffer for agf structure */
struct xfs_perag *pag,
xfs_btnum_t btnum) /* btree identifier */
{
struct xfs_agf *agf = agbp->b_addr;
struct xfs_btree_cur *cur;
cur = xfs_allocbt_init_common(mp, tp, pag, btnum);
if (btnum == XFS_BTNUM_CNT)
cur->bc_nlevels = be32_to_cpu(agf->agf_levels[XFS_BTNUM_CNT]);
else
cur->bc_nlevels = be32_to_cpu(agf->agf_levels[XFS_BTNUM_BNO]);
cur->bc_ag.agbp = agbp;
return cur;
}
/* Create a free space btree cursor with a fake root for staging. */
struct xfs_btree_cur *
xfs_allocbt_stage_cursor(
xfs_cntbt_init_cursor(
struct xfs_mount *mp,
struct xbtree_afakeroot *afake,
struct xfs_perag *pag,
xfs_btnum_t btnum)
struct xfs_trans *tp,
struct xfs_buf *agbp,
struct xfs_perag *pag)
{
struct xfs_btree_cur *cur;
cur = xfs_allocbt_init_common(mp, NULL, pag, btnum);
xfs_btree_stage_afakeroot(cur, afake);
cur = xfs_btree_alloc_cursor(mp, tp, &xfs_cntbt_ops,
mp->m_alloc_maxlevels, xfs_allocbt_cur_cache);
cur->bc_ag.pag = xfs_perag_hold(pag);
cur->bc_ag.agbp = agbp;
if (agbp) {
struct xfs_agf *agf = agbp->b_addr;
cur->bc_nlevels = be32_to_cpu(agf->agf_cnt_level);
}
return cur;
}
......@@ -588,16 +607,16 @@ xfs_allocbt_commit_staged_btree(
ASSERT(cur->bc_flags & XFS_BTREE_STAGING);
agf->agf_roots[cur->bc_btnum] = cpu_to_be32(afake->af_root);
agf->agf_levels[cur->bc_btnum] = cpu_to_be32(afake->af_levels);
xfs_alloc_log_agf(tp, agbp, XFS_AGF_ROOTS | XFS_AGF_LEVELS);
if (cur->bc_btnum == XFS_BTNUM_BNO) {
xfs_btree_commit_afakeroot(cur, tp, agbp, &xfs_bnobt_ops);
if (xfs_btree_is_bno(cur->bc_ops)) {
agf->agf_bno_root = cpu_to_be32(afake->af_root);
agf->agf_bno_level = cpu_to_be32(afake->af_levels);
} else {
cur->bc_flags |= XFS_BTREE_LASTREC_UPDATE;
xfs_btree_commit_afakeroot(cur, tp, agbp, &xfs_cntbt_ops);
agf->agf_cnt_root = cpu_to_be32(afake->af_root);
agf->agf_cnt_level = cpu_to_be32(afake->af_levels);
}
xfs_alloc_log_agf(tp, agbp, XFS_AGF_ROOTS | XFS_AGF_LEVELS);
xfs_btree_commit_afakeroot(cur, tp, agbp);
}
/* Calculate number of records in an alloc btree block. */
......
......@@ -47,12 +47,12 @@ struct xbtree_afakeroot;
(maxrecs) * sizeof(xfs_alloc_key_t) + \
((index) - 1) * sizeof(xfs_alloc_ptr_t)))
extern struct xfs_btree_cur *xfs_allocbt_init_cursor(struct xfs_mount *mp,
struct xfs_btree_cur *xfs_bnobt_init_cursor(struct xfs_mount *mp,
struct xfs_trans *tp, struct xfs_buf *bp,
struct xfs_perag *pag, xfs_btnum_t btnum);
struct xfs_btree_cur *xfs_allocbt_stage_cursor(struct xfs_mount *mp,
struct xbtree_afakeroot *afake, struct xfs_perag *pag,
xfs_btnum_t btnum);
struct xfs_perag *pag);
struct xfs_btree_cur *xfs_cntbt_init_cursor(struct xfs_mount *mp,
struct xfs_trans *tp, struct xfs_buf *bp,
struct xfs_perag *pag);
extern int xfs_allocbt_maxrecs(struct xfs_mount *, int, int);
extern xfs_extlen_t xfs_allocbt_calc_size(struct xfs_mount *mp,
unsigned long long len);
......
......@@ -224,7 +224,7 @@ int
xfs_attr_get_ilocked(
struct xfs_da_args *args)
{
ASSERT(xfs_isilocked(args->dp, XFS_ILOCK_SHARED | XFS_ILOCK_EXCL));
xfs_assert_ilocked(args->dp, XFS_ILOCK_SHARED | XFS_ILOCK_EXCL);
if (!xfs_inode_hasattr(args->dp))
return -ENOATTR;
......@@ -891,7 +891,8 @@ xfs_attr_defer_add(
struct xfs_attr_intent *new;
new = kmem_cache_zalloc(xfs_attr_intent_cache, GFP_NOFS | __GFP_NOFAIL);
new = kmem_cache_zalloc(xfs_attr_intent_cache,
GFP_KERNEL | __GFP_NOFAIL);
new->xattri_op_flags = op_flags;
new->xattri_da_args = args;
......
......@@ -29,6 +29,7 @@
#include "xfs_log.h"
#include "xfs_ag.h"
#include "xfs_errortag.h"
#include "xfs_health.h"
/*
......@@ -879,8 +880,7 @@ xfs_attr_shortform_to_leaf(
trace_xfs_attr_sf_to_leaf(args);
tmpbuffer = kmem_alloc(size, 0);
ASSERT(tmpbuffer != NULL);
tmpbuffer = kmalloc(size, GFP_KERNEL | __GFP_NOFAIL);
memcpy(tmpbuffer, ifp->if_data, size);
sf = (struct xfs_attr_sf_hdr *)tmpbuffer;
......@@ -924,7 +924,7 @@ xfs_attr_shortform_to_leaf(
}
error = 0;
out:
kmem_free(tmpbuffer);
kfree(tmpbuffer);
return error;
}
......@@ -1059,7 +1059,7 @@ xfs_attr3_leaf_to_shortform(
trace_xfs_attr_leaf_to_sf(args);
tmpbuffer = kmem_alloc(args->geo->blksize, 0);
tmpbuffer = kmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL);
if (!tmpbuffer)
return -ENOMEM;
......@@ -1125,7 +1125,7 @@ xfs_attr3_leaf_to_shortform(
error = 0;
out:
kmem_free(tmpbuffer);
kfree(tmpbuffer);
return error;
}
......@@ -1533,7 +1533,7 @@ xfs_attr3_leaf_compact(
trace_xfs_attr_leaf_compact(args);
tmpbuffer = kmem_alloc(args->geo->blksize, 0);
tmpbuffer = kmalloc(args->geo->blksize, GFP_KERNEL | __GFP_NOFAIL);
memcpy(tmpbuffer, bp->b_addr, args->geo->blksize);
memset(bp->b_addr, 0, args->geo->blksize);
leaf_src = (xfs_attr_leafblock_t *)tmpbuffer;
......@@ -1571,7 +1571,7 @@ xfs_attr3_leaf_compact(
*/
xfs_trans_log_buf(trans, bp, 0, args->geo->blksize - 1);
kmem_free(tmpbuffer);
kfree(tmpbuffer);
}
/*
......@@ -2250,7 +2250,8 @@ xfs_attr3_leaf_unbalance(
struct xfs_attr_leafblock *tmp_leaf;
struct xfs_attr3_icleaf_hdr tmphdr;
tmp_leaf = kmem_zalloc(state->args->geo->blksize, 0);
tmp_leaf = kzalloc(state->args->geo->blksize,
GFP_KERNEL | __GFP_NOFAIL);
/*
* Copy the header into the temp leaf so that all the stuff
......@@ -2290,7 +2291,7 @@ xfs_attr3_leaf_unbalance(
}
memcpy(save_leaf, tmp_leaf, state->args->geo->blksize);
savehdr = tmphdr; /* struct copy */
kmem_free(tmp_leaf);
kfree(tmp_leaf);
}
xfs_attr3_leaf_hdr_to_disk(state->args->geo, save_leaf, &savehdr);
......@@ -2343,6 +2344,7 @@ xfs_attr3_leaf_lookup_int(
entries = xfs_attr3_leaf_entryp(leaf);
if (ichdr.count >= args->geo->blksize / 8) {
xfs_buf_mark_corrupt(bp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
......@@ -2362,10 +2364,12 @@ xfs_attr3_leaf_lookup_int(
}
if (!(probe >= 0 && (!ichdr.count || probe < ichdr.count))) {
xfs_buf_mark_corrupt(bp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
if (!(span <= 4 || be32_to_cpu(entry->hashval) == hashval)) {
xfs_buf_mark_corrupt(bp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
......
......@@ -22,6 +22,7 @@
#include "xfs_attr_remote.h"
#include "xfs_trace.h"
#include "xfs_error.h"
#include "xfs_health.h"
#define ATTR_RMTVALUE_MAPSIZE 1 /* # of map entries at once */
......@@ -276,17 +277,18 @@ xfs_attr3_rmt_hdr_set(
*/
STATIC int
xfs_attr_rmtval_copyout(
struct xfs_mount *mp,
struct xfs_buf *bp,
xfs_ino_t ino,
int *offset,
int *valuelen,
uint8_t **dst)
struct xfs_mount *mp,
struct xfs_buf *bp,
struct xfs_inode *dp,
int *offset,
int *valuelen,
uint8_t **dst)
{
char *src = bp->b_addr;
xfs_daddr_t bno = xfs_buf_daddr(bp);
int len = BBTOB(bp->b_length);
int blksize = mp->m_attr_geo->blksize;
char *src = bp->b_addr;
xfs_ino_t ino = dp->i_ino;
xfs_daddr_t bno = xfs_buf_daddr(bp);
int len = BBTOB(bp->b_length);
int blksize = mp->m_attr_geo->blksize;
ASSERT(len >= blksize);
......@@ -302,6 +304,7 @@ xfs_attr_rmtval_copyout(
xfs_alert(mp,
"remote attribute header mismatch bno/off/len/owner (0x%llx/0x%x/Ox%x/0x%llx)",
bno, *offset, byte_cnt, ino);
xfs_dirattr_mark_sick(dp, XFS_ATTR_FORK);
return -EFSCORRUPTED;
}
hdr_size = sizeof(struct xfs_attr3_rmt_hdr);
......@@ -418,10 +421,12 @@ xfs_attr_rmtval_get(
dblkcnt = XFS_FSB_TO_BB(mp, map[i].br_blockcount);
error = xfs_buf_read(mp->m_ddev_targp, dblkno, dblkcnt,
0, &bp, &xfs_attr3_rmt_buf_ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(args->dp, XFS_ATTR_FORK);
if (error)
return error;
error = xfs_attr_rmtval_copyout(mp, bp, args->dp->i_ino,
error = xfs_attr_rmtval_copyout(mp, bp, args->dp,
&offset, &valuelen,
&dst);
xfs_buf_relse(bp);
......@@ -545,11 +550,13 @@ xfs_attr_rmtval_stale(
struct xfs_buf *bp;
int error;
ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
xfs_assert_ilocked(ip, XFS_ILOCK_EXCL);
if (XFS_IS_CORRUPT(mp, map->br_startblock == DELAYSTARTBLOCK) ||
XFS_IS_CORRUPT(mp, map->br_startblock == HOLESTARTBLOCK))
XFS_IS_CORRUPT(mp, map->br_startblock == HOLESTARTBLOCK)) {
xfs_bmap_mark_sick(ip, XFS_ATTR_FORK);
return -EFSCORRUPTED;
}
error = xfs_buf_incore(mp->m_ddev_targp,
XFS_FSB_TO_DADDR(mp, map->br_startblock),
......@@ -659,8 +666,10 @@ xfs_attr_rmtval_invalidate(
blkcnt, &map, &nmap, XFS_BMAPI_ATTRFORK);
if (error)
return error;
if (XFS_IS_CORRUPT(args->dp->i_mount, nmap != 1))
if (XFS_IS_CORRUPT(args->dp->i_mount, nmap != 1)) {
xfs_bmap_mark_sick(args->dp, XFS_ATTR_FORK);
return -EFSCORRUPTED;
}
error = xfs_attr_rmtval_stale(args->dp, &map, XBF_TRYLOCK);
if (error)
return error;
......
This diff is collapsed.
......@@ -232,6 +232,10 @@ enum xfs_bmap_intent_type {
XFS_BMAP_UNMAP,
};
#define XFS_BMAP_INTENT_STRINGS \
{ XFS_BMAP_MAP, "map" }, \
{ XFS_BMAP_UNMAP, "unmap" }
struct xfs_bmap_intent {
struct list_head bi_list;
enum xfs_bmap_intent_type bi_type;
......@@ -241,14 +245,11 @@ struct xfs_bmap_intent {
struct xfs_bmbt_irec bi_bmap;
};
void xfs_bmap_update_get_group(struct xfs_mount *mp,
struct xfs_bmap_intent *bi);
int xfs_bmap_finish_one(struct xfs_trans *tp, struct xfs_bmap_intent *bi);
void xfs_bmap_map_extent(struct xfs_trans *tp, struct xfs_inode *ip,
struct xfs_bmbt_irec *imap);
int whichfork, struct xfs_bmbt_irec *imap);
void xfs_bmap_unmap_extent(struct xfs_trans *tp, struct xfs_inode *ip,
struct xfs_bmbt_irec *imap);
int whichfork, struct xfs_bmbt_irec *imap);
static inline uint32_t xfs_bmap_fork_to_state(int whichfork)
{
......@@ -280,4 +281,12 @@ extern struct kmem_cache *xfs_bmap_intent_cache;
int __init xfs_bmap_intent_init_cache(void);
void xfs_bmap_intent_destroy_cache(void);
typedef int (*xfs_bmap_query_range_fn)(
struct xfs_btree_cur *cur,
struct xfs_bmbt_irec *rec,
void *priv);
int xfs_bmap_query_all(struct xfs_btree_cur *cur, xfs_bmap_query_range_fn fn,
void *priv);
#endif /* __XFS_BMAP_H__ */
......@@ -26,6 +26,22 @@
static struct kmem_cache *xfs_bmbt_cur_cache;
void
xfs_bmbt_init_block(
struct xfs_inode *ip,
struct xfs_btree_block *buf,
struct xfs_buf *bp,
__u16 level,
__u16 numrecs)
{
if (bp)
xfs_btree_init_buf(ip->i_mount, bp, &xfs_bmbt_ops, level,
numrecs, ip->i_ino);
else
xfs_btree_init_block(ip->i_mount, buf, &xfs_bmbt_ops, level,
numrecs, ip->i_ino);
}
/*
* Convert on-disk form of btree root to in-memory form.
*/
......@@ -44,9 +60,7 @@ xfs_bmdr_to_bmbt(
xfs_bmbt_key_t *tkp;
__be64 *tpp;
xfs_btree_init_block_int(mp, rblock, XFS_BUF_DADDR_NULL,
XFS_BTNUM_BMAP, 0, 0, ip->i_ino,
XFS_BTREE_LONG_PTRS);
xfs_bmbt_init_block(ip, rblock, NULL, 0, 0);
rblock->bb_level = dblock->bb_level;
ASSERT(be16_to_cpu(rblock->bb_level) > 0);
rblock->bb_numrecs = dblock->bb_numrecs;
......@@ -171,13 +185,8 @@ xfs_bmbt_dup_cursor(
new = xfs_bmbt_init_cursor(cur->bc_mp, cur->bc_tp,
cur->bc_ino.ip, cur->bc_ino.whichfork);
/*
* Copy the firstblock, dfops, and flags values,
* since init cursor doesn't get them.
*/
new->bc_ino.flags = cur->bc_ino.flags;
new->bc_flags |= (cur->bc_flags &
(XFS_BTREE_BMBT_INVALID_OWNER | XFS_BTREE_BMBT_WASDEL));
return new;
}
......@@ -189,10 +198,10 @@ xfs_bmbt_update_cursor(
ASSERT((dst->bc_tp->t_highest_agno != NULLAGNUMBER) ||
(dst->bc_ino.ip->i_diflags & XFS_DIFLAG_REALTIME));
dst->bc_ino.allocated += src->bc_ino.allocated;
dst->bc_bmap.allocated += src->bc_bmap.allocated;
dst->bc_tp->t_highest_agno = src->bc_tp->t_highest_agno;
src->bc_ino.allocated = 0;
src->bc_bmap.allocated = 0;
}
STATIC int
......@@ -211,7 +220,7 @@ xfs_bmbt_alloc_block(
xfs_rmap_ino_bmbt_owner(&args.oinfo, cur->bc_ino.ip->i_ino,
cur->bc_ino.whichfork);
args.minlen = args.maxlen = args.prod = 1;
args.wasdel = cur->bc_ino.flags & XFS_BTCUR_BMBT_WASDEL;
args.wasdel = cur->bc_flags & XFS_BTREE_BMBT_WASDEL;
if (!args.wasdel && args.tp->t_blk_res == 0)
return -ENOSPC;
......@@ -247,7 +256,7 @@ xfs_bmbt_alloc_block(
}
ASSERT(args.len == 1);
cur->bc_ino.allocated++;
cur->bc_bmap.allocated++;
cur->bc_ino.ip->i_nblocks++;
xfs_trans_log_inode(args.tp, cur->bc_ino.ip, XFS_ILOG_CORE);
xfs_trans_mod_dquot_byino(args.tp, cur->bc_ino.ip,
......@@ -360,14 +369,6 @@ xfs_bmbt_init_rec_from_cur(
xfs_bmbt_disk_set_all(&rec->bmbt, &cur->bc_rec.b);
}
STATIC void
xfs_bmbt_init_ptr_from_cur(
struct xfs_btree_cur *cur,
union xfs_btree_ptr *ptr)
{
ptr->l = 0;
}
STATIC int64_t
xfs_bmbt_key_diff(
struct xfs_btree_cur *cur,
......@@ -419,7 +420,7 @@ xfs_bmbt_verify(
* XXX: need a better way of verifying the owner here. Right now
* just make sure there has been one set.
*/
fa = xfs_btree_lblock_v5hdr_verify(bp, XFS_RMAP_OWN_UNKNOWN);
fa = xfs_btree_fsblock_v5hdr_verify(bp, XFS_RMAP_OWN_UNKNOWN);
if (fa)
return fa;
}
......@@ -435,7 +436,7 @@ xfs_bmbt_verify(
if (level > max(mp->m_bm_maxlevels[0], mp->m_bm_maxlevels[1]))
return __this_address;
return xfs_btree_lblock_verify(bp, mp->m_bmap_dmxr[level != 0]);
return xfs_btree_fsblock_verify(bp, mp->m_bmap_dmxr[level != 0]);
}
static void
......@@ -444,7 +445,7 @@ xfs_bmbt_read_verify(
{
xfs_failaddr_t fa;
if (!xfs_btree_lblock_verify_crc(bp))
if (!xfs_btree_fsblock_verify_crc(bp))
xfs_verifier_error(bp, -EFSBADCRC, __this_address);
else {
fa = xfs_bmbt_verify(bp);
......@@ -468,7 +469,7 @@ xfs_bmbt_write_verify(
xfs_verifier_error(bp, -EFSCORRUPTED, fa);
return;
}
xfs_btree_lblock_calc_crc(bp);
xfs_btree_fsblock_calc_crc(bp);
}
const struct xfs_buf_ops xfs_bmbt_buf_ops = {
......@@ -515,9 +516,16 @@ xfs_bmbt_keys_contiguous(
be64_to_cpu(key2->bmbt.br_startoff));
}
static const struct xfs_btree_ops xfs_bmbt_ops = {
const struct xfs_btree_ops xfs_bmbt_ops = {
.name = "bmap",
.type = XFS_BTREE_TYPE_INODE,
.rec_len = sizeof(xfs_bmbt_rec_t),
.key_len = sizeof(xfs_bmbt_key_t),
.ptr_len = XFS_BTREE_LONG_PTR_LEN,
.lru_refs = XFS_BMAP_BTREE_REF,
.statoff = XFS_STATS_CALC_INDEX(xs_bmbt_2),
.dup_cursor = xfs_bmbt_dup_cursor,
.update_cursor = xfs_bmbt_update_cursor,
......@@ -529,7 +537,6 @@ static const struct xfs_btree_ops xfs_bmbt_ops = {
.init_key_from_rec = xfs_bmbt_init_key_from_rec,
.init_high_key_from_rec = xfs_bmbt_init_high_key_from_rec,
.init_rec_from_cur = xfs_bmbt_init_rec_from_cur,
.init_ptr_from_cur = xfs_bmbt_init_ptr_from_cur,
.key_diff = xfs_bmbt_key_diff,
.diff_two_keys = xfs_bmbt_diff_two_keys,
.buf_ops = &xfs_bmbt_buf_ops,
......@@ -538,35 +545,10 @@ static const struct xfs_btree_ops xfs_bmbt_ops = {
.keys_contiguous = xfs_bmbt_keys_contiguous,
};
static struct xfs_btree_cur *
xfs_bmbt_init_common(
struct xfs_mount *mp,
struct xfs_trans *tp,
struct xfs_inode *ip,
int whichfork)
{
struct xfs_btree_cur *cur;
ASSERT(whichfork != XFS_COW_FORK);
cur = xfs_btree_alloc_cursor(mp, tp, XFS_BTNUM_BMAP,
mp->m_bm_maxlevels[whichfork], xfs_bmbt_cur_cache);
cur->bc_statoff = XFS_STATS_CALC_INDEX(xs_bmbt_2);
cur->bc_ops = &xfs_bmbt_ops;
cur->bc_flags = XFS_BTREE_LONG_PTRS | XFS_BTREE_ROOT_IN_INODE;
if (xfs_has_crc(mp))
cur->bc_flags |= XFS_BTREE_CRC_BLOCKS;
cur->bc_ino.ip = ip;
cur->bc_ino.allocated = 0;
cur->bc_ino.flags = 0;
return cur;
}
/*
* Allocate a new bmap btree cursor.
* Create a new bmap btree cursor.
*
* For staging cursors -1 in passed in whichfork.
*/
struct xfs_btree_cur *
xfs_bmbt_init_cursor(
......@@ -575,15 +557,34 @@ xfs_bmbt_init_cursor(
struct xfs_inode *ip,
int whichfork)
{
struct xfs_ifork *ifp = xfs_ifork_ptr(ip, whichfork);
struct xfs_btree_cur *cur;
unsigned int maxlevels;
cur = xfs_bmbt_init_common(mp, tp, ip, whichfork);
ASSERT(whichfork != XFS_COW_FORK);
cur->bc_nlevels = be16_to_cpu(ifp->if_broot->bb_level) + 1;
cur->bc_ino.forksize = xfs_inode_fork_size(ip, whichfork);
/*
* The Data fork always has larger maxlevel, so use that for staging
* cursors.
*/
switch (whichfork) {
case XFS_STAGING_FORK:
maxlevels = mp->m_bm_maxlevels[XFS_DATA_FORK];
break;
default:
maxlevels = mp->m_bm_maxlevels[whichfork];
break;
}
cur = xfs_btree_alloc_cursor(mp, tp, &xfs_bmbt_ops, maxlevels,
xfs_bmbt_cur_cache);
cur->bc_ino.ip = ip;
cur->bc_ino.whichfork = whichfork;
cur->bc_bmap.allocated = 0;
if (whichfork != XFS_STAGING_FORK) {
struct xfs_ifork *ifp = xfs_ifork_ptr(ip, whichfork);
cur->bc_nlevels = be16_to_cpu(ifp->if_broot->bb_level) + 1;
cur->bc_ino.forksize = xfs_inode_fork_size(ip, whichfork);
}
return cur;
}
......@@ -598,33 +599,6 @@ xfs_bmbt_block_maxrecs(
return blocklen / (sizeof(xfs_bmbt_key_t) + sizeof(xfs_bmbt_ptr_t));
}
/*
* Allocate a new bmap btree cursor for reloading an inode block mapping data
* structure. Note that callers can use the staged cursor to reload extents
* format inode forks if they rebuild the iext tree and commit the staged
* cursor immediately.
*/
struct xfs_btree_cur *
xfs_bmbt_stage_cursor(
struct xfs_mount *mp,
struct xfs_inode *ip,
struct xbtree_ifakeroot *ifake)
{
struct xfs_btree_cur *cur;
struct xfs_btree_ops *ops;
/* data fork always has larger maxheight */
cur = xfs_bmbt_init_common(mp, NULL, ip, XFS_DATA_FORK);
cur->bc_nlevels = ifake->if_levels;
cur->bc_ino.forksize = ifake->if_fork_size;
/* Don't let anyone think we're attached to the real fork yet. */
cur->bc_ino.whichfork = -1;
xfs_btree_stage_ifakeroot(cur, ifake, &ops);
ops->update_cursor = NULL;
return cur;
}
/*
* Swap in the new inode fork root. Once we pass this point the newly rebuilt
* mappings are in place and we have to kill off any old btree blocks.
......@@ -665,7 +639,7 @@ xfs_bmbt_commit_staged_btree(
break;
}
xfs_trans_log_inode(tp, cur->bc_ino.ip, flags);
xfs_btree_commit_ifakeroot(cur, tp, whichfork, &xfs_bmbt_ops);
xfs_btree_commit_ifakeroot(cur, tp, whichfork);
}
/*
......@@ -751,7 +725,7 @@ xfs_bmbt_change_owner(
ASSERT(xfs_ifork_ptr(ip, whichfork)->if_format == XFS_DINODE_FMT_BTREE);
cur = xfs_bmbt_init_cursor(ip->i_mount, tp, ip, whichfork);
cur->bc_ino.flags |= XFS_BTCUR_BMBT_INVALID_OWNER;
cur->bc_flags |= XFS_BTREE_BMBT_INVALID_OWNER;
error = xfs_btree_change_owner(cur, new_owner, buffer_list);
xfs_btree_del_cursor(cur, error);
......
......@@ -107,8 +107,6 @@ extern int xfs_bmbt_change_owner(struct xfs_trans *tp, struct xfs_inode *ip,
extern struct xfs_btree_cur *xfs_bmbt_init_cursor(struct xfs_mount *,
struct xfs_trans *, struct xfs_inode *, int);
struct xfs_btree_cur *xfs_bmbt_stage_cursor(struct xfs_mount *mp,
struct xfs_inode *ip, struct xbtree_ifakeroot *ifake);
void xfs_bmbt_commit_staged_btree(struct xfs_btree_cur *cur,
struct xfs_trans *tp, int whichfork);
......@@ -120,4 +118,7 @@ unsigned int xfs_bmbt_maxlevels_ondisk(void);
int __init xfs_bmbt_init_cur_cache(void);
void xfs_bmbt_destroy_cur_cache(void);
void xfs_bmbt_init_block(struct xfs_inode *ip, struct xfs_btree_block *buf,
struct xfs_buf *bp, __u16 level, __u16 numrecs);
#endif /* __XFS_BMAP_BTREE_H__ */
This diff is collapsed.
This diff is collapsed.
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2021-2024 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#include "xfs.h"
#include "xfs_fs.h"
#include "xfs_shared.h"
#include "xfs_format.h"
#include "xfs_log_format.h"
#include "xfs_trans_resv.h"
#include "xfs_mount.h"
#include "xfs_trans.h"
#include "xfs_btree.h"
#include "xfs_error.h"
#include "xfs_buf_mem.h"
#include "xfs_btree_mem.h"
#include "xfs_ag.h"
#include "xfs_buf_item.h"
#include "xfs_trace.h"
/* Set the root of an in-memory btree. */
void
xfbtree_set_root(
struct xfs_btree_cur *cur,
const union xfs_btree_ptr *ptr,
int inc)
{
ASSERT(cur->bc_ops->type == XFS_BTREE_TYPE_MEM);
cur->bc_mem.xfbtree->root = *ptr;
cur->bc_mem.xfbtree->nlevels += inc;
}
/* Initialize a pointer from the in-memory btree header. */
void
xfbtree_init_ptr_from_cur(
struct xfs_btree_cur *cur,
union xfs_btree_ptr *ptr)
{
ASSERT(cur->bc_ops->type == XFS_BTREE_TYPE_MEM);
*ptr = cur->bc_mem.xfbtree->root;
}
/* Duplicate an in-memory btree cursor. */
struct xfs_btree_cur *
xfbtree_dup_cursor(
struct xfs_btree_cur *cur)
{
struct xfs_btree_cur *ncur;
ASSERT(cur->bc_ops->type == XFS_BTREE_TYPE_MEM);
ncur = xfs_btree_alloc_cursor(cur->bc_mp, cur->bc_tp, cur->bc_ops,
cur->bc_maxlevels, cur->bc_cache);
ncur->bc_flags = cur->bc_flags;
ncur->bc_nlevels = cur->bc_nlevels;
ncur->bc_mem.xfbtree = cur->bc_mem.xfbtree;
if (cur->bc_mem.pag)
ncur->bc_mem.pag = xfs_perag_hold(cur->bc_mem.pag);
return ncur;
}
/* Close the btree xfile and release all resources. */
void
xfbtree_destroy(
struct xfbtree *xfbt)
{
xfs_buftarg_drain(xfbt->target);
}
/* Compute the number of bytes available for records. */
static inline unsigned int
xfbtree_rec_bytes(
struct xfs_mount *mp,
const struct xfs_btree_ops *ops)
{
return XMBUF_BLOCKSIZE - XFS_BTREE_LBLOCK_CRC_LEN;
}
/* Initialize an empty leaf block as the btree root. */
STATIC int
xfbtree_init_leaf_block(
struct xfs_mount *mp,
struct xfbtree *xfbt,
const struct xfs_btree_ops *ops)
{
struct xfs_buf *bp;
xfbno_t bno = xfbt->highest_bno++;
int error;
error = xfs_buf_get(xfbt->target, xfbno_to_daddr(bno), XFBNO_BBSIZE,
&bp);
if (error)
return error;
trace_xfbtree_create_root_buf(xfbt, bp);
bp->b_ops = ops->buf_ops;
xfs_btree_init_buf(mp, bp, ops, 0, 0, xfbt->owner);
xfs_buf_relse(bp);
xfbt->root.l = cpu_to_be64(bno);
return 0;
}
/*
* Create an in-memory btree root that can be used with the given xmbuf.
* Callers must set xfbt->owner.
*/
int
xfbtree_init(
struct xfs_mount *mp,
struct xfbtree *xfbt,
struct xfs_buftarg *btp,
const struct xfs_btree_ops *ops)
{
unsigned int blocklen = xfbtree_rec_bytes(mp, ops);
unsigned int keyptr_len;
int error;
/* Requires a long-format CRC-format btree */
if (!xfs_has_crc(mp)) {
ASSERT(xfs_has_crc(mp));
return -EINVAL;
}
if (ops->ptr_len != XFS_BTREE_LONG_PTR_LEN) {
ASSERT(ops->ptr_len == XFS_BTREE_LONG_PTR_LEN);
return -EINVAL;
}
memset(xfbt, 0, sizeof(*xfbt));
xfbt->target = btp;
/* Set up min/maxrecs for this btree. */
keyptr_len = ops->key_len + sizeof(__be64);
xfbt->maxrecs[0] = blocklen / ops->rec_len;
xfbt->maxrecs[1] = blocklen / keyptr_len;
xfbt->minrecs[0] = xfbt->maxrecs[0] / 2;
xfbt->minrecs[1] = xfbt->maxrecs[1] / 2;
xfbt->highest_bno = 0;
xfbt->nlevels = 1;
/* Initialize the empty btree. */
error = xfbtree_init_leaf_block(mp, xfbt, ops);
if (error)
goto err_freesp;
trace_xfbtree_init(mp, xfbt, ops);
return 0;
err_freesp:
xfs_buftarg_drain(xfbt->target);
return error;
}
/* Allocate a block to our in-memory btree. */
int
xfbtree_alloc_block(
struct xfs_btree_cur *cur,
const union xfs_btree_ptr *start,
union xfs_btree_ptr *new,
int *stat)
{
struct xfbtree *xfbt = cur->bc_mem.xfbtree;
xfbno_t bno = xfbt->highest_bno++;
ASSERT(cur->bc_ops->type == XFS_BTREE_TYPE_MEM);
trace_xfbtree_alloc_block(xfbt, cur, bno);
/* Fail if the block address exceeds the maximum for the buftarg. */
if (!xfbtree_verify_bno(xfbt, bno)) {
ASSERT(xfbtree_verify_bno(xfbt, bno));
*stat = 0;
return 0;
}
new->l = cpu_to_be64(bno);
*stat = 1;
return 0;
}
/* Free a block from our in-memory btree. */
int
xfbtree_free_block(
struct xfs_btree_cur *cur,
struct xfs_buf *bp)
{
struct xfbtree *xfbt = cur->bc_mem.xfbtree;
xfs_daddr_t daddr = xfs_buf_daddr(bp);
xfbno_t bno = xfs_daddr_to_xfbno(daddr);
ASSERT(cur->bc_ops->type == XFS_BTREE_TYPE_MEM);
trace_xfbtree_free_block(xfbt, cur, bno);
if (bno + 1 == xfbt->highest_bno)
xfbt->highest_bno--;
return 0;
}
/* Return the minimum number of records for a btree block. */
int
xfbtree_get_minrecs(
struct xfs_btree_cur *cur,
int level)
{
struct xfbtree *xfbt = cur->bc_mem.xfbtree;
return xfbt->minrecs[level != 0];
}
/* Return the maximum number of records for a btree block. */
int
xfbtree_get_maxrecs(
struct xfs_btree_cur *cur,
int level)
{
struct xfbtree *xfbt = cur->bc_mem.xfbtree;
return xfbt->maxrecs[level != 0];
}
/* If this log item is a buffer item that came from the xfbtree, return it. */
static inline struct xfs_buf *
xfbtree_buf_match(
struct xfbtree *xfbt,
const struct xfs_log_item *lip)
{
const struct xfs_buf_log_item *bli;
struct xfs_buf *bp;
if (lip->li_type != XFS_LI_BUF)
return NULL;
bli = container_of(lip, struct xfs_buf_log_item, bli_item);
bp = bli->bli_buf;
if (bp->b_target != xfbt->target)
return NULL;
return bp;
}
/*
* Commit changes to the incore btree immediately by writing all dirty xfbtree
* buffers to the backing xfile. This detaches all xfbtree buffers from the
* transaction, even on failure. The buffer locks are dropped between the
* delwri queue and submit, so the caller must synchronize btree access.
*
* Normally we'd let the buffers commit with the transaction and get written to
* the xfile via the log, but online repair stages ephemeral btrees in memory
* and uses the btree_staging functions to write new btrees to disk atomically.
* The in-memory btree (and its backing store) are discarded at the end of the
* repair phase, which means that xfbtree buffers cannot commit with the rest
* of a transaction.
*
* In other words, online repair only needs the transaction to collect buffer
* pointers and to avoid buffer deadlocks, not to guarantee consistency of
* updates.
*/
int
xfbtree_trans_commit(
struct xfbtree *xfbt,
struct xfs_trans *tp)
{
struct xfs_log_item *lip, *n;
bool tp_dirty = false;
int error = 0;
/*
* For each xfbtree buffer attached to the transaction, write the dirty
* buffers to the xfile and release them.
*/
list_for_each_entry_safe(lip, n, &tp->t_items, li_trans) {
struct xfs_buf *bp = xfbtree_buf_match(xfbt, lip);
if (!bp) {
if (test_bit(XFS_LI_DIRTY, &lip->li_flags))
tp_dirty |= true;
continue;
}
trace_xfbtree_trans_commit_buf(xfbt, bp);
xmbuf_trans_bdetach(tp, bp);
/*
* If the buffer fails verification, note the failure but
* continue walking the transaction items so that we remove all
* ephemeral btree buffers.
*/
if (!error)
error = xmbuf_finalize(bp);
xfs_buf_relse(bp);
}
/*
* Reset the transaction's dirty flag to reflect the dirty state of the
* log items that are still attached.
*/
tp->t_flags = (tp->t_flags & ~XFS_TRANS_DIRTY) |
(tp_dirty ? XFS_TRANS_DIRTY : 0);
return error;
}
/*
* Cancel changes to the incore btree by detaching all the xfbtree buffers.
* Changes are not undone, so callers must not access the btree ever again.
*/
void
xfbtree_trans_cancel(
struct xfbtree *xfbt,
struct xfs_trans *tp)
{
struct xfs_log_item *lip, *n;
bool tp_dirty = false;
list_for_each_entry_safe(lip, n, &tp->t_items, li_trans) {
struct xfs_buf *bp = xfbtree_buf_match(xfbt, lip);
if (!bp) {
if (test_bit(XFS_LI_DIRTY, &lip->li_flags))
tp_dirty |= true;
continue;
}
trace_xfbtree_trans_cancel_buf(xfbt, bp);
xmbuf_trans_bdetach(tp, bp);
xfs_buf_relse(bp);
}
/*
* Reset the transaction's dirty flag to reflect the dirty state of the
* log items that are still attached.
*/
tp->t_flags = (tp->t_flags & ~XFS_TRANS_DIRTY) |
(tp_dirty ? XFS_TRANS_DIRTY : 0);
}
/* SPDX-License-Identifier: GPL-2.0 */
/*
* Copyright (c) 2021-2024 Oracle. All Rights Reserved.
* Author: Darrick J. Wong <djwong@kernel.org>
*/
#ifndef __XFS_BTREE_MEM_H__
#define __XFS_BTREE_MEM_H__
typedef uint64_t xfbno_t;
#define XFBNO_BLOCKSIZE (XMBUF_BLOCKSIZE)
#define XFBNO_BBSHIFT (XMBUF_BLOCKSHIFT - BBSHIFT)
#define XFBNO_BBSIZE (XFBNO_BLOCKSIZE >> BBSHIFT)
static inline xfs_daddr_t xfbno_to_daddr(xfbno_t blkno)
{
return blkno << XFBNO_BBSHIFT;
}
static inline xfbno_t xfs_daddr_to_xfbno(xfs_daddr_t daddr)
{
return daddr >> XFBNO_BBSHIFT;
}
struct xfbtree {
/* buffer cache target for this in-memory btree */
struct xfs_buftarg *target;
/* Highest block number that has been written to. */
xfbno_t highest_bno;
/* Owner of this btree. */
unsigned long long owner;
/* Btree header */
union xfs_btree_ptr root;
unsigned int nlevels;
/* Minimum and maximum records per block. */
unsigned int maxrecs[2];
unsigned int minrecs[2];
};
#ifdef CONFIG_XFS_BTREE_IN_MEM
static inline bool xfbtree_verify_bno(struct xfbtree *xfbt, xfbno_t bno)
{
return xmbuf_verify_daddr(xfbt->target, xfbno_to_daddr(bno));
}
void xfbtree_set_root(struct xfs_btree_cur *cur,
const union xfs_btree_ptr *ptr, int inc);
void xfbtree_init_ptr_from_cur(struct xfs_btree_cur *cur,
union xfs_btree_ptr *ptr);
struct xfs_btree_cur *xfbtree_dup_cursor(struct xfs_btree_cur *cur);
int xfbtree_get_minrecs(struct xfs_btree_cur *cur, int level);
int xfbtree_get_maxrecs(struct xfs_btree_cur *cur, int level);
int xfbtree_alloc_block(struct xfs_btree_cur *cur,
const union xfs_btree_ptr *start, union xfs_btree_ptr *ptr,
int *stat);
int xfbtree_free_block(struct xfs_btree_cur *cur, struct xfs_buf *bp);
/* Callers must set xfbt->target and xfbt->owner before calling this */
int xfbtree_init(struct xfs_mount *mp, struct xfbtree *xfbt,
struct xfs_buftarg *btp, const struct xfs_btree_ops *ops);
void xfbtree_destroy(struct xfbtree *xfbt);
int xfbtree_trans_commit(struct xfbtree *xfbt, struct xfs_trans *tp);
void xfbtree_trans_cancel(struct xfbtree *xfbt, struct xfs_trans *tp);
#else
# define xfbtree_verify_bno(...) (false)
#endif /* CONFIG_XFS_BTREE_IN_MEM */
#endif /* __XFS_BTREE_MEM_H__ */
......@@ -38,63 +38,6 @@
* specific btree type to commit the new btree into the filesystem.
*/
/*
* Don't allow staging cursors to be duplicated because they're supposed to be
* kept private to a single thread.
*/
STATIC struct xfs_btree_cur *
xfs_btree_fakeroot_dup_cursor(
struct xfs_btree_cur *cur)
{
ASSERT(0);
return NULL;
}
/*
* Don't allow block allocation for a staging cursor, because staging cursors
* do not support regular btree modifications.
*
* Bulk loading uses a separate callback to obtain new blocks from a
* preallocated list, which prevents ENOSPC failures during loading.
*/
STATIC int
xfs_btree_fakeroot_alloc_block(
struct xfs_btree_cur *cur,
const union xfs_btree_ptr *start_bno,
union xfs_btree_ptr *new_bno,
int *stat)
{
ASSERT(0);
return -EFSCORRUPTED;
}
/*
* Don't allow block freeing for a staging cursor, because staging cursors
* do not support regular btree modifications.
*/
STATIC int
xfs_btree_fakeroot_free_block(
struct xfs_btree_cur *cur,
struct xfs_buf *bp)
{
ASSERT(0);
return -EFSCORRUPTED;
}
/* Initialize a pointer to the root block from the fakeroot. */
STATIC void
xfs_btree_fakeroot_init_ptr_from_cur(
struct xfs_btree_cur *cur,
union xfs_btree_ptr *ptr)
{
struct xbtree_afakeroot *afake;
ASSERT(cur->bc_flags & XFS_BTREE_STAGING);
afake = cur->bc_ag.afake;
ptr->s = cpu_to_be32(afake->af_root);
}
/*
* Bulk Loading for AG Btrees
* ==========================
......@@ -109,47 +52,20 @@ xfs_btree_fakeroot_init_ptr_from_cur(
* cursor into a regular btree cursor.
*/
/* Update the btree root information for a per-AG fake root. */
STATIC void
xfs_btree_afakeroot_set_root(
struct xfs_btree_cur *cur,
const union xfs_btree_ptr *ptr,
int inc)
{
struct xbtree_afakeroot *afake = cur->bc_ag.afake;
ASSERT(cur->bc_flags & XFS_BTREE_STAGING);
afake->af_root = be32_to_cpu(ptr->s);
afake->af_levels += inc;
}
/*
* Initialize a AG-rooted btree cursor with the given AG btree fake root.
* The btree cursor's bc_ops will be overridden as needed to make the staging
* functionality work.
*/
void
xfs_btree_stage_afakeroot(
struct xfs_btree_cur *cur,
struct xbtree_afakeroot *afake)
{
struct xfs_btree_ops *nops;
ASSERT(!(cur->bc_flags & XFS_BTREE_STAGING));
ASSERT(!(cur->bc_flags & XFS_BTREE_ROOT_IN_INODE));
ASSERT(cur->bc_ops->type != XFS_BTREE_TYPE_INODE);
ASSERT(cur->bc_tp == NULL);
nops = kmem_alloc(sizeof(struct xfs_btree_ops), KM_NOFS);
memcpy(nops, cur->bc_ops, sizeof(struct xfs_btree_ops));
nops->alloc_block = xfs_btree_fakeroot_alloc_block;
nops->free_block = xfs_btree_fakeroot_free_block;
nops->init_ptr_from_cur = xfs_btree_fakeroot_init_ptr_from_cur;
nops->set_root = xfs_btree_afakeroot_set_root;
nops->dup_cursor = xfs_btree_fakeroot_dup_cursor;
cur->bc_ag.afake = afake;
cur->bc_nlevels = afake->af_levels;
cur->bc_ops = nops;
cur->bc_flags |= XFS_BTREE_STAGING;
}
......@@ -163,17 +79,15 @@ void
xfs_btree_commit_afakeroot(
struct xfs_btree_cur *cur,
struct xfs_trans *tp,
struct xfs_buf *agbp,
const struct xfs_btree_ops *ops)
struct xfs_buf *agbp)
{
ASSERT(cur->bc_flags & XFS_BTREE_STAGING);
ASSERT(cur->bc_tp == NULL);
trace_xfs_btree_commit_afakeroot(cur);
kmem_free((void *)cur->bc_ops);
cur->bc_ag.afake = NULL;
cur->bc_ag.agbp = agbp;
cur->bc_ops = ops;
cur->bc_flags &= ~XFS_BTREE_STAGING;
cur->bc_tp = tp;
}
......@@ -211,29 +125,16 @@ xfs_btree_commit_afakeroot(
void
xfs_btree_stage_ifakeroot(
struct xfs_btree_cur *cur,
struct xbtree_ifakeroot *ifake,
struct xfs_btree_ops **new_ops)
struct xbtree_ifakeroot *ifake)
{
struct xfs_btree_ops *nops;
ASSERT(!(cur->bc_flags & XFS_BTREE_STAGING));
ASSERT(cur->bc_flags & XFS_BTREE_ROOT_IN_INODE);
ASSERT(cur->bc_ops->type == XFS_BTREE_TYPE_INODE);
ASSERT(cur->bc_tp == NULL);
nops = kmem_alloc(sizeof(struct xfs_btree_ops), KM_NOFS);
memcpy(nops, cur->bc_ops, sizeof(struct xfs_btree_ops));
nops->alloc_block = xfs_btree_fakeroot_alloc_block;
nops->free_block = xfs_btree_fakeroot_free_block;
nops->init_ptr_from_cur = xfs_btree_fakeroot_init_ptr_from_cur;
nops->dup_cursor = xfs_btree_fakeroot_dup_cursor;
cur->bc_ino.ifake = ifake;
cur->bc_nlevels = ifake->if_levels;
cur->bc_ops = nops;
cur->bc_ino.forksize = ifake->if_fork_size;
cur->bc_flags |= XFS_BTREE_STAGING;
if (new_ops)
*new_ops = nops;
}
/*
......@@ -246,18 +147,15 @@ void
xfs_btree_commit_ifakeroot(
struct xfs_btree_cur *cur,
struct xfs_trans *tp,
int whichfork,
const struct xfs_btree_ops *ops)
int whichfork)
{
ASSERT(cur->bc_flags & XFS_BTREE_STAGING);
ASSERT(cur->bc_tp == NULL);
trace_xfs_btree_commit_ifakeroot(cur);
kmem_free((void *)cur->bc_ops);
cur->bc_ino.ifake = NULL;
cur->bc_ino.whichfork = whichfork;
cur->bc_ops = ops;
cur->bc_flags &= ~XFS_BTREE_STAGING;
cur->bc_tp = tp;
}
......@@ -397,8 +295,7 @@ xfs_btree_bload_prep_block(
struct xfs_btree_block *new_block;
int ret;
if ((cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) &&
level == cur->bc_nlevels - 1) {
if (xfs_btree_at_iroot(cur, level)) {
struct xfs_ifork *ifp = xfs_btree_ifork_ptr(cur);
size_t new_size;
......@@ -406,14 +303,12 @@ xfs_btree_bload_prep_block(
/* Allocate a new incore btree root block. */
new_size = bbl->iroot_size(cur, level, nr_this_block, priv);
ifp->if_broot = kmem_zalloc(new_size, 0);
ifp->if_broot = kzalloc(new_size, GFP_KERNEL | __GFP_NOFAIL);
ifp->if_broot_bytes = (int)new_size;
/* Initialize it and send it out. */
xfs_btree_init_block_int(cur->bc_mp, ifp->if_broot,
XFS_BUF_DADDR_NULL, cur->bc_btnum, level,
nr_this_block, cur->bc_ino.ip->i_ino,
cur->bc_flags);
xfs_btree_init_block(cur->bc_mp, ifp->if_broot, cur->bc_ops,
level, nr_this_block, cur->bc_ino.ip->i_ino);
*bpp = NULL;
*blockp = ifp->if_broot;
......@@ -704,7 +599,7 @@ xfs_btree_bload_compute_geometry(
xfs_btree_bload_level_geometry(cur, bbl, level, nr_this_level,
&avg_per_block, &level_blocks, &dontcare64);
if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) {
if (cur->bc_ops->type == XFS_BTREE_TYPE_INODE) {
/*
* If all the items we want to store at this level
* would fit in the inode root block, then we have our
......@@ -763,7 +658,7 @@ xfs_btree_bload_compute_geometry(
return -EOVERFLOW;
bbl->btree_height = cur->bc_nlevels;
if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE)
if (cur->bc_ops->type == XFS_BTREE_TYPE_INODE)
bbl->nr_blocks = nr_blocks - 1;
else
bbl->nr_blocks = nr_blocks;
......@@ -890,7 +785,7 @@ xfs_btree_bload(
}
/* Initialize the new root. */
if (cur->bc_flags & XFS_BTREE_ROOT_IN_INODE) {
if (cur->bc_ops->type == XFS_BTREE_TYPE_INODE) {
ASSERT(xfs_btree_ptr_is_null(cur, &ptr));
cur->bc_ino.ifake->if_levels = cur->bc_nlevels;
cur->bc_ino.ifake->if_blocks = total_blocks - 1;
......
......@@ -22,7 +22,7 @@ struct xbtree_afakeroot {
void xfs_btree_stage_afakeroot(struct xfs_btree_cur *cur,
struct xbtree_afakeroot *afake);
void xfs_btree_commit_afakeroot(struct xfs_btree_cur *cur, struct xfs_trans *tp,
struct xfs_buf *agbp, const struct xfs_btree_ops *ops);
struct xfs_buf *agbp);
/* Fake root for an inode-rooted btree. */
struct xbtree_ifakeroot {
......@@ -41,10 +41,9 @@ struct xbtree_ifakeroot {
/* Cursor interactions with fake roots for inode-rooted btrees. */
void xfs_btree_stage_ifakeroot(struct xfs_btree_cur *cur,
struct xbtree_ifakeroot *ifake,
struct xfs_btree_ops **new_ops);
struct xbtree_ifakeroot *ifake);
void xfs_btree_commit_ifakeroot(struct xfs_btree_cur *cur, struct xfs_trans *tp,
int whichfork, const struct xfs_btree_ops *ops);
int whichfork);
/* Bulk loading of staged btrees. */
typedef int (*xfs_btree_bload_get_records_fn)(struct xfs_btree_cur *cur,
......@@ -76,8 +75,7 @@ struct xfs_btree_bload {
/*
* This function should return the size of the in-core btree root
* block. It is only necessary for XFS_BTREE_ROOT_IN_INODE btree
* types.
* block. It is only necessary for XFS_BTREE_TYPE_INODE btrees.
*/
xfs_btree_bload_iroot_size_fn iroot_size;
......
......@@ -23,6 +23,7 @@
#include "xfs_buf_item.h"
#include "xfs_log.h"
#include "xfs_errortag.h"
#include "xfs_health.h"
/*
* xfs_da_btree.c
......@@ -85,7 +86,8 @@ xfs_da_state_alloc(
{
struct xfs_da_state *state;
state = kmem_cache_zalloc(xfs_da_state_cache, GFP_NOFS | __GFP_NOFAIL);
state = kmem_cache_zalloc(xfs_da_state_cache,
GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL);
state->args = args;
state->mp = args->dp->i_mount;
return state;
......@@ -352,6 +354,8 @@ const struct xfs_buf_ops xfs_da3_node_buf_ops = {
static int
xfs_da3_node_set_type(
struct xfs_trans *tp,
struct xfs_inode *dp,
int whichfork,
struct xfs_buf *bp)
{
struct xfs_da_blkinfo *info = bp->b_addr;
......@@ -373,6 +377,7 @@ xfs_da3_node_set_type(
XFS_CORRUPTION_ERROR(__func__, XFS_ERRLEVEL_LOW, tp->t_mountp,
info, sizeof(*info));
xfs_trans_brelse(tp, bp);
xfs_dirattr_mark_sick(dp, whichfork);
return -EFSCORRUPTED;
}
}
......@@ -391,7 +396,7 @@ xfs_da3_node_read(
&xfs_da3_node_buf_ops);
if (error || !*bpp || !tp)
return error;
return xfs_da3_node_set_type(tp, *bpp);
return xfs_da3_node_set_type(tp, dp, whichfork, *bpp);
}
int
......@@ -408,6 +413,8 @@ xfs_da3_node_read_mapped(
error = xfs_trans_read_buf(mp, tp, mp->m_ddev_targp, mappedbno,
XFS_FSB_TO_BB(mp, xfs_dabuf_nfsb(mp, whichfork)), 0,
bpp, &xfs_da3_node_buf_ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(dp, whichfork);
if (error || !*bpp)
return error;
......@@ -418,7 +425,7 @@ xfs_da3_node_read_mapped(
if (!tp)
return 0;
return xfs_da3_node_set_type(tp, *bpp);
return xfs_da3_node_set_type(tp, dp, whichfork, *bpp);
}
/*
......@@ -631,6 +638,7 @@ xfs_da3_split(
if (node->hdr.info.forw) {
if (be32_to_cpu(node->hdr.info.forw) != addblk->blkno) {
xfs_buf_mark_corrupt(oldblk->bp);
xfs_da_mark_sick(state->args);
error = -EFSCORRUPTED;
goto out;
}
......@@ -644,6 +652,7 @@ xfs_da3_split(
if (node->hdr.info.back) {
if (be32_to_cpu(node->hdr.info.back) != addblk->blkno) {
xfs_buf_mark_corrupt(oldblk->bp);
xfs_da_mark_sick(state->args);
error = -EFSCORRUPTED;
goto out;
}
......@@ -1635,6 +1644,7 @@ xfs_da3_node_lookup_int(
if (magic != XFS_DA_NODE_MAGIC && magic != XFS_DA3_NODE_MAGIC) {
xfs_buf_mark_corrupt(blk->bp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
......@@ -1650,6 +1660,7 @@ xfs_da3_node_lookup_int(
/* Tree taller than we can handle; bail out! */
if (nodehdr.level >= XFS_DA_NODE_MAXDEPTH) {
xfs_buf_mark_corrupt(blk->bp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
......@@ -1658,6 +1669,7 @@ xfs_da3_node_lookup_int(
expected_level = nodehdr.level - 1;
else if (expected_level != nodehdr.level) {
xfs_buf_mark_corrupt(blk->bp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
} else
expected_level--;
......@@ -1709,12 +1721,16 @@ xfs_da3_node_lookup_int(
}
/* We can't point back to the root. */
if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk))
if (XFS_IS_CORRUPT(dp->i_mount, blkno == args->geo->leafblk)) {
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
}
if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0))
if (XFS_IS_CORRUPT(dp->i_mount, expected_level != 0)) {
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
/*
* A leaf block that ends in the hashval that we are interested in
......@@ -1732,6 +1748,7 @@ xfs_da3_node_lookup_int(
args->blkno = blk->blkno;
} else {
ASSERT(0);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
if (((retval == -ENOENT) || (retval == -ENOATTR)) &&
......@@ -2182,7 +2199,8 @@ xfs_da_grow_inode_int(
* If we didn't get it and the block might work if fragmented,
* try without the CONTIG flag. Loop until we get it all.
*/
mapp = kmem_alloc(sizeof(*mapp) * count, 0);
mapp = kmalloc(sizeof(*mapp) * count,
GFP_KERNEL | __GFP_NOFAIL);
for (b = *bno, mapi = 0; b < *bno + count; ) {
c = (int)(*bno + count - b);
nmap = min(XFS_BMAP_MAX_NMAP, c);
......@@ -2219,7 +2237,7 @@ xfs_da_grow_inode_int(
out_free_map:
if (mapp != &map)
kmem_free(mapp);
kfree(mapp);
return error;
}
......@@ -2297,8 +2315,10 @@ xfs_da3_swap_lastblock(
error = xfs_bmap_last_before(tp, dp, &lastoff, w);
if (error)
return error;
if (XFS_IS_CORRUPT(mp, lastoff == 0))
if (XFS_IS_CORRUPT(mp, lastoff == 0)) {
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
/*
* Read the last block in the btree space.
*/
......@@ -2348,6 +2368,7 @@ xfs_da3_swap_lastblock(
if (XFS_IS_CORRUPT(mp,
be32_to_cpu(sib_info->forw) != last_blkno ||
sib_info->magic != dead_info->magic)) {
xfs_da_mark_sick(args);
error = -EFSCORRUPTED;
goto done;
}
......@@ -2368,6 +2389,7 @@ xfs_da3_swap_lastblock(
if (XFS_IS_CORRUPT(mp,
be32_to_cpu(sib_info->back) != last_blkno ||
sib_info->magic != dead_info->magic)) {
xfs_da_mark_sick(args);
error = -EFSCORRUPTED;
goto done;
}
......@@ -2390,6 +2412,7 @@ xfs_da3_swap_lastblock(
xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
if (XFS_IS_CORRUPT(mp,
level >= 0 && level != par_hdr.level + 1)) {
xfs_da_mark_sick(args);
error = -EFSCORRUPTED;
goto done;
}
......@@ -2401,6 +2424,7 @@ xfs_da3_swap_lastblock(
entno++)
continue;
if (XFS_IS_CORRUPT(mp, entno == par_hdr.count)) {
xfs_da_mark_sick(args);
error = -EFSCORRUPTED;
goto done;
}
......@@ -2426,6 +2450,7 @@ xfs_da3_swap_lastblock(
xfs_trans_brelse(tp, par_buf);
par_buf = NULL;
if (XFS_IS_CORRUPT(mp, par_blkno == 0)) {
xfs_da_mark_sick(args);
error = -EFSCORRUPTED;
goto done;
}
......@@ -2435,6 +2460,7 @@ xfs_da3_swap_lastblock(
par_node = par_buf->b_addr;
xfs_da3_node_hdr_from_disk(dp->i_mount, &par_hdr, par_node);
if (XFS_IS_CORRUPT(mp, par_hdr.level != level)) {
xfs_da_mark_sick(args);
error = -EFSCORRUPTED;
goto done;
}
......@@ -2518,7 +2544,8 @@ xfs_dabuf_map(
int error = 0, nirecs, i;
if (nfsb > 1)
irecs = kmem_zalloc(sizeof(irec) * nfsb, KM_NOFS);
irecs = kzalloc(sizeof(irec) * nfsb,
GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL);
nirecs = nfsb;
error = xfs_bmapi_read(dp, bno, nfsb, irecs, &nirecs,
......@@ -2531,7 +2558,8 @@ xfs_dabuf_map(
* larger one that needs to be free by the caller.
*/
if (nirecs > 1) {
map = kmem_zalloc(nirecs * sizeof(struct xfs_buf_map), KM_NOFS);
map = kzalloc(nirecs * sizeof(struct xfs_buf_map),
GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL);
if (!map) {
error = -ENOMEM;
goto out_free_irecs;
......@@ -2557,12 +2585,13 @@ xfs_dabuf_map(
*nmaps = nirecs;
out_free_irecs:
if (irecs != &irec)
kmem_free(irecs);
kfree(irecs);
return error;
invalid_mapping:
/* Caller ok with no mapping. */
if (XFS_IS_CORRUPT(mp, !(flags & XFS_DABUF_MAP_HOLE_OK))) {
xfs_dirattr_mark_sick(dp, whichfork);
error = -EFSCORRUPTED;
if (xfs_error_level >= XFS_ERRLEVEL_LOW) {
xfs_alert(mp, "%s: bno %u inode %llu",
......@@ -2613,7 +2642,7 @@ xfs_da_get_buf(
out_free:
if (mapp != &map)
kmem_free(mapp);
kfree(mapp);
return error;
}
......@@ -2644,6 +2673,8 @@ xfs_da_read_buf(
error = xfs_trans_read_buf_map(mp, tp, mp->m_ddev_targp, mapp, nmap, 0,
&bp, ops);
if (xfs_metadata_is_sick(error))
xfs_dirattr_mark_sick(dp, whichfork);
if (error)
goto out_free;
......@@ -2654,7 +2685,7 @@ xfs_da_read_buf(
*bpp = bp;
out_free:
if (mapp != &map)
kmem_free(mapp);
kfree(mapp);
return error;
}
......@@ -2685,7 +2716,7 @@ xfs_da_reada_buf(
out_free:
if (mapp != &map)
kmem_free(mapp);
kfree(mapp);
return error;
}
......@@ -159,6 +159,17 @@ struct xfs_da3_intnode {
#define XFS_DIR3_FT_MAX 9
#define XFS_DIR3_FTYPE_STR \
{ XFS_DIR3_FT_UNKNOWN, "unknown" }, \
{ XFS_DIR3_FT_REG_FILE, "file" }, \
{ XFS_DIR3_FT_DIR, "directory" }, \
{ XFS_DIR3_FT_CHRDEV, "char" }, \
{ XFS_DIR3_FT_BLKDEV, "block" }, \
{ XFS_DIR3_FT_FIFO, "fifo" }, \
{ XFS_DIR3_FT_SOCK, "sock" }, \
{ XFS_DIR3_FT_SYMLINK, "symlink" }, \
{ XFS_DIR3_FT_WHT, "whiteout" }
/*
* Byte offset in data block and shortform entry.
*/
......
......@@ -819,16 +819,16 @@ xfs_defer_can_append(
/* Create a new pending item at the end of the transaction list. */
static inline struct xfs_defer_pending *
xfs_defer_alloc(
struct xfs_trans *tp,
struct list_head *dfops,
const struct xfs_defer_op_type *ops)
{
struct xfs_defer_pending *dfp;
dfp = kmem_cache_zalloc(xfs_defer_pending_cache,
GFP_NOFS | __GFP_NOFAIL);
GFP_KERNEL | __GFP_NOFAIL);
dfp->dfp_ops = ops;
INIT_LIST_HEAD(&dfp->dfp_work);
list_add_tail(&dfp->dfp_list, &tp->t_dfops);
list_add_tail(&dfp->dfp_list, dfops);
return dfp;
}
......@@ -846,7 +846,7 @@ xfs_defer_add(
dfp = xfs_defer_find_last(tp, ops);
if (!dfp || !xfs_defer_can_append(dfp, ops))
dfp = xfs_defer_alloc(tp, ops);
dfp = xfs_defer_alloc(&tp->t_dfops, ops);
xfs_defer_add_item(dfp, li);
trace_xfs_defer_add_item(tp->t_mountp, dfp, li);
......@@ -870,7 +870,7 @@ xfs_defer_add_barrier(
if (dfp)
return;
xfs_defer_alloc(tp, &xfs_barrier_defer_type);
xfs_defer_alloc(&tp->t_dfops, &xfs_barrier_defer_type);
trace_xfs_defer_add_item(tp->t_mountp, dfp, NULL);
}
......@@ -885,14 +885,9 @@ xfs_defer_start_recovery(
struct list_head *r_dfops,
const struct xfs_defer_op_type *ops)
{
struct xfs_defer_pending *dfp;
struct xfs_defer_pending *dfp = xfs_defer_alloc(r_dfops, ops);
dfp = kmem_cache_zalloc(xfs_defer_pending_cache,
GFP_NOFS | __GFP_NOFAIL);
dfp->dfp_ops = ops;
dfp->dfp_intent = lip;
INIT_LIST_HEAD(&dfp->dfp_work);
list_add_tail(&dfp->dfp_list, r_dfops);
}
/*
......@@ -979,7 +974,7 @@ xfs_defer_ops_capture(
return ERR_PTR(error);
/* Create an object to capture the defer ops. */
dfc = kmem_zalloc(sizeof(*dfc), KM_NOFS);
dfc = kzalloc(sizeof(*dfc), GFP_KERNEL | __GFP_NOFAIL);
INIT_LIST_HEAD(&dfc->dfc_list);
INIT_LIST_HEAD(&dfc->dfc_dfops);
......@@ -1011,7 +1006,7 @@ xfs_defer_ops_capture(
* transaction.
*/
for (i = 0; i < dfc->dfc_held.dr_inos; i++) {
ASSERT(xfs_isilocked(dfc->dfc_held.dr_ip[i], XFS_ILOCK_EXCL));
xfs_assert_ilocked(dfc->dfc_held.dr_ip[i], XFS_ILOCK_EXCL);
ihold(VFS_I(dfc->dfc_held.dr_ip[i]));
}
......@@ -1038,7 +1033,7 @@ xfs_defer_ops_capture_abort(
for (i = 0; i < dfc->dfc_held.dr_inos; i++)
xfs_irele(dfc->dfc_held.dr_ip[i]);
kmem_free(dfc);
kfree(dfc);
}
/*
......@@ -1114,7 +1109,7 @@ xfs_defer_ops_continue(
list_splice_init(&dfc->dfc_dfops, &tp->t_dfops);
tp->t_flags |= dfc->dfc_tpflags;
kmem_free(dfc);
kfree(dfc);
}
/* Release the resources captured and continued during recovery. */
......
......@@ -18,6 +18,7 @@
#include "xfs_errortag.h"
#include "xfs_error.h"
#include "xfs_trace.h"
#include "xfs_health.h"
const struct xfs_name xfs_name_dotdot = {
.name = (const unsigned char *)"..",
......@@ -25,6 +26,12 @@ const struct xfs_name xfs_name_dotdot = {
.type = XFS_DIR3_FT_DIR,
};
const struct xfs_name xfs_name_dot = {
.name = (const unsigned char *)".",
.len = 1,
.type = XFS_DIR3_FT_DIR,
};
/*
* Convert inode mode to directory entry filetype
*/
......@@ -104,13 +111,13 @@ xfs_da_mount(
ASSERT(mp->m_sb.sb_versionnum & XFS_SB_VERSION_DIRV2BIT);
ASSERT(xfs_dir2_dirblock_bytes(&mp->m_sb) <= XFS_MAX_BLOCKSIZE);
mp->m_dir_geo = kmem_zalloc(sizeof(struct xfs_da_geometry),
KM_MAYFAIL);
mp->m_attr_geo = kmem_zalloc(sizeof(struct xfs_da_geometry),
KM_MAYFAIL);
mp->m_dir_geo = kzalloc(sizeof(struct xfs_da_geometry),
GFP_KERNEL | __GFP_RETRY_MAYFAIL);
mp->m_attr_geo = kzalloc(sizeof(struct xfs_da_geometry),
GFP_KERNEL | __GFP_RETRY_MAYFAIL);
if (!mp->m_dir_geo || !mp->m_attr_geo) {
kmem_free(mp->m_dir_geo);
kmem_free(mp->m_attr_geo);
kfree(mp->m_dir_geo);
kfree(mp->m_attr_geo);
return -ENOMEM;
}
......@@ -178,8 +185,8 @@ void
xfs_da_unmount(
struct xfs_mount *mp)
{
kmem_free(mp->m_dir_geo);
kmem_free(mp->m_attr_geo);
kfree(mp->m_dir_geo);
kfree(mp->m_attr_geo);
}
/*
......@@ -236,7 +243,7 @@ xfs_dir_init(
if (error)
return error;
args = kmem_zalloc(sizeof(*args), KM_NOFS);
args = kzalloc(sizeof(*args), GFP_KERNEL | __GFP_NOFAIL);
if (!args)
return -ENOMEM;
......@@ -244,7 +251,7 @@ xfs_dir_init(
args->dp = dp;
args->trans = tp;
error = xfs_dir2_sf_create(args, pdp->i_ino);
kmem_free(args);
kfree(args);
return error;
}
......@@ -273,7 +280,7 @@ xfs_dir_createname(
XFS_STATS_INC(dp->i_mount, xs_dir_create);
}
args = kmem_zalloc(sizeof(*args), KM_NOFS);
args = kzalloc(sizeof(*args), GFP_KERNEL | __GFP_NOFAIL);
if (!args)
return -ENOMEM;
......@@ -313,7 +320,7 @@ xfs_dir_createname(
rval = xfs_dir2_node_addname(args);
out_free:
kmem_free(args);
kfree(args);
return rval;
}
......@@ -333,7 +340,8 @@ xfs_dir_cilookup_result(
!(args->op_flags & XFS_DA_OP_CILOOKUP))
return -EEXIST;
args->value = kmem_alloc(len, KM_NOFS | KM_MAYFAIL);
args->value = kmalloc(len,
GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_RETRY_MAYFAIL);
if (!args->value)
return -ENOMEM;
......@@ -364,15 +372,8 @@ xfs_dir_lookup(
ASSERT(S_ISDIR(VFS_I(dp)->i_mode));
XFS_STATS_INC(dp->i_mount, xs_dir_lookup);
/*
* We need to use KM_NOFS here so that lockdep will not throw false
* positive deadlock warnings on a non-transactional lookup path. It is
* safe to recurse into inode recalim in that case, but lockdep can't
* easily be taught about it. Hence KM_NOFS avoids having to add more
* lockdep Doing this avoids having to add a bunch of lockdep class
* annotations into the reclaim path for the ilock.
*/
args = kmem_zalloc(sizeof(*args), KM_NOFS);
args = kzalloc(sizeof(*args),
GFP_KERNEL | __GFP_NOLOCKDEP | __GFP_NOFAIL);
args->geo = dp->i_mount->m_dir_geo;
args->name = name->name;
args->namelen = name->len;
......@@ -419,7 +420,7 @@ xfs_dir_lookup(
}
out_free:
xfs_iunlock(dp, lock_mode);
kmem_free(args);
kfree(args);
return rval;
}
......@@ -441,7 +442,7 @@ xfs_dir_removename(
ASSERT(S_ISDIR(VFS_I(dp)->i_mode));
XFS_STATS_INC(dp->i_mount, xs_dir_remove);
args = kmem_zalloc(sizeof(*args), KM_NOFS);
args = kzalloc(sizeof(*args), GFP_KERNEL | __GFP_NOFAIL);
if (!args)
return -ENOMEM;
......@@ -477,7 +478,7 @@ xfs_dir_removename(
else
rval = xfs_dir2_node_removename(args);
out_free:
kmem_free(args);
kfree(args);
return rval;
}
......@@ -502,7 +503,7 @@ xfs_dir_replace(
if (rval)
return rval;
args = kmem_zalloc(sizeof(*args), KM_NOFS);
args = kzalloc(sizeof(*args), GFP_KERNEL | __GFP_NOFAIL);
if (!args)
return -ENOMEM;
......@@ -538,7 +539,7 @@ xfs_dir_replace(
else
rval = xfs_dir2_node_replace(args);
out_free:
kmem_free(args);
kfree(args);
return rval;
}
......@@ -626,8 +627,10 @@ xfs_dir2_isblock(
return 0;
*isblock = true;
if (XFS_IS_CORRUPT(mp, args->dp->i_disk_size != args->geo->blksize))
if (XFS_IS_CORRUPT(mp, args->dp->i_disk_size != args->geo->blksize)) {
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
return 0;
}
......
......@@ -22,6 +22,19 @@ struct xfs_dir3_icfree_hdr;
struct xfs_dir3_icleaf_hdr;
extern const struct xfs_name xfs_name_dotdot;
extern const struct xfs_name xfs_name_dot;
static inline bool
xfs_dir2_samename(
const struct xfs_name *n1,
const struct xfs_name *n2)
{
if (n1 == n2)
return true;
if (n1->len != n2->len)
return false;
return !memcmp(n1->name, n2->name, n1->len);
}
/*
* Convert inode mode to directory entry filetype
......
......@@ -20,6 +20,7 @@
#include "xfs_error.h"
#include "xfs_trace.h"
#include "xfs_log.h"
#include "xfs_health.h"
/*
* Local function prototypes.
......@@ -152,6 +153,7 @@ xfs_dir3_block_read(
__xfs_buf_mark_corrupt(*bpp, fa);
xfs_trans_brelse(tp, *bpp);
*bpp = NULL;
xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
return -EFSCORRUPTED;
}
......@@ -1108,7 +1110,7 @@ xfs_dir2_sf_to_block(
* Copy the directory into a temporary buffer.
* Then pitch the incore inode data so we can make extents.
*/
sfp = kmem_alloc(ifp->if_bytes, 0);
sfp = kmalloc(ifp->if_bytes, GFP_KERNEL | __GFP_NOFAIL);
memcpy(sfp, oldsfp, ifp->if_bytes);
xfs_idata_realloc(dp, -ifp->if_bytes, XFS_DATA_FORK);
......@@ -1253,7 +1255,7 @@ xfs_dir2_sf_to_block(
sfep = xfs_dir2_sf_nextentry(mp, sfp, sfep);
}
/* Done with the temporary buffer */
kmem_free(sfp);
kfree(sfp);
/*
* Sort the leaf entries by hash value.
*/
......@@ -1268,6 +1270,6 @@ xfs_dir2_sf_to_block(
xfs_dir3_data_check(dp, bp);
return 0;
out_free:
kmem_free(sfp);
kfree(sfp);
return error;
}
......@@ -18,6 +18,7 @@
#include "xfs_trans.h"
#include "xfs_buf_item.h"
#include "xfs_log.h"
#include "xfs_health.h"
static xfs_failaddr_t xfs_dir2_data_freefind_verify(
struct xfs_dir2_data_hdr *hdr, struct xfs_dir2_data_free *bf,
......@@ -433,6 +434,7 @@ xfs_dir3_data_read(
__xfs_buf_mark_corrupt(*bpp, fa);
xfs_trans_brelse(tp, *bpp);
*bpp = NULL;
xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
return -EFSCORRUPTED;
}
......@@ -1198,6 +1200,7 @@ xfs_dir2_data_use_free(
corrupt:
xfs_corruption_error(__func__, XFS_ERRLEVEL_LOW, args->dp->i_mount,
hdr, sizeof(*hdr), __FILE__, __LINE__, fa);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
......
......@@ -19,6 +19,7 @@
#include "xfs_trace.h"
#include "xfs_trans.h"
#include "xfs_buf_item.h"
#include "xfs_health.h"
/*
* Local function declarations.
......@@ -1393,8 +1394,10 @@ xfs_dir2_leaf_removename(
bestsp = xfs_dir2_leaf_bests_p(ltp);
if (be16_to_cpu(bestsp[db]) != oldbest) {
xfs_buf_mark_corrupt(lbp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
/*
* Mark the former data entry unused.
*/
......
......@@ -20,6 +20,7 @@
#include "xfs_trans.h"
#include "xfs_buf_item.h"
#include "xfs_log.h"
#include "xfs_health.h"
/*
* Function declarations.
......@@ -231,6 +232,7 @@ __xfs_dir3_free_read(
__xfs_buf_mark_corrupt(*bpp, fa);
xfs_trans_brelse(tp, *bpp);
*bpp = NULL;
xfs_dirattr_mark_sick(dp, XFS_DATA_FORK);
return -EFSCORRUPTED;
}
......@@ -443,6 +445,7 @@ xfs_dir2_leaf_to_node(
if (be32_to_cpu(ltp->bestcount) >
(uint)dp->i_disk_size / args->geo->blksize) {
xfs_buf_mark_corrupt(lbp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
......@@ -517,6 +520,7 @@ xfs_dir2_leafn_add(
*/
if (index < 0) {
xfs_buf_mark_corrupt(bp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
......@@ -736,6 +740,7 @@ xfs_dir2_leafn_lookup_for_addname(
cpu_to_be16(NULLDATAOFF))) {
if (curfdb != newfdb)
xfs_trans_brelse(tp, curbp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
curfdb = newfdb;
......@@ -804,6 +809,7 @@ xfs_dir2_leafn_lookup_for_entry(
xfs_dir3_leaf_check(dp, bp);
if (leafhdr.count <= 0) {
xfs_buf_mark_corrupt(bp);
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
......@@ -1739,6 +1745,7 @@ xfs_dir2_node_add_datablk(
} else {
xfs_alert(mp, " ... fblk is NULL");
}
xfs_da_mark_sick(args);
return -EFSCORRUPTED;
}
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
......@@ -260,6 +260,7 @@ int xfs_iext_count_may_overflow(struct xfs_inode *ip, int whichfork,
int nr_to_add);
int xfs_iext_count_upgrade(struct xfs_trans *tp, struct xfs_inode *ip,
uint nr_to_add);
bool xfs_ifork_is_realtime(struct xfs_inode *ip, int whichfork);
/* returns true if the fork has extents but they are not read in yet. */
static inline bool xfs_need_iread_extents(const struct xfs_ifork *ifp)
......
......@@ -838,10 +838,12 @@ struct xfs_cud_log_format {
#define XFS_BMAP_EXTENT_ATTR_FORK (1U << 31)
#define XFS_BMAP_EXTENT_UNWRITTEN (1U << 30)
#define XFS_BMAP_EXTENT_REALTIME (1U << 29)
#define XFS_BMAP_EXTENT_FLAGS (XFS_BMAP_EXTENT_TYPE_MASK | \
XFS_BMAP_EXTENT_ATTR_FORK | \
XFS_BMAP_EXTENT_UNWRITTEN)
XFS_BMAP_EXTENT_UNWRITTEN | \
XFS_BMAP_EXTENT_REALTIME)
/*
* This is the structure used to lay out an bui log item in the
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment