- 29 May, 2011 26 commits
-
-
Benny Halevy authored
Add a layout driver method to encode the layout type specific opaque part of layout commit in-line in the xdr stream. Currently, the pnfs-objects layout driver uses it to encode metadata hints to the MDS and the blocks layout driver to commit provisionally allocated extents to the file. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Boaz Harrosh authored
An io_state pre-allocates an error information structure for each possible osd-device that might error during IO. When IO is done if all was well the io_state is freed. (as today). If the I/O has ended with an error, the io_state is queued on a per-layout err_list. When eventually encode_layoutreturn() is called, each error is properly encoded on the XDR buffer and only then the io_state is removed from err_list and de-allocated. It is up to the io_engine to fill in the segment that fault and the type of osd_error that occurred. By calling objlayout_io_set_result() for each failing device. In objio_osd: * Allocate io-error descriptors space as part of io_state * Use generic objlayout error reporting at end of io. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Andy Adamson authored
Add a layout driver method to encode the layout type specific opaque part of layout return in-line in the xdr stream. Currently the pnfs-objects layout driver uses it to encode i/o error information on LAYOUTRETURN. Signed-off-by: Andy Adamson <andros@netapp.com> [fixup layout header pointer for encode_layoutreturn] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
With the objects layout security model, we have object capabilities that are associated with the layout and we anticipate that the server will issue a cb_layoutrecall for any setattr that changes security related attributes (user/group/mode/acl) or truncates the file. Therefore, the layout is returned before issuing the setattr to avoid the anticipated cb_layoutrecall. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
NFSv4.1 LAYOUTRETURN implementation Currently, does not support layout-type payload encoding. Signed-off-by: Alexandros Batsakis <batsakis@netapp.com> Signed-off-by: Andy Adamson <andros@citi.umich.edu> Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com> Signed-off-by: Fred Isaman <iisaman@citi.umich.edu> Signed-off-by: Fred Isaman <iisaman@netapp.com> Signed-off-by: Marc Eshel <eshel@almaden.ibm.com> Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn> [call pnfs_return_layout right before pnfs_destroy_layout] [remove assert_spin_locked from pnfs_clear_lseg_list] [remove wait parameter from the layoutreturn path.] [remove return_type field from nfs4_layoutreturn_args] [remove range from nfs4_layoutreturn_args] [no need to send layoutcommit from _pnfs_return_layout] [don't wait on sync layoutreturn] [fix layout stateid in layoutreturn args] [fixed NULL deref in _pnfs_return_layout] [removed recaim member of nfs4_layoutreturn_args] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Boaz Harrosh authored
With the use of the in-kernel osd library. Implement read/write of data from/to osd-objects according to information specified in the objects-layout. Support for stripping over mirrors with a received stripe_unit. There are however a few constrains which are not supported: 1. Stripe Unit must be a multiple of PAGE_SIZE 2. stripe length (stripe_unit * number_of_stripes) can not be bigger then 32bit. Also support raid-groups and partial-layout. Partial-layout is when not all the groups are received on the line, addressing only a partial range of the file. TODO: Only raid0! raid 4/5/6 support will come at later stage A none supported layout will send IO through the MDS [Important fallout from the last rebase] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [gfp_flags] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
Non-rpc layout driver such as for objects and blocks implement their own I/O path and error handling logic. Therefore bypass NFS-based error handling for these layout drivers. [fix lseg ref-count bugs, and null de-refs] [Fall out from: non-rpc layout drivers] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [get rid of PNFS_USE_RPC_CODE] [get rid of __nfs4_write_done_cb] [revert useless change in nfs4_write_done_cb] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
allocate and deallocate per-inode private pnfs_layout_hdr in preparation for I/O implementation. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
[gfp_flags] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Boaz Harrosh authored
When a new layout is received in objio_alloc_lseg all device_ids referenced are retrieved. The device information is queried for from MDS and then the osd_device is looked-up from the osd-initiator library. The devices are cached in a per-mount-point list, for later use. At unmount all devices are "put" back to the library. objlayout_get_deviceinfo(), objlayout_put_deviceinfo() middleware API for retrieving device information given a device_id. TODO: The device cache can get big. Cap its size. Keep an LRU and start to return devices which were not used, when list gets to big, or when new entries allocation fail. [pnfs-obj: Bugs in new global-device-cache code] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [gfp_flags] [use global device cache] [use layout driver in global device cache] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Boaz Harrosh authored
objlayout_alloc_lseg prepares an xdr_stream and calls the raid engins objio_alloc_lseg() to allocate a private pnfs_layout_segment. objio_osd.c::objio_alloc_lseg() uses passed xdr_stream to decode and store the layout_segment information in an objio_segment struct, using the pnfs_osd_xdr.h API for the actual parsing the layout xdr. objlayout_free_lseg calls objio_free_lseg() to free the allocated space. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [gfp_flags] [removed "extern" from function definitions] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Boaz Harrosh authored
* Add the fs/nfs/objlayout/pnfs_osd_xdr_cli.c file, which will include the XDR encode/decode implementations for the pNFS client objlayout driver. [Wrong type in comments] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
* Add the pnfs_osd_xdr.h header * defintions the pnfs_osd_layout structure including all it's sub-types and constants. * Declare the pnfs_osd_xdr_decode_layout API + all needed inline helpers. * Define the pnfs_osd_deviceaddr structure and all its subtypes and constants. * Declare API for decoding of a pnfs_osd_deviceaddr from XDR stream. * Define the pnfs_osd_ioerr structure, its substructures and constants. * Declare API for encoding of a pnfs_osd_ioerr into XDR stream. * Define the pnfs_osd_layoutupdate structure and its substructures. * Declare API for encoding of a pnfs_osd_layoutupdate into XDR stream. [Remove server definitions] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
* Define the PNFS_OBJLAYOUT Kconfig option in the nfs master Kconfig file. * Add the objlayout driver to the Kernel's Kbuild system. * Add the fs/nfs/objlayout/Kbuild file for building the objlayoutdriver.ko driver * Define fs/nfs/objlayout/objio_osd.c, register the driver on module initialization and unregister on exit. [pnfs-obj: remove of CONFIG_PNFS fallout] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [added "unsure" clause] [depend on NFS_V4_1] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
J. Bruce Fields authored
A pNFS client auto-negotiates a lot of features (minorversion level, pNFS layout type, etc.). This is convenient, but makes certain kinds of failures hard for a user to detect. For example, if the client falls back on 4.0, or falls back to MDS IO because the user didn't connect to the right iscsi disks before mounting, the only symptoms may be reduced performance, which may not be noticed till long after the actual failure, and may be difficult for a user to diagnose. However, such "failures" may also be perfectly normal in some cases, so we don't want to spam the system logs with them. One approach would be to put some more information into /proc/self/mountstats. Signed-off-by: J. Bruce Fields <bfields@fieldses.org> Signed-off-by: Benny Halevy <bhalevy@panasas.com> [pnfs: add commit client stats] [fixup data types for "ret" variables in pnfs_try_to* inline funcs.] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [fix definition of show_pnfs for !CONFIG_PNFS] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [nfs41: Fix show_sessions in the not CONFIG_NFS_V4_1 case] There is a build error when CONFIG_NFS_V4 is set but CONFIG_NFS_V4_1 is *not* set. show_sessions() prototype was unbalanced between the two cases. Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [pnfs: super.c remove CONFIG_PNFS] Signed-off-by: Andy Adamson <andros@netapp.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
Use recalled range to invalidate particular layout segments in the layout cache. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
Add offset and count parameters to pnfs_update_layout and use them to get the layout in the pageio path. Order cache layout segments in the following order: * offset (ascending) * length (descending) * iomode (RW before READ) Test byte range against the layout segment in use in pnfs_{read,write}_pg_test so not to coalesce pages not using the same layout segment. [fix lseg ordering] [clean up pnfs_find_lseg lseg arg] [remove unnecessary FIXME] [fix ordering in pnfs_insert_layout] [clean up pnfs_insert_layout] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
Initialize xdr_stream and xdr_buf using an array of page pointers and length of buffer. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
pnfs deviceids are unique per server, per layout type. struct nfs_client is currently used to distinguish deviceids from different nfs servers, yet these may clash between different layout types on the same server. Therefore, use the layout driver associated with each deviceid at insertion time to look it up, unhash, or delete it. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Marc Eshel authored
Note: This functionlaity is incomplete as all layout segments referring to the 'to be removed device id' need to be reaped, and all in flight I/O drained. [use be32 res in nfs4_callback_devicenotify] [use nfs_client to qualify deviceid for cb_notify_deviceid] [use global deviceid cache for CB_NOTIFY_DEVICEID] [refactor device cache _lookup_deviceid] [refactor device cache _find_get_deviceid] Signed-off-by: Benny Halevy <bhalevy@panasas.com> [Bug in new global-device-cache code] [layout_driver MUST set free_deviceid_node if using dev-cache] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
Use the pnfs_layoutdriver_type both as a qualifier for the deviceid, distinguishing deviceid from different layout types on the server, and for freeing the layout-driver allocated structure containing the nfs4_deviceid_node. [BUG in _deviceid_purge_client] [layout_driver MUST set free_deviceid_node if using dev-cache] [let ver < 4.1 compile] Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> [removed EXPORT_SYMBOL_GPL(nfs4_deviceid_purge_client)] Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
Move deviceid cache from the pnfs files layout driver to the generic layer in preparation for the objects layout driver. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
Some definitions in the header file depend on nfs_fs.h so pnfs.h can't be included independently. Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Benny Halevy authored
deviceids are unique per server, per layout type. Therefore, in the global cache in the files layout driver deviceids from different servers may clash so we need to qualify them with a struct nfs_client that represents the nfs server that returned the deviceid. Introduced in 2.6.39 commit ea8eecdd "NFSv4.1 move deviceid cache to filelayout driver" Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
Jim Rees authored
Signed-off-by: Jim Rees <rees@umich.edu> Signed-off-by: Benny Halevy <bhalevy@panasas.com>
-
- 19 May, 2011 1 commit
-
-
Linus Torvalds authored
-
- 18 May, 2011 13 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2Linus Torvalds authored
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jlbec/ocfs2: configfs: Fix race between configfs_readdir() and configfs_d_iput() configfs: Don't try to d_delete() negative dentries. ocfs2/dlm: Target node death during resource migration leads to thread spin ocfs2: Skip mount recovery for hard-ro mounts ocfs2/cluster: Heartbeat mismatch message improved ocfs2/cluster: Increase the live threshold for global heartbeat ocfs2/dlm: Use negotiated o2dlm protocol version ocfs2: skip existing hole when removing the last extent_rec in punching-hole codes. ocfs2: Initialize data_ac (might be used uninitialized)
-
git://git.secretlab.ca/git/linux-2.6Linus Torvalds authored
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6: drivercore: revert addition of of_match to struct device of: fix race when matching drivers
-
git://git.linux-mips.org/pub/scm/upstream-linusLinus Torvalds authored
* 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus: MIPS: Kludge IP27 build for 2.6.39. MIPS: AR7: Fix GPIO register size for Titan variant. MIPS: Fix duplicate invocation of notify_die. MIPS: RB532: Fix iomap resource size miscalculation.
-
Grant Likely authored
Commit b826291c, "drivercore/dt: add a match table pointer to struct device" added an of_match pointer to struct device to cache the of_match_table entry discovered at driver match time. This was unsafe because matching is not an atomic operation with probing a driver. If two or more drivers are attempted to be matched to a driver at the same time, then the cached matching entry pointer could get overwritten. This patch reverts the of_match cache pointer and reworks all users to call of_match_device() directly instead. Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
-
Milton Miller authored
If two drivers are probing devices at the same time, both will write their match table result to the dev->of_match cache at the same time. Only write the result if the device matches. In a thread titled "SBus devices sometimes detected, sometimes not", Meelis reported his SBus hme was not detected about 50% of the time. From the debug suggested by Grant it was obvious another driver matched some devices between the call to match the hme and the hme discovery failling. Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Milton Miller <miltonm@bga.com> [grant.likely: modified to only call of_match_device() once] Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
-
git://git.kernel.dk/linux-2.6-blockLinus Torvalds authored
* 'for-linus' of git://git.kernel.dk/linux-2.6-block: block: don't delay blk_run_queue_async scsi: remove performance regression due to async queue run blk-throttle: Use task_subsys_state() to determine a task's blkio_cgroup block: rescan partitions on invalidated devices on -ENOMEDIA too cdrom: always check_disk_change() on open block: unexport DISK_EVENT_MEDIA_CHANGE for legacy/fringe drivers
-
Ralf Baechle authored
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
-
Florian Fainelli authored
The 'size' variable contains the correct register size for both AR7 and Titan, but we never used it to ioremap the correct register size. This problem only shows up on Titan. [ralf@linux-mips.org: Fixed the fix. The original patch as in patchwork recognizes the problem correctly then fails to fix it ...] Reported-by: Alexander Clouter <alex@digriz.org.uk> Signed-off-by: Florian Fainelli <florian@openwrt.org> Patchwork: https://patchwork.linux-mips.org/patch/2380/Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
-
Ralf Baechle authored
Initial patch by Yury Polyanskiy <ypolyans@princeton.edu>. Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Patchwork: https://patchwork.linux-mips.org/patch/2373/
-
Ralf Baechle authored
This is the MIPS portion of Joe Perches <joe@perches.com>'s https://patchwork.linux-mips.org/patch/2172/ which seems to have been lost in time and space. Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
-
Joel Becker authored
configfs_readdir() will use the existing inode numbers of inodes in the dcache, but it makes them up for attribute files that aren't currently instantiated. There is a race where a closing attribute file can be tearing down at the same time as configfs_readdir() is trying to get its inode number. We want to get the inode number of open attribute files, because they should match while instantiated. We can't lock down the transition where dentry->d_inode is set to NULL, so we just check for NULL there. We can, however, ensure that an inode we find isn't iput() in configfs_d_iput() until after we've accessed it. Signed-off-by: Joel Becker <jlbec@evilplan.org>
-
Joel Becker authored
When configfs is faking mkdir() on its subsystem or default group objects, it starts by adding a negative dentry. It then tries to instantiate the group. If that should fail, it must clean up after itself. I was using d_delete() here, but configfs_attach_group() promises to return an empty dentry on error. d_delete() explodes with the entry dentry. Let's try d_drop() instead. The unhashing is what we want for our dentry. Signed-off-by: Joel Becker <jlbec@evilplan.org>
-
Shaohua Li authored
Let's check a scenario: 1. blk_delay_queue(q, SCSI_QUEUE_DELAY); 2. blk_run_queue_async(); the second one will became a noop, because q->delay_work already has WORK_STRUCT_PENDING_BIT set, so the delayed work will still run after SCSI_QUEUE_DELAY. But blk_run_queue_async actually hopes the delayed work runs immediately. Fix this by doing a cancel on potentially pending delayed work before queuing an immediate run of the workqueue. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
-