- 08 May, 2013 9 commits
-
-
Alex Elder authored
Defer setting the size and features fields of a mapped image until after the Linux disk structure is set up. Set the capacity of the disk after that. Rearrange the definition of rbd_image_header, separating the fields that are set only once from those that can be updated. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Hold off setting the read-only flag in rbd_add() for an image being mapped until we have successfully probed the image. At that point we know whether it's a snapshot mapping or not, so we can set the read-only flag in that one place rather than doing so (for snapshots) in rbd_dev_mapping_set(). To do this, pass a flag to the image probe routine indicating whether we want a read-only mapping. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
This function is a duplicate of rbd_dev_mapping_clear(), and was added by mistake. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Currently rbd_dev_mapping_set() looks up the snapshot id for the snapshot whose name is found in the rbd device's spec structure. That function gets called by rbd_dev_device_setup(), which is called by rbd_add() *after* rbd_dev_image_probe(). If the image probe succeeds, the rbd device's spec will already have been updated to include names and ids for all fields. Therefore there's no need to look up the snapshot id in rbd_dev_mapping_set(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
The presence of the LAYERING bit in an rbd image's feature mask does not guarantee the image actually has a parent image. Currently that bit is set only when a clone (i.e., image with a parent) is created, but it is (currently) not cleared if that clone gets flattened back into a "normal" image. A "parent_id" query will leave the parent_spec for the image being mapped a null pointer, but will not return an error. Currently, whenever an image with the LAYERED feature gets mapped, a warning about the use of layered images gets printed. But we don't want to do this for a flattened image, so print the warning only if we find there is a parent spec after the probe. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Since rbd_update_mapping_size() is now a trivial wrapper, just open code it in its two callers. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
When a mapped image changes size, we change the capacity recorded for the Linux disk associated with it, in rbd_update_mapping_size(). That function is called in two places--the format 1 and format 2 refresh routines. There is no need to set the capacity while holding the header semaphore. Instead, do it in the common rbd_dev_refresh(), using the logic that's already there to initiate disk revalidation. Add handling in the request function, just in case a request that exceeds the capacity of the device comes in (perhaps one that was started before a refresh shrunk the device). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
This commit: d98df63e rbd: revalidate_disk upon rbd resize instituted a call to revalidate_disk() to notify interested parties that a mapped image has changed size. This works well, as long as the the rbd device doesn't map a snapshot. A snapshot will never change size. However, the base image the snapshot is associated with can, and it can do so while the snapshot is mapped. The problem is that the test for the size is looking at the size of the base image, not the size of the mapped snapshot. This patch corrects that. Update the warning message shown in the event of error, and move it into the callers. This resolves: http://tracker.ceph.com/issues/4911Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
When rbd_dev_v2_refresh() is called, the rbd device already has a snapshot context associated with it. But that never gets freed, the pointer just gets overwritten. Fix this by dropping the rbd device's reference to the snapshot context before overwriting the pointer. Because ceph_put_snap_context() already handles for a null pointer we don't need to check for that (for the probe case, where no context has yet been assigned). This resolves: http://tracker.ceph.com/issues/4912Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
- 02 May, 2013 31 commits
-
-
Alex Elder authored
When a read for a layered image object finds the target object doesn't exist, a read image request for the parent image is created and submitted. When that completes, the callback routine was not releasing that parent image request. Fix that. The slab allocation stuff just added has greatly simplified the search for the source of this memory leak. This resolves: http://tracker.ceph.com/issues/4803Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Create a slab cache to manage allocation of ceph_osdc_request structures. This resolves: http://tracker.ceph.com/issues/3926Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Create a slab cache to manage ceph_msg_data structure allocation. This is part of: http://tracker.ceph.com/issues/3926Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Create a slab cache to manage ceph_msg structure allocation. This is part of: http://tracker.ceph.com/issues/3926Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
The names of objects used for image object requests are always fixed size. So create a slab cache to manage them. Define a new function rbd_segment_name_free() to match rbd_segment_name() (which is what supplies the dynamically-allocated name buffer). This is part of: http://tracker.ceph.com/issues/3926Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Create a slab cache to manage rbd_obj_request allocation. We aren't using a constructor, and we'll zero-fill object request structures when they're allocated. This is part of: http://tracker.ceph.com/issues/3926Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
The next patch will define a slab allocator for a object requests. To use that we'll need to allocate the name of an object separate from the request structure itself. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Create a slab cache to manage rbd_img_request allocation. Nothing too fancy at this point--we'll still initialize everything at allocation time (no constructor) This is part of: http://tracker.ceph.com/issues/3926Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Use bsearch(3) to make snapshot lookup by id more efficient. (There could be thousands of snapshots, and conceivably many more.) Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
This functionality inadvertently disappeared in the last patch. Image snapshots can get removed at just about any time. In particular it can disappear even if it is in use by an rbd client as a mapped image. The rbd client deals with such a disappearance by responding to new requests with ENXIO. This is implemented by each rbd device maintaining an EXISTS flag, which is normally set but cleared if a snapshot disappears. This patch (re-)implements the clearing of that flag. Whenever mapped image header information is refreshed, if the mapping is for a snapshot, verify the mapped snapshot is still present in the updated snapshot context. If it is not, clear the flag. It is not necessary to check this in the initial probe, because the probe will not succeed if the snapshot doesn't exist. This resolves: http://tracker.ceph.com/issues/4880Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
We no longer use the snapshot list for anything. When we need to look up a snapshot name, id, size, or feature mask, we just do it directly rather than relying on this list being updated with every refresh. The main reason it existed was for the benefit of the device/sysfs entries that previously were associated with snapshots. So get rid of the snapshot list, and struct rbd_snap, and the hundreds of lines of code that supported them. This resolves: http://tracker.ceph.com/issues/4868Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
This patch defines a handful of new functions that will allow us to get rid of the rbd device structure's list of snapshots. Define rbd_snap_id_by_name() to look up a snapshot id given its name. This is efficient for format 1 images but not for format 2. Fortunately it only gets called at mapping time so it's not that critical. Use rbd_snap_id_by_name() to find out the id for a snapshot getting mapped, and pass that id to new functions rbd_snap_size() and rbd_snap_features() to look up information about a given snapshot's size and feature mask given its snapshot id. All this gets done in rbd_dev_mapping_set(). As a result, snap_by_name() is no longer needed, so get rid of it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
In order to align with what was needed for format 1 rbd images, rbd_dev_v2_snap_info() was set up to take as argument an index into the array of snapshot ids in a rbd device's snapshot context. This switches that around, so we pass the snapshot id instead. In doing this, rbd_snap_name() now returns a dynamically-allocated string rather than a fixed one, so there's no need to make a duplicate in its caller, rbd_dev_spec_update(). This means the following functions take a snapshot id where they previously used an index value: rbd_dev_snap_info() rbd_dev_v1_snap_info() rbd_dev_v2_snap_info() A new function, rbd_dev_snap_index(), determines the snap index for format 1 images and uses it to look up the name. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Rather than scanning the list of snapshot structures for it, scan the snapshot context buffer containing snapshot names in order to determine for a format 1 image the name associated with a given snapshot id. Pull out the part of rbd_dev_v1_snap_info() that does this scan into a new function, _rbd_dev_v1_snap_name(). Have that function return a dynamically-allocated copy of the name, and don't duplicate it in rbd_dev_v1_snap_info(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Nothing ever uses the version field maintained in the object request structure any more, so get rid of it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Only NULL is passed as the version argument to rbd_obj_method_sync(), so get rid of it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Continued from the last patch, more parameters that can go away because we no longer have a need to track object versions. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Several functions in rbd have parameters meant to allow the version of an object to be passed in or out. The purpose of those was to allow the version of a header object to be maintained, but we no longer do that. As a result, these parameters are never actually needed or used, so get rid of them. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
The rbd code takes care to maintain the version of the header object. This was done in hopes of using it to detect a change in the object between reading it and setting up a watch request to be notified of changes. The mechanism was never fully implemented, however. And we now avoid the original problem by setting up the watch request before ever reading the content of the header. The osd doesn't interpret the object version supplied with a WATCH osd op, nor does it use the version supplied with a NOTIFY_ACK op (we can just supply 0 for both). There is therefore no need to maintain the header's object version any more, so stop doing so. We'll be able to simplify some more rbd code in the next few patches as a result of this. This resolves: http://tracker.ceph.com/issues/3952Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Make explicit that snapshot names don't change by making functions return and take parameters that that point to const qualified data. This resolves: http://tracker.ceph.com/issues/4867Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Whenever a header object event causes a mapped rbd image to refresh its header information, revalidate_disk() is being called. This was done in rbd_dev_refresh() outside the control mutex in order to avoid a lock inversion. Although a an event like this *might* indicate the image has changed size, most of the time it does not. Record the image size before and after the refresh, and only call revalidate_disk() if it changes. This resolves: http://tracker.ceph.com/issues/4867Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
A warning gets spewed for any image being probed, including parent images. Set up a condition such that the warning message only gets printed for the image being mapped, not any of its parents. Also, I didn't like the way the warning ended up being so long. Make it a terse warning instead. People experimenting with layering will know what the message means. This is part of: http://tracker.ceph.com/issues/4867Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Now that we have a library routine to create snap contexts, use it. This is part of: http://tracker.ceph.com/issues/4857Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
This creates a new source file "net/ceph/snapshot.c" to contain utility routines related to ceph snapshot contexts. The main motivation was to define ceph_create_snap_context() as a common way to create these structures, but I've moved the definitions of ceph_get_snap_context() and ceph_put_snap_context() there too. (The benefit of inlining those is very small, and I'd rather keep this collection of functions together.) Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Stop setting up Linux devices during the image probe operation. Instead, set up the devices as a separate step after the image probe, in rbd_add(). A consequence of this is that only mapped images get devices assigned to them, which is pretty sweet. This resolves: http://tracker.ceph.com/issues/4774Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Currently an rbd_device structure gets destroyed from the release routine for the device embedded within it. Stop doing that, instead calling rbd_dev_image_release() right after rbd_bus_del_dev() wherever the latter is called. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Define a new function rbd_dev_unprobe() which undoes state changes that occur from calling rbd_dev_v1_probe() or rbd_dev_v2_probe(). Note that this is a superset of rbd_header_free(), which is now getting removed (it seems to have been used improperly anyway). Flesh out rbd_dev_image_release() so it undoes exactly what rbd_dev_image_probe() does. This means that: - rbd_dev_device_release() gets called when the last device reference gets dropped; - that undoes everything done by the rbd_dev_device_setup() call at the end of rbd_dev_image_probe() (and nothing more), ending by calling rbd_dev_image_release(); and - rbd_dev_image_release() undoes everything else done by rbd_dev_image_probe() (and this includes a call to rbd_dev_unprobe(). This means the image and device portions of an rbd device are fairly cleanly separated now, so error paths should be a little easier to verify than they used to be. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Rename rbd_dev_probe_finish() to be rbd_dev_device_setup(). Its purpose is to set up the Linux side of an rbd device mapping. Rename rbd_dev_release() to be rbd_dev_device_release(), making it more obvious it serves as the inverse of the setup function (or it will). Encapsulate some of what was done in rbd_dev_release() into a new function rbd_dev_image_release(), which serves as the inverse of setting up the ceph side of the mapped rbd image. Define a new helper rbd_dev_clear_mapping() to simply zero out the fields of a mapping structure--the inverse of rbd_dev_set_mapping(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Drop the module reference at the end of rbd_remove() for symmetry with adding a reference at the top of rbd_add(). Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
Move setting up the watch request for an image so it's done in rbd_dev_image_probe() rather than rbd_dev_probe_finish(). Move it all the way up to before doing the initial probe. This avoids a potential race condition, in which we get (and use) the initial snapshot context for an image, and it gets changed between that time and the time we get the watch set up. This resolves: http://tracker.ceph.com/issues/3871Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-
Alex Elder authored
When a format 2 image is refreshed, code is in place to verify that the object order never changes from what it was originally. This relies on the fact that the refresh will occur *after* an initial load of information about the image. An upcoming patch makes it possible for the refresh to occur first, so we can no longer make this order check. The order really can't ever change anyway--this was just a sanity check. So get rid of it. Signed-off-by: Alex Elder <elder@inktank.com> Reviewed-by: Josh Durgin <josh.durgin@inktank.com>
-