- 29 Aug, 2019 1 commit
-
-
Jeff Layton authored
commit 28a28261 upstream. When ceph_mdsc_do_request returns an error, we can't assume that the filelock_reply pointer will be set. Only try to fetch fields out of the r_reply_info when it returns success. Cc: stable@vger.kernel.org Reported-by:
Hector Martin <hector@marcansoft.com> Signed-off-by:
Jeff Layton <jlayton@kernel.org> Reviewed-by:
"Yan, Zheng" <zyan@redhat.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com> Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 02 Apr, 2018 1 commit
-
-
Chengguang Xu authored
Some of dout format do not include newline in the end, fix for the files which are in fs/ceph and net/ceph directories, and changing printk to dout for printing debug info in super.c Signed-off-by:
Chengguang Xu <cgxu519@icloud.com> Reviewed-by:
"Yan, Zheng" <zyan@redhat.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com>
-
- 13 Nov, 2017 4 commits
-
-
Yan, Zheng authored
When session get evicted, all file locks associated with the session get released remotely by mds. File locks tracked by kernel become stale. In this situation, set an error flag on inode. The flag makes further file locks return -EIO. Another option to handle this situation is cleanup file locks tracked kernel. I do not choose it because it is inconvenient to notify user program about the error. Signed-off-by:
"Yan, Zheng" <zyan@redhat.com> Acked-by:
Jeff Layton <jlayton@redhat.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com>
-
Yan, Zheng authored
Don't malloc if there is no flock. Signed-off-by:
"Yan, Zheng" <zyan@redhat.com> Reviewed-by:
Jeff Layton <jlayton@redhat.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com>
-
Yan, Zheng authored
Signed-off-by:
"Yan, Zheng" <zyan@redhat.com> Reviewed-by:
Jeff Layton <jlayton@redhat.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com>
-
Yan, Zheng authored
file locks are tracked by inode's auth mds. dropping auth caps is equivalent to releasing all file locks. Signed-off-by:
"Yan, Zheng" <zyan@redhat.com> Acked-by:
Jeff Layton <jlayton@redhat.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com>
-
- 02 Nov, 2017 1 commit
-
-
Greg Kroah-Hartman authored
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard...
-
- 16 Jul, 2017 1 commit
-
-
Benjamin Coddington authored
Since commit c69899a1 "NFSv4: Update of VFS byte range lock must be atomic with the stateid update", NFSv4 has been inserting locks in rpciod worker context. The result is that the file_lock's fl_nspid is the kworker's pid instead of the original userspace pid. The fl_nspid is only used to represent the namespaced virtual pid number when displaying locks or returning from F_GETLK. There's no reason to set it for every inserted lock, since we can usually just look it up from fl_pid. So, instead of looking up and holding struct pid for every lock, let's just look up the virtual pid number from fl_pid when it is needed. That means we can remove fl_nspid entirely. The translaton and presentation of fl_pid should handle the following four cases: 1 - F_GETLK on a remote file with a remote lock: In this case, the filesystem should determine the l_pid to return here. Filesystems should indicate that the fl_pid represents a non-local pid value that should not be translated by returning an fl_pid <= 0. 2 - F_GETLK on a local file with a remote lock: This should be the l_pid of the lock manager process, and translated. 3 - F_GETLK on a remote file with a local lock, and 4 - F_GETLK on a local file with a local lock: These should be the translated l_pid of the local locking process. Fuse was already doing the correct thing by translating the pid into the caller's namespace. With this change we must update fuse to translate to init's pid namespace, so that the locks API can then translate from init's pid namespace into the pid namespace of the caller. With this change, the locks API will expect that if a filesystem returns a remote pid as opposed to a local pid for F_GETLK, that remote pid will be <= 0. This signifies that the pid is remote, and the locks API will forego translating that pid into the pid namespace of the local calling process. Finally, we convert remote filesystems to present remote pids using negative numbers. Have lustre, 9p, ceph, cifs, and dlm negate the remote pid returned for F_GETLK lock requests. Since local pids will never be larger than PID_MAX_LIMIT (which is currently defined as <= 4 million), but pid_t is an unsigned int, we should have plenty of room to represent remote pids with negative numbers if we assume that remote pid numbers are similarly limited. If this is not the case, then we run the risk of having a remote pid returned for which there is also a corresponding local pid. This is a problem we have now, but this patch should reduce the chances of that occurring, while also returning those remote pid numbers, for whatever that may be worth. Signed-off-by:
Benjamin Coddington <bcodding@redhat.com> Signed-off-by:
Jeff Layton <jlayton@redhat.com>
-
- 07 Jul, 2017 1 commit
-
-
Yan, Zheng authored
Don't re-send interrupted flock request in cases of mds failover and receiving request forward. Because corresponding 'lock intr' request may have been finished, it won't get re-sent. Link: http://tracker.ceph.com/issues/20170 Signed-off-by:
"Yan, Zheng" <zyan@redhat.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com>
-
- 03 Oct, 2016 1 commit
-
-
Yan, Zheng authored
Signed-off-by:
Yan, Zheng <zyan@redhat.com>
-
- 22 Oct, 2015 1 commit
-
-
Benjamin Coddington authored
Instead of having users check for FL_POSIX or FL_FLOCK to call the correct locks API function, use the check within locks_lock_inode_wait(). This allows for some later cleanup. Signed-off-by:
Benjamin Coddington <bcodding@redhat.com> Signed-off-by:
Jeff Layton <jeff.layton@primarydata.com>
-
- 31 Jul, 2015 1 commit
-
-
Yan, Zheng authored
posix locks should be in ctx->flc_posix list Signed-off-by:
Yan, Zheng <zyan@redhat.com> Signed-off-by:
Ilya Dryomov <idryomov@gmail.com>
-
- 16 Feb, 2015 1 commit
-
-
Jeff Layton authored
This reverts commit 9bd0f45b . Linus rightly pointed out that I failed to initialize the counters when adding them, so they don't work as expected. Just revert this patch for now. Reported-by:
Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by:
Jeff Layton <jeff.layton@primarydata.com>
-
- 16 Jan, 2015 5 commits
-
-
Jeff Layton authored
This makes things a bit more efficient in the cifs and ceph lock pushing code. Signed-off-by:
Jeff Layton <jlayton@primarydata.com> Acked-by:
Christoph Hellwig <hch@lst.de>
-
Jeff Layton authored
We can now add a dedicated spinlock without expanding struct inode. Change to using that to protect the various i_flctx lists. Signed-off-by:
Jeff Layton <jlayton@primarydata.com> Acked-by:
Christoph Hellwig <hch@lst.de>
-
Jeff Layton authored
Signed-off-by:
Jeff Layton <jlayton@primarydata.com> Acked-by:
Christoph Hellwig <hch@lst.de>
-
Jeff Layton authored
Signed-off-by:
Jeff Layton <jlayton@primarydata.com> Acked-by:
Christoph Hellwig <hch@lst.de>
-
Jeff Layton authored
There is only a single call site for each of these functions, and the caller takes the i_lock prior to calling them and drops it just afterward. Move the spinlocking into the functions instead. Signed-off-by:
Jeff Layton <jlayton@primarydata.com> Acked-by:
Christoph Hellwig <hch@lst.de>
-
- 17 Dec, 2014 1 commit
-
-
Yan, Zheng authored
When a lock operation is interrupted, current code sends a unlock request to MDS to undo the lock operation. This method does not work as expected because the unlock request can drop locks that have already been acquired. The fix is use the newly introduced CEPH_LOCK_FCNTL_INTR/CEPH_LOCK_FLOCK_INTR requests to interrupt blocked file lock request. These requests do not drop locks that have alread been acquired, they only interrupt blocked file lock request. Signed-off-by:
Yan, Zheng <zyan@redhat.com>
-
- 02 Jun, 2014 1 commit
-
-
Jeff Layton authored
Currently, the fl_owner isn't set for flock locks. Some filesystems use byte-range locks to simulate flock locks and there is a common idiom in those that does: fl->fl_owner = (fl_owner_t)filp; fl->fl_start = 0; fl->fl_end = OFFSET_MAX; Since flock locks are generally "owned" by the open file description, move this into the common flock lock setup code. The fl_start and fl_end fields are already set appropriately, so remove the unneeded setting of that in flock ops in those filesystems as well. Finally, the lease code also sets the fl_owner as if they were owned by the process and not the open file description. This is incorrect as leases have the same ownership semantics as flock locks. Set them the same way. The lease code doesn't actually use the fl_owner value for anything, so this is more for consistency's sake than a bugfix. Reported-by:
Trond Myklebust <trond.myklebust@primarydata.com> Signed-off-by:
Jeff Layton <jlayton@poochiereds.net> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> (Staging portion) Acked-by:
J. Bruce Fields <bfields@fieldses.org>
-
- 28 Apr, 2014 1 commit
-
-
Yan, Zheng authored
Signed-off-by:
Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by:
Sage Weil <sage@inktank.com>
-
- 05 Apr, 2014 3 commits
-
-
Yan, Zheng authored
flock and posix lock should use fl->fl_file instead of process ID as owner identifier. (posix lock uses fl->fl_owner. fl->fl_owner is usually equal to fl->fl_file, but it also can be a customized value). The process ID of who holds the lock is just for F_GETLK fcntl(2). The fix is rename the 'pid' fields of struct ceph_mds_request_args and struct ceph_filelock to 'owner', rename 'pid_namespace' fields to 'pid'. Assign fl->fl_file to the 'owner' field of lock messages. We also set the most significant bit of the 'owner' field. MDS can use that bit to distinguish between old and new clients. The MDS counterpart of this patch modifies the flock code to not take the 'pid_namespace' into consideration when checking conflict locks. Signed-off-by:
Yan, Zheng <zheng.z.yan@intel.com> Reviewed-by:
Sage Weil <sage@inktank.com>
-
Yan, Zheng authored
Signed-off-by:
Yan, Zheng <zheng.z.yan@intel.com>
-
Yan, Zheng authored
VFS does not directly pass flock's operation code to filesystem's flock callback. It translates the operation code to the form how posix lock's parameters are presented. Signed-off-by:
Yan, Zheng <zheng.z.yan@intel.com>
-
- 01 Jul, 2013 1 commit
-
-
Jim Schutt authored
Signed-off-by:
Jim Schutt <jaschut@sandia.gov> Reviewed-by:
Alex Elder <elder@inktank.com>
-
- 29 Jun, 2013 1 commit
-
-
Jeff Layton authored
Having a global lock that protects all of this code is a clear scalability problem. Instead of doing that, move most of the code to be protected by the i_lock instead. The exceptions are the global lists that the ->fl_link sits on, and the ->fl_block list. ->fl_link is what connects these structures to the global lists, so we must ensure that we hold those locks when iterating over or updating these lists. Furthermore, sound deadlock detection requires that we hold the blocked_list state steady while checking for loops. We also must ensure that the search and update to the list are atomic. For the checking and insertion side of the blocked_list, push the acquisition of the global lock into __posix_lock_file and ensure that checking and update of the blocked_list is done without dropping the lock in between. On the removal side, when waking up blocked lock waiters, take the global lock before walking the blocked list and dequeue the waiters from the global list prior to removal from the fl_block list. With this, deadlock detection should be race free while we minimize excessive file_lock_lock thrashing. Finally, in order to avoid a lock inversion problem when handling /proc/locks output we must ensure that manipulations of the fl_block list are also protected by the file_lock_lock. Signed-off-by:
Jeff Layton <jlayton@redhat.com> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 17 May, 2013 2 commits
-
-
Jim Schutt authored
Ceph's encode_caps_cb() worked hard to not call __page_cache_alloc() while holding a lock, but it's spoiled because ceph_pagelist_addpage() always calls kmap(), which might sleep. Here's the result: [13439.295457] ceph: mds0 reconnect start [13439.300572] BUG: sleeping function called from invalid context at include/linux/highmem.h:58 [13439.309243] in_atomic(): 1, irqs_disabled(): 0, pid: 12059, name: kworker/1:1 . . . [13439.376225] Call Trace: [13439.378757] [<ffffffff81076f4c>] __might_sleep+0xfc/0x110 [13439.384353] [<ffffffffa03f4ce0>] ceph_pagelist_append+0x120/0x1b0 [libceph] [13439.391491] [<ffffffffa0448fe9>] ceph_encode_locks+0x89/0x190 [ceph] [13439.398035] [<ffffffff814ee849>] ? _raw_spin_lock+0x49/0x50 [13439.403775] [<ffffffff811cadf5>] ? lock_flocks+0x15/0x20 [13439.409277] [<ffffffffa045e2af>] encode_caps_cb+0x41f/0x4a0 [ceph] [13439.415622] [<ffffffff81196748>] ? igrab+0x28/0x70 [13439.420610] [<ffffffffa045e9f8>] ? iterate_session_caps+0xe8/0x250 [ceph] [13439.427584] [<ffffffffa045ea25>] iterate_session_caps+0x115/0x250 [ceph] [13439.434499] [<ffffffffa045de90>] ? set_request_path_attr+0x2d0/0x2d0 [ceph] [13439.441646] [<ffffffffa0462888>] send_mds_reconnect+0x238/0x450 [ceph] [13439.448363] [<ffffffffa0464542>] ? ceph_mdsmap_decode+0x5e2/0x770 [ceph] [13439.455250] [<ffffffffa0462e42>] check_new_map+0x352/0x500 [ceph] [13439.461534] [<ffffffffa04631ad>] ceph_mdsc_handle_map+0x1bd/0x260 [ceph] [13439.468432] [<ffffffff814ebc7e>] ? mutex_unlock+0xe/0x10 [13439.473934] [<ffffffffa043c612>] extra_mon_dispatch+0x22/0x30 [ceph] [13439.480464] [<ffffffffa03f6c2c>] dispatch+0xbc/0x110 [libceph] [13439.486492] [<ffffffffa03eec3d>] process_message+0x1ad/0x1d0 [libceph] [13439.493190] [<ffffffffa03f1498>] ? read_partial_message+0x3e8/0x520 [libceph] . . . [13439.587132] ceph: mds0 reconnect success [13490.720032] ceph: mds0 caps stale [13501.235257] ceph: mds0 recovery completed [13501.300419] ceph: mds0 caps renewed Fix it up by encoding locks into a buffer first, and when the number of encoded locks is stable, copy that into a ceph_pagelist. [elder@inktank.com: abbreviated the stack info a bit.] Cc: stable@vger.kernel.org # 3.4+ Signed-off-by:
Jim Schutt <jaschut@sandia.gov> Reviewed-by:
Alex Elder <elder@inktank.com>
-
Jim Schutt authored
In his review, Alex Elder mentioned that he hadn't checked that num_fcntl_locks and num_flock_locks were properly decoded on the server side, from a le32 over-the-wire type to a cpu type. I checked, and AFAICS it is done; those interested can consult Locker::_do_cap_update() in src/mds/Locker.cc and src/include/encoding.h in the Ceph server code (git://github.com/ceph/ceph ). I also checked the server side for flock_len decoding, and I believe that also happens correctly, by virtue of having been declared __le32 in struct ceph_mds_cap_reconnect, in src/include/ceph_fs.h. Cc: stable@vger.kernel.org # 3.4+ Signed-off-by:
Jim Schutt <jaschut@sandia.gov> Reviewed-by:
Alex Elder <elder@inktank.com>
-
- 23 Feb, 2013 1 commit
-
-
Al Viro authored
Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 08 Jun, 2011 2 commits
-
-
Sage Weil authored
If we request a lock and then abort (e.g., ^C), we need to send a matching unlock request to the MDS to unwind our lock attempt to avoid indefinitely blocking other clients. Reported-by:
Brian Chrisman <brchrisman@gmail.com> Signed-off-by:
Sage Weil <sage@newdream.net>
-
Sage Weil authored
We should use ihold whenever we already have a stable inode ref, even when we aren't holding i_lock. This avoids adding new and unnecessary locking dependencies. Signed-off-by:
Sage Weil <sage@newdream.net>
-
- 01 Dec, 2010 2 commits
-
-
Herb Shiu authored
Fill in the local lock with response data if appropriate, and don't call posix_lock_file when reading locks. Signed-off-by:
Herb Shiu <herb_shiu@tcloudcomputing.com> Acked-by:
Greg Farnum <gregf@hq.newdream.net> Signed-off-by:
Sage Weil <sage@newdream.net>
-
Herb Shiu authored
Signed-off-by:
Herb Shiu <herb_shiu@tcloudcomputing.com> Acked-by:
Greg Farnum <gregf@hq.newdream.net> Signed-off-by:
Sage Weil <sage@newdream.net>
-
- 20 Oct, 2010 2 commits
-
-
Greg Farnum authored
When the lock_kernel() turns into lock_flocks() and a spinlock, we won't be able to do allocations with the lock held. Preallocate space without the lock, and retry if the lock state changes out from underneath us. Signed-off-by:
Greg Farnum <gregf@hq.newdream.net> Signed-off-by:
Sage Weil <sage@newdream.net>
-
Yehuda Sadeh authored
This factors out protocol and low-level storage parts of ceph into a separate libceph module living in net/ceph and include/linux/ceph. This is mostly a matter of moving files around. However, a few key pieces of the interface change as well: - ceph_client becomes ceph_fs_client and ceph_client, where the latter captures the mon and osd clients, and the fs_client gets the mds client and file system specific pieces. - Mount option parsing and debugfs setup is correspondingly broken into two pieces. - The mon client gets a generic handler callback for otherwise unknown messages (mds map, in this case). - The basic supported/required feature bits can be expanded (and are by ceph_fs_client). No functional change, aside from some subtle error handling cases that got cleaned up in the refactoring process. Signed-off-by:
Sage Weil <sage@newdream.net>
-
- 25 Aug, 2010 1 commit
-
-
Alan Cox authored
Just scrubbing some warnings so I can see real problem ones in the build noise. For 32bit we need to coax gcc politely into believing we really honestly intend to the casts. Using (u64)(unsigned long) means we cast from a pointer to a type of the right size and then extend it. This stops the warning spew. Signed-off-by:
Alan Cox <alan@linux.intel.com> Signed-off-by:
Sage Weil <sage@newdream.net>
-
- 02 Aug, 2010 1 commit
-
-
Greg Farnum authored
Implement flock inode operation to support advisory file locking. All lock/unlock operations are synchronous with the MDS. Lock state is sent when reconnecting to a recovering MDS to restore the shared lock state. Signed-off-by:
Greg Farnum <gregf@hq.newdream.net> Signed-off-by:
Sage Weil <sage@newdream.net>
-