Commit 8c4aa7be authored by Kiran Kumar Modukuri's avatar Kiran Kumar Modukuri Committed by Khalid Elmously

cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag

BugLink: https://bugs.launchpad.net/bugs/1776254

In cachefiles_mark_object_active(), the new object is marked active and
then we try to add it to the active object tree.  If a conflicting object
is already present, we want to wait for that to go away.  After the wait,
we go round again and try to re-mark the object as being active - but it's
already marked active from the first time we went through and a BUG is
issued.

Fix this by clearing the CACHEFILES_OBJECT_ACTIVE flag before we try again.

Analysis from Kiran Kumar Modukuri:

[Impact]
Oops during heavy NFS + FSCache + Cachefiles

CacheFiles: Error: Overlong wait for old active object to go away.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000002

CacheFiles: Error: Object already active kernel BUG at
fs/cachefiles/namei.c:163!

[Cause]
In a heavily loaded system with big files being read and truncated, an
fscache object for a cookie is being dropped and a new object being
looked. The new object being looked for has to wait for the old object
to go away before the new object is moved to active state.

[Fix]
Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when
retrying the object lookup.

[Testcase]
Have run ~100 hours of NFS stress tests and have not seen this bug recur.

[Regression Potential]
 - Limited to fscache/cachefiles.

Fixes: 9ae326a6 ("CacheFiles: A cache that backs onto a mounted filesystem")
Signed-off-by: default avatarKiran Kumar Modukuri <kiran.modukuri@gmail.com>
Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
(backported from commit 5ce83d4b)
Signed-off-by: default avatarDaniel Axtens <daniel.axtens@canonical.com>
Acked-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
Acked-by: default avatarKamal Mostafa <kamal@canonical.com>
Signed-off-by: default avatarKhalid Elmously <khalid.elmously@canonical.com>
parent 6a36dd9e
......@@ -190,6 +190,8 @@ static int cachefiles_mark_object_active(struct cachefiles_cache *cache,
/* an old object from a previous incarnation is hogging the slot - we
* need to wait for it to be destroyed */
wait_for_old_object:
clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags);
if (fscache_object_is_live(&xobject->fscache)) {
pr_err("\n");
pr_err("Error: Unexpected object collision\n");
......@@ -251,7 +253,6 @@ static int cachefiles_mark_object_active(struct cachefiles_cache *cache,
goto try_again;
requeue:
clear_bit(CACHEFILES_OBJECT_ACTIVE, &object->flags);
cache->cache.ops->put_object(&xobject->fscache);
_leave(" = -ETIMEDOUT");
return -ETIMEDOUT;
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment