From: Andrew Deason Date: Thu, 26 Apr 2018 17:01:57 +0000 (-0500) Subject: afs: Avoid GetDCache delays on screwy cache X-Git-Tag: upstream/1.8.1_pre2^2~12 X-Git-Url: https://git.michaelhowe.org/gitweb/?a=commitdiff_plain;h=56ce248751d804efe664c9af3b62f5e15a026afe;p=packages%2Fo%2Fopenafs.git afs: Avoid GetDCache delays on screwy cache Currently, if our afs_AllocDCache call fails in afs_GetDCache, we retry once per second for 5 minutes. The reasoning is that we're out of dcache slots, and so if we wait a little while, maybe something will become freeable and we can continue. However, afs_AllocDCache can also fail if we have plenty of free dslots, but we are unable to successfully call afs_GetUnusedDSlot() on any of them. This can happen if our disk cache is screwed up, and so waiting and retrying will not make things better (but we'll spew a ton of "disk cache read error in CacheItems slot" errors in the log each time, and do so 300 times). So instead, only do our sleep/retry loop if we actually appear to be out of free or discarded dslots. Otherwise, just return an error immediately, since sleeping and retrying will not make anything better. Reviewed-on: https://gerrit.openafs.org/13033 Reviewed-by: Mark Vitale Reviewed-by: Michael Meffie Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk Tested-by: BuildBot (cherry picked from commit bec329c1c81d96b5933527f7cdb3638f24833087) Change-Id: Iaee53eca133985ad5964b61b3641cd8ad2802014 Reviewed-on: https://gerrit.openafs.org/13190 Tested-by: BuildBot Reviewed-by: Andrew Deason Reviewed-by: Joe Gorse Reviewed-by: Marcio Brito Barbosa Reviewed-by: Benjamin Kaduk --- diff --git a/src/afs/afs_dcache.c b/src/afs/afs_dcache.c index 659e5ef31..6dd79a58b 100644 --- a/src/afs/afs_dcache.c +++ b/src/afs/afs_dcache.c @@ -1972,25 +1972,34 @@ afs_GetDCache(struct vcache *avc, afs_size_t abyte, tdc = afs_AllocDCache(avc, chunk, aflags, NULL); if (!tdc) { ReleaseWriteLock(&afs_xdcache); - - /* If we can't get space for 5 mins we give up and bail out */ - if (++downDCount > 300) { - afs_warn("afs: Unable to get free cache space for file " - "%u:%u.%u.%u for 5 minutes; failing with an i/o error\n", - avc->f.fid.Cell, - avc->f.fid.Fid.Volume, - avc->f.fid.Fid.Vnode, - avc->f.fid.Fid.Unique); - goto done; + if (afs_discardDCList == NULLIDX && afs_freeDCList == NULLIDX) { + /* It looks like afs_AllocDCache failed because we don't + * have any free dslots to use. Maybe if we wait a little + * while, we'll be able to free up some slots, so try for 5 + * minutes, then bail out. */ + if (++downDCount > 300) { + afs_warn("afs: Unable to get free cache space for file " + "%u:%u.%u.%u for 5 minutes; failing with an i/o error\n", + avc->f.fid.Cell, + avc->f.fid.Fid.Volume, + avc->f.fid.Fid.Vnode, + avc->f.fid.Fid.Unique); + goto done; + } + afs_osi_Wait(1000, 0, 0); + goto RetryLookup; } - /* - * Locks held: - * avc->lock(R) if setLocks - * avc->lock(W) if !setLocks - */ - afs_osi_Wait(1000, 0, 0); - goto RetryLookup; + /* afs_AllocDCache failed, but not because we're out of free + * dslots. Something must be screwy with the cache, so bail out + * immediately without waiting. */ + afs_warn("afs: Error while alloc'ing cache slot for file " + "%u:%u.%u.%u; failing with an i/o error\n", + avc->f.fid.Cell, + avc->f.fid.Fid.Volume, + avc->f.fid.Fid.Vnode, + avc->f.fid.Fid.Unique); + goto done; } /*