Andrew Deason [Thu, 31 May 2012 22:45:56 +0000 (17:45 -0500)]
vol: Avoid getting stuck in ATTACHING in attach2
Since commit 5fc2365f, a VNOVOL error early in attach2 meant that we
skipped changing the volume state to anything, and just returned
instead. When we do this, the volume is in VOL_STATE_ATTACHING for
DAFS, and so if we return, the volume will forever be in
VOL_STATE_ATTACHING. The next thing that tries to access the volume
will wait forever for the volume to come out of that state.
So, revert half of 5fc2365f, and transition to ERROR state instead.
This code path should not be hit during normal usage, since a
nonexistant volume access for the fileserver will be detected earlier.
If the volume does not appear to exist at this stage of attachment,
something is wrong with the volume, so this warrants the ERROR state.
For the volserver and other volume utilities, we may hit this when a
request just plain references a nonexistant volume for whatever
reason, but in that case the vp should go away soon. For non-DAFS,
this commit does not change much, since the difference between
error_notbroken and unlocked_error is very small.
The other half of 5fc2365f is not changed, since it is correct. For
VOFFLINE errors at this point, the volume has already been
transitioned to VOL_STATE_UNATTACHED, so it is okay to return. Add a
comment to help make this more explicit.
Anders Kaseorg [Tue, 23 Jul 2013 18:30:20 +0000 (14:30 -0400)]
Do not expose afs_assert.h from other public headers
afs_assert.h redefines the standard assert macro, which is evil and
breaks some applications that might want to include our public headers
(e.g. some versions of Cython). This was fixed on master by commit cac74242728ad97e3ce9cef0a949d58c237250f6, which removes afs_assert.h
entirely and adds opr_Assert. Since that patch may be too invasive
for 1.6.x, here’s a minimal patch that just stops exposing
afs_assert.h from our other public headers.
Change-Id: I39a7b9ae8d43cfe0059e10e47ce4b1c22e01c544 Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Reviewed-on: http://gerrit.openafs.org/10096 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com> Reviewed-by: Derrick Brashear <shadow@your-file-system.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Derrick Brashear [Mon, 21 Nov 2011 17:06:59 +0000 (12:06 -0500)]
ihandle: don't keep reallyclosing future fds
given that we can mark something invalid for future use, ever,
once we have done so for all fds, we ih_reallyclose is done.
don't persist the setting to the detriment of new fds
If a callback race has been lost cm_MergeStatus is not executed.
In that case either the activeRPC count should not be incremented
or must be decremented to indicate that the current call has been
completed.
If the CcPurge operation fails or cannot be performed, in addition
to setting the purge on close flag, set the verify data flag. This
ensures that the next attempt to access the file will retry the
purge.
If the redirector is using Direct IO servicing there are no extents
in use. Skip the AFSFlushExtents, AFSTearDownExtents, and related
calls unless extent processing is in use. This will reduce lock
contention and reduce cpu processing.
Andrew Deason [Thu, 31 May 2012 21:41:15 +0000 (16:41 -0500)]
DAFS: Preattach, not attach, in FSYNC_Drop
FSYNC_Drop currently attaches volumes that were checked out by the
dropped fssync handler, but not checked back in, in order to make the
volume available again. For DAFS, however, a full attachment is
unnecessary; just preattach instead.
Eliminate pointless changes between nbsd50/60 params
param.nbsd60.h removed AFS_64BIT_ENV and added an empty expression
to an #if.
AFS_64BIT_ENV is required for any platform to build (on 1.6)
This is a 1.6-only change. On master, AFS_64BIT_ENV was removed entirely
in commit fc9aa428.
Change-Id: I57d7e2a2ef5ab4bf787a99083d35ba70e240710c
Reviewed-on: http://gerrit.openafs.org/10137 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jonathan A. Kollasch <jakllsch@kollasch.net> Tested-by: Jonathan A. Kollasch <jakllsch@kollasch.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Andrew Deason [Wed, 2 Nov 2011 21:55:49 +0000 (16:55 -0500)]
afs: Do not use separate array for srvAddrs
The array of srvAddr structs we use in afs_LoopServers have indices
unrelated to the indices of conns, rxconns, etc. Several places were
assuming that addr[i] corresponded to conn[i], which is not
necessarily true. So instead, do not use the separate addr array
(except when populating the conn and rxconn arrays), and just get the
srvAddr structure by going through the relevant conn[i].
Reviewed-on: http://gerrit.openafs.org/5790 Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit f199ac666195771a02e3ebb040c6e5fe47c58c58)
Change-Id: I70be3c518d2b1ccd51e050532d966a27cf22090f
Reviewed-on: http://gerrit.openafs.org/9434 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@your-file-system.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Tue, 9 Apr 2013 08:00:16 +0000 (04:00 -0400)]
libafs: initialize hard mount last errors
Initialize the values of the server last errors
introduced in commit 94a8ce970d57498583e249ea61725fce1ee53a50
to avoid logging garbage for the last error codes.
Jeffrey Altman [Mon, 26 Aug 2013 00:07:44 +0000 (20:07 -0400)]
Windows: Hold Fcb Resource across CcPurgeSection
Now that the Fcb Resource and SectionObjectResource are held in
the FastIo pathway and the Trend Micro deadlock has been addressed
by holding a reference on the FileObject it is time to fix the
lock acquisition ordering. For each CcPurgeSection call the
Fcb Resource will be held exclusive before the SectionObjectResource.
Rod Widdowson [Sun, 25 Aug 2013 20:20:28 +0000 (13:20 -0700)]
Windows: Strip out unused ModWriter Fastio code
The code is no longer used (the fcb->PagingIO resource is taken for
us by the modwriter) so we strip it out to save other making changes
and then remembering/discovering that this code isn't being used.
Rod Widdowson [Sun, 25 Aug 2013 16:16:39 +0000 (09:16 -0700)]
Windows: Pin the Cc FileObject during section create.
This means that if we purge the data cache while the section is being
created then the MJ_CLOSE will not happen until we unpin the FO.
Thus we can drop any embarsssing locks prior to the close and
meddling antivirus products can do odd stuff in the close path.
Note that there may not be a file object, but in that case there
will be no close on the purge since any CcInitialize operations
will wait on us dropping the SOP lock exe - hence the SOP cannot
be set up.
Also note that this only applies to the data section,
but we do not purge the image section.
Refactor AFSPerformObjectInvalidate so that all of the non-DIRECT_IO
processing variables are in the Extents processing section. Remove
all references to Extents processing from the DIRECT_IO block.
Jeffrey Altman [Thu, 22 Aug 2013 21:50:39 +0000 (17:50 -0400)]
Windows: Refactor AFSVerifyEntry AFSValidateEntry
Inside a big switch statement it is hard to follow when there
are multiple 'break' exits within a 'case'. Reorganize the code
so that there is only a single exit for the FILE type. Unnecessary
blocks are removed as well.
Section Object Resource acquires and releases are lost in the
noise of all of the rest of the locks. Introduce a dedicated
subsystem just for Section Objects.
Jeffrey Altman [Wed, 21 Aug 2013 16:27:35 +0000 (12:27 -0400)]
Windows: Call AFSExeceptionFilter for all exceptions
In many cases we capture exceptions record and the Exception Code
as ntStatus and move on with life. This patchset changes that.
All exceptions are passed to AFSExceptionFilter so we do not miss
anything.
Andrew Deason [Fri, 15 Jun 2012 21:58:42 +0000 (16:58 -0500)]
viced: Restrict RXAFS_FlushCPS to administrators
RXAFS_FlushCPS currently can be run by anyone, including
unauthenticated users. Forcing CPS calculation can be a relatively
resource-intensive operation, though, if done frequently enough, and
only should need to be done by administrators. Thus, only let
administrators use it.
Simon Wilkinson [Sat, 31 Mar 2012 23:21:04 +0000 (19:21 -0400)]
viced: Do error translation for InlineBulkStatus
When a host has requested universal errors, error code conversion
is performed in the CallPostamble. However, the InlineBulkStatus
errorcodes are passed as part of the data set, not as RX errors,
so this translation is not performed.
Fix this so that we also translate error codes that are part of
the InlineBulkStatus response.
Andrew Deason [Thu, 12 May 2011 15:57:09 +0000 (10:57 -0500)]
viced: Enable NAT ping on hosts
Turn on NAT ping on the Rx connection for the callback channel for
hosts. This should help improve behavior for clients behind NATs and
stateful firewalls, even for clients that predate NAT ping
functionality.
Reviewed-on: http://gerrit.openafs.org/4646 Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit aafdc08cfc49da4c23ecd91f9e690fd70e95df55)
Change-Id: I428b6648276ec49fd4f003b3cf2d88a07c8aa1d9
Reviewed-on: http://gerrit.openafs.org/9420 Reviewed-by: Derrick Brashear <shadow@your-file-system.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com>
Mark Vitale [Wed, 13 Mar 2013 02:13:20 +0000 (22:13 -0400)]
dafs: prevent corruption in large fsstate.dat files
If while writing to the fsstate.dat file, it exceeds the current
size of the file (multiples of FS_STATE_INIT_FILESIZE (8MiB)),
we call fs_stateResizeFile. This un-mmaps, truncates, and
re-mmaps the file. Unfortunately, fs_stateMapFile() resets the
state->mmap.offset and .cursor, so any writes in flight over
the resize will overwrite the first bytes of the file (and leave
zeros in the file where the data should have been written).
Upon return from the write that caused a file resize, the offset
is eventually corrected and the state dump continues with a
silent failure. Eventually the state dump completes and the
file header is rewritten; this may conceal some or all of
the overwrite damage at offset 0. However, any zeros near the 8MiB
offset (and its multiples) remain as corruption.
Add a flag to fs_stateMapFile() to allow the caller to specify if
the offset and cursor should be preserved. Modify fs_stateResizeFile()
to use this capability.
testing note: temporarily reduced FS_STATE_INIT_FILESIZE to 256 bytes
during testing in order to make the problem easier to reproduce.
This problem would normally occur only on relatively large/active
DAFS fileservers.
Simon Wilkinson [Sun, 8 Apr 2012 12:58:25 +0000 (13:58 +0100)]
fileserver: Fix NeverAttach support
Commit 35becabed870d4bfe49abaa499d99a3ffb0a2d31 added support for
the /vicepXX/NeverAttach. However this code only appears to work on
Linux. It fails build testing on (at least) Mac OS X, FreeBSD, and AIX.
Modify the code so that the NeverAttach call uses the same variable to
locate the path of the partition as the AlwaysAttach call does.
Michael Meffie [Thu, 7 Jun 2012 18:46:04 +0000 (14:46 -0400)]
libafs: fs flushall for unix cm
Implement the fs flushall command on the unix cache manager to flush
all volume data. Uses a new common pioctl code point VIOC_FLUSHALL (14),
registered with the grand.central.org assigned numbers.
Michael Meffie [Thu, 7 Jun 2012 16:58:54 +0000 (12:58 -0400)]
libafs: use afs_ResetVCache in flush volume data
Remove some code duplication by using afs_ResetVCache
in the flush volume data pioctl. Adds a flag to
ResetVCache to avoid unneeded calls to purge dnlc
when reseting all the vcaches in a volume.
Adds freeing of vcache link data in the flush volume
data pioctl.
Andrew Deason [Mon, 1 Nov 2010 20:34:26 +0000 (15:34 -0500)]
Cleanup VOffline log message for non-DAFS
Commit fd592c7674d4aa44dda90998b54d7b56947f6ed8 fixed the 'Volume X
(Y) is now offline' message for DAFS, but the same problem persists
for non-DAFS. Fix the non-DAFS case.
Andrew Deason [Thu, 3 Feb 2011 22:11:38 +0000 (16:11 -0600)]
volser: Do not reset copyDate in ReClone
When we ReClone in the volserver, do not reset the clone's copyDate to
the current time. If we retain the copyDate between ReClone
operations, then we can know when the clone was first created (and
thus makes local RO clones more consistent with remote RO sites).
Simon Wilkinson [Thu, 19 May 2011 17:19:29 +0000 (18:19 +0100)]
vlserver: Use correct base value when replacing
When we're removing existing address entries the code calculates
a base and index value for each entry that we're removing an address
from. However, it then _uses_ a previously calculated base value,
with the new index. This works fine if the old base and the new base
match, but if they don't, chaos will ensue.
Andrew Deason [Fri, 21 May 2010 20:54:33 +0000 (15:54 -0500)]
vlserver: Access cache via vl_ctx
The vlserver application-level ubik cache (which consists of
HostAddress, ex_addr, and cheader) is currently being accessed via
global variables everywhere. Instead, access these via the new vl_ctx
struct that is passed to functions during a transaction, so we have
the ability to modify the cache without making all changes visible as
we change it.
Andrew Deason [Fri, 21 May 2010 16:12:50 +0000 (11:12 -0500)]
vlserver: Add a struct for trans-specific data
Instead of passing a ubik_trans pointer to many functions inside the
vlserver, pass a vlserver-defined vl_ctx struct, so we can add new
things to keep track of in a transaction that are not part of ubik.
Andrew Deason [Wed, 12 Dec 2012 22:14:55 +0000 (16:14 -0600)]
LINUX: Avoid multiple d_invalidate loops
Currently, in afs_linux_lookup, we put an artificial limit on how many
times we loop through all dentry aliases, trying ti d_invalidate all
of them. Instead of using an arbitrary limit, we can just go through
all of them once, by using d_prune_aliases. This should be faster, and
removes some of the logic required here.
Note that this does remove our check for DCACHE_DISCONNECTED in each
alias' d_flags. This should not be a problem, since we will still use
any remaining DCACHE_DISCONNECTED dentry via d_splice_alias if one
still exists.
Reviewed-on: http://gerrit.openafs.org/8751 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: Derrick Brashear <shadow@your-file-system.com>
(cherry picked from commit 370aaaeafa43f804b0a5286d92b4ec5f1ccb62be)
Change-Id: I1aa70afe8268852c676f241e0189bc010ad757aa
Reviewed-on: http://gerrit.openafs.org/9288 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: Derrick Brashear <shadow@your-file-system.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Tested-by: BuildBot <buildbot@rampaginggeek.com>
Andrew Deason [Thu, 4 Oct 2012 20:49:56 +0000 (15:49 -0500)]
DAFS: VRS_r with VOL_SALVAGE_NO_OFFLINE in attach2
One caller of VRequestSalvage_r in attach2 was not passing the
VOL_SALVAGE_NO_OFFLINE flag. This really should be passed for every
place that manually sets vp->nUsers = 0, since then the VPutVolume_r
handlers will never fire.
Anders Kaseorg [Tue, 23 Jul 2013 18:37:26 +0000 (14:37 -0400)]
volume_inline.h: Down with assert, again
Commit 34767c6a0f914960c9a1efabe69dd9c312a2b400 replaced all assert
calls in this file with osi_Assert, but shortly thereafter, commit db6ee95864a8fc5f33b7e95c19c8ff5058d37e92 added VTimedWaitStateChange_r
with two new assert calls. These are precarious in a public header;
fix them to osi_Assert like the ones in VWaitStateChange_r.
Signed-off-by: Anders Kaseorg <andersk@mit.edu>
Reviewed-on: http://gerrit.openafs.org/10094 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Derrick Brashear <shadow@your-file-system.com>
(cherry picked from commit 30fa9480dd99ed93fa642dd8ce9746760fb42180)
Change-Id: Id0bc0e75de000cf3e4133aaf31f52d9a565c8d9f
Reviewed-on: http://gerrit.openafs.org/10095 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Andrew Deason [Thu, 1 Nov 2012 16:51:42 +0000 (11:51 -0500)]
afs: Handle easy GetValidDSlot errors
Many callers of GetValidDSlot currently assume they will always get
back a valid dcache, and will panic on getting NULL. However, for many
of these callers, handling the NULL case is quite easy, since the
failure to get a dcache can just result in an error directly, or
obtaining the dcache is best-effort or just an optimization.
This commit just handles the "easy" cases; some other callers require
more complex handling.
Andrew Deason [Wed, 31 Oct 2012 20:04:55 +0000 (15:04 -0500)]
afs: Make last_error always useful
Currently we record last_error as the last getuerror() we got when
failing to read in a slot in UFSGetDSlot. For kernels that do not have
getuerror(), this variable is currently useless, and we do not record
anywhere what the last error received was (besides logging it via
afs_warn).
So, for non-uerror, just record what 'code' we got, so we at least
have something.
Andrew Deason [Thu, 22 Mar 2012 22:54:12 +0000 (17:54 -0500)]
salvager: Trust inode-based special data over OGM
Currently the salvaging code looks for special inodes, and infers the
volume id and inode type from the OGM data in each special inode file.
However, we can already derive this information from the inode number
itself for the special inode, so if they disagree, use the values
based off of the inode number and correct the OGM data.
The inode number should be more likely to be correct, since that is
how we look up the special inode from the header when attaching the
volume. It is also impossible to get special inode files with the same
name, so this ensures we don't get duplicates. And for people that go
snooping around /vicepX/AFSIDat even though we tell them not to, it
seems more likely that they go around 'chmod'ing or 'chown'ing rather
than 'mv'ing.
This change avoids an abort in the salvaging code when the OGM data is
wrong. If we trust the OGM data when it is incorrect, we assume the
special inode file is for a different volume. So when we go to
recreate one of the special files for the volume we're actually
working with, the IH_CREATE fails (from EEXIST) and so we abort.
Andrew Deason [Fri, 23 Mar 2012 18:02:22 +0000 (13:02 -0500)]
namei: Abstract out OGM functions a bit more
Add GetWinOGM and SetWinOGM for getting and setting the
Windows-equivalent of the Unix OGM data. Make those and CheckOGM use
GetFileTime/SetFileTime so we can operate just via an FD_t, without
needing the full pathname. Modify the NT namei_icreate to use
SetWinOGM.
Andrew Deason [Wed, 31 Jul 2013 20:58:41 +0000 (15:58 -0500)]
budb: Do not use garbage cellinfo
If the -servers option is given, we never initialize cellinfo or the
clones array. So, don't give the cellinfo structure or the clones
array to ubik in that case, or we may crash or do other weird things.
This issue appears to have been introduced in commit fc4ab52e.
Michael Meffie [Mon, 10 Dec 2012 23:00:25 +0000 (18:00 -0500)]
xstat: length check cm call info
Define the cm xstat function call counters with an xmacro to avoid
duplicating the list of cm function names. This obviates the need
to update xstat_cm_test.c when new function names are added to the
cm xstat collection id 0.
Check the number of returned records when printing the function call
counts to avoid over-running when a newer xstat_cm_test client
receives data from an older cm.
Reviewed-on: http://gerrit.openafs.org/8741 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Derrick Brashear <shadow@your-file-system.com>
(cherry picked from commit 09c0484fd8878797957f7ff5936c542a0f6332c4)
Change-Id: I622a4f16cbb102962199f26e5431b04ea381d5fe
Reviewed-on: http://gerrit.openafs.org/9065 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Andrew Deason [Tue, 2 Oct 2012 19:38:20 +0000 (14:38 -0500)]
afs: Avoid tracking file locks for RO volumes
Advisory file locks for RO volumes don't make a lot of sense, since
there are no possible writes to worry about. The fileserver already
does not track these, so don't even bother processing them in the
client.
Simon Wilkinson [Tue, 19 Feb 2013 17:53:11 +0000 (17:53 +0000)]
libafscp: Actually return callback from FindCallback
Fix FindCallback so that it actually returns the callback that it
found. This requires changing the function prototype so that the
third parameter is passed by reference, and updating the single
call site.
Mark Vitale [Fri, 21 Dec 2012 22:56:14 +0000 (17:56 -0500)]
dafs: preattach should wait for exclusive states
In rare circumstances an FSYNC_VOL_ON operation may fail silently,
leaving the volume in its previous state. The only clue is a FileLog
message "volume <nnnn> not in quiescent state".
This is caused by a race condition in the volume package: an
FSYNC_VOL_ON operation is attempting to preattach a volume
(in VPreAttachVolumeByVp_r()) at the same time a fileserver RPC
(e.g. FetchStatus) is detaching the volume (in VReleaseVolumeHandles_r())
at the conclusion of attach2() logic.
The fix calls VWaitExclusiveState_r() before calling
VPreAttachVolumeByVp_r().
Change-Id: Ib66859381d29311fda3e08984dcb740eadafb340
Reviewed-on: http://gerrit.openafs.org/8814 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@your-file-system.com>
(cherry picked from commit 1f891b622e9b32a068082087eae9d787057f7f00)
Reviewed-on: http://gerrit.openafs.org/9070 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Andrew Deason [Wed, 21 Aug 2013 22:07:14 +0000 (17:07 -0500)]
viced: Clarify comment explaining cba sorting
The current comment here is very brief; it may not be immediately
clear to a reader why we are sorting these, and so why we need the
given CBAs in an array. Expand on it a bit.
Note that it seems like it might be possible to refactor multi_Rx to
not require all calls to be created before any packets are sent. If
multi_Rx were changed to send data as we create calls, it may be
possible to eliminate this sorting, and allow for slightly more
efficient callback traversal when breaking callbacks.
Jeffrey Altman [Sat, 17 Aug 2013 14:18:53 +0000 (10:18 -0400)]
Windows: Cap Cache Size on X86
Since we know the cache size cannot be arbitrary size because it
must fit into contiguous process memory and because it is difficult
to compute the actual size limit, cap the size to 716800KB.
Jeffrey Altman [Fri, 16 Aug 2013 19:36:32 +0000 (15:36 -0400)]
Windows: Do not recycle deleted scache on refcnt 0
If the scache object with CM_SCACHEFLAG_DELETED set is recycled
then the deleted state is lost and the cache manager cannot prevent
unnecessary FetchStatus queries to the file server.
Jeffrey Altman [Fri, 16 Aug 2013 16:01:55 +0000 (12:01 -0400)]
Windows: Do not remove scp from hash table on deletion
If the CM_SCACHEFLAG_DELETED flag is going to have any benefit, the
cm_scache object must not be removed from the hash table in response
to a VNOVNODE error. Otherwise, a new cm_scache object is allocated,
the CM_SCACHEFLAG_DELETED is not found, and a new callback request
is issued to the file server which in response returns VNOVNODE.
Do this enough times and the abort threshold is triggered and then
the application becomes very unhappy with performance.
Change-Id: I8570370905fa4c3bbdd72f5535329cfab5bebf1a
Reviewed-on: http://gerrit.openafs.org/10121 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Derrick Brashear <shadow@your-file-system.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: Jonathan A. Kollasch <jakllsch@kollasch.net> Reviewed-by: Jonathan A. Kollasch <jakllsch@kollasch.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Andrew Deason [Thu, 27 Jan 2011 19:13:21 +0000 (13:13 -0600)]
afscp: Fix -s option for writes
When writing to AFS with afscp, the -s option was sleeping before any
StoreData RPCs actually got issued to the fileserver. Move the sleep
to after we have done one rx_Read/rx_Write, so we sleep after starting
to contact the fileserver, to make sleeping while writing more
consistent with sleeping while reading.
Andrew Deason [Wed, 10 Nov 2010 21:35:17 +0000 (15:35 -0600)]
afscp: Add -s option
Add an -s option to afscp, to specify an amount of time to sleep in
the middle of a read or write operation. This can be helpful in
simulating a slow client.
Michael Laß [Sun, 14 Jul 2013 19:31:27 +0000 (21:31 +0200)]
Use -nofork when starting bosserver via systemd
Systemd does not expect the started process to fork unless
"Type=forking" is given. Use -nofork to run BOS in foreground and allow
systemd to track its state.
Reviewed-on: http://gerrit.openafs.org/10087 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Derrick Brashear <shadow@your-file-system.com> Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: Michael Laß <lass@mail.uni-paderborn.de> Tested-by: Ken Dreyer <ktdreyer@ktdreyer.com> Reviewed-by: Ken Dreyer <ktdreyer@ktdreyer.com>
(cherry picked from commit e2d458c11956af6fe721f7151487cb19f07ac16f)
Change-Id: I2b66ca126dbda6c2c616d74b571908c57d1e86e4
Reviewed-on: http://gerrit.openafs.org/10093 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Ken Dreyer <ktdreyer@ktdreyer.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Andrew Deason [Fri, 8 Feb 2013 23:24:28 +0000 (17:24 -0600)]
afs: Avoid SetupVolume panic
Currently SetupVolume panics if it cannot successfully read a
volumeinfo entry from disk. Try to return an error instead, so we
don't panic the machine.
Reviewed-on: http://gerrit.openafs.org/9094 Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Derrick Brashear <shadow@your-file-system.com>
(cherry picked from commit 6f7ae535bbac2a5376358801b7f2c9e072f2d141)
Change-Id: Ib8ea06192bfcd6c2111444db325abc4a90190bbc
Reviewed-on: http://gerrit.openafs.org/9131 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Andrew Deason [Fri, 8 Feb 2013 23:26:32 +0000 (17:26 -0600)]
afs: Move SetupVolume tv initialization after loop
The fields in tv are not used by the loop looking for the given volume
on disk. If we wait until after that loop to initialize the fields in
tv, it is easier to handle errors encountered in the loop.
This should incur no functional change.
Reviewed-on: http://gerrit.openafs.org/9093 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Derrick Brashear <shadow@your-file-system.com>
(cherry picked from commit 8f95dc9eb92cb31f9d29eb87daac747f53b5a1cc)
Change-Id: I65f3b647017aebacf28026a648c75b2d279c768e
Reviewed-on: http://gerrit.openafs.org/9130 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@your-file-system.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Andrew Deason [Wed, 2 Jan 2013 19:09:06 +0000 (14:09 -0500)]
afs: Check dv against localhero aincr
For operations that modify directories, we call afs_LocalHero to
determine if we can perform the directory modification in our local
cache, and avoid fetching the dir blob from the fileserver. Currently,
afs_LocalHero assumes that the DV received from the fileserver is
correct, and will update the cache DV as long as we have a valid
callback on the file.
If for any reason the client cache falls out of sync with what's on
the fileserver, this can cause the client to incorrectly believe its
cache is up to date. Since, the cached data will be marked with the
newest DV, even if the DV on the server has jumped to be larger than
we expected.
While the client cache should never fall out of sync with the
fileserver, in the past this has been possible due to other bugs
(fileserver idle dead processing and client VNOSERVICE handling).
Assuming that the given DV is correct is also just unnecesarily
fragile, since we can always check if it is correct, so just check it,
and add some comments helping explain what's going on here. Note that
regular file writes effectively already check this.
Note that this change makes use of the 'aincr' argument to
afs_LocalHero, which was previously unused. aincr appears to have been
used for a purpose similar to this before OpenAFS 1.0, but was
removed, possibly accidentally.
It is possible this change negatively affects, or even breaks
(unlikely), functionality with the AFS<->DFS translator. Although
nothing of the sort has been seen, it is difficult to know one way or
the other, due to the lack of available DFS translators.
Marc Dionne [Mon, 8 Jul 2013 14:53:00 +0000 (10:53 -0400)]
Linux 3.11: Convert from readdir to iterate file operation
Convert the readdir function so that it can be used as the new
"iterate" file operation. This new operation is passed a context
that contains a pointer to the filldir function and the offset.
The context is passed into the new dir_emit function that will
call the function specified by the context.
The new dir_emit function returns true on success, so we must be
careful about how we check for failure since this is different
behaviour from what filldir currently does.
Ben Kaduk [Wed, 17 Jul 2013 00:39:56 +0000 (20:39 -0400)]
Check for over/underflow while allocating PTS ids
The behavior of signed integer over/underflow is implementation-defined,
but even if the compiler is nice and just wraps around, we could get
ourselves into trouble later on.
Ben Kaduk [Wed, 31 Jul 2013 00:17:01 +0000 (20:17 -0400)]
Do not use a non-literal format string
Now that UKERNEL's panic() is a proper varargs function (gerrit 9877),
we can use a literal format string "%s" to print the panic message.
clang warngs about a non-literal format string, and in some build
environments the warning becomes fatal via -Werror.
Andrew Deason [Wed, 31 Jul 2013 20:58:41 +0000 (15:58 -0500)]
budb: Do not use garbage cellinfo
If the -servers option is given, we never initialize cellinfo or the
clones array. So, don't give the cellinfo structure or the clones
array to ubik in that case, or we may crash or do other weird things.
This issue appears to have been introduced in commit fc4ab52e.
Andrew Deason [Thu, 1 Aug 2013 19:06:52 +0000 (14:06 -0500)]
DAFS: Remove AFS_DEMAND_ATTACH_UTIL
Currently we have two DAFS-related preprocessor defines in the
codebase: AFS_DEMAND_ATTACH_FS and AFS_DEMAND_ATTACH_UTIL. DAFS_FS is
the symbol for enabling DAFS code, and turns on demand attachment and
all of the related complicated volume handling; it requires pthreads.
DAFS_UTIL is supposed to be used for utilities interacting with DAFS,
but do not have pthreads and so cannot build the relevant threads for
e.g. the VLRU, so they don't support demand attachment and a lot of
more advanced volume handling techniques.
Having both of these exist is confusing. For example, currently in
partition.c we only initialize dp->volLockFile for DAFS_FS, even
though the structure exists if _either_ DAFS_FS or DAFS_UTIL is
defined. This means when only DAFS_UTIL is defined, volLockFile will
exist in the partition structure, but will be uninitialized!
Amongst other possible issues, this means right now that DAFS_UTIL
users (dasalvager is the only one right now) will try to use an
uninitialized volLockFile whenever they try to use a volume that needs
locking. Since the partition struct is usually initialized to all
zeroes, this means we'll try to issue a lock request for FD 0,
whatever FD 0 is. If FD 0 is not open, we'll fail with EBADF and bail
out. But if FD 0 is open to some random file, the lock will probably
succeed, and we'll proceed without actually locking the volume lock
file. While the fssync volume checkout mechanism still works, the
on-disk locking mechanism protects against race conditions the fssync
volume checkout mechanism cannot protect against, and so handling
volumes in this way is not safe.
This is just one example; there are other issues with the partition
headerLockFile and probably may other things; most instances of
DAFS_FS really should be enabled for DAFS_UTIL as well.
So, instead of trying to account for and fix all of these problems
individually, get rid of AFS_DEMAND_ATTACH_UTIL, and just use
AFS_DEMAND_ATTACH_FS. This means that all relevant code must be
pthreaded, but since the only relevant code is for the dasalvager, we
can just make dasalvager pthreaded. Salvaging does not make use of any
threads or LWPs, so this should not have any side-effects.
Thanks to Ralf Brunckhorst for reporting the issue where we encounter
EBADF when FD 0 is not open, leading to the discovery of this.
Anders Kaseorg [Tue, 23 Jul 2013 18:37:26 +0000 (14:37 -0400)]
volume_inline.h: Down with assert, again
Commit 34767c6a0f914960c9a1efabe69dd9c312a2b400 replaced all assert
calls in this file with osi_Assert (now opr_Assert), but shortly
thereafter, commit db6ee95864a8fc5f33b7e95c19c8ff5058d37e92 added
VTimedWaitStateChange_r with two new assert calls. These are
precarious in a public header; fix them to opr_Assert like the ones in
VWaitStateChange_r.
der-protos.h was generated from Heimdal headers which in turn were
auto-generated. The included a large number of function prototypes
of the form
ret-type func(parm-list, type */* comment */);
where the combination of */* is ambiguous. Does it mean an end comment
followed by a pointer declaration or a pointer declaration followed by
a begin comment. This combination generates warnings on Windows. The
bug was fixed in Heimdal's code generator. Fixing it here by editing
the code.