Simon Wilkinson [Fri, 11 May 2012 20:14:38 +0000 (21:14 +0100)]
c-tap-harness: Fix import paths
Somehow or another, the file list committed as 098e6f141f2234dcd0196096ab6f739db678f746 is missing the tests/
prefix for a number of object files. Reinstate this prefix.
Jeffrey Altman [Thu, 10 May 2012 12:36:33 +0000 (08:36 -0400)]
Windows: Avoid deadlock during "fs memdump"
When the afs redirector is in use, it is possible that "fs memdump"
could be executed while all of the pages in the Windows page cache
are dirty with data that must be purged and flushed to \\afs. In
such a situation it is not safe for afsd_service.exe to hold
global locks such as buf_globalLock, cm_scacheLock, etc. while
performing WriteFile() calls against %TEMP%\afsd_alloc.log if
afsd_alloc.log was opened without the FILE_FLAG_NO_BUFFERING flag.
Doing so can result in a deadlock as it can become impossible for
the Windows page cache to purge data to complete the WriteFile()
as all extent operations block waiting for the global lock to
be cleared.
The correct long term approach would be to use the FILE_FLAG_NO_BUFFERING
flag when opening %TEMP%\afsd_alloc.log. However, this requires that
all writes to the file be performed using buffers that are consistent
with the drive geometry. Such an approach would be incompatible with
the _CrtMemDumpAllObjectsSince() operation and would require a redesign
of the current interfaces. See
The short term fix is to dump the contents without holding the
global locks. This can result in an inconsistent view of the world
but will ensure that deadlocks are avoided. This patchset makes
such a change when the afs redirector is in use.
Andrew Deason [Wed, 2 May 2012 17:11:01 +0000 (12:11 -0500)]
vol: Remove redundant vop check in GetVolume
VAttachVolumeByVp_r (specifically attach_check_vop in attach2) already
handles checking for conflicting vol ops, and gives us VOFFLINE
appropriately. We don't need to check again in GetVolume.
Change-Id: Ibb93d423d3c856dd957a2569412a85698180ff8e
Reviewed-on: http://gerrit.openafs.org/7304 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Tom Keiser <tkeiser@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementix.org>
Simon Wilkinson [Sat, 8 Oct 2011 22:33:37 +0000 (23:33 +0100)]
tests: Use enum rather than #defines for tests
Change the command test so that it uses an enum, rather than #defines
for offsets into the parms array. This is mainly a cosmetic change, but
brings the test suite into line with the way that we're doing stuff in
the "real" code.
Change-Id: Ia9d72e13230edd4fe13af52ba6816cf775693c36
Reviewed-on: http://gerrit.openafs.org/7133 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Tom Keiser <tkeiser@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementix.org>
Andrew Deason [Wed, 2 May 2012 17:07:49 +0000 (12:07 -0500)]
vol: Pay attention to specialStatus after VAVByVp
attach2/VAttachVolumeByVp_r do not alter the yielded error code
according to specialStatus. So, pay attention to specialStatus after
receiving an error from VAttachVolumeByVp_r, to ensure we respond with
the correct error code.
Change-Id: I59e977dd1f0949f8fe5670c7a52429acbfb7d7e9
Reviewed-on: http://gerrit.openafs.org/7303 Reviewed-by: Tom Keiser <tkeiser@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Derrick Brashear <shadow@dementix.org>
Andrew Deason [Wed, 2 May 2012 16:38:57 +0000 (11:38 -0500)]
vol: Avoid VBUSY/VRESTARTING trick for offline vop
Currently, if GetVolume() finds that the volume we're trying to attach
has a vol op that leaves the volume offline, we do the
VBUSY/VRESTARTING trick as described in CheckVnode(). This doesn't
make any sense for a couple of reasons.
For one, VBUSY/VRESTARTING is not the correct error code to return to
the client when an offline vol op is in progress and vp->specialStatus
is not set everywhere else we yield VOFFLINE.
Additionally, this block of code is only hit once for a particular vol
op. Once we reach this section, the volume is in UNATTACHED state, and
so on the next iteration of GetVolume we will immediately return
VOFFLINE (or specialStatus). So the CheckVnode-like situation is not
applicable, since we are not returning VBUSY to the same client for 15
minutes; we would return VBUSY once and then return VOFFLINE.
Change-Id: I0e8376df7937fd6bd01f9998371b9289c4ad2618
Reviewed-on: http://gerrit.openafs.org/7302 Reviewed-by: Tom Keiser <tkeiser@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Derrick Brashear <shadow@dementix.org>
Andrew Deason [Mon, 7 May 2012 20:49:34 +0000 (15:49 -0500)]
fs: Report default storebehind when errors exist
After 904c9fbe, we no longer print out the default store asynchrony
when any of the supplied paths results in a pioctl error. However, if
just one (or a few) of the paths supplied results in an error (such
as, the path does not exist), this does not prevent us from reporting
the default value.
Instead, keep track of whether or not we have a valid value, and try
to determine the default if we haven't already by the end of
StoreBehindCmd, and print it out.
Jeffrey Altman [Sun, 6 May 2012 23:31:03 +0000 (19:31 -0400)]
Windows: Checksum server lists on Volume Errors
For VMOVED, VNOVOL and VOFFLINE checksum the server lists for
the current volume. If the server list changes as a result of
the forced volume location update, do not set the updated flag
which prevents subsequent volume location updates for the current
cm_req object.
This combined with the previous patchset to filter volume locations
based upon the VLSF_NEWREPSITE flag will avoid outages during
vos release operations.
Jeffrey Altman [Sun, 6 May 2012 01:37:00 +0000 (21:37 -0400)]
Windows: Track Mixed RO Volume Release State
If the volume location information indicates that a replica site
is VLSF_NEWREPSITE then it implies that some of the replicas are
out of date. Ignore the out of date replicas when constructing
the list and force a volume location list reset every five minutes
while the replica site info is mixed.
Jeffrey Altman [Sun, 6 May 2012 00:46:08 +0000 (20:46 -0400)]
Windows: Make CM resilient to transient VNOVOL
The 1.6.0 and 1.6.1 file servers send transient VNOVOL errors which
are no indicative of the volume not being present. For example,
VNOVOL can be sent during a transition to a VBUSY state prior to
salvaging or when cloning a .backup volume instance. As a result
the cache manager must attempt at least one retry when a VNOVOL is
receive but there are no changes to the volume location information.
This patchset records the VNOVOL error in the cm_req_t structure
If the volume is replicated, the volume's server reference into a busy state.
If the volume is not replicated, the thread is paused for two seconds.
In both cases, the request is retried. If the VNOVOL error is received
a second time from the same server, the volume server reference is
deleted as before. This is done to prevent repeated requests to the
VLDB server and the file server that are expected to fail. The server
reference will be restored to the volume on the next volume location
update.
Jeffrey Altman [Sat, 5 May 2012 23:11:07 +0000 (19:11 -0400)]
Windows: cm_GetNewSCache drop lock to permit change
In cm_GetNewSCache the entire LRU queue is searched for a
cm_scache_t object that is safe to recycle. If none are the LRU
queue was immediately searched again without dropping the
cm_scacheLock or taking a pause. As a result it is quite possible
that a thread about to release a cm_scache_t was blocked from
doing so.
This patchset factors some of the logic a bit differently to
improve readability and adds new log messages to help diagnose
the cause of a problem if no cm_scache_t ever becomes available.
Andrew Deason [Fri, 4 May 2012 22:13:32 +0000 (17:13 -0500)]
ubik: Initialize ubik_callPortal earlier
As of 7caf4143, we call ubeacon_InitServerList* before ubik_callPortal
is set, causing Rx connections to be created to port 0, causing
various problems with communicating with other sites. Initialize
ubik_callPortal to the correct value before calling any such
functions, so we create connections to the right port.
Jeffrey Altman [Fri, 4 May 2012 00:01:22 +0000 (20:01 -0400)]
Windows: FCB cleanup must be done before ObjectInfo
When processing the cleanup and destruction of a File Control Block
the related ObjectInfoCB is required for proper cleanup. Reorganize
the AFSPrimaryVolumeWorkerThread logic to ensure that this is true.
This involves dropping the VolumeCB->ObjectInfoTree.TreeLock around
the AFSCleanupFcb() call. While the lock is released it is possible
for the ObjectInfoCB->OpenReferenceCount to change. Therefore, new
checks must be added after the lock is re-acquired to ensure that
an in-use object is not destroyed.
Jeffrey Altman [Thu, 3 May 2012 23:58:31 +0000 (19:58 -0400)]
Windows: AFSInitFcb STATUS_REPARSE cleanup
If a race is detected when creating a new File Control Block in
AFSInitFcb() the Fcb Header must be torn down and the ExtentsResource
and DirtyExtentsListLock must be deleted prior to freeing the pool
memory.
Jeffrey Altman [Wed, 2 May 2012 22:20:45 +0000 (18:20 -0400)]
Windows: cm_BkgFetch do not impose arbitrary timeout
The afs redirector will queue extent requests for the entire file
if it is being copied to local disk as long as there is enough
page cache space to store it. If the file is 8GB and the bandwidth
from the file server is 100K/second it may take a while to get to
the end of the request queue. Do not arbitrarily time out the
requests.
Jeffrey Altman [Wed, 2 May 2012 22:05:26 +0000 (18:05 -0400)]
Windows: Treat all cached writes as write-through
Treat all writes that are cached in the windows page cache as
write-through requests so that they are delivered immediately to
the AFS cache.
The upside is that the afsd service can begin to store data to the
file server immediately which can be of significant importance whe
the AFSCache is larger than the file size and the file size is large
and the bandwidth to the file server is slow. In that situation
the entire file can be written into the windows page cache and
will only be flushed to disk at the last handle close on the file.
The downside is that all data will be written to the file server
including that for files that will later have the delete pending
flag applied.
Jeffrey Altman [Wed, 2 May 2012 21:58:39 +0000 (17:58 -0400)]
Windows: RDR_RequestFileExtentsAsync set current DV
if the buffer returned from cm_GetBuffer() has an offset that is
beyond the serverLength and it has a "bad" data version, set the
data version to the current value. This is for debugging clarity.
Jeffrey Altman [Wed, 2 May 2012 21:52:44 +0000 (17:52 -0400)]
Windows: refactor cm_GetBuffer avoid BIOD construction
Constructing a BIOD is a very expensive operation as it requires
obtaining exclusive locks on each and every buffer that in the
collection. The prior code would construct a BIOD for a chunk
worth of buffers and then check to see if the current buffer is
beyond the serverLength or the truncation position. If so, the
buffer is cleared and the buffer is returned as current after
releasing the BIOD. This is very wasteful. Instead, check every
buffer in the BIOD to see if it should be made current or not.
If yes, do so before releasing the BIOD. This permits the construction
of the BIOD to be avoided for the rest of the buffers in the chunk.
Jeffrey Altman [Wed, 2 May 2012 21:42:59 +0000 (17:42 -0400)]
Windows: cm_QueueBKGRequest improvements
Do not add duplicate requests into the queue. Outstanding extent requests
will be re-issued by the afs redirector on a periodic basis while
waiting for them to be satisfied. If they are pending there is no
need to remember them a second time.
Use separate queues for Fetch and Store operations. Store operations
might be blocked on the file server but a Fetch operation might be served
from the cache.
If AFSInitializeProcessCB() fails in AFSProcessCreate() it can
lead to a recursive loop of AFSValidateProcessEntry() ->
AFSProcessCreate() calls. Only call AFSValidateProcessEntry()
if AFSInitializeProcessCB() succeeds. On failure, log an error
to the trace log.
Windows: reorg RDR_CleanupFile to prevent lock leak
RDR_CleanupFile could fail to drop a file lock if the user does
not have write permission on last handle close even if the file
is readonly or there were no dirty extents to be stored. The
error handling would return the error immediately and skip the
file lock release. This patchset changes the logic so that the
user permissions are not tested if the file is located on a readonly
volume or if there are no dirty extents or metadata changes to store.
In addition, if there is an error, skip to unlock processing and
not to function exit processing.
Andrew Deason [Thu, 3 May 2012 19:57:08 +0000 (14:57 -0500)]
vos: Default to server confdir for -localauth
For -localauth, we traditionally default to using the server
configuration directory, since that's usually the dir that has the
KeyFile in it. Keep doing that with the new ubik client interface.
Change-Id: I0f7e1ed180874f52c2b91b1ea3f74e763c26cd0c
Reviewed-on: http://gerrit.openafs.org/7324 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Tom Keiser <tkeiser@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementix.org>
Andrew Deason [Thu, 3 May 2012 17:40:40 +0000 (12:40 -0500)]
vos setaddrs: notice unexpected errors
Currently 'vos setaddrs' only prints a message and errors out if the
VL_RegisterAddrs call fails with certain error codes (VL_MULTIPADDR
and RXGEN_OPCODE). But if we get something else like an access error,
we should of course print that out, instead of reporting success.
Change-Id: Id90c65604289651d9f20fb1ab2c706446162f324
Reviewed-on: http://gerrit.openafs.org/7322 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Tom Keiser <tkeiser@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementix.org>
Michael Meffie [Tue, 17 Apr 2012 02:29:24 +0000 (22:29 -0400)]
bozo: increase salvage instance poll rate
Increase the bos client poll rate of the salvager temporary bnode
instance status, from every 5 seconds to 1 second. This reduces the
minimum time bos salvage takes, from 5 seconds to 1 second, which
can add up when doing a large number of volume salvages.
Change-Id: Ia0f48bfabae9442ab0f1b4a6f43df34699892f66
Reviewed-on: http://gerrit.openafs.org/7231 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Tom Keiser <tkeiser@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementix.org>
High security mode for integrated logon never was high security.
It use was deprecated in the 1.5 series and it has no use at all
in the afs redirector world. Remove it.
The get cache params output is supposed to include two values:
. the size of the cache
. the size of the cache in use
Windows no longer has a concept of an unused cache buffer. All
buffers are inserted onto the freelist and are available for
recycling when the AFSCache file is created. Instead of reporting
the used cache space as 0K, report it as the full cache in use.
It is likely to disturb users less.
Andrew Deason [Fri, 27 Apr 2012 17:59:25 +0000 (12:59 -0500)]
vol: A GOING_OFFLINE volume should yield VOFFLINE
Currently, GetVolume treats a volume in the VOL_STATE_GOING_OFFLINE
state the same as VOL_STATE_SHUTTING_DOWN, and so returns VNOVOL for a
GOING_OFFLINE volume, but these states are very different.
GOING_OFFLINE indicates that a volume should soon be in the UNATTACHED
state, so we should treat GOING_OFFLINE the same as UNATTACHED for
returning errors to the user. For UNATTACHED, we return specialStatus
if it's set, or VOFFLINE otherwise; so, just do the same for
GOING_OFFLINE.
The variable bPurgeExtents was not being set when a DV change
was detected in AFSValidateEntry(). This resulted in the purge
being skipped and old data being left in the cache.
Windows: Directory validation should purge data changes immediately
During AFSEnumerateDirectory() and AFSVerifyDirectoryContent() calls
use AFSPerformObjectInvalidate() instead of AFSInvalidateObject()
to trigger the data purge. This is necessary to avoid a race as
AFSInvalidateObject() will queue a work request that will be performed
after the metadata is updated.
Windows: Flag purge on close if CcPurgeCacheSection fails
CcPurgeCacheSection can fail. If it does, remember that the
purge still needs to be performed by setting the
AFS_FCB_FLAG_PURGE_ON_CLOSE flag on the File Control Block.
Simon Wilkinson [Sun, 22 Apr 2012 17:19:07 +0000 (18:19 +0100)]
tests: More fixes for the vos test
The vos test wasn't running correctly from runtests, as it contained
a relative path which assumed that the CWD was tests/volser, rather
than tests/
Modify this to use the BUILD environment variable when invoked from
runtests, and also add an exit after the exec(), so that if we do
fail to launch the binary we don't have two processes both running
the same code.
Windows: Add global root to name array if share name
If the share name was resolved by querying the service instead
of finding the entry in the root.afs root directory, construct
a name array in AFSParseName() that includes the AFSGlobalRoot
above the resolved share root directory.
In AFSBackupEntry, check for the case where two volume root entries
appear in sequence without an intervening mount point.
Simon Wilkinson [Sat, 21 Apr 2012 06:43:59 +0000 (07:43 +0100)]
ptserver: Refactor per-call ubik initialisation
The way in which the ubik database is initialised is identical for
all read transactions, and for all write transactions. Rather than
duplicating this code in each call handler, pull it out into two
helper functions - ReadPreamble and WritePreamble.
Simon Wilkinson [Sat, 21 Apr 2012 19:55:23 +0000 (20:55 +0100)]
util: Completely remove get_krbrlm
Commit d85ece0977e043154b7d8f5aef5f4cd972771e8e added a new
mechanism for determining whether a realm is local or not, and
susequent commits removed all in-tree calls to the now-legacy
functions in get_krbrlm.c
To avoid confusion, just remove all of these legacy functions, as
we don't want to end up supporting two ways of doing this
operation.
This change is not suitable for pullup to a stable release.
Simon Wilkinson [Wed, 18 Apr 2012 11:46:31 +0000 (12:46 +0100)]
tests: Add a RX functionality test
Use the rxperf performance testing tools to add a couple of simple
RX tests. The first moves 1Mbyte of data backwards and forwards 30
times. The second starts 30 threads, which each move 1MByte of data
once.
This is by no means an exhaustive test of RX, but the single and
multi-threaded invocations should provide a useful smoke test if
things get very broken.
Simon Wilkinson [Tue, 17 Apr 2012 22:19:17 +0000 (23:19 +0100)]
rxperf: Move into the tools directory
Move the 'rxperf' RX performance testing utility out of the
src/rx/test directory, and into the slightly more visible top level
src/tools/ directory
As this is the first time that rxperf has been built as part of the
default build, make a number of changes so that it will build on all
of our supported platforms.
Simon Wilkinson [Wed, 18 Apr 2012 11:44:43 +0000 (12:44 +0100)]
tests: Explicitly include DES in superuser test
When the hcrypto/des header was removed from our installed headers, it
wasn't added back in to the superuser test. Add it now, so that the test
can build.
Simon Wilkinson [Wed, 18 Apr 2012 11:35:10 +0000 (12:35 +0100)]
Mac OS: Fixed shared library symbol issues
Some of our shared libraries (in particular, roken) build with different
symbols in them depending on the exact configuration options for a
particular platform. This means that not all of the symbols in the map
file may be present within the library. On Mac OS X we have been working
around this by using the "-flat_namespace,-undefined,suppress" linker
options.
However, with Lion this no longer works, as the linker still expects to
find the symbol in the library whose mapfile indicated that it was
present. So, for example, we end up with errors like:
dyld: Symbol not found: _errx
Referenced from: openafs.git/tests/rx/../../src/tools/rxperf/rxperf
Expected in: openafs.git/lib/librokenafs.dylib.1.1
... despite errx actually being provided by the system libraries.
The fix to this is to use the default two level namespace, and change
our behaviour for undefined symbols to 'dynamic_lookup', rather than
'suppress'
Michael Meffie [Mon, 5 Mar 2012 15:47:45 +0000 (10:47 -0500)]
audit: remove static local realms
Remove the static list of local realms and use the
auth interace to do the local realm check. A callback
function is registered by the servers to avoid a circular
dependency between audit and auth.
Simon Wilkinson [Fri, 13 Apr 2012 13:49:59 +0000 (14:49 +0100)]
rx: Use native 64bit data counters
Modify the peer, call and rpc_stats structures to use native 64 bit
types for the bytesSent and bytesRcvd data counters. All of our
platforms support native 64bit quantities now, so there's absolutely
no value in rolling our own.
Windows: Drop Fcb Resource across SetEOF and SetAllocation
If the file size or allocation is being altered, we must hold
the PagingResource and drop the Fcb Resource. Dropping the
Fcb resource is necessary to avoid a deadlock with TrendMicro's
filter if the size is set to zero and acquiring the PagingResource
is necessary to prevent races now that the Fcb Resource is no
longer held.
Instead of calling CcPurgeCacheSection() in AFSProcessOverwriteSupersede()
as part of the file length truncation to zero, call CcSetFileSizes().
Wait to call CcSetFileSizes() until after the Fcb->Resource has been
dropped but while the Fcb->Header.PagingIoResource is still held.
Make sure that file sizes are restored in the Fcb->Header if the
afsd_service rejects the file update.
Michael Meffie [Tue, 28 Feb 2012 13:50:33 +0000 (08:50 -0500)]
auth: local realms configuration
Add krb.conf and krb.excl support to the auth cell configuration
library. Provide a function to determine if the user is local to the
cell. Provide a function to set the local realms during application
initialization. These changes are intended to replace the functions
afs_krb_get_lrealm and afs_is_foreign_ticket_name.
Simon Wilkinson [Fri, 13 Apr 2012 18:14:44 +0000 (19:14 +0100)]
rx: Remove surplus call to FindPeer
When stats are enabled, rxi_ReadPacket calls FindPeer immediately
the packet is received from the wire. The peer structure that it
gets is used solely to increment a counter, and then thrown away.
Given that FindPeer requires a lock, and a hash lookup, this is
really inefficent.
Instead, delay the compilation of statistics until rxi_ReceivePacket.
Call FindPeer for version and debug packets which have no associated
connection otherwise wait until we have found the packet's connection,
and use the peer which is linked from there.
Andrew Deason [Thu, 29 Mar 2012 15:30:47 +0000 (10:30 -0500)]
rx: dec rx_nWaiting on clearing RX_CALL_WAIT_PROC
Currently, a couple of callers (rxi_ResetCall, and
rxi_AttachServerProc) will decrement rx_nWaiting only if
RX_CALL_WAIT_PROC is set for a call, and the call is on a queue
(presumably rx_incomingCallQueue). This can cause an imbalance in
rx_nWaiting if these code paths are reached when, in another thread,
rx_GetCall has removed the call from its queue, but it has not yet
cleared RX_CALL_WAIT_PROC (this can happen while it is waiting for
call->lock). In this situation, rx_GetCall will remove the call from
its queue, wait, and e.g. rxi_ResetCall will clear RX_CALL_WAIT_PROC;
neither will decrement rx_nWaiting.
This is possible if a new call is started on a call channel with an
extant call that is waiting for a thread; we will rxi_ResetCall in
rxi_ReceivePacket, but rx_GetCall may be running at the same time.
This race may also be possible via rxi_AttachServerProc via
rxi_UpdatePeerReach -> TryAttach -> rxi_AttachServerProc while
rx_GetCall is running, but I'm not sure.
To avoid this, decrement rx_nWaiting based on RX_CALL_WAIT_PROC alone,
regardless of whether or not the call is on a queue. This mirrors the
incrementing rx_nWaiting behavior, where rx_nWaiting is only
incremented if RX_CALL_WAIT_PROC is unset for a call, so this should
guarantee that rx_nWaiting does not become unbalanced.
In rxi_ReceivePacket, if the packet is for a client connection
and there is no call allocated, the conn->conn_call_lock was
leaked. Introduced by 95c38dff3740d7e24971ceb5875c06e7abfce102.