Rod Widdowson [Fri, 28 Dec 2012 14:40:40 +0000 (14:40 +0000)]
Windows: Only allow the local system account to speak to the redirector
When we get the IOCTL_AFS_INITIALIZE_CONTROL_DEVICE IOCTL we check to
see whether the calling process is the LOCAL_SYSTEM_SID (the one that
services run at if they are not running as a specified SID). If we
are not then the initialize fails ACCESS_DENIED.
If the debug build ONLY, setting the AFS_DBG_DISABLE_SYSTEM_SID_CHECK
bit in OpenAFSDebugFlags circumvents this check, allowing interactive
debugging.
Existing code stops two processes (or even handles) from trying to
initialize the system.
Right-shifting a signed int by 24 bits can produce a value outside of
0..0xff due to sign-extension. As a result, in AddressMatch(), the
first bPattern!=255 check can never succeed. Fix by masking with 255
before comparison.
Andrew Deason [Fri, 4 Jan 2013 19:18:40 +0000 (14:18 -0500)]
SOLARIS: Look for ncurses in ncurses/ncurses.h
Solaris 11+ has ncurses.h in ncurses/ncurses.h. Look for it there.
Without this, on Solaris 11.1 we will detect libncurses automatically
(because it lives in /usr/lib), but not ncurses.h (since it is in
ncurses/ncurses.h, not ncurses.h). So, we will fall back to curses.h,
but will try to link to libncurses, which, as you might guess, fails
with various undefined symbols.
Andrew Deason [Fri, 4 Jan 2013 18:47:35 +0000 (13:47 -0500)]
SOLARIS: Avoid areq and auid conflict
On new Solaris (11.1), nfs/auth.h #defines areq and auid to access
some elements inside the nfsauth_arg structure more easily. We have a
lot of functions that use those names as parameters, so the compiler
throws an error (since we have a decl like "struct vrequest
*areq_u.areq").
We cannot avoid including that header, since we need some NFS-related
headers for the NFS xlator, and they pull in nfs/auth.h
unconditionally. So, work around this by undefining areq and auid
afterwards.
Andrew Deason [Wed, 2 Jan 2013 19:09:06 +0000 (14:09 -0500)]
afs: Check dv against localhero aincr
For operations that modify directories, we call afs_LocalHero to
determine if we can perform the directory modification in our local
cache, and avoid fetching the dir blob from the fileserver. Currently,
afs_LocalHero assumes that the DV received from the fileserver is
correct, and will update the cache DV as long as we have a valid
callback on the file.
If for any reason the client cache falls out of sync with what's on
the fileserver, this can cause the client to incorrectly believe its
cache is up to date. Since, the cached data will be marked with the
newest DV, even if the DV on the server has jumped to be larger than
we expected.
While the client cache should never fall out of sync with the
fileserver, in the past this has been possible due to other bugs
(fileserver idle dead processing and client VNOSERVICE handling).
Assuming that the given DV is correct is also just unnecesarily
fragile, since we can always check if it is correct, so just check it,
and add some comments helping explain what's going on here. Note that
regular file writes effectively already check this.
Note that this change makes use of the 'aincr' argument to
afs_LocalHero, which was previously unused. aincr appears to have been
used for a purpose similar to this before OpenAFS 1.0, but was
removed, possibly accidentally.
It is possible this change negatively affects, or even breaks
(unlikely), functionality with the AFS<->DFS translator. Although
nothing of the sort has been seen, it is difficult to know one way or
the other, due to the lack of available DFS translators.
Mark Vitale [Fri, 21 Dec 2012 23:26:18 +0000 (18:26 -0500)]
vol: correct old conditional for IH_CONDSYNC
Two places in the vol package performed IH_CONDSYNC(vp->linkHandle)
only if AFS_NT40_ENV. This was correct when the namei implementation
was windows only; however, this ifdef was apparently overlooked
when namei was implemented for UNIX.
Russ Allbery [Thu, 3 Jan 2013 22:09:02 +0000 (14:09 -0800)]
Make MIN/MAX code in rx/rx_packet.h more readable
Eventually all MIN/MAX code in the tree should be handled uniformly,
but until that day, make this chunk of it more readable and
document the odd exception case for Linux kernel builds.
Russ Allbery [Thu, 3 Jan 2013 21:57:02 +0000 (13:57 -0800)]
Ensure MIN/MAX are defined in userspace builds of rx
The include of <sys/param.h> was removed from rx_packet.h on
Linux 2.6 and later to fix kernel builds with 3.7, which doesn't
have that header in kernel space. However, while kernel space
always provides MIN/MAX defines, userspace relied on the header.
On at least powerpc, no other include chain includes sys/param.h,
so MIN/MAX were left undefined.
Fix this by only skipping the include of <sys/param.h> on Linux
if building in kernel mode.
Andrew Deason [Fri, 28 Dec 2012 18:16:49 +0000 (13:16 -0500)]
viced: initInterfaceAddr_r regardless of ICBS code
Currently we only call initInterfaceAddr_r for a host if a call to
RXAFS_InitCallBackState3 succeeds. However, this leaves the host
without a host->interface structure, which indicates that the host
does not support UUIDs, and is represented by just a single host,port
pair.
But this is not correct; the host probably does have the relevant UUID
associated with it, but it is just not responding. So, with the
current code, we create a uuid-less host structure for a host that
probably has a uuid; that host structure will probably never be used,
and will just get deleted later.
So instead, always call initInterfaceAdd_r. Do it before the ICBS
call, so the host will be findable via UUID as early as possible. If
the ICBS call fails, the host will be marked as 'down' later on.
Andrew Deason [Fri, 28 Dec 2012 17:58:33 +0000 (12:58 -0500)]
viced: Avoid dangling uuid hash table entry
Currently we add a given host to the uuid hash table, then call
RXAFS_InitCallBackState3, and then only initialize the host->interface
structure if the ICBS3 call succeeded.
If the ICBS3 call fails, we have added a host to the uuid hash table,
but the host structure does not contain that uuid. If the host is then
deleted, we will not remove the host from the uuid hash table (since
host->interface is NULL), and so the uuid hash table entry will still
point to the freed host. If that host is then later looked up via that
uuid, we can reference a freed host, which can cause all kinds of
undefined behavior.
So instead, add the host to the uuid hash table at the same time that
we initialize the host->interface structure, inside
initInterfaceAddr_r.
Andrew Deason [Fri, 28 Dec 2012 21:39:15 +0000 (16:39 -0500)]
afs: Avoid unnecessary panic in ShakeLooseVCaches
afs_vcount can change as we traverse the loop. If we successfully
evict something from the cache, afs_vcount goes down, but our loop
variable 'i' stays incremented. For example, if afs_vcount was 100 at
the start of the loop and we kicked out 50 things, by the time we
traverse the entire VLRU, we could have iterated over the loop 100
times, but afs_vcount would still be just at 50.
So, remember what afs_vcount was at the start of the loop, and use
that for our loop limit. Note that vcaches cannot be added to the VLRU
during the execution of this loop, since we're just kicking stuff out.
And nobody else can modify the VLRU but us, since we're holding
afs_xvcache, and if we drop afs_xvcache, we restart the whole eviction
process.
The bug here was introduced by commit bc6dd950, but usually did not
affect Linux until commit 696db866.
Andrew Deason [Wed, 12 Dec 2012 22:14:55 +0000 (16:14 -0600)]
LINUX: Avoid infinite d_invalidate loop
In afs_linux_lookup, we try to invalidate as many dentry aliases as we
can. However, a successful d_invalidate() against a dentry does not
necessarily mean that the dentry will not show up in a later
d_find_alias() call. This is because d_invalidate() does not remove
the dentry from the d_alias list, but just removes it from the hash
chain. dput() is what removes it from the d_alias list when all of the
references go away. If a reference is grabbed between our
d_invalidate() and dput() calls, the dentry will stay on the d_alias
list.
We will then retry the loop, and we will get the same dentry back from
d_find_alias(). Running d_invalidate() on an unhashed dentry is a
no-op, so we don't change anything in the loop. We will retry again
and again, looping forever and spinning the CPU.
To avoid this, just call d_prune_aliases instead, instead of
repeatedly looping through the alias list ourselves. Note that this
does remove our check for DCACHE_DISCONNECTED in each alias' d_flags.
This should not be a problem, since we will still use any remaining
DCACHE_DISCONNECTED dentry via d_splice_alias if one still exists.
Ben Kaduk [Thu, 8 Nov 2012 23:40:57 +0000 (18:40 -0500)]
Add configure option to not install kauth
Per http://www.openafs.org/pages/no-more-des.html the kaserver
suite of utilities is deprecated and is not supposed to be built
anymore in this post-1.6 world.
Not building them at all requires some effort, but not installing
them is pretty easy. Do the easy part for now, and leave the hard
parts for a follow-up commit.
Andrew Deason [Wed, 19 Dec 2012 00:49:49 +0000 (18:49 -0600)]
viced: Sanity check file link count during CoW
A few ihandle bugs in the past have caused the CopyOnWrite code to
open cached file handles for files which have been deleted. When we
CoW, both of the files we're dealing with had better actually be on
disk, so bail out and flag an error if either of them appear unlinked.
Andrew Deason [Fri, 21 Dec 2012 18:30:24 +0000 (12:30 -0600)]
ihandle: Add FDH_ISUNLINKED
Add the FDH_ISUNLINKED functionality to ihandle. This lets the caller
know if the file for the underlying file descriptor has been deleted
out from under us. This is useful for sanity checks in some callers.
Andrew Deason [Fri, 14 Dec 2012 21:05:53 +0000 (15:05 -0600)]
volser: Check vnode length on dump
Commit aadf69eabb1962496fa93745ab560a5b48cacd61 adds length checks on
vnodes during fileserver read/write operations. Do the same thing when
we dump volume data from the volserver, to ensure that we don't
transmit incorrect data e.g. to other RO sites when releasing.
Michael Laß [Sat, 22 Dec 2012 21:54:20 +0000 (22:54 +0100)]
Remove AFSLore from wiki URLs
The URL of the openafs wiki doesn't contain "AFSLore" anymore. Although
these old URLs still work, replace them to point users to the correct
address in the first place. Also be consistent and always use a
trailing /.
Rod Widdowson [Sun, 23 Dec 2012 14:48:14 +0000 (14:48 +0000)]
Windows: more warnings on kernel builds
This is the same sort of changes a per git commit 8a4094e9ffa5d0f96501817c8ffd3cc8dc7ec62b
but this time for the fs tree. Again most of the work is UNREFERENCED_PARAMETER, initialize
variables where the compiler lacks the smarts and remove unused locals.
Jeffrey Altman [Wed, 19 Dec 2012 16:45:56 +0000 (11:45 -0500)]
Windows: Remove unused AFSVolumeWorkerThread
The currently implmentation does all maintenance work in the
AFSPrimaryVolumeWorkerThread which is associated with the
AFSGlobalRoot volume object. Remove AFSVolumeWorkerThread as
it is unused.
Jeffrey Altman [Wed, 19 Dec 2012 14:31:06 +0000 (09:31 -0500)]
Windows: Wait for all worker threads to exit
The signalling mechanism for waking and shutting down worker threads
relies upon a per-queue event. Therefore it is not guaranteed that
the worker thread that AFSShutdown*Thread() is attempting to wait
for is in fact the thread that will be woken and exit. Modify the
code to loop waking threads until the one that is being waited for
does in fact exit.
Subsequent calls to AFSShutdown*Thread() will bypass the wait if
the thread has already exited.
Rod Widdowson [Fri, 21 Dec 2012 16:17:34 +0000 (16:17 +0000)]
Windows: warnings on kernel builds
I turned some new warning (by virtue of a more modern compiler) and
threw up a few hundred. This checked supresses them:
- Mostly remove variables which are never used
- Make unused parameters UNREFERENCED_PARAMETER
- Initialize a couple of variables which were either forgotton or
the compiler wasn't smart enough to notice were initialized.
Also strip out some extraneous tabs which had crept in.
Michael Laß [Fri, 14 Dec 2012 16:06:30 +0000 (17:06 +0100)]
Update configure help msg to match actual defaults
The defaults for LINUX_KERNEL_PATH and LINUX_KERNEL_BUILD in
acinclude.m4 were changed in 2cfd611, 94ff565 and 3f9d982 without updating
the output of ./configure --help. Change the description of
linux-kernel-headers and linux-kernel-build to show the correct defaults.
Stephan Wiesand [Sat, 15 Dec 2012 14:36:24 +0000 (15:36 +0100)]
Linux: Restructure kernel header detection
As of kernel 3.7, version.h has moved, and hence utsrelease.h was
no longer found. Loop over candidate directories and locations
within, and look for the files we're actually after.
FIXES 131525
Change-Id: I686212a283b9e0ce769b1351e3cb75e08f4b110c
Reviewed-on: http://gerrit.openafs.org/8761 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Laß <lass@mail.uni-paderborn.de> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@your-file-system.com>
Andrew Deason [Tue, 20 Nov 2012 21:00:15 +0000 (15:00 -0600)]
ubik: Do not count votes from error'd connections
If the given connection has a connection-wide error on it, the vote we
got from that site is probably not valid, and we could easily be
interpreting an error code as a vote time. So instead, treat the host
as if we got a network error from it.
Instead of validating the symlink target if both the TargetFID
and the Target Name fields are undefined, perform the validating
if either of them are undefined.
During AFSLocateNameEntry processing of an absolute symlink the
component being searched for in the AFSGlobalRoot volume may not
be found. In this case, use AFSCheckCell() to query the service
to resolve the name. If the AFSGlobalRoot happens to be the
Freelance Root Volume then the service can resolve it even though
it is not present in the directory listing.
Jeffrey Altman [Sat, 15 Dec 2012 17:19:31 +0000 (12:19 -0500)]
Windows: cm_NameI Freelance Eval of Absolute Symlinks
In cm_NameI() it is possible that a symlink to an absolute path
is reached for which the component after the 'mountRoot' is not
present in the mountRoot directory. If the mountRoot is the
Freelance root.volume then it is appropriate to attempt automatic
cell resolution.
Jeffrey Altman [Sat, 15 Dec 2012 17:18:52 +0000 (12:18 -0500)]
Windows: Set Symlink mpDV after reading target string
When a cm_scache object, symlink, has its mountPointString field
successfully populated by cm_GetData(), the mpDV field must be
assigned the current dataVersion value in order to prevent unnecessary
queries of the mountPointString from the file server.
Jeffrey Altman [Fri, 14 Dec 2012 04:33:54 +0000 (23:33 -0500)]
Windows: If no inlinebulkstat, set the flag correctly
If RXAFS_InlineBulkStatus fails with RXGEN_OPCODE,
cm_SetServerNoInlineBulk must be called with the 'no' parameter
set to True. Otherwise, thE cm_server object will not remember
that the RPc is not supported.
This is important for avoiding unnecessary timeouts on IBM AFS 3.6
servers.
Arne Wiebalck [Thu, 6 Dec 2012 15:23:05 +0000 (16:23 +0100)]
Update 'vos shadow' man page
Change the 'vos shadow' man page to say that updating the
VLDB with shadow volumes does only work if the VLDB entries
for the corresponding source volumes are deleted first.
Change-Id: I2764776b7a03346b5b2809f796d1deed0c32933b
Reviewed-on: http://gerrit.openafs.org/8652 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Dan van der Ster <daniel.vanderster@cern.ch> Tested-by: Dan van der Ster <daniel.vanderster@cern.ch> Reviewed-by: Ken Dreyer <ktdreyer@ktdreyer.com>
Jeffrey Altman [Sun, 9 Dec 2012 23:13:01 +0000 (18:13 -0500)]
Windows: Correct RDR Subsystem value overlap
AFS_SUBSYSTEM_LOAD_LIBRARY and AFS_SUBSYSTEM_PROCESS_PROCESSING
were both assigned value 0x00010000. Ensure that all values are
unique and match the documentation.
Jeffrey Altman [Sun, 9 Dec 2012 06:18:03 +0000 (01:18 -0500)]
Windows: buf_usedCount can be 64-bit
If buf_usedCount is 64-bit, must use 64-bit Interlocked operations.
Define buf_IncrementUsedCount() and buf_DecrementUsedCount() macros
to wrap the 32-bit and 64-bit Interlocked operations as appropriate.
Jeffrey Altman [Sun, 9 Dec 2012 05:31:58 +0000 (00:31 -0500)]
Windows: RDR Dynamic root Freelance only
commit 6c708d1415b27bf8f2804f3407e4fbe2f7bf1009 does not restrict
the AFS redirector dynamic root detection only to the Freelance
volume root directory. This logic should not apply to arbitrary
mount point target paths.
Restructure the code around the MmFlushImageSection() call in
AFSProcessOpen() to ensure that the SectionObjectResource is
released even when the flush fails.
Jeffrey Altman [Fri, 7 Dec 2012 04:24:44 +0000 (23:24 -0500)]
Windows: AFSCachedWrite reset LastServerFlush when Forced
If the ForceFlush parameter to AFSCachedWrite is set to TRUE,
the Fcb->Specific.File.LastServerFlush value must be reset to
ensure that the AFSPrimaryVolumeWorkerThread will trigger an
extent flush on its next pass. The LastServerFlush value will
not be reset by AFSNonCachedWrite() when called as a result of
a CcFlushCache() call as it appears to be PagingIo.
Jeffrey Altman [Fri, 7 Dec 2012 04:20:57 +0000 (23:20 -0500)]
Windows: Periodic Worker CleanupFcb to Flush Dirty Extents
The AFSPrimaryVolumeWorkerThread must not check the ObjectInformationCB
reference count when determining whether or not to call AFSCleanupFcb().
One of the tasks of AFSCleanupFcb() is to flush dirty extents to the
service.
Jeffrey Altman [Thu, 6 Dec 2012 13:16:01 +0000 (08:16 -0500)]
Windows: SetVolumeState is not an invalidation
Volume state notifications (online, offline, unknown) from the
afsd_service.exe to the afs redirector are not invalidation
events. The verify flag should not be set, the extents should
not be purged, etc. Just set or clear the AFS_VOLUME_FLAGS_OFFLINE.
Jeffrey Altman [Wed, 5 Dec 2012 18:07:08 +0000 (13:07 -0500)]
Windows: AFSSetRenameInfo Drop TreeLocks MmForceSectionClosed
If the Target directory TreeLock is held across the
MmForceSectionClosed() call in AFSSetRenameInfo() Trend Micro's
TmPreFlt!TmpQueryFullName call can deadlock in its worker thread
when AFSCommonCreate() attempts to AFSLocateNameEntry() which
requires shared access to the TreeLock for the search.
Reorganize the Target Entry Exists case to call MmForceSectionClosed()
after the directory entry has been deleted. That should throw
a monkey wrench into Trend Micro's attempt to scan it.
Jeffrey Altman [Tue, 4 Dec 2012 21:57:36 +0000 (16:57 -0500)]
Windows: Prevent lock inversion SetFileRenameInfo
SetFileRenameInfo calls MmForceSectionClosed() which can call
back to the afs redirector via the Cleanup processing. AFSCleanup()
requires an exclusive hold of Fcb->Resource so we must obtain the
lock first in AFSSetFileRenameInfo() prior to obtaining the
SectionObjectResource.
Jeffrey Altman [Sun, 25 Nov 2012 00:47:01 +0000 (19:47 -0500)]
rx: RX_INVALID_OPERATION abort unknown service only
Patchset 1fbe83f9aacfc36a9c426ba1fd18ad7c72869dc1 introduced the
sending of RX_INVALID_OPERATION aborts for connection attempts
requesting a service not offered by the rx peer. By sending aborts
for all failures of rxi_FindConnection() the set of incoming packets
that are responded to is broader than simply those with non-matching
serviceIds. This patchset restricts the transmission of
RX_INVALID_OPERATION aborts only to the explicit case in which
rxi_FindConnection() attempted to find a service and either failed
to find a match or couldn't apply the requested security class/level
to that service.
Andrew Deason [Tue, 13 Nov 2012 20:54:10 +0000 (14:54 -0600)]
rx: Add rx_GetNetworkError
Add the function rx_GetNetworkError to rx, to allow callers to
retrieve some information about the last ICMP error received for a
connection's peer. This can be useful if a call on that connection was
recently ended due to ICMP errors.
Andrew Deason [Wed, 28 Nov 2012 21:12:12 +0000 (15:12 -0600)]
afs: Apply VLRU safety check for Linux too
This invariant should apply to all platforms, not just those with
dynamic vcaches. Since this prevents an infinite loop if the list os
corrupt or something, having this around everywhere seems useful. So,
drop the check for afsd_dynamic_vcaches.
Andrew Deason [Fri, 16 Nov 2012 20:18:32 +0000 (14:18 -0600)]
afs: Fix VLRU traversal sanity check
On non-Linux, the number of vcaches in the VLRU can easily exceed
afs_maxvcount, since we allocate new vcaches when we run out. So,
assume we only have afs_vcount vcaches on the VLRU, instead of
assuming we have at most afs_maxvcount vcaches.
Andrew Deason [Tue, 20 Nov 2012 20:18:47 +0000 (14:18 -0600)]
ubik: Try to detect VOTE_Beacon errors
Currently the way ubik dbsites vote for each other is via the "return
value" of the Beacon VOTE RPC. Since this is really an Rx abort, this
can easily collide with actual errors on the wire, such as rxkad
errors.
Try to detect these by detecting vote times that are very different
than the current timestamp (more than an hour in the future or past),
and treat it like a network error.
If we do not do this, a single site reporting an error can cause us to
never reach quorum, since we calculate our sync site expiration based
on the oldest 'yes' vote, which for most known Rx aborts will be far
in the past.
Derrick Brashear [Mon, 19 Nov 2012 20:54:35 +0000 (15:54 -0500)]
macos: decode mountain lion panics
security hardening in the mountain lion kernel
screws up our decoder. apple doesn't bother to document
what to do, but after some head smashing, here we are.
AFSLocateNameEntry() can return STATUS_REPARSE in addition to
NTSTATUS failure codes. As in the case of an NTSTATUS code other
than STATUS_OBJECT_NAME_NOT_FOUND, the reference counts are decremented in
AFSLocateNameEntry() if STATUS_REPARSE is returned.
Jeffrey Altman [Sat, 1 Dec 2012 12:54:59 +0000 (07:54 -0500)]
Windows: Decr. used buf count flush and deleted files
When flushing the cache (fs flush*) or when deleting files or
when a file server reported error indicates that the buffer is
bad, decrement the used cache count so "fs getcacheparms" now
indicates a floating value based upon the number of buffers
containing potentially valid data.
Jeffrey Altman [Sat, 1 Dec 2012 12:30:57 +0000 (07:30 -0500)]
Windows: Restore "fs getcacheparms" used space
buf_Init() adds all of the available buffers to the free queue
at startup and cm_data.buf_freeCount tracks the number of items
in the free queue. So it cannot be used as a method of reporting
how much of the cache space has been used. Add a new buf_usedCount
parameter to the cm_memmap data to track the number of cache blocks
that have been used up to the total number of allocated blocks.
buf_usedCount can then be used in cm_IoctlGetCacheParms to restore
the original behavior.
Jeffrey Altman [Thu, 29 Nov 2012 08:14:42 +0000 (03:14 -0500)]
Windows: QFileInfo only Verify Entry when necessary
During a QueryFileInformation request only call AFSVerifyEntry()
when the AFS_OBJECT_FLAGS_VERIFY flag is set on the ObjectInformationCB.
The AFS_OBJECT_FLAGS_VERIFY flag is set in response to an invalidation
event from the cache manager. Let the cache manager decide when our
data is no longer consistent with the file server.
Jeffrey Altman [Thu, 29 Nov 2012 08:13:11 +0000 (03:13 -0500)]
Windows: Dir Enum only validate when necessary
During a directory enumeration, do not call AFSValidateEntry()
for every DirectoryCB. Instead only do so when the
AFS_OBJECT_FLAGS_VERIFY is set on the ObjectInformationCB.
Fcb->Specific.File.ExtentsCount cannot be used directly
in the for loop conditional because each call to AFSFreeExtent()
decrements the value. Instead, save the original value to
'lFcbExtentCount' and use that in the loop conditional.
Jeffrey Altman [Thu, 29 Nov 2012 08:00:48 +0000 (03:00 -0500)]
Windows: AFSCleanup calls AFSDeleteFcbExtents
When the hard link count of the file drops to zero, call
AFSDeleteFcbExtents() instead of AFSTearDownFcbExtents()
because the file has been deleted and the extents have been
implicitly released.
Jeffrey Altman [Thu, 29 Nov 2012 07:58:46 +0000 (02:58 -0500)]
Windows: AFSDeleteFcbExtents()
Similar to AFSTearDownFcbExtents() but does not release the
extents to afsd_service.exe. It is intended for use when the
FCB's extents are implicitly released as the result of file
deletion.
Jeffrey Altman [Wed, 28 Nov 2012 07:48:48 +0000 (02:48 -0500)]
Windows: Fcb sectionObjectResource
Add a SectionObjectResource to the AFS_FCB structure. This lock
replaces the Fcb.Resource in protecting the SectionObjectPointers.
The new resource is being added to assist in avoiding deadlocks
caused by Trend Micro and perhaps other AV products when
CcPurgeCacheSection() is called while holding the Fcb.Resource
which is required in AFSProcessOpen().
Jeffrey Altman [Tue, 27 Nov 2012 19:26:21 +0000 (14:26 -0500)]
Windows: Implement dynamic cell detection for RDR
RDR_EvaluateNodeByName knew how to parse \\afs\foo#bar\ notation
but couldn't perform a lookup for a cell that wasn't already
in the root directory. Add support for autorecognition.
Revert commit cecd99abd3837ef820d78fb15e450c8688b0f39b. Failing
to garbage collect the FCB from the ObjectInformationCB at the
earliest opportunity opens the door to a deadlock with Trend Micro's
anti-virus driver. Trend Micro attempts to make a copy of the
file data each time a CcPurgeCacheSection() is performed on the
FCB. If during AFSValidateEntry or AFSVerifyEntry a DirectoryCB->
ObjectInformationCB->FCB is discovered which has a non-NULL
SectionObjectPointers.DataSectionObject and the data version in
the DirEnumEntryCB differs from the ObjectInformationCB, a
CcPurgeCacheSection() call is performed while holding the
FCB->NPFcb->Resource exclusively. Trend Micro will deadlock the
thread making the CcPurgeCacheSection() call when it attempts
to open the file in one of its worker threads.
Jeffrey Altman [Sun, 25 Nov 2012 01:10:49 +0000 (20:10 -0500)]
rx: set abort client_initiated flag to match direction
In a recent incident involving packet reflection back to the
file server, aborts were being sent by the file server in response
to a server sent packet. The aborts sent in response also failed
to set the CLIENT_INITIATED flag in the header which permitted the
the actual client to confuse the Abort as applying to its client
initiated connection.
in rxi_SendRawAbort, set the CLIENT_INITIATED flag to the opposite
of the packet the abort is being sent in response to.
Simon Wilkinson [Sun, 4 Nov 2012 17:06:41 +0000 (17:06 +0000)]
rx: Make rxevent_Put NULL the event ptr being put
Change rxevent_Put so that it takes a pointer to the event being
put, and NULLs that pointer. This removes a lot of duplicate code
in callers, as well as making it harder to reuse a discarded event.
Simon Wilkinson [Thu, 22 Nov 2012 08:41:51 +0000 (08:41 +0000)]
rx: Return success value when cancelling an event
When cancelling an event that holds reference counts, we need
to know whether the attempt to cancel the event was successful
or not, otherwise references can be double put when the
cancellation races with the event firing. Take the follwing
Holding obj->lock doesn't control whether the event is fired or
not. Only putting the reference if the event being fired matches
that in obj doesn't help - it just leaks a reference in a different
race.
So, make rxevent_Cancel return true if it did actually cancel the
event, and false if the event has already been fired. This means
that Thread A can become
MUTEX_ENTER(&obj->lock);
if (rxevent_Cancel(&obj->event)
obj_put(&obj->refcnt);
MUTEX_EXIT(&obj->lock)
In the example above, rxevent_Cancel would return false.
Simon Wilkinson [Sat, 3 Nov 2012 23:15:50 +0000 (23:15 +0000)]
rx: Don't treat calls specially in event package
Many different structures can be passed to the rxevent package as
data. Don't give calls special treatment by making rxevent aware of
how to release their reference counts when an event is cancelled.
Update all of the callers of rxevent_Cancel to use the new arguments,
and where they were cancelling functions with calls as parameters add
the appropriate CALL_RELE directives. In many cases, this has led to
new helper functions to cancel particular call-based events.
Simon Wilkinson [Thu, 1 Nov 2012 17:44:07 +0000 (17:44 +0000)]
rx: Remove unused origPeer parameter to FindPeer
rxi_FindPeer took an 'origPeer' parameter, which was originally
there as an optimisation to decrement a reference count when replacing
a peer on a connection structure. However, we don't do that any more,
and the origPeer parameter is never used.
Simon Wilkinson [Fri, 9 Nov 2012 23:20:42 +0000 (23:20 +0000)]
rx: CheckBusy doesn't drop conn_call_lock
As rxi_CheckBusy doesn't drop the conn_call_lock when it checks for
busy call slots, it doesn't need to deal with someone replacing a
call behind its back.
Jeffrey Altman [Mon, 26 Nov 2012 16:25:43 +0000 (11:25 -0500)]
Windows: RefCounts, Asserts, and Trace Logging
Rename DirectoryCB.OpenReferenceCount to DirOpenReferenceCount
to distinguish it from the FCB.OpenReferenceCount. This makes
it easier to search for instances within an editor or debugger.
Ensure that all InterlockedIncrement and InterlockedDecrement
calls on a reference count field assign their value to a local
'lCount' variable. Ensure that 'lCount' is used within any
trace log messages and conditionals.
Add ASSERT( lCount >= 0) after all reference count decrements
in order to catch underflows.
Change conditionals from (RefCount == 0) to (RefCount <= 0) so
that object destruction can occur when there has been an underflow.
This is important in release builds for which ASSERT() is a no-op.
Jeffrey Altman [Mon, 26 Nov 2012 16:16:56 +0000 (11:16 -0500)]
Windows: AFSClose File FCB/CCB cleanup before DirCB
Move the processing of FCB and CCB cleanup ahead of the
DirectoryCB cleanup. It is not safe to dereference the
Ccb->DirectoryCB until after the CCB has been destroyed.
Jeffrey Altman [Sun, 25 Nov 2012 23:13:53 +0000 (18:13 -0500)]
Windows: AFSDeleteObjectInfo not on volume roots
The VolumeCB embeds an ObjectInformationCB structure which must
not be freed by calling AFSDeleteObjectInfo(). Add an assert
in the checked build and return without destroying the object
in the free build.