Jeffrey Altman [Mon, 6 May 2013 19:12:54 +0000 (15:12 -0400)]
Windows: AFSLibExFreePool*() macros
Introduce the AFSLibExFreePool() and AFSLibExFreePoolWithTag() macros
which simply call ExFreePool() and ExFreePoolWithTag().
The prefix AFSLib indicates that memory allocated by
AFSLibExAllocatePoolWithTag() must be freed before unloading.
AFSExFreePool*() cannot be used because that is a pointer to a
function provided by AFSRedir.sys which may not be assigned when
memory must be freed.
The only time that ExFreePool() should be used is if the memory was
allocated by a system function.
Jeffrey Altman [Mon, 6 May 2013 19:05:10 +0000 (15:05 -0400)]
Windows: Use AFSLibExAllocatePool for library local
If the memory allocation is for an object that must be freed before
the afsredirlib.sys driver unloads, use the AFSLibExAllocatePoolWithTag
interface. AFSExAllocatePoolWithTag allocates the memory from
afsredir.sys which prevents Verifier from being used to detect leaks.
Jeffrey Altman [Tue, 7 May 2013 22:36:16 +0000 (18:36 -0400)]
Windows: RDR_Initialize must cleanup threads on failure
If RDR_Initialize() fails after instantiating the worker thread
pool it must call RDR_ShutdownFinal() to destroy the pool before
exiting. Otherwise, the threads will spin endlessly as each
DeviceIoControl call to the redirector fails.
Jeffrey Altman [Mon, 4 Mar 2013 04:10:51 +0000 (23:10 -0500)]
Windows: CreateFile Reparse Point to File as File
Apply the Reparse Point to File as File Policy to CreateFile. If the
FILE_OPEN_REPARSE_POINT flag is specified to the CreateFile operation
and AFSIgnoreReparsePointToFile() returns TRUE, evaluate the target
object (if possible) and if the object is a FILE, then ignore the
FILE_OPEN_REPARSE_POINT flag. Otherwise, re-evaluate the request to
attempt to open a reparse point if it exists.
AFSIgnoreReparsePointToFile() is a helper routine that uses the
global reparse point policy to decide whether or not a reparse point
whose target is a file should be reported to applications as a file.
When per-AuthGroup or per-Process policy is supported, this function
should be modified.
pete scott [Wed, 27 Feb 2013 15:51:44 +0000 (08:51 -0700)]
Windows: IOCTL_AFS_SET_REPARSE_POLICY
IOCTL_AFS_SET_REPARSE_POLICY is a new ioctl that can be executed
by anyone to alter the behavior of AFS Symlink-to-File reparse point
processing. Policy can be set for a global default or for the active
authentication group. If the AFS_REPARSE_POINT_TO_FILE_AS_FILE policy is
active, afs symlinks will not be reported as reparse points if the symlink
target is known to be a file.
This patchset implements the ioctl but not the "reparse point to file as
file" functionality. Per authgroup policy setting is not permitted by the
ioctl but is not supported at this time.
Jeffrey Altman [Sat, 4 May 2013 15:56:30 +0000 (11:56 -0400)]
Windows: Report Case Sensitive Search
Return the FILE_CASE_SENSITIVE_SEARCH volume flag as part of afs
volume properties. NTFS does and our search algorithm is case
sensitive first, then case insensitive.
Jeffrey Altman [Fri, 3 May 2013 15:23:31 +0000 (11:23 -0400)]
Windows: Introduce CM_CONN_FLAG_NEW
The new CM_CONN_FLAG_NEW flag is set on the cm_conn object whenever
a new rx_connection has been created. The flag is cleared in cm_Analyze
if the call succeeded or if the error is one that is generated as a
result of communicating with the peer. If no communication with the
peer has taken place the connection is considered "new".
For errors that would result in forcing a new connection, check whether
the existing connection is already "new". This avoids an extra
RX_CALL_DEAD timeout period in the case where a "new" connection was
already in use.
if you are rebuilding from pt_util, data sanitization should
not randomly chown and/or rename your groups. likewise,
an admin should have the ability to do this.
Ken Dreyer [Wed, 1 May 2013 03:59:32 +0000 (21:59 -0600)]
doc: quote list items in POD
Recent versions of Pod::Simple complain if we use integers or other
special characters in an =item list. We have a couple bulleted lists
that happen to have integers or other special characters as the list
values. Quote the items with C<> so that Pod::Simple can correctly parse
them again.
Michael Meffie [Tue, 30 Apr 2013 15:30:15 +0000 (11:30 -0400)]
pt_util: fix group line check for input files
Fix the check for requiring group lines before any membership lines. Do
not clear flag indicating the presence of a group after reading each
line. (This error was caught by the pt_util-t unit test.)
Michael Meffie [Tue, 30 Apr 2013 19:38:24 +0000 (15:38 -0400)]
tests: make a plan for man page checks
Split the man page check routine into two routines; one to get the list
of sub-commands for a command, and another to verify a man page exists
for each sub-command. Use the list of sub-commands to set up the
Test::More plan before running the tests.
Setting the plan before running the tests allows the the man page tests
to run on systems which ship older versions the Test::More module.
Andrew Deason [Tue, 30 Apr 2013 19:37:54 +0000 (14:37 -0500)]
afs: Do not invalidate all dcaches on startup
Commit 20b0c65a289e2b55fb6922c8f60e873f1f4c6f97 changed
afs_UFSGetDSlot to always treat a dslot entry as invalid if
'datavalid' was 0. This was to force the invalidation of the given
dslot if we were reading in a dslot from the free or discard list,
since the data in that dslot is not valid.
However, 'datavalid' is also 0 when we read in dcache entries from
disk on startup. So, this means that we invalidated all cache entries
when the client started up, effectively making our persistent cache
worthless.
Fix this by only forcing this invalidation when we are reading from a
free or discarded dcache, and not during the initial cache scan. That
is, when 'indexvalid' is 1, and 'datavalid' is 0.
The parameters for these Get*DSlot variants should maybe be changed to
be a little more clear, but for now, this is a targeted fix for this
specific issue.
Windows: pSrcObject instead of pSrcFcb->ObjectInformation
In AFSSetFileLinkInfo and AFSSetRenameInfo consistently use the
variable pSrcObject instead of pSrcFcb->ObjectInformation. pSrcObject
is a local alias. Mixing both forms in the same function is confusing.
pCurrentObject is supposed to be an alias for pDirEntry->ObjectInformation
but it was not always being updated when pDirEntry was replaced. As a
result several tests were being performed incorrectly and the wrong data
was being logged.
Windows: AFSExamineVolume drop TreeLock if waiters
After each call to AFSExamineObject drop the ObjectInfoTree.TreeLock
if there are threads waiting for access. The garbage collection process
should not delay real work.
Each time the ObjectInformationCB object is looked up
from the ObjectInfoTree the LastAccessCount field should be updated
except in cases of invalidation, garbage collection, and extent
processing. This is particularly important when an ObjectInfoCB
is attached to DirectoryCB in AFSInitDirEntry and when constructing
directory snapshots or validating directory content.
Windows: AFSFindObjectInfo update last access time
Add a boolean parameter to AFSFindObjectInfo() which is used
to indicate whether or not the last access time for the found
ObjectInfoCB should be updated.
Set the new parameter in all calls to AFSFindObjectInfo().
In AFSInvalidateVolume a reference count is obtained in order to
ensure that the object is valid throughout the invalidation request.
Although the refcnt is obtained while holding the TreeLock the refcnt
was not released while holding the TreeLock which could open the door
for another thread to race.
In AFSInitDirEntry the pattern was to find or allocate an
ObjectInfoCB then destroy it if the DirectoryCB creation fails
for some reason. The problem with this approach is that once the
VolumeCB ObjectInfoTree.TreeLock is dropped the ObjectInfoCB is findable.
That means that the contents of the ObjectInfoCB must be valid.
This patchset makes three changes. First, in the case where the
ObjectInfoCB is allocated, the fields of the ObjectInfoCB are populated
from the DirEnumEntry before the TreeLock is dropped. Second, if the
DirectoryCB allocation fails the ObjectInfoCB is not deleted. It is
perfectly valid and can be used by a subsequent AFSInitDirEntry call.
Perhaps one that is racing with this thread. It will eventually be
cleaned up by the AFSPrimaryVolumeWorkerThread. Finally, when the
ObjectInfoCB reference count is decremented the TreeLock is held shared in
order to prevent races with other threads that might be incrementing it
themselves.
The CM_VOLUMEFLAG_RO_SIZE_VALID flag was being reset using the
wrong field which resulted in the flag never being cleared and
the correct volume size not being reported.
Windows: fail if pSrcParentObject cannot be resolved
In AFSSetFileLinkInfo and AFSSetRenameInfo return STATUS_INVALID_PARAMETER
if pSrcParentObject cannot be determined. Otherwise, a NULL pointer
dereference will occur.
Change-Id: I0e265433aa85066005e90b3584f8e865c5be79c8
Reviewed-on: http://gerrit.openafs.org/9807 Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
Windows: SetFileRenameInfo Do not replace pSrcParentObject
If pSrcParentObject is replaced by pTargetParentObject then the
reference count obtained by the AFSFindObjectInfo() call at the
start of AFSFileRenameInfo will be released on the wrong object.
This will result in a reference leak on pSrcParentObject and an
undercount on pTargetParentObject. pTargetParentObject can then
be garbage collected while it is in use.
Change-Id: Id10db257afbd4996a31eb98ad7eca69343297274
Reviewed-on: http://gerrit.openafs.org/9806 Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Reviewed-by: Rod Widdowson <rdw@steadingsoftware.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
Andrew Deason [Wed, 17 Apr 2013 23:04:58 +0000 (18:04 -0500)]
LINUX: Sometimes let dentry_open handle refcounts
When Linux changed dentry_open to use a 'path' argument, they also
changed it so dentry_open handles incrementing the relevant ref
counts. So now, sometimes we need to inc the dentry and vfsmount
refcounts ourselves, and sometimes we need to leave them alone.
To accommodate this, change afs_dentry_open to also handle refcounting
itself, and 'get' the given dentry and vfsmount if necessary.
Also note that currently, afs_linux_raw_open can call afs_dentry_open
twice in the case of an error, but it does not dget(dp). This means
that dp could be undercounted, since dentry_open on older kernels will
dec the refcount on the given dentry in the case of an error. This
change should also fix this so dp is not undercounted in that case.
FIXES 131613
Change-Id: I0e9deb7ce57633ff65b76d2444a0416ecbe329fd
Reviewed-on: http://gerrit.openafs.org/9801 Reviewed-by: Derrick Brashear <shadow@your-file-system.com> Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net>
dentry_open, at least on older kernels, decs the refcount on its
arguments in the case of an error. So calling mntget for each
dentry_open invocation actually is the correct thing to do.
This code may need to be further fixed in order to work for newer
kernels, but for now, at least put it back the way it was so we don't
undercount ref counts on older kernels.
Windows: RDR_DeleteFileEntry test for empty directory
RDR_DeleteFileEntry should check to see that a directory entry
that is a directory is in fact empty. The most frequent use of
RDR_DeleteFileEntry is to check whether the object can be deleted
prior to setting the DeletePending state which in turn results in
the object being deleted during Cleanup. If the directory is not
empty during Cleanup it is too late for the error to be seen by
the application.
If the file server is asked to remove a directory that is not empty
one might expect it to return UAENOTEMPTY but instead it returns UAEEXIST.
The error translation function cm_MapRPCErrorRmdir did not include
EEXIST in the list of errors that convert to CM_ERROR_NOTEMPTY.
Prior to IBM AFS 3.5 the file server did return ENOTEMPTY and if a
particular platform did not define ENOTEMPTY, ENOTEMPTY was defined to
be EEXIST. To late to change things back now.
Andrew Deason [Fri, 29 Mar 2013 18:40:41 +0000 (13:40 -0500)]
Make ihandle sync behavior runtime-configurable
The actual behavior of FDH_SYNC has changed a bit over the years, and
some people want one behavior, and some want another. Make it possible
to make this choice at runtime with the new -sync option, instead of
making this decision by running with different patches.
Note that FDH_SYNC is not a macro anymore, nor is it an inline
function. While it could be a macro, it would look a bit complex, and
there are some oddities with trying to use vol_io_params inside the
FDH_SYNC expansion (vol_io_params is not declared for LWP, for
example). And having it be an inline function causes problems with
some odd linking dependencies. For example, vlib.a contains volume.o,
but does not contain a definition for DFlushVolume (dir/buffer.c),
which is referenced in volume.o. 'vos' uses vlib.a, but does not
bring in anything that defines DFlushVolume. Currently this appears to
not cause a problem because 'vos' uses nothing from volume.o, so the
dependencies of volume.o don't matter. Adding an inline FDH_SYNC for
platforms that don't support 'static inline' would add a dependency to
volume.o (via vol_io_params), which causes an error for the lack of a
DFlushVolume.
Those are possibly just some problems, and may not be all. So instead,
make it so we don't have to deal with that and just have a normal
function. While FDH_SYNC may be called in a performance-critical
section, the overhead of a real function call is nowhere near the
delay of an actual fsync(), so presumably any overhead doesn't matter.
Andrew Deason [Wed, 17 Apr 2013 06:33:07 +0000 (01:33 -0500)]
LINUX: Avoid duplicate mntget in afs_linux_raw_open
In the unlikely event that our afs_dentry_open call fails with
cache_creds, we call afs_dentry_open again with the current creds as a
fallback. However, we call mntget on afs_cacheMnt for each call. So if
we actually hit the second call, we'll have added 2 refs to
afs_cacheMnt, but we only actually opened one file, causing a slight
overcount on afs_cacheMnt refs.
To avoid this, just call mntget once, before any of the
dentry_open-related calls.
cm_Analyze forces new rx connections in response to VICECONNBAD and
VICETOKENDEAD errors but failed to mark the cm_req_t with
CM_REQ_NEW_CONN_FORCED and failed to set 'forcing_new' to true ensuring
that a retry would take place even if the cm_req_t included the no retry
flag.
cm_Analyze invalidated the credentials for the cell upon receiving an
RXKADEXPIRED error from a server but failed for force the establishment of
a new rx connection to the server. As a result, the expired credentials
would continue to be used until the credential expires.
Add a comment reminding the reader that CcSetFileSizes only needs
to be called on a ValidDataLength change if the VDL value has decreased.
A write operation cannot result in a decrease therefore CcSetFileSizes
does not need to be called from within AFSCommonWrite().
Change-Id: Iaf867ec876a6265dc2c8a7ba2319fdf67503a467
Reviewed-on: http://gerrit.openafs.org/9757 Reviewed-by: Rod Widdowson <rdw@steadingsoftware.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
Windows: CcPurge range modified by non-cached write
When a non-cached non-paging write occurs, the update bypasses the
Windows cache. As a result any cached data in the modified range is
now invalid and must be purged.
CcPurgeCacheSection is known to trigger some filter drivers to open
the file from a worker thread. To avoid a deadlock on the
Fcb->NPFcb->Resource that resource must be dropped. Holding the
SectionObjectResource exclusive is sufficient to protect against races
with other writes, reads and SetEndOfFile operations. While purging the
cache prior to calling the service might be more desireable, it cannot be
done safely without violating the lock hierarchy. Therefore, the purge is
performed after any call to the service completes.
Change-Id: I953a74a0675875eb6be85f85ce924473deb3347f
Reviewed-on: http://gerrit.openafs.org/9756 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Rod Widdowson <rdw@steadingsoftware.com> Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
The following race was identified by Rod Widdowson.
A. File is complete up to 1000 Eof=1000, VDL=1000
B. File Eof is set to 2000. Eof=2000, VDL=1000 (SetInfo doesn't move VDL)
C. Locks dropped.
Thread1) Write comes in for 1000 for 500. This is not extending.
Locks taken shared.
Thread1) Data Written to Server. Thread stalls.
Thread2) Read comes in for 1000 for 1000. Locks taken shared
so it proceeds.
Thread2) CcRead calls CcZero and so the cache get zeros from 1000 to 2000
Thread1) VDL moves forward.
The windows cache is now poisoned between 1000 and 1500 and protected by
the VDL. Any future reads gets the wrong data and any write to that part
will cause an overwrite of zeros.
Instead of holding the Fcb->NPFcb->Resource and
Fcb->NPFcb->SectionObjectResource shared during a NonCached write, hold it
exclusive because the write is occurring behind the back of the windows
cache.
Change-Id: I2244e1247dcee2c3ca0d95e6ee11de3187d491c5
Reviewed-on: http://gerrit.openafs.org/9754 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Rod Widdowson <rdw@steadingsoftware.com> Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
Windows: AFS_INVALIDATE_DATA_VERSION only by service
Let the service make all decisions regarding when a data version
invalidation should be initiated. If during directory enumeration
or entry validation a data version change is noticed, that is an
indication that the meta data should be updated.
Change-Id: I8872fb5500b08ef2c6b64ab5fd13beeee4267aa2
Reviewed-on: http://gerrit.openafs.org/9743 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
Windows: Update ValidDataLength on all nonPagingIo
Instead of updating the Fcb->Header.ValidDataLength only when
processing cached writes, update it for all non-PagingIo extending writes.
This ensures that a file that is extended by a mixture of cached and
non-cached (NO_INTERMEDIATE_BUFFERING) writes will properly trigger a
page fault when the Windows cache manager does not have a complete page
cached.
Change-Id: I255bb667e33fadd07eb8961901d33707812a8406
Reviewed-on: http://gerrit.openafs.org/9742 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
Writes can alter both the EndOfFile (Fcb FileSize) and the ValidDataLength
which must remain synchronized with the data known to the service.
Dropping the Fcb.Resource and the SectionObjectResource prior to
performing non-cached writes opens the possibility of a race in which
data changes and length updates can be altered independently.
Efforts are made to avoid holding locks across calls to the service
because they can result in deadlocks with object invalidation or extent
management. However, object invalidation for data version changes are
now handled in a worker thread. It should be safe to hold onto the
Fcb Resource and SectionObjectResource across non-cached write processing.
The locks are not held in the paging IO path so paging non-cached
writes (which cannot be extending) will not prevent cached writes from
taking place in parallel.
The reason it is critical for the ValidDataLength and the FileSize to
remain in sync with the data for non-paging non-cached writes is that
these values are used to determine whether the Windows cache manager
should trigger a page fault to read data from the service upon receiving
an extending cached write that doesn't fill the page.
Change-Id: If3edb2a7412623dbec10a6efd2ee8d3b92ac992d
Reviewed-on: http://gerrit.openafs.org/9745 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Rod Widdowson <rdw@steadingsoftware.com> Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
as the volume label in the Volume Information response. For UNC
paths this is fine but for DOS devices on Windows 7 and earlier returning
a volume label that is longer than the NTFS maximum label length (32
characters) results in the Explorer Shell treating the volume as if it
does not support long file names.
From this patchset forward if the FileObject->FileName indicates that
the query is for a DOS Device, only return the AFS volume name and not the
cell informmation in the Volume Information response.
FIXES 131632
Change-Id: Iee26a00e0042e2594a5e039ee57688b61b10da45
Reviewed-on: http://gerrit.openafs.org/9751 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Rod Widdowson <rdw@steadingsoftware.com> Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
Windows: \\afs\all is not a server for NP enumeration
\\afs\all is a special share name that refers to the global root
which in the AFS redirector is actually \\AFS. However, from the
perspective of the network provider interface \\afs\all is just a
share which refers to a directory. Do not treat attempts to evaluate
it as if they were the same as evaluating \\AFS. One is a global
enumeration (\\AFS) and the other is just a hidden share name.
Change-Id: I24af24ec005c729bb1430c55254f2b68689932ed
Reviewed-on: http://gerrit.openafs.org/9750 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Rod Widdowson <rdw@steadingsoftware.com> Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
Modify the IOCTL_AFS_CONFIG_LIBRARY_TRACE DeviceIoControl message
to pass an AFSDebugTraceConfigCB which is used to toggle the value
of the Library's AFSDebugTraceFnc pointer. When the trace log is
enabled, the AFSDbgLogMsg parameter is non-NULL and when the log is
disabled, the parameter is NULL.
Change-Id: I71b951f244b760487f2ece94409cefaa7a73ea31
Reviewed-on: http://gerrit.openafs.org/9748 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Rod Widdowson <rdw@steadingsoftware.com> Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
The amount of space allocated for use by the pioctl call to
obtain the ACL for the source directory in the "up" command
is not large enough and the call fails when access lists get
sufficiently large.
This change increases the size of the space provided to
pioctl to the maximum possible. This allows for much larger
access lists and is consistent with a similar call in the
"fs listacl" command).
OpenBSD 5.3: Replace use of copyinstr for setting mount point name.
As a result of a realignment of kernel memory in OpenBSD 5.3,
the copyinstr() routine no longer works for copying the mount
point name into the internal mount table structure. It also
fails silently, so it's not noticed until someone looks at
the mount table and discovers that the mount point name for
AFS is missing.
This patch replaces the use of copyinstr() with strlcpy() for
copying the mount point name in OpenBSD 5.3.
Note that this is consistent with how other similar device
support has addressed the same issue in OpenBSD 5.3.
Andrew Deason [Thu, 28 Mar 2013 21:42:58 +0000 (16:42 -0500)]
aklog: Probe for libasn1 on heimdal
aklog uses encode_EncTicketPart and some other encode_* ASN.1 routines
when we're building against heimdal. Our krb5 autoconf logic from
c-rra-util is not guaranteed to include libasn1 in KRB5_LIBS, since
it's not required for functions in the krb5 API. So, specifically test
for it.
In almost all cases where an AFSCcb is present the associated AFSFcb
is also present. The AFSFcb has a direct pointer to the AFSObjectInfoCB.
This patchset replaces the Ccb->DirectoryCB->ObjectInformation references
with Fcb->ObjectInformation. This avoids one level of pointer indirection
and will make it easier to remove the DirectoryCB ObjectInformation
pointer in the future.
Change-Id: I2a6f5d2ed8ef1ad85691f07f425f99e3fb6cce31
Reviewed-on: http://gerrit.openafs.org/9724 Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com> Tested-by: Jeffrey Altman <jaltman@your-file-system.com>
AFSDeleteDirEntry() frees the memory allocated to the DirectoryCB.
To ensure that an invalid memory pointer is not accidentally used
by the caller after the memory is freed, use
InterlockedCompareExchangePointer() to set the input parameter to
NULL prior to destroying the DirectoryCB.
Change-Id: I2e92d4277d1f9baee164bfb941821aa11a1ad738
Reviewed-on: http://gerrit.openafs.org/9721 Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com> Tested-by: Jeffrey Altman <jaltman@your-file-system.com>
Periodically there is a lost race which results in a valid DirectoryCB
with a non-NULL ObjectInformation pointer that refers to freed memory.
This major reorganization simplifies the logic and attempts to close
potential loopholes.
First, the AFSExamineDirectory() function is removed and replaced by
a call to AFSDeleteDirEntry(). The AFSExamineDirectory() function
examined all of the children AFSObjectInfoCB objects which in turn
duplicated much of the logic of AFSExamineObjInfo at the cost of
increased complexity due to the additional layer of locked objects.
Once the AFSDirectoryCB is removed a subsequent pass of the worker
thread will free the AFSObjectInfoCBs.
Second, the AFS_OBJECT_REFERENCE_DIRENTRY category had been used for
both DirectoryCB references and the Pioctl references. A new
AFS_OBJECT_REFERENCE_PIOCTL category has been created to improve the
ability to track the allocations and releases.
Third, the AFSPrimaryVolumeWorker thread now attempts to hold onto the
VolumeCB TreeLock exclusively. Previously the lock was held shared.
However, it is not safe for both the garbage collection and the find
routines to both be shared. One has to be exclusive. Although holding
the TreeLock exclusively in the garbage collection processing will result
in the lock being held for extended periods of time, it is more likely
that there will be benefits from parallel access during AFSFindObjectInfo()
calls.
Attempts to obtain most other locks are non-blocking. If the lock cannot
be obtained, the object must be in use. Therefore, it should not be
garbage collected.
AFSFindObjectInfo performed the search of Volume object tree protected by
the TreeLock but dropped the lock before incrementing the reference count.
This behavior contributed to a race with the AFSPrimaryVolumeWorkerThread
which has to drop the VolumeCB TreeLock periodically in order to safely
cleanup FCBs.
Change-Id: I0cba4a118e4835edee7702db97846567618e0adf
Reviewed-on: http://gerrit.openafs.org/9719 Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com> Tested-by: Jeffrey Altman <jaltman@your-file-system.com>
Jeffrey Altman [Tue, 26 Mar 2013 12:52:59 +0000 (08:52 -0400)]
Windows: AFSExamineObject() refcnt underflows
Now that the reference counting is likely to be correct, do not
garbage collect objects with negative reference counts. If the
reference counts are wrong the objects will never be destroyed but
that is now a safer choice than freeing memory that might be in use.
Andrew Deason [Wed, 3 Apr 2013 21:39:07 +0000 (16:39 -0500)]
vos: Restore some VNOVOL error messages
Many places in vos/vsprocs have code to delete a volume. Commit f4e73067cdef990262c69c38ac98761620a63f25 tried to refactor them by
consolidating the common "delete" code into DoVolDelete. However, not
all of the removed code had exactly the same behavior, and some of
these variants were not handled by DoVolDelete.
One such variation is that DoVolDelete always printed an error message
if the target volume did not exist. But for some call sites this
condition is not an error, and prior to the refactoring they did not
print such an error message. Commit 1092cbe34fc8519826b3fa0565505b7bd81bc922 tried to correct this by
suppressing the error message if the target volume does not exist.
However, this means that all DoVolDelete calls do not print such an
error, where some should and some should not print an error. This
means that in some edge cases when we encounter an unexpected VNOVOL
error, we now skip printing the specific error we got and instead go
right to cleanup/recovery/exit. For a few other cases, we used to
print an error and continue (because it is a non-fatal error or a
warning), but now we print nothing when we encounter a VNOVOL error.
Fix this by specifically printing an error for the VNOVOL error for
DoVolDelete call sites that used to print such an error. Do this for
all such sites except ones where we obviously print an error
immediately afterwards anyway.
This is just a quick targeted fix. A future more robust fix should
involve altering DoVolDelete to handle all of the different behaviors
expected by its various callers.
Change-Id: Ia79bce3d2fed4acd62d517064db5b6be77f6e987
Reviewed-on: http://gerrit.openafs.org/9704 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Derrick Brashear <shadow@your-file-system.com>
Instead of testing for Characteristics = FILE_READ_ONLY_DEVICE
which applies to the entire device, only return media protected
errors if the volume FileSystemAttributes include FILE_READ_ONLY_VOLUME.
Change-Id: Ice716083c7f0ecb9e80d0ca9e3e143249293d28e
Reviewed-on: http://gerrit.openafs.org/9699 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com> Tested-by: Jeffrey Altman <jaltman@your-file-system.com>
Andrew Deason [Tue, 26 Mar 2013 22:50:31 +0000 (17:50 -0500)]
volser: Make VolListOneVolume errors consistent
Currently, VolXListOneVolume errors out with ENODEV if any attachment
error occurs with the specified volume. But VolListOneVolume always
returns success if it can find the indicated volume, and any
attachment errors and such are reported in the 'status' field of the
volume info structure.
These two functions do pretty much the same thing; VolXListOneVolume
just provides more info than VolListOneVolume. So make them behave the
same way, and provide more specific information, whether or not
somebody ran 'vos examine' or 'vos examine -extended'.
The 'vos' binary has always handled errors in the 'status' volume info
structure for both "extended" and non-"extended" queries. This
difference appears to just have been a mistake from OpenAFS 1.0.
Andrew Deason [Tue, 26 Mar 2013 22:26:23 +0000 (17:26 -0500)]
volser: Restore Vol*ListOneVolume error handling
In the 1.4 series, the volserver VolListOneVolume function always
returned success if the specified volume was found in any way, and
ENODEV otherwise. The VolXListOneVolume returned ENODEV if the volume
was not found, or if any error occurred.
DAFS (specifically, commit ed25934c1fe96b143715025b49104e75dce9a361)
changed these so they both behave the same way. That is, they both
return success if the volume was found at all, and ENODEV otherwise.
These changes mean that a 'vos examine' for a volume with an existing
volume transaction now indicates that a volume is offline/unattached,
but in the 1.4 series, the volume was indicated as "busy".
So, restore the original 1.4 behavior of these functions, so the
volume status is reported as it always was. This effectively reverts 53cc2ebaea5e5488d5285f0d13ffa47069ee986f, and slightly changes the
post-DAFS code to look more like the 1.4 code. This also removes the
'code' variable from VolListOneVolume and adds an explicit comment
about what's going on, to make this a little more clear.
While changing the behavior of VolXListOneVolume to match that of
VolListOneVolume perhaps makes sense, for now just restore the exact
1.4 behavior, and make the function flow look a little more like the
1.4 code did. A future change may make them the same again.
Jeffrey Altman [Wed, 27 Mar 2013 04:49:56 +0000 (00:49 -0400)]
Windows: cache readonly volume size information
Cache the volume size information for .readonly volumes which can
be reset when the volume callback is broken. This reduces the number
of RXAFS_GetVolumeStatus RPC calls issues on .readonly volumes.
Jeffrey Altman [Tue, 26 Mar 2013 13:08:58 +0000 (09:08 -0400)]
Windows: btree enumeration bulk stats
Each of the btree enumeration bulk stat operations include the
directory object in the bulk stat list. If the only object in the
list is the directory object, do not perform the bulk stat rpc as
it just wastes time. All of the required objects are already cached
with current callbacks.
Andrew Deason [Tue, 26 Mar 2013 22:00:05 +0000 (17:00 -0500)]
volser: Indicate busy volume with VBUSY
Commit 34fc86bcc749f3bd059831b7e5dae03dc09a9393 changed several uses
of VBUSY to VOLSERVOLBUSY in order to detect retriable operations.
However, one such change did not change an Rx abort code, but instead
was used for the 'status' field for a volintInfo or volintXInfo
structure. That is not really a general error code, but a field with a
few specific known values (at least, that is how existing clients
interpret it).
Go back to using VBUSY, so clients indicate the volume as busy,
instead of as offline/unattached.
Andrew Deason [Tue, 26 Mar 2013 18:27:33 +0000 (13:27 -0500)]
aklog: Only try to use krb5-weak.conf if it exists
The logic we use for using krb5-weak.conf to allow 'weak crypto'
requires us to know where the default krb5.conf is. The default
krb5.conf local can vary significantly depending on the platform, and
we don't have a good way of figuring out what it is, so we guess. We
may guess wrong.
To limit the cases where we guess wrong, only try to do this
workaround if the krb5-weak.conf file actually exists.
Ben Kaduk [Tue, 26 Mar 2013 21:42:38 +0000 (17:42 -0400)]
Fix DARWIN build with clang
In 1d8937b86050 we added a function call to kauth_cred_unref in the
DARWIN100 case (replacing a macro), but added the inclusion of
sys/kauth.h only when using versions older than DARWIN80.
On DARWIN100 and above, clang detects that the now-implicit function
declaration is in conflict with the actual prototype, which is included
later through afs/sysincludes.h when compiling the kernel rx code.
Since including sys/kauth.h seems to have been harmless for old versions,
just include it always.
Andrew Deason [Tue, 26 Mar 2013 18:14:30 +0000 (13:14 -0500)]
aklog: Search for /etc/krb5/krb5.conf
aklog tweaks the KRB5_CONFIG environment var when performing one of
our 'weak crypto' workarounds. We assume that the default krb5.conf is
/etc/krb5.conf, but for Solaris 11 libkrb5, krb5.conf is in
/etc/krb5/krb5.conf. Although this file could be anywhere, try
/etc/krb5/krb5.conf too, so we at least work on stock Solaris.
Mark Vitale [Wed, 13 Mar 2013 02:13:20 +0000 (22:13 -0400)]
dafs: prevent corruption in large fsstate.dat files
If while writing to the fsstate.dat file, it exceeds the current
size of the file (multiples of FS_STATE_INIT_FILESIZE (8MiB)),
we call fs_stateResizeFile. This un-mmaps, truncates, and
re-mmaps the file. Unfortunately, fs_stateMapFile() resets the
state->mmap.offset and .cursor, so any writes in flight over
the resize will overwrite the first bytes of the file (and leave
zeros in the file where the data should have been written).
Upon return from the write that caused a file resize, the offset
is eventually corrected and the state dump continues with a
silent failure. Eventually the state dump completes and the
file header is rewritten; this may conceal some or all of
the overwrite damage at offset 0. However, any zeros near the 8MiB
offset (and its multiples) remain as corruption.
Add a flag to fs_stateMapFile() to allow the caller to specify if
the offset and cursor should be preserved. Modify fs_stateResizeFile()
to use this capability.
testing note: temporarily reduced FS_STATE_INIT_FILESIZE to 256 bytes
during testing in order to make the problem easier to reproduce.
This problem would normally occur only on relatively large/active
DAFS fileservers.
Mark Vitale [Fri, 25 Jan 2013 23:47:49 +0000 (18:47 -0500)]
salvager: prevent assertion during -orphans attach
Improve JudgeEntry() detection of orphaned directories to
prevent unintentional deletion of their '.' and '..' entries.
This in turn prevents a later assert (opr_Verify) when we try to
delete and re-add '..' in order to attach the orphan.
In JudgeEntry(), 2 sources of information about a
directory entry are compared for consistency:
- vnodeEssence (unique) from its vnode index entry
- name, vnodeNumber and unique from its dir blob entry
A directory entry may be ignored, deleted, or repaired/replaced,
based upon the results of these and other tests (e.g. dirOprhaned).
The '.' and '..' entries are treated as special cases because
we do not want to delete them at this point if this directory
is orphaned. However, the current test for orphanhood
(vnodeEssence->unique == 0) is not sufficient; it could be
zero for other reasons. This commit now uses the dirOrphaned
flag to test for this.
However, we are still interested in doing the right thing
for '.' and '..' entries with vnodeEssence->unique == 0.
This may indicate that the dir blob entry is pointing at the
wrong vnode, and that vnode has unique==0. The current code
incorrectly ignores (returns 0) this case. This commit now
now falls through to the repair/replace code so that we can
find the correct vnode for this entry.
The current code assumes that the 'vnodeEssence == 0 &&
!dirOrphaned' case doesn't exist.
Jeffrey Altman [Fri, 15 Mar 2013 03:27:25 +0000 (23:27 -0400)]
vol: remove duplicate stmp declaration
Patchset 38cf31463e3f3c675de727c1e793e117a90e6d20 added a definition of
afs_ino_str_t stmp which should have replaced the b64_string_t stmp
declaration that was already present.