Ben Kaduk [Wed, 27 Mar 2013 21:02:55 +0000 (17:02 -0400)]
Export heimdal's rand-fortuna PRNG to the kernel
Some systems (e.g., AIX, SGI, DFBSD, HPUX) do not supply a useful
implementation of osi_readRandom(), in some cases because the kernel
does not expose a random-number interface to kernel modules. We want
real random numbers on all systems, because we want to use the for
setting the RX epoch and connection ID in the kernel.
Build hcrypto's rand-fortuna PRNG into the rand-kernel interface we expose,
and implement RAND_bytes using rand-fortuna when osi_ReadRandom()
is not useful.
Add stub routines to config.h as needed, and add a heim_threads.h
with the necessary locking for rand-fortuna. The rand-fortuna algorithm
requires some measure of time's passage, so provide a stub gettimeofday()
with single-second resolution. We use a single (global) mutex for the
hcrypto kernel code, so that we can statically declare an initializer to
be the address of that mutex. Otherwise the locking is taken essentially
wholesale from rx_kmutex.
rand-fortuna requires the sha256 code for its hashing, and also
requires a stub rand-fortuna to satisfy linker symbol visibility.
Since the rand-fortuna code does not have any actual sources of entropy
available to it during its initialization routines, we must explicitly
seed the in-kernel rand-fortuna using entropy passed in from userland.
(Userland will always have at least /dev/random available, so the
userland hcrypto should always have usable entropy.) Be sure to do so
early in the afsd startup sequence, before any daemons are started, so
that entropy is available to the core rx code for generating the epoch
and cid -- the rand-fortuna code will (erroneously) always claim that
it has startup entropy even though in this case it may not actually
have any entropy. The rand-fortuna code does not consider itself
fully seeded until it has 128 bytes of entropy, so be sure to pass
more than that in from userspace.
It is preferrable to always build this code into the kernel, even on
systems when it is not going to be used, to help prevent bitrot. This
also avoids the possibility of a new system being supported that would
attempt to use the rand-fortuna code but fail to supply any seed entropy,
which would not necessarily be readily apparent.
Simon Wilkinson [Mon, 25 Aug 2014 15:25:43 +0000 (16:25 +0100)]
ubik: Don't leak UBIK_VERSION_LOCK if udisk_LogEnd fails
If the call to udisk_LogEnd() fails (probably due to an I/O error)
don't leak the UBIK_VERSION_LOCK.
This is the possible cause of a vlserver deadlock, which had
approximately 4800 threads blocked. Analysis of backtrace of all
of these threads showed that all blocked threads were waiting in
ubik.c:555 (blocked on DBHOLD) with the exception of:
One in beacon.c:388 (blocked on UBIK_VERSION_LOCK)
One in recovery.c:503 (blocked on DBHOLD)
One in ubik.c:125 (blocked on DBHOLD)
One in ubik.c:585 (blocked on UBIK_VERSION_LOCK)
The last of these is the critical one, because it already holds
the lock that DBHOLD waits on - so despite the vast majority of
threads being blocked in DBHOLD, it's actually UBIK_VERSION_LOCK
that we're waiting on.
There is no sign of a thread which is still active which currently
holds UBIK_VERSION_LOCK.
Simon Wilkinson [Mon, 25 Aug 2014 15:15:26 +0000 (16:15 +0100)]
ubik: Don't leak UBIK_VERSION_LOCK if setlabel fails
If a call to the setlabel() physical IO function fails, don't
leak the UBIK_VERSION_LOCK.
This is the possible cause of a vlserver deadlock, which had
approximately 4800 threads blocked. Analysis of backtrace of all
of these threads showed that all blocked threads were waiting in
ubik.c:555 (blocked on DBHOLD) with the exception of:
One in beacon.c:388 (blocked on UBIK_VERSION_LOCK)
One in recovery.c:503 (blocked on DBHOLD)
One in ubik.c:125 (blocked on DBHOLD)
One in ubik.c:585 (blocked on UBIK_VERSION_LOCK)
The last of these is the critical one, because it already holds
the lock that DBHOLD waits on - so despite the vast majority of
threads being blocked in DBHOLD, it's actually UBIK_VERSION_LOCK
that we're waiting on.
There is no sign of a thread which is still active which currently
holds UBIK_VERSION_LOCK.
Garrett Wollman [Thu, 28 Aug 2014 07:09:49 +0000 (03:09 -0400)]
config: remove support for old FreeBSD releases
The FreeBSD project no longer supports 5.x, 6.x, or 7.x releases, and
has not done so for a long time. It's unlikely the OpenAFS works
properly on any of them, if it even still builds, since it is not
regularly build-tested on anything older than 8.3. Unclutter
src/config by removing the param.*.h files for these obsolete
releases.
Garrett Wollman [Thu, 28 Aug 2014 07:04:19 +0000 (03:04 -0400)]
README: update for current state of FreeBSD support
The FreeBSD project hasn't supported releases prior to 8.x for a long
time now, and OpenAFS is neither built nor tested regularly on
anything that old. Dedocument support for these releases in
preparation for later removing configuration support.
Windows: Avoid deadlock during pending delete cleanup
Release the Fcb resource and clear the AFS_DIR_ENTRY_PENDING_DELETE
flag prior to the AFSProcessRequest(AFS_REQUEST_TYPE_CLEANUP_PROCESSING)
if a delete is pending during cleanup of the last FCB open handle.
Failure to do so results in an out of order lock acquisition when
the parent object info tree lock is acquired after the AFSProcessRequest()
call to the service completes.
Volume dumps can be created from backup volumes, cloned volumes, or
directly from RW volumes. The beginning and end of the time range
covered by the dump is recorded in the DumpHeader. The end time is
based on the type of the volume. Use backupDate for backup volumes,
use copyDate for cloned volumes, and updateDate for RW volumes.
Jeffrey Altman [Sun, 29 Jun 2014 03:03:45 +0000 (23:03 -0400)]
Windows: Do not sync callbacks when only need locks
Syncing lock operations with callback fetching is unnecessary because
local lock state is not tracked via callbacks. More importantly it
risks blocking the cm_LockDaemon thread which needs to be able to
renew locks without obstruction.
Jeffrey Altman [Mon, 18 Aug 2014 19:28:14 +0000 (15:28 -0400)]
Windows: set hard dead timeout not conn timeout for probes
For the Rx connections used for probing VL and FILE servers set a
hard dead timeout and not a connection timeout. A connection timeout
will not terminate the call as long as the lastReceiveTime continues
to be updated by ping packets. The hard dead timeout will cause the
connection to fail when the 10 second limit expires.
Jeffrey Altman [Mon, 18 Aug 2014 19:25:50 +0000 (15:25 -0400)]
Windows: Freelance whole volume rdr invalidate
When updating the Freelance directory do not notify the redirector
of individual objects to invalidate since that can leaad to race
conditions. Send whole volume invalidations since that is what is
required in any case.
pete scott [Wed, 13 Aug 2014 19:28:49 +0000 (15:28 -0400)]
Windows: Obtain File Attribs for DFS Link target
The AFSRetrieveFileAttributes() function is used to acquire the
attributes for an AFS symlink. The result is either returned directly
to the application or used internally to determine the attributes
to be exposed by reparse points.
If the evaluated symlink crosses a DFS Link the redirector cannot
return the request to IO Manager to evaluate the target. Instead
the redirector must handle the request internally and attempt to
read the attributes of the target object.
Change-Id: If14df8dc41e13fd59b524fdb575c46abab1dfc2f
Reviewed-on: http://gerrit.openafs.org/11399 Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
pete scott [Mon, 11 Aug 2014 17:18:16 +0000 (13:18 -0400)]
Windows: LocateName skip DFS Link only last component
As with Mount Points and Symlinks, when AFSLocateName() is called to
process a CreateFile with Open_Reparse_Point enabled, DFS Link processing
must be disabled only for the last component in the path. Failure to
do so results in the AFS Redirector succeeding IRP_MJ_CREATE calls that
should be given back to the IO Manager so the path can be evaluated by
another file system.
Change-Id: I1627e7c6582d3a80d99dd2acc5171135a6a7bc4b
Reviewed-on: http://gerrit.openafs.org/11398 Reviewed-by: Peter Scott <pscott@kerneldrivers.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
Jeffrey Altman [Mon, 11 Aug 2014 05:07:27 +0000 (01:07 -0400)]
Windows: Reparse Policy vs DFSLinks
When a reparse policy is specified and AFSLocateNameEntry() returns
with STATUS_REPARSE, do not re-evaluate the path with the reparse
policy disabled. STATUS_REPARSE was returned because the FileObject's
FileName was modified and the IO Manager needs to reparse the request.
Jeffrey Altman [Mon, 11 Aug 2014 05:38:54 +0000 (01:38 -0400)]
Windows: AFSParseName always set FileName output
The FileName output parameter is used by the caller even when an
error occurs. In case of error it indicates that path that failed
to parse. Not all of the error paths set FileName.
Start AFSParseName() with FileName referring to
IrpSp->FileObject->FileName. It can be updated as required later.
Jeffrey Altman [Mon, 11 Aug 2014 05:28:12 +0000 (01:28 -0400)]
Windows: Refactor AFSParseName related name parsing
AFSParseName() is a very long complex function. Extract the parsing
of RelatedFileObject name parsing to a new function AFSParseRelatedName().
This removed ~160 lines of source code out of AFSParseName().
This changeset is not intended to introduce any functional changes.
Michael Meffie [Tue, 18 Feb 2014 20:23:54 +0000 (15:23 -0500)]
vos: cross-device link error message
Print a better diagnostic message for cross-device link errors, which
happens when a clone volume is not in the same partition as the
parent read-write volume.
Perry Ruiter [Wed, 4 Jun 2014 22:27:32 +0000 (15:27 -0700)]
redhat: Fix minor whitespace errors in openafs-kmodtool
During review of commit c20c01185ed748b2bc823369a8f28cf004b7d1c9
gerrit flagged one of the changed lines as having a trailing whitespace
error. This patch corrects that error and several others that were
in the file.
Change-Id: I3668e67e456322cccdfa76df935951053f9b6a48
Reviewed-on: http://gerrit.openafs.org/11200 Reviewed-by: Ken Dreyer <ktdreyer@ktdreyer.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
Perry Ruiter [Tue, 27 May 2014 07:07:52 +0000 (00:07 -0700)]
Correct comment typos in a couple files
Correct typos in a couple files. These were noticed while
researching code paths. Comment changes only. No code change.
afs/afs_stats.h has source file names updated on several lines.
Many source file name comments are wrong in this file.
I didn't attempt to correct them all, just the ones I bumped
into. If I bump into others in the future I'll fix them then.
rx/rx_call.h has source of enumerated types corrected.
Ben Kaduk [Mon, 21 Jul 2014 21:30:36 +0000 (17:30 -0400)]
Remove some incomplete struct initializers
C99 requires that objects with static linkage, which includes
global variables, be initialized to zero/NULL.
It is possible that old compilers required a hack of using one
explicit initializer and relying on the requirement from C99 that
the elements of the structure not listed in the initializer be
initialized as if it had static linkage. These incomplete initializers
seem to have been introduced to support old OS X compilers which
are not believed to still be in use.
Using a complete explicit initializer is undesired here, as the
rxkad statistics structures have a great number of elements and
the uuid structure is somewhat complicated.
Ben Kaduk [Mon, 21 Jul 2014 21:50:50 +0000 (17:50 -0400)]
FBSD: avoid unused-variable warning
This variable is passed as an argument to the ma_vn_lock() compat
macro, which ignores the thread argument on some versions of FreeBSD.
Make the variable only be declared in those cases when it will be used.
Ben Kaduk [Mon, 21 Jul 2014 18:13:39 +0000 (14:13 -0400)]
FBSD: initialize 'retval' for afs3_syscall
In the same way as linux_ret.
An ugly hack, but retval is not really used for anything relevant at
the moment, and the compiler will warn about it being used uninitialized
otherwise.
Ben Kaduk [Mon, 21 Jul 2014 15:01:04 +0000 (11:01 -0400)]
Avoid a name conflict in a local variable
Modern compilers will warn when a variable in a nested scope hiding
a variable of the same name in an outer scope. One of the arguments
to afs_lhash_remove() is already named 'data'; don't reuse that name
for a local variable.
Benjamin Kaduk [Thu, 24 Jul 2014 13:40:21 +0000 (09:40 -0400)]
Make kernel hcrypto calloc return zeroed memory
As far as I can tell, the afs_osi_Alloc contract does not
guarantee zeroed memory. On FreeBSD, with a debug kernel, it
definitely does not currently provide zeroed memory, returning
instead memory initialized with 0xdeadc0de.
Properly speaking, the role of calloc() is to both check for overflow
from the multiplication and to produce zeroed memory. However, since
we do not have a reasonable way to report failure, do not bother
checking for overflow at this time.
Garrett Wollman [Wed, 13 Aug 2014 06:32:06 +0000 (02:32 -0400)]
viced: time_t might not be long
Fix a couple of printf format errors that bite on FreeBSD 10 for i386.
Since time_t might be an int, it can't be printed with a long format.
Since time_t might be a long in general, cast to it to long when
printing.
Garrett Wollman [Wed, 13 Aug 2014 06:20:02 +0000 (02:20 -0400)]
afsd: correct printf format mismatch in debugging printf
On platforms where size_t is unsigned int, the type of
cacheFiles * sizeof(AFSD_INO_T) is not an unsigned long as the format
string requires. Casting cacheFiles to unsigned long ensures that the
result is at least unsigned long, although it will still be wrong if
any architecture makes size_t be long long. Fixes build for FreeBSD
10 on i386.
Mark Vitale [Fri, 6 Jun 2014 23:27:04 +0000 (19:27 -0400)]
opr: opr_AssertionFailed undefined in kernel module
The opr_Assert in opr_rbtree_remove is incompletely defined;
the opr_Assert macro is defined in opr.h, but the definition
for the opr_AssertionFailed routine it invokes is not included.
This allows the kernel module to build successfully even though
it retains a hidden undefined reference for opr_AssertionFailed.
However, the logic in obr_rbtree_remove ensures that this
particular opr_Assert can never fail - it is superfluous.
Some compilers (e.g. gcc for Linux AFS kernel module
builds) are able to recognize this and optimize it out. Others
(e.g. Solaris 5.12) do not, and when this happens the OpenAFS
build appears to succeed but the kernel module will fail to load
due to the undefined symbol.
Andrew Deason [Thu, 17 Jul 2014 15:33:23 +0000 (10:33 -0500)]
LINUX: Avoid premature RO volume lock error
Commit 0fc27471e7da0c5de4addcdec1bfbca5208072cc avoids processing lock
requests for RO volumes, but it did this both in afs_lockctl() and in
the Linux-specific afs_linux_lock(). The changes in afs_linux_lock()
are incorrect, since they also avoid F_GETLK requests (whereas
afs_lockctl() just avoids F_SETLK* requests).
Additionally, the section in afs_linux_lock() incorrectly reports an
error, since it returns a positive EBADF error code, when we are
supposed to return -EBADF.
The result of all of this is that an F_GETLK F_WRLCK request for an RO
volume always fails with fcntl() returning 9 (EBADF), which is an
invalid return code for fcntl() F_GETLK (instead we should return -1
with an errno of 9). But if there are no locks, we should return
success anyway.
Just remove this section, since afs_lockctl() handles this case itself
anyway.
Thanks to Todd Lewis for reporting this issue.
Change-Id: Ia7f3f0b1bdbb922dca06be9de02a9c2b33f9ffee
Reviewed-on: http://gerrit.openafs.org/11316 Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: D Brashear <shadow@your-file-system.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Tested-by: BuildBot <buildbot@rampaginggeek.com>
Michael Meffie [Fri, 8 Nov 2013 21:22:48 +0000 (16:22 -0500)]
tools: fix unpack in example sysvmq audit reader
Fix the unpack in the example sysvmq audit reader script to
correctly unpack the message type, which is an native long.
From the msgrcv perl docoumentation:
Note that when a message is received, the message type as a native
long integer will be the first thing in VAR, followed by the actual
message. This packing may be opened with "unpack("l! a*")".
Change-Id: I5c5480c30d530b384d8057fb071b01e67f1b4ad2
Reviewed-on: http://gerrit.openafs.org/10445 Reviewed-by: D Brashear <shadow@your-file-system.com> Tested-by: D Brashear <shadow@your-file-system.com>
Stephan Wiesand [Wed, 23 Jul 2014 11:57:50 +0000 (13:57 +0200)]
volinfo: fix documenting comments
As pointed by Andrew deason during review of the 1.6 pullup of
commit ae27283550dab33704f30e18975722e0ed2c5424, psize is not
a parameter of HandleHeaderFiles, and in function HandleSpecialFile
it is of type inout since the value is first read by the += operation.
Fix this, and try to improve the description of psize too.
Change-Id: Ia728b20475f0c44b6104dc954aaa04d5f0f098b5
Reviewed-on: http://gerrit.openafs.org/11319 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: D Brashear <shadow@your-file-system.com>
Andrew Deason [Thu, 24 Jul 2014 16:07:45 +0000 (11:07 -0500)]
LINUX: Check afs_lookup return code explicitly
Checking if the returned vcache is NULL or not is a bit of an indirect
way to check if an error occurred. Just check the return code itself,
to make sure we notice if any kind of error is reported.
Suggested by Chas Williams.
Change-Id: I61cc7304e9885ddaaebe96db3b12457cb6224420
Reviewed-on: http://gerrit.openafs.org/11321 Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: D Brashear <shadow@your-file-system.com>
Ben Kaduk [Fri, 18 Jul 2014 19:19:24 +0000 (15:19 -0400)]
FBSD: adhere to gop_lookupname() semantics
The current semantics are that gop_lookupname() returns an unlocked
vnode; the previous code was written to a different semantic that
a locked vnode should be returned.
This makes a disk cache more likely to work on FreeBSD, but such
configurations remain not very tested.
Stephan Wiesand [Thu, 31 Jul 2014 18:50:04 +0000 (20:50 +0200)]
libafs: remove stray "-v 2" argument to afs_compile_et
Commit 4e6b7ab904d38d38da1b80a7342bd815668a8c09 separated the
compile_et rules for creating the source and header files using
the new -emit functionality. During review for inclusion in 1.6,
Chas Williams spotted a stray "-v 2" carried over to the rule
for creating the header file, where it doesn't apply. Remove it.
Change-Id: I554354eae0fa018e56fe7b78df69a43e5b5a0b07
Reviewed-on: http://gerrit.openafs.org/11347 Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: D Brashear <shadow@your-file-system.com>
Michael Meffie [Mon, 28 Jul 2014 21:27:40 +0000 (17:27 -0400)]
libafs: do not allow NULL creds for afs_CreateReq
Do not allow callers to pass a NULL cred to afs_CreateReq. This
avoids setting the uid of zero in the vrequest when no cred is
passed. Update callers to pass afs_osi_credp for an anonymous cred
when no cred is available.
Thanks to Andrew Deason for pointing out afs_osi_credp should be
used.
Change-Id: I05f694026ec72ab701160d9920e47c16cda46cd7
Reviewed-on: http://gerrit.openafs.org/11336 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: D Brashear <shadow@your-file-system.com>
Always define AFS_HAVE_STATVFS. According to the man page, statvfs()
appeared in FreeBSD 5.0. Additionally, this macro is only used for
userspace which eliminates all disables except for FreeBSD 5.0 which
appears to have just been an oversight when the param file was created
from the 4.x param files.
Also fixes the comment so it reflects the actual choice.
Use location number 104, which is the next in the sequence.
The code in this module is compiled when building the
nfs translator, which is only built under linux when
configure detects it is possible.
Thanks to Andrew Deason for spotting this error.
Change-Id: I00c834bc915fa3be7d5f27467895930e4f62aa76
Reviewed-on: http://gerrit.openafs.org/11351 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: D Brashear <shadow@your-file-system.com>
Andrew Deason [Wed, 23 Jul 2014 16:54:47 +0000 (11:54 -0500)]
LINUX: Drop dentry if lookup returns new file
Background: when an entry is looked up after its parent changes,
afs_linux_dentry_revalidate re-looks-up the entry name in its parent.
If we get an ENOENT back, we d_drop the dentry, and in any other
situation we just d_invalidate it. As discussed in prior commits 997f7fce437787a45ae0584beaae43affbd37cce and 389473032cf0b200c2c39fd5ace108bdc05c9d97, we cannot simply d_drop the
dentry in all cases, because that would cause legitimate directories
to be reported as "deleted" if we just failed to lookup the entry due
to e.g. transient network errors (this causes, among other things,
'getcwd' to fail with ENOENT).
However, this logic has problems if the dentry name still exists, but
points to a different file; the case where 'tvc != vcp' in
afs_linux_dentry_revalidate. If that case happens, and the dentry is
still held open by some process, we will continue to try to reference
the vcache pointed to by the 'old' dcache entry, which is incorrect.
To maybe more clearly illustrate the issue, consider the following
cases:
$ sleep 9999 < /afs/localcell/testvol.ro/dir1/file1 &
$ rm -rf /afs/localcell/testvol.rw/dir1
$ mkdir /afs/localcell/testvol.rw/dir1
$ vos release testvol
$ ls -l /afs/localcell/testvol.ro
ls: cannot access /afs/localcell/testvol.ro/dir1: No such file or directory
total 0
d????????? ? ? ? ? ? dir1
Here, on the last 'ls', afs_linux_dentry_revalidate will afs_lookup
'dir1', and notice that it points to a different file (tvc != vcp),
and will d_invalidate the dentry. But since the file is still held
open, the dentry doesn't go away, and so we are still pointing to the
vcache for the old, deleted 'dir1'. That file doesn't exist anymore on
the fileserver, so we get an ENOENT when actually trying to stat() it
(we get a VNOVNODE from the fileserver, whcih gets translated to an
ENOENT).
A possibly more serious case is when the file is just renamed:
In this situation, the same code path applies, but the old file still
exists, so we will continue to use it without error. But since we are
still pointing at the old file, of course the results are incorrect.
Once we kill the process holding the file open, the bad dentry finally
goes away and the results are valid again.
To fix this behavior, d_drop the dentry in all cases, except when we
encounter an error preventing the lookup from being done. This ensures
that the dentry is unhashed from the parent directory in the scenarios
above, and so cannot be used for a subsequent lookup.
With this change, the only afs_lookup response that causes a simple
d_invalidate is when we encounter actual errors during the lookup
(such as transient network failures). This is correct, since in those
cases we don't _know_ that the dentry is wrong. For all other cases,
we do know that the dentry is wrong and so we must force it to be
unhashed.
Change-Id: I11a2db1e05d68a755a77815ec5e8d01ac7b36129
Reviewed-on: http://gerrit.openafs.org/11320 Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: D Brashear <shadow@your-file-system.com>
Andrew Deason [Wed, 30 Jul 2014 16:12:39 +0000 (11:12 -0500)]
ptserver: Fix RemoveFromSGEntry hentry memcpy
In this function, hentry is the "previous" continuation entry that we
looked at, and centry is the "current" continuation entry. We keep
track of the previous continuation entry in case we need to update its
'next' pointer, which we do if we free one of the continuation entries
because it is empty after the removal.
So, this memcpy is supposed to copy the current entry to the previous
one, but the arguments are flipped, so we just copy zeroes to centry
(since hentry is initialized to zeroes early on in the function), and
hentry never gets set to anything besides zeroes.
The effect of this is that whenever a ptdb entry has more than one
continuation entry, and we free up any of them after the first one via
RemoveFromSGEntry, the previous continuation entry becomes blanked
(though the 'next' pointer should still be correct). This means the
membership information for that group is not recorded correctly, as it
loses a chunk of the IDs that it is a member of. The reverse mapping
should still be intact (the parent groups have a pointer to the
sub-group), but the group probably doesn't function correctly.
The reason this happened is because of the confusing conversion from
bcopy to memcpy. Most of the instances of bcopy/bcmp/bzero/etc were
converted (correctly) back in commit c5c521af, but the supergroups
implementation was added afterwards, in 8ab7a909, and contained a
bcopy reference. This bcopy was converted to memcpy in 58d5f38b, but
the argument order was not corrected, causing this bug.
To fix this, just flip the first two arguments of the memcpy. Just get
rid of the casts here, too, to match the code in the non-supergroups
RemoveFromEntry and elsewhere.
Mark Vitale [Mon, 7 Apr 2014 22:56:26 +0000 (18:56 -0400)]
afs: maintain afs_users buckets in sorted order
Modify afs_GetUser() to insert a new unixuser into an afs_users
hash bucket in sorted order, by uid/PAG. This is in support of
other small optimizations in future commits.
Mark Vitale [Thu, 3 Apr 2014 20:37:51 +0000 (16:37 -0400)]
afs: only reset access caches for the matching cell
When an AFS user's tokens change (unlog, aklog) or expire,
afs_ResetAccessCache() is called to reset all the access caches
for that uid/PAG.
However, a user/PAG may have tokens for multiple cells, and they
may expire or be set/reset at different times. Therefore, it is
incorrect to assume that all access caches for a uid/PAG should
be discarded when only one cell's tokens have changed.
Modify afs_ResetAccessCache() to acccept a new argument 'cell',
and only reset the access caches for a uid/PAG if the vcache
resides in the specified cell. If the caller really wants to
reset all a user's access caches, specify cell=-1.
For cache managers that are running with multiple PAGs and multiple
cells, this should improve performance because 1) it avoids
scanning access caches chains for vcaches that are not part of the
current cell and 2) it avoids deleting access caches that may still
good, thus preventing unnecessary FetchStatus calls.
Michael Meffie [Tue, 3 Jun 2014 03:24:45 +0000 (23:24 -0400)]
linux: dont ignore kmod build errors
Errors from the linux kmod build are not propagated, since make is
run as the first command in a pipeline, and the shell returns the
exit code of the last command in the pipeline. Run the make command
in a subshell to detect errors, and exit afterwards. (This method
is more portable than bash specific pipeline processing options.)
Thanks to Mark Vitale for pointing out this build system defect
to me.
Change-Id: If3e204fe31dbdc9e7416d52fae897f792d27d678
Reviewed-on: http://gerrit.openafs.org/11186 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: D Brashear <shadow@your-file-system.com>
- That commit makes the RPC fail in situations where it did not
before. But even if we cannot calculate the checksum, we can still
return other information about the key, so this is undesirable.
- It masks the previous 'code' value, returned from stat(). The
return code of stat() is now effectively ignored, except for the
purposes of setting st_mtime, whereas previously a failure caused
the RPC to fail. This is a behavior change.
So, effectively revert c04de52da4e89e15b211b4a19a3d9bc4d612b209.
Explicitly cast the return value of ka_KeyCheckSum to void, to make it
clear that we are intentionally ignoring the return value, so
hopefully this will not be flagged as a warning by code analysis tools
such as coverity.
Change-Id: Iac745d7c88ed7c2d97660e6949caa63580eef6e2
Reviewed-on: http://gerrit.openafs.org/11194 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Perry Ruiter <pruiter@sinenomine.net> Reviewed-by: D Brashear <shadow@your-file-system.com>
Benjamin Kaduk [Thu, 5 Jun 2014 00:41:57 +0000 (20:41 -0400)]
rx: Do not try to cancel nonexistent events
Unconditionally cancelling the resend event and releasing the
reference it was supposed to have on the call, can cause the
call reference count to go negative.
In particular, the call chain when a new rx_call structure is
allocated would cause its reference count to become negative.
Behave similarly to all the other rxevent_Cancel calls touched
by 20034a815750beff262d49b37fba225c72dd0ab1, and only cancel the
event and drop a reference when the event is present on the call.
Stephan Wiesand [Mon, 2 Jun 2014 14:15:15 +0000 (16:15 +0200)]
fstrace: Don't read uninitialised data on other platforms either
Commit 908105fe8d51551e45692de4e145022138a0356c fixed an off-by-one
error potentially causing a buffer overread in CheckTypes, but only
in the IRIX/AIX version of the function. Apply the same fix to the
code for the other platforms.
Perry Ruiter [Tue, 27 May 2014 08:26:59 +0000 (01:26 -0700)]
config: Move AFS_LRALLOCSIZ to afs_args.h
AFS_LRALLOCSIZ is currently defined in afs/afs.h. Other memory
related definitions such as AFS_SMALLOCSIZ and AFS_MDALLOCSIZ
are defined in config/afs_args.h. Move AFS_LRALLOCSIZ to
config/afs_args.h for consistency.
Change-Id: Ie1e286c24be6a2def404a54355a2fa4b2c42330d
Reviewed-on: http://gerrit.openafs.org/11174 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: D Brashear <shadow@your-file-system.com>
Marc Dionne [Tue, 29 Apr 2014 16:48:03 +0000 (12:48 -0400)]
libafs: Speed up afs_CheckTokenCache
On systems with a large number of PAGs and files in use, the
periodic daemon job that checks for expired credentials and
cleans up the axs cache can run for a very long time. This
can lead to kernel soft lockups and eventually hang processes
and file access because of unavailable locks.
Rework the scanning logic in afs_CheckTokenCache to make the
scanning more efficient in most real world cases. On a test
system accessing ~4000 files from processes in 1000 PAGs, this
has been observed to reduce the runtime of afs_CheckTokenCache
from a problematic ~70s down to about 0.7s.
Additionally, this changes the conditions in which an axscache is
discarded. uid+cell (rather than just uid) must now match, and
if no matching unixuser is found, it will also be discarded.
Adapted from code from Jeffrey Altman who provided the original
loop algorithm and code.
Change-Id: I65b275b4244b3b6ab65453623bb8729530a9e1a6
Reviewed-on: http://gerrit.openafs.org/11123 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: D Brashear <shadow@your-file-system.com>
Ben Kaduk [Fri, 4 Jan 2013 21:16:04 +0000 (16:16 -0500)]
Dummy Makefile for rxgk
Include a libtool export symbol list for the shared library, which
only has the client RPC calls and the NewFooSecurityObject primitives
for now, since that's all that's stubbed out.
Also connect the rxgk directory up to be buildable from the root, but
nothing depends on it yet so it will not be built.
Looking ahead, build a libafsrpc_rxgk.la object.
Change-Id: I12ddefbdaa1ad4845649e3a32efdeaaa21b5e9b7
Reviewed-on: http://gerrit.openafs.org/10563 Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: D Brashear <shadow@your-file-system.com>
Ben Kaduk [Fri, 6 Dec 2013 20:24:58 +0000 (15:24 -0500)]
Add rxgk boilerplate
Just the skeleton of what needs to be there. The actual import is split
over multiple commits, to make the reviewer's burden more manageable.
Error table, protocol description, and stubs for the security object
routines, with header to declare them.
The public header rxgk.h currently only contains a few typedefs and the
NewSecurityObject prototypes, and includes the RPC interface and com_err
code headers.
Change-Id: I7893f78119bb4aef12112cc1e51e1ec69de326c2
Reviewed-on: http://gerrit.openafs.org/10562 Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: D Brashear <shadow@your-file-system.com>
Ben Kaduk [Fri, 6 Dec 2013 19:56:25 +0000 (14:56 -0500)]
Add some configure bits for GSS-API
rxgk will require gss_pseudo_random and might want a couple other
krb5-specific bits. We'll also need substvars to tell whether or
not we can try building these things.
Change-Id: Id18eb3f554605875696095eb40c25ec54df1f74b
Reviewed-on: http://gerrit.openafs.org/10561 Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: D Brashear <shadow@your-file-system.com> Tested-by: BuildBot <buildbot@rampaginggeek.com>
Marc Dionne [Mon, 5 May 2014 17:33:10 +0000 (13:33 -0400)]
Linux: Drop PageReclaim AOP_WRITEPAGE_ACTIVATE case
The exit case here seems to have been added to avoid recursion into
the writeback code and eventual deadlock (see RT #15239). One issue
is that the PageReclaim check can trigger in code paths that don't
deal with the AOP_WRITEPAGE_ACTIVATE code correctly, leading to EIO
errors when multiple threads are doing large mmap writes and memory
pressure is sufficient to trigger reclaim.
The check could be improved to check wbc.for_reclaim which seems to
indicate more reliably when it is safe to return ACTIVATE, but given
that the CPageWrite flag already provides more targeted recursion
prevention, it seems safer to just drop this special case.
Note that many kernel filesystems used to have a similar check mainly
to prevent excessive stack usage, but as more recent kernels have
moved away from doing any writeback during direct reclaim this is a
case that should no longer occur. Partly as a result of this there
are very few users of AOP_WRITEPAGE_ACTIVATE left in the kernel,
which may be a motivation to find a better mechanism for OpenAFS
eventually.
This has been shown to help avoid EIO errors with multiple processes
doing intensive mmap writing.
Thanks to Yadav Yadavendra for identifying the issue and providing
extensive analysis and testing.
Michael Meffie [Tue, 18 Feb 2014 18:59:59 +0000 (13:59 -0500)]
volser: log message for cross-device link errors
Add a log entry to the volume server to help diagnose those pesky
'Invalid cross-link device' errors returned by vos, which occur when
a clone volume is located in a different partition than the parent
read-write volume, or when a read-only volume is on the incorrect
partition on the server.
With this change, a new log entry is added when the volume server
fails to create a clone or a read-write volume because a volume with
the target volume id already exists on a different partition. For a
clone volume, this would be a different partition than the
read-write volume. For a read-only volume, this would be a different
partition than indicated in the vldb.
Examples:
Volume foobar is on /vicepb, but foobar.backup is incorrectly on
partition /vicepa.
$ vos backup foobar
Failed to clone the volume 536870934
: Invalid cross-device link
VolserLog:
VCreateVolume: volume 536870936 for parent 536870934 found on /vicepa; unable to create volume on /vicepb.
1 Volser: Clone: Couldn't create new volume 536870936 for parent 536870934; clone aborted
...
The vldb indicates a read-only volume should be on /vicepa on a
remote site, but the actual volume is on /vicepb.
$ vos release xyzzy
Failed to create the ro volume: : Input/output error
The volume 536870921 could not be released to the following 1 sites:
mantis /vicepa
VOLSER: release could not be completed
...
VolserLog on mantis:
VCreateVolume: volume 536870922 for parent 536870921 found on /vicepb; unable to create volume on /vicepa.
...
Change-Id: Iaa471c46059d598a5095d59580e3b0b8ac6e1992
Reviewed-on: http://gerrit.openafs.org/10849 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Perry Ruiter <pruiter@sinenomine.net> Reviewed-by: D Brashear <shadow@your-file-system.com>
Marc Dionne [Wed, 28 May 2014 13:53:58 +0000 (09:53 -0400)]
vol: Fix gcc 4.9 warnings
gcc 4.9 complains here because the trailing 0 in these macros
has no effect, the value having already been set to NULL.
Just remove the offending 0s, nothing uses the return value
anyway, even if there were platforms where 0 != NULL.
Change-Id: Ic9a79d51419726c0c823a9228c21c13dea918dc8
Reviewed-on: http://gerrit.openafs.org/11176 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: D Brashear <shadow@your-file-system.com>
Stephan Wiesand [Fri, 30 May 2014 13:05:28 +0000 (15:05 +0200)]
libadmin: Remove redundant memset call
Commit bf78bf2c115659b78c34d3bc9d1934bcff21c8cc added initialisation
of the nbulkentries structure to 0, to avoid freeing garbage due to a
goto fail_... before the structure is initialised. As pointed out by
Andrew Deason, there already is an equivalent memset call later in the
code which is now redundant. Remove it.
Perry Ruiter [Fri, 30 May 2014 21:28:53 +0000 (14:28 -0700)]
audit: Delete va_copy kludge
When I developed fix c3d4c109305b2db8a63b754c1894ad37326dc340 I used
va_copy. I was nervous because it required C99, but I had no
problem with any of the buildbots, nor did any reviewer comment.
audit/audit.c contains a local hack to simulate va_copy in the
pre C99 days. There are no uses of va_copy in audit.c but
presumably at some point there was. Delete the local va_copy.
Change-Id: I5e30c7e3052aeffe56e366888c5a3db3a705fd00
Reviewed-on: http://gerrit.openafs.org/11184 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: D Brashear <shadow@your-file-system.com>
Perry Ruiter [Tue, 27 May 2014 08:16:26 +0000 (01:16 -0700)]
Delete several unused memory management constants
Change 412854593cf368006c18e6c0dc607a9ecd76a0e0, removed from
the code base the last usage of:
AFS_SALLOC_LOW_WATER (defined in afs/afs.h)
AFS_MALLOC_LOW_WATER (defined in config/afs_args.h)
AFS_MDALLOCSIZ (defined in config/afs_args.h)
This patch deletes these constants.
Change-Id: I1333aed508875e9b13dc3f36f3ff0d5eadfb2cfd
Reviewed-on: http://gerrit.openafs.org/11173 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
Ben Kaduk [Wed, 8 May 2013 16:51:31 +0000 (12:51 -0400)]
Suppress nonliteral format string warning/error
Clang doesn't like a nonliteral format string, and some kernel
builds (e.g., freebsd) are done with -Werror. Use the standard
workaround for FreeBSD and UKERNEL builds by calling vsnprintf()
into a fixed buffer.
Remove the !defined(AFS_LINUX26_ENV) check, as it duplicates a
conditional around the entirety of osi_Panic().
Michael Meffie [Sun, 19 Jan 2014 22:04:08 +0000 (17:04 -0500)]
libafs: separate source and header compile_et rules
Use the new compile_et -emit flag to generate source and header
files separately to support parallel make.
Export afs_trace.h since it is required to build libafs. Before the
compile_et -emit flag was available, The afs_trace.h file was
generated as a side-effect of creating afszcm.cat.
Change-Id: I4e93691dda34ddc8600d6a818503e0c9e75e618a
Reviewed-on: http://gerrit.openafs.org/10729 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: D Brashear <shadow@your-file-system.com>
Michael Meffie [Sun, 30 Mar 2014 09:53:16 +0000 (11:53 +0200)]
doxygen: make dox
Add an optional make target (make dox) and doxygen configuration to
generate doxygen output files. Auto-detect when the doxygen and
graphviz dot tools are available. When dot is present, configure
doxygen to create dependency graphs.
Since the graph generation can take a very long time, a new
configure option has been added to override the dot tool
auto-detection. To disable the graph generation (even if dot is
installed), run configure with the option: --without-dot
When graph generation is desired, but graphviz dot is not present in
the PATH, specify the path to dot with the configure option
--with-dot=<path-to-dot>.
The configure summary has been updated to show when doxygen document
and graph generation is configured.
Thank you Jason Edgecombe for providing the doxygen configuration
for OpenAFS.
Benjamin Kaduk [Fri, 25 Apr 2014 19:23:16 +0000 (15:23 -0400)]
rxgen: use unsigned type for max array length
Plain '0' is of type int, i.e., signed, and therefore so is '~0'.
The length of an XDR array is unsigned, so this constant should
be of an unsigned type.
Change-Id: I13f5f94b2f54bc0adcdf2ded1696b797b5205057
Reviewed-on: http://gerrit.openafs.org/11107 Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Perry Ruiter <pruiter@sinenomine.net> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: D Brashear <shadow@your-file-system.com>
Benjamin Kaduk [Fri, 25 Apr 2014 19:24:22 +0000 (15:24 -0400)]
Some rx type cleanup for signedness
The epoch, Cid, and security header/trailer sizes are all fundamentally
unsigned quantities. Change the types exposed in some API signatures
to match this reality, and also change the global variables for the
epoch and Cid to match. (Per-connection variables were already of
an unsigned type.)
Change-Id: I4a56736ef7d78028d1d0b980cda0b4c37d694388
Reviewed-on: http://gerrit.openafs.org/11106 Reviewed-by: Perry Ruiter <pruiter@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: D Brashear <shadow@your-file-system.com>