Andrew Deason [Tue, 26 Apr 2011 19:32:25 +0000 (14:32 -0500)]
Fix --without-krb5
Currently, specifying --without-krb5 causes the AM_CONDITIONAL
KRB5_USES_COM_ERR to not be defined, which makes configure refuse to
run successfully. Fix this by forcing KRB5_USES_COM_ERR to always be
false if we are running explicitly without krb5.
Fixes breakage on freebsd for missing malloc.h, reported by GAWollman,
and, since roken.h already includes stdlib.h to pull in malloc, is no
longer necessary
Marc Dionne [Sat, 23 Apr 2011 02:23:21 +0000 (22:23 -0400)]
ubik: add uvote_HaveSyncAndVersion
Add a new function uvote_HaveSyncAndVersion() that combines the
logic from uvote_GetSyncSite and uvote_eq_dbVersion, without
releasing the vote lock in between. Make use of it in
urecovery_AllBetter.
Marc Dionne [Sat, 23 Apr 2011 01:24:34 +0000 (21:24 -0400)]
ubik: Defer updateUbikNetworkAddress until after RX startup
The beacon package initialization has been moved to precede starting
RX services, but the broadcast of addresses to other servers should
be deferred until after RX is started.
Make updateUbikNetworkAddress an exported function and call it
from the general initilization sequence.
Marc Dionne [Sat, 29 Jan 2011 19:37:23 +0000 (14:37 -0500)]
ubik: locking in recovery.c
Locking changes in recovery.c:
- In urecovery_Initialize, hold the DB lock over ReplayLog
and InitializeDB
- Hold the DB lock over larger portions of urecovery_interact.
Some values which should be protected were examined and modified
without holding any locks.
- In the early part of urecovery_interact, only take the DB lock
when it's really needed, now that some values are protected by other
locks.
- DoProbe is now called without the DB lock, so it doesn't need to
drop and re-aquire it.
Marc Dionne [Sat, 16 Apr 2011 18:19:57 +0000 (14:19 -0400)]
ubik: always hold DB lock for urecovery_ResetState()
ubik_ResetState requires callers to hold the DB lock, since it modifies
urecovery_state. All callers of ubeacon_AmSyncSite outside of the beacon
package hold the DB lock, but calls from the beacon thread do not, and
can't block on getting the DB lock if we're sync site.
Add a beacon internal version of ubeacon_AmSyncSite that skips the
call to ResetState, and have the callers take the DB lock and call
ResetState themselves if needed. They can take the lock in this case
because we know we're not the sync site. Refactor the exported
ubeacon_AmSyncSite in terms of this new function.
Marc Dionne [Sat, 16 Apr 2011 16:56:05 +0000 (12:56 -0400)]
ubik: set UBIK_RECLABELDB before propagating version
Quoting Jeffrey Hutzelman:
In udisk_commit(), when committing the first write transaction
after becoming sync site, the database is relabelled. In this
case, the UBIK_RECLABELDB recovery state bit should be set before
propagating the label change to other servers, rather than after.
This is because ContactQuorum_DISK_Setversion() will
release the database lock, at which point the recovery state may
be cleared by urecovery_ResetState() running in another thread.
It is important that a relabelling which occurs before recovery
state is cleared not result in the UBIK_RECLABELDB recovery state
bit being set after; otherwise, the server may fail to correctly
relabel the database the next time it becomes sync site.
Marc Dionne [Sat, 16 Apr 2011 15:52:57 +0000 (11:52 -0400)]
ubik: remote: fix DB lock usage
Many of the RPC functions in the remote package have a similar
prologue that makes use of ubik_currentTrans before taking the
DB lock. Take the lock earlier, and rely on the ubik_dbase global
instead of the dbase pointer in ubik_currentTrans.
In GetVersion, take the lock earlier to cover the call to
ubeacon_AmSyncSite.
Ben Kaduk [Sun, 19 Dec 2010 04:52:43 +0000 (23:52 -0500)]
Rename libcom_err to libafscom_err
We no longer provide a compatible libcom_err, and in fact
we renamed the symbols in our libcom_err several years ago
to reflect this fact.
When we build on a system where KRB5_LIBS includes
-lkrb5 -lcom_err , the new Unix build system will pick up
our libcom_err (as $(AFS_LDFLAGS) is the first argument in
AFS_LDRULE and pulls in a linker search path for our libcom_err)
which does not provide all the needed symbols for libkrb5.
Fully rename our libcom_err away to avoid these conflicts.
Marc Dionne [Fri, 22 Apr 2011 19:23:27 +0000 (15:23 -0400)]
Linux: cleanup aio support
Code that called directly into the aio operations (ex: readv/writev)
would bypass the AFS specific operations found in afs_linux_read
and afs_linux_write.
Rework the handlers:
- For newer kernels with aio, let the kernel use its default read
and write operations, and define the aio_read and aio_write operations,
with the AFS specific bits, calling into generic_file_aio_read/write.
The kernels default read/write operations are just wrappers around the
aio versions.
- For older kernels, leave things as is, pointing read and write to
afs_linux_read/write
Simon Wilkinson [Tue, 19 Apr 2011 10:47:08 +0000 (11:47 +0100)]
cmd: Split up dispatch function
Split up the command line parsing behaviour out of the cmd_Dispatch
function, and into a function of its own - cmd_Parse. This lets servers
which only have a single "syntax" installed just parse, without needing
to go through a dispatch function, and all of the control flow
complexity that requires.
Simon Wilkinson [Mon, 18 Apr 2011 07:25:55 +0000 (08:25 +0100)]
cmd: Add function to disable positional commands
Add a new cmd_DisablePositionalCommands function which can be used
to completely disable positional commands, for functions which have
no desire to make use of them.
Simon Wilkinson [Sat, 23 Apr 2011 15:42:54 +0000 (11:42 -0400)]
cmd: Add some tests to the test suite
Add some tests for the command library to the integrated test
suite in tests. These are far from complete, and are mainly there
to ensure that we don't break any of this functionality when modifying
the library.
Andrew Deason [Mon, 25 Apr 2011 18:58:34 +0000 (13:58 -0500)]
pam: Fix password torching const-ness
In some code branches, the PAM code "torches" a password by zeroing
it. However, it does this through a const pointer which we otherwise
know is not actually const. Make sure we get better type checking by
doing this through a non-const pointer.
Andrew Deason [Mon, 25 Apr 2011 18:53:52 +0000 (13:53 -0500)]
pam: Password is const in setcred
afs_setcred.c gets the "password" pointer from pam_get_data, which
always gives a const pointer (unlike pam_get_item used in afs_auth.c
&c, which sometimes gives a const or not-const pointer, depending on
the PAM implementation).
So, declare password const, to get better type checking.
If the Kerberos v5 library cannot be loaded (pkrb5_init_context
equal to NULL) return a reasonable error code instead of
returning success and doing nothing.
Windows: NPLogonNotify provide password in all cases
When calling KFW_AFS_get_cred() from NPLogonNotify()
always provide the user password. Do not count on a
credential cache existing from a previous call.
Marc Dionne [Sat, 16 Apr 2011 15:22:54 +0000 (11:22 -0400)]
pam: Clear up PAM_CONST related warnings on Linux
Commit 78d1f8d8 expanded the use of PAM_CONST and introduced many
new warnings on Linux where pam expects "const" arguments.
This clears up the warnings by doing the following:
- Cast "user" to char * when kalling ka* functions
- Change the signature of pam_afs_prompt and pam_afs_printf to use
PAM_CONST
- Use a separate non-const password pointer for pam_afs_prompt
Simon Wilkinson [Thu, 21 Apr 2011 15:07:05 +0000 (16:07 +0100)]
Linux: Restrict # of cbrs we allocate at once
With commit a309e274632993c5aeec04c6e090f5ac95837a40, we changed the
number of CBRs that we allocate in a chunk from 300 to 1024. However,
this change takes the amount of memory requried to allocate a chunk
of CBRs above PAGE_SIZE on Linux. This changes the allocator that we
use from kmalloc to vmalloc. Whilst we can, and do, prevent kmalloc
from flushing filesystem pages when we invoke it, we don't have a
similar level of control over vmalloc.
In one reported case, clients deadlock whilst attempting to allocate
this memory, in a call stack that looks something like:
Andrew Deason [Sat, 23 Apr 2011 21:52:30 +0000 (16:52 -0500)]
viced: Release all hosts in h_Enumerate*
h_Enumerate and h_Enumerate_r were not releasing all of the holds they
obtained when the callback function caused the enumeration to bail
early. Correct them so all host holds are released.
Andrew Deason [Sat, 23 Apr 2011 21:44:41 +0000 (16:44 -0500)]
viced: Print a warning when using a deleted client
We should never get a deleted client back from GetClient. Log a
message if we do, to explain why access may suddenly appear to fail,
and assist in determining why.
Note that we still try to service the request, since the accessing
user may still have enough access to do whatever was requested.
Andrew Deason [Sat, 23 Apr 2011 21:25:00 +0000 (16:25 -0500)]
viced: Fix host enumeration flags
Do not give uninitialized flags values to h_Enumerate callback
functions. In fact, do not give a flags value to h_Enumerate or
h_Enumerate_r callback functions at all, since they are not actually
used.
Fix host enumeration callback functions to just return 0 or the
relevant flags, instead of basing the return value off of the given
flags value. Update MultiBreakVolumeCallBack_r to use the correct
return values, since it currently tries to use the old meanings of the
host enumeration return values.
Simon Wilkinson [Mon, 25 Apr 2011 12:56:38 +0000 (13:56 +0100)]
Windows: Remove duplicate file
The 'Streamfiles.txt' file had been committed with both that name,
and an all lower case name. This makes git very sad on systems with
case insensitive filenames.
Andrew Deason [Thu, 21 Apr 2011 22:10:13 +0000 (17:10 -0500)]
libafs: Initialize _settok_tokenCell primary flag
Always set the *primary flag to something in _settok_tokenCell.
Otherwise, the lag may be unset, as it is not required to be
initialized by all callers.
In cm_ReadMountPoint and cm_HandleLink the variable 'thyper'
represets the 'offset' at which cm_GetData should fetch data.
Rename 'thyper' to 'offset' and fix a coding error caused by
misinterpreting the variable purpose.
Andrew Deason [Thu, 21 Apr 2011 19:24:45 +0000 (14:24 -0500)]
aklog: Return token when performing 524 conversion
We weren't actually returning a token and username from
rxkad_get_converted_token. Do so.
This is a 1.6-specific change. This issue was fixed on master when
aklog was changed to use the new SetTokenEx family of pioctls in
commit 53837416cbed3ba4d11f63015e1f13800519f2ed.
use the relative path for afsio.c
use objdir path for generated files
Change-Id: I3b16108eacd949bcb1ddc2224961e87bce9999bb
Reviewed-on: http://gerrit.openafs.org/4508 Reviewed-by: Jonathan A. Kollasch <jakllsch@kollasch.net> Tested-by: Jonathan A. Kollasch <jakllsch@kollasch.net> Reviewed-by: Derrick Brashear <shadow@dementia.org>
Simon Wilkinson [Sat, 16 Apr 2011 13:50:11 +0000 (14:50 +0100)]
FreeBSD: Don't ignore Makefile
The file src/packaging/FreeBSD/Makefile is part of the repository,
and so shouldn't be excluded by .gitignore (the exclusion is inherited
from the top level). So, restore it with .gitignore in this directory.
cm_GetData() drops the cm_scache_t rw lock which permits other
threads to access the data while it is in an inconsistent state.
Avoid the race by using a stack allocated temporary buffer to
receive the data from cm_GetData(). Only copy the data into
the mountPointStringp buffer under the rwlock.
Stephan Wiesand [Sun, 17 Apr 2011 22:37:36 +0000 (23:37 +0100)]
make afsdump_scan get ACLs right
This makes afsdump_scan get the ACLs right on little endian systems.
It also corrects and slightly beautifies some output (indentation,
cut&paste error for negative ACL label).
Andrew Deason [Thu, 14 Apr 2011 19:36:36 +0000 (14:36 -0500)]
auth: Set correct flags in token_extractRxkad
The flags that token_extractRxkad returns are flags that are passed to
ktc_SetToken, not the flags that are passed directly to the PSetTokens
pioctl. So, we should be setting AFS_SETTOK_SETPAG, which is
interpreted by ktc_SetToken.
Simon Wilkinson [Wed, 13 Apr 2011 14:21:46 +0000 (15:21 +0100)]
libafs: Remove afs_write duplication
The afs_write() code for memory and disk cache suffered from exactly
the same duplication problems as the afs_read() code.
Apply a similar fix - unify afs_UFSWrite and afs_MemWrite into a single
afs_write function, place the UFS specific code into afs_UFSWriteUIO,
and make use of the existing afs_MemWriteUIO for the memcache case.
afsio is a utility for file transfer to and from AFS file space
without the help of the AFS client/cache manager. Using libafscp,
this (partially rewritten) version of afsio is able to accomplish
(1) authenticated access to an AFS path or FID (an existing
KerberosV ticket is required), (2) fall back on unauthenticated
("anonymous") access if authentication (token acquisition) fails,
and (3) work independtly of the AFS cache manager (afsd need not
be running, though CellServDB and ThisCell are currently required).
issues:
1) libvldbint and libafsint are not compiled pthreaded. we link in
what we need. this should be changed when we are all-pthreaded.
2) venus is not a pthreaded-directory otherwise. same deal:
in an all-pthreaded universe, undo the bodge that we do here.
3) venus is not an all-krb5 directory either. slight ick.
Andrew Deason [Thu, 14 Apr 2011 19:11:22 +0000 (14:11 -0500)]
RX: Remove allocation counters
Remove the osi_alloccnt and osi_allocsize counters, and the associated
osi_alloc_mutex. These counters are pretty useless since nothing looks
at them, and their use of a mutex requires Rx to be initialized before
XDR can be used. Removing them lifts this restriction.
Andrew Deason [Wed, 13 Apr 2011 17:39:19 +0000 (12:39 -0500)]
Suppress cmp component version error messages
When we use cmp to determine whether to replace
AFS_component_version_number.c, suppress stderr in addition to stdout,
to slightly reduce output during the build.
Andrew Deason [Fri, 15 Apr 2011 16:18:37 +0000 (11:18 -0500)]
AIX51: Fix PAGs
On AIX 5.1 and later, we set a process' PAG by using the AIX PAG
mechanism (and not by group ids), but we were determining what PAG a
process was in by the group list. Instead use the PAG identifier.
This effectively reverts 277c37f48c8126ba9cb986ffc7361fcb98e2bbf2, but
it puts the kcred_getpag call in a different place that makes more
sense in the current PAG code organization.
Andrew Deason [Wed, 13 Apr 2011 15:52:50 +0000 (10:52 -0500)]
pam: Use PAM_CONST more often
Some callers of pam_get_item et al were just casting their argument to
a const void **. Some PAM implementations (Linux) want a const void**,
but others (Solaris) do not. Use the PAM_CONST symbol already defined
by autoconf to declare or cast the relevant variable const or not as
appropriate.
Andrew Deason [Wed, 13 Apr 2011 16:10:52 +0000 (11:10 -0500)]
pam: Check for null upwd from getpwnam_r
The POSIX getpwnam_r can yield a NULL struct passwd pointer even when
the returned error code is 0 (in particular, when the requested entry
is not found). Just add a check for a null upwd to make sure we don't
dereference a NULL pointer.
Andrew Deason [Wed, 13 Apr 2011 16:08:09 +0000 (11:08 -0500)]
pam: Use POSIX getpwnam_r on Solaris
_POSIX_PTHREAD_SEMANTICS is now always defined for Solaris, which
means we get a POSIX-conforming getpwnam_r, which takes 5 arguments.
So, add Solaris to the list of platforms that use a POSIX getpwnam_r.
Andrew Deason [Wed, 13 Apr 2011 17:15:12 +0000 (12:15 -0500)]
vfsck: Fix roken fallout
Including roken.h in vfsck sources pulls in some more modern headers
that vfsck code isn't used to. Accommodate:
- Prevent roken.h from pulling in dirent.h so we don't conflict with
the old-style directory defines for HP-UX. Also move the inclusion
of the old-style directory defines to before roken.h, so we have
the directory types defined in roken.h.
- Remove some prototypes so the don't conflict with the prototypes in
system headers.
- Remove a couple of bizarre vprintf invocations, as they conflict
with the actual vprintf definitions.
Change-Id: Ifd7cd2544e75ed49b93ab491c4acadcb18528315
Reviewed-on: http://gerrit.openafs.org/4472 Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk> Tested-by: BuildBot <buildbot@rampaginggeek.com> Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementia.org>
Andrew Deason [Wed, 13 Apr 2011 15:34:37 +0000 (10:34 -0500)]
Fix some configure header prereqs
On at least Solaris, the configure tests for netinet/if_ether.h and
security/pam_modules.h issued warnings because they existed but were
not compilable. Perform the tests with the prerequisite headers of
net/if.h and security/pam_appl.h, respectively, so autoconf will stop
yelling at us.
Simon Wilkinson [Tue, 12 Apr 2011 18:49:38 +0000 (19:49 +0100)]
libafs: Remove unecessary parameters to afs_read
We were providing additional buffer and length parameters to
afs_read which are now unused, as the necessary information is
contained within the iovec. Just remove these parameters to tidy the
code up a bit.
Simon Wilkinson [Tue, 12 Apr 2011 18:41:30 +0000 (19:41 +0100)]
libafs: Remove afs_read duplication
The disk cache and memcache afs_read functions are effectively
duplicates of each other. Abstract out the common code into a generic
afs_read() function, and put the cache type specific code into
UFSReadUIO (there is already a MemReadUIO which contains the code
necessary for the memcache).
Andrew Deason [Fri, 8 Apr 2011 18:00:15 +0000 (13:00 -0500)]
DAFS: Request salvage on detach for volser
When the volserver notices that a volume needs salvaging, mark
V_needsSalvaged. So when we VDetachVolume the volume, we can then just
request the salvage in the volume package.
Fix the VolClone salvaging code to do this as well, instead of using
the vol-private VRequestSalvage_r interface.
Andrew Deason [Thu, 7 Apr 2011 17:36:19 +0000 (12:36 -0500)]
volser: Avoid assert on ViceCreateRoot failure
If IH_CREATE fails in ViceCreateRoot, it may just be due to an on-disk
inconsistency. So, don't assert, but just return an error and detach
the volume.
Andrew Deason [Thu, 7 Apr 2011 18:51:14 +0000 (13:51 -0500)]
DAFS: Do not give back vol to viced after salvage
If we VRequestSalvage_r a volume successfully, and we are not the
fileserver, we will tell the fileserver to salvage a volume. So, we do
not need to give back the volume afterwards, since telling the
fileserver that a volume needs a salvage effectively gives it back (so
the salvager can take it).
So, clear needsPutBack so we don't try to also give back the volume,
and avoid the fileserver yelling at us for trying to give back a
volume that is checked out by someone else (or is not checked out at
all).
Andrew Deason [Mon, 2 Aug 2010 18:23:34 +0000 (13:23 -0500)]
XDR: decouple from system XDR implementation
Since commit 7293ddf325b149cae60d3abe7199d08f196bd2b9 we have stopped
trying to use the system-provided XDR implementation, but the xdr_ops
structure was still structured to accomodate for the old limitations
of the system XDR. Change xdr_ops so it is just always one consistent
structure.
This removes:
- The AFS_XDR_64BITOPS define and all related code, since we never
call the 64-bit versions of getint and putint ourselves
- The rearrangement of getint32/putint32 depending if we are in
Solaris kernel-land or not
Simon Wilkinson [Wed, 23 Mar 2011 16:31:42 +0000 (16:31 +0000)]
ptserver: Add cmdline options for config and log
Make it possible to set the location of the ptserver's configuration
directory, and the file that it logs to, from the command line. This
makes it possible to bring up a ptserver without requiring an
installation on the system for testing purposes.
Andrew Deason [Wed, 6 Apr 2011 21:56:22 +0000 (16:56 -0500)]
afsd: Trim trailing slashes on Linux mntent
When we write a mount entry on Linux when mounting /afs, trim trailing
slashes on the mount path. Otherwise, the umount utility can get
slightly confused, and leave the /afs mount entry in /etc/mtab after
it's been unmounted.
For full correctness we should probably completely canonicalize the
path like the mount utility does, but it's unlikely that anyone will
provide significantly weird paths for cacheMountDir, so don't bother.
Simon Wilkinson [Thu, 24 Mar 2011 12:28:10 +0000 (12:28 +0000)]
vlserver: Add options for config, log and db
Make it possible to set the location of the vlserver's configuration
directory, database file, and the file that it logs to, from the
command line. This makes it possible to bring up a vlserver without
requiring an installation on the system for testing purposes.
Andrew Deason [Thu, 31 Mar 2011 17:51:44 +0000 (12:51 -0500)]
salvager: Do not AskDelete on GetInodeSummary fail
GetInodeSummary can fail due to a number of different reasons, not
just because the VG doesn't exist. If, for example, we just fail to
write the temporary inode file, we will return with an error, but we
should not AskDelete the volume in that instance.
GetInodeSummary already has code to delete the volumes in question
when no inodes are found, so remove the extra AskDelete after
GetInodeSummary returns.
Andrew Deason [Thu, 31 Mar 2011 22:22:12 +0000 (17:22 -0500)]
salvager: Error volumes on GetInodeSummary errors
When GetInodeSummary fails due to an internal failure (not from just
failing to find applicable inodes), currently it just returns an
error, and does not return the checked-out singleVolumeNumber back to
the fileserver.
When we fail to gather inodes, we should force the volume to an error
state, since we haven't salvaged the volume. But if we fail to find
any applicable inodes, we just want to VOL_DONE the volume, since the
header has possibly been destroyed, and the volume doesn't exist.
So, issue an FSYNC_VOL_FORCE_ERROR command when we encounter errors in
GetInodeSummary, except when we fail to find applicable inodes.
Marc Dionne [Wed, 6 Apr 2011 01:30:20 +0000 (21:30 -0400)]
ubik: don't rely on timeout value after select()
The value of timeout after a select() call should be considered
undefined; relying on its value is not portable.
Since IOMGR_Select doesn't modify the timeout it is given, the
intention of the code seems to be to wait for gradually increasing
timeout values, starting at 50ms. At least under Linux, the
timeout gets set to 0 by select() if it waited for the full specified
time, resulting in a much shorter maximum possible wait period.
Initialize the timeout value for each loop according to the existing
logic, to get consistent behaviour between the lwp and pthreaded code.
Andrew Deason [Tue, 5 Apr 2011 19:51:26 +0000 (14:51 -0500)]
Correct strftime callers
Some strftime callers were not using the resultant string
appropriately. Correct them to have the same behavior as when we were
using afs_ctime (which included a trailing newline).
Marc Dionne [Fri, 4 Feb 2011 01:51:06 +0000 (20:51 -0500)]
ubik: Introduce version lock
The "version" lock is a new lock that protects the database version
information. The goal is to allow the beacon thread to use the
protected values without blocking for an extended period of time,
which could occur if it was using the database lock.
Reading requires holding either lock, while writing requires holding
both locks.
The following values are protected:
ubik_epochTime
db->version
db->flags
db->tidCounter
db->writeTidCounter
Based on analysis and design work from Jeffrey Hutzelman.