Andrew Deason [Fri, 17 Feb 2012 22:24:16 +0000 (16:24 -0600)]
viced: Remove extraneous h_AHTAHT_r in h_GetHost_r
We added this address to the host with an addInterfaceAddr_r call just
a few lines before, which adds the host to the address hash table.
Another call to h_AddHostToAddrHashTable_r is pure overhead and
confusing.
Andrew Deason [Fri, 17 Feb 2012 21:46:50 +0000 (15:46 -0600)]
viced: Set h_GetHost_r probefail if MPAA_r fails
Currently, in h_GetHost_r, if we get a connection whose address does
not match an extant host, but the reported uuid does, we ProbeUuid the
old host. If it fails, we call MultiProbeAlternateAddress_r and set
'probefail'. Later on, if 'probefail' is set, we always add the
connection address to the host, and remove the host->host,host->port
address from the host.
However, this is not always correct. Consider the following situation.
We have an existing host that has primary address 1.1.1.1, and also
has addresses 1.1.1.2 and 1.1.1.3 on the interface list but not on the
hash table. Say that host A stops responding on 1.1.1.1, and a
connection comes in from 1.1.1.2. We ProbeUuid 1.1.1.1 and get a
failure, so we call MultiProbeAlternateAddress_r.
MultiProbeAlternateAddress_r probes via rx_Multi the addresses 1.1.1.2
and 1.1.1.3. Say that 1.1.1.3 responds first, and responds
successfully, so MultiProbeAlternateAddress_r sets 1.1.1.3 to be the
primary address for the host.
After MultiProbeAlternateAddress_r returns, 'probefail' is set. A few
lines down, we see that oldHost->host does not match haddr, and
'probefail' is set, so we add 1.1.1.2 to the interface list, and
remove 1.1.1.3, and set 1.1.1.2 to be the primary address, even though
1.1.1.3 is the address we most recently 'know' is correct.
To fix this, only set 'probefail' if MultiProbeAlternateAddress_r also
fails after the failed ProbeUuid call. Conceptually this makes sense,
since if MultiProbeAlternateAddress_r succeeds, it found an address
that responds successfully to ProbeUuid, and it sets that address to
be the primary address. Therefore, after MultiProbeAlternateAddress_r
returns success, the situation is the same as if the 'good' address
was already the primary address, and the ProbeUuid call succeeded, so
'probefail' should be cleared.
Andrew Deason [Fri, 17 Feb 2012 19:14:31 +0000 (13:14 -0600)]
viced: Correctly update addrs on alt addr probe
The functions MultiBreakCallBackAlternateAddress_r and
MultiProbeAlternateAddress_r try to find a valid address in a host's
interface list of addrs. If they find one, they update host->host and
host->port. However, they do so just by changing those fields directly
and by calling h_DeleteHostFromAddrHashTable_r and
h_AddHostToAddrHashTable_r. This leaves the old host->host, host->port
on the interface list, and leaves it marked as 'valid'. Similarly, the
new host and port may still be marked as not 'valid'.
This can result in the host being on the addr hash table via an
address that is not on the host's interface list. After the above
situation occurs, we may call
and then update host->host and host->port, which happens in a variety
of places. Since host->host, host->port is not marked as valid in the
interface list, it is not removed from the addr hash table, but it is
removed from the interface list. Eventually, this can cause the host
to be referenced from the addr hash table even after it has been
freed.
Since this can result in hash table entries pointing to the 'wrong'
host, this can result in FileLog messages such as:
Sun Feb 5 03:16:35 2012 Removing address that does not belong to host 0xdeadbeefdead (1.2.3.4:7001).
To fix this, make MultiBreakCallBackAlternateAddress_r and
MultiProbeAlternateAddress_r update the address list the same way as
all of the code in host.c does; by adding the new address with
addInterfaceAddr_r, removing it with removeInterfaceAddr_r, and
updating host->host and host->port.
Andrew Deason [Thu, 16 Feb 2012 22:20:16 +0000 (16:20 -0600)]
viced: Delete dup host before probing old host
Currently, when the fileserver gets a new connection from an address
not on the addr hash table, we allocate a new host structure and add
that host to the addr hash table. If we then find that that host's
uuid matches the uuid of an extant host, we do the following:
- probe the old host with the uuid, and MultiProbeAlternateAddress_r
if the probe fails
- mark the duplicate host as HOSTDELETED
- manipulate the interface lists
Consider, for example, that we have an extant host ('oldHost') with
address 1.2.3.4:7001, but with 5.6.7.8:7001 on its alternate interface
list. At some point, the 1.2.3.4:7001 interface goes away or becomes
unreachable. A new connection comes in from that same host on
5.6.7.8:7001.
What will happen is we create a new host for address 5.6.7.8:7001, and
then detect the uuid collision. When we try to probe the old address
of 1.2.3.4:7001, it will fail, and we will try to
MultiProbeAlternateAddress_r. MultiProbeAlternateAddress_r will
determine that the alternate address 5.6.7.8:7001 responds
successfully to the probe, and it tries to set 5.6.7.8:7001 to be the
primary address of 'oldHost', and add 'oldHost' to the addr hash table
under 5.6.7.8:7001.
But the "new" host from the incoming connection is already hashed on
the address hash table under 5.6.7.8:7001, so the
h_AddHostToAddrHashTable_r call in MultiProbeAlternateAddress_r fails.
Since we later delete the new duplicate host, this results in
5.6.7.8:7001 being the primary address for the host, but that address
is not anywhere in the address hash table.
This behavior can be seen by the following pair of FileLog messages:
Wed Feb 1 11:02:38 2012 CB: ProbeUuid for 0xdeadbeefdead (1.2.3.4:7001) failed -01
Wed Feb 1 11:02:38 2012 h_AddHostToAddrHashTable_r: refusing to hash host beefdead, baadcafe (5.6.7.8:7001) already hashed
While those message do not necessarily indicate this problem, this
problem will result in those messages.
To fix this, mark the duplicate host as HOSTDELETED before we do any
probing on 'oldHost'. This way, if MultiProbeAlternateAddress_r tries
to add 'oldHost' to the addr hash table under 5.6.7.8:7001, it will be
able to do so successfully, since the old duplicate host is deleted.
Andrew Deason [Mon, 13 Feb 2012 20:11:36 +0000 (14:11 -0600)]
Rx: Avoid lastBusy/PEER_BUSY discrepancy
If an rx call has the RX_CALL_PEER_BUSY flag set, but the call's
conn->lastBusy is not set, we can easily cause an rx caller to loop
infinitely. rx_NewCall will see that lastBusy for a call channel is
not set, and will use that call channel, but rxi_CheckBusy will note
that the call appears busy and that there are non-busy call channels
on the same conn, and so will return RX_CALL_BUSY.
This can currently happen in rxi_ResetCall, since we set
RX_CALL_PEER_BUSY on the call again if the call had that flag set when
rxi_ResetCall was called. If we are calling rxi_ResetCall with
'newcall' set, the passed in call is unrelated to the new call, since
it was obtained from the free list. Thus, the busy-ness of the call
should be ignored. Fix this by only paying attention to the incoming
RX_CALL_PEER_BUSY flag if 'newcall' is not set.
Also prevent this from happening by clearing RX_CALL_PEER_BUSY in
rx_NewCall when we select a call and clear lastBusy for that call.
Derrick Brashear [Tue, 13 Dec 2011 16:24:16 +0000 (11:24 -0500)]
volser: allow clonevol purge id to be new id
effectively the same functionality that reclone already uses, but
for some reason we artificially limit it out of clone despite
the interface being there for it. it used to be there. put it back.
Andrew Deason [Wed, 8 Feb 2012 22:03:29 +0000 (16:03 -0600)]
RedHat: Fail openafs-client 'stop' on rmmod error
Currently, the openafs-client RPM init script ignores any error
reported by rmmod. If 'umount /afs' succeeds but rmmod does not, the
client may panic the machine if the client is started again (from e.g.
running the 'restart' init script method), since afsd will try to
initialize AFS with a libafs that has been shut down.
So, do not ignore errors from 'rmmod', and instead fail the 'stop'
method from the init script if we get an error.
Andrew Deason [Tue, 20 Dec 2011 22:44:42 +0000 (17:44 -0500)]
viced: Keep H_LOCK while locking host in h_Alloc_r
Currently in h_Alloc_r, we h_Lock_r the host, so we have it locked on
return. However, h_Lock_r drops the host glock, which is bad in this
situation since we have already added the host to the global hash
table, so other threads may see it. This can mean that by the time
h_Alloc_r returns, the returned host may have HOSTDELETED set, and/or
the addresses associated with the host may be completely different.
h_Alloc_r's caller, h_GetHost_r, seems to assume that the host is
still associated with the address of the passed-in connection. When
this is not true, this can result in the host structure getting into a
strange state, such as the primary addr/port may not be hashed. The
host may also have HOSTDELETED set, in which case we're not supposed
to be dealing with it at all.
To avoid these problems, lock host->lock directly in h_Alloc_r,
without going through h_Lock_r and dropping H_LOCK. Also do it as one
of the first things we do to initialize the host, just to make sure
that if anybody else happens to see the host, it is locked by us when
they do.
Tom Keiser [Wed, 1 Feb 2012 08:31:23 +0000 (03:31 -0500)]
com_err: correctly deal with lack of libintl
On machines lacking a libintl, _intlize() currently fails to initialize
the output error string--leading to tools (e.g., translate_et) returning
a null string; make afs_com_err fall back to returning the en/US canonical
error text when we don't have any i18n support...
Christof Hanke [Sun, 29 Jan 2012 17:08:57 +0000 (18:08 +0100)]
linux: fix probing for noop_fsync
Commit 267934d0e6910c8d8166a6e78f93c1bab40857b8 introduced
probing code to deal with the renameing of simple_fsync
inside the linux-kernel.
This test does not take different parameter-lists
for noop_fsync or simple_fsync resp. into account.
Fix this.
Reviewed-on: http://gerrit.openafs.org/6628 Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: Derrick Brashear <shadow@dementix.org> Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 20e82cecd9008f9b3467c9a323c5c3abf27f3021)
Andrew Deason [Mon, 6 Feb 2012 19:23:41 +0000 (13:23 -0600)]
Disable kernel opt by default on Solaris 10 and 11
With newer Solaris Studio (sometime in the 12.* series), cc started
adding SSE instructions to optimized x86 code, which is invalid for
kernel code and can generate panics. There appears to be no way to
turn this off currently (-xvector=%none is non-functional), so default
to not optimizing kernel code.
Andrew Deason [Thu, 2 Feb 2012 23:35:52 +0000 (17:35 -0600)]
SOLARIS: Use kcred instead of afs_osi_cred
For many vfs ops to the cache, we currently pass &afs_osi_cred for our
credentials, which is a mostly zeroed-out credential structure. In
some modern versions of Solaris (Solaris 11), at least some parts of
this structure need to not be NULL (cr_zone), or we will panic.
The Solaris kernel provides a 'kcred' credentials structure for the
purpose of using "kernel" credentials for i/o. So just use that
instead for Solaris 8 and beyond, since kcred has existed at least
since Solaris 8.
Andrew Deason [Thu, 22 Dec 2011 20:48:49 +0000 (15:48 -0500)]
afs: Panic on afs_conn refcount imbalance
An undercounted afs_conn can easily cause a panic and/or memory
corruption later on, since we put an rx_connection reference with each
afs_conn reference. Panic as soon as we detect this, as this indicates
a serious bug.
Michael Meffie [Wed, 14 Dec 2011 17:52:51 +0000 (12:52 -0500)]
Unix CM: reset blacklist on hard-mount retry
Reset black-listed servers on a request when retrying due to a
hard-mount retry. When hard-mounts are in effect, a request may
retry indefinitely. If all the servers have been black-listed
due to a transient error, the request may never complete.
Reviewed-on: http://gerrit.openafs.org/6330 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit faa58c9f60a158481bdfee27e175a37c5fcd64aa)
Andrew Deason [Thu, 10 Nov 2011 21:18:41 +0000 (15:18 -0600)]
SOLARIS: Do not build x86 kernel module on 5.11
Oracle Solaris 11 no longer supports x86 (amd64 is required). If we
try to build the x86 module, /usr/include/sys/kobj.h complains that
the ISA is unsupported, and refuses to go on. So, just remove
MODLOAD32 from the libafs directories to build on sunx86_511.
when we are going to hit the backend storage, disable keepalives.
the net effect of this is that no idle dead time is needed; instead,
the normal dead time will result in a connection with no activity
simply dying naturally if i/o blocks forever.
it's important that keepalives be enabled during callback breaks,
so that is done.
Jeffrey Altman [Mon, 28 Nov 2011 17:58:02 +0000 (12:58 -0500)]
rx: RX_CALL_IDLE and RX_CALL_BUSY
Allocate new Rx error codes for Idle and Busy calls but do not
send these errors on the wire. They are only intended for local
use.
RX_CALL_IDLE is an indication to an application that requests it
that the rx peer is maintaining an open call channel but has not
sent any actual data for the length of the registered idle dead
timeout.
RX_CALL_BUSY is an indication to an application that requests it
that the rx peer believes the selected call channel is in use by
a pre-existing call.
When either RX_CALL_IDLE or RX_CALL_BUSY are assigned as the call
error and an abort must be sent to the rx peer, the errors are
translated to RX_CALL_TIMEOUT. This is necessary because it is
not possible to add new Rx error values in a method that is safe
for peers that are not expecting them.
This patchset also documents which Rx errors defined in rx.h are
used on the wire and which are not.
The Unix and Windows cache managers are updated to build with
these new error codes.
Andrew Deason [Mon, 7 Mar 2011 17:08:26 +0000 (11:08 -0600)]
RX: Avoid timing out non-kernel busy channels
When we encounter a "busy" call channel (indicated by receiving
RX_PACKET_TYPE_BUSY packets), we can error out a call with
RX_CALL_TIMEOUT to try and get the application code to retry the call.
However, many RX applications are not aware of this, and will just
fail with an error upon receiving a single busy packet.
So instead, make this behavior optional, and only do it if the
application tells us what specific error it expects to receive when a
busy call channel is detected. Enable this behavior for the Unix cache
manager, as it can cope with receiving an RX_CALL_TIMEOUT error in
this scenario.
Andrew Deason [Fri, 13 Jan 2012 18:43:16 +0000 (13:43 -0500)]
vol: remove SYNC fatal_error processing
Currently SYNC clients will "disable" themselves on certain error
patterns. For example, if the server end closes its file descriptor
too many times, or takes too long and then closes the fd, the SYNC
client will return an error and set fatal_error. On any subsequent
SYNC requests, the request will immediately fail without contacting
the server, often making SYNC client programs effectively useless
until they are restarted.
There isn't really any reason to cause future requests to fail.
Transient problems in the fileserver can easily make this situation
possible (e.g. a fileserver can crash but still take several minutes
to close the SYNC fd while the core is written to disk), and so while
we may return an error for a specific problematic request, future
requests may be fine.
So, just remove everything related to fatal_error, so future SYNC
requests can continue to be attempted. Adjust some log messages to
reflect the new behavior.
Marc Dionne [Wed, 18 Jan 2012 01:19:54 +0000 (20:19 -0500)]
rx: Correctly test for end of call queue
The intention of this condition is to check if the current call
being considered is the last one on the queue, but the test is
incorrect. A null next pointer indicates a removed item, not
the end of the queue.
Use the queue_IsLast macro instead to correctly determine that
this is the last item in the queue and that a call has to be
selected, either the current one or a previously seen good choice.
This can cause calls to get permanently stuck in the call queue
and never get assigned to a thread, even when all threads are
idle.
Andrew Deason [Wed, 11 Jan 2012 15:00:35 +0000 (10:00 -0500)]
vol: Fix VCreateVolume special inode cleanup
In order to dec the relevant special inodes, we need to know the
parent vol id in addition to the vol id itself. Use the appropriate
volume IDs when IH_DEC'ing special inodes after we fail to create the
volume, so we don't leave behind special inodes.
Marc Dionne [Fri, 6 Jan 2012 22:22:35 +0000 (17:22 -0500)]
libuafs: only rebuild h directory when needed
A few changes to allow a "make all ; sudo make install ; make all..."
workflow to work without manually removing files in between.
Make the rebuilding of the h directory dependent on the source
files scanned to build it. This prevents it from being rebuilt
for every "make install".
While we're here, use -f when removing linktest for the clean target.
This allows "make clean" to remove it without prompting when the user
doesn't have write access to the file, as is the case when make install
rebuilds it as root.
afs: discard cached state when we are unsure of validity
in the event we got a network error, we don't know if the server
completed (or will complete) our operation. we can assume nothing.
a more complicated version of this could attempt to verify that the
state is what we expect it to be, but in extended callbacks universe
this is potentially easier to solve anyway. for now, return the
error to the caller, and mark the vcache unstat'd.
Andrew Deason [Fri, 18 Nov 2011 16:25:08 +0000 (10:25 -0600)]
DAFS: Atomically re-hash vnode in VGetFreeVnode_r
VGetFreeVnode_r pulls a vnode off of the vnode LRU, and removes the
vnode from the vnode hash table. In DAFS, we may drop the volume glock
immediately afterwards in order to close the ihandle for the old vnode
structure.
While we have the glock dropped, another thread may try to
VLookupVnode for the new vnode we are creating, find that it is not
hashed, and call VGetFreeVnode_r itself. This can result in two
threads having two separate copies of the same vnode, which bypasses
any mutual exclusion ensured by per-vnode locks, since they will lock
their own version of the vnode. This can result in a variety of
different problems where two threads try to write to the same vnode at
the same time. One example is calling CopyOnWrite on the same file in
parallel, which can cause link undercounts, writes to the wrong vnode
tag, and other CoW-related errors.
To prevent all this, make VGetFreeVnode_r atomically remove the old
vnode structure from the relevant hashes, and add it to the new hashes
before dropping the glock. This ensures that any other thread trying
to load the same vnode will see the new vnode in the hash table,
though it will not yet be valid until the vnode is loaded.
Note that this only solves this race for DAFS. For non-DAFS, the vol
glock is held over the ihandle close, so this race does not exist.
The comments around the callers of VGetFreeVnode_r indicate that
similar extant races exist here for non-DAFS, but they are unsolvable
without significant DAFS-like changes to the vnode package.
Andrew Deason [Tue, 27 Dec 2011 02:22:08 +0000 (21:22 -0500)]
afs: Grab a reference to setp in afs_icl_Event4
We can drop GLOCK in several places in afs_icl_Event4 and the
afs_icl_AppendRecord callee. To ensure that the given afs_icl_set does
not get freed while we have GLOCK dropped, grab a reference to the
set.
Thanks to Ryan C. Underwood for reporting an issue triggered by this.
Michael Meffie [Fri, 12 Aug 2011 18:29:48 +0000 (14:29 -0400)]
xstat: cm xstat time values are 32 bit
The kernel space cm xstat time structures are implemented as 32
bit values in memory and on the wire. Define the client side
xstat userspace structures as 32 bit time values as well to avoid
size mismatches on systems with native 64 bit time values.
it's actually important this be more than the rx call dead time
so timing out server callbacks to clients don't result in us idle deading
a call to the server when callbacks need to be broken
Geoffrey Thomas [Sun, 1 Jan 2012 00:51:29 +0000 (19:51 -0500)]
linux: fsync on a directory should return 0, not EINVAL
Directory writes are synchronous, so this is fine. There's a
mostly-convenient function in fs/libfs.c that returns 0 that we can use
to do what we want ("mostly" because it was renamed in 2.6.35).
FIXES 130425
Reviewed-on: http://gerrit.openafs.org/6491 Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 267934d0e6910c8d8166a6e78f93c1bab40857b8)
Geoffrey Thomas [Sun, 11 Dec 2011 10:06:24 +0000 (05:06 -0500)]
rpm: Don't attempt to restart on upgrade when using systemd
systemd is actually rather capable of leaving the OpenAFS client in an
incredibly broken state, thanks to its willingness to track services and
kill their processes. We should not attempt to restart the client on
upgrade, whether a normal upgrade or a migration from SysV initscripts.
In the former case, it's fine (and correct) for the old AFS to keep
running; in the latter case, the unit file is capable of correctly
shutting down an initscript-launched client. The same is true for the
OpenAFS server.
This brings the packaging in line with the SysV initscript code in the
specfile, which does not attempt to restart the service, as well as with
e.g. Debian's packaging, which uses --no-restart-on-upgrade.
While we're here, clean up a redundant BuildRequires on systemd-units.
Andrew Deason [Thu, 9 Jun 2011 03:50:27 +0000 (22:50 -0500)]
libafs: memset dirHeader->hashTable
Clear dirHeader->hashTable via memset instead of via a loop. This is
more efficient, and avoids the loop getting optimized into an unusable
_memset call on recent versions of Solaris Studio when building for
the kernel.
Thanks to Jeff Blaine for reporting the issue with Solaris Studio.
Andrew Deason [Mon, 19 Dec 2011 22:11:31 +0000 (17:11 -0500)]
Include afsconfig.h before anything else
afsconfig.h can define various preprocessor symbols that can affect
how system headers behave. For example, the presence of the
_POSIX_PTHREAD_SEMANTICS symbol changes the number of arguments to
getpwnam_r on at least Solaris 8. So, we must include afsconfig.h
before including anything else, to ensure consistency.
Andrew Deason [Sun, 18 Dec 2011 21:20:42 +0000 (15:20 -0600)]
aklog: Add replacement setenv/unsetenv
aklog makes use of the setenv and unsetenv functions, which do not
exist (at least) on HP-UX earlier than around 11i v3, and do not exist
on Solaris earlier than Solaris 10. Add replacement functions for
setenv and unsetenv when they are not present. Note that these
implementations are copied from libroken, and setenv was modified to
not use asprintf.
This is 1.6-specific. On the master branch, libroken takes care of
these for us. On the master branch, setenv and unsetenv from libroken
were added in 70e8451acd0426024c152073e53bc6606e0189e1.
Andrew Deason [Wed, 14 Dec 2011 20:42:08 +0000 (14:42 -0600)]
afs: Clear VHardMount on ResetVolumeInfo
afs_Analyze sets VHardMount on a volume struct when a hard-mount
scenario is encountered, and clears it after sleeping. However, if the
volume struct has VRecheck set, or if it's not in memory, afs_Analyze
cannot retrieve the volume struct in order to clear VHardMount again.
For the VRecheck case, this can results in VHardMount never getting
cleared, and so hard-mount messages for the volume seem to disappear.
So, clear VHardMount when we set VRecheck so this does not occur.
For the case where the volume struct is not in memory, this is not a
problem, since when we allocate a volume struct again, the VHardMount
state will not be retained.
Andrew Deason [Wed, 20 Jul 2011 21:50:52 +0000 (16:50 -0500)]
libafs: Rate-limit hard-mount waiting messages
Limit how often we log "hard-mount waiting for XXX" messages. Without
this, it is possible for a client with hard-mounts enabled to spam the
kernel log rather excessively (in extreme cases this can even panic
the machine on at least some Linux).
To keep things simple, just log approximately one message per volume
per hard-mount interval.
Andrew Deason [Wed, 4 May 2011 17:34:20 +0000 (12:34 -0500)]
libafs: Avoid using changing unixuser ticket data
PSetTokens was afs_osi_Alloc'ing after afs_osi_Free'ing the previous
token data. This can sleep, causing tu->stp to be pointing to garbage
while we wait to alloc. Additionally, rxkad_NewClientSecurityObject
can sleep while waiting to alloc memory, and so the given tu->stp
pointer given to it by afs_ConnBySA may be invalid by the time it
actually uses the data.
To fix this, we could implement unixuser locking to ensure mutual
exclusion of these events. However, this implements a more
conservative change for the 1.4 and 1.6 branches. In PSetTokens we
alloc the new memory before we change anything, and in afs_ConnBySA we
make copies of the ticket data before giving it to rxkad. With these
changes, the glock gives us enough serialization to avoid issues with
tu->stp changing underneath us.
This change is specific to 1.4 and 1.6. On the master branch, this
issue is fixed by implementing unixuser locks in change
Idd66d72f716b7e7dc08faa31ae43e9a23639bae3.
Andrew Deason [Mon, 25 Apr 2011 18:58:34 +0000 (13:58 -0500)]
pam: Fix password torching const-ness
In some code branches, the PAM code "torches" a password by zeroing
it. However, it does this through a const pointer which we otherwise
know is not actually const. Make sure we get better type checking by
doing this through a non-const pointer.
Marc Dionne [Sat, 16 Apr 2011 15:22:54 +0000 (11:22 -0400)]
pam: Clear up PAM_CONST related warnings on Linux
Commit 78d1f8d8 expanded the use of PAM_CONST and introduced many
new warnings on Linux where pam expects "const" arguments.
This clears up the warnings by doing the following:
- Cast "user" to char * when kalling ka* functions
- Change the signature of pam_afs_prompt and pam_afs_printf to use
PAM_CONST
- Use a separate non-const password pointer for pam_afs_prompt
Reviewed-on: http://gerrit.openafs.org/4487 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 3ea39166d64d2e66cddef015734c2f91548423af)
Stephan Wiesand [Sun, 17 Apr 2011 22:37:36 +0000 (23:37 +0100)]
make afsdump_scan get ACLs right
This makes afsdump_scan get the ACLs right on little endian systems.
It also corrects and slightly beautifies some output (indentation,
cut&paste error for negative ACL label).
Andrew Deason [Mon, 19 Sep 2011 15:05:59 +0000 (11:05 -0400)]
ntohs ubik header size
The 'size' field in the ubik header is only 16-bits wide, so we should
be using ntohs to read it, not ntohl. The database checking utilities
for the prdb and kadb were still using ntohl (vldb was fixed by 591f9b6de9ab3dc5c17ad41af0241527f7f04b31).
Michael Meffie [Fri, 16 Sep 2011 15:23:18 +0000 (11:23 -0400)]
solaris: libafs depends on fs/ufs
The solaris afs module depends on symbols exported by fs/ufs.
Set this dependency in the afs module so the kernel loader
will automatically load the fs/ufs driver if is not already
loaded, such on zfs only systems.
Andrew Deason [Fri, 2 Dec 2011 20:36:59 +0000 (14:36 -0600)]
salvager: Create link table with volume group id
The link table needs to be created with the VG id or RW vol id, not
the non-RW vol id. Unlike other special inodes, this goes for both the
'parent' and 'volume' volume ids, not just the 'parent' id, since
there is only one link table per VG.
Without this, the salvager can generate invalid linktable special
inodes if it encounters a VG with no inodes for the RW vol.
Andrew Deason [Wed, 30 Nov 2011 23:41:53 +0000 (17:41 -0600)]
DAFS: Ensure logging on attach2 errors
The attach2 error path transitions a volume to VOL_STATE_ERROR, in
case whatever got us to that error path did not already put the volume
in an appropriate state. Log when we do this, to make sure we do not
end up with a volume in VOL_STATE_ERROR state silently.
Andrew Deason [Wed, 30 Nov 2011 23:35:56 +0000 (17:35 -0600)]
DAFS: Avoid unnecessary preattach on FSYNC_VOL_ON
FSYNC_VOL_ON/FSYNC_VOL_ATTACH can be called to "online" a volume that
was actually kept online for the duration of the volume operation.
Avoid calling VPreAttachVolumeByVp_r for such a volume if it's already
attached, in order to avoid an unnecessary log message and to save a
tiny bit of processing.
Andrew Deason [Wed, 30 Nov 2011 23:21:32 +0000 (17:21 -0600)]
DAFS: Log more for VPreAttachVolumeByVp odd states
When we encounter "odd" states in VPreAttachVolumeByVp_r, say what the
actual state we encountered was, along with the attach flags, so we
have a better idea of what's going on.
Andrew Deason [Wed, 30 Nov 2011 23:08:57 +0000 (17:08 -0600)]
DAFS: Ensure GetVolume errors on ERROR volumes
In GetVolume, after we call VAttachVolumeByVp_r, there is no explicit
check to see if vp is in VOL_STATE_ERROR state. Make sure we don't try
to use such a volume, or blindly transition the volume away from that
state.
Andrew Deason [Wed, 30 Nov 2011 20:36:06 +0000 (14:36 -0600)]
DAFS: Do not transition to ERROR on trivial errors
attach2 can result in many different errors; some indicate that the
volume is in an inconsistent state, but many others just indicate that
the volume cannot be attached for benign reasons (such as VNOVOL if
the volume doesn't exist, or VOFFLINE if the volume is being used by a
volume utility). Currently, for DAFS, attach2 transitions the relevant
volume to the VOL_STATE_ERROR state for almost all errors encountered,
even the benign ones. Instead, skip the error state transition for
error handling paths that do not reflect a "broken" volume.
rx: arrange for Finalize to really stop running calls
previously rxi_ServerProc would happily error a call once
rx_tranquil was set, but keep calling ExecuteRequest.
Reorder code so kernel shutdown attempts are processed first;
then arrange if we are tranquil to not process the call further.
Todd Lewis [Sun, 11 Sep 2011 11:42:47 +0000 (12:42 +0100)]
RPM: Fix dkms support on Fedora 15
Newer dkms no longer uses or supplies a $kernelver_array variable;
instead it uses $kernelver. The attached patch uses both, one of
which will be empty, so the test will do the Right Thing regardless
of your dkms version.
Further, the "mv" command at the end of the MAKE[0]= line needs
lots of back-slashes on each of its parms. We need three to make it
all the way to the final dkms.conf file -- so that's six -- plus one
more to escape the '$'; that's seven in all.
In case there's any question (and with all the back-slashes involved,
there should be) about the intent here, the whole point of this
patch is to make the final dkms.conf MAKE[0]= line look like this
(module line breaks:
Andrew Deason [Mon, 29 Aug 2011 18:07:01 +0000 (13:07 -0500)]
ihandle: OPEN fdPs are not counted in ihP refcount
Just add a comment explaining that an OPEN FdHandle_t does not count
against the ref count for its parent IHandle_t. Recently I've seen
some confusion about this when discussing ihandle internals, and this
should make this abundantly clear.
Install*Volume is careful to protect against recursing into the volume
lock via ResetVolumeInfo. Unfortunately, GetServer acquires xserver,
and then if it needs to call GetCapabilities, it drops and reacquires
xserver.
turns out the volume locks weren't protecting much. they also aren't
grabbed before xvolume is dropped. fine, so, restructure to do all the
work, then merge the result.
Andrew Deason [Wed, 24 Aug 2011 17:30:00 +0000 (12:30 -0500)]
ihandle: Actually assert active fdPs are not AVAIL
FdHandle_t's that are on the linked list for an associated IHandle_t
should not be in the state FD_HANDLE_AVAIL. For the non-PIO case, we
assert that this is the case in ih_open (since we assert that if the
FdHandle_t is not in INUSE state, then it must be in OPEN state).
However, for the PIO case, we were just skipping over any FdHandle_t's
that were in the AVAIL state. These should never exist while on that
linked list, so assert for the PIO case, as well.
In the absence of bugs, there is no functional change here, but it
perhaps makes the ih_open loop easier to understand.
Matt Benjamin [Wed, 24 Aug 2011 20:23:37 +0000 (16:23 -0400)]
LINUX vcache lock ordering in afs_linux_readdir
Normalize shared and exclusive lock operations. Take the lock
exclusive immediately, since the code assumes a write lock if
the vcache state is in flux or the entry is being fetched, releasing
-write- rather than shared, since we do not hold a shared lock.
Reviewed-on: http://gerrit.openafs.org/5309 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk> Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit fa97579a08cdf23fcff3c50a5845d72a785feeaf)
Garrett Wollman [Sun, 7 Aug 2011 03:36:14 +0000 (23:36 -0400)]
butc: initialize startTime before it is used
In some unusual error situations, startTime may be used uninitialized.
Move the initialization up above the first such error condition.
(None of the intervening code can take measurably long to execute
so this should not make any difference in the non-error case.)
Andrew Deason [Wed, 8 Jun 2011 18:19:59 +0000 (13:19 -0500)]
afsd: Fail gracefully on mtab open failure
On Linux and IRIX, fail gracefully when we fail to open /etc/mtab,
instead of segfaulting. Move strdup'ing cacheMountDir until after
opening /etc/mtab, to simplify the error handling.
Simon Wilkinson [Tue, 31 May 2011 07:28:51 +0000 (08:28 +0100)]
vos: Don't leak/overflow bulkaddrs
The vos listaddrs command repeatedly reuses a bulkaddrs array. It
zeros it once (without freeing the allocated memory), and then
repeatedly uses it without zeroing in a loop. This means that the XDR
library assumes that a sufficiently large block is already allocated,
doesn't reallocate for the incoming data, or check limits.
This means that if the first call to VL_GetAddrsU returns a set of
addresses smaller than subsequent calls, we'll write past the end
of the array, causing memory corruption.
Fix this by freeing the arrays correctly with each pass of the call.
Andrew Deason [Mon, 16 May 2011 18:45:49 +0000 (13:45 -0500)]
libafs: Always use anonymous VL connections
afs_NewVolumeByName was using the areq given by the caller for
afs_SetupVolume, which may represent authenticated credentials. Give
afs_SetupVolume &treq instead, which will be anonymous, so we don't
have to deal with rxkad for VL lookups.
Andrew Deason [Mon, 16 May 2011 20:02:14 +0000 (15:02 -0500)]
viced: Check vnode length on Rename and Link
Commit 2578555d7e08131bf2fe4cdd0aa4b32567a76eb2 added vnode length
checks when we create or remove vnodes, but not during Rename and Link
operations (when vnodes are neither created nor destroyed). Add the
check in Rename and Link.
Andrew Deason [Wed, 27 Apr 2011 20:36:44 +0000 (15:36 -0500)]
viced: Do not try to reuse deleted client
When h_FindClient_r encounters a deleted client structure, it does not
try to find a different client structure to use. Force it to use a new
client structure by setting client to NULL when it detects a deleted
client.
This arguably reverts part of 4e55e30f5b2c149b350b6d6875793adf722fdc21, but the code paths in
h_FindClient_r are very different now, so that commit is probably not
too relevant.
Andrew Deason [Mon, 25 Apr 2011 18:53:52 +0000 (13:53 -0500)]
pam: Password is const in setcred
afs_setcred.c gets the "password" pointer from pam_get_data, which
always gives a const pointer (unlike pam_get_item used in afs_auth.c
&c, which sometimes gives a const or not-const pointer, depending on
the PAM implementation).
So, declare password const, to get better type checking.
Andrew Deason [Wed, 13 Apr 2011 15:52:50 +0000 (10:52 -0500)]
pam: Use PAM_CONST more often
Some callers of pam_get_item et al were just casting their argument to
a const void **. Some PAM implementations (Linux) want a const void**,
but others (Solaris) do not. Use the PAM_CONST symbol already defined
by autoconf to declare or cast the relevant variable const or not as
appropriate.
Andrew Deason [Wed, 13 Apr 2011 16:10:52 +0000 (11:10 -0500)]
pam: Check for null upwd from getpwnam_r
The POSIX getpwnam_r can yield a NULL struct passwd pointer even when
the returned error code is 0 (in particular, when the requested entry
is not found). Just add a check for a null upwd to make sure we don't
dereference a NULL pointer.
Andrew Deason [Wed, 13 Apr 2011 16:08:09 +0000 (11:08 -0500)]
pam: Use POSIX getpwnam_r on Solaris
_POSIX_PTHREAD_SEMANTICS is now always defined for Solaris, which
means we get a POSIX-conforming getpwnam_r, which takes 5 arguments.
So, add Solaris to the list of platforms that use a POSIX getpwnam_r.
Andrew Deason [Fri, 8 Apr 2011 18:00:15 +0000 (13:00 -0500)]
DAFS: Request salvage on detach for volser
When the volserver notices that a volume needs salvaging, mark
V_needsSalvaged. So when we VDetachVolume the volume, we can then just
request the salvage in the volume package.
Fix the VolClone salvaging code to do this as well, instead of using
the vol-private VRequestSalvage_r interface.
Andrew Deason [Thu, 7 Apr 2011 17:36:19 +0000 (12:36 -0500)]
volser: Avoid assert on ViceCreateRoot failure
If IH_CREATE fails in ViceCreateRoot, it may just be due to an on-disk
inconsistency. So, don't assert, but just return an error and detach
the volume.
Andrew Deason [Thu, 7 Apr 2011 18:51:14 +0000 (13:51 -0500)]
DAFS: Do not give back vol to viced after salvage
If we VRequestSalvage_r a volume successfully, and we are not the
fileserver, we will tell the fileserver to salvage a volume. So, we do
not need to give back the volume afterwards, since telling the
fileserver that a volume needs a salvage effectively gives it back (so
the salvager can take it).
So, clear needsPutBack so we don't try to also give back the volume,
and avoid the fileserver yelling at us for trying to give back a
volume that is checked out by someone else (or is not checked out at
all).
Andrew Deason [Wed, 6 Apr 2011 21:56:22 +0000 (16:56 -0500)]
afsd: Trim trailing slashes on Linux mntent
When we write a mount entry on Linux when mounting /afs, trim trailing
slashes on the mount path. Otherwise, the umount utility can get
slightly confused, and leave the /afs mount entry in /etc/mtab after
it's been unmounted.
For full correctness we should probably completely canonicalize the
path like the mount utility does, but it's unlikely that anyone will
provide significantly weird paths for cacheMountDir, so don't bother.
Marc Dionne [Wed, 6 Apr 2011 01:30:20 +0000 (21:30 -0400)]
ubik: don't rely on timeout value after select()
The value of timeout after a select() call should be considered
undefined; relying on its value is not portable.
Since IOMGR_Select doesn't modify the timeout it is given, the
intention of the code seems to be to wait for gradually increasing
timeout values, starting at 50ms. At least under Linux, the
timeout gets set to 0 by select() if it waited for the full specified
time, resulting in a much shorter maximum possible wait period.
Initialize the timeout value for each loop according to the existing
logic, to get consistent behaviour between the lwp and pthreaded code.