Marc Dionne [Sat, 29 Oct 2011 23:23:07 +0000 (19:23 -0400)]
Linux: 3.1: update RCU path walking detection in permission i_op
The permission() inode operation changed again with kernel 3.1,
back to the form it had before 2.6.38. This compiles fine,
but is missing the new way of detecting when we get called in
RCU path walking mode, resulting in system hangs.
Simon Wilkinson [Tue, 21 Jun 2011 17:32:02 +0000 (18:32 +0100)]
rx: Remove the ADAPT_WINDOW code
RX still has the remnants of an old mechanism for doing RTT and
congestion window detection. This code is #ifdef'd out using
the ADAPT_WINDOW define, but is pretty much unservicable these days,
as it collides with the TCP style implementation (with ADAPT_WINDOW
enabled, both will attempt to manipulate a connections RTT and
window size)
As the current TCP-style RTT and window calculation seem to work
much better in deployment, and there isn't much hope for us being
able to maintain two different congestion mechanisms, just remove
ADAPT_WINDOW. It is in git, if we ever want it back (not that I
think we ever would).
Jeffrey Altman [Fri, 28 Oct 2011 15:36:10 +0000 (11:36 -0400)]
Windows: out of date version not in current chunk
In buf_GetNewLocked(), the comparision to decide whether a
cm_buf_t is a member of the current chunk must take the data
version into account. If the data version is out of date, it
is not part of the current chunk and is an object that can be
safely recycled.
Edward Z. Yang [Tue, 18 Oct 2011 03:16:15 +0000 (23:16 -0400)]
linux: Update Packaging to build OpenAFS services for Fedora's systemd
Fedora 15 now uses systemd (see http://fedoraproject.org/wiki/Systemd)
for the OS init system. While it currently has backwards
compatibility with older SysV-style init scripts, future versions of
Fedora may no longer support it, and OS startup tends to be faster
with the systemd service units. Also, systemd runs all the service's
processes within a linux kernel cgroup.
(see http://www.kernel.org/doc/Documentation/cgroups/cgroups.txt)
This change includes an openafs-client.service and
openafs-server.service unit files for the client and server packages
respectively.
Client
- Loading the openafs module was moved into
/etc/sysconfig/modules/openafs-client.modules. This causes the OS to
load the module on boot. This is the preferred way for modules to be
loaded with Fedora. (See
http://docs.fedoraproject.org/en-US/Fedora/15/html/Deployment_Guide/sec-Persistent_Module_Loading.html
for more details)
- The CellServDB file is generated with sed rather than cat.
This change was made because Systemd doesn't execute as a shell
script, but rather executes processes directly. Rather than invoking
a shell to concatenate the CellServDB.* files, they're written to the
CellServDB file using a sed oneliner.
- Do all of the proper kernel module loading and unloading.
Server
- Since systemd uses cgroups, when the service is shut down, all
processes in the openafs-server.service cgroup will be terminated.
The other changes are standard as per:
http://fedoraproject.org/wiki/Packaging:ScriptletSnippets#Systemd
Original version by Jonathan Billings <jsbillin@umich.edu>.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
Change-Id: Ifb41790ffe107b319097b9750273aecfe82c3349
Reviewed-on: http://gerrit.openafs.org/5637 Reviewed-by: Derrick Brashear <shadow@dementix.org> Reviewed-by: Alex Chernyakhovsky <achernya@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
Jeffrey Altman [Thu, 27 Oct 2011 21:57:25 +0000 (17:57 -0400)]
Windows: only flush buffers on shutdown if running
If a service shutdown message is received prior to the
service entering the running state, do not attempt to
buf_CleanAndReset() because the required data structures
and locks are not initialized.
Jeffrey Altman [Tue, 25 Oct 2011 19:32:11 +0000 (15:32 -0400)]
Windows: Do not EEXIST exact match during rename
AFS Rename operations on the file server will delete a
target file if it exists. Do not prevent renames because
an exact match of the target name exists in the target
directory.
Rod Widdowson [Sat, 22 Oct 2011 15:46:26 +0000 (16:46 +0100)]
Windows: Look for 8.3 name when doing a rename
If we are doing a destructive rename we need to find whether the
target file exists. This is done in the usual way (case sensitive
case insensitive), but the short name is not looked for.
This means that the rename of a file to a short name will not
supersede correctly; rather the service refuses the rename since
the target existed already.
This patch looks the target name up in the shortname tree if the
target name is short and all else has failed.
Jeffrey Altman [Mon, 17 Oct 2011 13:28:11 +0000 (09:28 -0400)]
Windows: free pointer after last reference
This is a superficial change but is being done for readability.
If given the choice of freeing memory and then testing the pointer
value or vice-versa, test the pointer value first.
Jeffrey Altman [Mon, 17 Oct 2011 13:22:53 +0000 (09:22 -0400)]
Windows: AFSEvaluateTargetByName free buffer if no return
For consistency with other functions in AFSCommSupport
modify AFSEvaluateTargetByName to free the DirEntry on
completion if the caller has not provided an out parameter
to accept it.
Rod Widdowson [Sat, 22 Oct 2011 14:00:03 +0000 (15:00 +0100)]
Windows: Defer deref of a directoryEntry
During the handling of SL_OPEN_TARGET opens (usually associated
with a rename) a directory entry was deferenced prior to its
contents being used (to set up a seconding inforation field).
This change just holds on to the reference until after that processing.
Rod Widdowson [Fri, 21 Oct 2011 15:57:02 +0000 (16:57 +0100)]
Windows: Set new file index correctly during rename
Directory entries are required to have a file index which is used during
directory enumeration. When inserting into a new directory we have to
update this from the target directory.
This code fixes a bug whereby it was being set from the source FCB, rather
than the target one. On failure we now also reset the value to its old value.
allow cloning of any volume to any volume with same parent ID
remove checks to disallow cloning of ro volumes to rw volumes,
which allows cloning of any volume within the same parent ID
grouping, including allowing destruction of newer version of the
volumes.
remove check for disallowing clones of backup or ro volumes
removes the if-statement ensuring that the volume being cloned is
not a backup volume, nor a read-only volume. This allows clones
from any type of volume to a given volume. Parent volume meta-data
is maintained, only the cloneId value changes.
Andrew Deason [Mon, 29 Aug 2011 22:41:31 +0000 (17:41 -0500)]
DAFS: Remove VOL_SALVAGE_INVALIDATE_HEADER
Currently VRequestSalvage_r takes a flag,
VOL_SALVAGE_INVALIDATE_HEADER, which causes the header for the
specified volume to be freed (via FreeVolumeHeader). This is almost
never safe to do, since there may be other users of the specified
volume that can be accessing the volume header at the same time.
There is also no reason to invalidate the header at the time of the
VRequestSalvage_r call, since the header must be invalidated when we
detach the volume (other utilities may change header information). So,
if there are any problems in the future because we do not invalidate
the header at the time of VRequestSalvage_r, it is the fault of the
detachment/offlining logic.
So, remove VOL_SALVAGE_INVALIDATE_HEADER and all of its users. Take
this opportunity to correctly document the VRequestSalvage_r headers
in the VRequestSalvage_r comment, as it was previously missing the
VOL_SALVAGE_NO_OFFLINE flag.
Michael Meffie [Thu, 13 Oct 2011 16:23:35 +0000 (12:23 -0400)]
DAFS: fssync online requires a partition name argument
fssync-debug online silently fails when run without a partition name.
Check for the required partition name on the server side and the client
side. Report errors back to the client when the server side fails to
pre-attach the volume.
Andrew Deason [Tue, 11 Oct 2011 15:51:14 +0000 (10:51 -0500)]
volser: Remove ExtractVolId
volser was using its own function to extract a volume ID from a
filename string, and was using atol to do so. The ato* family of
functions can have problems with larger volume IDs, not to mention a
lack of error checking, so don't use it. Since we already have the
function VolumeNumber in the vol package to do the very same thing,
just use that instead.
Andrew Deason [Mon, 3 Oct 2011 18:10:44 +0000 (13:10 -0500)]
viced: Check for HOSTDELETED in stillborn check
h_FindClient_r checks the connection rock for a client object twice.
First it sees if we already have a client object, and if we don't, we
effectively create one (or find a suitable one). Then we check again,
to see if someone else set the rock while we were creating a client
structure.
Currently, the first check checks if client->host->hostFlags has
HOSTDELETED set, but the second check does not. So, if the host
associated with the client has been deleted by someone else, currently
we will unnecessarily log a "stillborn client" message, and we will
continue to use the deleted host. If the host continues to be held by
someone, we will run into the same situation repeatedly on future
requests until all of the host references go away.
To fix this, also ignore HOSTDELETED clients when performing the
stillborn race check.
Andrew Deason [Fri, 14 Oct 2011 16:32:34 +0000 (11:32 -0500)]
vos offline: Bring volume back online for -busy
vos offline is supposed to bring a volume back online from "busy"
status before exiting, as volumes should not be in "busy" status for
extended periods of time. This was being enforced by required that
-sleep be specified; however, -sleep only results in the volume being
brought back online if a non-zero sleep time was specified. So, make
sure the volume is brought back online if -busy was specified.
Rod Widdowson [Sat, 22 Oct 2011 13:27:41 +0000 (14:27 +0100)]
Windows: Remove unused cleanup flag
In AFSOpenTargetDirectory the flag bRemoveShare was initialized
FALSE and never set TRUE. In teardown after failure some code
did listen to the flag, but the operation (IoRemoveShareAccess)
was not protected by the FCB mainlock which it should have been.
Rather than get the locking correct, just remove the flag entirely.
do set errors when we bomb out early
do not unlock and return early when we happen to do a correct zero
length read
do set errors the kernel can deal with if we're feeding a page routine
Marc Dionne [Mon, 24 Oct 2011 02:45:21 +0000 (22:45 -0400)]
dir: add missing return in DRead
A missing return in the kernel version of DRead causes the code to
think that no entry exists for a dir and proceed to allocate a new
one, if the entry is the third one in the hash chain.
If the existing entry is dirty, its contents are never written back,
and the pending changes to the directory are not seen by the client.
Simon Wilkinson [Sun, 23 Oct 2011 23:07:33 +0000 (19:07 -0400)]
rpm: Turn on debugging
Now that we build with a blank CFLAGS line, we need to make sure and
actually turn on debugging in the build system, so that our debuginfo
files are vaguely useful
Simon Wilkinson [Sun, 23 Oct 2011 20:23:34 +0000 (21:23 +0100)]
rx: Define afs_kmutex_t for LWP too
afs_kmutex_t is used for lock definitions in the kernel, and in
pthreaded builds. LWP doesn't have any equivalent, and all structure
members using this type have to be protected with RX_ENABLE_LOCKS, which
starts to become untidy.
Just make afs_kmutex_t an int for LWP, so that we can simplify our
headers, at the expense of some additional storage on LWP builds (which
are going away at some point, anyway)
Simon Wilkinson [Sun, 23 Oct 2011 15:38:13 +0000 (16:38 +0100)]
dir: Don't leak a buffer on a failed Enumerate
If, for some reasons, Enumerate encounters a hash object with a NULL
buffer pointer, that's no reason to leak the hash object. Make sure
that we DRelease it before failing
Simon Wilkinson [Wed, 12 Oct 2011 13:50:18 +0000 (09:50 -0400)]
rx: ackall handling
If we ACKALL a stream, then we're sending a hard ACK for all of the
packets in the stream. We shouldn't send that hard ACK, and then a
load of soft ACKs for packets that don't actually exist.
Andrew Deason [Fri, 12 Aug 2011 19:50:26 +0000 (14:50 -0500)]
LINUX: Revert group changes on keyring failure
On Linux kernels that support keyrings, when we setpag we try to add
the PAG to the session keyring and to the supplemental group list.
Currently, if we fail to add the PAG to the keyring (which may happen
due to key quotas, or possibly other reasons), we return failure but
the group list is still modified with the new PAG in it.
Therefore, if the keyring-based approach fails, the new PAG may still
be in use, but there are no keyring keys associated with that PAG, so
the PAG may never get destroyed. This can cause a large number of PAGs
to accumulate over time, causing performance problems.
So, change this so that, in the event that keyring installation fails,
we revert the group list back to what it was before we touched it.
Also mark all unixusers with the new PAG as expired, in case one got
created during processing. Thus, the new PAG never gets used.
Andrew Deason [Thu, 20 Oct 2011 21:57:14 +0000 (16:57 -0500)]
viced: Do not swallow errors on StoreData recovery
When we encounter any error in the StoreData fetch/store loop, we
reset the disk usage to ensure it remains correct, even in the face of
unexpected errors. However, when we do so, we use the errorCode from
VAdjustDiskUsage as our return value; if it is 0, we return success,
ignoring the error that got us in this code path in the first place.
Instead, keep track of a temporary errorCode for the disk usage
adjustment, and do not override our return value if there was no error
in the disk usage numbers.
Simon Wilkinson [Sat, 22 Oct 2011 08:43:41 +0000 (09:43 +0100)]
opr: Move queue header out of util
Move the header which is installed as opr/queues.h out of util/ and
into the new, top level, opr/ directory. Similarly move the tests out
of the util/ test suite, and into the opr/ tests
Simon Wilkinson [Tue, 11 Oct 2011 00:01:26 +0000 (19:01 -0500)]
dir: Remove double release in FindBlobs
When DRead() fails, we DRelease the entrybuf, then break. However,
this break takes us to the end of the function, where we promptly
DRelease again, causing a double free
Simon Wilkinson [Wed, 12 Oct 2011 17:04:28 +0000 (13:04 -0400)]
ukernel: add morepackets check in listener
Make the listener loop actually check for more packets needed,
like kernel, pthreads and lwp. Only checking for new packets every
20 seconds isn't sufficient on today's networks!
Simon Wilkinson [Wed, 12 Oct 2011 13:47:14 +0000 (09:47 -0400)]
rx: Don't clear the receive queue when out of packets
We can end up discarding a receive queue that's been soft acked,
effectively taking back soft acks we sent. Whilst the RX
documentation says that a client can drop soft acked packets at
will, our RX implementation assumes that if the final packet in
a call has been soft acked, we won't clear the queue. If a client
clears the queue in this situation, the call will hang.
What *should* happen is that we should take necessary locks,
confirm that we have not soft-acked all of the packets in a flow,
and then discard, or, if we're just going to discard, error the
call.
Andrew Deason [Thu, 14 Apr 2011 20:36:50 +0000 (15:36 -0500)]
auth: Get correct viceid in legacy GetToken
When ktc_GetTokenEx needs to get tokens via the legacy ktc_GetToken
interface, it was not extracting the viceid. Make it set the viceid so
the caller gets the correct id.
Normally this would require parsing the given client name. To reduce
the amount of times we store and extract the viced from the "AFS ID
%d" string, create a helper GetToken function that can store the
viceid directly, without storing it in a string.
Andrew Deason [Thu, 14 Apr 2011 20:05:37 +0000 (15:05 -0500)]
auth: Force correct evenness on rxkad tokens
Rxkad tokens historically have forced odd lifetimes when the given
viceid is actually an AFS ID, and even lifetimes when it is not. Force
this when the new token-handling functions are used (so the viceid is
correctly interpreted by users of the old token format), by creating
rxkad tokens with token_importRxkadViceId.
Slightly reworked by Simon Wilkinson to provide a generic token
destructor function.
Simon Wilkinson [Mon, 10 Oct 2011 22:19:13 +0000 (17:19 -0500)]
docs: Refer to dafs binaries by their real names
(Most of) the dafs binaries are called da(something). Update the
example in the dafileserver documentation so that we call the binaries
by the names that they are actually installed with on the system.
Simon Wilkinson [Mon, 10 Oct 2011 21:09:40 +0000 (22:09 +0100)]
ptserver: Don't check for noauth before rebuilding
The ptserver database building scripts would check to see if the server
was running from a bosserver with the noauth flag set before performing
a database rebuild.
This means that you can't start ptserver normally, and then configure
the database using pts -localauth, which is the preferred method for
configuring new cells.
Remove the check for noauth. This is slightly risky, as it means that a
corrupt database could be completely erased upon restart. However, we
already check that the dbheader (65k) is entirely blank - which will
protect us against any single page corruption errors.
Ben Kaduk [Sun, 23 Oct 2011 15:22:07 +0000 (11:22 -0400)]
FBSD: typo fix
Gerrit/5572 added conditionals on __FreeBSD_version >= 900044, which
is (approximately) when a bunch of kernel API renames happened.
(There has since been a dedicated version bump to 900045 a month
or two post-facto, but 900044 should be fine for now.)
However, 900044 is not 90004.
Rod Widdowson [Wed, 12 Oct 2011 10:04:33 +0000 (11:04 +0100)]
Windows AFSRDR: Log before decrementing refcount
The library support package keeps count of the number of times
the library code is active. When this goes to zero this means
that unload of the library can continue.
Although I cannot see it in the code it seems reasonable to assume
that at that stage the device object might go away so (and if it
doesn't do now it may in the future). This potentially renders it
unsafe to do anything after InflightLibraryEvent has been signalled.
This patch moves the logging up to above the decrement of the refcount.
Hartmut Reuter [Wed, 5 Oct 2011 14:06:05 +0000 (10:06 -0400)]
vol_split: avoid using stale open directory vnodes
we could in case of multiple splits end up using a stale open
vnode for a directory; attempt to close and thus force-reopen
any fdhandles backing ihandles.
Ben Kaduk [Sat, 8 Oct 2011 21:16:26 +0000 (17:16 -0400)]
FBSD: deal with kernel API rename
Upstream decided to rename the kernel functions that implement
syscalls to have a sys_prefix (including afs3_syscall!).
We use a couple of them, so we need to conditionalize accordingly.
Unfortunately, __FreeBSD_version was not bumped with the change,
so we use something close to it and hope it's close enough.
Jeffrey Altman [Sat, 8 Oct 2011 08:01:07 +0000 (10:01 +0200)]
Correct Heimdal conversion of libadmin/adminutil
Patchset 4251e386aa25bb3fc02fa255e92327fffc8b954d converts to
using Heimdal. The conversion undid the introduction of the
abstraction function fetch_krb5_error_message() which is
implemented in src/util. Restore the use of fetch_krb5_error_message()
and modify src/util/krb5_nt.c to use the Kerberos Compat SDK
interface.
Andrew Deason [Tue, 12 Apr 2011 22:47:51 +0000 (17:47 -0500)]
tsm41: Add options for uidpag and localuid
Add runtime options to aklog_dynamic_auth. Commit 3a541eb11d1bc7bd05b85635315214218d3b5d6f changed the behavior of
aklog_dynamic_auth to be more friendly to the CDE screenlocker, but
forced the use of UID-based PAGs.
Since some users like to use real PAGs and don't care about the CDE
screenlocker, made this behavior a runtime decision instead.
Jeffrey Altman [Sat, 1 Oct 2011 18:05:31 +0000 (14:05 -0400)]
Windows: Explorer Shell Extension enhancements
Redesign the AFS Volume Tab to report:
. Volume name
. Volume ID
. Cell
. Server
. Availability
. Quota
. Partition Info
. Replica Server List
Properly handle multiple selections to report the volume info
of the parent object and not the actively selected object.
When a mount point is selected, display the volume information
for the target volume.
Remove file server from AFS tab.
Modify the AFS tab to better handle multiple selections including
mount points.
Extend many gui2fs functions to implement a poor man's "follow"
option. This really should be done with the pioctl 'literal'
capability but this is an improvement. The pioctl modifications
will require a major redesign of gui2fs.c and all of the dialogs.
Andrew Deason [Thu, 29 Sep 2011 17:14:15 +0000 (12:14 -0500)]
Remove a few extra trailing backslashes
In a few different places, moving libutil before libafshcrypto_lwp
caused a variable definition to have a trailing \ on the last line of
the definition. This can confuse make (at least, the HP-UX make) to
think the next following definition is also part of the current
definition. Remove the trailing "\"s.
Andrew Deason [Wed, 28 Sep 2011 20:02:48 +0000 (15:02 -0500)]
vol: Only check "logging" on vice partitions
We don't care about non-vicepX partitions, so move part of the UFS
"logging" check into VCheckPartition. This API should probably redone
so the "am I a vicepX partition" check is done completely separately,
but for now, this will do.
Rod Widdowson [Thu, 29 Sep 2011 14:34:48 +0000 (15:34 +0100)]
FSSYNC-Client: Consistent use of partition name
Over time the FSSYNC code has collected examples where the partition
path is passed rather than the partition name. In Unix this is the
same (/vicepX), but on windows the path is the DOS device (C:).
This checkin changes FSSYNC client code to always use the partition
name.
This checkin does not address FSSYNC server or SALVSYNC.
Andrew Deason [Thu, 29 Sep 2011 19:49:53 +0000 (14:49 -0500)]
DAFS: Do not serialize state for invalid hosts
When we serialize host information for DAFS during shutdown, we have
no guarantee that the host is in a valid state when we look at it.
This can result in a host being saved to disk when we are waiting for
the host to respond to an RPC, and so the information about the host
is invalid. For example, we can save a host that has the
HWHO_INPROGRESS flag set, and when it is restored later, this can
cause odd behavior since the flag is set but no thread is actually
waiting for the host to respond.
So instead, during state serialization, try to determine if a host may
be in an invalid state, and simply skip the host if it may.
Andrew Deason [Thu, 29 Sep 2011 21:04:54 +0000 (16:04 -0500)]
DAFS: Skip hosts with invalid flags on restore
Host entries with HWHO_INPROGRESS set or ALTADDR unset do not have
valid state, since those flags indicate that the fileserver was in the
middle of identifying the host when the host struct was serialized.
Skip entries from the on-disk host data that have such invalid flags
set when restoring state, so we do not load invalid data.
Andrew Deason [Thu, 29 Sep 2011 20:22:35 +0000 (15:22 -0500)]
DAFS: Add explicit 'valid' field for index maps
The CB, FE, and host serialization structures were just using the
relevant indices to determine whether or not an entry mapping and old
index to a new index was populated with actual data. For host
structures, this really isn't sufficient, since our index can be 0,
and the structure is calloc'd, so the index in the structure could
also be 0.
Add a flag explicitly stating whether or not the structure has been
filled in, to make this unambiguous.
Marc Dionne [Thu, 29 Sep 2011 01:15:32 +0000 (21:15 -0400)]
rx: add post RPC procedure capability
Add the ability to specify a procedure that will be called after
the end of each RPC for a service. This is similar to the
existing afterProc, except that it gets called after the RPC
has ended (after EndCall).
rx_SetPostProc and rx_GetPostProc are provided to set and retrieve
a postProc for a specified service.
Unlike the afs_set_acl_dlg the PropACL sheet only uses a single
ComboList to maintain both the positive and negative ACEs but
uses two CStringArrays to separately store the positive and
negative ACEs. Two entries in each array are used to store
an ACE. The %2==0 entry is the pts name and the %2==1 entry is
the permission list. This needs to be taken into account when
manipulating the negative entries since the array count for the
normal entries is twice the number of ACEs.
Negative entries were prefixed with '=' instead of '-'.
The Remove button was not hooked up and was not enabled or disabled
under all appropriate conditions.