Andrew Deason [Fri, 6 Jul 2012 21:37:39 +0000 (16:37 -0500)]
Linux: Make dir dentry aliases act like symlinks
Currently, we try to invalidate other dentries that exist for a
particular dir inode when we look up a dentry. This is so we try to
avoid duplicate dentries for a directory, which Linux does not like
(you cannot have hardlinks to a dir).
If we cannot invalidate the other aliases (because they are being
used), right now we just return the alias. This can make it very easy
to panic the client, due to the sanity checks Linux performs when dong
things like 'rmdir'. If we do something like this:
For the 'rmdir', we will lookup 'mtpt2'. Since 'mtpt' and 'mtpt2'
are mountpoints for the same volume, their dentries point to the same
directory inode. So when we lookup 'mtpt2', we will try to invalidate
the other dentry, but we cannot do that since it is the cwd. So we
return the alias dentry (for 'mtpt'). The Linux VFS layer then does a
sanity check for the rmdir operation, checking that the child dentry's
parent inode is the same as the inode we're performing the rmdir for.
Since the dentry we returned was for 'mtpt', whose parent is 'dir1',
and the actual dir we're performing the rmdir for is 'dir2', this
sanity check fails and we BUG.
To avoid this, make the dentry alias act like a symlink when we
encounter an uninvalidateable dentry alias. That is, we allow multiple
dentry aliases for a directory, however, when the dentry aliases are
actually used, we redirect to a common dentry (via d_automount where
possible, and follow_link elsewhere).
This means that such mountpoints will behave similarly to symlinks, in
that we 'point' to a specific mountpoint dentry. This means that if we
have multiple different ways to get to the same volume, and all are
accessed at the same time, all but one of those mountpoints will
behave like symlinks, pointing to the same mountpoint. So, the '..'
entries for each path will all point to the parent dir of one
mountpoint, meaning that the '..' entry will be "wrong", but for most
cases it will still be correct.
In order to try to make the 'target', pointed-to directory consistent,
we add a new field to struct vcache: target_link. This points to the
dentry we should redirect to, whenever that vcache is referenced. To
avoid (possibly not-feasibly-solvable) problems with refcounting, this
pointer is not actually a reference to the target dentry, but just
serves as a pointer to compare to.
afs_server: delete code that has been ifdef'ed out for years
The comments in afs_SetServerPrefs() said "clean up, delete this".
The oldest one is a decade old. Removing these #ifdefs will make
following the rest of the spaghetti #ifdefs a bit easier.
Garrett Wollman [Tue, 9 Aug 2011 04:28:27 +0000 (00:28 -0400)]
libafs: afs_CacheFetchProc can't be called without a dcache pointer
An inspection of the only call site suggests that afs_CacheFetchProc()
can't be called with a null dcache pointer, and code further down
in this function dereferences adc unconditionally (assuming
rxfs_fetchInit() doesn't crash first) so remove the conditional
here.
Probably more of these parameters can and should be included in the
AFS_NONNULL.
OpenAFS does not have separate distributions for the United States
and the rest of the world. Nor are there any restrictions on the
capabilities of the Update Server.
volser: restructure GetNextVol and clients to remove duplicate code
There are several odd-looking but stylized loops involving GetNextVol()
which can be radically simplified if only GetNextVol() would return
a meaningful value. Move all of the code that skips non-volume-header
files in the directory into GetNextVol and have it return a truth value
(instead of always returning zero) that indicates whether it saw
something that looks like a volume header. Then all the odd while
loops and strcmps just collapse into while(GetNextVol(...)).
GetNextVol() had external scope, but there are no callers in the
tree that use it outside of volprocs.c, and it's not part of a
public library interface, so make it static.
While here, don't strcmp() past the end of a filename that begins with
'V' but is too short to be a valid volume name.
afscp: avoid null dereference in _GetSecurityObject error case
Handle the possible error return from krb5_get_host_realm in the
same way as the other error cases (using an anonymous security
object); otherwise "realm" would be left null.
Andrew Deason [Thu, 26 Jul 2012 21:40:03 +0000 (16:40 -0500)]
LINUX: Hold GLOCK for proc traversal
The functions that traverse unixuser structures for display via /proc
(uu_start et al) call various libafs functions hold and release locks,
etc. To do any of that, we need GLOCK. Amongst other issues, we can
panic if we try to acquire a contested lock without GLOCK, since we
assert glock is held when we sleep for the lock or try to wake other
waiters. The same goes for the legacy CellServDB proc file.
rx: protect against ACKs with serial as prevPacket
patchset 4e71409fe1305cde4b9b341247ba658d8d24f4d0 introduced a
check in rxi_ReceiveAckPacket for out of order ack packets which
relied upon the value of the previousPacket field. Unfortunately,
some versions of RX store the previous packet's serial number in
the field instead of previous packet's sequence number. Modify
the check to only discard out of order ACKs if the previousPacket
sequence number is within the valid window.
patchset 1f0cf8b2b4bb6e36d8d82323a15ced72d91db0ec tested for
an empty queue but what is really required is a test for end of
queue after the queue_Scan(). If the queue_Scan() completes
at the end of the queue, in other words, pointing at the list
head, then return NULL because no match was found.
Andrew Deason [Wed, 9 May 2012 23:45:51 +0000 (18:45 -0500)]
vos: Minimize release impact for new RO sites
Currently, if a new RO site is added with 'vos addsite', the only way
to populate the new site with data is a 'vos release' (excepting hacks
using 'vos restore' and 'vos addsite -live', etc). Due to safeguards
in 'vos' ensuring that RO sites always all contain the same data when
marked as up-to-date in the VLDB, such a release always incurs some
amount of data to be transmitted to all sites, as well as remote sites
being brought offline briefly, even when the RW data has not changed
in very long time.
To alleviate this situation, make 'vos release' detect if new,
unpopulated RO sites have been added, and if the RW volume has not
changed since the release of any existing RO sites. If both of these
conditions are true, do not update any of the existing sites, but only
transmit volume data to the sites that did not already contain RO
volumes.
tabular_output: don't leak table struct on error exit
The caller is almost certainly going to exit when we return, but
all the same, don't leak the table description structure in the
error exit. Makes the static analyzer happier.
afsdump_extract: clarify logic to avoid freeing local buffer
Sometimes vnodepath is set to a local buffer. Sometimes it is set
to malloc'ed storage. Simplify the logic for freeing vnodepath
by checking explicitly for this condition rather than the state
of other variables. As a bonus, avoids a false (?) positive from
the static analyzer.
Michael Meffie [Fri, 22 Jun 2012 03:44:31 +0000 (23:44 -0400)]
vlserver: always use the hostaddress table in GetAddrsU
Use the hostaddress (IpMappedAddr) table when looking up hosts by IP
address and when listing addresses by index, instead of accessing
the multi-homed extensions directly.
The existing vos client calls the old GetAddrs rpc to first retrieve
a count of the number of addresses expected. This count is the
number of addresses in the hostaddress table. If there are
unreferenced entries in the mh extension blocks, then vos can return
an incorrect or incomplete list of addresses.
To be consistent with the rest of the host address processing, use
the hostaddress table in GetAddrsU to lookup hosts by index or by IP
address.
The hostaddress table is already used when looking up addresses by
UUID.
afs_conn: make release_conns_vector() actually work
release_conns_vector must never have been called before with
a non-null parameter, because it could not possibly work.
The first line of the loop is a null pointer dereference, and
if that were fixed, there's also a modify-after-free bug as well.
It's not clear how what the old version was trying to do; this
version makes a stab at doing something sensible but might be
less than required. (Note that this would be much simpler if
converted to queue(3) macros or a similar standard linked-list
data structure.)
kauth: ka_CellToRealm's "realm" parameter cannot be null
Annotate ka_CellToRealm with AFS_NONNULL to indicate that its
"realm" parameter cannot be null; it does not make sense to call
this routine without this parameter. (The static analyzer inlines
the call to ka_ExpandCell and concludes that "realm" might be null;
the annotation will prevent that and avoid a false positive.)
Andrew Deason [Wed, 25 Jul 2012 20:48:34 +0000 (15:48 -0500)]
crypto: Use our strcasecmp in kernel
A few pieces of heimdal we use in the kernel call strcasecmp
(hcrypto/evp.c, krb5/crypto.c). The strcasecmp function does not exist
in all kernels (specifically, it does not exist in at least Linux 2.4,
2.6.9, and probably not on Solaris pre-10). Since we have our own copy
of strcasecmp (called afs_strcasecmp), just use that for now.
Ideally we would have some kind of configure test for detecting the
presence of the function in the kernel, and use the roken
implementation when we don't. We currently have the framework for
neither of those in place at the moment, though, so just get by with
this for now.
Change-Id: Ia96b17596da6cb168c80c92486fa049c05205da4
Reviewed-on: http://gerrit.openafs.org/7881 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementix.org>
ptuser: avoid implementation-defined behavior in CreateIdList()
CreateIdList() is an internal subroutine of pr_IDListExpandedMembers(),
used to flatten a hash table of protection IDs into an array that can
be passed to pr_IdToName(). If for some reason the hash table had no
entries, it would call malloc(0) and, depending on how the
the implementation defines this, either return a PRNOMEM error (wrong!)
or else allocate a minimum-sized buffer which pr_IdListExpandedMembers
would then promptly leak. Compromise between the two behaviors by
not allocating any memory in this case but returning success, and in
the caller check for an empty list and avoid the pointless RPC to
translate no IDs into no names. pr_IDListExpandedMembers() will return
success, as it previously did in the non-PRNOMEM case.
kauth: don't call lcstring with a null source argument
This code was probably never executed, but now that lcstring() has
an AFS_NONNULL annotation, the static analyzer indicates the
potential null-pointer-dereference.
Andrew Deason [Wed, 25 Jul 2012 15:45:16 +0000 (10:45 -0500)]
rx: Raise minimum Linux atomics version to 2.6
Linux 2.4 does not have atomic_dec_return. If we switch to a
dec_and_test-like API, then we could use the Linux 2.4 atomics. But
for now, just raise the minimum to 2.6, and for 2.4 and below just use
the generic atomics implementation so we can build.
afs_bypasscache: parameters of afs_ReadNoCache can't be null
The first two parameters of afs_ReadNoCache() are unconditionally
indirected through, and all existing callers appear to guarantee
that these parameters are in fact non-null, so annotate the function
declaration to so indicate, and remove the one test that checks
whether avc (the first parameter) is null. I suspect that acred
cannot be null either, but this code does not appear to depend on
that, so it's not included in the non-null annotation.
It is important that down servers be detected as soon as possible.
When it is not possible to perform a blocking probe, perform a
probe in a backgrond thread.
ptuser: pr_SNameToId/SIdToName: if RPC response empty, force error
If the prserver returns an empty response to ubik_PR_NameToID
or ubik_PR_IDToName, but doesn't otherwise give an error,
force a PRINTERNAL error return so that the client knows that the
the return parameter was not updated. Existing callers seem to
expect this, as pr_SNameToId is often called without initializing
the variable which receives the result and checking only for the
error code.
Simon Wilkinson [Mon, 16 Jul 2012 19:09:04 +0000 (20:09 +0100)]
auth: Fix GetTokenEx with NULL cellName
If GetTokenEx is called with a NULL cellName, it means use the
local cell. To do this with the legacy interface, a 0 length string
must be used for the cell instance of the ktc_principal passed to
GetToken. Fix this so that we do so, rather than attempting to
strcpy(..., NULL) which never ends well.
When queue_Scan is executed on an empty queue the queue element
variable, in this case 'rpc_stat' is the queue head, _RXQ(q),
and not NULL. Callers of rxi_FindRpcStat() expect NULL on failure
to find or create an rx_interface_stat object. Correct the behavior
by testing for an empty queue and return NULL immediately if the
queue is empty and the caller is not requesting creation.
Simon Wilkinson [Wed, 7 Sep 2011 17:31:32 +0000 (18:31 +0100)]
afsd: Tidy up system calls
Tidy up the way that we do system calls from afsd, by making
afsd_syscall a va_arg function, using a structure to pass system call
information around, and simplifying the #ifdef ladder that converts our
platform independent system calls into something platform specific.
This fixes all of the warnings in afsd which required the -Wno-error
option, the only warnings remaining are related to daemon being
deprecated on Darwin.
There's no need for type-punning here; usr_getspecific() is a macro
that just assigns to the variable whose address we provide, so the
cast was just unnecessary (and erroneous) obfuscation. This is the
only caller of usr_getspecific(), so if it needs to be more complex
in the future, it should probably just be open-coded here.
strcompose: NULL must always be cast when passed to a variadic function
The C standard allows NULL to be defined as a bare "0", which will
be passed to variadic functions as an int. If the function expects
a pointer type, demons fly out of your nose. strcompose() is such
a function, so make sure that all of its callers cast NULL appropriately.
(None of them did.) This may be an opportune time to change all of
the callers to spell it opr_strcompose() as well, and avoid using a
reserved identifier, but this change does not do so.
opr: constify various string functions and mark them AFS_NONNULL()
All of these string functions require at least one non-null argument.
Mark them as AFS_NONNULL() so that the compiler and static checker can
find erroneous uses. The "source" arguments of lcstring and ucstring
can be const, so do so. (This doesn't affect anything in the tree
right now.) While here, note a few unfixed issues with these interfaces.)
xdr: fix two old FIXMEs related to signed/unsigned arithmetic
It's implementation-defined whether the C '>>' operator, when
applied to a signed integer, is sign-extending or zero-filling.
If you want unsigned arithmetic, you have to ask for it explicitly.
One assumes the reason for the shift is to avoid overflow if the
returned size/count is later converted to a signed int, in which
case maybe it would be better to use INT_MAX here. This is the
minimal change necessary for correctness.
If there are extents in the list with a non-zero ActiveCount,
those extents will be skipped and the list 'le' will never
become empty. Add an additional condition to ensure that the
loop is only executed once for each extent in the list.
Replace AFSExFreePool() with AFSExFreePoolWithTag() which is
a wrapper around both ExFreePool() and ExFreePoolWithTag().
If a 'Tag' value, is provided, ExFreePoolWithTag() is used.
Otherwise, ExFreePool().
Specify allocation tag values wherever possible. Path name buffer
tags are not specified because they are allocated using multiple
tags. The same is true for network provider string buffers.
This is being done in order to debug a memory corruption issue.
Warning: this is a change to the AFSRedir->AFSRedirLib interface
and therefore both drivers must be updated with a reboot and
not simply restarting the service.
rx: rxi_ReceiveDataPacket do not set rprev on drop
In KERNEL builds if there are no available packet buffers the
new packet is dropped on the floor. In that case, the call's
rprev field should not be updated because the packet was never
"received" for delivery to the application.
Remove a dead comment from the same block of code.
Windows: avoid memory overrun during extent release
While tearing down extents, if an extent is found to be in use
it will be skipped. Must use 'ulReleaseCount' as the index
into the released extent array.
Originally, the first store to "code" was dead here. Refactor the
error exits to follow the non-error exit path, which has the effect
of making the store to "code" live again (and also makes it less
likely that any new cleanup code will be unintentionally omitted).
In the ubik_ClientInit recovery case, handle the possibility that
aproc() returned zero and return UINTERNAL rather than letting the
caller think that this operation succeeded.
the whole of the api used for icon handling when you steal it
from a resource fork is deprecated in new macos. fine. we'll just make
an app bundle by cheating, move andy into a standalone icns file,
install him into the "bundle" and open it the macos way.
afs_server: remove 3 dead assignments in LoopServers, move live one
The assignments to conns, rxconns, and conntimer are all dead, so
remove them. The assignment to nconns is live, but rather far from
the for loop that actually uses it; move it to just before the loop.
Michael Meffie [Thu, 7 Jun 2012 16:58:54 +0000 (12:58 -0400)]
libafs: use afs_ResetVCache in flush volume data
Remove some code duplication by using afs_ResetVCache
in the flush volume data pioctl. Adds a flag to
ResetVCache to avoid unneeded calls to purge dnlc
when reseting all the vcaches in a volume.
Adds freeing of vcache link data in the flush volume
data pioctl.
util: LogCommandLine: argc is an int, so assert that it's positive
In practice, argc should never be negative, but by convention it's
a signed int, so change the assertion to require it to be positive
rather than merely nonzero to get some help to the static analyzer.
budb: don't malloc(0) on error condition in GetText()
malloc(0) is non-portable (may return a pointer to no space, or it
may return NULL. Just set the result to NULL without bothering to
call malloc(), as is done earlier in this function.
rx: don't leak a connection hash table in unlikely error condition
If getsockname() returns an error (which shouldn't be possible),
rx_InitHost would leak a connection hash table (which probably
doesn't matter because the caller will just exit anyway). Make
the analyzer happy by freeing the memory anyway.
The AFS file server had always performed a PRSFS_READ permission
check on the volume's root directory (1.1) vnode before responding
succesfully to the client. A successful response contains the
following volume state information:
Message of the day (if any)
Offline message (if any)
Online flag
InService flag
Blessed flag
NeedsSalvage flag
Type
MinQuota
MaxQuota
BlocksInUse
PartBlocksAvail
PartMaxBlocks
All of this information is publicly available to anonymous users
via other services so it is odd that it is hidden from anonymous
cache managers.
As sites begin to tighten the ACLs on volumes due to privacy
and security concerns this READ permission check is begin to
cause problems for Windows clients that rely upon the quota and
block counts to determine whether or not it is likely to be safe
to perform an extending write. In many environments volumes are
being configured such that the root directory is 'l' for all and
only the subdirectories provide for 'ridw'. Under these situations
the user is able to read/write the data but cannot determine how
much free space is available. Since all of the data returned by
RXAFS_GetVolumeStatus is publicly available, the patchset removes
the access check entirely.