Jeffrey Altman [Thu, 19 Nov 2009 23:11:06 +0000 (18:11 -0500)]
viced: set volume sync data in bulk status rpcs
The bulkstatus and inlinebulkstatus rpcs have a bug
that prevents the volume sync data from being set.
Currently the data is being set within the for loop
only when i == nfiles. The conditional of the loop
is i < nfiles so the SetVolumeSync call is never
performed. This patch changes the test for performing
SetVolumeSync to i == 0.
Marc Dionne [Sun, 25 Oct 2009 02:10:46 +0000 (22:10 -0400)]
Linux: Keyrings PAG handling changes
We can take advantage of the fact that PagInCred now receives
a kernel credentials structure as an argument (including any session
keyring) to make some improvements in the handling of PAGs
when keyrings are in use.
These changes are effective only if keyrings are in use and we
have a recent enough kernel where we can use the kernel
credentials structure.
1 - Search the session keyring of the passed credentials instead of
the current process' to determine the PAG, if any. This was always
not really correct, and now we're able to do the right thing.
In some situations such as background writeback and pre-fetching,
this means that we'll now do it with the right credentials, even when
in a PAG.
2 - Don't use groups at all to determine PAG membership. Doing so
can lead to some inconsistent situations such as the one described
in RT 125198, where a process gets access through a soon to be
deleted PAG. Make PagInCred look exclusively at the keyrings.
Groups are still updated to try to reflect the current PAG for now,
if the passed credentials belong to the current process.
Note that a process can no longer get a PAG's privileges simply by
adding the corresponding groups to its group list.
Marc Dionne [Sun, 22 Nov 2009 19:17:19 +0000 (14:17 -0500)]
Remove "unused" warnings from lex generated files
Some (f)lex generated source files produce warnings because of unused
labels or variables.
Since there is limited control of the source itself, just be more
permissive in this particular case with -Wno-unused.
UFSOpen shares a prototype with MemCacheOpen because of the
afs_cacheOps structure. This is why a void * is used.
Revert until a more complete fix can be submitted that adresses
the memcache case as well.
Simon Wilkinson [Wed, 18 Nov 2009 20:07:04 +0000 (20:07 +0000)]
Remove inode hinting for dcaches
The VNOP read code has always contained incomplete support for inode
hinting. In theory this would let us attach open cache files to dcache
structures, so that we don't have the overhead of opening the file
with every read that we do.
However, this has been ifdef'd off ever since the first release, and
is fundamentally broken - it relied upon structure elements that just
don't exist, and has no mechanism for throttling the number of inode
hints that are maintained. Inode hinting also required that we store
an inode number within the osi_file structure (so hint validity could
be checked), which causes a problem on some modern OS's.
Simplify all of this, by just removing the partial hinting support.
If we want to revisit this in the future, then the code is in git,
but if we _do_ feel we want to keep open cache files around, it's
probably better to start from scratch!
Simon Wilkinson [Mon, 26 Oct 2009 19:58:53 +0000 (19:58 +0000)]
Fix prepare and commit_write to do the right thing
Even when we're doing syncronous writeback, as we currently do
for write() operations, it's important to correctly fill, and flag
the pages we're writing to. Not doing so has a huge performance
penalty, as it means even when we've just written a page, we have to
pull it back from the backing store for a read.
This code fixes prepare_write and commit_write (for RHEL5) and
write_begin and write_end (for Fedora) to correctly populate and
flag pages which are being written.
Simon Wilkinson [Sat, 24 Oct 2009 14:08:52 +0000 (15:08 +0100)]
Linux: Use atomics for credential reference counts
The reference count maintained as part of the afs_cred structure
wasn't being maintained atomically, requiring that crfree and
crhold always be called with the GLOCK held.
This patch just switches to using Linux's inbuilt atomic types to
maintain the reference count.
Andrew Deason [Wed, 18 Nov 2009 21:43:17 +0000 (15:43 -0600)]
Define WCOREDUMP in salvsync-server.c
Some platforms do not define WCOREDUMP. Conditionally define WCOREDUMP
in salvsync-server.c, and make all of the similar WCOREDUMP defines in
the tree consistent.
Change-Id: I197979881ade20f6e790bf41523938089379dbe3
Reviewed-on: http://gerrit.openafs.org/846 Reviewed-by: Russ Allbery <rra@stanford.edu> Reviewed-by: Tom Keiser <tkeiser@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementia.org> Tested-by: Derrick Brashear <shadow@dementia.org>
Mickey Lane [Wed, 18 Nov 2009 17:23:15 +0000 (12:23 -0500)]
Fix 2 errors in Windows release Notes
Description of registry key HKLM\ SOFTWARE\ OpenAFS\
Client\ Server Preferences\ File (and \ VLDB) states
"256" - should be 15 - and "ServerPreferences" should
have a space between the words.
Jeffrey Altman [Fri, 13 Nov 2009 18:56:20 +0000 (13:56 -0500)]
Windows: cm_BkgDaemon requeuing only applies to BkgStore
cm_BkgDaemon currently requeues failed requests for a variety
of errors. It only applies to cm_BkgStore requests. The current
code only supports cm_BkgStore and cm_BkgPrefetch operations.
Additional background operations may be added in the future.
If requeues are meant to apply to the new operations, they should
be explicitly specified. Specify cm_BkgStore explicitly now.
Jeffrey Altman [Sat, 14 Nov 2009 21:33:31 +0000 (16:33 -0500)]
Windows: Improvements to background fetch processing
Log offset and length in cm_BkgPrefetch()
Convert mxheld to rwheld in cm_BkgPrefetch() now that cm_scache_t
objects use rwlocks.
Do not clear CM_SCACHEFLAG_PREFETCHING from within the error
returns from cm_CheckFetchRange(). Let the caller decide if
that is appropriate.
Add CM_BUF_CMBKGFETCH cm_buf_t cmFlag to make it possible to
quickly detect if a background fetch operation has already
been queued for a particular cm_buf_t data range.
Andrew Deason [Wed, 18 Nov 2009 20:08:49 +0000 (14:08 -0600)]
Define T_SRV when not defined for us
Define T_SRV when we don't have a usable arpa/nameser_compat.h, just
like we do with T_AFSDB. Some platforms like AIX do not have an
easily-usable arpa/nameser_compat.h.
Make ihandle file descriptor cache parameters tunable, and accommodate
platforms where max open files is large. Expand the fd cache hash table
to 2048 entries. Raise fd cache size automatically to match configured
number of lwps.
NOTE: This code has been tested on Centos 5.3 x86_64, on VMWare, 2 physical,
2 logical CPUs (in tandem with viced_more_threads).
Simon Wilkinson [Fri, 13 Nov 2009 09:50:29 +0000 (09:50 +0000)]
Rationalise our include paths
Our include paths are a bit of a mess. Fix these so that they're
more rational, and more in line with normal coding style.
In particular:
*) Don't include all of the subdirectories of our top level
include directory. If a file wants afs/file.h, it should
include that, not "file.h"
*) Try to avoid including '.' in the search path (although
objdir builds make this harder)
*) Don't blindly include other directories from the code tree
in the search path. If a package wants another packages header,
then it should get it from the include directory
*) Use the convention that quoted includes ("") pick up local
headers. Bracketed includes (<>) pick up ones from the top level
include dir
*) In directories which pull in files from multiple packages, don't
blindly put all of the package directories in the search path.
Specifically include the file's package directory when required
The big change here is that it's no longer possible to hide a system
include by placing a header of the same name in include/afs. The most
common case where this was happening was for 'assert.h'
Simon Wilkinson [Fri, 13 Nov 2009 16:33:52 +0000 (16:33 +0000)]
Better errors from aklog
Since the great com_err fracture, aklog has only returned decent
error messages from AFS, leaving Kerberos errors untranslated.
Needless to say, this causes user confusion and distress.
This patch uses the error display proc hook to call out to the real
com_err in situations where AFS can't supply an error message, giving
clearer errors for Kerberos problems.
Jeffrey Altman [Sat, 14 Nov 2009 21:27:37 +0000 (16:27 -0500)]
Windows: Error mapping for VBUSY and VRESTARTING
Add error mapping for VBUSY and VRESTARTING to
cm_MapRPCError(). Return CM_ERROR_ALLBUSY.
This prevents an unknown error from being returned
to the SMB redirector.
Jeffrey Altman [Sat, 31 Oct 2009 14:33:00 +0000 (10:33 -0400)]
Windows: Use STATUS_IO_TIMEOUT where STATUS_TIMEOUT was returned
STATUS_TIMEOUT causes the smb redirector to drop the connection.
STATUS_RETRY is interpreted by the smb redirector as if the error was
generated by the transport stack and not the smb server.
STATUS_IO_TIMEOUT is listed in the SNIA CIFS 1.0 spec as a valid
return code for the smb server. Lets us that.
Jeffrey Altman [Sun, 15 Nov 2009 06:01:23 +0000 (01:01 -0500)]
Windows: Fix port assignment to use network byte order
Service port numbers are stored within sockaddr* structures
and returned by afsconf_FindService() in network byte order.
getAFSServer() and afsconf_GetAfsdbInfo() accept and return
service port numbers in network byte order.
When processing the special case for 7002 and 7003 in
afsconf_GetAfsdbInfo(), the comparisons must consistently
use network byte order.
When assigning port numbers for AFSDB lookups, getAFSServer()
must use network byte order.
Document the use of network byte order for each variable.
Andrew Deason [Wed, 11 Nov 2009 17:23:49 +0000 (11:23 -0600)]
Make ktc_curpag also detect ONEGROUP PAG gids
ktc_curpag falls back to looking at the group list if the VIOC_GETPAG
pioctl fails. If we're in AFS_LINUX26_ONEGROUP_ENV in the kernel,
though, ktc_curpag still looks for two groups, instead of the one
combined group. Add a check for the big one group in the fallback if
we're on LINUX26.
Change-Id: I28e5eda5c62f13a6fb466c8a2b04d2628706498f
Reviewed-on: http://gerrit.openafs.org/815 Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: Derrick Brashear <shadow@dementia.org>
Simon Wilkinson [Wed, 21 Oct 2009 22:17:15 +0000 (23:17 +0100)]
Use set_page_writeback and end_page_writeback
Calling set_page_writeback and end_page_writeback is necessary to
ensure that the dirty page radix tree and the page dirty flags
tally. The results of end_page_writeback are also used by the
bdi code to prioritise writeback. The Linux kernel
documentation contains further warnings of doom for what may
happen due to not calling them.
Adding set_page_writeback and end_page_writeback also allows us to
unlock the page earlier (the page can be locked any time after the
writeback flag is set). This means that we're not calling the
backing filesystem's ->write function with our pages locked, and
should help reduce contention and the potential for deadlocks there.
Change-Id: I9130b2ad9a09c6b9b16a0f63d7b4a614a93de8d3
Reviewed-on: http://gerrit.openafs.org/819 Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Tested-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: Derrick Brashear <shadow@dementia.org>
Marc Dionne [Thu, 29 Oct 2009 23:58:00 +0000 (19:58 -0400)]
Linux: Use the kernel's credentials structure
Recent kernels (2.6.29 and above) have a separate ref-counted
structure for holding credentials. Use it directly instead of
keeping a separate afs specific structure that shadows the same
information.
Also adapt Linux for the change from cr_xxx to afs_cr_xxx wrappers.
Reference counting is done with the appropriate get/put calls.
Andrew Deason [Wed, 11 Nov 2009 16:51:19 +0000 (10:51 -0600)]
Do not check *aoutSize in PGetPAG
*aoutSize is always zero in pioctls, since afs_HandlePioctl handles
checking the output buffer size, and sets outSize to 0 before calling
the pioctl. So, PGetPAG was always returning E2BIG; remove the check to
make it work.
Simon Wilkinson [Wed, 11 Nov 2009 10:34:30 +0000 (10:34 +0000)]
Update warning inhibition
A number of recent changes haven't caught all of the locations where
warning inhibition can be removed. This patch updates all of the
inhibitions to reflect the current state of the tree when built with
gcc4.2
Simon Wilkinson [Wed, 11 Nov 2009 10:28:29 +0000 (10:28 +0000)]
const char paths for ubik_ServerInit
ubik_ServerInit* take a pathname, which should really be a const.
It already is in many of the callers, some of which remove the
const by casting, the others throw errors.
Make pathName const for all of ubik_ServerInitByInfo, ubik_ServerInit
and ubik_ServerInitCommon.
Update all of our callers to remove the now unecessary casting.
Remove the now uneccessary warning inhibition on vlserver/vlserver.c
Simon Wilkinson [Wed, 11 Nov 2009 10:19:07 +0000 (10:19 +0000)]
Fix des key type issue in bosoprocs
The call to afsconf_AddKey was using 'akey' rather than 'akey->data'.
As data is the first element of the akey structure, these are actually
identical, but the compiler sees it as a type error. Fix to use the
correct name, and remove the warning inhibition.
Simon Wilkinson [Wed, 11 Nov 2009 10:13:57 +0000 (10:13 +0000)]
Prototype UV_Bind
Publicly prototype UV_Bind in volser_prototypes.h
Make dump.c use the public prototype, instead of an incomplete
private copy, and remove the warning inhibition that was required to
support the private copy.
Simon Wilkinson [Wed, 11 Nov 2009 08:32:48 +0000 (08:32 +0000)]
Remove 'M' variants of lock macros
Since the beginning, we've had M variants of the lock macros, which
are identical to the normal form. Dispose of these variants, to make
it clearer what's going on.
Simon Wilkinson [Wed, 11 Nov 2009 09:10:36 +0000 (09:10 +0000)]
Fix warnings from afsconf_SetExtendedCellInfo
If a is declared as an array, then a == &a. However, the compiler
still gives a type warning when usign the & form, as the types no
longer match. 5f720faab920a1007327de415ceaf187c16fdbe6 fixed this
problem for calls to GetExtendedCellInfo - do the same for the
corresponding Set calls.
Simon Wilkinson [Wed, 11 Nov 2009 08:28:32 +0000 (08:28 +0000)]
Include signal.h for sigfillset
f6ce2af008feb615e94d924fc9f81e2098e73e7c added a call to
AFS_SIGSET_CLEAR to vol/volume.c. However, it didn't add signal.h
to this file. As AFS_SIGSET_CLEAR calls sigfillset(), this broke
checked builds.
Add signal.h to the list of headers to fix the build warning.
Marc Dionne [Tue, 10 Nov 2009 23:36:55 +0000 (18:36 -0500)]
krb_udp.c warning fix
This file generates a warning because the left side of a variable
assignment is commented out. Keep the effect of the line
(incrementing packet) but remove the unused casting and
reference, and remove the comments that date from the original
IBM source.
Leave a new comment in place in case the information is useful.
Adjust the Makefile and README.WARNINGS to account for the change.
Marc Dionne [Tue, 10 Nov 2009 23:16:45 +0000 (18:16 -0500)]
src/pam/afs_auth.c warning fix
ka_UserAuthenticateGeneral expects an afs_int32 pointer for the
password_expires argument. A (long *) was used in afs_auth.c,
generating a few warnings.
Simon Wilkinson [Wed, 11 Nov 2009 08:12:51 +0000 (08:12 +0000)]
cr_gid is already used by Darwin
Commit eb8e55bba7740a87e07ef07bb4b789e6d4e36f0d introduced a variety
of functions for accessing members of the credentials structure in a
platform independent way. Sadly, cr_gid is already defined by the
Darwin platform headers (on Darwin, the GID is just the first of
the user's groups)
Turn cr_gid() into afs_cr_gid() to avoid this problem, and for
consistency, also rename cr_uid, cr_ruid, cr_rgid, and the
corresponding set_* functions.
Russ Allbery [Mon, 9 Nov 2009 01:31:25 +0000 (17:31 -0800)]
Update afsd cache and firewall details
Cache parameters are discussed in two locations in the afsd man page,
and the first copy had not been updated for the new auto-tuning of
the chunk size and the stat parameter. Fix both.
Note that the firewall requirements for klog only apply if you're using
kaserver and klog. Kerberos v5 has its own requirements, but this is not
the place to talk about them.
Marc Dionne [Thu, 29 Oct 2009 23:23:28 +0000 (19:23 -0400)]
Unix client: wrappers for credentials structure access
This patch introduces and makes use of wrappers for access
to credentials structure members:
cr_uid (afs_ucred_t *)
cr_ruid(afs_ucred_t *)
cr_gid (afs_ucred_t *)
cr_rgid(afs_ucred_t *)
cr_group_info(afs_ucred_t *)
Inline functions are also introduced to set values:
set_cr_uid (afs_ucred_t *, uid_t)
set_cr_ruid(afs_ucred_t *, uid_t)
set_cr_gid (afs_ucred_t *, gid_t)
set_cr_rgid(afs_ucred_t *, gid_t)
set_cr_group_info(afs_ucred_t *, struct group_info *)
This will allow an architecture to make use of an alternate
structure to hold credentials. In particular it will allow
the linux client to be modified to use the kernel credentials
structure directly instead of shadowing it into our own local
structure.
Michael Meffie [Thu, 5 Nov 2009 16:08:08 +0000 (11:08 -0500)]
viced: avoid useless core if shutdown during initialization
Avoid leaving an unnecessary core file when the fileserver is
shutdown while still attaching volumes. The bosserver issues
SIQUIT to shutdown the fileserver which leaves a core file by
default.
Register the fileserver shutdown signal handler earlier in the
fileserver initialization, before the long running volume
attachment is started. The volume package shutdown has been
changed to allow the VShutdown to gracefully abort the volume
attachment (or pre-attachment for DAFS).
Simon Wilkinson [Wed, 4 Nov 2009 20:15:36 +0000 (20:15 +0000)]
Complete removal of DUX client code
With commit cfce015ead18c72ee921f480c73e9247a98838fc (in 2006) all
of the files specific to the DUX cache manager were removed.
However, the DUX code within general files remained untouched.
This patch completes the removal of the (entirely non-functional)
DUX client, by removing all cache manager code which is for
AFS_DUX*_ENV and AFS_OSF_ENV platforms.
It also takes the advantage of this removal to simplify some #ifdef
ladders, and indents others (purely because I needed the indentation
to work out what on earth was going on!)
Simon Wilkinson [Wed, 4 Nov 2009 18:09:51 +0000 (18:09 +0000)]
Move vnode macros to their own directories
The tree is inconsistent whether macros for access to vnodes are
provided by the OS directories, or in afs_osi.h. This makes things
very confusing, especially in the Linux case where macros are
provided in afs_osi.h, and then promptly redefined in
LINUX/osi_machdep.h
Adopt a convention where default macros are conditionally provided
by osi_machdep.h. Where these aren't wanted, they should be disabled
in osi_machdep.h, and OS specific versions provided in the individual
OS's directory.
Marc Dionne [Sat, 7 Nov 2009 15:51:52 +0000 (10:51 -0500)]
Linux: always use afs_maybe_unlock_kernel
In one error case in afs_linux_lookup unlock_kernel() is called
directly instead of using the conditional "maybe" form.
If the config is such that the BKL is not taken, this can result
in an attempt to unlock when the lock has not been taken, and
can cause an oops.
Simon Wilkinson [Wed, 4 Nov 2009 23:40:39 +0000 (23:40 +0000)]
Prevent VLRUQ race in ShakeLooseVCaches
When ShakeLooseVCaches is called from afs_Daemon, the xvcache lock
is not held. This means that if the GLOCK is dropped for any reason
(for example, whilst purging the dentry cache), then
ShakeLooseVCaches can be raced, end we can end up attempting to
flush the same vcache twice.
The symptoms of this in Linux are that we oops in clear_inode.
Get the xvcache lock in afs_Daemon(), before calling
ShakeLooseVCaches. Also, remove the conditional GLOCK code from
that function. If we don't have the GLOCK on entry, then we're really
in trouble (and both code paths - afs_Daemon and afs_NewVCache should
get the GLOCK for us, anyway)
Rainer Toebbicke [Fri, 30 Oct 2009 11:10:21 +0000 (12:10 +0100)]
Correct diskused and files when cloning a volume
Recalculates a volume's disk space used and number of files upon
every clone where it is effortless. Even though tracked mostly
correctly, bugs and accidents leave their traces which only a
salvage would correct.
Marc Dionne [Wed, 28 Oct 2009 21:54:32 +0000 (17:54 -0400)]
Linux - Fix disk cache access for selinux/AppArmor constrained processes
Preserve the credentials used for cache initialisation and use then
whenever disk cache files are opened. This takes advantage of the
credentials separation work from David Howells available in kernels
2.6.29 and above.
Access to cache files was done under the security context of the
user process, causing processes constrained by selinux or AppArmor to
fail to access AFS cache files and causing the cache manager to panic.
Besides the RT tickets, should also fix the following Ubuntu bugs:
415766 429260 457779 459299
Jeffrey Altman [Fri, 23 Oct 2009 14:54:35 +0000 (09:54 -0500)]
Check for (hostFlags & HOSTDELETED) after h_Lock_r
Many callers of h_Lock_r do not check if the HOSTDELETED flag is set,
even though it could have been set while waiting for the host lock. Add
checks for it everywhere we call h_Lock_r and we care if the host has
been deleted.
Andrew Deason [Mon, 2 Nov 2009 18:19:45 +0000 (12:19 -0600)]
DAFS: Avoid SALVSYNC communication during shutdown
Avoid trying to contact the salvageserver for any reason while we are
shutting down. During shutdown the salvageserver may not be around
anymore, so any SALVSYNC communication will appear to hang.
Just set a global flag to indicate 'no-SALVSYNC' on shutdown, in
addition to the thread-local flag we already have.
Andrew Deason [Mon, 2 Nov 2009 23:18:19 +0000 (17:18 -0600)]
DAFS: Wait for exclusive ops in FSYNC_VOL_OFF
In the FSYNC_VOL_OFF handler, fssync-server.c errors out if the call to
VGetVolumeByVp_r fails. However, this can fail if the volume is in an
error state such as SALVAGING. Normally we don't even call GetVolume
when the volume is salvaging, but the volume state can change to
SALVAGING inside GetVolume. This is particularly likely to happen on a
demand salvage, since we switch to the SALVSYNC_REQ state when
scheduling the salvage, and if we are still in that state when the
salvaged child requests a VOL_OFF, we will fail to get the heavyweight
ref.
Fix this in two ways. First, we VWaitExclusiveState_r before examining
states for the short-circuit logic so our view of the volume state is
more accurate. Second, re-examine the volume state after the call to
GetVolume, and perform the same short-circuit logic, since the volume
state may have changed during GetVolume.
Dan Hyde [Thu, 29 Oct 2009 16:07:47 +0000 (12:07 -0400)]
Add array bounds checking in h_Enumerate
When hostList is not properly NULL-terminated, the current code does
not protect from buffer overflow. The following patch prevents buffer
overflow, prints a message, and asserts.
On our Linux hosts, we never reached the original assert, as there is
a problem handling the segfault the buffer overflow causes.
Marc Dionne [Sat, 31 Oct 2009 17:27:18 +0000 (13:27 -0400)]
Linux: Fix write_begin configure test for recent RHEL kernels
Recent RHEL kernels now define simple_write_begin, which was used as
a test for the write_begin address_space op. This makes the test
succeed when it shouldn't, and breaks the build.
Rewrite the test to actually check the address_space operation.
Change-Id: Idac9b318ff716b61bf8ca4508d2dbdbfbad5b50d
Reviewed-on: http://gerrit.openafs.org/759 Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk> Tested-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: Derrick Brashear <shadow@dementia.org>
Marc Dionne [Sat, 31 Oct 2009 12:54:52 +0000 (08:54 -0400)]
Fix memory allocation warnings at shutdown
At shutdown we check for unfreed memory allocated with AllocSmallSpace
and AllocLargeSpace and complain in the syslog if there are dangling
pieces. This patch takes care of a few cases that always showed up
as warnings, even after a simple start-stop of the client.
- The cacheInode file needs to be closed before the checks, since it
uses a large piece for its struct file.
- The ICL logging code allocates 6 small pieces that are never freed.
Add a shutdown_icl() function that releases everything. While we're
at it, correct one place where we allocated with afs_osi_Alloc but
freed with osi_FreeSmallSpace, confusing our accounting.
Simon Wilkinson [Thu, 29 Oct 2009 18:53:30 +0000 (18:53 +0000)]
Cleanup cache bypass
This patch cleans up the cache bypass code so that it uses a
consistent form of indentation throughout the file.
It also changes the do { } while(0); macros to omit the trailing
semicolon, as macro definitions with trailing semicolons break
normal coding conventions.
Andrew Deason [Wed, 28 Oct 2009 16:06:47 +0000 (11:06 -0500)]
Avoid using released hosts
Since h_Release_r has the possibility of freeing a host, we should not
be using a host after it has been released. A few places can still use a
released host, potentially causing heap corruption, double frees, and
generally weird behavior.
So either move calls of h_Release_r until after we finish using a host,
or make sure to set the pointer to NULL after it has been released.
Change-Id: I3d5275c3862003e372d3c19a5462e62bf9cb269e
Reviewed-on: http://gerrit.openafs.org/747 Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Dan Hyde <drh@umich.edu> Reviewed-by: Derrick Brashear <shadow@dementia.org>
Simon Wilkinson [Thu, 29 Oct 2009 18:42:41 +0000 (18:42 +0000)]
Coding style cleanup
Our style for function definitions has the name of the function as
the first item on a new line - this means you can find a definition
by using grep ^functionName. Fix the disconnected code to follow this
style.
Simon Wilkinson [Wed, 28 Oct 2009 11:12:18 +0000 (11:12 +0000)]
Make afsd.pod reflect reality
9d396c4916fdac64fcface30e6637ca6e2911203 (from 2005) introduced
autotuning for afsd, and changed some of the defaults which aren't
autotuned. Update the afsd man page to reflect the autotuning, and
the new defaults.
Simon Wilkinson [Wed, 28 Oct 2009 18:24:33 +0000 (18:24 +0000)]
Move PMTU header block to top of file
1206e7538be86f073b21cd289266286b60a95d0a added linux/errqueue.h to
rx_user.c, but added the include in the middle of a function - which
means that the new structure is out of scope for the rest of the file,
which breaks the build on Linux.
Put the header include at the start with all of its other friends.
Simon Wilkinson [Mon, 26 Oct 2009 19:52:48 +0000 (19:52 +0000)]
Use fewer #ifdefs for dynamic vcaches
When we're not in AFS_MAXVCOUNT_ENV, make afsd_dynamic_vcaches a
static 0, which allows the removal of a scattering of #ifdef's in
the middle of conditionals in afs_vcache.c, and generally improves
the code browsing experience.
Also, move the externs for this variable to afs.h, where they belong,
and fix related formatting.
Simon Wilkinson [Mon, 26 Oct 2009 18:52:52 +0000 (18:52 +0000)]
Remove hardcoded maximum time
When iterating across the buffer list, afs_newslot used a hardcoded
maximum time to find the oldest. Instead of using this, just use the
accesstime of the first unused buffer that we find as the oldest, and
continue as normal.
Simon Wilkinson [Mon, 26 Oct 2009 19:36:53 +0000 (19:36 +0000)]
Fix dynamic vcache / rxmaxmtu cmd id collision
Both dynamic vcaches and rxmaxmtu had been committed as using the
35th command entry. Fix this according to the order they are in
the command list (35 and 36, respectively). Tidy up the command list
so it's easier to read, and remove the #ifdef notdef entry from it,
as adding it back in would just cause chaos.
Sadly, similar changes were never made to afs/afs_buffer.c, so the
same problems remain in the cache manager.
The issue here is with two processes racing in afs_newslot. Calls to
afs_newslot protect buffers with a zero reference count using
afs_bufferLock. If we release afs_bufferLock, before we increase the
reference count of the vcache, then we can end up with newslot
picking the same buffer for two different purposes.
The GLOCK actually protects us from the worst of this, but this fix
is necessary both for correctness, and for symmetry with the file
server buffer code.
Simon Wilkinson [Mon, 26 Oct 2009 19:46:09 +0000 (19:46 +0000)]
Remove pininodes
The pininodes option has been commented out of afsd since the
original OpenAFS commit. Enabling it now would cause chaos, due to
the way that cmd orders its arguments. Just remove the sections
of code to avoid this danger.
Andrew Deason [Mon, 26 Oct 2009 19:04:48 +0000 (14:04 -0500)]
Dec old special inodes in inode convertROtoRW
The convertROtoRW code for the inode fileserver makes copies of the
volume's special inodes, but leaves the old (RO) inodes around. If the
RO is created again, this will result in duplicate special inodes for
the same volume, which freaks out the salvager (and possibly other
things).
So IH_DEC the old RO special inodes after converting, so they go away.
Jeffrey Altman [Mon, 26 Oct 2009 14:13:00 +0000 (07:13 -0700)]
ubik_VL_GetAddrsU does not accept a VLCallBack parameter
ubik_VL_GetAddrU accepts a pointer to a uniqifier and not
a pointer to a VLCallBack structure. Remove an incorrect
cast and provide the correct parameter in src/volser/vos.c.
Andrew Deason [Fri, 23 Oct 2009 20:02:12 +0000 (15:02 -0500)]
Avoid 'salvageserver -client -showlog' segfault
Running salvageserver with the -client and -showlog options will
currently segfault, since -client does not open logFile, and -showlog
will attempt to rewind logFile on exit.
Fix this by not allowing -client and -showlog together (since it won't
work anyway, as -showlog tries to read SalvageLog), and by making
showlog() check logFile for NULL-ness.
This adds the functions cm_RankUpServers() and cm_RankServer() to
the Windows cache manager. cm_RankUpServers() steps through the
list of servers, and calls cm_RankServer(), which in turn re-ranks
the servers that are currently up based on rx peer statistics as
exposed by rx_GetLocalPeers().
cm_RankUpServers() is called every 10 minutes by the cache manager
daemon, so as to allow re-ranking of the servers.
Also added is the struct server->adminRank data structure, to
allow for the modification of the rank that the admin has set,
without but basing this modification on the admin-set rank.
Simon Wilkinson [Fri, 23 Oct 2009 15:34:33 +0000 (16:34 +0100)]
Don't return AOP_WRITEPAGE_ACTIVATE to write()
When we're called from write(), we don't have the option
of deferring the writing of a page by returning AOP_WRITEPAGE_ACTIVATE.
Instead, write() simply sees this as the output of 0x8000 bytes of data.
So, whilst we can mark a vcache as being output, we can't defer the
processing of one which is already being written (by, for example, an
earlier writepage()).
This problem only affects files which are have mmap() and write()
called in quick succession, but it does break the fsx utility.
Change-Id: I750a186de38da9873665a862f5b584a78e6979ad
Reviewed-on: http://gerrit.openafs.org/725 Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Tested-by: Marc Dionne <marc.c.dionne@gmail.com> Tested-by: Derrick Brashear <shadow@dementia.org> Reviewed-by: Derrick Brashear <shadow@dementia.org>
Simon Wilkinson [Sat, 24 Oct 2009 09:54:32 +0000 (10:54 +0100)]
Use user credentials for Linux writepage()
We have no control over the context in which the kernel calls our
writepage routine. It may be from the process which original wrote the
page, from any other process on the system which is writing and goes
over the dirty page threshold, or from the flush thread (pdflush /
flush-afs). Therefore, we cannot use the credentials of the current
process to perform the writeback. This is an issue both for afs_write
(which, in our current MM model, may need to contact the fileserver
to read missing chunks), and for DoPartialWrite (which needs to be
able to store chunks when the local cache is getting full)
This patch stores the credentials of the first process to open a file in
the vcache structure. Whenever writepage() is used to writeback pages
for this file, the cached credentials are used rather than those of the
current context.
Thanks to Marc Dionne for his work in testing and refining this patch.
Michael Meffie [Thu, 22 Oct 2009 19:51:33 +0000 (15:51 -0400)]
volser transaction object race conditions
Fix the transaction object races between VolMonitor and the
volume operation procedures which can cause the volume
server to crash.
Add a per transaction object mutex to safely set the
transaction call pointer and name. Fix VolMonitor to safely
traverse the transaction list and to access the call pointer
and last proc name while copying info to send to the vos
client. Fix the sleep thread to safely access the last proc
name.
FIXES 125479
Change-Id: I59595b93522d111b6a771d3d93c246bfc2ce65de
Reviewed-on: http://gerrit.openafs.org/718 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementia.org> Tested-by: Derrick Brashear <shadow@dementia.org>
Andrew Deason [Thu, 22 Oct 2009 16:12:30 +0000 (11:12 -0500)]
Avoid prematurely destroying callback_rxcon
Currently, h_GetHost_r and removeAddress_r can destroy the
callback_rxcon of a host. Having a NULL callback_rxcon can cause
segfaults in code that does not properly check if a host has been
HOSTDELETED before trying to use it.
Although such code is incorrect and should be fixed, we can still avoid
a segfault in those situations by not destroying callback_rxcon until we
destroy the host itself. This just prevents destroying callback_rxcon in
h_GetHost_r and removeAddress_r, leaving it to h_TossStuff_r to destroy
when it destroys the host.
Simon Wilkinson [Fri, 23 Oct 2009 11:42:19 +0000 (12:42 +0100)]
Resolve error return issues in writepage
The writepage_sync changes get error returns wrong in a couple of places. In
particular, they return a 0 code from dopartialwrite in preference to the
length return from page_writeback
Change-Id: I34a848fed5f799aa6844e9ef0339321f91c7e59b
Reviewed-on: http://gerrit.openafs.org/721 Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Tested-by: Simon Wilkinson <sxw@inf.ed.ac.uk> Tested-by: Derrick Brashear <shadow@dementia.org> Reviewed-by: Derrick Brashear <shadow@dementia.org>