Always use kbuild for all Linux kernel configure probes
Some Linux kernel probes for the existence of header files were done
with file existence checks (test -f). This breaks if the kernel build
system is stacking multiple directories of headers together with
compile-time -I include path options, as is the case for the current
Debian Linux header packages. Instead, always use kbuild to check
whether a kernel header is available.
Similarly, use AC_TRY_KBUILD instead of AC_TRY_COMPILE when checking
for an SELinux kernel, since AC_TRY_COMPILE doesn't call into kbuild
and won't get the correct kernel header paths.
This is part of the fix for Debian Bug#521745 and has been included in
the Debian package since 1.4.10+dfsg1-1.
Marc Dionne [Wed, 28 Oct 2009 21:54:32 +0000 (17:54 -0400)]
Linux - Fix disk cache access for selinux/AppArmor constrained processes
Preserve the credentials used for cache initialisation and use then
whenever disk cache files are opened. This takes advantage of the
credentials separation work from David Howells available in kernels
2.6.29 and above.
Access to cache files was done under the security context of the
user process, causing processes constrained by selinux or AppArmor to
fail to access AFS cache files and causing the cache manager to panic.
Besides the RT tickets, should also fix the following Ubuntu bugs:
415766 429260 457779 459299
The current kernel module build infrastructure relies on the ability to
create symlinks from known directory names used in the AFS code to the
actual locations of the kernel header files. This breaks if there is no
single kernel header tree and instead multiple trees are layered together
by kbuild using compile-time -I include paths.
Attempt to detect this case by seeing if linux/types.h is in the kernel
header directory where we expect it. If not, rather than creating
symlinks for h, sys, and netinet, create directories and populate them
with single-line headers that just include the corresponding linux/*.h
header. The list of headers for which to do this is generated dynamically
by analyzing the AFS kernel source code and looking for relevant #include
directives.
This patch has been part of the Debian OpenAFS package since
1.4.10+dfsg1-1. The check for whether we have layered kernel header trees
may be specific to Debian and may require modification later if other
Linux distributions do something similar.
Simon Wilkinson [Wed, 28 Oct 2009 11:12:18 +0000 (11:12 +0000)]
Make afsd.pod reflect reality
9d396c4916fdac64fcface30e6637ca6e2911203 (from 2005) introduced
autotuning for afsd, and changed some of the defaults which aren't
autotuned. Update the afsd man page to reflect the autotuning, and
the new defaults.
Andrew Deason [Wed, 28 Oct 2009 16:06:47 +0000 (11:06 -0500)]
Avoid using released hosts
Since h_Release_r has the possibility of freeing a host, we should not
be using a host after it has been released. A few places can still use a
released host, potentially causing heap corruption, double frees, and
generally weird behavior.
So either move calls of h_Release_r until after we finish using a host,
or make sure to set the pointer to NULL after it has been released.
Reviewed-on: http://gerrit.openafs.org/747 Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Dan Hyde <drh@umich.edu> Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 416e2f11c35f5d55f91090b30b4db1a9bf6d6e07)
Andrew Deason [Mon, 26 Oct 2009 19:04:48 +0000 (14:04 -0500)]
Dec old special inodes in inode convertROtoRW
The convertROtoRW code for the inode fileserver makes copies of the
volume's special inodes, but leaves the old (RO) inodes around. If the
RO is created again, this will result in duplicate special inodes for
the same volume, which freaks out the salvager (and possibly other
things).
So IH_DEC the old RO special inodes after converting, so they go away.
Reviewed-on: http://gerrit.openafs.org/735 Tested-by: Andrew Deason <adeason@sinenomine.net> Tested-by: Derrick Brashear <shadow@dementia.org> Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit dbe3b7b8eeb4a010f82248befc6167b3b5ed9606)
Marc Dionne [Sun, 1 Nov 2009 21:03:17 +0000 (16:03 -0500)]
Linux: Avoid deadlock in readdir - release GLOCK for filldir
The GLOCK is held while calling the filldir function in afs_linux_readdir().
If this function causes a page fault, and in particular if this fault
involves AFS, we're in trouble as we'll eventually deadlock in the
readpage code.
A simple test case for this is to call the getdents syscall on an
AFS directory with a buffer that is part of an mmaped AFS file.
This is already the case in the master branch; the change was part of
the merge of the NFS translator code.
FIXES 125555
Change-Id: I829838e45f94921d22335154587216f7842e3955
Reviewed-on: http://gerrit.openafs.org/760 Tested-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk> Reviewed-by: Derrick Brashear <shadow@dementia.org>
Marc Dionne [Thu, 17 Sep 2009 20:57:52 +0000 (16:57 -0400)]
Linux: 2.6.32 - Adapt to writeback changes
Adapt to the writeback changes in kernel 2.6.32
- Since we define our own backing_dev, it needs to be registered with
the writeback code and attached to the super_block. Otherwise it
might get ignored when writeback is needed.
- Each backing_dev now gets its own kernel thread. The name of the
thread is based on the registered name - the openafs one will appear
as "flush-afs".
Andrew Deason [Thu, 22 Oct 2009 16:12:30 +0000 (11:12 -0500)]
Avoid prematurely destroying callback_rxcon
Currently, h_GetHost_r and removeAddress_r can destroy the
callback_rxcon of a host. Having a NULL callback_rxcon can cause
segfaults in code that does not properly check if a host has been
HOSTDELETED before trying to use it.
Although such code is incorrect and should be fixed, we can still avoid
a segfault in those situations by not destroying callback_rxcon until we
destroy the host itself. This just prevents destroying callback_rxcon in
h_GetHost_r and removeAddress_r, leaving it to h_TossStuff_r to destroy
when it destroys the host.
Andrew Deason [Tue, 20 Oct 2009 17:43:42 +0000 (12:43 -0500)]
HPUX: Do not sigwait on critical signals
On HPUX, it is possible for 'critical' signals such as SEGV, ABRT, etc
to be delivered to the softsig thread when we sigwait(). The current
code marks these as 'fatal' and just exit(0)s when they are received,
preventing us from getting cores in the case of a SEGV, ABRT, etc.
To work around this and keep behavior on other platforms the same, just
do not wait on 'critical' signals on HPUX in the softsig thread.
Reviewed-on: http://gerrit.openafs.org/693 Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit bf9c51a4e13b3e621b99866e9be53c8fe35a39fe)
Reviewed-on: http://gerrit.openafs.org/702
Andrew Deason [Thu, 24 Sep 2009 17:02:55 +0000 (12:02 -0500)]
Use f_bsize for ZFS afs_fsfragsize
On ZFS, the disk space files can use up can be rounded up to the next
recordsize boundary if they've been truncated. This can cause the Unix CM
to mis-estimate cache usage, since it truncates files fairly often, and
assumes the disk space used is the file length rounded up to the next
f_frsize.
Since the ZFS recordsize is available via the statvfs f_bsize, just
round up to that instead. There is still some additional file metadata
that takes up some additional space on disk, but according to ZFS people
I've spoken to about this, it cannot be known in advance. In practice,
the additional metadata storage doesn't appear to exceed about 10% of
the data storage, which should be acceptable.
Andrew Deason [Tue, 13 Oct 2009 16:20:51 +0000 (11:20 -0500)]
Remove extra arguments to afs_syscall_call
Someone appears to have mistaken afs_syscall_call for
afs_syscall_pioctl, and passed rvp and CRED to afs_syscall_call on
solaris (afs_syscall_call always takes 6 arguments; afs_syscall_pioctl
takes various arguments depending on the platform). After b1ff4a0b1115f5739c0365cc963189b1f931971f, this breaks the client build
on AFS_SUN5_ENV, but only because it added prototypes for
afs_syscall_call; it looks like it's always been wrong, but it was never
noticed.
Andrew Deason [Wed, 7 Oct 2009 17:14:11 +0000 (12:14 -0500)]
Make namei convertROtoRW'd volumes usable
Right now, if you convertROtoRW a volume on namei, the converted volume
appears to need a salvage before it is usable, and the header of the old
(now empty) RO volume is kept around. Fix this:
-- Set inUse = 0 on the converted volume, so the fileserver will be
able to attach the volume when we give it back
-- Unlink the RO header file, instead of trying to unlink the
VI_VOLINFO file twice
-- Log the actual error code (errno) in the error message for the last
unlink
Andrew Deason [Mon, 21 Sep 2009 21:57:01 +0000 (16:57 -0500)]
Implement _PC_FILESIZEBITS for solaris pathconf
Using recent NFS clients and servers with the translator under Solaris
causes AFS to be queried for the _PC_FILESIZEBITS pathconf value. Right
now we don't implement it and return EINVAL, causing at least some
modern NFS clients to be unable to mount AFS via the translator on at
least some modern NFS servers.
So, return _PC_FILESIZEBITS as either 32 or 64, depending on whether we
are a 64-bit client or not.
Russ Allbery [Sat, 22 Aug 2009 01:37:41 +0000 (18:37 -0700)]
Add automatic sysname detection for ARM Linux
Add arm*-linux* to the case statement that attempts to automatically
determine the AFS sysname, similar to the other Linux sysname
determination cases.
afs_InitReq fails to initialize the "flags" field of the vrequest structure.
Consequently the logic involving (flags & O_NONBLOCK) in afs_Analyze leads to
unpredictable results. afs_InitReq should initialize the complete vrequest
structure.
Andrew Deason [Thu, 3 Sep 2009 19:43:28 +0000 (14:43 -0500)]
Update accessDate on volume access
Right now accessDate is simply never updated, so the last access time
for a volume is never reported. Simply update the field in
VBumpVolumeUsage_r, so we track the last time the volume was accessed.
Note that this does not increase disk writes to the volume header; the
performance impact is effectively nil.
Andrew Deason [Mon, 20 Jul 2009 17:31:44 +0000 (12:31 -0500)]
Add additional vlprocs safety checks
This adds additional safety checks to the vlserver's implementation of
the VL_CreateEntry, VL_ReplaceEntry, and VL_UpdateEntry RPCs. Now in all
three of these, any new volume ID that would be added to the VLDB or
that would be newly referenced in a VLDB entry is checked against
duplication in other entries. Additionally, any new volume names added
to the VLDB (either by creation, or modifying an existing volume) are
checked against duplication. This should make it impossible for clients
to make a volume ID or volume name correspond to multiple volume groups
(either conceptually or literally in the vldb).
This also alters the vlserver's implementation of the VL_GetNewVolumeId
RPC such that the vlserver increments maxvolid until the range of volume
IDs [*newvolumeid, *newvolumeid+bumpcount) is unused. 'vos' is modified
to only allocate one new volume id at a time, so we don't skip over
potentially-usable vol ids.
Andrew Deason [Mon, 6 Jul 2009 15:29:20 +0000 (10:29 -0500)]
Allow specifying vos create/addsite volume IDs
This adds the -id option to 'vos create', and the -roid option to 'vos
create' and 'vos addsite'. This allows the user to manually specify the
volume IDs that a new RW or RO volume will get (or explicitly specify
that an RO volume ID should be unset), instead of always relying on the
volume IDs retrieved from the vlserver.
Andrew Deason [Tue, 14 Jul 2009 16:29:01 +0000 (11:29 -0500)]
Ignore SIGSYS when issuing pioctl syscall
Ignore SIGSYS when we issue the pioctl syscall, so we don't dump core
when the kernel module hasn't yet been installed on several platforms.
Also, restore the old SIGSYS signal handler afterwards, so we don't
cause any side-effects.
Reviewed-on: http://gerrit.openafs.org/81 Verified-by: Andrew Deason <adeason@sinenomine.net> Verified-by: Derrick Brashear <shadow@dementia.org> Reviewed-by: Russ Allbery <rra@stanford.edu>
(cherry picked from commit 4f36dd089a9c7187f94f77516a486245c057f7f4)
Reviewed-on: http://gerrit.openafs.org/274 Tested-by: Andrew Deason <adeason@sinenomine.net>
ktc_curpag isn't specific to a Kerberos v4 environment, so move it outside
the AFS_KERBEROS_ENV #ifdef. Add it to the auth.h header and to the
exports from the shared libafsauthent.
Make afs_GetVolume() correctly handle requests for fids which are in
the dynroot cell but are not the root of the dynamic root volume. This
is necessary to allow dynamic root mount points to be looked up and
followed in situations where the dynroot volume is not in the volcache,
but its root vnode is in the vcache.
Simon Wilkinson [Tue, 21 Jul 2009 23:04:24 +0000 (00:04 +0100)]
Remove the RCSID macro
The move to git means that we can no longer populate the RCSID
macro in the way that it was used with CVS. This patch simply
removes the macro from every file, except where it contains
information from upstream (and it's in a comment).
Simon Wilkinson [Mon, 6 Jul 2009 13:38:42 +0000 (14:38 +0100)]
Revise git ignore files
Revise our git ignores to match the current state of the tree, and include
entires in the top level for all of the 'dest' directories for all of the
architectures we claim to support.
rework all linux vnode ops so the vulnerability we previously had can't
recur later just because someone makes a change that would leak a negative
error
call inode's setattr op instead of just inode_setattr, when one is available.
needed for xfs, notably also will cause truncates to be journalled for ext3,
which may solve some existing issues
curpag needs to know about kernel constructs (getpagvalue on AIX, onegroup
versus two group on linux) and on aix 5.1 simply can't work. add a new pioctl
and use it to simply ask the kernel what the current pag is
Updates to chapter one of the Admin Guide. Remove references to the
Authentication Server, add references to a Kerberos server, revise ntpd
parts to reflect the fact that OpenAFS doesn't ship ntpd, and removed
the distinction between the US and non-US versions of the Update Server.
rxi_Findcbi, rxi_FIndIfnet, rxi_FindIfMTU "failure" end up returning
the RX_REMOTE_PACKET_SIZE as the mtu to use unless we allow our override
to apply, so we do that. then, add an afsd switch to allow setting it.
afsd man page update required and will follow.
====================
This delta was composed from multiple commits as part of the CVS->Git migration.
The checkin message with each commit was inconsistent.
The following are the additional commit messages.
====================
LICENSE IPL10
FIXES 124880
avoid either reopening closed vnodes and leaving cached descriptors around,
or discarding a reference we're not holding; instead, sync changes when the
fd is closed, and note such has been done; otherwise, no changes from older
code.
Replace version info in the DocBook files with a new ENTITY "version"
associated with a local "version.xml" file which contain a <revision>
tag for the current release.
The version.xml file should be autogenerated by the Makefile system.
Standardize the UNIX Makefiles for all of the DocBook guides. Remove the
rest of the generated files and switch to xsltproc and dblatex for the
document generation in all cases. Fix a few DocBook errors by removing
the contents of the <index> tag and removing the unknown <pubsnumber> tag
in the <revision> field.
Use dblatex to build PDF documentation instead of docbook2html and xsltproc
to build HTML instead of docbook2html. Remove all the index generation
logic, since dblatex and xsltproc handle that automatically. Remove the
contents of the <index> tag in the source, since neither program requires
there be anything in there.
Remove the style sheets and configuration that were used for docbook2*.