Russ Allbery [Wed, 5 May 2010 04:53:44 +0000 (21:53 -0700)]
Fix handling of the afsd.fuse.8.gz man page link
The automatic fixing of the link to point to the compressed page didn't
work when it was in a separate package, so manually create the link
rather than installing the link created by upstream.
Russ Allbery [Wed, 5 May 2010 03:55:45 +0000 (20:55 -0700)]
Move the FUSE client into a separate package
* Move the experimental afsd.fuse AFS FUSE client into openafs-fuse to
avoid adding a FUSE dependency in openafs-client. Document its
current limitations in the package description.
Russ Allbery [Tue, 4 May 2010 23:26:15 +0000 (16:26 -0700)]
Check if AFS is mounted before killing processes with open files
* Skip killing processes with files open in AFS if AFS does not appear
to be mounted according to /etc/mtab. Otherwise, we may call lsof
without a specific mount point and kill far more processes than we
intend to. (This code is disabled by default, so this problem would
only be seen by people who enabled it.)
Russ Allbery [Tue, 4 May 2010 22:06:52 +0000 (15:06 -0700)]
In the openafs-client init script, don't assume AFS is mounted at /afs
* In the openafs-client init script, don't assume that AFS is mounted on
/afs when unmounting it or killing processes with AFS files open.
Instead, parse the output from mount to find the AFS mount point.
Russ Allbery [Tue, 4 May 2010 21:27:36 +0000 (14:27 -0700)]
Preserve AFS mount point and cache directory in cacheinfo
* Preserve the AFS mount point and cache directorys set in
/etc/openafs/cacheinfo if the file already exists rather than
overwriting them with the defaults. Thanks, Liam Healy.
(Closes: #580077)
Simon Wilkinson [Thu, 22 Apr 2010 21:24:11 +0000 (22:24 +0100)]
Unix: Modify disk cache versioning
This change increments the disk cache version number, and adds a
structure size record to the disk cache header. All old disk caches
will be replaced when the client is started.
With the various changes made to unify our file handles, and to
support large file handles on Linux, the size of the 'fcache'
structure was modified earlier in the 1.5 series. However, fcache
is also the building block of the CacheItems file, so these changes
inadvertently broke users upgrading from 1.4. In addition, as the
disk cache inode is now a union of many different structures, the
structure size is now potentially volatile across both kernel, and
OpenAFS revisions.
Up the version number so old disk caches are invalidated and won't
crash users who are upgrading. Also take the opportunity to add an
item to the header which stores the size of struct fcache used
by the disk cache. If the size on disk doesn't match that expected
by the kernel module, truncate the cache and start again.
Change-Id: I2ee8863d0bfaaaba34272c9e139638e17669a53e
Reviewed-on: http://gerrit.openafs.org/1811 Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: Derrick Brashear <shadow@dementia.org> Tested-by: Derrick Brashear <shadow@dementia.org>
Andrew Deason [Thu, 22 Apr 2010 16:54:06 +0000 (11:54 -0500)]
Update nextVnodeUnique before checking inUse
When attaching a volume, update the nextVnodeUnique field for the
volume, before we do any checks on the volume; for example, checking
inUse, which may result in a demand-salvage if we are running DAFS.
If we do not do this, we can schedule a demand-salvage without setting
nextVnodeUnique, and VUpdateVolume_r will update the volume header
uniquifier to nextVnodeUnique+200, when nextVnodeUnique is not set.
So, we will always set the uniquifier to 200. Fortunately, the salvage
should usually fix the uniquifer anyway.
So, set nextVnodeUnique before doing any of those checks, to avoid
screwing up the uniquifier when taking the volume offline.
Andrew Deason [Thu, 22 Apr 2010 18:21:52 +0000 (13:21 -0500)]
Prefer EndCall errors in StoreMini
Partially revert b1eb6a7a3f80500f0187cc6a1dd2013e1a5e154a, so we do
not mask the rx_EndCall error with a EndRXAFS_StoreData error (for
example, if EndRXAFS_StoreData returns RXGEN_CC_UNMARSHAL, and
rx_EndCall returns VBUSY). We need to agree on how to do this
throughout the tree, but for now, just fix StoreMini.
Change-Id: I4913946089fd0857506d9186f85c5c8115a5b95d
Reviewed-on: http://gerrit.openafs.org/1808 Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk> Reviewed-by: Derrick Brashear <shadow@dementia.org>
Simon Wilkinson [Thu, 22 Apr 2010 16:56:25 +0000 (17:56 +0100)]
Linux: RedHat packaging updates for RHEL6
Update our bundled spec file and related tools so they can be used
to build OpenAFS on the RHEL6 beta.
- Make kmodtool recognise el6 as having "modern" kernel naming
conventions
- Replace %{PACKAGE_VERSION} (which seems to have disappeared)
with the standard %{version} macro
Thanks to billings and phalenor on IRC for their testing efforts.
Fix UCONTEXT detection on ppc_linux26 via include order
param.linux26.h defines USE_UCONTEXT for all Linux platforms for
glibc 2.4 and higher, but it does this by testing __GLIBC__ and
__GLIBC_MINOR__. These are defined by features.h, which is included
by any system header. At least one system header must be included
before those are defined. lwp/process.c was including <afsconfig.h>
and <afs/param.h> before any other headers, leading to those macros
being undefined. Most of the Linux architectures either have their
own implementation or were explicitly defining USE_UCONTEXT in the
per-architecture param file, but ppc_linux26 was relying on the
default.
Fix this by reordering the includes to include the various system
headers before <afs/param.h> and add a comment explaining why.
This previously worked in earlier versions because the old
param.ppc_linux26.h file included <afs/afs_sysnames.h>, which
included "stds.h", which included a system header prior to the check
for ucontext. The new generic param file reverses that order.
Andrew Deason [Wed, 21 Apr 2010 17:41:21 +0000 (12:41 -0500)]
Recover from afs_GetVolSlot errors
afs_GetVolSlot can panic in a few different ways, such as failing to
read from or write to VolumeInfo. Instead of panic'ing, return an
error to the application. Adjust callers to deal with getting a NULL
volume returned.
Add RFC 5864 to the protocol documentation directory
Add a copy of RFC 5864 (DNS SRV Resource Records for AFS) to the
protocol documentation directory for reference. As permitted by
the IETF Trust License Policy section 3(e), I release this document
under the MIT/X Consortium license included in this copy of the
document.
If --enable-fuse-client is passed to configure and afsd.fuse is built,
install it into the same directory as afsd and install afsd.fuse.8 as a
symlink to the afsd.8 man page. Add documentation of afsd.fuse to the
afsd man page.
Andrew Deason [Sun, 18 Apr 2010 23:49:18 +0000 (18:49 -0500)]
Add documentation for fs callback xstats
Change I572ff682de4cc7ef27bb46dd028d3d797b873841 added the fileserver
callback xstats collection to afsmonitor. Provide some documentation
for these fields, along with the other fields displayed by afsmonitor.
Fix UCONTEXT detection on ppc_linux26 via include order
param.linux26.h defines USE_UCONTEXT for all Linux platforms for
glibc 2.4 and higher, but it does this by testing __GLIBC__ and
__GLIBC_MINOR__. These are defined by features.h, which is included
by any system header. At least one system header must be included
before those are defined. lwp/process.c was including <afsconfig.h>
and <afs/param.h> before any other headers, leading to those macros
being undefined. Most of the Linux architectures either have their
own implementation or were explicitly defining USE_UCONTEXT in the
per-architecture param file, but ppc_linux26 was relying on the
default.
Fix this by reordering the includes to include the various system
headers before <afs/param.h> and add a comment explaining why.
This previously worked in earlier versions because the old
param.ppc_linux26.h file included <afs/afs_sysnames.h>, which
included "stds.h", which included a system header prior to the check
for ucontext. The new generic param file reverses that order.
If --enable-fuse-client is passed to configure and afsd.fuse is built,
install it into the same directory as afsd and install afsd.fuse.8 as a
symlink to the afsd.8 man page. Add documentation of afsd.fuse to the
afsd man page.
Remove special-case call sequence for KAM_SetPassword on s390
For Linux s390 (but not s390x), an additional argument was passed
to KAM_SetPassword between the kvno and the encryption key. This
doesn't seem to match the rest of the code and is now, with stricter
prototyping, preventing the code from compiling. Remove it and use
the same call sequence on s390 as everywhere else.
Add a caution explaining how the file server addresses are registered
and pointing users at NetInfo and NetRestrict plus restarting the file
server for the normal case.
Mention what version of OpenAFS introduced this command. Drop the note
about the version of OpenAFS that added the -encrypt flag, since the
whole command is newer than that.
Reference vos listaddrs -printuuid specifically to get the UUID.
General formatting and wording cleanup: use terminology more consistently,
continue a long example line, wrap long lines, fix a spelling error, and
add cross-references to NetInfo and NetRestrict.
* Removed the logic to set $SMP based on CONFIG_SMP from
/boot/config-$kernelver
* When using --with-linux-kernel-packaging in the configure line, dkms
no longer needs MPS=$SMP in the make line.
Remove special-case call sequence for KAM_SetPassword on s390
For Linux s390 (but not s390x), an additional argument was passed
to KAM_SetPassword between the kvno and the encryption key. This
doesn't seem to match the rest of the code and is now, with stricter
prototyping, preventing the code from compiling. Remove it and use
the same call sequence on s390 as everywhere else.
Andrew Deason [Mon, 19 Apr 2010 19:48:14 +0000 (14:48 -0500)]
Use AC_PREREQ
We use AC_USE_SYSTEM_EXTENSIONS, which was introduced in autoconf
2.60. To allow for less confusion and perhaps a more clear error
message, specify that we require using at least that version.
Remove newly imported upstream directory in import-upstream
import-upstream was leaving behind a directory from unpacking the
upstream source, which under some circumstances was getting imported
into the upstream branch (if, for instance, the script was run twice
in succession). Make sure it's gone with rm -rf.
* Add build dependency on libfuse-dev so that the new FUSE afsd is
built. Install afsd.fuse into the openafs-client package for the time
being. There is, as yet, no documentation or init script support for
the FUSE implementation.
Add a caution explaining how the file server addresses are registered
and pointing users at NetInfo and NetRestrict plus restarting the file
server for the normal case.
Mention what version of OpenAFS introduced this command. Drop the note
about the version of OpenAFS that added the -encrypt flag, since the
whole command is newer than that.
Reference vos listaddrs -printuuid specifically to get the UUID.
General formatting and wording cleanup: use terminology more consistently,
continue a long example line, wrap long lines, fix a spelling error, and
add cross-references to NetInfo and NetRestrict.
Windows: Preserve volume location info in case of comm fail
The cache manager refreshes volume location information every
two hours. If during a refresh the communication with the
vldb server fails, the previously known volume location information
should continue to be used.
The previous behavior in which the volume location information
is discarded first and then the update is performed can result
in unnecessary client failures when a temporary disruption in
communication with the vldb server occurs. Instead, wait until
we have a successful response from the vldb server before the
previous server list is discarded.
The idle error value (if any) is stored in the cm_req_t object.
Since the value is never cleared, the same value can be returned
for all requests that make use of the same cm_req_t object.
Change the behavior to only return an idle error once and then
clear it.
Rx: make conn_call_lock and conn_data_lock usage consistent
The rx_connection.flags field is protected by the conn_data_lock
but the conn_data_lock is not held everywhere the conn flags
field is altered. This produces a race that can result in a
deadlock when waiter flags are inadvertently prevented from being
cleared.
The conn_call_lock usage in rx_EndCall which was removed in
Change e169708681eb1bbbb31951b95f68e861a4b01c7e must be restored.
If rx_EndCall never obtains the conn_call_lock it can't ensure
that the thread in rx_NewCall actively checking the calls will
not end up blocking when there is now a call channel that can
be reused. This usage of conn_call_lock can be removed only
if a true producer/consumer model is implemented.
Windows: cm_UpdateCell must hold cell lock across server random
cm_UpdateCell fails to hold the cell lock across the server
randomization. As a result the vlserver list can be destroyed
while randomization is taking place.
Windows: CM_SCACHESYNC_STOREDATA for non-files have no buffers
Do not add QData objects with null cm_buf_t pointers to the
cm_scache_t bufWritep queue when synchronizing directory changes.
If a callback is required while the directory change is being
pushed it can result in a deadlock.
Windows: define new event log messages for cm_Analyze VBUSY, VRESTARTING, etc.
Add MSG_SERVER_REPORTS_VBUSY, MSG_SERVER_REPORTS_VRESTARTING,
MSG_ALL_SERVERS_BUSY, MSG_ALL_SERVERS_OFFLINE,
and MSG_ALL_SERVERS_DOWN.
Add event message throttling. Only permit one copy of a message
to be generated every five seconds if the message will duplicate
the prior message. This often occurs when a server or volume becomes
inaccessible and there were a large number of requests queued on it.
Integrate these new messages into cm_Analyze processing for VBUSY,
VRESTARTING, ALLDOWN, ALLOFFLINE, and ALLBUSY errors.
Windows: wait for I/O on buffers to complete in cm_SetupStoreBIOD
cm_SetupStoreBIOD constructs a list of dirty buffers for a file
that are to be written to the file server. When constructing
the list, if when determining the first dirty buffer we come across
a buffer that is already actively involved in an I/O operation,
call buf_WaitIO() to wait until the buffer is no longer busy before
continuing. This reduces lock contention and synchronization
conflicts.
Windows: split cm_buf_t.flags field to ensure proper locking
It turns out that for all these years the locks protecting
the cm_buf_t flags field have been racy. Some of the flags
were protected by the cm_buf_t mutex and others by the
buf_globalLock. This patchset splits the flags field so that
the appropriate lock will be used exclusively to protect a
common set of flags.
we could mask the mode setting on symlinks, however, it would be nice
to change the fileserver to allow mode setting on symlinks in some
(safe) cases. preserve our ability to do so.
Harald Barth [Fri, 16 Apr 2010 05:45:35 +0000 (01:45 -0400)]
Add vos setaddrs command and man page
The vos setaddrs command sets the IP addresses for a server entry
in the Volume Location Database (VLDB). Specify one or serveral hosts.
All existing hosts in the VLDB entry are replaced with the new entries
on the command line.
The src/config/param.rs_aix61.h source file was stored
in the repo with CR-LF end of line. This is causing
problems for Windows Git which converts CR-LF to LF
for storage in the repo.
Simon Wilkinson [Thu, 15 Apr 2010 19:52:11 +0000 (20:52 +0100)]
Tidy up UKERNEL includes
UKERNEL is just another userspace build - there's no need to
maintain completely separate header file lists in each object file
for "userspace" and "ukernel". Tidy this up to improve the
readability of these sections of code.
Change-Id: I69f476a0b8aae1204cd4207c7c656ec7e07184df
Reviewed-on: http://gerrit.openafs.org/1758 Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementia.org> Tested-by: Derrick Brashear <shadow@dementia.org>
Thread safety in rx_NewCall requires that only one thread be
actively allocating or recycling a call at a time. Since we are
no longer holding the conn_call_lock across the entire transaction
we need to have another synchronization mechanism. Add a new
rx_connection flag, RX_CONN_MAKECALL_ACTIVE, which when set indicates
that a thread is actively obtaining a call. If any other threads
see this flag set, they will wait until being signalled that the
thread has completed its activity.
In addition, because the call->lock may be dropped when processing
rxi_ResetCall(), we must hold a reference to the call once we
begin using it. Otherwise, the call may be garbage collected
behind our back.
new contact to a fileserver can trigger an InitCallBackStateN RPC
to us, which our agent will need afs_xserver to handle. don't hold it;
we only need it to fill in capabilities. racing here is ok.
Change-Id: Ie0aaea3ab462e421bd31ba3b703d8cd0cb0d61df
Reviewed-on: http://gerrit.openafs.org/1754 Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Tested-by: Marc Dionne <marc.c.dionne@gmail.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementia.org> Tested-by: Derrick Brashear <shadow@dementia.org>
Autogenerate a Debian changelog for correct package versioning
The Debian packaging infrastructure takes the package version from the
most recent entry in the changelog file. Change the changelog file to
a template and add an entry to the top that will be set to the current
version of OpenAFS, with a Debian revision that will sort before any
official package.
Andrew Deason [Tue, 6 Apr 2010 22:07:33 +0000 (17:07 -0500)]
Use afsd code in libuafs
Share the same CM code for the kernel client as in libuafs, so we
don't duplicate code for initializing the cache and other things. In
order to do this:
- Remove some libuafs global variables that share name and
functionality with those in afsd, and declare some static
- Remove uafs_Init(), and move the ukernel-specific code in it to
osi_Init(); replace with uafs_Setup(), uafs_ParseArgs(), and
uafs_Run(), which just call into afsd functions
- Remove libuafs' cache initialization code (CreateCacheFile,
SweepAFSCache, etc); instead just use afsd's
- Add uafs_mount(), to perform the 'mount'ing step that takes place
in the normal kernel CM
- Add afsd_uafs.c for the glue between afsd and libuafs
Marc Dionne [Tue, 13 Apr 2010 22:58:11 +0000 (18:58 -0400)]
Fix new UKERNEL warnings on 64-bit
Commit 830cb48c enabled new warnings when building UKERNEL, which
causes builds with --enable-checking to fail. These are 64-bit
specific warnings from int to pointer conversions and one printf
warning.
Changes:
- cast printf argument to (int) in afs_usrops.c
- use (iparmtype)(uintptrsz) to convert 32-bit integers to
pointers
- move the definition of uintptrsz to src/afs/afs.h so its
available to other source files, and remove the original definition
in afs_syscall.c
Michael Meffie [Tue, 12 Jan 2010 02:16:06 +0000 (21:16 -0500)]
DAFS: avoid volume lock contention during initialization
Avoid the excessive volume lock contention during startup to
improve the time to pre-attach a very large number of volumes.
The parallel attach worker threads avoid the volume lock
while scanning the partitions for volumes and send batches of
volume ids to the main thread to be preattached under the
volume lock.
Felix Frank [Thu, 4 Mar 2010 03:41:15 +0000 (22:41 -0500)]
Fileserver capabilities support for the UNIX client
The attached patch has the client perform a GetCapabilities RPC
on fileservers it encounters.
It uses an additional server flag bit to keep track of the servers that
have been queried already.
In the case of afs_CeckServers(), GetTime RPCs are largely replaced by
GetCapabilities. GetTime is performed on a server if and only if
afs_setTime is nonzero and either
(a) no setTimeHost has yet been determined or
(b) the server in question has been designated as setTimeHost
The GetServers() function could thus be simplified even further wrt. the
setTime mechanism, but doing so would imply more rewriting (violating
the KISS principle; a followup patch should deal with that).
When a client is asked to reset callback states, it also resets the
"capabilites known" bit.
Thanks go to Simon Wilkinson, Jeffrey Altman and Jeffrey Hutzelman for
input regarding logic and implementation details.
Rx: avoid out of order lock acquisition in rx_NewCall
Sha-1 33010ef25e716f2ec2df17cc113f4ef8f67e3a74 broke the lock order
conventions between the conn->conn_call_lock and the call-lock.
This patchset corrects the ordering and handles the synchronization
issues that might occur when the call->lock is dropped within
rx_NewCall.
Andrew Deason [Mon, 12 Apr 2010 17:39:00 +0000 (12:39 -0500)]
Do not turn off AFS_HAVE_STATVFS for UKERNEL
Many param files turn off AFS_HAVE_STATVFS for UKERNEL. We obviously
still have statvfs() available whether we are running with UKERNEL or
not, so modify param files to enable it for UKERNEL if it was enabled
for non-UKERNEL.
The only places using this define are afsd and vol/partition.c, the
latter of which will not be affected.
Matt Smith [Sat, 10 Apr 2010 06:36:59 +0000 (01:36 -0500)]
Fix problems from afs_osi_gcpags reorganization
Corrections to mistakes made during the reorganization
of afs_osi_gcpags.c to per-OS directories. Includes fixes to
LINUX24 and whitespace corrections.
Michael Meffie [Sat, 10 Apr 2010 01:03:09 +0000 (21:03 -0400)]
afsmonitor: show busy counts
Update afsmonitor to display rx_nBusies, fs_nBusies,
sysname_ID, and fs_GetCapabilities, which where claimed from
spare fields long ago. Add a new group name called
Busies_group to show just the busy fields.