Michael Meffie [Thu, 20 Jul 2017 08:13:04 +0000 (04:13 -0400)]
redhat: specify man pages without wildcards
Currently, some of the man pages are specified with the full name and
some are specified with a wildcard for the filename extension. Instead,
specify all the man pages without a wildcards to be more consistent and
to avoid putting incorrect man pages in packages.
This change removes a stray copy the klog.krb5.1 man page from
openafs-kauth-client subpackage and moves the AuthLog/AuthLog.dir man
pages to the optional openafs-kauth-server subpackage.
Reviewed-on: https://gerrit.openafs.org/12731 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 671db4ca5a76625d9b7133510cc1cbdda8a5d9b9)
Mark Vitale [Fri, 1 Dec 2017 01:26:46 +0000 (20:26 -0500)]
LINUX: Avoid d_invalidate() during afs_ShakeLooseVCaches()
With recent changes to d_invalidate's semantics (it returns void in Linux 3.11,
and always returns success in RHEL 7.4), it has become increasingly clear that
d_invalidate() is not the best function for use in our best-effort
(nondisruptive) attempt to free up vcaches that is afs_ShakeLooseVCaches().
The new d_invalidate() semantics always force the invalidation of a directory
dentry, which contradicts our desire to be nondisruptive, especially when
that directory is being used as the current working directory for a process.
Our call to d_invalidate(), intended to merely probe for whether a dentry
can be discarded without affecting other consumers, instead would cause
processes using that dentry as a CWD to receive ENOENT errors from getcwd().
A previous commit (c3bbf0b4444db88192eea4580ac9e9ca3de0d286) tried to address
this issue by calling d_prune_aliases() instead of d_invalidate(), but
d_prune_aliases() does not recursively descend into children of the given
dentry while pruning, leaving it an incomplete solution for our use-case.
To address these issues, modify the shakeloose routine TryEvictDentries() to
call shrink_dcache_parent() and maybe __d_drop() for directories, and
d_prune_aliases() for non-directories, instead of d_invalidate(). (Calls to
d_prune_aliases() for directories have already been removed by reverting commit c3bbf0b4444db88192eea4580ac9e9ca3de0d286.)
Just like d_invalidate(), shrink_dcache_parent() has been around "forever"
(since pre-git v2.6.12). Also like d_invalidate(), it "walks" the parent
dentry's subdirectories and "shrinks" (unhashes) unused dentries. But unlike
d_invalidate(), shrink_dcache_parent() will not unhash an in-use dentry, and
has never changed its signature or semantics.
d_prune_aliases() has also been available "forever", and has also never changed
its signature or semantics. The lack of recursive descent is not an issue for
non-directories, which cannot have such children.
[kaduk@mit.edu: apply review feedback to fix locking and avoid extraneous
changes, and reword commit message]
Reviewed-on: https://gerrit.openafs.org/12830 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit afbc199f152cc06edc877333f229604c28638d07)
Change-Id: I6d37e5584b57dcbb056385a79f67b92a363e08d2
Reviewed-on: https://gerrit.openafs.org/12851 Tested-by: BuildBot <buildbot@rampaginggeek.com> Tested-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Mark Vitale [Thu, 30 Nov 2017 22:56:13 +0000 (17:56 -0500)]
LINUX: consolidate duplicate code in osi_TryEvictDentries
The two stanzas for HAVE_DCACHE_LOCK are now functionally identical;
remove the preprocessor conditionals and duplicate code.
Minor functional change is incurrred for very old (before 2.6.38) Linux
versions that have dcache_lock; we are now obtaining the d_lock as well.
This is safe because d_lock is also quite old (pre-git, 2.6.12), and it
is a spinlock that's only held for checking d_unhashed. Therefore, it
should have negligible performance impact. It cannot cause deadlocks or
violate locking order, because spinlocks can't be held across sleeps.
Reviewed-on: https://gerrit.openafs.org/12792 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Andrew Deason <adeason@dson.org> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 5076dfc14b980aed310f3862875d5e9919fa199d)
Mark Vitale [Thu, 30 Nov 2017 21:08:38 +0000 (16:08 -0500)]
LINUX: create afs_linux_dget() compat wrapper
For dentry operations that cover multiple dentry aliases of
a single inode, create a compatibility wrapper to hide differences
between the older dget_locked() and the current dget().
No functional change should be incurred by this commit.
Reviewed-on: https://gerrit.openafs.org/12789 Reviewed-by: Andrew Deason <adeason@dson.org> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 74f4bfc627c836c12bb7c188b86d570d2afdcae8)
However, since that commit, several things have happened:
- RHEL 7.4 changed the semantics of d_invalidate() such that it
invalidates the cwd, but did NOT change the return type to void.
This broke our autoconf test for detecting the new semantics.
- Further research reveals that d_prune_aliases() was not the best
choice for replacing d_invalidate(). This is because for directories,
d_prune_aliases() doesn't invalidate dentries when they are referenced
by its children, and it doesn't walk the tree trying to invalidate
child dentries. So it can leave dentries dangling, if the only
references to thos dentries are via children.
Stephan Wiesand [Fri, 22 Dec 2017 13:40:32 +0000 (14:40 +0100)]
Linux 4.15: check for 2nd argument to pagevec_init
Linux 4.15 removes the distinction between "hot" and "cold" cache
pages, and pagevec_init() no longer takes a "cold" flag as the
second argument. Add a configure test and use it in osi_vnodeops.c .
Stephan Wiesand [Fri, 22 Dec 2017 13:17:09 +0000 (14:17 +0100)]
Linux: use plain page_cache_alloc
Linux 4.15 removes the distinction between "hot" and "cold" cache
pages, and no longer provides page_cache_alloc_cold(). Simply use
page_cache_alloc() instead, rather than adding yet another test.
Marcio Barbosa [Thu, 12 Oct 2017 15:42:40 +0000 (12:42 -0300)]
macos: make the OpenAFS client aware of APFS
Apple has introduced a new file system called APFS. Starting from High
Sierra, APFS replaces Mac OS Extended (HFS+) as the default file system
for solid-state drives and other flash storage devices.
The current OpenAFS client is not aware of APFS. As a result, the
installation of the current client into an APFS volume will panic the
machine.
To fix this problem, make the OpenAFS client aware of APFS.
Reviewed-on: https://gerrit.openafs.org/12743 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 6e57b22642bafb177e0931b8fb24042707d6d62f)
Benjamin Kaduk [Fri, 15 Dec 2017 01:54:57 +0000 (19:54 -0600)]
Fix macro used to check kernel_read() argument order
The m4 macro implementing the configure check is called
LINUX_KERNEL_READ_OFFSET_IS_LAST, but it defines a preprocessor symbol
that is just KERNEL_READ_OFFSET_IS_LAST. Our code needs to check
for the latter being defined, not the former.
Reported by Aaron Ucko.
Reviewed-on: https://gerrit.openafs.org/12808 Reviewed-by: Anders Kaseorg <andersk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit edc5463f3db4b6af2307741d9f4ee8f2c81cd98e)
Benjamin Kaduk [Mon, 4 Dec 2017 23:20:57 +0000 (17:20 -0600)]
OPENAFS-SA-2017-001: rx: Sanity-check received MTU and twind values
Rather than blindly trusting the values received in the
(unauthenticated) ack packet trailer, apply some minmial sanity checks
to received values. natMTU and regular MTU values are subject to
Rx minmium/maximum packet sizes, and the transmit window cannot drop
below one without risk of deadlock.
The maxDgramPackets value that can also be present in the trailer
already has sufficient sanity checking.
Extremely low MTU values (less than 28 == RX_HEADER_SIZE) can cause us
to set a negative "maximum usable data" size that gets used as an
(unsigned) packet length for subsequent allocation and computation,
triggering an assertion when the connection is used to transmit data.
Benjamin Kaduk [Tue, 28 Nov 2017 04:17:28 +0000 (22:17 -0600)]
afs: Fix bounds check in PNewCell
Reported by the opensuse buildbot:
CC [M] /home/buildbot/opensuse-tumbleweed-i386-builder/build/src/libafs/MODLOAD-4.13.12-1-default-MP/rx_packet.o
/home/buildbot/opensuse-tumbleweed-i386-builder/build/src/afs/afs_pioctl.c: In function ‘PNewCell’:
/home/buildbot/opensuse-tumbleweed-i386-builder/build/src/afs/afs_pioctl.c:3075:55: error: ‘*’ in boolean context, suggest ‘&&’ instead [-Werror=int-in-bool-context]
if ((afs_pd_remaining(ain) < AFS_MAXCELLHOSTS +3) * sizeof(afs_int32))
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
Benjamin Kaduk [Tue, 28 Nov 2017 04:07:53 +0000 (22:07 -0600)]
rx: fix call refcount leak in error case
The recent event handling normalization in commit 304d758983b499dc568d6ca57b6e92df24b69de8 had event handlers switch
to dropping their reference on the associated connection/call just
before return. An early return case was missed in the conversion,
leading to a refcount leak in an error case.
Reviewed-on: https://gerrit.openafs.org/12781 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 66b74e78ba5fea6a8236dcd3b8b46e1dfa6a0ac7)
Change-Id: I532c49b2ef6ec95dd26a99c02e12ea53348f9690
Reviewed-on: https://gerrit.openafs.org/12783 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Marcio Barbosa [Thu, 16 Nov 2017 22:24:03 +0000 (17:24 -0500)]
afs: fix kernel_write / kernel_read arguments
The order / content of the arguments passed to kernel_write and
kernel_read are not right. As a result, the kernel will panic if one of
the functions in question is called.
Michael Meffie [Mon, 6 Nov 2017 22:37:46 +0000 (17:37 -0500)]
tests: fix out of bounds access in the rx-event test
Use the NUMEVENTS symbol which defines the array size instead of an
incorrect hard coded number when checking if a second event can be added
to be fired at the same time. This fixes a potential out of bounds
access of the event test array.
Also update the comment which incorrectly mentions the incorrect number
of events in the test.
Reviewed-on: https://gerrit.openafs.org/12762 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 50a3eb7b7ee94bffaadc98429bd404164e89ec7f)
Change-Id: I7a975e7498c1c7416a800c9294c97ee4de4fd57a
Reviewed-on: https://gerrit.openafs.org/12779 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Thu, 16 Nov 2017 10:49:49 +0000 (04:49 -0600)]
Sprinkle rx_GetConnection() for concision
Instead of inlining the body (taking the lock, incrementing the
refcount, and dropping the lock), use the convenience function
designed for this purpose.
Reviewed-on: https://gerrit.openafs.org/12772 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 2ae84bf053fe66b73a2c77b5d71305bae2c17587)
Change-Id: I60794d877a76fbb7c8ba59207e710a20641cc8f1
Reviewed-on: https://gerrit.openafs.org/12778 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Sun, 8 Oct 2017 03:42:38 +0000 (22:42 -0500)]
Standardize rx_event usage
Go over all consumers of the rx event framework and normalize its usage
according to the following principles:
rxevent_Post() is used to create an event, and it returns an event
handle (with a reference on the event structure) that can be used
to cancel the event before its timeout fires. (There is also an
additional reference on the event held by the global event tree.)
In all(*) usage within the tree, that event handle is stored within
either an rx_connection or an rx_call. Reads/writes to the member variable
that holds the event handle require either the conn_data_lock or call
lock, respectively -- that means that in most cases, callers of
rxevent_Post() and rxevent_Cancel() will be holding one of those
aforementioned locks. The event handlers themselves will need to
modify the call/connection object according to the nature of the
event, which requires holding those same locks, and also a guarantee
that the call/connection is still a live object and has not been
deallocated! Whether or not rxevent_Cancel() succeeds in cancelling
the event before it fires, whenever passed a non-NULL event structure
it will NULL out the supplied pointer and drop a reference on the
event structure. This is the correct behavior, since the caller
has asked to cancel the event and has no further use for the event
handle or its reference on the event structure. The caller of
rxevent_Cancel() must check its return value to know whether or
not the event was cancelled before its handler was able to run.
The interaction window between the call/connection lock and the lock
protecting the red/black tree of pending events opens up a somewhat
problematic race window. Because the application thread is expected
to hold the call/connection lock around rxevent_Cancel() (to protect
the write to the field in the call/connection structure that holds
an event handle), and rxevent_Cancel() must take the lock protecting
the red/black tree of events, this establishes a lock order with the
call/connection lock taken before the eventTree lock. This is in
conflict with the event handler thread, which must take the eventTree
lock first, in order to select an event to run (and thus know what
additional lock would need to be taken, by virtue of what handler
function is to be run). The conflict is easy to resolve in the
standard way, by having a local pointer to the event that is obtained
while the event is removed from the red/black tree under the eventTree
lock, and then the eventTree lock can be dropped and the event run
based on the local variable referring to it. The race window occurs
when the caller of rxevent_Cancel() holds the call/connection lock,
and rxevent_Cancel() obtains the eventTree lock just after the event
handler thread drops it in order to run the event. The event handler
function begins to execute, and immediately blocks trying to obtain
the call/connection lock. Now that rxevent_Cancel() has the eventTree
lock it can proceed to search the tree, fail to find the indicated event
in the tree, clear out the event pointer from the call/connection
data structure, drop its caller's reference to the event structure,
and return failure (the event was not cancelled). Only then does the
caller of rxevent_Cancel() drop the call/connection lock and allow
the event handler to make progress.
This race is not necessarily problematic if appropriate care is taken,
but in the previous code such was not the case. In particular, it
is a common idiom for the firing event to call rxevent_Put() on itself,
to release the handle stored in the call/connection that could have
been used to cancel the event before it fired. Failing to do so would
result in a memory leak of event structures; however, rxevent_Put() does
not check for a NULL argument, so a segfault (NULL dereference) was
observed in the test suite when the race occurred and the event handler
tried to rxevent_Put() the reference that had already been released by
the unsuccessful rxevent_Cancel() call. Upon inspection, many (but not
all) of the uses in rx.c were susceptible to a similar race condition
and crash.
The test suite also papers over a related issue in that the event handler
in the test suite always knows that the data structure containing the
event handle will remain live, since it is a global array that is allocated
for the entire scope of the test. In rx.c, events are associated with
calls and connections that have a finite lifetime, so we need to take care
to ensure that the call/connection pointer stored in the event remains
valid for the duration of the event's lifecycle. In particular, even an
attempt to take the call/connection lock to check whether the corresponding
event field is NULL is fraught with risk, as it could crash if the lock
(and containing call/connection) has already been destroyed! There are
several potential ways to ensure the liveness of the associated
call/connection while the event handler runs, most notably to take care
in the call/connection destruction path to ensure that all associated
events are either successfully cancelled or run to completion before
tearing down the call/connection structure, and to give the pending event
its own reference on the associated call/connection. Here, we opt for
the latter, acknowledging that this may result in the event handler thread
doing the full call/connection teardown and delay the firing of subsequent
events. This is deemed acceptable, as pending events are for intentionally
delayed tasks, and some extra delay is probably acceptable. (The various
keepalive events and the challenge event could delay the user experience
and/or security properties if significantly delayed, but I do not believe
that this change admits completely unbounded delay in the event handler
thread, so the practical risk seems minimal.)
Accordingly, this commit attempts to ensure that:
* Each event holds a formal reference on its associated call/connection.
* The appropriate lock is held for all accesses to event pointers in
call/connection structures.
* Each event handler (after taking the appropriate lock) checks whether
it raced with rxevent_Cancel() and only drops the call/connection's
reference to the event if the race did not occur.
* Each event handler drops its reference to the associated call/connection
*after* doing any actions that might access/modify the call/connection.
* The per-event reference on the associated call/connection is dropped by
the thread that removes the event from the red/black tree. That is,
the event handler function if the event runs, or by the caller of
rxevent_Cancel() when the cancellation succeed.
* No non-NULL event handles remain in a call/connection being destroyed,
which would indicate a refcounting error.
(*) There is an additional event used in practice, to reap old connections,
but it is effectively a background task that reschedules itself
periodically, with no handle to the event retained so as to be able
to cancel it. As such, it is unaffected by the concerns raised here.
While here, standardize on the rx_GetConnection() function for incrementing
the reference count on a connection object, instead of inlining the
corresponding mutex lock/unlock and variable access.
In contrast to what was done on master, for the 1.8 branch we do not
force-enable refcount checking.
Reviewed-on: https://gerrit.openafs.org/12756 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 304d758983b499dc568d6ca57b6e92df24b69de8)
Benjamin Kaduk [Thu, 5 Oct 2017 04:03:44 +0000 (23:03 -0500)]
Adjust rx-event test to exercise cancel/fire race
We currently do not properly handle the case where a thread runs
rxevent_Cancel() in parallel with the event-handler thread attempting
to fire that event, but the test suite only picked up on this issue
in a handful of the Debian automated builds (somewhat less-resourced
ones, perhaps).
Modify the event scheduling algorithm in the test so as to create a
larger chunk of events scheduled to fire "right away" and thereby
exercise the race condition more often when we proceed to cancel
a quarter of events "right away".
Reviewed-on: https://gerrit.openafs.org/12755 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit bdb509fb1d8e0fdca05dffecdbcbf60a95ea502e)
Change-Id: I27cebed3c2c3daff10b8d3f5f6f949e667791a72
Reviewed-on: https://gerrit.openafs.org/12774 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Michael Laß [Thu, 2 Nov 2017 20:16:49 +0000 (21:16 +0100)]
gtx: link against libtinfo if termlib is seperated
If ncurses is built with "./configure --with-termlib=tinfo", gtx fails
to link because of an undefined reference to the LINES symbol which is
then provided by libtinfo.so and not libncurses.so.
If ncurses is present, additionally check whether LINES is provided by
ncurses or tinfo and set $LIB_curses accordingly.
This change is based on a patch provided by Bastian Beischer.
FIXES 134420
Reviewed-on: https://gerrit.openafs.org/12760 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 311f1d28a2f626350b33ad432e674055b62511bd)
Change-Id: I2f69fe51bbefeeb2a17145a88aa9c891644f2f61
Reviewed-on: https://gerrit.openafs.org/12763 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Laß <lass@mail.uni-paderborn.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Linux: Use kernel_read/kernel_write when __vfs variants are unavailable
We hide the uses of set_fs/get_fs behind a macro, as those functions
are likely to soon become unavailable:
> Christoph Hellwig suggested removing all calls outside of the core
> filesystem and architecture code; Andy Lutomirski went one step
> further and said they should all go.
https://lwn.net/Articles/722267/
Reviewed-on: https://gerrit.openafs.org/12729 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 5ee516b3789d3545f3d78fb3aba2480308359945)
Change-Id: I28a7126bf6ab048f8d949f190e557a3fa44f3f46
Reviewed-on: https://gerrit.openafs.org/12737 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
unexports both __vfs_read and __vfs_write, but keeps the former in
fs.h--as it is is still being used by another part of the tree.
This situation results in a false positive in our Autoconf check,
which does not see the export statements, and ends up marking the
corresponding API as available.
That, in turn, causes some code which assumes symmetry with
__vfs_write to fail to compile.
Switch to testing for __vfs_write, which correctly marks the API as
unavailable.
Reviewed-on: https://gerrit.openafs.org/12728 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 929e77a886fc9853ee292ba1aa52a920c454e94b)
Change-Id: I03e3c8222360a6b04b45b45a8f56b5df054f6783
Reviewed-on: https://gerrit.openafs.org/12736 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Anders Kaseorg [Sat, 2 Sep 2017 03:37:07 +0000 (23:37 -0400)]
vol: Fix two buffers being one char too short
Fixes these warnings:
namei_ops.c: In function 'namei_copy_on_write':
namei_ops.c:1328:31: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=]
snprintf(path, sizeof(path), "%s-tmp", name.n_path);
^~~~~~~~
namei_ops.c:1328:2: note: 'snprintf' output between 5 and 260 bytes into a destination of size 259
snprintf(path, sizeof(path), "%s-tmp", name.n_path);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vol_split.c: In function 'split_volume':
vol_split.c:576:22: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=]
sprintf(symlink, "#%s", V_name(newvol));
^~~~~
vol_split.c:576:5: note: 'sprintf' output between 2 and 33 bytes into a destination of size 32
sprintf(symlink, "#%s", V_name(newvol));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reviewed-on: https://gerrit.openafs.org/12722 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 0a9a6b57ce6e1c97fcc651c8cb74e66fc8422a1e)
Change-Id: Ia60439aed7925b786a0213d96a7afb413579e01f
Reviewed-on: https://gerrit.openafs.org/12723 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Seth Forshee [Tue, 22 Aug 2017 12:59:11 +0000 (07:59 -0500)]
Linux: Include linux/uaccess.h rather than asm/uaccess.h if present
Starting with Linux 4.12 there is a module build error on s390
due to asm/uaccess.h using a macro defined in the common header.
The common header has been around since 2.6.18 and has always
included asm/uaccess.h, so switch to using the common header
whenever it is present.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Reviewed-on: https://gerrit.openafs.org/12714 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 962f4838dc461567d896304f617a0923745d13d5)
Change-Id: I5a7834b982458159804bc4d940e39ef283253299
Reviewed-on: https://gerrit.openafs.org/12718 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Wed, 2 Aug 2017 01:57:52 +0000 (20:57 -0500)]
Remove src/mcas
This lock-free library toolkit is intriguing and may be the subject
of future work, but such development will occur on the master branch,
and these files are just clutter on openafs-stable-1_8_x. Remove
them to give the tree a more clean start.
Remove src/mcas and stop mentioning it in SOURCE-MAP; don't reference
it in the rpctests, either.
Change-Id: I21b1b6b64a709fe40aa53aaf3470d128c0dc2f86
Reviewed-on: https://gerrit.openafs.org/12682 Tested-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Wed, 2 Aug 2017 01:55:52 +0000 (20:55 -0500)]
Remove src/rxgk
These files were commited slightly prematurely to the tree; rxgk
support is intended for the 2.0 release, and will not appear in the
1.8.x release series.
Remove src/rxgk and drop mentions of rxgk from configure/Makefile.in/etc.
Change-Id: Ib7d40eaac85b05d920781b61f73dbdf8fedfcc2b
Reviewed-on: https://gerrit.openafs.org/12681 Tested-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Michael Meffie [Fri, 14 Apr 2017 01:48:06 +0000 (21:48 -0400)]
redhat: kauth client and server sub-packages
Move the kaserver and kauth client programs to conditionally built
packages called openafs-kauth-server and openafs-kauth-client.
Packagers can build these by specifying '--with kauth'. They are not
built by default to discourage use.
This commit subsumes the openafs-kpasswd package into the
openafs-kauth-client package.
Change-Id: I1322f05d7fe11d466c9ed71a5059c21b759d95ab
Reviewed-on: https://gerrit.openafs.org/12600 Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Michael Meffie [Mon, 10 Apr 2017 19:06:02 +0000 (15:06 -0400)]
redhat: do not package kauth by default
Do not package kaserver and related programs by default to discourage
use. Add the '--with kauth' rpmbuild option to allow packagers to
continue include the kauth programs for compatibility.
Change-Id: I8bf9f6dc221afc22ed6c9a33cf101d705e6c4920
Reviewed-on: https://gerrit.openafs.org/12597 Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Mon, 31 Jul 2017 01:57:05 +0000 (20:57 -0500)]
Default to crypt mode for unix clients
Though the protection offered by rxkad, even with rxkad-k5 and rxkad-kdf, is
insufficient to protect traffic from a determined attacker, it remains the
case that the internet is not a safe place for user data to travel in the
clear, and has not been for a long time. The Windows client encrypts by
default, and all or nearly all the Unix client packaging scripts set crypt
mode by default. Catch up to reality and default to crypt mode in the
Unix cache manager.
The current version does not have a corresponding LWP_WaitProcess call
for the beacon_globals.ubik_amSyncSite global. As a result, the
LWP_NoYieldSignal(&beacon_globals.ubik_amSyncSite) signal call can be
safely removed.
Change-Id: I72c4ccfe8e68551673dc728dd699ba8c561d76d1
Reviewed-on: https://gerrit.openafs.org/12673 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
Michael Meffie [Wed, 2 Aug 2017 00:10:32 +0000 (20:10 -0400)]
doc: relocate notes from arch to txt
The doc/txt directory has become the de facto home for text-based
technical notes. Relocate the contents of the doc/arch directory to
doc/txt. Relocate doc/examples to doc/txt/examples.
Update the doc/README file to be more current and remove old work in
progress comments.
ubik: update epoch as soon as sync-site is elected
The ubik_epochTime represents the time at which the coordinator first
received its coordinator mandate. However, this global is currently not
updated at the moment when a new sync-site is elected. Instead,
ubik_epochTime is only updated at the very end of the first write
transaction, when a new database label is written (in udisk_commit).
This causes at least 2 different issues:
For one, this means that we change ubik_epochTime while a remote
transaction is in progress. If VOTE_Beacon is called after
ubik_epochTime is updated, but before the remote transaction ends, the
remote sites will detect that the transaction id in ubik_currentTrans is
wrong (via urecovery_CheckTid(), since the epoch doesn't match), and
they will abort the transaction. This means the transaction will fail,
and it may cause a loss of quorum until another election is completed.
Another issue is that ubik_epochTime can be 0 at the beginning of a
write transaction, if this is the first election that this site has won.
Since ubik_epochTime is used to construct transaction ids, this means
that we can have different transactions that originate from different
sites at different times, but they have the same epoch in their tid.
For example, say a write transaction starts with epoch 0, but the
originating site is killed/interrupted before finishing. That write
transaction will linger on remote sites in ubik_currentTrans with an
epoch of 0 (since the originating site will never call
DISK_ReleaseLocks, or DISK_Abort, etc). Normally the sync site will kill
such a lingering transaction via urecovery_CheckTid, but since the epoch
is 0, and the election winner's epoch is also 0, the transaction looks
valid and may never be killed. If that transaction is holding a lock on
the database, this means that the database will forever remain locked,
effectively preventing any access to the db on that site.
To fix both of these issues, update ubik_epochTime with the current
time as soon as we win the election. This ensures that the epoch is not
updated in the middle of a transaction, and it ensures that all
transactions are created with a unique epoch: the epoch of the election
that we won.
Note that with this commit, we do not ever set ubik_epochTime to the
magic value of '2' during database init. The special '2' epoch only
needs to be set in the database itself, and it is never an actual epoch
that represents a real quorum that went through the election process.
The database will be labelled with a 'real' epoch after the first write,
like normal.
[kaduk@mit.edu: comment the locking strategy in ubeacon_Interact()]
Joe Gorse [Thu, 6 Jul 2017 19:47:24 +0000 (15:47 -0400)]
LINUX: afs_create infinite fetchStatus loop
For a file in a directory with the CStatd bit cleared, we can get
an infinite fetchStatus loop.
In afs_create(), afs_getDCache() may return NULL due to an error.
If unchecked it will loop which may produce multiple fetchStatus()
calls to the fileserver.
Credit: Yadav Yadavendra for identifying and analysing this issue.
Michael Meffie [Tue, 1 Aug 2017 21:21:13 +0000 (17:21 -0400)]
volser: preserve volume stats by default
Commit dfceff1d3a66e76246537738720f411330808d64 added the
-preserve-vol-stats flag to the volume server. This enabled a change in
the volume server to preserve volume usage statistics during reclone and
restore operations. Otherwise, volume usage counters of read-only
volumes are cleared when volumes are released, making it difficult to
track usage with the volume stats.
Make this feature the default behavior of the volume server and provide
the option -clear-vol-stats to use the old behavior if so desired. This
change makes the -preserve-vol-stats the default, and keeps it as a
hidden flag for sites which may already have that flag set in the
BosConfig.
Since this changes a default behavior of the volume server, this change
is only appropriate on a major or minor release boundary, not in the
middle of a stable series.
Marcio Barbosa [Mon, 22 May 2017 16:55:32 +0000 (12:55 -0400)]
ubik: avoid early DISK_Begin calls we know will fail
Currently, we can start a write transaction on a site immediately after
it is elected as the sync site. However, after commit d47beca1,
SDISK_Begin on remote sites will fail right after an election occurs
(since lastYesState is not set, and so urecovery_AllBetter will fail).
And after commit fac0b742, this error is always noticed and propagated
back to the application.
As a result, when we try to write immediately after a sync site is
elected, the transaction will fail with UNOQUORUM, the remote sites will
be marked as down, and we may lose quorum and require another election
to be performed. This can easily happen repeatedly for a site that
frequently tries to make changes to a ubik database.
To avoid marking other sites down and going through another election
process, do not allow write transactions until we know that lastYesState
is set on the remote sites. We do this by waiting until the next wave of
beacons are sent, which tell the remote sites that we are the sync site.
In other words, only allow write transactions after the sync site knows
that the remote sites also know that the sync site has been elected.
With this commit, a write transaction immediately after an election
will still fail with UNOQUORUM, but we avoid triggering an error on the
remote sites, and avoid losing quorum in this situation.
Change-Id: I9e1a76b4022e6d734af1165d94c12e90af04974d
Reviewed-on: https://gerrit.openafs.org/12592 Reviewed-by: Andrew Deason <adeason@dson.org> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Marcio Barbosa [Wed, 21 Jun 2017 20:42:37 +0000 (17:42 -0300)]
ubik: allow remote dbase relabel if up to date
When a site is elected the sync-site, its database is not immediately
relabeled. The database in question will be relabeled at the end of the
first write transaction (in udisk_commit). To do so, the dbase->version
is updated on the sync-site first (1) and then the versions of the
remote sites are updated through SDISK_SetVersion() (2).
In order to make sure that the remote site holds the same database as
the sync-site, the SDISK_SetVersion() function checks if the current
version held by the remote site (ubik_dbVersion) is equal to the
original version stored by the sync-site (oldversionp). If
ubik_dbVersion is not equal to oldversionp, SDISK_SetVersion() will
fail with USYNC.
However, ubik_dbVersion can be updated by the vote thread at any time.
That is, if the sync site calls VOTE_Beacon() on the remote site between
events (1) and (2), the remote site will set ubik_dbVersion to the new
version, while ubik_dbase->version is still set to the old version. As
a result, ubik_dbVersion will not be equal to oldversionp and
SDISK_SetVersion() will fail with USYNC. This failure may cause a loss
of quorum until another election is completed.
To fix this problem, let SDISK_SetVersion() relabel the database when
ubik_dbase->version is equal to oldversionp. In order to try to only
affect the scenario described above, also check if ubik_dbVersion is
equal to newversionp.
Joe Gorse [Wed, 10 May 2017 15:38:25 +0000 (11:38 -0400)]
afs: fix repeated BulkStatus calls for directories.
There is a filetype comparison check in afs_DoBulkStat just after
BulkFetch RPC. This check will fail for directories even though
bulkStatus was done for directories.
This code is apparently necessary for Darwin, but it causes this problem
otherwise. Thus it is removed from the rest of the builds using the
AFS_DARWIN_ENV preprocessor variable.
Credit: Yadav Yadavendra for identifying and analysing this issue.
Andrew Deason [Thu, 15 Jun 2017 20:32:41 +0000 (15:32 -0500)]
LINUX: Workaround d_splice_alias/d_lookup race
Before Linux kernel commit 4919c5e45a91b5db5a41695fe0357fbdff0d5767,
d_splice_alias in some cases can d_rehash the given dentry without
attaching it to the given inode, right before the dentry is unhashed
again. This means that for a few moments, that negative dentry is
visible to __d_lookup, and thus is visible to path lookup and can be
given to afs_linux_dentry_revalidate.
Currently, afs_linux_dentry_revalidate will say that the dentry is
valid, because d_time and other fields are set; it's just not attached
to an inode. This causes an ENOENT error on lookup, even though the
file is there (and no OpenAFS code said otherwise).
Normally this race is rare, but it can be frequently exercised if
we access the same directory via different names at the same time.
This can happen with multiple mountpoints to the same volume, or by
accessing an @sys directory via its abbreviated and expanded forms.
To get around this, make afs_linux_dentry_revalidate check negative
'dentry's to see if they are unhashed. We also lock the parent inode,
in order to guarantee that a problematic d_splice_alias call isn't
running at the same time (and thus, we know the dentry will not be
unhashed immediately afterwards). This slows down
afs_linux_dentry_revalidate for valid negative 'dentry's a little, but
it allows us to use negative dentry's at all.
Stephan Wiesand [Mon, 24 Jul 2017 09:37:54 +0000 (11:37 +0200)]
Linux 4.13: use designated initializers where required
struct path is declared with the "designated_init" attribute,
and module builds now use -Werror=designated-init. Cope.
And as pointed out by Michael Meffie, struct ctl_table has
the same requirement now, so use a designated initializer
for the final element of the sysctl table too.
Change-Id: I0ec45aac961dcefa0856a15ee218085626a357c7
Reviewed-on: https://gerrit.openafs.org/12663 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Michael Meffie [Fri, 7 Jul 2017 15:11:12 +0000 (11:11 -0400)]
afs: fix afs_xserver deadlock in afsdb refresh
When setting up a new volume, the cache manager calls afs_GetServer() to
setup the server object for each fileserver associated with the volume.
The afs_GetServer() function locks afs_xserver and then, among other
things, calls afs_GetCell() to lookup the cell info by cell number.
When the cache manager is running in afsdb mode, afs_GetCell() will
attempt to refresh the cell info if the time-to-live has been exceeded
since the last call to afs_GetCell(). During this refresh the AFSDB
calls afs_GetServer() to update the vlserver information. The afsdb
handler thread and the thread processing the volume setup become
deadlocked since the afs_xserver lock is already held at this point.
This bug will manifest when the DNS SRV record TTL is smaller than the
time the fileservers respond to the GetCapabilities RPC within
afs_GetServer() and there are multiple read-only servers for a volume.
Avoid the deadlock by using the afs_GetCellStale() variant within
afs_GetServer(). This variant returns the memory resident cell info
without the afsdb upcall and the subsequent afs_GetServer() call.
Michael Meffie [Tue, 11 Jul 2017 12:51:08 +0000 (08:51 -0400)]
afs: restore force_if_down check when getting connections
Commit cb9e029255420608308127b0609179a46d9983ad removed the
force_if_down check in afs_ConnBySA, which effictively turned on
force_if_down flag for every call to afs_ConnBySA. This caused
afs_ConnBySA to always return connections, even for server addresses
marked down and force_if_down set to 0.
One serious consequence of this bug is the cache manager will retry the
preferred vlserver indefinitely when it is unreachable. This is because
the loop in afs_ConnMHosts always tries hosts in preferred order and
expects afs_ConnBySA to return a NULL if the server address has no
connections because it is marked down.
Restore the check for server addresses marked down to honor the
force_if_down flag again so we do not get connections for down servers
unless requested.
Michael Meffie [Mon, 10 Apr 2017 18:23:12 +0000 (14:23 -0400)]
redhat: fix rpmbuild command line option defaults
Fix the handling of default values for the various rpmbuild options
which can be given. These have been broken as code was shuffled around
over the years.
Remove obsolete comments about detecting what to build based on the
architecture.
Provide the '--without authlibs' option to disable the openafs-authlibs
package.
Change-Id: I6c8db1f3163ee241f9a4d1282345a0ddeabd284c
Reviewed-on: https://gerrit.openafs.org/12596 Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
The space allocated for outputFileBuf is only 2 bytes larger than
sizeof(VERS_FILE). But we add potentially 4 extra bytes like
".txt" or ".xml". Just allocate enough space for all file suffices.
Andrew Deason [Thu, 15 Jun 2017 20:29:17 +0000 (15:29 -0500)]
afs_linux_lookup: Avoid d_add on afs_lookup error
Currently, afs_linux_lookup looks roughly like this pseudocode:
{
code = afs_lookup(&vcp);
if (!code) {
ip = AFSTOV(vcp);
error = process_ip(ip);
if (error) {
goto done;
}
}
process_dp(dp);
newdp = d_splice_alias(ip, dp);
done:
cleanup();
}
Note that if there is an error while processing the looked-up inode
(ip), we jump over d_splice_alias. But if we encounter an error from
afs_lookup itself, we do not jump over d_splice_alias. This means that
if afs_lookup encounters any error, we initialize the given dentry
(dp) as a negative entry, effectively telling the Linux kernel that
the requested name does not exist.
This is correct for ENOENT errors, of course, but is incorrect for any
other error. For non-ENOENT errors we later return an error from the
function, but this does not invalidate the generated dentry. The
result is that when afs_lookup encounters an error, that error will be
propagated to userspace, but subsequent lookups for the same name will
yield an ENOENT error (until the dentry is invalidated). This can
easily cause a file to seem to mysteriously disappear, if a transient
error like network problems caused the afs_lookup call to fail.
To fix this, treat ENOENT as a non-error, like the comments already
suggest. In our case, ENOENT is not really an error; it just means we
populate the given dentry differently. So if we get ENOENT from
afs_lookup, set our vcache to NULL and clear the error, and continue.
This also has the side effect of not treating ENOENT errors from
afs_CreateAttr identically to ENOENT errors from afs_lookup. That
shouldn't happen, but there have been abuses of the ENOENT error code
in the past, so it is probably better to be cautious.
Many thanks to Gaja Sophie Peters for assistance in tracking down and
testing fixes for this issue, including providing access to test systems
experiencing the buggy behavior.
FIXES 133654
Change-Id: Ia9aab289d5c041557ab6b00f1d41de2edfc97a89
Reviewed-on: https://gerrit.openafs.org/12637 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Joe Gorse <jhgorse@gmail.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: Michael Meffie <mmeffie@sinenomine.net>
Andrew Deason [Thu, 15 Jun 2017 20:29:48 +0000 (15:29 -0500)]
LINUX: Rearrange afs_linux_lookup cleanup
Currently, the cleanup and error handling in afs_linux_lookup is
structured similar to this pseudocode:
if (!code) {
if (!IS_ERR(newdp)) {
return no_error;
} else {
return newdp_error;
}
} else {
return code_error;
}
The multiple different nested error cases make this a little complex.
To make this easier to follow for subsequent changes, alter this
structure to be more like this:
if (IS_ERR(newdp)) {
return newdp_error;
}
if (code) {
return code_error;
}
return no_error;
There should be no functional change in this commit; it is just code
reorganization.
Technically the ordering of these checks is changed, but there is no
combination of conditions that actually results in different code
being hit. That is, if 'code' is nonzero and IS_ERR(newdp) is true,
then we would go through a different path. But that cannot happen,
since if 'code' is nonzero, we have no inode and so IS_ERR(newdp)
cannot be true (d_splice_alias cannot return an error for a NULL
inode). So there is no functional change.
Change-Id: I94a3aef5239358c3d13fe5314044dcc85914d0a4
Reviewed-on: https://gerrit.openafs.org/12636 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Joe Gorse <jhgorse@gmail.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: Michael Meffie <mmeffie@sinenomine.net>
Andrew Deason [Fri, 23 Jun 2017 22:20:11 +0000 (17:20 -0500)]
afs: Improve "Corrupt directory" warning
This warning is a bit confusing to see, since it doesn't say anything
about AFS (making it unclear where it's coming from), and it lacks a
trailing newline (making it ugly). Fix both of these.
Michael Meffie [Fri, 2 Jun 2017 19:19:26 +0000 (15:19 -0400)]
bozo: do not fail silently on unknown bosserver options
Instead of failing silently when the bosserver is started with an
unknown option, print an error message and exit with a non-zero value.
Continue to exit with 0 when the -help option is given to request the
usage message.
This change should help make bosserver startup failures more obvious
when an unsupported option is specified. Example systemd status message:
systemd[1]: Starting OpenAFS Server Service...
bosserver[32308]: Unrecognized option: -bogus
bosserver[32308]: Usage: bosserver [-noauth] ....
systemd[1]: openafs-server.service: main process exited,
code=exited, status=1/FAILURE
Jeffrey Altman [Sat, 27 May 2017 18:59:04 +0000 (14:59 -0400)]
rx: wake up send after 'twind' has been updated
Beginning in AFS 3.4 and 3.5 the ack trailer includes the size of the
peer's receive window. This value is used to update the sender's
transmit window (twind). When the twind is increased the application
thread is signaled to indicate that more packets can be sent.
This change wakes the application thread after twind is updated by
the peer's receive window instead of beforehand. Failure to do so
can result in 100ms transmit delays when the receive window transitions
from closed to open.
Michael Meffie [Fri, 7 Apr 2017 02:50:41 +0000 (22:50 -0400)]
redhat: update rpm spec file
Update the spec file to keep up with accumulated changes.
* Correct installation location of db check programs.
* Install afsd to the legacy location to avoid breaking
init scrips and systemd configs.
* Exclude yet another duplicated copy of kpwvalid.
* libubik_pthread.a is gone.
* Install the kpwvalid man page.
* Continue to remove the obsolete kdb program.
* Update the names of the pam_afs symlinks.
* Add libkopenafs to authlibs.
* Package dafssync-debug man pages.
* Package opr/queue.h in devel.
* Package akeyconvert and man page.
* Do not package fuse version of afsd. A separate sub-package
for afsd.fuse is warrented, since it adds new libfuse
dependencies.
* Package new server man pages, including dafsssync-* pages.
* Package libafsrfc3961.a as a devel lib.
* Continue to package kauth programs.
Change-Id: I875c3b8dee53abbc67b0f05f8b291bb58abf41a5
Reviewed-on: https://gerrit.openafs.org/12595 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Tim Creech [Sun, 5 Mar 2017 23:13:45 +0000 (18:13 -0500)]
FBSD: build fix for FreeBSD 11
r285819 eliminated b_saveaddr from struct buf, while r292373 changed the
arguments to VOP_GETPAGES. The approach used by this patch to address
these changes was inspired by FreeBSD's nfs and samba clients.
Michael Meffie [Wed, 5 Apr 2017 20:48:36 +0000 (16:48 -0400)]
redhat: convert rpm spec file to make install
Convert the build and install from the deprecated 'make dest' to the
modern 'make install' method.
* Clarify the install section by unrolling the shell scripts,
reorganizing, and improving the comments.
* Remove the gzip glob of the man pages; rpmbuild automatically
compresses the man pages and will handle symlinks correctly.
* Remove the generated temporary list file and specify files directly.
* Remove the extra tar commands to install the man pages out of the doc
directory; 'make install..' installs the man pagess.
* Remove code in the install section which determines the sysname. This is
no longer needed during the install.
* Update the kernel module install commands to accommodate the
conversion from 'make dest'.
Change-Id: I97ec80185a2b11704b27ea74941b50ff4a5aca8c
Reviewed-on: https://gerrit.openafs.org/12594 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Stephan Wiesand [Tue, 11 Apr 2017 09:58:55 +0000 (11:58 +0200)]
Linux: only include cred.h if it exists
Commit c89fd17df1032ec2eacc0d0c9b73e19c5e8db7d2 introduced an explicit
include of linux/cred.h since the latest kernel no longer includes it
implicitly in sched.h. Alas, older kernels (like 2.6.18) don't have this
file. Add a configure test for the existence of cred.h and only include
it if actually present.
This breaks existing OpenAFS autoconf tests for recalc_sigpending() and
task_struct.signal->rlim, so that the OpenAFS kernel module can no
longer build.
Modify OpenAFS autoconfig tests to cope.
Change-Id: Ic9f174b92704eabcbd374feffe5fbeb92c8987ce
Reviewed-on: https://gerrit.openafs.org/12573 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Joe Gorse <jhgorse@gmail.com> Tested-by: Joe Gorse <jhgorse@gmail.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
statx: Add a system call to make enhanced file info available
The Linux getattr inode operation is altered to take two additional
arguments: a u32 request_mask and an unsigned int flags that indicate
the synchronisation mode. This change is propagated to the
vfs_getattr*() function.
- int (*getattr) (struct vfsmount *, struct dentry *, struct kstat *);
+ int (*getattr) (const struct path *, struct kstat *,
+ u32 request_mask, unsigned int sync_mode);
The first argument, request_mask, indicates which fields of the statx
structure are of interest to the userland call. The second argument,
flags, currently may take the values defined in
include/uapi/linux/fcntl.h and are optionally used for cache coherence:
(1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does.
(2) AT_STATX_FORCE_SYNC will require a network filesystem to
synchronise its attributes with the server - which might require
data writeback to occur to get the timestamps correct.
(3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in
a network filesystem. The resulting values should be considered
approximate.
This patch provides a new autoconf test and conditional compilation to
cope with the changes in our getattr implementation.
Change-Id: Ie4206140ae249c00a8906331c57da359c4a372c4
Reviewed-on: https://gerrit.openafs.org/12572 Reviewed-by: Joe Gorse <jhgorse@gmail.com> Tested-by: Joe Gorse <jhgorse@gmail.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Jonathon Weiss [Thu, 10 Nov 2016 22:06:18 +0000 (17:06 -0500)]
Prevent double-starting client on RHEL7
On RHEL7 if the AFS client is stopped with 'service openafs-client
stop', but that fails for some reason (most commonly because some
process has a file or directory in AFS open) systemd will decide that
the openafs-client is in a failed state when it is actually running.
If one then runs 'service openafs-client start' systemd will start a
new AFS client. At this point AFS access will continue to work until
the functional AFS client is (successfully) stopped, at which point a
reboot is required to recover.
Have systemd check the status of 'fs sysname' before starting the
AFS client, so we avoid getting into a state that requires a reboot.
Change-Id: I28a5cca746823d69183ea5ce65c10e1725009c5c
Reviewed-on: https://gerrit.openafs.org/12443 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Tue, 21 Feb 2017 04:18:09 +0000 (22:18 -0600)]
XBSD: do not claim AFS_VM_RDWR_ENV
The AFS_VM_RDWR_ENV configuration parameter (defined or not defined
in each platform's param.h) is undocumented, but appears to be an
indication of a property of the platform OS's VFS layer, or perhaps just
of the complexity of the read/write vnops that we implement for it.
That is, AFS_VM_RDWR_ENV is defined when the read/write vnops implement
partial write logic (and presumably when they interact with the OS VM
layer in ways not expressed by the afs_write() abstraction); systems
that do not define AFS_VM_RDWR_ENV can use the afs_write() function
fairly directly as the vnode operation. Use of AFS_VM_RDWR_ENV
evolved over time, with the original (AFS 3.2/3.3-era) code using a
simple scheme that handled partial writes directly in afs_write()
and avoided complexity in callers. In AFS 3.4, sunos and solaris
gained a more complicated read/write vnop that incorporated the
afs_DoPartialWrite() call itself, and eventually in 3.6 we see the
behavior established at the original IBM import, with all the (Unix)
OSes supported at that time defining AFS_VM_RDWR_ENV.
When DARWIN support was brought in in commit a41175cfbbf4d06ccfe14ae54bef8b7464ecd80b, its param.h properly did
not define AFS_VM_RDWR_ENV, as OS X uses a VFS interface that shares
some level of abstraction with the traditional BSD VFS and its
read/write/getpages/putpages operations, so the afs_write() behavior
was natural and no extra complications needed for integration with the
VM layer or other optimizations.
However, when the initial FreeBSD support came in a few months later,
it seems to have taken inspiration from the OSes that were supported
in the initial IBM import, and kept the AFS_VM_RDWR_ENV definition.
This was then propagated to all the later BSDs as they were added.
Perhaps the most noticeable consequence of this definition is that
the calls to afs_DoPartialWrite() from afs_write() are bypassed, with
a comment that "[i]f write is implemented via VM, afs_DoPartialWrite()
is called from the high-level write op" (and calls to afs_FakeOpen()
and afs_FakeClose() are similarly skipped). This means that attempting
to write a file larger than the local cache will hang waiting for
more space to be freed, which will never happen as the vcache remains
locked and will not be written out in the background.
In the absence of a documented meaning for AFS_VM_RDWR_ENV, this
also gives us a proxy that can be used to indicate whether a given
OS's support intended to claim the AFS_VM_RDWR_ENV -- such platforms
will actually contain the call to afs_DoPartialWrite() in the
appropriate vnode operation. This can be used to sanity-check the
places where AFS_VM_RDWR_ENV is removed by this commit. Interestingly,
HP-UX does not call afs_DoPartialWrite() but also is clearly in a VFS
that uses a rdwr()-based approach, as the corresponding vnode operation
is implemented by mp_afs_rdwr(), so leave it unchanged for now.
Tim Creech is responsible for noting the lack of calls to
afs_DoPartialWrite() on FreeBSD, and Chaskiel Grundman for the
historical research into pre-OpenAFS AFS behavior.
Designing and implementing more complicated BSD read/write vnops that
incorporate afs_DoPartialWrite() and other improvements is left for
future work.
Change-Id: I8e89855ac31303934f97d0753b64899fb7e3867c
Reviewed-on: https://gerrit.openafs.org/12520 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Antoine Verheijen <apv@ualberta.ca> Reviewed-by: Tim Creech <tcreech@tcreech.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Marcio Barbosa [Tue, 31 Jan 2017 14:43:18 +0000 (11:43 -0300)]
vol: detach offline volumes on dafs
Taking a volume offline always clears the inService bit. Taking a
volume out of service also takes it offline. Therefore, if the
inService flag is false, the volume in question should be offline.
On dafs, an offline volume should be unattached.
The attach2() function does not change the state of the volume received
as an argument to unattached when the inService flag is false. Instead,
this function changes the state of the volume in question to
pre-attached and returns VNOVOL to the client. As result, subsequent
accesses to this volume will make the server try and fail to attach
this offline volume over and over again, writing to the FileLog each
time.
To fix this problem, detach the volume received as an argument if the
inService flag is false. Since the new state of this volume will be
unattached, subsequent accesses will not hit attach2().
This situation where a volume is not offline but is also not in service
can occur if a volume is taken offline with vos offline and some time
later the DAFS fileserver is shutdown and restarted; the volume is
placed into the preattach state by default when the server restarts.
Each access to the volume by clients then causes the fileserver to
attempt to attach the volume, which fails, since the in-service flag in
the volume header is false from the previous vos offline. The
fileserver will log a warning to the FileLog on each attempt to attach
the volume, and this will fill the FileLog with duplicate messages
corresponding to the number of attempted accesses.
Mark Vitale [Tue, 28 Feb 2017 23:02:39 +0000 (18:02 -0500)]
SOLARIS: prevent BAD TRAP panic with Studio 12.5
Starting with Solaris Studio 12.3, it is documented that Solaris kernel
modules (such as libafs) must not use any floating point, vector, or
SIMD/SSE instructions on x86 hardware. However, each new Studio
compiler release (12.4 and especially 12.5) is more likely to use these
types of instructions by default.
If the libafs kernel module includes any forbidden kernel instructions,
Solaris will panic the system with:
BAD TRAP: type=7 (#nm Device not available)
Provide a new autoconfig test to specify the required compiler options
(-xvector=%none -xregs=no%float) when building the OpenAFS kernel module
for Solaris, so that no invalid x86 instructions are used.
In addition, reinstate default kernel module optimization for Solaris.
It had been disabled in commit 80592c53cbb0bce782eb39a5e64860786654be9f
to address this same issue in Studio 12.3 and 12.4. However, Studio
12.5 started using some SSE instructions even with no optimization.
This commit has been tested with OpenAFS master and Studio 12.5 at all
optimization levels (none, -xO1 through -xO5) and verified to contain no
XMM register instructions via the following command:
$ gobjdump -dlr libafs64.o | grep xmm | wc -l
Mark Vitale [Tue, 21 Feb 2017 01:16:47 +0000 (20:16 -0500)]
DAFS: do not save or restore host state if CPS in progress
If a fileserver is shutdown while one or more PR_GetHostCPS calls
are in progress, this state is saved in the fsstate.dat file as
hostFlags HCPS_WAITING, HCPS_INPROGRESS. Other hosts that are
merely waiting will have HCPS_WAITING recorded.
However, it makes no sense to restore host structs in this state,
because the GetCPS calls will no longer be in progress. Once these
hosts become active, they will block server threads and quickly cause
all server threads to be exhausted as other CPS requests are blocked
behind them.
Instead, exclude these states from both save and restore.
Marcio Barbosa [Thu, 2 Mar 2017 21:01:48 +0000 (18:01 -0300)]
osx: build afscell only for active architecture
The InstallerPlugins framework provided by the MacOSX10.12.sdk does not
define symbols for architecture i386. As a result, the OpenAFS code
cannot be built on OS X 10.12.
To fix this problem, build the afscell xcode project only for active
architecture.
Michael Meffie [Thu, 11 Jun 2015 17:14:27 +0000 (13:14 -0400)]
libafs: vldb cache timeout option (-volume-ttl)
The unix cache manager caches VLDB information for read-only volumes as
long as a volume callback is held for a read-only volume. The volume
callback may be held as long as files in the read-only volume are being
accessed. The cache manager caches VLDB information for read/write
volumes as long as volume level errors (such as VMOVED) are not returned
by a fileserver while accessing files within the volume.
Add a new option to set the maximum amount of time VLDB information will
be cached, even if a callback is still held for a read-only volume, or
no volume errors have been encounted while accessing files in read/write
volumes.
This avoids situations where the vldb information is cached indefinitely
for read-only and read/write volumes. Instead, the VL servers will be
periodically probed for volume information.
Change-Id: I5f2a57cdaf5cbe7b1bc0440ed6408226cc988fed
Reviewed-on: https://gerrit.openafs.org/11898 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Sergio Gelato [Wed, 22 Feb 2017 21:55:33 +0000 (13:55 -0800)]
LINUX: Debian/Ubuntu build regression on kernel 3.16.39
Now that kernel 4.9 has hit jessie-backports, it becomes desirable to
also backport the associated openafs patches.
Unfortunately, Linux-4.9-inode_change_ok-becomes-setattr_prepare.patch
causes a build failure against jessie's current default kernel,
3.16.39-1, due to the fact that setattr_prepare() is available (it was
cherrypicked to address CVE-2015-1350) but file_dentry() is not (it was
introduced in kernel 4.6).
This makes it difficult to have a version of openafs for jessie that
supports both kernels.
To deal with this, follow the implementation of file_dentry() in 4.6,
and simplify it to account for the lack of d_real() support in older
kernels.
Note that inode_change_ok() has been added back to 3.16.39-1 to avoid
ABI changes. That means the current openafs packages in jessie continue
to work with kernel 3.16.39-1 since they do not include
Linux-4.9-inode_change_ok-becomes-setattr_prepare.patch.
Originally reported at
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=855366
FIXES RT134158
Change-Id: I157aa2ff25945c1c6e3b8e4a600557125711a681
Reviewed-on: https://gerrit.openafs.org/12523 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Mark Vitale [Wed, 7 Dec 2016 16:11:45 +0000 (11:11 -0500)]
Linux 4.10: have_submounts is gone
Linux commit f74e7b33c37e vfs: remove unused have_submounts() function
(v4.10-rc2) removes have_submounts from the tree after providing a
replacement (path_has_submounts) for its last in-tree caller, autofs.
However, it turns out that OpenAFS is better off not using the new
path_has_submounts. Instead, OpenAFS could/should have stopped using
have_submounts() much earlier, back in Linux v3.18 when d_invalidate
became void. At that time, most in-tree callers of have_submounts had
already been converted to use check_submounts_and_drop back in v3.12.
At v3.18, a series of commits modified check_submounts_and_drop to
automatically remove child submounts (instead of returning -EBUSY if a
submount was detected), then subsumed it into d_invalidate. The end
result was that VFS now implicitly handles much of the housekeeping
previously called explicitly by the various filesystem d_revalidate
routines:
- shrink_dcache_parent
- check_submounts_and_drop
- d_drop
- d_invalidate
All in-tree filesystem d_revalidate routines were updated to take
advantage of this new VFS support.
Modify afs_linux_dentry_revalidate to no longer perform any special
handling for invalid dentries when D_INVALIDATE_IS_VOID. Instead, allow
our VFS caller to properly clean up any invalid dentry when we return 0.
Change-Id: I0c4d777e6d445857c395a7b5f9a43c9024b098e9
Reviewed-on: https://gerrit.openafs.org/12506 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Joe Gorse [Thu, 16 Feb 2017 23:01:50 +0000 (18:01 -0500)]
LINUX: Bring debug symbols back to the Linux kernel module.
Starting with 4.8 Linux kernels our existing build script
generator, make_kbuild_makefile.pl, does not pass the debugging
symbols CFLAGS that were present when building for previous kernels.
This fix appends the $(KERN_DBG) variable which will only be defined
when the configuration includes the --enable-debug-kernel option.
Change-Id: I9a85dc0311a3a706239bc9e471b2d7197ebe1946
Reviewed-on: https://gerrit.openafs.org/12519 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Michael Meffie [Fri, 10 Feb 2017 15:39:09 +0000 (10:39 -0500)]
build: add --without-swig to override swig check
Add the --without-swig option to disable the automatic swig detection
and disable the optional features which depend on swig. This allows
builders to avoid swig even if present on the build system.
Also, add the --with-swig option to force the check and fail if not
detected. This allows builders to declare the swig features are
mandatory.
The default continues to be to check for swig, and if present, build the
optional features which require swig.
To disable the automatic check for swig and disable the features which
depend on swig:
./configure --without-swig # or --with-swig=no
To force the check and fail if swig is not present on the system:
./configure --with-swig # or --with-swig=yes
If --with-swig is given and swig is not detected, then configure will
fail with the message:
configure: error: swig requested but not found
The Perl 5 bindings for libuafs is the only feature which requires swig
at this time.
Andrew Deason [Fri, 10 Feb 2017 07:29:28 +0000 (01:29 -0600)]
PERLUAFS: Modernize lang-specific swig typemaps
Currently, our swig bindings for PERLUAFS define a couple of typemaps
like so:
%typemap(in, numinputs=1, perl5) (char *READBUF, int LENGTH) {
[...]
}
Embedding the target language name in the typemap arguments is a very
old way of specifying what language the typemap is for; they were
removed after swig 1.1. With swig 3.0.x releases (and possibly
others), the specific combination of this deprecated syntax and some
other features we're using causes a segfault. That's clearly a bug in
swig, but we shouldn't be using the deprecated syntax anyway.
Update this to instead use preprocessor symbols to specify
language-specific typemaps (#ifdef SWIGPERL). We only actually define
these for perl right now, so make sure to throw an error if we're not
running for perl.
FIXES 134103
Change-Id: I14264a2dfada53d99413808ed5d60b79b1ee44f3
Reviewed-on: https://gerrit.openafs.org/12517 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>