git.michaelhowe.org Git - packages/o/openafs.git/log

New upstream version 1.8.0~pre3

Update NEWS for rx security fix

Change-Id: I30282ac8f51a7b16dd851fdbd41464f8fdafc279

OPENAFS-SA-2017-001: rx: Sanity-check received MTU and twind values

Rather than blindly trusting the values received in the
(unauthenticated) ack packet trailer, apply some minmial sanity checks
to received values. natMTU and regular MTU values are subject to
Rx minmium/maximum packet sizes, and the transmit window cannot drop
below one without risk of deadlock.

The maxDgramPackets value that can also be present in the trailer
already has sufficient sanity checking.

Extremely low MTU values (less than 28 == RX_HEADER_SIZE) can cause us
to set a negative "maximum usable data" size that gets used as an
(unsigned) packet length for subsequent allocation and computation,
triggering an assertion when the connection is used to transmit data.

FIXES 134450

(cherry picked from commit 894555f93a2571146cb9ca07140eb98c7a424b01)

Change-Id: I98e2a65d1aa291a73e8cfed9c9eaac71c6af00dc

Make OpenAFS 1.8.0pre3

Update the version strings for the third 1.8.0 prerelease.

Change-Id: I25a4eee4de04e57ffcf9055f69ae9a3d683b8d64
Reviewed-on: https://gerrit.openafs.org/12765
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Update NEWS for 1.8.0pre3

Change-Id: I38110825cbe8b5c4ca18d86e4542374ae26f6fd4
Reviewed-on: https://gerrit.openafs.org/12764
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>

afs: Fix bounds check in PNewCell

Reported by the opensuse buildbot:

  CC [M]  /home/buildbot/opensuse-tumbleweed-i386-builder/build/src/libafs/MODLOAD-4.13.12-1-default-MP/rx_packet.o
/home/buildbot/opensuse-tumbleweed-i386-builder/build/src/afs/afs_pioctl.c: In function ‘PNewCell’:
/home/buildbot/opensuse-tumbleweed-i386-builder/build/src/afs/afs_pioctl.c:3075:55: error: ‘*’ in boolean context, suggest ‘&&’ instead [-Werror=int-in-bool-context]
     if ((afs_pd_remaining(ain) < AFS_MAXCELLHOSTS +3) * sizeof(afs_int32))
         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~

The bug was introduced in commit 718f85a8b6.

Reviewed-on: https://gerrit.openafs.org/12782
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 4fa0ee620cfb9991ca9748b5ee116cc8e1e6c505)

Change-Id: I0963403846a62dddf2d13ce3c03d772a6d869119
Reviewed-on: https://gerrit.openafs.org/12784
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

rx: fix call refcount leak in error case

The recent event handling normalization in commit
304d758983b499dc568d6ca57b6e92df24b69de8 had event handlers switch
to dropping their reference on the associated connection/call just
before return. An early return case was missed in the conversion,
leading to a refcount leak in an error case.

Reviewed-on: https://gerrit.openafs.org/12781
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 66b74e78ba5fea6a8236dcd3b8b46e1dfa6a0ac7)

Change-Id: I532c49b2ef6ec95dd26a99c02e12ea53348f9690
Reviewed-on: https://gerrit.openafs.org/12783
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

afs: fix kernel_write / kernel_read arguments

The order / content of the arguments passed to kernel_write and
kernel_read are not right. As a result, the kernel will panic if one of
the functions in question is called.

[kaduk@mit.edu: include configure check for multiple kernel_read()
variants, per linux commits bdd1d2d3d251c65b74ac4493e08db18971c09240
and e13ec939e96b13e664bb6cee361cc976a0ee621a]

FIXES 134440

Reviewed-on: https://gerrit.openafs.org/12769
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 3ce55426ee6912b78460465bcaa1428333ad1fbc)

Change-Id: I28f04f7625a471c37f98515d5186f80082bf6a43
Reviewed-on: https://gerrit.openafs.org/12780
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

tests: fix out of bounds access in the rx-event test

Use the NUMEVENTS symbol which defines the array size instead of an
incorrect hard coded number when checking if a second event can be added
to be fired at the same time. This fixes a potential out of bounds
access of the event test array.

Also update the comment which incorrectly mentions the incorrect number
of events in the test.

Reviewed-on: https://gerrit.openafs.org/12762
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 50a3eb7b7ee94bffaadc98429bd404164e89ec7f)

Change-Id: I7a975e7498c1c7416a800c9294c97ee4de4fd57a
Reviewed-on: https://gerrit.openafs.org/12779
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Sprinkle rx_GetConnection() for concision

Instead of inlining the body (taking the lock, incrementing the
refcount, and dropping the lock), use the convenience function
designed for this purpose.

Reviewed-on: https://gerrit.openafs.org/12772
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 2ae84bf053fe66b73a2c77b5d71305bae2c17587)

Change-Id: I60794d877a76fbb7c8ba59207e710a20641cc8f1
Reviewed-on: https://gerrit.openafs.org/12778
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

rx: fix mutex leak in error case

Reported by Mark Vitale

Reviewed-on: https://gerrit.openafs.org/12771
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 01bcfd3e14f6ee1faa4b8ce5a7932de37d585fd3)

Change-Id: I4384d6813a5cfb053e6991eb3c157fa59ecfa11b
Reviewed-on: https://gerrit.openafs.org/12777
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Add event-related mutex assertions

In utility functions that access fields of type struct rxevent *,
assert that the appropriate lock is held for the access in question.

These assertions are only compiled in when built with -DOPR_DEBUG_LOCKS,
which can be enbled by --debug-locks at configure time.

Reviewed-on: https://gerrit.openafs.org/12757
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit a7a3108e602c83176c5578c9f28b6312f71aba78)

Change-Id: I147a2e475feffb1b75a08ac5b08614bd6d8f46a5
Reviewed-on: https://gerrit.openafs.org/12776
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Standardize rx_event usage

Go over all consumers of the rx event framework and normalize its usage
according to the following principles:

rxevent_Post() is used to create an event, and it returns an event
handle (with a reference on the event structure) that can be used
to cancel the event before its timeout fires.  (There is also an
additional reference on the event held by the global event tree.)
In all(*) usage within the tree, that event handle is stored within
either an rx_connection or an rx_call.  Reads/writes to the member variable
that holds the event handle require either the conn_data_lock or call
lock, respectively -- that means that in most cases, callers of
rxevent_Post() and rxevent_Cancel() will be holding one of those
aforementioned locks.  The event handlers themselves will need to
modify the call/connection object according to the nature of the
event, which requires holding those same locks, and also a guarantee
that the call/connection is still a live object and has not been
deallocated!  Whether or not rxevent_Cancel() succeeds in cancelling
the event before it fires, whenever passed a non-NULL event structure
it will NULL out the supplied pointer and drop a reference on the
event structure.  This is the correct behavior, since the caller
has asked to cancel the event and has no further use for the event
handle or its reference on the event structure.  The caller of
rxevent_Cancel() must check its return value to know whether or
not the event was cancelled before its handler was able to run.

The interaction window between the call/connection lock and the lock
protecting the red/black tree of pending events opens up a somewhat
problematic race window.  Because the application thread is expected
to hold the call/connection lock around rxevent_Cancel() (to protect
the write to the field in the call/connection structure that holds
an event handle), and rxevent_Cancel() must take the lock protecting
the red/black tree of events, this establishes a lock order with the
call/connection lock taken before the eventTree lock.  This is in
conflict with the event handler thread, which must take the eventTree
lock first, in order to select an event to run (and thus know what
additional lock would need to be taken, by virtue of what handler
function is to be run).  The conflict is easy to resolve in the
standard way, by having a local pointer to the event that is obtained
while the event is removed from the red/black tree under the eventTree
lock, and then the eventTree lock can be dropped and the event run
based on the local variable referring to it.  The race window occurs
when the caller of rxevent_Cancel() holds the call/connection lock,
and rxevent_Cancel() obtains the eventTree lock just after the event
handler thread drops it in order to run the event.  The event handler
function begins to execute, and immediately blocks trying to obtain
the call/connection lock.  Now that rxevent_Cancel() has the eventTree
lock it can proceed to search the tree, fail to find the indicated event
in the tree, clear out the event pointer from the call/connection
data structure, drop its caller's reference to the event structure,
and return failure (the event was not cancelled).  Only then does the
caller of rxevent_Cancel() drop the call/connection lock and allow
the event handler to make progress.

This race is not necessarily problematic if appropriate care is taken,
but in the previous code such was not the case.  In particular, it
is a common idiom for the firing event to call rxevent_Put() on itself,
to release the handle stored in the call/connection that could have
been used to cancel the event before it fired.  Failing to do so would
result in a memory leak of event structures; however, rxevent_Put() does
not check for a NULL argument, so a segfault (NULL dereference) was
observed in the test suite when the race occurred and the event handler
tried to rxevent_Put() the reference that had already been released by
the unsuccessful rxevent_Cancel() call.  Upon inspection, many (but not
all) of the uses in rx.c were susceptible to a similar race condition
and crash.

The test suite also papers over a related issue in that the event handler
in the test suite always knows that the data structure containing the
event handle will remain live, since it is a global array that is allocated
for the entire scope of the test.  In rx.c, events are associated with
calls and connections that have a finite lifetime, so we need to take care
to ensure that the call/connection pointer stored in the event remains
valid for the duration of the event's lifecycle.  In particular, even an
attempt to take the call/connection lock to check whether the corresponding
event field is NULL is fraught with risk, as it could crash if the lock
(and containing call/connection) has already been destroyed!  There are
several potential ways to ensure the liveness of the associated
call/connection while the event handler runs, most notably to take care
in the call/connection destruction path to ensure that all associated
events are either successfully cancelled or run to completion before
tearing down the call/connection structure, and to give the pending event
its own reference on the associated call/connection.  Here, we opt for
the latter, acknowledging that this may result in the event handler thread
doing the full call/connection teardown and delay the firing of subsequent
events.  This is deemed acceptable, as pending events are for intentionally
delayed tasks, and some extra delay is probably acceptable.  (The various
keepalive events and the challenge event could delay the user experience
and/or security properties if significantly delayed, but I do not believe
that this change admits completely unbounded delay in the event handler
thread, so the practical risk seems minimal.)

Accordingly, this commit attempts to ensure that:

* Each event holds a formal reference on its associated call/connection.
* The appropriate lock is held for all accesses to event pointers in
  call/connection structures.
* Each event handler (after taking the appropriate lock) checks whether
  it raced with rxevent_Cancel() and only drops the call/connection's
  reference to the event if the race did not occur.
* Each event handler drops its reference to the associated call/connection
  *after* doing any actions that might access/modify the call/connection.
* The per-event reference on the associated call/connection is dropped by
  the thread that removes the event from the red/black tree.  That is,
  the event handler function if the event runs, or by the caller of
  rxevent_Cancel() when the cancellation succeed.
* No non-NULL event handles remain in a call/connection being destroyed,
  which would indicate a refcounting error.

(*) There is an additional event used in practice, to reap old connections,
    but it is effectively a background task that reschedules itself
    periodically, with no handle to the event retained so as to be able
    to cancel it.  As such, it is unaffected by the concerns raised here.

While here, standardize on the rx_GetConnection() function for incrementing
the reference count on a connection object, instead of inlining the
corresponding mutex lock/unlock and variable access.

In contrast to what was done on master, for the 1.8 branch we do not
force-enable refcount checking.

Reviewed-on: https://gerrit.openafs.org/12756
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 304d758983b499dc568d6ca57b6e92df24b69de8)

Change-Id: I68e6cc162a148b6ebbabe037a7bc3cccd648423c
Reviewed-on: https://gerrit.openafs.org/12775
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

Adjust rx-event test to exercise cancel/fire race

We currently do not properly handle the case where a thread runs
rxevent_Cancel() in parallel with the event-handler thread attempting
to fire that event, but the test suite only picked up on this issue
in a handful of the Debian automated builds (somewhat less-resourced
ones, perhaps).

Modify the event scheduling algorithm in the test so as to create a
larger chunk of events scheduled to fire "right away" and thereby
exercise the race condition more often when we proceed to cancel
a quarter of events "right away".

Reviewed-on: https://gerrit.openafs.org/12755
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit bdb509fb1d8e0fdca05dffecdbcbf60a95ea502e)

Change-Id: I27cebed3c2c3daff10b8d3f5f6f949e667791a72
Reviewed-on: https://gerrit.openafs.org/12774
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

gtx: link against libtinfo if termlib is seperated

If ncurses is built with "./configure --with-termlib=tinfo", gtx fails
to link because of an undefined reference to the LINES symbol which is
then provided by libtinfo.so and not libncurses.so.

If ncurses is present, additionally check whether LINES is provided by
ncurses or tinfo and set $LIB_curses accordingly.

This change is based on a patch provided by Bastian Beischer.

FIXES 134420

Reviewed-on: https://gerrit.openafs.org/12760
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 311f1d28a2f626350b33ad432e674055b62511bd)

Change-Id: I2f69fe51bbefeeb2a17145a88aa9c891644f2f61
Reviewed-on: https://gerrit.openafs.org/12763
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Laß <lass@mail.uni-paderborn.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Linux: Use kernel_read/kernel_write when __vfs variants are unavailable

We hide the uses of set_fs/get_fs behind a macro, as those functions
are likely to soon become unavailable:

> Christoph Hellwig suggested removing all calls outside of the core
> filesystem and architecture code; Andy Lutomirski went one step
> further and said they should all go.

https://lwn.net/Articles/722267/

Reviewed-on: https://gerrit.openafs.org/12729
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 5ee516b3789d3545f3d78fb3aba2480308359945)

Change-Id: I28a7126bf6ab048f8d949f190e557a3fa44f3f46
Reviewed-on: https://gerrit.openafs.org/12737
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Linux: Test for __vfs_write rather than __vfs_read

The following commit:

    commit eb031849d52e61d24ba54e9d27553189ff328174
    Author: Christoph Hellwig <hch@lst.de>
    Date:   Fri Sep 1 17:39:23 2017 +0200

        fs: unexport __vfs_read/__vfs_write

unexports both __vfs_read and __vfs_write, but keeps the former in
fs.h--as it is is still being used by another part of the tree.

This situation results in a false positive in our Autoconf check,
which does not see the export statements, and ends up marking the
corresponding API as available.

That, in turn, causes some code which assumes symmetry with
__vfs_write to fail to compile.

Switch to testing for __vfs_write, which correctly marks the API as
unavailable.

Reviewed-on: https://gerrit.openafs.org/12728
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 929e77a886fc9853ee292ba1aa52a920c454e94b)

Change-Id: I03e3c8222360a6b04b45b45a8f56b5df054f6783
Reviewed-on: https://gerrit.openafs.org/12736
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Correct m4 conditionals in curses.m4

AS_IF does not invoke the test(1) shell builtin for us, so we must
take care to consistently use it ourself.

While here, sprinkle some missing double-quotes around variable
expansions in AS_IF statements in this file.

Submitted by Bastian Beischer.

FIXES 134414

Change-Id: Iccfe311011f17de6317cf64abdc58b0812b81b8c
Reviewed-on: https://gerrit.openafs.org/12738
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit e0c5ada214596d5adb6798682d5e280cc99f447c)
Reviewed-on: https://gerrit.openafs.org/12739

vol: Fix two buffers being one char too short

Fixes these warnings:

namei_ops.c: In function 'namei_copy_on_write':
namei_ops.c:1328:31: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=]
  snprintf(path, sizeof(path), "%s-tmp", name.n_path);
                               ^~~~~~~~
namei_ops.c:1328:2: note: 'snprintf' output between 5 and 260 bytes into a destination of size 259
  snprintf(path, sizeof(path), "%s-tmp", name.n_path);
  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

vol_split.c: In function 'split_volume':
vol_split.c:576:22: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=]
     sprintf(symlink, "#%s", V_name(newvol));
                      ^~~~~
vol_split.c:576:5: note: 'sprintf' output between 2 and 33 bytes into a destination of size 32
     sprintf(symlink, "#%s", V_name(newvol));
     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Reviewed-on: https://gerrit.openafs.org/12722
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 0a9a6b57ce6e1c97fcc651c8cb74e66fc8422a1e)

Change-Id: Ia60439aed7925b786a0213d96a7afb413579e01f
Reviewed-on: https://gerrit.openafs.org/12723
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

New upstream version 1.8.0~pre2

Linux: Include linux/uaccess.h rather than asm/uaccess.h if present

Starting with Linux 4.12 there is a module build error on s390
due to asm/uaccess.h using a macro defined in the common header.
The common header has been around since 2.6.18 and has always
included asm/uaccess.h, so switch to using the common header
whenever it is present.

Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Reviewed-on: https://gerrit.openafs.org/12714
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 962f4838dc461567d896304f617a0923745d13d5)

Change-Id: I5a7834b982458159804bc4d940e39ef283253299
Reviewed-on: https://gerrit.openafs.org/12718
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Make OpenAFS 1.8.0pre2

Update the version strings for the second 1.8.0 prerelease.

Change-Id: I3e3f950d0565b877a4da4f8843a015ac392484d5
Reviewed-on: https://gerrit.openafs.org/12683
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Remove src/mcas

This lock-free library toolkit is intriguing and may be the subject
of future work, but such development will occur on the master branch,
and these files are just clutter on openafs-stable-1_8_x. Remove
them to give the tree a more clean start.

Remove src/mcas and stop mentioning it in SOURCE-MAP; don't reference
it in the rpctests, either.

Change-Id: I21b1b6b64a709fe40aa53aaf3470d128c0dc2f86
Reviewed-on: https://gerrit.openafs.org/12682
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Remove src/rxgk

These files were commited slightly prematurely to the tree; rxgk
support is intended for the 2.0 release, and will not appear in the
1.8.x release series.

Remove src/rxgk and drop mentions of rxgk from configure/Makefile.in/etc.

Change-Id: Ib7d40eaac85b05d920781b61f73dbdf8fedfcc2b
Reviewed-on: https://gerrit.openafs.org/12681
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

redhat: move bosserver and fssync-debug man pages

Move the bosserver and fssync-debug/dafssync-debug man pages to the
openafs-server package, which distributes those programs.

Change-Id: I9c84ad485834177fd43b28acd444d3d54c648cc8
Reviewed-on: https://gerrit.openafs.org/12601
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>

redhat: kauth client and server sub-packages

Move the kaserver and kauth client programs to conditionally built
packages called openafs-kauth-server and openafs-kauth-client.
Packagers can build these by specifying '--with kauth'. They are not
built by default to discourage use.

This commit subsumes the openafs-kpasswd package into the
openafs-kauth-client package.

Change-Id: I1322f05d7fe11d466c9ed71a5059c21b759d95ab
Reviewed-on: https://gerrit.openafs.org/12600
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>

redhat: do not package kauth by default

Do not package kaserver and related programs by default to discourage
use. Add the '--with kauth' rpmbuild option to allow packagers to
continue include the kauth programs for compatibility.

Change-Id: I8bf9f6dc221afc22ed6c9a33cf101d705e6c4920
Reviewed-on: https://gerrit.openafs.org/12597
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>

Default to crypt mode for unix clients

Though the protection offered by rxkad, even with rxkad-k5 and rxkad-kdf, is
insufficient to protect traffic from a determined attacker, it remains the
case that the internet is not a safe place for user data to travel in the
clear, and has not been for a long time. The Windows client encrypts by
default, and all or nearly all the Unix client packaging scripts set crypt
mode by default. Catch up to reality and default to crypt mode in the
Unix cache manager.

Change-Id: If0061ddca3bedf0df1ade8cb61ccb710ec1181d4
Reviewed-on: https://gerrit.openafs.org/12668
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

ubik: remove useless signal call

The current version does not have a corresponding LWP_WaitProcess call
for the beacon_globals.ubik_amSyncSite global. As a result, the
LWP_NoYieldSignal(&beacon_globals.ubik_amSyncSite) signal call can be
safely removed.

Change-Id: I72c4ccfe8e68551673dc728dd699ba8c561d76d1
Reviewed-on: https://gerrit.openafs.org/12673
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

doc: add a document to describe rx debug packets

This document gives a basic description of Rx debug packets, the
protocol to exchange debug packets, and the version history.

Change-Id: Ic040d336c1e463f7da145f1a292c20c5d5f215df
Reviewed-on: https://gerrit.openafs.org/12677
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

doc: add kolya's rx-spec to doc/txt

Add rx protocol spec and rx debug spec written by Nickolia Zeldovich.

Rx protocol specification draft (2002)
Nickolai Zeldovich, kolya@MIT.EDU

Change-Id: I65a9a83a8889503f3a82c8fde7a87f84d2736c8d
Reviewed-on: https://gerrit.openafs.org/12676
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

doc: relocate notes from arch to txt

The doc/txt directory has become the de facto home for text-based
technical notes. Relocate the contents of the doc/arch directory to
doc/txt. Relocate doc/examples to doc/txt/examples.

Update the doc/README file to be more current and remove old work in
progress comments.

Change-Id: Iaa53e77eb1f7019d22af8380fa147305ac79d055
Reviewed-on: https://gerrit.openafs.org/12675
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Add NEWS entry for recent ubik changes

Of the ubik-fix-write-after-recovery topic, this seems like the most
noteworthy portion, with the other bits wrapped up in the preface.

Change-Id: Icc1afb9f851ef2d7ade49c2382cc023997f1bf26
Reviewed-on: https://gerrit.openafs.org/12679
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

ubik: update epoch as soon as sync-site is elected

The ubik_epochTime represents the time at which the coordinator first
received its coordinator mandate. However, this global is currently not
updated at the moment when a new sync-site is elected. Instead,
ubik_epochTime is only updated at the very end of the first write
transaction, when a new database label is written (in udisk_commit).
This causes at least 2 different issues:

For one, this means that we change ubik_epochTime while a remote
transaction is in progress. If VOTE_Beacon is called after
ubik_epochTime is updated, but before the remote transaction ends, the
remote sites will detect that the transaction id in ubik_currentTrans is
wrong (via urecovery_CheckTid(), since the epoch doesn't match), and
they will abort the transaction. This means the transaction will fail,
and it may cause a loss of quorum until another election is completed.

Another issue is that ubik_epochTime can be 0 at the beginning of a
write transaction, if this is the first election that this site has won.
Since ubik_epochTime is used to construct transaction ids, this means
that we can have different transactions that originate from different
sites at different times, but they have the same epoch in their tid.
For example, say a write transaction starts with epoch 0, but the
originating site is killed/interrupted before finishing. That write
transaction will linger on remote sites in ubik_currentTrans with an
epoch of 0 (since the originating site will never call
DISK_ReleaseLocks, or DISK_Abort, etc). Normally the sync site will kill
such a lingering transaction via urecovery_CheckTid, but since the epoch
is 0, and the election winner's epoch is also 0, the transaction looks
valid and may never be killed. If that transaction is holding a lock on
the database, this means that the database will forever remain locked,
effectively preventing any access to the db on that site.

To fix both of these issues, update ubik_epochTime with the current
time as soon as we win the election. This ensures that the epoch is not
updated in the middle of a transaction, and it ensures that all
transactions are created with a unique epoch: the epoch of the election
that we won.

Note that with this commit, we do not ever set ubik_epochTime to the
magic value of '2' during database init. The special '2' epoch only
needs to be set in the database itself, and it is never an actual epoch
that represents a real quorum that went through the election process.
The database will be labelled with a 'real' epoch after the first write,
like normal.

[kaduk@mit.edu: comment the locking strategy in ubeacon_Interact()]

Change-Id: I6cdcf5a73c1ea564622bfc8ab7024d9901af4bc8
Reviewed-on: https://gerrit.openafs.org/12609
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

LINUX: afs_create infinite fetchStatus loop

For a file in a directory with the CStatd bit cleared, we can get
an infinite fetchStatus loop.

In afs_create(), afs_getDCache() may return NULL due to an error.
If unchecked it will loop which may produce multiple fetchStatus()
calls to the fileserver.

Credit: Yadav Yadavendra for identifying and analysing this issue.

Change-Id: Iecd77d49a5f3e8bb629396c57246736b39aa935f
Reviewed-on: https://gerrit.openafs.org/12651
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Update NEWS for volume stats default change

Change-Id: I1a184bf638609866f6f7f1d11c224dfee1113eef
Reviewed-on: https://gerrit.openafs.org/12678
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

volser: preserve volume stats by default

Commit dfceff1d3a66e76246537738720f411330808d64 added the
-preserve-vol-stats flag to the volume server. This enabled a change in
the volume server to preserve volume usage statistics during reclone and
restore operations. Otherwise, volume usage counters of read-only
volumes are cleared when volumes are released, making it difficult to
track usage with the volume stats.

Make this feature the default behavior of the volume server and provide
the option -clear-vol-stats to use the old behavior if so desired. This
change makes the -preserve-vol-stats the default, and keeps it as a
hidden flag for sites which may already have that flag set in the
BosConfig.

Since this changes a default behavior of the volume server, this change
is only appropriate on a major or minor release boundary, not in the
middle of a stable series.

Change-Id: I3706ede64b7b18a80b39ebd55f2e1824bb7dbc57
Reviewed-on: https://gerrit.openafs.org/12674
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

ubik: avoid early DISK_Begin calls we know will fail

Currently, we can start a write transaction on a site immediately after
it is elected as the sync site. However, after commit d47beca1,
SDISK_Begin on remote sites will fail right after an election occurs
(since lastYesState is not set, and so urecovery_AllBetter will fail).
And after commit fac0b742, this error is always noticed and propagated
back to the application.

As a result, when we try to write immediately after a sync site is
elected, the transaction will fail with UNOQUORUM, the remote sites will
be marked as down, and we may lose quorum and require another election
to be performed. This can easily happen repeatedly for a site that
frequently tries to make changes to a ubik database.

To avoid marking other sites down and going through another election
process, do not allow write transactions until we know that lastYesState
is set on the remote sites. We do this by waiting until the next wave of
beacons are sent, which tell the remote sites that we are the sync site.
In other words, only allow write transactions after the sync site knows
that the remote sites also know that the sync site has been elected.

With this commit, a write transaction immediately after an election
will still fail with UNOQUORUM, but we avoid triggering an error on the
remote sites, and avoid losing quorum in this situation.

Change-Id: I9e1a76b4022e6d734af1165d94c12e90af04974d
Reviewed-on: https://gerrit.openafs.org/12592
Reviewed-by: Andrew Deason <adeason@dson.org>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>

ubik: allow remote dbase relabel if up to date

When a site is elected the sync-site, its database is not immediately
relabeled. The database in question will be relabeled at the end of the
first write transaction (in udisk_commit). To do so, the dbase->version
is updated on the sync-site first (1) and then the versions of the
remote sites are updated through SDISK_SetVersion() (2).

In order to make sure that the remote site holds the same database as
the sync-site, the SDISK_SetVersion() function checks if the current
version held by the remote site (ubik_dbVersion) is equal to the
original version stored by the sync-site (oldversionp). If
ubik_dbVersion is not equal to oldversionp, SDISK_SetVersion() will
fail with USYNC.

However, ubik_dbVersion can be updated by the vote thread at any time.
That is, if the sync site calls VOTE_Beacon() on the remote site between
events (1) and (2), the remote site will set ubik_dbVersion to the new
version, while ubik_dbase->version is still set to the old version. As
a result, ubik_dbVersion will not be equal to oldversionp and
SDISK_SetVersion() will fail with USYNC. This failure may cause a loss
of quorum until another election is completed.

To fix this problem, let SDISK_SetVersion() relabel the database when
ubik_dbase->version is equal to oldversionp. In order to try to only
affect the scenario described above, also check if ubik_dbVersion is
equal to newversionp.

Change-Id: I97e6f8cacd1c9bca0b4c72374c058c5fe5b638b3
Reviewed-on: https://gerrit.openafs.org/12613
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

afs: fix repeated BulkStatus calls for directories.

There is a filetype comparison check in afs_DoBulkStat just after
BulkFetch RPC. This check will fail for directories even though
bulkStatus was done for directories.

This code is apparently necessary for Darwin, but it causes this problem
otherwise. Thus it is removed from the rest of the builds using the
AFS_DARWIN_ENV preprocessor variable.

Credit: Yadav Yadavendra for identifying and analysing this issue.

Change-Id: I9645f0e7a3327cb5f20cdf3ba2bf1cc5b1509bb5
Reviewed-on: https://gerrit.openafs.org/12610
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

relocate old afs docs to doc/txt

Move the afs/DOC files to the top-leve doc/txt directory, since this has
become the home for developer oriented documentation.

Change-Id: I128d338c69534b4ee6043105a7cfd390b280afe3
Reviewed-on: https://gerrit.openafs.org/12662
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Incorporate old release notes into NEWS

Cleanup the doc/txt directory by incorporating the old release
notes into the NEWS file.

Change-Id: I63911fc5cb0b476e201148c6d3fa3441f4746ab7
Reviewed-on: https://gerrit.openafs.org/12661
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Update NEWS for 1.8.0pre2

Change-Id: I5f83e81f25177bde1ea691e756359563e80ee3f2
Reviewed-on: https://gerrit.openafs.org/12660
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Import NEWS from openafs-stable-1_6_x

Import change descriptions for 1.6.20.1, 1.6.20.2, 1.6.21.

Change-Id: Ib4f06c7046eb6e1bb0a1ccfb9f6c45191154fe0e
Reviewed-on: https://gerrit.openafs.org/12659
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Linux: fix whitespace in osi_sysctl.c

Remove dozens of trailing spaces and make consistent use of tabs
for indentation throughout the file.

Change-Id: Ibbd17d2b9828590ffd84b76aac70646e9fe9cb2c
Reviewed-on: https://gerrit.openafs.org/12665
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

LINUX: Workaround d_splice_alias/d_lookup race

Before Linux kernel commit 4919c5e45a91b5db5a41695fe0357fbdff0d5767,
d_splice_alias in some cases can d_rehash the given dentry without
attaching it to the given inode, right before the dentry is unhashed
again. This means that for a few moments, that negative dentry is
visible to __d_lookup, and thus is visible to path lookup and can be
given to afs_linux_dentry_revalidate.

Currently, afs_linux_dentry_revalidate will say that the dentry is
valid, because d_time and other fields are set; it's just not attached
to an inode. This causes an ENOENT error on lookup, even though the
file is there (and no OpenAFS code said otherwise).

Normally this race is rare, but it can be frequently exercised if
we access the same directory via different names at the same time.
This can happen with multiple mountpoints to the same volume, or by
accessing an @sys directory via its abbreviated and expanded forms.

To get around this, make afs_linux_dentry_revalidate check negative
'dentry's to see if they are unhashed. We also lock the parent inode,
in order to guarantee that a problematic d_splice_alias call isn't
running at the same time (and thus, we know the dentry will not be
unhashed immediately afterwards). This slows down
afs_linux_dentry_revalidate for valid negative 'dentry's a little, but
it allows us to use negative dentry's at all.

Linux kernel commit 4919c5e45a91b5db5a41695fe0357fbdff0d5767 fixes
this issue, which was included in 2.6.34, so don't do this workaround
for 2.6.34 and on.

Change-Id: I8e58ebed4441151832054b1ef3f1aa5af1c4a9b5
Reviewed-on: https://gerrit.openafs.org/12638
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Linux 4.13: use designated initializers where required

struct path is declared with the "designated_init" attribute,
and module builds now use -Werror=designated-init. Cope.

And as pointed out by Michael Meffie, struct ctl_table has
the same requirement now, so use a designated initializer
for the final element of the sysctl table too.

Change-Id: I0ec45aac961dcefa0856a15ee218085626a357c7
Reviewed-on: https://gerrit.openafs.org/12663
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>

afs: fix afs_xserver deadlock in afsdb refresh

When setting up a new volume, the cache manager calls afs_GetServer() to
setup the server object for each fileserver associated with the volume.
The afs_GetServer() function locks afs_xserver and then, among other
things, calls afs_GetCell() to lookup the cell info by cell number.

When the cache manager is running in afsdb mode, afs_GetCell() will
attempt to refresh the cell info if the time-to-live has been exceeded
since the last call to afs_GetCell(). During this refresh the AFSDB
calls afs_GetServer() to update the vlserver information. The afsdb
handler thread and the thread processing the volume setup become
deadlocked since the afs_xserver lock is already held at this point.

This bug will manifest when the DNS SRV record TTL is smaller than the
time the fileservers respond to the GetCapabilities RPC within
afs_GetServer() and there are multiple read-only servers for a volume.

Avoid the deadlock by using the afs_GetCellStale() variant within
afs_GetServer(). This variant returns the memory resident cell info
without the afsdb upcall and the subsequent afs_GetServer() call.

Change-Id: Iad57870f84c5e542a5ee20f00ea03b3fc87683a1
Reviewed-on: https://gerrit.openafs.org/12652
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

afs: restore force_if_down check when getting connections

Commit cb9e029255420608308127b0609179a46d9983ad removed the
force_if_down check in afs_ConnBySA, which effictively turned on
force_if_down flag for every call to afs_ConnBySA. This caused
afs_ConnBySA to always return connections, even for server addresses
marked down and force_if_down set to 0.

One serious consequence of this bug is the cache manager will retry the
preferred vlserver indefinitely when it is unreachable. This is because
the loop in afs_ConnMHosts always tries hosts in preferred order and
expects afs_ConnBySA to return a NULL if the server address has no
connections because it is marked down.

Restore the check for server addresses marked down to honor the
force_if_down flag again so we do not get connections for down servers
unless requested.

Change-Id: Ia117354929a62b0cedc218040649e9e0b8d8ed23
Reviewed-on: https://gerrit.openafs.org/12653
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

redhat: fix rpmbuild command line option defaults

Fix the handling of default values for the various rpmbuild options
which can be given. These have been broken as code was shuffled around
over the years.

Remove obsolete comments about detecting what to build based on the
architecture.

Provide the '--without authlibs' option to disable the openafs-authlibs
package.

Change-Id: I6c8db1f3163ee241f9a4d1282345a0ddeabd284c
Reviewed-on: https://gerrit.openafs.org/12596
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>

mkvers: fix potential buffer overflow

The space allocated for outputFileBuf is only 2 bytes larger than
sizeof(VERS_FILE). But we add potentially 4 extra bytes like
".txt" or ".xml". Just allocate enough space for all file suffices.

Change-Id: Ic0f97590be208deaf9c4a5c25e21056ea9d2cd6f
Reviewed-on: https://gerrit.openafs.org/12657
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

New upstream version 1.6.21

afs_linux_lookup: Avoid d_add on afs_lookup error

Currently, afs_linux_lookup looks roughly like this pseudocode:

{
    code = afs_lookup(&vcp);
    if (!code) {
        ip = AFSTOV(vcp);
        error = process_ip(ip);
        if (error) {
            goto done;
        }
    }
    process_dp(dp);
    newdp = d_splice_alias(ip, dp);
done:
    cleanup();
}

Note that if there is an error while processing the looked-up inode
(ip), we jump over d_splice_alias. But if we encounter an error from
afs_lookup itself, we do not jump over d_splice_alias. This means that
if afs_lookup encounters any error, we initialize the given dentry
(dp) as a negative entry, effectively telling the Linux kernel that
the requested name does not exist.

This is correct for ENOENT errors, of course, but is incorrect for any
other error. For non-ENOENT errors we later return an error from the
function, but this does not invalidate the generated dentry. The
result is that when afs_lookup encounters an error, that error will be
propagated to userspace, but subsequent lookups for the same name will
yield an ENOENT error (until the dentry is invalidated). This can
easily cause a file to seem to mysteriously disappear, if a transient
error like network problems caused the afs_lookup call to fail.

To fix this, treat ENOENT as a non-error, like the comments already
suggest. In our case, ENOENT is not really an error; it just means we
populate the given dentry differently. So if we get ENOENT from
afs_lookup, set our vcache to NULL and clear the error, and continue.

This also has the side effect of not treating ENOENT errors from
afs_CreateAttr identically to ENOENT errors from afs_lookup. That
shouldn't happen, but there have been abuses of the ENOENT error code
in the past, so it is probably better to be cautious.

Many thanks to Gaja Sophie Peters for assistance in tracking down and
testing fixes for this issue, including providing access to test systems
experiencing the buggy behavior.

FIXES 133654

Change-Id: Ia9aab289d5c041557ab6b00f1d41de2edfc97a89
Reviewed-on: https://gerrit.openafs.org/12637
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Joe Gorse <jhgorse@gmail.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: Michael Meffie <mmeffie@sinenomine.net>

LINUX: Rearrange afs_linux_lookup cleanup

Currently, the cleanup and error handling in afs_linux_lookup is
structured similar to this pseudocode:

    if (!code) {
        if (!IS_ERR(newdp)) {
            return no_error;
        } else {
            return newdp_error;
        }
    } else {
        return code_error;
    }

The multiple different nested error cases make this a little complex.
To make this easier to follow for subsequent changes, alter this
structure to be more like this:

    if (IS_ERR(newdp)) {
        return newdp_error;
    }

    if (code) {
        return code_error;
    }

    return no_error;

There should be no functional change in this commit; it is just code
reorganization.

Technically the ordering of these checks is changed, but there is no
combination of conditions that actually results in different code
being hit. That is, if 'code' is nonzero and IS_ERR(newdp) is true,
then we would go through a different path. But that cannot happen,
since if 'code' is nonzero, we have no inode and so IS_ERR(newdp)
cannot be true (d_splice_alias cannot return an error for a NULL
inode). So there is no functional change.

Change-Id: I94a3aef5239358c3d13fe5314044dcc85914d0a4
Reviewed-on: https://gerrit.openafs.org/12636
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Joe Gorse <jhgorse@gmail.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: Michael Meffie <mmeffie@sinenomine.net>

Make OpenAFS 1.6.21

Update version strings for the 1.6.21 release.

Change-Id: I27569473ad9b988829bb517419d3d04f4cfa8c0f
Reviewed-on: https://gerrit.openafs.org/12649
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

Update NEWS for 1.6.21

Finalize the 1.6.21 release notes

Change-Id: I09974201c8155dc697abbf29079e5ceb2a74e629
Reviewed-on: https://gerrit.openafs.org/12635
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

doc: Add introduction and credits to ubik.txt

Credit where it's due. And the remainder of the introduction may
provide some useful context too.

Change-Id: I99c7e599363126c581ae1ac00da67c33acc3687f
Reviewed-on: https://gerrit.openafs.org/12644
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Put jhutz's ubik analysis in doc/txt

A file in the source tree is much easier to locate than an old
mailing list post; it's quite handy to have this at hand as a
reference.

Change-Id: I5267a2f86b36e92b05249364085bdd33aeb28d1b
Reviewed-on: https://gerrit.openafs.org/12642
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

afs: Improve "Corrupt directory" warning

This warning is a bit confusing to see, since it doesn't say anything
about AFS (making it unclear where it's coming from), and it lacks a
trailing newline (making it ugly). Fix both of these.

Change-Id: I92a3d07fd193bf99b545aef9b21f52d23c356a2d
Reviewed-on: https://gerrit.openafs.org/12641
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

Make OpenAFS 1.6.21pre1

Update version strings for the first 1.6.21 prerelease.

Change-Id: I700f0b110373e47f2f471f30ba8eefe9a3b6cf4f
Reviewed-on: https://gerrit.openafs.org/12603
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

Update NEWS for 1.6.21pre1

Release notes for the first OpenAFS 1.6.21 prerelease

Change-Id: I9d01bd7856574e2c3da872854a5bffeac2119f3e
Reviewed-on: https://gerrit.openafs.org/12634
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

vol: modify volume updateDate upon salvage change

If the salvager changed the volume, set the VolumeDiskData.updateDate
field so that

1. the change is visible via "vos examine"

2. backup services will backup the corrected volume

Teradactyl pointed out the problem which forces cell administrators
to manually trigger a backup for each volume that has been salvaged.

Reviewed-on: https://gerrit.openafs.org/12629
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit cdb92f94598e5b25fbcdfc6fb1650218ec05d63f)

Change-Id: I0ecf0bf52a78cd6e1de4e79fc4a33cb509a816f5
Reviewed-on: https://gerrit.openafs.org/12633
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

libafs: remove linux conditionals for md5 inode number calculation

Remove the conditionals which hide the md5 digest calculation for inode
numbers on non-linux platforms. This feature was originally added to
support sites running on linux, but is generally useful and the
implementation is not specific to linux.

Reviewed-on: http://gerrit.openafs.org/11854
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Perry Ruiter <pruiter@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit ac05e8ceebd05c2d8496759e70cf7b1b92541134)

Change-Id: I8fd613c436120a6436f48920ce4f33570dfb1fb8
Reviewed-on: https://gerrit.openafs.org/12632
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

bozo: do not fail silently on unknown bosserver options

Instead of failing silently when the bosserver is started with an
unknown option, print an error message and exit with a non-zero value.
Continue to exit with 0 when the -help option is given to request the
usage message.

This change should help make bosserver startup failures more obvious
when an unsupported option is specified. Example systemd status message:

   systemd[1]: Starting OpenAFS Server Service...
   bosserver[32308]: Unrecognized option: -bogus
   bosserver[32308]: Usage: bosserver [-noauth] ....
   systemd[1]: openafs-server.service: main process exited,
               code=exited, status=1/FAILURE

Reviewed-on: https://gerrit.openafs.org/12630
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit f5491119ff7d422b1c0c311a50e30bec1c15296c)

Change-Id: I5c3ffbb21915fd0a2773873e360cee85504796f8
Reviewed-on: https://gerrit.openafs.org/12631
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

vol: modify volume updateDate upon salvage change

If the salvager changed the volume, set the VolumeDiskData.updateDate
field so that

1. the change is visible via "vos examine"

2. backup services will backup the corrected volume

Teradactyl pointed out the problem which forces cell administrators
to manually trigger a backup for each volume that has been salvaged.

Change-Id: I9a35b92e8abbe3b54b08e64ac13de44442736c72
Reviewed-on: https://gerrit.openafs.org/12629
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

bozo: do not fail silently on unknown bosserver options

Instead of failing silently when the bosserver is started with an
unknown option, print an error message and exit with a non-zero value.
Continue to exit with 0 when the -help option is given to request the
usage message.

This change should help make bosserver startup failures more obvious
when an unsupported option is specified. Example systemd status message:

   systemd[1]: Starting OpenAFS Server Service...
   bosserver[32308]: Unrecognized option: -bogus
   bosserver[32308]: Usage: bosserver [-noauth] ....
   systemd[1]: openafs-server.service: main process exited,
               code=exited, status=1/FAILURE

Change-Id: I8717fb4a788fbcc3d1e2d271dd03511c5b504f10
Reviewed-on: https://gerrit.openafs.org/12630
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

LINUX: Switch to new bdi api for 4.12.

super_setup_bdi() dynamically allocates backing_dev_info structures
for filesystems and cleans them up on superblock destruction.

Appears with Linux commit fca39346a55bb7196888ffc77d9e3557340d1d0b
Author: Jan Kara <jack@suse.cz>
Date: Wed Apr 12 12:24:28 2017 +0200

Reviewed-on: https://gerrit.openafs.org/12614
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 63e530e7df0b8013bcc4421b0bba558d4f1d2d57)

Change-Id: I48a49ee8852bf842c24e7df0609fe2184bf45d90
Reviewed-on: https://gerrit.openafs.org/12626
Tested-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

rx: wake up send after 'twind' has been updated

Beginning in AFS 3.4 and 3.5 the ack trailer includes the size of the
peer's receive window.  This value is used to update the sender's
transmit window (twind).  When the twind is increased the application
thread is signaled to indicate that more packets can be sent.

This change wakes the application thread after twind is updated by
the peer's receive window instead of beforehand.  Failure to do so
can result in 100ms transmit delays when the receive window transitions
from closed to open.

Reviewed-on: https://gerrit.openafs.org/12625
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit aaa47dc1077f0dd5b0040006c831f64cc8a303b5)

Change-Id: Icfbe10f93a34adfb14f5c34198f78b67aa043c53
Reviewed-on: https://gerrit.openafs.org/12627
Tested-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

LINUX: CURRENT_TIME macro goes away.

Check if the macro exists, define it if it does not.

Reviewed-on: https://gerrit.openafs.org/12611
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit b47dc5482da614742b01dcc62d5e11d766a9432f)

Change-Id: I1ed3706e830b98436a5a22d99fa338b01fd5b997
Reviewed-on: https://gerrit.openafs.org/12624
Tested-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

rx: wake up send after 'twind' has been updated

Beginning in AFS 3.4 and 3.5 the ack trailer includes the size of the
peer's receive window.  This value is used to update the sender's
transmit window (twind).  When the twind is increased the application
thread is signaled to indicate that more packets can be sent.

This change wakes the application thread after twind is updated by
the peer's receive window instead of beforehand.  Failure to do so
can result in 100ms transmit delays when the receive window transitions
from closed to open.

Change-Id: Id129ea93e94612a4b8cce9f8cbddde9c779ff26b
Reviewed-on: https://gerrit.openafs.org/12625
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

LINUX: Switch to new bdi api for 4.12.

super_setup_bdi() dynamically allocates backing_dev_info structures
for filesystems and cleans them up on superblock destruction.

Appears with Linux commit fca39346a55bb7196888ffc77d9e3557340d1d0b
Author: Jan Kara <jack@suse.cz>
Date: Wed Apr 12 12:24:28 2017 +0200

Change-Id: I67eed0fcb8c96733390579847db57fb8a4f0df3e
Reviewed-on: https://gerrit.openafs.org/12614
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>

afs: add afsd -inumcalc option

This commit adds the afsd -inumcalc command line switch to specify the
inode number calculation method in a platform neutral way.

Inode numbers reported for files within the AFS filesystem are generated
by the cache manager using a calculation which derives a number from a
FID. Long ago, a new type of calculation was added which generates inode
numbers using a MD5 message digest of the FID.  The MD5 inode number
calculation variant is computationally more expensive but greatly
reduces the chances for inode number collisions.

The MD5 calculation can be enabled on the Linux cache manager using the
Linux sysctl interface.  Other than the sysctl method of selecting the
inode calculation type, the MD5 inode number calculation method is not
specific to Linux.

This change introduces a command-line option which accepts a value to
indicate the calculation method, instead of a simple flag to enable MD5
inode numbers.  This should allow for new inode calculation methods
in the future without the need for additional afsd command-line flags.

Two values are currently accepted for -inumcalc. The value of 'compat'
specifies the legacy inode number calculation. The value 'md5' indicates
that the new MD5 calculation is to be used.

Reviewed-on: https://gerrit.openafs.org/11855
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 0028ea92ad3e7aac6a4c51f63703a4d9d7b9dcd6)

Change-Id: I9021eea9f64c754157061d039f63b6f744ec2ec5
Reviewed-on: https://gerrit.openafs.org/12608
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Tested-by: Stephan Wiesand <stephan.wiesand@desy.de>

client: flag in cachemanager if rmtsys is enabled

when processing "fs sysname" on a client, a rmtsys-related
checks are executed by default. These prevent a user with gid
2748 and 2750 (0xabc and 0xabe) from executing this command.
Add a new flag inside the cachemanager for the rmtsys-
functionality. This flag is set through a new ioctl by the afsd
on startup.

Reviewed-on: http://gerrit.openafs.org/10245
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@your-file-system.com>
(cherry picked from commit db1de98ecf6fd22b9c36b3ba284984f03cb0ae35)

Change-Id: Ia2a367e4675782a681b4f6efd6365da482adfab8
Reviewed-on: https://gerrit.openafs.org/12607
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Tested-by: Stephan Wiesand <stephan.wiesand@desy.de>

afs: release the packets used by rx on shutdown

When the OpenAFS client is unmounted on DARWIN, the blocks of packets
allocated by RX are released. Historically, the memory used by those
packets was never properly released.

Before 230dcebcd61064cc9aab6d20d34ff866a5c575ea, only the last block of
packets used to be released:

...
struct rx_packet *rx_mallocedP = 0;
...
void
rxi_MorePackets(int apackets)
{
    ...
    getme = apackets * sizeof(struct rx_packet);
    p = rx_mallocedP = (struct rx_packet *)osi_Alloc(getme);
    ...
}
...
void
rxi_FreeAllPackets(void)
{
    ...
    osi_Free(rx_mallocedP, ...);
    ...
}
...

As we can see, ‘rx_mallocedP’ is a global pointer that stores the
first address of the last allocated block of packets. As a result, when
‘rxi_FreeAllPackets’ is called, only the last block is released.

However, 230dcebcd61064cc9aab6d20d34ff866a5c575ea moved the global
pointer in question to the end of the last block. As a result, when the
OpenAFS client is unmounted on DARWIN, the ‘rxi_FreeAllPackets’
function releases the wrong block of memory. This problem was exposed
on OS X 10.12 Sierra where the system crashes when the OpenAFS client
is unmounted.

To fix this problem, store the address of every single block of packets
in a queue and release one by one when the OpenAFS client is unmounted.

Reviewed-on: https://gerrit.openafs.org/12427
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 5b28061fb593f5f48df549b07f0ccd848348b93c)

Change-Id: Id8606b1c1444861df69ed4af8169e343964a691d
Reviewed-on: https://gerrit.openafs.org/12602
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

vol: detach offline volumes on dafs

Taking a volume offline always clears the inService bit. Taking a
volume out of service also takes it offline. Therefore, if the
inService flag is false, the volume in question should be offline.
On dafs, an offline volume should be unattached.

The attach2() function does not change the state of the volume received
as an argument to unattached when the inService flag is false. Instead,
this function changes the state of the volume in question to
pre-attached and returns VNOVOL to the client. As result, subsequent
accesses to this volume will make the server try and fail to attach
this offline volume over and over again, writing to the FileLog each
time.

To fix this problem, detach the volume received as an argument if the
inService flag is false. Since the new state of this volume will be
unattached, subsequent accesses will not hit attach2().

This situation where a volume is not offline but is also not in service
can occur if a volume is taken offline with vos offline and some time
later the DAFS fileserver is shutdown and restarted; the volume is
placed into the preattach state by default when the server restarts.
Each access to the volume by clients then causes the fileserver to
attempt to attach the volume, which fails, since the in-service flag in
the volume header is false from the previous vos offline. The
fileserver will log a warning to the FileLog on each attempt to attach
the volume, and this will fill the FileLog with duplicate messages
corresponding to the number of attempted accesses.

Reviewed-on: https://gerrit.openafs.org/12515
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 2421da2bf327525216ec7e79b9aa81fa2c4f77d5)

Change-Id: I95cffb6a91797341d9202cbbef3b205c11348d5e
Reviewed-on: https://gerrit.openafs.org/12569
Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

DAFS: do not save or restore host state if CPS in progress

If a fileserver is shutdown while one or more PR_GetHostCPS calls
are in progress, this state is saved in the fsstate.dat file as
hostFlags HCPS_WAITING, HCPS_INPROGRESS. Other hosts that are
merely waiting will have HCPS_WAITING recorded.

However, it makes no sense to restore host structs in this state,
because the GetCPS calls will no longer be in progress. Once these
hosts become active, they will block server threads and quickly cause
all server threads to be exhausted as other CPS requests are blocked
behind them.

Instead, exclude these states from both save and restore.

Reviewed-on: https://gerrit.openafs.org/12561
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 38a3f51fb8b3910ecdd7cacb06f35ec681990aea)

Change-Id: I0e02543fd2e547fcc9f95db0973f09e5951a1da1
Reviewed-on: https://gerrit.openafs.org/12568
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

LINUX: CURRENT_TIME macro goes away.

Check if the macro exists, define it if it does not.

Change-Id: I9990579f94bfba0804e60fa6ddcc077984cc46c3
Reviewed-on: https://gerrit.openafs.org/12611
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

redhat: update rpm spec file

Update the spec file to keep up with accumulated changes.

* Correct installation location of db check programs.
* Install afsd to the legacy location to avoid breaking
  init scrips and systemd configs.
* Exclude yet another duplicated copy of kpwvalid.
* libubik_pthread.a is gone.
* Install the kpwvalid man page.
* Continue to remove the obsolete kdb program.
* Update the names of the pam_afs symlinks.
* Add libkopenafs to authlibs.
* Package dafssync-debug man pages.
* Package opr/queue.h in devel.
* Package akeyconvert and man page.
* Do not package fuse version of afsd. A separate sub-package
  for afsd.fuse is warrented, since it adds new libfuse
  dependencies.
* Package new server man pages, including dafsssync-* pages.
* Package libafsrfc3961.a as a devel lib.
* Continue to package kauth programs.

Change-Id: I875c3b8dee53abbc67b0f05f8b291bb58abf41a5
Reviewed-on: https://gerrit.openafs.org/12595
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>

FBSD: build fix for FreeBSD 11

r285819 eliminated b_saveaddr from struct buf, while r292373 changed the
arguments to VOP_GETPAGES. The approach used by this patch to address
these changes was inspired by FreeBSD's nfs and samba clients.

Change-Id: Ibcf6b6fde6c86f96aa814af2bca08f1a8b286740
Reviewed-on: https://gerrit.openafs.org/12575
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

redhat: convert rpm spec file to make install

Convert the build and install from the deprecated 'make dest' to the
modern 'make install' method.

* Clarify the install section by unrolling the shell scripts,
  reorganizing, and improving the comments.
* Remove the gzip glob of the man pages; rpmbuild automatically
  compresses the man pages and will handle symlinks correctly.
* Remove the generated temporary list file and specify files directly.
* Remove the extra tar commands to install the man pages out of the doc
  directory; 'make install..' installs the man pagess.
* Remove code in the install section which determines the sysname. This is
  no longer needed during the install.
* Update the kernel module install commands to accommodate the
  conversion from 'make dest'.

Change-Id: I97ec80185a2b11704b27ea74941b50ff4a5aca8c
Reviewed-on: https://gerrit.openafs.org/12594
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>

redhat: fix whitespace errors in the rpm spec file

Remove trailing whitespace characters that have crept into
the rpm spec file over the years.

Change-Id: I08c7ad926ddb524d6938b26513963c28c70b4195
Reviewed-on: https://gerrit.openafs.org/12606
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>

afs: fs getcacheparms miscounts dcaches for large files

fs getcacheparms issued with the -excessive option tabulates in-memory
dcaches ("DCentries") by size.  However, any dcache with validPos > 2^31
is miscounted in the 4k-16k bucket.  This is caused by a type mismatch
between 'validPos' (afs_size_t) and 'size' (int) which leads to a
negative value for size by sign-extension.  The size comparison "sieve"
fails for negative numbers; it skips the first bucket (0-4K) and dumps
them in the second one (4k-16k).

Move the declaration of 'size' closer to its use, and declare it with
the same type as 'validPos' (afs_size_t) so the comparison sieve
correctly places these dcaches in the last (>=1M) bucket.

Reviewed-on: https://gerrit.openafs.org/12347
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit b5e4e8c14130f601bbf43dee5927222ebf7613fa)

Change-Id: I659fd86f05b29c1eac1a262d340bcc1ce2640797
Reviewed-on: https://gerrit.openafs.org/12605
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

afs: fs getcacheparms miscounts zero-length dcaches

When fs getcacheparms is issued with the -excessive option, it
tabulates all in-memory dcaches ("DCentries") by size.

dcaches with validPos == 0 were being tabulated in the 4k-16k bucket.

Fix the first comparison in the 'sieve' so these dcaches will be counted
in the correct 0-4k bucket instead.

Introduced by commit 176c2fddb95ced6c13e04e7492fc09b5551f273c

Reviewed-on: https://gerrit.openafs.org/12346
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit c966c0b8414ef0a041b1a8d5261c9eccd4d39d99)

Change-Id: I53a20644f549550cef85f0cc6f3551ed5dbe1e23
Reviewed-on: https://gerrit.openafs.org/12604
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

doc: clarify the fs wscell manpage

What's displayed by fs wscell is not necessarily the current content
of ThisCell, but that at the time of starting the client. Say so.

FIXES 133339

Reviewed-on: https://gerrit.openafs.org/12537
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit bd15a5f56fde98983464acf5fd4cdd731d206d9f)

Change-Id: I47d7b92488b1166934a1704765c0f1e914a178a8
Reviewed-on: https://gerrit.openafs.org/12559
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

ubik: SVOTE_Beacon should hold the DB lock for CheckTid

Reviewed-on: https://gerrit.openafs.org/4262
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
Tested-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 5548f6540557795ded65a52c7066839c5eef468f)

Change-Id: I0d4a4d5e796bc6cb731f00db34cc0776f746ca85
Reviewed-on: https://gerrit.openafs.org/12516
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

doc: update information about vlserver logging

Mention the vlserver -d option can be used to set the initial logging
level.

Thanks to Mark Vitale for the suggestion.

Reviewed-on: https://gerrit.openafs.org/12324
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit f5f057ce8198480fb9c67f2a8c8eee906f8a7c4a)

Change-Id: Iaa0f10d020d3993fe92690c860cdad03605d31ec
Reviewed-on: https://gerrit.openafs.org/12477
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

LINUX: eliminate unused variable warning

Commit c3bbf0b4444db88192eea4580ac9e9ca3de0d286 added routine
osi_TryEvictDentries and included new logic for D_INVALIDATE_IS_VOID.
Unfortunately, this new code path no longer uses dentry; it also should
have been made conditional at that time.

Wrap the declaration of dentry in #ifndef D_INVALIDATE_IS_VOID to
eliminate the unused variable warning.

Reviewed-on: https://gerrit.openafs.org/12505
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 19599b5ef5f7dff2741e13974692fe4a84721b59)

Change-Id: Ic15df733fcbccfaf9870ecd335bb2d549ab0d43d
Reviewed-on: https://gerrit.openafs.org/12513
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

afs: shake harder in shake-loose-vcaches

Linux based cache managers will allocate vcaches on demand and
deallocate batches of vcaches in the background. This feature is called
dynamic vcaches.

Vcaches to be deallocated are found by traversing the vcache LRU list
(VLRU) from the oldest vcache to the newest. Up to a target number of
vcaches are attempted to be evicted.  The afs_xvcache lock protecting
the VLRU may be dropped and re-acquired while attempting to evict a
vcache. When this happens, it is possible the VLRU may have changed, so
the traversal of the VLRU is restarted.  This restarting of the VLRU
transversal is limited to 100 iterations to avoid looping indefinitely.

Vcaches which are busy cannot be evicted and remain in the VLRU. When a
busy cache was not evicted and the afs_xvache lock was dropped, the VLRU
traversal is restarted from the end of the VLRU. When the busy vcache is
encountered on the retry, it will trigger additional retries until the
loop limit is reached, at which point the target number of vcaches will
not be deallocated.

This can leave a very large number of unbusy vcaches which are never
deallocated.  On a busy machine, tens of millions of unused vcaches can
remain in memory. When the busy vcache at the end of the VLRU is finally
evicted, the log jam is broken, and the background deamon will hold the
afs_xvcache lock for an excessively long time, hanging the system.

Fix this by moving busy vcaches to the head of the VLRU before
restarting the VLRU traversal. These busy vcaches will be skipped when
retrying the VLRU traversal, allowing the cache manager to make progress
deallocating vcaches down to the target level.

This was already done on the mac osx platform while attempting to evict
vcaches. Move the code to move busy vcaches to the head of the VLRU up
the the platform agnostic caller.

Thanks to Andrew Deason for the initial version of this patch.

Reviewed-on: https://gerrit.openafs.org/11654
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@dson.org>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 5c136c7d93ed97166f39bf716cc7f5d579b70677)

Change-Id: If60b1889d012a739aa5b43e842abb80a6ebfdb6a
Reviewed-on: https://gerrit.openafs.org/12451
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

LINUX: do not use d_invalidate to evict dentries

When working within the AFS filespace, commands which access large
numbers of OpenAFS files (e.g., git operations and builds) may result in
active files (e.g., the current working directory) being evicted from the
dentry cache.  One symptom of this is the following message upon return
to the shell prompt:

"fatal: unable to get current working directory: No such file or
directory"

Starting with Linux 3.18, d_invalidate returns void because it always
succeeds.  Commit a42f01d5ebb13da575b3123800ee6990743155ab adapted
OpenAFS to cope with the new return type, but not with the changed
semantics of d_invalidate.  Because d_invalidate can no longer fail with
-EBUSY when invoked on an in-use dentry. OpenAFS must no longer trust it
to preserve in-use dentries.

Modify the dentry eviction code to use a method (d_prune_aliases) that
does not evict in-use dentries.

Reviewed-on: https://gerrit.openafs.org/12363
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit c3bbf0b4444db88192eea4580ac9e9ca3de0d286)

Change-Id: Ic72a280f136cc414b54d4b8ec280f225290df122
Reviewed-on: https://gerrit.openafs.org/12450
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

Reformat src/afs/LINUX/osi_vcache.c

Apply the GNU indent options from CODING, with manual adjustments
to leave jump labels in column zero.

Also rename and mark static a function-local helper function.

Reviewed-on: https://gerrit.openafs.org/12422
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 22933e02e2510f25b79230964f135571c7bfe710)

Change-Id: I9fb2886ae2213218ae80ea9d5b80540b9c79077b
Reviewed-on: https://gerrit.openafs.org/12449
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

LINUX: split dentry eviction from osi_TryEvictVCache

To make osi_TryEvictVCache clearer, and to prepare for a future change
in dentry eviction, split the dentry eviction logic into its own routine
osi_TryEvictDentries.

No functional difference should be incurred by this commit.

Reviewed-on: https://gerrit.openafs.org/12362
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Joe Gorse <jhgorse@gmail.com>
(cherry picked from commit 742643e306929ac979ab69515a33ee2a3f2fa3fa)

Change-Id: I750fc7606ca56e784a60bdbc13a32d21fe307429
Reviewed-on: https://gerrit.openafs.org/12448
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

doc: correct help for 'bos getlog' -restricted mode

Commit f085951d39c0d6c1e6a626177c30235704317600 introduced an error in
the bos getlog helpfile.

Modify the helpfile to describe the actual restrictions imposed by
-restricted mode.

Reviewed-on: https://gerrit.openafs.org/12454
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 3af0460a4a6d7bf22e1789fd9e375659e20c3a55)

Change-Id: Ifa544c322e67da712a0bc96b3797e51786e4d399
Reviewed-on: https://gerrit.openafs.org/12476
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

Update NEWS again for 1.6.20.2

Finalize the 1.6.20.2 release notes, including a few late additions.

Change-Id: I32a394e4af700d52f487e0db528ed261e4c2131b
Reviewed-on: https://gerrit.openafs.org/12591
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Tested-by: Stephan Wiesand <stephan.wiesand@desy.de>

Linux: only include cred.h if it exists

Commit c89fd17df1032ec2eacc0d0c9b73e19c5e8db7d2 introduced an explicit
include of linux/cred.h since the latest kernel no longer includes it
implicitly in sched.h. Alas, older kernels (like 2.6.18) don't have this
file. Add a configure test for the existence of cred.h and only include
it if actually present.

Reviewed-on: https://gerrit.openafs.org/12593
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 6b7b4239ab22fbb301e3b50e2ca4072445ba4e9e)

Change-Id: I64970ba471180d32fa5af5445e7604bbe8511b32
Reviewed-on: https://gerrit.openafs.org/12598
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

Linux v4.11: cred.h is no longer included in sched.h

With Linux commit e26512fea5bcd6602dbf02a551ed073cd4529449, cred.h is no
longer included in sched.h.

Several components of libafs which require cred.h were picking it by
including sched.h.

Instead, explicitly add an include for cred.h. cred.h begins with a
customary one-shot to prevent multiple loads:

#ifndef _LINUX_CRED_H
#define _LINUX_CRED_H

Therefore we don't need a new autoconf test or preprocessor conditional
to prevent redundant includes on older Linux releases.

Reviewed-on: https://gerrit.openafs.org/12574
Tested-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Joe Gorse <jhgorse@gmail.com>
Tested-by: Joe Gorse <jhgorse@gmail.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
(cherry picked from commit c89fd17df1032ec2eacc0d0c9b73e19c5e8db7d2)

Change-Id: I235a6272c55a8f734be07b578bbb1a324cf34e2e
Reviewed-on: https://gerrit.openafs.org/12590
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

Linux v4.11: signal stuff moved to sched/signal.h

In Linux commit c3edc4010e9d102eb7b8f17d15c2ebc425fed63c, signal_struct
and other signal handling declarations were moved from sched.h to
sched/signal.h.

This breaks existing OpenAFS autoconf tests for recalc_sigpending() and
task_struct.signal->rlim, so that the OpenAFS kernel module can no
longer build.

Modify OpenAFS autoconfig tests to cope.

Reviewed-on: https://gerrit.openafs.org/12573
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Joe Gorse <jhgorse@gmail.com>
Tested-by: Joe Gorse <jhgorse@gmail.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
(cherry picked from commit ad001550949b612ff6b4899fa8da50ee58f87533)

Change-Id: I491208d77e45d45cc0089b8033892a6408da431c
Reviewed-on: https://gerrit.openafs.org/12589
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

Linux v4.11: getattr takes struct path

With Linux commit a528d35e8bfcc521d7cb70aaf03e1bd296c8493f

    statx: Add a system call to make enhanced file info available

The Linux getattr inode operation is altered to take two additional
arguments: a u32 request_mask and an unsigned int flags that indicate
the synchronisation mode.  This change is propagated to the
vfs_getattr*() function.

-   int (*getattr) (struct vfsmount *, struct dentry *, struct kstat *);
+   int (*getattr) (const struct path *, struct kstat *,
+                     u32 request_mask, unsigned int sync_mode);

The first argument, request_mask, indicates which fields of the statx
structure are of interest to the userland call. The second argument,
flags, currently may take the values defined in
include/uapi/linux/fcntl.h and are optionally used for cache coherence:

(1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does.

(2) AT_STATX_FORCE_SYNC will require a network filesystem to
     synchronise its attributes with the server - which might require
     data writeback to occur to get the timestamps correct.

(3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in
     a network filesystem.  The resulting values should be considered
     approximate.

This patch provides a new autoconf test and conditional compilation to
cope with the changes in our getattr implementation.

Reviewed-on: https://gerrit.openafs.org/12572
Reviewed-by: Joe Gorse <jhgorse@gmail.com>
Tested-by: Joe Gorse <jhgorse@gmail.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit de5ee1a67d1c3284d65dc69bbbf89664af70b357)

Change-Id: I41ff134e1e71944f0629c9837d38cfbc495264c8
Reviewed-on: https://gerrit.openafs.org/12588
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Mark Vitale <mvitale@sinenomine.net>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

Linux: only include cred.h if it exists

Commit c89fd17df1032ec2eacc0d0c9b73e19c5e8db7d2 introduced an explicit
include of linux/cred.h since the latest kernel no longer includes it
implicitly in sched.h. Alas, older kernels (like 2.6.18) don't have this
file. Add a configure test for the existence of cred.h and only include
it if actually present.

Change-Id: Ia7e38160492b1e03cdb257e4b2bef4d18c4a28fb
Reviewed-on: https://gerrit.openafs.org/12593
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

Prevent double-starting client on RHEL7

On RHEL7 if the AFS client is stopped with 'service openafs-client
stop', but that fails for some reason (most commonly because some
process has a file or directory in AFS open) systemd will decide that
the openafs-client is in a failed state when it is actually running.
If one then runs 'service openafs-client start' systemd will start a
new AFS client. At this point AFS access will continue to work until
the functional AFS client is (successfully) stopped, at which point a
reboot is required to recover.

Have systemd check the status of 'fs sysname' before starting the
AFS client, so we avoid getting into a state that requires a reboot.

Reviewed-on: https://gerrit.openafs.org/12443
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit c666bfee8848183ccbc566c9e0fa019088e56505)

Change-Id: I2e7bf69ec5d1ae344d38b86fc3caace25b2da135
Reviewed-on: https://gerrit.openafs.org/12587
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>

Linux v4.11: cred.h is no longer included in sched.h

With Linux commit e26512fea5bcd6602dbf02a551ed073cd4529449, cred.h is no
longer included in sched.h.

Several components of libafs which require cred.h were picking it by
including sched.h.

Instead, explicitly add an include for cred.h. cred.h begins with a
customary one-shot to prevent multiple loads:

#ifndef _LINUX_CRED_H
#define _LINUX_CRED_H

Therefore we don't need a new autoconf test or preprocessor conditional
to prevent redundant includes on older Linux releases.

Change-Id: Ifc496c83141d2cfbd417133feb6d87c1146e5014
Reviewed-on: https://gerrit.openafs.org/12574
Tested-by: Mark Vitale <mvitale@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Joe Gorse <jhgorse@gmail.com>
Tested-by: Joe Gorse <jhgorse@gmail.com>
Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>