Recent versions of rpm (the issue was found on Fedora 27) extract
the debuginfo data from a copy of the original files having the
package version-release as a suffix. This broke the original
change since the regular expression passed to find-debuginfo.sh
no longer matched the name of the openafs.ko file. The file list
for the -debuginfo package remained empty, which caused rpmbuild
to fail.
Relax the regex to match the previous and current file names we
are after. It is possible but unlikely that .*openafs\.ko.* will
ever match any file not being a kernel module.
Reviewed-on: https://gerrit.openafs.org/13030 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 076b73e06df8240f209470ea6ee19b66eb4166c3)
Change-Id: Ib8a683d586ad3bd5237f27546a95ce92dd9de04f
Reviewed-on: https://gerrit.openafs.org/13036 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Stephan Wiesand [Thu, 26 Apr 2018 17:50:06 +0000 (19:50 +0200)]
redhat: PACKAGE_VERSION macro no longer exists
Commit 0d0e7699c9f789214205fe6837cded1a4c95f9c0 replaced all uses
of the %PACKAGE_VERSION macro in the spec with the %version one, but
missed an instance in the kmodtool script. Fix this, to avoid a
warning during rpmbuild.
Reviewed-on: https://gerrit.openafs.org/13031 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit cfa74883e4996dfee2bd6ffaa3b967e5a7941e0b)
Change-Id: I2d57ddc3700f509da3255df1f952f55d8cd7f0e8
Reviewed-on: https://gerrit.openafs.org/13037 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Michael Meffie [Fri, 6 Apr 2018 03:43:34 +0000 (23:43 -0400)]
redhat: remove the openafs-kernel-version.sh script
Commit ec706b21530240d7fb66bad2f08513eff8f7c335 (Remove Linux 2.4 compat
from RedHat packaging) removed the use of the script
openafs-kernel-version.sh, which was used in the linux 2.4 days to look
up the current kernel version. Nowadays, we use the openafs-kmodtool
script to determine the kernel version.
Remove the unused openafs-kernel-version.sh script from the package
sources.
Reviewed-on: https://gerrit.openafs.org/12996 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 28ea20d03f8abd8109547d6825edad159748397a)
Michael Meffie [Fri, 6 Apr 2018 02:56:50 +0000 (22:56 -0400)]
redhat: remove extra kernel version check
Commit a1c072ac562ccf74e5afb8449db1bcef86aef362 (redhat: fix rpmbuild command
line option defaults) added logic to set the default value of the kernvers
variable when not specified as an rpmbuild command line option.
This default value is not necessary, since 'kmodtool verrel' already returns
the current running kernel version by default. The result of 'kmodtool verrel'
sets the kverrel variable, which holds the value of the kernel version we are
building. The kernvers variable is only used as an argument to 'kmodtool
verrel' and may be empty by default to indicate the current version should be
returned.
Remove the unnecessary setting of the default value of kernvers.
Also update the information banner to show the value of kverrel, which is the
actual version we are building, instead of kernvers, which is empty be default.
Reviewed-on: https://gerrit.openafs.org/12995 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 9f0164f4254da39c3c31e0268da58ce7a6ccda1d)
Stephan Wiesand [Wed, 4 Apr 2018 15:09:39 +0000 (17:09 +0200)]
FBSD: param.h consistency
Commit 88dc4d93f5ef080da8f56fac453f095e6c79d4a0 ("Add param.h
files for recent FreeBSD") introduced an inconsistency between
the i386 and amd64 param.h files for 11.1 and 12.0 regarding
the *_FBSD101_ENV #defines.
Citing Benjamin Kaduk: "Traditionally we have the param.h for
a FreeBSD N.0 release include the (N-1).Y values that existed
at the time of the N.0 release, and freeze that set of (N-1).Y
values for the lifetime of FreeBSD N.x, if that makes sense."
Given that FreeBSD 11.0 was released shortly after 10.3, and
12.0 is not yet released, consistently #define
*_FBSD10{1..3}_ENV for 11.1 and *_FBSD10{1..4}_ENV for 12.0
Reviewed-on: https://gerrit.openafs.org/12990 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 154512831966d12c1e32e6271d4ab1440a25b96e)
Stephan Wiesand [Mon, 26 Mar 2018 18:21:19 +0000 (20:21 +0200)]
redhat: Create unique debuginfo packages for kmods
Commit 443dd5367e0cd9050ad39a6594c5be521271b4e9 ("redhat:
separate debuginfo package for kmod rpm") introduced the
creation of separate debuginfo packages for the kmod packages.
As such, this is useful, but all debuginfo packages for a given
OpenAFS release ended up with the same name/version/release for
the kmod debuginfo package, no matter which kernel release or
variant the kmod was built for.
Move the additional black magic from the spec into the kmodtool
script where we have the means to do better: Use the same naming
and versioning conventions as for the kmod-openafs packages
themselves.
Reviewed-on: https://gerrit.openafs.org/12977 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 387ae9536888419d7b101513e04e1c644e3218d6)
Change-Id: I220408eacd0c39449843240f225cfced163cbff7
Reviewed-on: https://gerrit.openafs.org/12986 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Andrew Deason [Wed, 7 Mar 2018 17:32:43 +0000 (11:32 -0600)]
ubik: Log sync site for SDISK_SendFile USYNC error
In SDISK_SendFile, we return a USYNC error if the caller is not the
sync site. Say who the sync site is when we do this, to possibly help
post-mortem debugging.
Reviewed-on: https://gerrit.openafs.org/12943 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit c44f6f7a8052bdd1fb021e07bb6ae142b61e6b5b)
Change-Id: I398780c98ee5eade75e06a42d54637c169bc250a
Reviewed-on: https://gerrit.openafs.org/12948 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
The idea is that for a .sym file consisting of, for example:
foo
bar
We then generate a regex like (^foo$|^bar$). However, since the 'tr'
removes all newlines, the line given to the last 'sed' in the pipeline
has no trailing newline. On some systems, such as Solaris, this causes
sed to not output anything at all, resulting in a regex pattern of
just "()".
For example:
# on Debian
$ echo -n foo | sed -e 's/foo/bar/'
bar$
# on Solaris
$ echo -n foo | sed -e 's/foo/bar/'
$
To avoid this, we can change the sed pipeline to not remove the
newlines until the very end. Change the way we construct our regex to
this instead:
Marcio Barbosa [Thu, 22 Feb 2018 22:53:23 +0000 (17:53 -0500)]
ubik: don't set database epoch to 0 if not needed
If our attempt to receive a fresh database from a peer fails, we will
overwrite the version.epoch field of our current local copy of the
database with an invalid value, "0". The idea behind this approach is
to make sure that this database will not be seen as a legit copy if the
transfer is not completed properly. Although it is questionable if this
approach is still necessary (since the current version writes the data
into a temporary file), it is undisputed that the database version does
not have to be invalidated if the transfer fails in a early stage where
no data has been written and we could safely continue to reuse the local
copy for read-only queries. Early failures may happen if:
1. The peer sending the database to us is not the peer we believe to be
the sync site;
2. The sender is not authorized to call DISK_SendFile;
In both cases, the database epoch is invalidated. As a result of that,
we may have the following consequences:
1. Reads may not be allowed
Once the on disk epoch is invalidated, if the server in question is
rebooted, the invalid on disk epoch will be used to initialize the in
memory epoch. At this point, reads may not be allowed since
urecovery_AllBetter checks if the in memory epoch is greater than 1.
Reads should not be blocked forever since the sync-site will send a new
database to this remote and, as a result of that, the invalid version
will be corrected.
2. Data can be lost
If the site with the invalid epoch is the one with the most recent
database, the database can be rolled back to an earlier version during a
new quorum establishment. Consider the following scenario where we have
three sites:
Site A (up - database up to date) (sync-site)
Site B (up - database up to date)
Site C (down - old database)
The epoch of B is invalidated due to the problem fixed by this patch.
Then, A is turned off and C is turned on. In this scenario, the new
sync-site will distribute the old database held by C since its epoch is
greater than 0.
To fix the problem in question, do not set the database epoch to 0
if the local database was not modified.
Acknowledgements:
Hartmut Reuter <hartmut.reuter@gmx.de>
- found the problem;
- suggested a possible solution;
Benjamin Kaduk <kaduk@mit.edu>
- submitted the first version;
Andrew Deason <adeason@sinenomine.net>
- suggested changes;
Reviewed-on: https://gerrit.openafs.org/12924 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
(cherry picked from commit bd6a2484011dad6298c4ce97dd0cd68e0834baa5)
Michael Meffie [Tue, 20 Feb 2018 16:51:01 +0000 (11:51 -0500)]
afs: improve -volume-ttl error messages
Change the afs call which sets the volume ttl value to return EFAULT
instead of EINVAL when given an out of range value for the volume ttl
parameter. This is more consistent with the other op codes, which
return EFAULT when given an out of range parameter and allows the caller
to distinguish between an invalid opcode and a bad parameter.
Move the volume ttl range constants to afs_args.h, which is where
constants related to the op codes are supposed to be defined. This makes
the constants available to the caller in afsd.c as well as the
implementation in afs_call.c.
Update afsd to print a more sensible error message when the volume ttl
set calls fails due to an out of range parameter.
Reviewed-on: https://gerrit.openafs.org/12918 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 6d74e3d6a1becf86cec30efc2d01a5692167afe1)
Change-Id: I2cd86b6fbba31f74862bb902ac94b0874de8afac
Reviewed-on: https://gerrit.openafs.org/12936 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Fri, 2 Mar 2018 02:28:23 +0000 (20:28 -0600)]
afs_pioctl: avoid -Wpointer-sign
Change the declaration of 'addr' to be a signed int, to match
RXAFS_CallBackRxConnAddr() and the afsd_pd_GetInt() used with it.
This was detected by clang 4.0 in FreeBSD 11.1, via -Wpointer-sign.
Reviewed-on: https://gerrit.openafs.org/12934 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 79f33b859aeb3c91f2cce7597fdc138978c4e1d9)
Change-Id: Iee85059bebfc8d6fbda3409b720576bd4f6c5f8f
Reviewed-on: https://gerrit.openafs.org/12938 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Mark Vitale [Fri, 2 Mar 2018 04:16:56 +0000 (23:16 -0500)]
LINUX: fix RedHat 7.5 ENOTDIR issues
Red Hat Linux 7.5 beta introduces a new file->f_mode flag
FMODE_KABI_ITERATE as a means for certain in-tree filesystems to
indicate that they have implemented file operation iterate() instead of
readdir(). The kernel routine iterate_dir() tests this flag to decide
whether to invoke the file operation iterate() or readdir().
The OpenAFS configure script detects that the file operation iterate()
is available under RH7.5 and so implements iterate() as
afs_linux_readdir(). However, since OpenAFS does not set
FMODE_KABI_ITERATE on any of its files, the kernel's iterate_dir() will
not invoke iterate() for any OpenAFS files. OpenAFS has also not
implemented readdir(), so iterate_dir() must return -ENOTDIR.
Instead, modify OpenAFS to fall back to readdir() in this case.
Reviewed-on: https://gerrit.openafs.org/12935 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit c818f86b79a636532d396887d4f22cc196c86288)
Change-Id: I71386b17f0c751b69c86ef0f5766a5baf3dc36bd
Reviewed-on: https://gerrit.openafs.org/12950 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Andrew Deason [Thu, 15 Feb 2018 22:41:33 +0000 (16:41 -0600)]
rxdebug: NUL-terminate version before printing
Currently, 'rxdebug -version' never initializes the buffer we read the
version string into. Usually this is not noticeable, since all OpenAFS
binaries tend to pad the Rx version response packet with NULs, so we
get back several NULs to terminate the string. However, this is not
guaranteed, and if we do not get back a NUL-terminated string, we can
easily read beyond the end of the buffer.
To avoid this, initialize the 'version' buffer with NULs before we do
anything, and set the last byte to NUL, in case we exactly filled the
buffer.
Reviewed-on: https://gerrit.openafs.org/12908 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Andrew Deason <adeason@sinenomine.net>
(cherry picked from commit a66629eac4dda4eea37b4f06e0850641cb2a7387)
Andrew Deason [Thu, 15 Feb 2018 22:53:57 +0000 (16:53 -0600)]
doc: Edits to the 'afsd -volume-ttl' manpage
Make a few misc changes to the text for the new -volume-ttl option:
- Minor grammatical/typo fixes
- Emphasize a little more that the default behavior allows for vldb
info to be cached _forever_
- Provide some info on the effects of changing this value
- Provide a suggested "typical" value, to give some clue as to what
should be set here, so a curious user doesn't just set this to the
first value they see (10 minutes)
Reviewed-on: https://gerrit.openafs.org/12909 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Andrew Deason <adeason@sinenomine.net>
(cherry picked from commit e6c2624249a6ab96053c1d1134aec8e3f6bcee9e)
Michael Meffie [Wed, 21 Feb 2018 01:31:11 +0000 (20:31 -0500)]
redhat: package libuafs perl bindings
Require the swig package as a build dependency. Build and package the
libuafs perl bindings. Place these libraries in the openafs-devel
package, along with the man page (moved from the openfs-client package).
This fixes an rpm build error when the swig package is present on the
build system,
Jeffrey Altman [Sat, 10 Feb 2018 15:47:24 +0000 (10:47 -0500)]
rx: Do not count RXGEN_OPCODE towards abort threshold
An RXGEN_OPCODE is returned for opcodes that are not implemented by the
rx service. These opcodes might be deprecated opcodes that are no
longer supported or more recently registered opcodes that have yet to
be implemented. Clients should not be punished for issuing unsupported
calls. The clients might be old and are issuing no longer supported
calls or they might be newer and are issuing yet to be implemented calls
as part of a feature test and fallback strategy.
This change ignores RXGEN_OPCODE errors when deciding how to adjust the
rx_call.abortCount. When an RXGEN_OPCODE abort is sent the
rx_call.abortCount and rx_call.abortError are left unchanged which
preserves the state for the next failing call.
Note that this change intentionlly prevents the incrementing of the
abortCount for client connections as they never send delay aborts.
Reviewed-on: https://gerrit.openafs.org/12906 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit f82d1c7d5aeae148305e867c1f79c6ea2f9e0a2a)
Marcio Barbosa [Wed, 21 Jun 2017 20:24:05 +0000 (16:24 -0400)]
ubik: check if epoch is sane before db relabel
The sync-site relabels its database at the end of the first write
transaction. The new label will be equal to the time at which the
sync-site in question first received its coordinator mandate. This time
is stored by a global called ubik_epochTime. In order to make sure that
the new database label is sane, only relabel the database if
ubik_epochTime is within a specific range.
Reviewed-on: https://gerrit.openafs.org/12640 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Andrew Deason <adeason@dson.org> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit f5c289d00aaf7c5525b477da5b89f6675456c211)
Change-Id: I78ebd2b8aeae01ef5e3b826ad6f1de5a5c1db79e
Reviewed-on: https://gerrit.openafs.org/12886 Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Sat, 9 Dec 2017 17:37:59 +0000 (11:37 -0600)]
Replace <rpc/types.h> with <rx/xdr.h>
Our in-tree xdr.h appears to have started life as a concatenation of
rpc/types.h and rpc/xdr.h, and should include all the needed functionality.
Indeed, commit 7293ddf325b149cae60d3abe7199d08f196bd2b9 even indicates
that we expect to be using our in-tree XDR everywhere anyway, so the
system XDR is superfluous.
Note that afs/sysincludes.h (not afsincludes.h!) already includes
rx/xdr.h ifndef AFS_LINUX22_ENV.
This change should help systems running glibc 2.26 or newer, which has
stopped providing the Sun RPC headers by default.
While here remove some duplicate includes of rpc/types.h in the
AIX-specific sources.
The Solaris NFS translator bits cannot really be changed, since the system
headers are used and have tight interdependencies.
Update rxgen to not emit rpc/types.h inclusion.
[mmeffie: squash 12801 to not emit rpc/types.h from rxgen]
Reviewed-on: https://gerrit.openafs.org/12800 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit e443a9fb67dbc29e6cc36661a4ac6e91af113f23)
Change-Id: I351e5c1e1223c49ca76e3d68c264ac1625abae60
Reviewed-on: https://gerrit.openafs.org/12894 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Reviewed-on: https://gerrit.openafs.org/12884 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit c7c71d2429cf685f3ffad6b2e6d102d900edc197)
Change-Id: I271cfeb6aea888ae40539e248a18131b0affeda8
Reviewed-on: https://gerrit.openafs.org/12901 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Mark Vitale [Tue, 30 Jun 2015 05:54:21 +0000 (01:54 -0400)]
SOLARIS: Avoid vcache locks when flushing pages for RO vnodes
We have multiple code paths that hold the following locks at the same
time:
- avc->lock for a vcache
- The page lock for a page in 'avc'
In order to avoid deadlocks, we need a consistent ordering for obtaining
these two locks. The code in afs_putpage() currently obtains avc->lock
before the page lock (Obtain*Lock is called before pvn_vplist_dirty).
The code in afs_getpages() also obtains avc->lock before the page lock,
but it does so in a loop for all requested pages (via pvn_getpages()).
On the second iteration of that loop, it obtains avc->lock, and the page
from the first iteration of the loop is still locked. Thus, it obtains a
page lock before locking avc->lock in some cases.
Since we have two code paths that obtain those two locks in a different
order, a deadlock can occur. Fixing this properly requires changing at
least one of those code paths, so the locks are taken in a consistent
order. However, doing so is complex and will be done in a separate
future commit.
For this commit, we can avoid the deadlock for RO volumes by simply
avoiding taking avc->lock in afs_putpages() at all while the pages are
locked. Normally, we lock avc->lock because pvn_vplist_dirty() will call
afs_putapage() for each dirty page (and afs_putapage() requires
avc->lock held). But for RO volumes, we will have no dirty pages
(because RO volumes cannot be written to from a client), and so
afs_putapage() will never be called.
So to avoid this deadlock issue for RO volumes, avoid taking avc->lock
across the pvn_vplist_dirty() call in afs_putpage(). We now pass a dummy
pageout callback function to pvn_vplist_dirty() instead, which should
never be called, and which panics if it ever is.
We still need to hold avc->lock a few other times during afs_putpage()
for other minor reasons, but none of these hold page locks at the same
time, so the deadlock issue is still avoided.
Benjamin Kaduk [Fri, 5 Jan 2018 04:00:15 +0000 (22:00 -0600)]
rx: remove trailing semicolons from FBSD mutex operations
Since the first introduction of FreeBSD support, the macros
(MUTEX_ENTER, etc.) for kernel mutex operations have included
trailing semicolons, unique among all the platforms.
This did not cause problems until the recent work on rx event
handlers, which put a MUTEX_ENTER() in the body of an 'if' clause
with no brackets, and attempted to follow it with an 'else' clause.
This results in the following (rather obtuse) compiler error:
Christof Hanke [Mon, 18 Dec 2017 15:58:39 +0000 (16:58 +0100)]
Avoid gcc warning
When using the configure option --enable-checking with gcc 7.2.1,
the compilation fails with
vutil.c:860:20: error: ‘%s’ directive writing up to 255 bytes into \
a region of size 63 [-Werror=format-overflow=]
This can be seen in the logs of the openSUSE Tumbleweed builder
for e.g. build 2368.
Avoid this warning by using snprintf which is provided by libroken
for all platforms.
Reviewed-on: https://gerrit.openafs.org/12813 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit fd4eaebb60dbefc27be98015fee23a3cf5d9752d)
Marcio Barbosa [Mon, 21 Aug 2017 18:21:54 +0000 (14:21 -0400)]
ubik: avoid DISK_Begin on sites that didn't vote for sync
As already described on 7c708506, SDISK_Begin fails on remotes if
lastYesState is not set. To fix this problem, 7c708506 does not allow
write transactions until we know that lastYesState is set on at least
quorum (ubik_syncSiteAdvertised == 1). In other words, if enough sites
received a beacon packet informing that a sync-site was elected, write
transactions will be allowed. This means that ubik_syncSiteAdvertised
can be true while lastYesState is not set in a few sites.
Consider the following scenario in a cell with frequent write
transactions:
Site A => Sync-site (up)
Site B => Remote 1 (up)
Site C => Remote 2 (down - unreachable)
Since A and B are up, we have quorum. After the second wave of beacons,
ubik_syncSiteAdvertised will be true and write transactions will be
allowed. At some point, C is not unreachable anymore. Site A sends a
copy of its database to C, but C did not vote for A yet (lastYesState ==
0). A new write transaction is initialized and, since lastYesState is
not set on C, DISK_Begin fails on this remote site and C is marked as
down. Since C is reachable, A will mark this remote site as up. The
sync-site will send its database to C, but C did not vote for A yet. A
new write transaction is initialized and, since lastYesState is not set
on C, DISK_Begin fails on this remote site and C is marked as down. In a
cell with frequent write transactions, this cycle will repeat forever.
As a result, the sync-site will be constantly sending its database to C
and quorum will be operating with less sites, increasing the chances
of re-elections.
To fix this problem, do not call DISK_Begin on remotes that did not
vote for the sync-site yet.
Reviewed-on: https://gerrit.openafs.org/12715 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 68ec78950a6e39dc1bf15012d4b889728086d0b7)
Marcio Barbosa [Mon, 21 Aug 2017 19:50:14 +0000 (15:50 -0400)]
ubik: update ubik_dbVersion during SDISK_SendFile
The ubik_dbVersion global represents the sync site's database version
and it is mostly used by the remote sites for sanity checks. Currently,
this global is updated when database changes are made on the sync site
(SDISK_Commit or SDISK_SetVersion), as well as every time we vote "yes"
for the sync-site in a beacon reply. Unfortunately, ubik_dbVersion is
not updated when a copy of the sync site's database is received via
DISK_SendFile, and it won't get updated until our next "yes" vote.
During this window, the current database version will not match
ubik_dbVersion. As a result, any write transaction during this time
frame will fail on the remote site in question.
To fix this problem, do not wait for the next beacon packet to update
ubik_dbVersion when the sync site's database is received; just update
it when we get the new database. Since no write transactions are
allowed while the db is transferring, ubik_dbVersion can be safely
updated.
Reviewed-on: https://gerrit.openafs.org/12716 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Andrew Deason <adeason@dson.org> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 50c1d1088d2adcbb37b6a9d23fdd63617b1267be)
Andrew Deason [Fri, 12 Jan 2018 03:27:28 +0000 (21:27 -0600)]
LINUX: Avoid locking inode in check_dentry_race
Currently, check_dentry_race locks the parent inode in order to ensure
it is not running in parallel with d_splice_alias for the same inode.
(For old Linux kernel versions; see commit b0461f2d: "LINUX:
Workaround d_splice_alias/d_lookup race".)
However, it is possible to hit this area of code when the parent inode
is already locked. When someone tries to create a file, directory, or
symlink, Linux tries to lookup the dentry for the target path, to see
if it already exists. While looking up the last component of the path,
Linux locks the directory, and if it finds a dentry for the target
name, it calls d_invalidate on it while the parent directory is
locked.
For a dentry with a NULL inode, we'll then try to lock the parent
inode in check_dentry_race. But since the inode is already locked, we
will deadlock.
From a user's point of view, the hang can be reproduced by doing
something similar to:
$ mkdir dir # succeeds
$ rmdir dir
$ ls -l dir
ls: cannot access dir: No such file or directory
$ mkdir dir # hangs
To avoid this, we can just change which lock we're using to avoid
check_dentry_race/d_splice_alias from running in parallel. Instead of
locking the parent inode, introduce a new global lock (called
dentry_race_sem), and lock that in check_dentry_race and around our
d_splice_alias call. We know that those are the only two users of this
new lock, so this should avoid any such deadlocks.
This does potentially reduce performance, since all tasks that hit
check_dentry_race or d_splice_alias will take the same global lock.
However, this at least still allows us to make use of negative
dentries, and this entire code path only applies to older Linux
kernels. It could be possible to add a new lock into struct vcache
instead, but using a global lock like this commit does is much
simpler.
Reviewed-on: https://gerrit.openafs.org/12868 Tested-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit ef1d4c8d328e9b9affc9864fd084257e9fa08445)
Caitlyn Marko [Thu, 9 Feb 2017 14:16:17 +0000 (09:16 -0500)]
SOLARIS: save kernel module function arguments for debugging
Add the -Wu,-save_args compiler option when building kernel modules
under Solaris 10 and 11 for the amd64 architecture.
Binaries generated with this option save function arguments on the stack
during function entry for debugging purposes. Up to six integer
arguments are saved on function entry, and are not modified during the
execution of the function.
[mmeffie: commit message update]
Reviewed-on: https://gerrit.openafs.org/12798 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 32d0493a7e4f74f5e5efdfde5eca29ed7d1bf3ec)
Marcio Barbosa [Mon, 5 Feb 2018 21:16:17 +0000 (21:16 +0000)]
autoconf: detect ctf-tools and add ctf to libafs
CTF is a reduced form of debug information similar to DWARF and stab. It
describes types and function prototypes. The principal objective of the
format is to shrink the data size as much as possible so that it could
be included in a production environment. MDB, DTrace, and other tools
use CTF debug information to read and display structures correctly.
This commit introduces a new configure option called --with-ctf-tools.
This option can be used to specify an alternative path where the tools
can be found. If the path is not provided, the tools will be searched
in a set of default directories (including $PATH). The CTF debugging
information will only be included if the corresponding --enable-debug /
--enable-debug-kernel is specified.
Note: at the moment, the Solaris kernel module is the only module
benefited by this commit.
Reviewed-on: https://gerrit.openafs.org/12680 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 88cb536f99dc58fdbeb9fa6c47c26774241a0cb6)
Michael Meffie [Sat, 30 Dec 2017 22:59:38 +0000 (17:59 -0500)]
autoconf: refactor linux-checks.m4
Further refactoring of the autoconf macros. Divy up the linux kernel
checks into smaller files.
This is a non-functional change. Care has been taken preserve the
ordering of the autoconf tests. Except for whitespace, the generated
configure file has not been changed by this refactoring. This has been
verified with a 'diff -u -w -B' comparison of the generated configure
file before and after applying this commit.
Reviewed-on: https://gerrit.openafs.org/12844 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 6a2b85cd4c00a08e165cb96d2cb56bf87c6324bc)
Michael Meffie [Sat, 30 Dec 2017 17:12:59 +0000 (12:12 -0500)]
autoconf: refactor ostype.m4
Further refactoring of the autoconf macros. Move more linux and solaris
specific checks into their own files.
This is a non-functional change. Care has been taken preserve the
ordering of the autoconf tests. Except for whitespace, the generated
configure file has not been changed by this refactoring. This has been
verified with a 'diff -u -w -B' comparison of the generated configure
file before and after applying this commit.
Reviewed-on: https://gerrit.openafs.org/12843 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 3c2e39bab7d927aa5f20d02a5e327927a4b2b553)
Michael Meffie [Fri, 29 Dec 2017 19:24:28 +0000 (14:24 -0500)]
autoconf: refactor acinclude.m4
The acinclude.m4 is very large and often requires to be changed for
unrelated commits. Divy up the large acinclude.m4 into a number of
smaller files to avoid so many contentions and to make the autoconf
system easier to maintain.
This is a non-functional change. Care has been taken preserve the
ordering of the autoconf tests. Except for whitespace, the generated
configure file has not been changed by this refactoring. This has been
verified with a 'diff -u -w -B' comparison of the generated configure
file before and after applying this commit.
Reviewed-on: https://gerrit.openafs.org/12842 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit c72622a244e561173e86ffe88ee3c9a8c823a76a)
Michael Meffie [Wed, 17 Jan 2018 22:33:50 +0000 (17:33 -0500)]
redhat: fix conditional for kernel-debuginfo files directive
Commit 443dd5367e0cd9050ad39a6594c5be521271b4e9 added support for a
separate debuginfo package for the kernel module. Unfortunately, the
%files directive for the kernel module debuginfo package was incorrectly
placed in the %if stanza of the build_userspace condition, so the
rpmbuild fails when attempting to build just the kernel module.
Fix this by moving the new %files directive out of the build_userspace
conditional.
Reviewed-on: https://gerrit.openafs.org/12874 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit f599e1ce6354c42a9c0c8f7205ba8a03c35ea72b)
Michael Meffie [Sat, 22 Jul 2017 02:30:43 +0000 (22:30 -0400)]
redhat: avoid rpmbuild exclude directives
Older versions of rpmbuild do not support the files exclude directive,
so fall back to the old way in which we remove the files to be excluded
and list the files to be included.
Reviewed-on: https://gerrit.openafs.org/12733 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit a71288a387095ccb4be83c1abae34ada80f53185)
Michael Meffie [Thu, 20 Jul 2017 08:13:04 +0000 (04:13 -0400)]
redhat: specify man pages without wildcards
Currently, some of the man pages are specified with the full name and
some are specified with a wildcard for the filename extension. Instead,
specify all the man pages without a wildcards to be more consistent and
to avoid putting incorrect man pages in packages.
This change removes a stray copy the klog.krb5.1 man page from
openafs-kauth-client subpackage and moves the AuthLog/AuthLog.dir man
pages to the optional openafs-kauth-server subpackage.
Reviewed-on: https://gerrit.openafs.org/12731 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 671db4ca5a76625d9b7133510cc1cbdda8a5d9b9)
Mark Vitale [Fri, 1 Dec 2017 01:26:46 +0000 (20:26 -0500)]
LINUX: Avoid d_invalidate() during afs_ShakeLooseVCaches()
With recent changes to d_invalidate's semantics (it returns void in Linux 3.11,
and always returns success in RHEL 7.4), it has become increasingly clear that
d_invalidate() is not the best function for use in our best-effort
(nondisruptive) attempt to free up vcaches that is afs_ShakeLooseVCaches().
The new d_invalidate() semantics always force the invalidation of a directory
dentry, which contradicts our desire to be nondisruptive, especially when
that directory is being used as the current working directory for a process.
Our call to d_invalidate(), intended to merely probe for whether a dentry
can be discarded without affecting other consumers, instead would cause
processes using that dentry as a CWD to receive ENOENT errors from getcwd().
A previous commit (c3bbf0b4444db88192eea4580ac9e9ca3de0d286) tried to address
this issue by calling d_prune_aliases() instead of d_invalidate(), but
d_prune_aliases() does not recursively descend into children of the given
dentry while pruning, leaving it an incomplete solution for our use-case.
To address these issues, modify the shakeloose routine TryEvictDentries() to
call shrink_dcache_parent() and maybe __d_drop() for directories, and
d_prune_aliases() for non-directories, instead of d_invalidate(). (Calls to
d_prune_aliases() for directories have already been removed by reverting commit c3bbf0b4444db88192eea4580ac9e9ca3de0d286.)
Just like d_invalidate(), shrink_dcache_parent() has been around "forever"
(since pre-git v2.6.12). Also like d_invalidate(), it "walks" the parent
dentry's subdirectories and "shrinks" (unhashes) unused dentries. But unlike
d_invalidate(), shrink_dcache_parent() will not unhash an in-use dentry, and
has never changed its signature or semantics.
d_prune_aliases() has also been available "forever", and has also never changed
its signature or semantics. The lack of recursive descent is not an issue for
non-directories, which cannot have such children.
[kaduk@mit.edu: apply review feedback to fix locking and avoid extraneous
changes, and reword commit message]
Reviewed-on: https://gerrit.openafs.org/12830 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit afbc199f152cc06edc877333f229604c28638d07)
Change-Id: I6d37e5584b57dcbb056385a79f67b92a363e08d2
Reviewed-on: https://gerrit.openafs.org/12851 Tested-by: BuildBot <buildbot@rampaginggeek.com> Tested-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Mark Vitale [Thu, 30 Nov 2017 22:56:13 +0000 (17:56 -0500)]
LINUX: consolidate duplicate code in osi_TryEvictDentries
The two stanzas for HAVE_DCACHE_LOCK are now functionally identical;
remove the preprocessor conditionals and duplicate code.
Minor functional change is incurrred for very old (before 2.6.38) Linux
versions that have dcache_lock; we are now obtaining the d_lock as well.
This is safe because d_lock is also quite old (pre-git, 2.6.12), and it
is a spinlock that's only held for checking d_unhashed. Therefore, it
should have negligible performance impact. It cannot cause deadlocks or
violate locking order, because spinlocks can't be held across sleeps.
Reviewed-on: https://gerrit.openafs.org/12792 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Andrew Deason <adeason@dson.org> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 5076dfc14b980aed310f3862875d5e9919fa199d)
Mark Vitale [Thu, 30 Nov 2017 21:08:38 +0000 (16:08 -0500)]
LINUX: create afs_linux_dget() compat wrapper
For dentry operations that cover multiple dentry aliases of
a single inode, create a compatibility wrapper to hide differences
between the older dget_locked() and the current dget().
No functional change should be incurred by this commit.
Reviewed-on: https://gerrit.openafs.org/12789 Reviewed-by: Andrew Deason <adeason@dson.org> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 74f4bfc627c836c12bb7c188b86d570d2afdcae8)
However, since that commit, several things have happened:
- RHEL 7.4 changed the semantics of d_invalidate() such that it
invalidates the cwd, but did NOT change the return type to void.
This broke our autoconf test for detecting the new semantics.
- Further research reveals that d_prune_aliases() was not the best
choice for replacing d_invalidate(). This is because for directories,
d_prune_aliases() doesn't invalidate dentries when they are referenced
by its children, and it doesn't walk the tree trying to invalidate
child dentries. So it can leave dentries dangling, if the only
references to thos dentries are via children.
Stephan Wiesand [Fri, 22 Dec 2017 13:40:32 +0000 (14:40 +0100)]
Linux 4.15: check for 2nd argument to pagevec_init
Linux 4.15 removes the distinction between "hot" and "cold" cache
pages, and pagevec_init() no longer takes a "cold" flag as the
second argument. Add a configure test and use it in osi_vnodeops.c .
Stephan Wiesand [Fri, 22 Dec 2017 13:17:09 +0000 (14:17 +0100)]
Linux: use plain page_cache_alloc
Linux 4.15 removes the distinction between "hot" and "cold" cache
pages, and no longer provides page_cache_alloc_cold(). Simply use
page_cache_alloc() instead, rather than adding yet another test.
Marcio Barbosa [Thu, 12 Oct 2017 15:42:40 +0000 (12:42 -0300)]
macos: make the OpenAFS client aware of APFS
Apple has introduced a new file system called APFS. Starting from High
Sierra, APFS replaces Mac OS Extended (HFS+) as the default file system
for solid-state drives and other flash storage devices.
The current OpenAFS client is not aware of APFS. As a result, the
installation of the current client into an APFS volume will panic the
machine.
To fix this problem, make the OpenAFS client aware of APFS.
Reviewed-on: https://gerrit.openafs.org/12743 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 6e57b22642bafb177e0931b8fb24042707d6d62f)
Benjamin Kaduk [Fri, 15 Dec 2017 01:54:57 +0000 (19:54 -0600)]
Fix macro used to check kernel_read() argument order
The m4 macro implementing the configure check is called
LINUX_KERNEL_READ_OFFSET_IS_LAST, but it defines a preprocessor symbol
that is just KERNEL_READ_OFFSET_IS_LAST. Our code needs to check
for the latter being defined, not the former.
Reported by Aaron Ucko.
Reviewed-on: https://gerrit.openafs.org/12808 Reviewed-by: Anders Kaseorg <andersk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit edc5463f3db4b6af2307741d9f4ee8f2c81cd98e)
Benjamin Kaduk [Mon, 4 Dec 2017 23:20:57 +0000 (17:20 -0600)]
OPENAFS-SA-2017-001: rx: Sanity-check received MTU and twind values
Rather than blindly trusting the values received in the
(unauthenticated) ack packet trailer, apply some minmial sanity checks
to received values. natMTU and regular MTU values are subject to
Rx minmium/maximum packet sizes, and the transmit window cannot drop
below one without risk of deadlock.
The maxDgramPackets value that can also be present in the trailer
already has sufficient sanity checking.
Extremely low MTU values (less than 28 == RX_HEADER_SIZE) can cause us
to set a negative "maximum usable data" size that gets used as an
(unsigned) packet length for subsequent allocation and computation,
triggering an assertion when the connection is used to transmit data.
Benjamin Kaduk [Tue, 28 Nov 2017 04:17:28 +0000 (22:17 -0600)]
afs: Fix bounds check in PNewCell
Reported by the opensuse buildbot:
CC [M] /home/buildbot/opensuse-tumbleweed-i386-builder/build/src/libafs/MODLOAD-4.13.12-1-default-MP/rx_packet.o
/home/buildbot/opensuse-tumbleweed-i386-builder/build/src/afs/afs_pioctl.c: In function ‘PNewCell’:
/home/buildbot/opensuse-tumbleweed-i386-builder/build/src/afs/afs_pioctl.c:3075:55: error: ‘*’ in boolean context, suggest ‘&&’ instead [-Werror=int-in-bool-context]
if ((afs_pd_remaining(ain) < AFS_MAXCELLHOSTS +3) * sizeof(afs_int32))
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~
Benjamin Kaduk [Tue, 28 Nov 2017 04:07:53 +0000 (22:07 -0600)]
rx: fix call refcount leak in error case
The recent event handling normalization in commit 304d758983b499dc568d6ca57b6e92df24b69de8 had event handlers switch
to dropping their reference on the associated connection/call just
before return. An early return case was missed in the conversion,
leading to a refcount leak in an error case.
Reviewed-on: https://gerrit.openafs.org/12781 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 66b74e78ba5fea6a8236dcd3b8b46e1dfa6a0ac7)
Change-Id: I532c49b2ef6ec95dd26a99c02e12ea53348f9690
Reviewed-on: https://gerrit.openafs.org/12783 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Marcio Barbosa [Thu, 16 Nov 2017 22:24:03 +0000 (17:24 -0500)]
afs: fix kernel_write / kernel_read arguments
The order / content of the arguments passed to kernel_write and
kernel_read are not right. As a result, the kernel will panic if one of
the functions in question is called.
Michael Meffie [Mon, 6 Nov 2017 22:37:46 +0000 (17:37 -0500)]
tests: fix out of bounds access in the rx-event test
Use the NUMEVENTS symbol which defines the array size instead of an
incorrect hard coded number when checking if a second event can be added
to be fired at the same time. This fixes a potential out of bounds
access of the event test array.
Also update the comment which incorrectly mentions the incorrect number
of events in the test.
Reviewed-on: https://gerrit.openafs.org/12762 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 50a3eb7b7ee94bffaadc98429bd404164e89ec7f)
Change-Id: I7a975e7498c1c7416a800c9294c97ee4de4fd57a
Reviewed-on: https://gerrit.openafs.org/12779 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Thu, 16 Nov 2017 10:49:49 +0000 (04:49 -0600)]
Sprinkle rx_GetConnection() for concision
Instead of inlining the body (taking the lock, incrementing the
refcount, and dropping the lock), use the convenience function
designed for this purpose.
Reviewed-on: https://gerrit.openafs.org/12772 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 2ae84bf053fe66b73a2c77b5d71305bae2c17587)
Change-Id: I60794d877a76fbb7c8ba59207e710a20641cc8f1
Reviewed-on: https://gerrit.openafs.org/12778 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Sun, 8 Oct 2017 03:42:38 +0000 (22:42 -0500)]
Standardize rx_event usage
Go over all consumers of the rx event framework and normalize its usage
according to the following principles:
rxevent_Post() is used to create an event, and it returns an event
handle (with a reference on the event structure) that can be used
to cancel the event before its timeout fires. (There is also an
additional reference on the event held by the global event tree.)
In all(*) usage within the tree, that event handle is stored within
either an rx_connection or an rx_call. Reads/writes to the member variable
that holds the event handle require either the conn_data_lock or call
lock, respectively -- that means that in most cases, callers of
rxevent_Post() and rxevent_Cancel() will be holding one of those
aforementioned locks. The event handlers themselves will need to
modify the call/connection object according to the nature of the
event, which requires holding those same locks, and also a guarantee
that the call/connection is still a live object and has not been
deallocated! Whether or not rxevent_Cancel() succeeds in cancelling
the event before it fires, whenever passed a non-NULL event structure
it will NULL out the supplied pointer and drop a reference on the
event structure. This is the correct behavior, since the caller
has asked to cancel the event and has no further use for the event
handle or its reference on the event structure. The caller of
rxevent_Cancel() must check its return value to know whether or
not the event was cancelled before its handler was able to run.
The interaction window between the call/connection lock and the lock
protecting the red/black tree of pending events opens up a somewhat
problematic race window. Because the application thread is expected
to hold the call/connection lock around rxevent_Cancel() (to protect
the write to the field in the call/connection structure that holds
an event handle), and rxevent_Cancel() must take the lock protecting
the red/black tree of events, this establishes a lock order with the
call/connection lock taken before the eventTree lock. This is in
conflict with the event handler thread, which must take the eventTree
lock first, in order to select an event to run (and thus know what
additional lock would need to be taken, by virtue of what handler
function is to be run). The conflict is easy to resolve in the
standard way, by having a local pointer to the event that is obtained
while the event is removed from the red/black tree under the eventTree
lock, and then the eventTree lock can be dropped and the event run
based on the local variable referring to it. The race window occurs
when the caller of rxevent_Cancel() holds the call/connection lock,
and rxevent_Cancel() obtains the eventTree lock just after the event
handler thread drops it in order to run the event. The event handler
function begins to execute, and immediately blocks trying to obtain
the call/connection lock. Now that rxevent_Cancel() has the eventTree
lock it can proceed to search the tree, fail to find the indicated event
in the tree, clear out the event pointer from the call/connection
data structure, drop its caller's reference to the event structure,
and return failure (the event was not cancelled). Only then does the
caller of rxevent_Cancel() drop the call/connection lock and allow
the event handler to make progress.
This race is not necessarily problematic if appropriate care is taken,
but in the previous code such was not the case. In particular, it
is a common idiom for the firing event to call rxevent_Put() on itself,
to release the handle stored in the call/connection that could have
been used to cancel the event before it fired. Failing to do so would
result in a memory leak of event structures; however, rxevent_Put() does
not check for a NULL argument, so a segfault (NULL dereference) was
observed in the test suite when the race occurred and the event handler
tried to rxevent_Put() the reference that had already been released by
the unsuccessful rxevent_Cancel() call. Upon inspection, many (but not
all) of the uses in rx.c were susceptible to a similar race condition
and crash.
The test suite also papers over a related issue in that the event handler
in the test suite always knows that the data structure containing the
event handle will remain live, since it is a global array that is allocated
for the entire scope of the test. In rx.c, events are associated with
calls and connections that have a finite lifetime, so we need to take care
to ensure that the call/connection pointer stored in the event remains
valid for the duration of the event's lifecycle. In particular, even an
attempt to take the call/connection lock to check whether the corresponding
event field is NULL is fraught with risk, as it could crash if the lock
(and containing call/connection) has already been destroyed! There are
several potential ways to ensure the liveness of the associated
call/connection while the event handler runs, most notably to take care
in the call/connection destruction path to ensure that all associated
events are either successfully cancelled or run to completion before
tearing down the call/connection structure, and to give the pending event
its own reference on the associated call/connection. Here, we opt for
the latter, acknowledging that this may result in the event handler thread
doing the full call/connection teardown and delay the firing of subsequent
events. This is deemed acceptable, as pending events are for intentionally
delayed tasks, and some extra delay is probably acceptable. (The various
keepalive events and the challenge event could delay the user experience
and/or security properties if significantly delayed, but I do not believe
that this change admits completely unbounded delay in the event handler
thread, so the practical risk seems minimal.)
Accordingly, this commit attempts to ensure that:
* Each event holds a formal reference on its associated call/connection.
* The appropriate lock is held for all accesses to event pointers in
call/connection structures.
* Each event handler (after taking the appropriate lock) checks whether
it raced with rxevent_Cancel() and only drops the call/connection's
reference to the event if the race did not occur.
* Each event handler drops its reference to the associated call/connection
*after* doing any actions that might access/modify the call/connection.
* The per-event reference on the associated call/connection is dropped by
the thread that removes the event from the red/black tree. That is,
the event handler function if the event runs, or by the caller of
rxevent_Cancel() when the cancellation succeed.
* No non-NULL event handles remain in a call/connection being destroyed,
which would indicate a refcounting error.
(*) There is an additional event used in practice, to reap old connections,
but it is effectively a background task that reschedules itself
periodically, with no handle to the event retained so as to be able
to cancel it. As such, it is unaffected by the concerns raised here.
While here, standardize on the rx_GetConnection() function for incrementing
the reference count on a connection object, instead of inlining the
corresponding mutex lock/unlock and variable access.
In contrast to what was done on master, for the 1.8 branch we do not
force-enable refcount checking.
Reviewed-on: https://gerrit.openafs.org/12756 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 304d758983b499dc568d6ca57b6e92df24b69de8)
Benjamin Kaduk [Thu, 5 Oct 2017 04:03:44 +0000 (23:03 -0500)]
Adjust rx-event test to exercise cancel/fire race
We currently do not properly handle the case where a thread runs
rxevent_Cancel() in parallel with the event-handler thread attempting
to fire that event, but the test suite only picked up on this issue
in a handful of the Debian automated builds (somewhat less-resourced
ones, perhaps).
Modify the event scheduling algorithm in the test so as to create a
larger chunk of events scheduled to fire "right away" and thereby
exercise the race condition more often when we proceed to cancel
a quarter of events "right away".
Reviewed-on: https://gerrit.openafs.org/12755 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit bdb509fb1d8e0fdca05dffecdbcbf60a95ea502e)
Change-Id: I27cebed3c2c3daff10b8d3f5f6f949e667791a72
Reviewed-on: https://gerrit.openafs.org/12774 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Michael Laß [Thu, 2 Nov 2017 20:16:49 +0000 (21:16 +0100)]
gtx: link against libtinfo if termlib is seperated
If ncurses is built with "./configure --with-termlib=tinfo", gtx fails
to link because of an undefined reference to the LINES symbol which is
then provided by libtinfo.so and not libncurses.so.
If ncurses is present, additionally check whether LINES is provided by
ncurses or tinfo and set $LIB_curses accordingly.
This change is based on a patch provided by Bastian Beischer.
FIXES 134420
Reviewed-on: https://gerrit.openafs.org/12760 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 311f1d28a2f626350b33ad432e674055b62511bd)
Change-Id: I2f69fe51bbefeeb2a17145a88aa9c891644f2f61
Reviewed-on: https://gerrit.openafs.org/12763 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Laß <lass@mail.uni-paderborn.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Linux: Use kernel_read/kernel_write when __vfs variants are unavailable
We hide the uses of set_fs/get_fs behind a macro, as those functions
are likely to soon become unavailable:
> Christoph Hellwig suggested removing all calls outside of the core
> filesystem and architecture code; Andy Lutomirski went one step
> further and said they should all go.
https://lwn.net/Articles/722267/
Reviewed-on: https://gerrit.openafs.org/12729 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 5ee516b3789d3545f3d78fb3aba2480308359945)
Change-Id: I28a7126bf6ab048f8d949f190e557a3fa44f3f46
Reviewed-on: https://gerrit.openafs.org/12737 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
unexports both __vfs_read and __vfs_write, but keeps the former in
fs.h--as it is is still being used by another part of the tree.
This situation results in a false positive in our Autoconf check,
which does not see the export statements, and ends up marking the
corresponding API as available.
That, in turn, causes some code which assumes symmetry with
__vfs_write to fail to compile.
Switch to testing for __vfs_write, which correctly marks the API as
unavailable.
Reviewed-on: https://gerrit.openafs.org/12728 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 929e77a886fc9853ee292ba1aa52a920c454e94b)
Change-Id: I03e3c8222360a6b04b45b45a8f56b5df054f6783
Reviewed-on: https://gerrit.openafs.org/12736 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Anders Kaseorg [Sat, 2 Sep 2017 03:37:07 +0000 (23:37 -0400)]
vol: Fix two buffers being one char too short
Fixes these warnings:
namei_ops.c: In function 'namei_copy_on_write':
namei_ops.c:1328:31: warning: 'snprintf' output may be truncated before the last format character [-Wformat-truncation=]
snprintf(path, sizeof(path), "%s-tmp", name.n_path);
^~~~~~~~
namei_ops.c:1328:2: note: 'snprintf' output between 5 and 260 bytes into a destination of size 259
snprintf(path, sizeof(path), "%s-tmp", name.n_path);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
vol_split.c: In function 'split_volume':
vol_split.c:576:22: warning: 'sprintf' may write a terminating nul past the end of the destination [-Wformat-overflow=]
sprintf(symlink, "#%s", V_name(newvol));
^~~~~
vol_split.c:576:5: note: 'sprintf' output between 2 and 33 bytes into a destination of size 32
sprintf(symlink, "#%s", V_name(newvol));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reviewed-on: https://gerrit.openafs.org/12722 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 0a9a6b57ce6e1c97fcc651c8cb74e66fc8422a1e)
Change-Id: Ia60439aed7925b786a0213d96a7afb413579e01f
Reviewed-on: https://gerrit.openafs.org/12723 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Seth Forshee [Tue, 22 Aug 2017 12:59:11 +0000 (07:59 -0500)]
Linux: Include linux/uaccess.h rather than asm/uaccess.h if present
Starting with Linux 4.12 there is a module build error on s390
due to asm/uaccess.h using a macro defined in the common header.
The common header has been around since 2.6.18 and has always
included asm/uaccess.h, so switch to using the common header
whenever it is present.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Reviewed-on: https://gerrit.openafs.org/12714 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 962f4838dc461567d896304f617a0923745d13d5)
Change-Id: I5a7834b982458159804bc4d940e39ef283253299
Reviewed-on: https://gerrit.openafs.org/12718 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Wed, 2 Aug 2017 01:57:52 +0000 (20:57 -0500)]
Remove src/mcas
This lock-free library toolkit is intriguing and may be the subject
of future work, but such development will occur on the master branch,
and these files are just clutter on openafs-stable-1_8_x. Remove
them to give the tree a more clean start.
Remove src/mcas and stop mentioning it in SOURCE-MAP; don't reference
it in the rpctests, either.
Change-Id: I21b1b6b64a709fe40aa53aaf3470d128c0dc2f86
Reviewed-on: https://gerrit.openafs.org/12682 Tested-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Wed, 2 Aug 2017 01:55:52 +0000 (20:55 -0500)]
Remove src/rxgk
These files were commited slightly prematurely to the tree; rxgk
support is intended for the 2.0 release, and will not appear in the
1.8.x release series.
Remove src/rxgk and drop mentions of rxgk from configure/Makefile.in/etc.
Change-Id: Ib7d40eaac85b05d920781b61f73dbdf8fedfcc2b
Reviewed-on: https://gerrit.openafs.org/12681 Tested-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
Michael Meffie [Fri, 14 Apr 2017 01:48:06 +0000 (21:48 -0400)]
redhat: kauth client and server sub-packages
Move the kaserver and kauth client programs to conditionally built
packages called openafs-kauth-server and openafs-kauth-client.
Packagers can build these by specifying '--with kauth'. They are not
built by default to discourage use.
This commit subsumes the openafs-kpasswd package into the
openafs-kauth-client package.
Change-Id: I1322f05d7fe11d466c9ed71a5059c21b759d95ab
Reviewed-on: https://gerrit.openafs.org/12600 Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Michael Meffie [Mon, 10 Apr 2017 19:06:02 +0000 (15:06 -0400)]
redhat: do not package kauth by default
Do not package kaserver and related programs by default to discourage
use. Add the '--with kauth' rpmbuild option to allow packagers to
continue include the kauth programs for compatibility.
Change-Id: I8bf9f6dc221afc22ed6c9a33cf101d705e6c4920
Reviewed-on: https://gerrit.openafs.org/12597 Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Benjamin Kaduk [Mon, 31 Jul 2017 01:57:05 +0000 (20:57 -0500)]
Default to crypt mode for unix clients
Though the protection offered by rxkad, even with rxkad-k5 and rxkad-kdf, is
insufficient to protect traffic from a determined attacker, it remains the
case that the internet is not a safe place for user data to travel in the
clear, and has not been for a long time. The Windows client encrypts by
default, and all or nearly all the Unix client packaging scripts set crypt
mode by default. Catch up to reality and default to crypt mode in the
Unix cache manager.
The current version does not have a corresponding LWP_WaitProcess call
for the beacon_globals.ubik_amSyncSite global. As a result, the
LWP_NoYieldSignal(&beacon_globals.ubik_amSyncSite) signal call can be
safely removed.
Change-Id: I72c4ccfe8e68551673dc728dd699ba8c561d76d1
Reviewed-on: https://gerrit.openafs.org/12673 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
Michael Meffie [Wed, 2 Aug 2017 00:10:32 +0000 (20:10 -0400)]
doc: relocate notes from arch to txt
The doc/txt directory has become the de facto home for text-based
technical notes. Relocate the contents of the doc/arch directory to
doc/txt. Relocate doc/examples to doc/txt/examples.
Update the doc/README file to be more current and remove old work in
progress comments.
ubik: update epoch as soon as sync-site is elected
The ubik_epochTime represents the time at which the coordinator first
received its coordinator mandate. However, this global is currently not
updated at the moment when a new sync-site is elected. Instead,
ubik_epochTime is only updated at the very end of the first write
transaction, when a new database label is written (in udisk_commit).
This causes at least 2 different issues:
For one, this means that we change ubik_epochTime while a remote
transaction is in progress. If VOTE_Beacon is called after
ubik_epochTime is updated, but before the remote transaction ends, the
remote sites will detect that the transaction id in ubik_currentTrans is
wrong (via urecovery_CheckTid(), since the epoch doesn't match), and
they will abort the transaction. This means the transaction will fail,
and it may cause a loss of quorum until another election is completed.
Another issue is that ubik_epochTime can be 0 at the beginning of a
write transaction, if this is the first election that this site has won.
Since ubik_epochTime is used to construct transaction ids, this means
that we can have different transactions that originate from different
sites at different times, but they have the same epoch in their tid.
For example, say a write transaction starts with epoch 0, but the
originating site is killed/interrupted before finishing. That write
transaction will linger on remote sites in ubik_currentTrans with an
epoch of 0 (since the originating site will never call
DISK_ReleaseLocks, or DISK_Abort, etc). Normally the sync site will kill
such a lingering transaction via urecovery_CheckTid, but since the epoch
is 0, and the election winner's epoch is also 0, the transaction looks
valid and may never be killed. If that transaction is holding a lock on
the database, this means that the database will forever remain locked,
effectively preventing any access to the db on that site.
To fix both of these issues, update ubik_epochTime with the current
time as soon as we win the election. This ensures that the epoch is not
updated in the middle of a transaction, and it ensures that all
transactions are created with a unique epoch: the epoch of the election
that we won.
Note that with this commit, we do not ever set ubik_epochTime to the
magic value of '2' during database init. The special '2' epoch only
needs to be set in the database itself, and it is never an actual epoch
that represents a real quorum that went through the election process.
The database will be labelled with a 'real' epoch after the first write,
like normal.
[kaduk@mit.edu: comment the locking strategy in ubeacon_Interact()]
Joe Gorse [Thu, 6 Jul 2017 19:47:24 +0000 (15:47 -0400)]
LINUX: afs_create infinite fetchStatus loop
For a file in a directory with the CStatd bit cleared, we can get
an infinite fetchStatus loop.
In afs_create(), afs_getDCache() may return NULL due to an error.
If unchecked it will loop which may produce multiple fetchStatus()
calls to the fileserver.
Credit: Yadav Yadavendra for identifying and analysing this issue.
Michael Meffie [Tue, 1 Aug 2017 21:21:13 +0000 (17:21 -0400)]
volser: preserve volume stats by default
Commit dfceff1d3a66e76246537738720f411330808d64 added the
-preserve-vol-stats flag to the volume server. This enabled a change in
the volume server to preserve volume usage statistics during reclone and
restore operations. Otherwise, volume usage counters of read-only
volumes are cleared when volumes are released, making it difficult to
track usage with the volume stats.
Make this feature the default behavior of the volume server and provide
the option -clear-vol-stats to use the old behavior if so desired. This
change makes the -preserve-vol-stats the default, and keeps it as a
hidden flag for sites which may already have that flag set in the
BosConfig.
Since this changes a default behavior of the volume server, this change
is only appropriate on a major or minor release boundary, not in the
middle of a stable series.
Marcio Barbosa [Mon, 22 May 2017 16:55:32 +0000 (12:55 -0400)]
ubik: avoid early DISK_Begin calls we know will fail
Currently, we can start a write transaction on a site immediately after
it is elected as the sync site. However, after commit d47beca1,
SDISK_Begin on remote sites will fail right after an election occurs
(since lastYesState is not set, and so urecovery_AllBetter will fail).
And after commit fac0b742, this error is always noticed and propagated
back to the application.
As a result, when we try to write immediately after a sync site is
elected, the transaction will fail with UNOQUORUM, the remote sites will
be marked as down, and we may lose quorum and require another election
to be performed. This can easily happen repeatedly for a site that
frequently tries to make changes to a ubik database.
To avoid marking other sites down and going through another election
process, do not allow write transactions until we know that lastYesState
is set on the remote sites. We do this by waiting until the next wave of
beacons are sent, which tell the remote sites that we are the sync site.
In other words, only allow write transactions after the sync site knows
that the remote sites also know that the sync site has been elected.
With this commit, a write transaction immediately after an election
will still fail with UNOQUORUM, but we avoid triggering an error on the
remote sites, and avoid losing quorum in this situation.
Change-Id: I9e1a76b4022e6d734af1165d94c12e90af04974d
Reviewed-on: https://gerrit.openafs.org/12592 Reviewed-by: Andrew Deason <adeason@dson.org> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
Marcio Barbosa [Wed, 21 Jun 2017 20:42:37 +0000 (17:42 -0300)]
ubik: allow remote dbase relabel if up to date
When a site is elected the sync-site, its database is not immediately
relabeled. The database in question will be relabeled at the end of the
first write transaction (in udisk_commit). To do so, the dbase->version
is updated on the sync-site first (1) and then the versions of the
remote sites are updated through SDISK_SetVersion() (2).
In order to make sure that the remote site holds the same database as
the sync-site, the SDISK_SetVersion() function checks if the current
version held by the remote site (ubik_dbVersion) is equal to the
original version stored by the sync-site (oldversionp). If
ubik_dbVersion is not equal to oldversionp, SDISK_SetVersion() will
fail with USYNC.
However, ubik_dbVersion can be updated by the vote thread at any time.
That is, if the sync site calls VOTE_Beacon() on the remote site between
events (1) and (2), the remote site will set ubik_dbVersion to the new
version, while ubik_dbase->version is still set to the old version. As
a result, ubik_dbVersion will not be equal to oldversionp and
SDISK_SetVersion() will fail with USYNC. This failure may cause a loss
of quorum until another election is completed.
To fix this problem, let SDISK_SetVersion() relabel the database when
ubik_dbase->version is equal to oldversionp. In order to try to only
affect the scenario described above, also check if ubik_dbVersion is
equal to newversionp.
Joe Gorse [Wed, 10 May 2017 15:38:25 +0000 (11:38 -0400)]
afs: fix repeated BulkStatus calls for directories.
There is a filetype comparison check in afs_DoBulkStat just after
BulkFetch RPC. This check will fail for directories even though
bulkStatus was done for directories.
This code is apparently necessary for Darwin, but it causes this problem
otherwise. Thus it is removed from the rest of the builds using the
AFS_DARWIN_ENV preprocessor variable.
Credit: Yadav Yadavendra for identifying and analysing this issue.