Marc Dionne [Tue, 3 Dec 2013 19:10:00 +0000 (14:10 -0500)]
Linux 3.13: Check return value from bdi_init
The use of the bdi_init function now gets a warning because the
return value is unused and the function is now defined with
the warn_unused_result attribute.
Assign and check the return value.
Reviewed-on: http://gerrit.openafs.org/10530 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@your-file-system.com>
(cherry picked from commit ccc5d3f7adceda4d8cf41f04fe02d5cfe376befd)
Change-Id: I2ccd9bbdce396a003030e3e09f9f6d75a1c4fa7c
Reviewed-on: https://gerrit.openafs.org/12274 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Tested-by: Stephan Wiesand <stephan.wiesand@desy.de>
Benjamin Kaduk [Sun, 1 May 2016 23:48:40 +0000 (19:48 -0400)]
Linux 4.5: don't access i_mutex directly
Linux commit 5955102c, in preparation for future work, introduced
wrapper functions to lock/unlock inode mutexes. This is to
prepare for converting it to a read-write semaphore, so that
lookup can be done with only the shared lock held.
Adopt the afs_linux_*lock_inode() functions accordingly, and
convert afs_linux_fsync() to using those wrappers, since the
FOP_FSYNC_TAKES_RANGE case appears to be the current case.
Amusingly, afs_linux_*lock_inode() already have a branch to
handle the case when inode serialization is protected by a
semaphore; it seems that this is going to come full-circle.
Reviewed-on: https://gerrit.openafs.org/12268 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Joe Gorse <jhgorse@gmail.com> Tested-by: Joe Gorse <jhgorse@gmail.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 360f4ef53c454494cd5212a5ea46c658bdb2879c)
Change-Id: I52f29cdb6f0bf85bcbb6624ed62e071b1f3807c9
Reviewed-on: https://gerrit.openafs.org/12302 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Tested-by: Stephan Wiesand <stephan.wiesand@desy.de>
Linux 4.5: get_link instead of follow_link+put_link
In linux commit 6b255391, the follow_link inode operation was
replaced by the get_link operation, which is basically the same
but takes the inode and dentry separately, allowing for the
possibility of staying in RCU mode.
For now, only support this if page_get_link is available and we are
using the USABLE_KERNEL_PAGE_SYMLINK_CACHE
The previous test for USABLE_KERNEL_PAGE_SYMLINK_CACHE used a bogus,
undefined configure variable (ac_cv_linux_kernel_page_follow_link).
Remove it, as it was not needed
Reviewed-on: https://gerrit.openafs.org/12265 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Joe Gorse <jhgorse@gmail.com> Tested-by: Joe Gorse <jhgorse@gmail.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 2ef27ea1bb032cee8d26980e60e02b52a0805763)
Change-Id: I828823ad16f24bae583de9cf436844565217918d
Reviewed-on: https://gerrit.openafs.org/12301 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Tested-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Thu, 27 Aug 2015 17:06:05 +0000 (13:06 -0400)]
afs: shake harder in shake-loose-vcaches
Linux based cache managers will allocate vcaches on demand and
deallocate batches of vcaches in the background. This feature is called
dynamic vcaches.
Vcaches to be deallocated are found by traversing the vcache LRU list
(VLRU) from the oldest vcache to the newest. Up to a target number of
vcaches are attempted to be evicted. The afs_xvcache lock protecting
the VLRU may be dropped and re-acquired while attempting to evict a
vcache. When this happens, it is possible the VLRU may have changed, so
the traversal of the VLRU is restarted. This restarting of the VLRU
transversal is limited to 100 iterations to avoid looping indefinitely.
Vcaches which are busy cannot be evicted and remain in the VLRU. When a
busy cache was not evicted and the afs_xvache lock was dropped, the VLRU
traversal is restarted from the end of the VLRU. When the busy vcache is
encountered on the retry, it will trigger additional retries until the
loop limit is reached, at which point the target number of vcaches will
not be deallocated.
This can leave a very large number of unbusy vcaches which are never
deallocated. On a busy machine, tens of millions of unused vcaches can
remain in memory. When the busy vcache at the end of the VLRU is finally
evicted, the log jam is broken, and the background deamon will hold the
afs_xvcache lock for an excessively long time, hanging the system.
Fix this by moving busy vcaches to the head of the VLRU before
restarting the VLRU traversal. These busy vcaches will be skipped when
retrying the VLRU traversal, allowing the cache manager to make progress
deallocating vcaches down to the target level.
This was already done on the mac osx platform while attempting to evict
vcaches. Move the code to move busy vcaches to the head of the VLRU up
the the platform agnostic caller.
Thanks to Andrew Deason for the initial version of this patch.
Reviewed-on: https://gerrit.openafs.org/11654 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@dson.org> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 5c136c7d93ed97166f39bf716cc7f5d579b70677)
Michael Meffie [Thu, 25 Feb 2016 23:49:20 +0000 (18:49 -0500)]
LINUX: hold vcache while dropping dcache refs
Hold a reference on a vcache while attempting to evict the inode from
the dcache. Since the afs_xvcache lock is dropped, it could be possible
for the vcache to be flushed during this time, making it unsafe to use
the vcache after the eviction attempt.
Reviewed-on: https://gerrit.openafs.org/12206 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Andrew Deason <adeason@dson.org> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 961875cbedc2c91cdba6dc34a43c6136ea9797fb)
Andrew Deason [Sun, 12 Apr 2015 01:51:09 +0000 (20:51 -0500)]
afs: Log abnormally large chunk files
Any chunk in our cache for a regular file should be smaller than or
equal to our configured chunksize. If someone sets a chunk to be
larger than that, it is very strange and may cause other confusing
issues. Specifically, afs_DoPartialWrite determines if our cache is
"too full" by counting the number of dirty chunks. If we have a dirty
chunk that is much larger than the chunksize, it can throw off the
afs_DoPartialWrite calculation.
This is only true for dcaches backing regular files, though. For
directories, we fetch the entire directory into a single chunk file,
and the size of a directory blob can easily exceed the chunksize
without issues. The aforementioned issue with afs_DoPartialWrite does
not apply, since directory chunks cannot be dirty (we only locally
modify the chunk if we modify the dir on the server, and the DVs
match).
Anyway, it should not be possible to get a chunk for a regular file
larger than the chunksize. Log a message if it does occur, to help
assist anyone in tracking down issues when this does occur.
[mmeffie@sinenomine.net remove unnecessary casts in afs_warn args.]
Reviewed-on: http://gerrit.openafs.org/11831 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 11845765c75a2f15404ac55a882358c3f88595b9)
Change-Id: I7c9f4aa147ba63e51bb805484bac5785259847cb
Reviewed-on: https://gerrit.openafs.org/12216 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Andrew Deason [Fri, 10 Apr 2015 02:26:25 +0000 (21:26 -0500)]
afs: Log weird 'size' fetchdata errors
There are a couple of situations that should never happen when issuing
a fetchdata, but cause errors when they do:
- The fileserver responds with more than 2^32 bytes of data
- The fileserver responds with more data than requested (but still
smaller than 2^32)
While these should normally never be encountered, it can be very
confusing when they do, since they cause file fetches to fail. To give
the user or investigating developer some hope of figuring out what is
going on, at least log a warning in these situations, to at least
indicate this is the area in which something is breaking.
Only log these once, in case something causes these conditions to be
hit, e.g., every fetch. Once is at least enough to say this is
happening.
[mmeffie@sinenomine.net remove unneeded casts in afs_warn args and
explicit static initializers.]
Reviewed-on: http://gerrit.openafs.org/11830 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 5fbf45b56298aa5a93cf9015f2d6346c7a0f615c)
Change-Id: I2f15255f33f44bef038ac9926d1ed47eca73d89a
Reviewed-on: https://gerrit.openafs.org/12215 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Andrew Deason [Wed, 8 Apr 2015 03:10:53 +0000 (22:10 -0500)]
afs: Fix fetchInit for negative/large lengths
Currently, the 'length64' variable in rxfs_fetchInit is almost
completely unused (it just goes into an icl logging function). For the
length that we actually use ('*alength'), we just take the lower 32
bits of the length that the fileserver told us. This method is
incorrect in at least the following cases:
- If the fileserver returns a length that is larger than 2^32-1,
we'll just take the lower 32 bits of the 64-bit length the
fileserver told us about. The client currently never requests a
fetch larger than 2^32-1, so this would be an error, but if this
occurred, we would not detect it until much later in the fetch.
- If the fileserver returns a length that is larger than 2^31-1, but
smaller than 2^32, we'll interpret the length as negative (which we
assume is just 0, due to bugs in older fileservers). This is also
incorrect.
- If the fileserver returns a negative length smaller than -2^31+1,
we may interpret the give length as a positive value instead of a
negative one. Older fileservers can do this if we fetch data beyond
the file's EOF (this was fixed in the fileserver in commit 529d487d65d8561f5d0a43a4dc71f72b86efd975). This positive length
will cause an error (usually), instead of proceeding without error
(which is what would happen if we correctly interpreted the length
as negative).
On Solaris, this can manifest as a failed write, when writing to a
location far beyond the file's EOF from the fileserver's point of
view, because Solaris writes can trigger a fetch for the same area.
Seeking to a location far beyond the file's EOF and writing can
trigger this, as can a normal copy into AFS, if the file is large
enough and the cache is large enough. To explain in more detail:
When copying a file into AFS, the cache manager will buffer the dirty
data in the disk cache until the file is synced/closed, or we run out
of cache space. While this data is buffering, the application will
write into an offset, say, 3GiB into the file. On Solaris, this can
trigger a read for the same region, which will trigger a fetch from
the fileserver at the offset 3GiB into the file. If the fileserver
does not contain the fix in commit 529d487d65d8561f5d0a43a4dc71f72b86efd975, it will respond with a large
negative number, which we interpret as a large positive number; much
larger than the requested length. This will cause the fetch to fail,
which then causes the whole write() call to fail. Specifically this
will fail with EINVAL on Solaris, since that is the error code we
return from afs_GetOnePage when we fail to acquire a dcache. If the
cache is small enough, this will not happen, since we will flush data
to the fileserver before we have a large amount of dirty data,
e.g., 3GiB. (The actual error occurs closer to 2GiB, but this is just
for illustrative purposes.)
To fix this, detect the various ranges of values mentioned above, and
handle them specially. Lengths that are too large will yield an error,
since we cannot handle values over 2^31-1 in the rxfs_* framework
currently.
For lengths that are negative, just act as if we received a length of
0. Do this for both the 64-bit codepath and the non-64-bit codepath,
just so they remain identical.
[mmeffie@sinenomine.net: directly use 64 bit comparisons, don't mask
end call error code, commit nits.]
Reviewed-on: http://gerrit.openafs.org/11829 Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit c0f52c3a3d76059c9d8b2df3374df844d8d6861b)
Change-Id: If6b9debe3f6381634b15be4529931422d908c2aa
Reviewed-on: https://gerrit.openafs.org/12214 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Most of the time, this is fine. However, if 'position' is more than
2GiB greater than file_length, 'size' will calculated to be smaller
than -2GiB. Since 'size' in this code is a signed 32-bit integer, this
can cause 'size' to underflow, and result in a value closer to
(positive) 2GiB.
This has two potential effects:
The afs_AdjustSize call in afs_GetDCache will cause the underlying
cache file for this dcache to be very large (if our offset is around
2GiB larger than the file size). This can confuse other parts of the
client, since our cache usage reporting will be incorrect (and can be
even way larger than the max configured cache size).
This will also cause a read request to the fileserver that is larger
than necessary. Although 'size' will be capped at our chunksize, it
should be 0 in this situation, since we know there is no data to
fetch. At worst, this currently can just result in worse performance
in rare situations, but it can also just be very confusing.
Note that an afs_GetDCache request beyond EOF can currently happen in
non-race conditions on at least Solaris when performing a file write.
For example, with a chunksize of 256KiB, something like this will
trigger the overflow in 'size' in most cases:
Michael Meffie [Tue, 16 Dec 2014 21:13:01 +0000 (16:13 -0500)]
vlserver: do not perform ChangeAddr on mh entries, except for removal
Fix a long standing bug in the ChangeAddr RPC which damages the vldb,
When vos changeaddr is run with -oldaddr and -newaddr, and the -oldaddr
is present in an multi-homed entry, instead of changing the address in
the mh entry, the server slot is "downgraded" to a single homed entry
and the mh entry is orphaned in the vldb.
Instead, if the -oldaddr is in a multi-home entry, refuse to change the
address with a VL entry not found error and log the event.
Multi-homed addresses can be changed manually using the vos setaddrs
command which calls the RegisterAddrs() RPC.
Reviewed-on: http://gerrit.openafs.org/11639 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Daria Brashear <shadow@your-file-system.com> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 1cc77cd43732cca1c617db329a71693903d2b699)
Change-Id: I14a77317d582dd1cb8490e643b8fdfc86f4942c0
Reviewed-on: https://gerrit.openafs.org/12089 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Stephan Wiesand [Thu, 7 Apr 2016 08:58:30 +0000 (10:58 +0200)]
Linux: Fix misleading indentation and other whitespace
Commit 7edc6694e7632c9736bd1516935604a638165313 introduced a
misleading indentation of a line in afs_linux_prefetch. Correct
it, and once here remove trailing whitespace throughout the file.
Reviewed-on: https://gerrit.openafs.org/12253 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 3609ebcfa3f70ca7612364c0cc2345b1d7f1096b)
Change-Id: I0d42c6751b835308c692c0ebb7d217f56ad5cf2a
Reviewed-on: https://gerrit.openafs.org/12254 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Jeff Blaine [Thu, 19 May 2011 01:46:52 +0000 (21:46 -0400)]
Hide -noexecute in favor of -dryrun
Makes all previous -noexecute arguments hidden (still callable)
and replaces them with -dryrun whose help text has been made
common where appropriate instead of the 3 previous ways the
argument was explained.
Marcio Barbosa [Tue, 29 Dec 2015 13:31:43 +0000 (10:31 -0300)]
afs: do not allow two shutdown sequences in parallel
Often, ‘afsd -shutdown’ is called right after ‘umount’.
Both commands hold the glock before calling ‘afs_shutdown’.
However, one of the functions called by 'afs_shutdown', namely,
‘afs_FlushVCBs’, might drop the glock when the global
'afs_shuttingdown' is still equal to 0. As a result, a scenario
with two shutdown sequences proceeding in parallel is possible.
To fix the problem, the global ‘afs_shuttingdown’ is used as an
enumerated type to make sure that the second thread will not run
‘afs_shutdown’ while the first one is stuck inside ‘afs_FlushVCBs’.
Reviewed-on: http://gerrit.openafs.org/12016 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 70fd9bc6dcc79cb25e98cdcfd0f085c4bf4f310a)
Change-Id: I073d1914a7daa858a78305ff154074f2a51a9f5f
Reviewed-on: https://gerrit.openafs.org/12179 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
"local" links to section heads inside the same pod page should be written
L</OPTIONS> instead of L<OPTIONS>. the other broken links are assorted
typos and capitalization changes.
Note: This crash was exposed by other bugs (to be addressed in future
commits) in OpenAFS large volume support. However, there may
be other failure paths (unrelated to large volumes) that expose
this error as well.
When VAllocVnode() must allocate a new vnode but fails while
updating the vnode index file (e.g. an "addled bitmap" due to other
bugs in working with a vnode index larger than 2^31 bytes), it branches
to common recovery logic at label error_encountered:.
Part of this recovery is to call VFreeBitmapEntry_r(). Commit 08ffe3e81d875b58ae5fe4c5733845d5132913a0 added a VOL_FREE_BITMAP_WAIT
flag to VFreeBitmapEntry() in order to prevent races with VAllocBitmapEntry().
If the caller specifies VOL_FREE_BITMAP_WAIT, VFreeBitmapEntry_r will
call VCreateReservation_r() and VWaitExclusiveState_r(). However, the
exit from VFreeBitmapEntry_r() calls VCancelReservation_r() unconditionally.
This works correctly with the majority of callers to VFreeBitmapEntry_r,
which do specify the VOL_FREE_BITMAP_WAIT flag.
However, the VAllocVnode() error_encountered logic must specify 0 for
this flag because the thread is already in an exclusive state
(VOL_STATE_VNODE_ALLOC). This correctly causes VFreeBitmapEntry_r() to
forgo both the reservation and wait-for-exclusive-state. However, before
exit it erroneously calls VCancelReservation_r(). We now have unbalanced
reservations (nWaiters); this causes an assert when the VAllocVnode()
error_encountered recovery code later calls VCancelReservation_r()
for what it believes is its own prior reservation.
Modify VFreeBitmapEntry_r() to make its final VCancelReservation_r()
conditional on flag VOL_FREE_BITMAP_WAIT.
Reviewed-on: http://gerrit.openafs.org/11983 Reviewed-by: Perry Ruiter <pruiter@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Jeffrey Altman <jaltman@auristor.com> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit d833ba768064a32a19c6b0b94ffb0d8a3a40a089)
Change-Id: Ia146ca55b1c0497d475357e61eaeb061a11bd597
Reviewed-on: https://gerrit.openafs.org/12209 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Commit a14e791541bf19c6c377e68bc2f978fba34f94b1
refactored and corrected the counting of requests and aborts.
However, it inadvertently introduced a new undercount for
VL_GetEntryByName* requests, counting them only if
NameIsId(volname), e.g. volname="536870911".
Ensure that the normal case of a non-"numeric" volname is
also counted.
Discovered during review of pullup to 1.6.x.
Reviewed-on: http://gerrit.openafs.org/12106 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 670381aa5d3a7bc91ad74c7499605cca2c33d612)
Change-Id: Ic41f8775e4897efe5f6280b56d06d733865556a2
Reviewed-on: https://gerrit.openafs.org/12113 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Simon Wilkinson [Thu, 19 May 2011 14:06:15 +0000 (15:06 +0100)]
vlserver: Tidy up request counting
Tidy up the counting of requests and aborts in the vlserver. Don't
hide a variable allocation within a macro, convert macros to inline
functions, and make it possible to not count particular operations
by passing in an opcode of 0.
Michael Meffie [Fri, 30 Jan 2015 17:20:10 +0000 (12:20 -0500)]
volser: detect eof in dump stream while reading acl
Detect an EOF condition while reading the ACL in a dump stream
and return a restore error, instead of filling the ACL with
0xFF and then failing the restore due to an invalid tag.
Reviewed-on: http://gerrit.openafs.org/11703 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit ed52d65fe98549e13023e0a8997da479b626085a)
Change-Id: I9aacd635b8bbf89923db0121639d5112ab775c19
Reviewed-on: https://gerrit.openafs.org/12185 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Benjamin Kaduk [Sun, 22 Nov 2015 20:23:49 +0000 (14:23 -0600)]
cellconfig: check for invalid dotted quads
IP addresses entered into the CellServDB with components larger
than 255 would silently be trucated down to 8-bit unsigned integer
representations. This could cause confusing behavior with
occasional hangs.
FIXES 131794
Reviewed-on: http://gerrit.openafs.org/12109 Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 97150150e6d12cbbc0c4a5af3424c9bf1e56918c)
Change-Id: I4e628ab7e12e33b23cc513a268879de115ddec2e
Reviewed-on: https://gerrit.openafs.org/12210 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Fri, 30 Jan 2015 17:12:03 +0000 (12:12 -0500)]
volser: range check acl header fields during dumps and restores
Perform range checks on the acl header fields when reading an
acl from a dump stream and when writing an acl to a dump
stream.
Before this change, a bogus value in the total, positive, or
negative acl fields from a dump stream could cause an out of
bounds access of the acl entries table, crashing the volume
server.
Reviewed-on: http://gerrit.openafs.org/11702 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 0bf9fba458b39035a09f45c1b63f1e65672d4c00)
Change-Id: Icebeb1d62900a7978f02177627a30e41de49a182
Reviewed-on: https://gerrit.openafs.org/12127 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Wed, 24 Feb 2016 21:57:11 +0000 (16:57 -0500)]
LINUX: ifconfig is deprecated
ifconfig is deprecated and is no longer installed by default on RHEL 7 and
Centos 7. Use the replacement ip command in the init script for linux.
Fallback to ifconfig in the event the ip command is not available.
Thanks to Ben Kaduk for pointing out the hash built-in command.
Reviewed-on: http://gerrit.openafs.org/12192 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit b702ab5da216976ed01ad3b1c474ecd4cc522ff2)
Change-Id: I9ffdfee233555f1e06bc4f980e2905851224ecc9
Reviewed-on: https://gerrit.openafs.org/12193 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Benjamin Kaduk [Sun, 22 Nov 2015 19:24:43 +0000 (13:24 -0600)]
volser: set error, not code, before rfail
The rfail cleanup handler overwrites 'code' ~unconditionally, but
does use an existing 'error' value if present. Since the intent
is to return failure to the caller, preserve the code in the error
variable and do so.
Benjamin Kaduk [Mon, 23 Nov 2015 00:22:58 +0000 (18:22 -0600)]
Fix optimized IRIX kernel module builds
Commit 9f94892f8d996a522e7801ef6088a13769bee7c2 (from 2006)
introduced per-file CFLAGS, using $(CFLAGS-$@); this construct
is not parsed well by IRIX make, which ends up attempting to
expand '$@)' and finding mismatched parentheses.
Commit 5987e2923a2670a27a801461dc9668ec88ed7d2a (from 2007) followed,
fixing the IRIX build but only for the NOOPT case. This left the
problematic expression in CFLAGS_OPT until 2013, when another RT
ticket was filed reporting the continued breakage. That ticket
was then ignored until 2015 (now) with no particular cries of
outrage on the mailing lists. Perhaps this gives some indication
of the size and/or mindset of the IRIX userbase. (There have
been successful IRIX installations during this time period, so
presumably it was discovered that disabling optimizations helped
the build along.)
FIXES 131621
Reviewed-on: http://gerrit.openafs.org/12111 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 767694d9ec86fc9451f5a4ba2ec7405c29986a21)
Change-Id: Ie5d349b1e9f8a768efcb461d7367d2d7deac31f6
Reviewed-on: https://gerrit.openafs.org/12198 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Stephan Wiesand [Tue, 8 Mar 2016 13:15:17 +0000 (14:15 +0100)]
Linux 4.4: Do not use splice()
splice() may return -ERESTARTSYS if there are pending signals, and
it's not even clear how this should be dealt with. This potential
problem has been present for a long time, but as of Linux 4.4
(commit c725bfce7968009756ed2836a8cd7ba4dc163011) seems much more
likely to happen.
Until resources are available to fix the code to handle such errors,
avoid the riskier uses of splice().
If there is a default implementation of file_splice_{write,read},
use that; on somewhat older kernels where it is not available,
use the generic version instead.
[kaduk@mit.edu: add test for default_file_splice_write]
Reviewed-on: https://gerrit.openafs.org/12217 Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit ae5f411c3b374367ab8ae69488f78f8e0484ce48)
Change-Id: I40dd0d60caece6379a62674defb8d46a2bfadad6
Reviewed-on: https://gerrit.openafs.org/12228 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Laß [Mon, 18 Jan 2016 17:29:00 +0000 (18:29 +0100)]
Linux 4.4: key_payload has no member 'value'
In Linux 4.4 (146aa8b1453bd8f1ff2304ffb71b4ee0eb9acdcc) type-specific and
payload data have been merged. The payload is now accessed directly and has
no 'value' member anymore.
FIXES 132677
Reviewed-on: https://gerrit.openafs.org/12169 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 5067ee3ae11932a3f1c972c8f88b20afbd9e1d88)
Change-Id: I5a3e89b2676b463935e9a77042cbcd8ab812dc68
Reviewed-on: https://gerrit.openafs.org/12226 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Fri, 18 Mar 2016 14:22:33 +0000 (10:22 -0400)]
doc: fs examine no longer requires read rights on the volume root vnode
Update the man page to reflect the current access rights required for fs
examine. Historically, fs examine required read access on the root
vnode of the volume housing the directory or file being examined. This
access check was relaxed in commit d2d591caf2c9b4cf2ebae708cc9b4c8b78ca5a5a,
since the information returned by the file server is already available
anonymously by other means.
Reviewed-on: https://gerrit.openafs.org/12223 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit f99c1ec32bb6e8d31ac517173ff7502dbd85aa05)
Change-Id: I580d1e0cab7f823ac1932f99066495cef9e2410a
Reviewed-on: https://gerrit.openafs.org/12224 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Benjamin Kaduk [Wed, 16 Mar 2016 21:16:49 +0000 (16:16 -0500)]
Add param files for FreeBSD 10.2, 10.3
FreeBSD 10.3 is in the beta stage now; better get ready for it.
Reviewed-on: https://gerrit.openafs.org/12222 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 02a393de6b30a500b77f276011c70d41eff363b5)
[updated to match the FreeBSD param.h files on openafs-stable-1_6_x]
Change-Id: Iae290edd29b34aa849f7422b48c765f81eb802fe
Reviewed-on: https://gerrit.openafs.org/12232 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Benjamin Kaduk [Tue, 15 Mar 2016 04:15:20 +0000 (23:15 -0500)]
OPENAFS-SA-2016-002 ListAddrByAttributes information leak
The ListAddrByAttributes structure is used as an input to the GetAddrsU
RPC; it contains a Mask field that controls which of the other fields
will actually be read by the server during the RPC processing.
Unfortunately, the client only wrote to the fields indicated by the
mask, leaving the other fields uninitialized for transmission on the
wire, leaking some contents of client memory.
Plug the information leak by zeroing the entire structure before use.
Benjamin Kaduk [Tue, 15 Mar 2016 04:15:20 +0000 (23:15 -0500)]
OPENAFS-SA-2016-002 VldbListByAttributes information leak
The VldbListByAttributes structure is used as an input to several
RPCs; it contains a Mask field that controls
which of the other fields will actually be read by the server
during the RPC processing. Unfortunately, the client only
wrote to the fields indicated by the mask, leaving the other
fields uninitialized for transmission on the wire, leaking
some contents of client memory.
Plug the information leak by zeroing the entire structure before use.
Benjamin Kaduk [Tue, 15 Mar 2016 04:15:20 +0000 (23:15 -0500)]
OPENAFS-SA-2016-002 AFSStoreVolumeStatus information leak
The AFSStoreVolumeStatus structure is used as an input to the
RXAFS_SetVolumeStatus RPC; it contains a Mask field that controls
which of the other fields will actually be read by the server
during the RPC processing. Unfortunately, the client only
wrote to the fields indicated by the mask, leaving the other
fields uninitialized for transmission on the wire, leaking
some contents of kernel memory.
Plug the information leak by zeroing the entire structure before use.
Benjamin Kaduk [Sun, 13 Mar 2016 17:56:24 +0000 (12:56 -0500)]
OPENAFS-SA-2016-002 AFSStoreStatus information leak
Marc Dionne reported that portions of the AFSStoreStatus structure
were not written to before being sent over the network for
operations such as create, symlink, etc., leaking the contents
of the kernel stack to observers. Which fields in the request
are used are controlled by a flags field, and so if a field was
not going to be used by the server, it was sometimes left
uninitialized.
Fix the information leak by zeroing out the structure before use.
Benjamin Kaduk [Thu, 10 Mar 2016 01:30:20 +0000 (19:30 -0600)]
OPENAFS-SA-2016-001 group creation by foreign users
CVE-2016-2860:
The ptserver permits foreign-cell users to create groups as if they were
system:administrators. In particular, groups in the user namespace
(with no colon) and the system: namespace can be created. No group
quota is enforced for the creation of these groups, but they will be
owned by system:administrators and cannot be changed by the user that
created them. When processing requests from foreign users, the
creator ID is overwritten with the ID of system:administrators, and
that field is later used for access control checks in
CorrectGroupName(), called from CreateEntry().
The access-control bypass is not possible for creating user entries,
since there is an early check in CreateOK() that only permits
administrators to create users, using a correct test for whether
the call is being made by an administrator.
Brian Torbich [Thu, 21 Jan 2016 15:08:27 +0000 (10:08 -0500)]
redhat: Correct permissions on systemd unit files
Change the systemd unit file permissions created via
openafs.spec to be 0644 instead of 0755. Having the
systemd unit files be executable will trigger a systemd
warning.
FIXES 132662
Reviewed-on: http://gerrit.openafs.org/12174 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit a4c4b786059ac7d5f9ecc5ec07727f000b62c13f)
Stephan Wiesand [Mon, 22 Jun 2015 08:44:11 +0000 (10:44 +0200)]
redhat: Avoid bogus dependencies when building the srpm
By default the spec defines that both userland and kernel module
packages should be built. This results in a dependency of the form
"kernel-devel-`uname -m` = `uname -r`" being added to the source
package created by makesrpm.pl, which is bogus because the uname
values are from the system on which the srpm is built and needn't
apply to the system where it is used. While rpm and rpmbuild ignore
such dependencies of source packages, other tools don't and may fail.
Some versions of rpmbuild will also enforce those requirements when
building the srpm itself, which is pointless too.
Avoid both problems by pretending not to attempt building modules
and ignoring any dependencies when makesrpm.pl invokes rpmbuild -bs.
Reviewed-on: http://gerrit.openafs.org/11903 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Tested-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 9ee5fa152b7b7de6a6ddc6ed87bbf9f76da6e3e4)
Change-Id: I76aac20b8dcad2105f8d20a3e169b2f5526ef956
Reviewed-on: http://gerrit.openafs.org/12195 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de> Tested-by: Stephan Wiesand <stephan.wiesand@desy.de>
Mark Vitale [Mon, 9 Feb 2015 23:16:16 +0000 (18:16 -0500)]
pioctl.c: restore required result variable
Commit b9fb9c62a6779aa997259ddf2a83a90b08e04d5f refactored lpioctl()
so that LINUX would have its own implementation. This also simplified
the other lpioctl() implementations by removing superfluous variable
'rval'.
Unfortunately, 'rval' was actually required for both DARWIN and SUN511.
On both of these platforms, the address of 'errcode' is passed
to the respective ioctl_*() routine so its value may be passed back
to lpioctl(). Therefore, 'errcode' must not also be used for the
return value from these functions; doing so results in the return
value from the function overwriting the intended value of 'errcode' upon
return to lpioctl().
In the case of Solaris 11, ioctl_sun_afs_syscall() always returns zero
(as long as the ioctl device 'dev/afs' opened successfully).
So 'errcode' was always being set to zero, even if the pioctl had
actually failed. For example, without this fix, 'fs listcells'
loops forever on Solaris 11, listing an infinite number of "cells",
because it will never "see" the EDOM that informs it of the last defined
cell.
Benjamin Kaduk [Thu, 6 Feb 2014 21:11:49 +0000 (16:11 -0500)]
pioctl.c: removed unused variable
The 'rval' variable is only actually used in the LINUX20 case;
adding another conditional block is making the LINUX20 case
different enough that it should get split out entirely.
Doing so lets the 'else' clause be simpler.
Found by clang on FreeBSD 10.0.
Reviewed-on: http://gerrit.openafs.org/10819 Tested-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Perry Ruiter <pruiter@sinenomine.net> Reviewed-by: D Brashear <shadow@your-file-system.com>
(cherry picked from commit b9fb9c62a6779aa997259ddf2a83a90b08e04d5f)
Change-Id: I47f781bc13d54ad5a1b34365fcb9680793b206d1
Reviewed-on: http://gerrit.openafs.org/11778 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Benjamin Kaduk [Thu, 6 Feb 2014 22:01:19 +0000 (17:01 -0500)]
FBSD: Switch the dummy 'data' for mount(2)
The mount(2) API takes a void*, but 'rn' is const char*, which
is const-incorrect. Our vfs_cmount implementation ignores the 'data'
parameter, but upstream's kernel mount(2) implementation did
have a NULL check until r158611 (in the 6.1 or 7.0 timeframe),
so leave that comment for now.
Arguably we should be using nmount(2) instead of mount(2) anyway,
but leave that for a separate patch.
Michael Meffie [Thu, 7 Jan 2016 19:15:53 +0000 (14:15 -0500)]
Linux: Fix crash when the afs root volume is not found
Commit 602130f1de65eefeb4e31e114070d544eb9edd40 changed the allocation of the
backing device info to directly use the kernel memory allocator. Unfortunately,
one of the deallocations was not converted to the kernel memory deallocator
in the backport to the 1.6.x branch.
The code path is triggered when the afs root volume is not found (for example,
not -dynroot and the root.afs volume is not available.) This causes the system
to crash instead of just failing to mount /afs.
This is a 1.6.x change only. This bug was introduced in version 1.6.14.1.
FIXES 132653
Change-Id: Ifc991be5f914b4a4e1a797b7e2178dc03436b8e6
Reviewed-on: http://gerrit.openafs.org/12166 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Stephan Wiesand [Mon, 14 Dec 2015 14:11:37 +0000 (15:11 +0100)]
Update NEWS for 1.6.16
Release notes for OpenAFS 1.6.16
Change-Id: I5c1676b2bad4e94039691fb17f33fb5e278fadbf
Reviewed-on: http://gerrit.openafs.org/12131 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Mark Vitale [Fri, 7 Aug 2015 15:56:16 +0000 (11:56 -0400)]
afs: pioctl kernel memory overrun
CVE-2015-8312:
Any pioctl with an input buffer size (ViceIoctl->in_size)
exactly equal to AFS_LRALLOCSIZE (4096 bytes) will cause
a one-byte overwrite of its kernel memory working buffer.
This may crash the operating system or cause other
undefined behavior.
The attacking pioctl must be a valid AFS pioctl code.
However, it need not specify valid arguments (in the ViceIoctl),
since only rudimentary checking is done in afs_HandlePioctl.
Most argument validation occurs later in the individual
pioctl handlers.
Nor does the issuer need to be authenticated or authorized
in any way, since authorization checks also occur much later,
in the individual pioctl handlers. An unauthorized user
may therefore trigger the overrun by either crafting his
own malicious pioctl, or by issuing a privileged
command, e.g. 'fs newalias', with appropriately sized but
otherwise arbitrary arguments. In the latter case, the
attacker will see the expected error message:
"fs: You do not have the required rights to do this operation"
but in either case the damage has been done.
Pioctls are not logged or audited in any way (except those
that cause loggable or auditable events as side effects).
root cause:
afs_HandlePioctli() calls afs_pd_alloc() to allocate two
two afs_pdata structs, one for input and one for output.
The memory for these buffers is based on the requested
size, plus at least one extra byte for the null terminator
to be set later:
requested size allocated
================= =================================
> AFS_LRALLOCSIZ osi_Alloc(size+1)
<= AFS_LRALLOCSIZ afs_AllocLargeSize(AFS_LRALLOCSIZ)
afs_HandlePioctl then adds a null terminator to each buffer,
one byte past the requested size. This is safe in all cases
except one: if the requested in_size was _exactly_
AFS_LRALLOCSIZ (4096 bytes), this null is one byte beyond
the allocated storage, zeroing a byte of kernel memory.
Commit 6260cbecd0795c4795341bdcf98671de6b9a43fb introduced
the null terminators and they were correct at that time.
But the commit message warns:
"note that this works because PIGGYSIZE is always less than
AFS_LRALLOCSIZ"
Commit f8ed1111d76bbf36a466036ff74b44e1425be8bd introduced
the bug by increasing the maximum size of the buffers but
failing to account correctly for the null terminator in
the case of input buffer size == AFS_LRALLOCSIZ.
Commit 592a99d6e693bc640e2bdfc2e7e5243fcedc8f93 (master
version of one of the fixes in the recent 1.6.13 security
release) is the fix that drew my attention to this new
bug. Ironically, 592a99 (combined with this commit), will
make it possible to eliminate the "offending" null termination
line altogether since it will now be performed automatically by
afs_pd_alloc().
[kaduk@mit.edu: adjust commit message for CVE number assignment,
reduce unneeded churn in the diff.]
Chas Williams [Wed, 2 Dec 2015 15:38:42 +0000 (10:38 -0500)]
Open syscall emulation file O_RDONLY
As reported on the -info mailing list, docker is now exporting the
/proc filesystem as read only. ioctl() doesn't need write permissions
to do its work, so change O_RDWR to O_RDONLY.
Reviewed-on: http://gerrit.openafs.org/12122 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 359e1f2a25d242984229edfb378c0b95c3ee8570)
configure now checks for the standard getmaxyx() macro; failing that,
it looks for the older but pre-standardization getmaxx() and getmaxy(),
then falls back to the 4.2BSD curses _maxx and _maxy fields; if all
else fails, gtx building is disabled.
gtx now defines getmaxyx() itself if necessary, based on the above.
This also fixes a bug in gtx with all ncurses versions > 1.8.0 on
platforms other than NetBSD and OS X: gtx was using the _maxx and
_maxy fields, which starting with ncurses 1.8.1 were off by 1 from
the expected values. As such, behavior of scout and/or afsmonitor
may change on most ncurses-using platforms.
Reviewed-on: http://gerrit.openafs.org/12107 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit b800f7d9bd5ea390ab330c1c0c38ac8277eb9998)
Change-Id: Ia42eb33a963aa15131511c07ef4823f3f061a762
Reviewed-on: http://gerrit.openafs.org/12125 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Stephan Wiesand [Tue, 8 Dec 2015 12:13:47 +0000 (13:13 +0100)]
redhat: exclude kpasswd from debuginfo processing
While kpasswd was in the separate openafs-kpasswd package to avoid
clashing with the krb5 executable, openafs-debuginfo still conflicted
with krb5-debuginfo. Remove the x-bits from kpasswd in %install to
make debuginfo processing ignore it, and add them back in the %files
list. Make kapasswd a copy rather than a hard link to have it processed
in the usual way.
This is a 1.6-only change. On the master branch, this issue is fixed
by commit 4e3ceaccd9dc2b6e6a20e938d82af1ebaa2c43c8 which however
removes kpasswd altogether and is thus considered inapproriate for the
stable release series.
FIXES 131771
Change-Id: Icd940e3f5da133a98401c7a28ed6ee0c637bf602
Reviewed-on: http://gerrit.openafs.org/12128 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: Stephan Wiesand <stephan.wiesand@desy.de> Reviewed-by: Marcio Brito Barbosa <mbarbosa@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Wed, 18 Feb 2015 02:54:46 +0000 (21:54 -0500)]
prdb_check: fix out of bounds array access in continuation entries
A continuation entry (struct contentry) contains 39 id elements, however
a regular entry (struct prentry) contains only 10 id elements.
Attempting to access more than 10 elements of a regular entry is
undefined behavior.
Use a stuct contentry when when processing continuation entries in
prdb_check. This is done to safely traverse the id arrays of the
continuation entries. Use the new pr_PrintContEntry to print
continuation entries.
The undefined behavior manfests as a segmentation violation in
WalkNextChain() when built with GCC 4.8 with optimization enabled.
Reviewed-on: http://gerrit.openafs.org/11742 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 15e8678661ec49f5eac3954defad84c06b3e0164)
Change-Id: Ifc0682cd2b6b1590b10c44ccdda181fd4227c1c2
Reviewed-on: http://gerrit.openafs.org/12104 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Wed, 18 Feb 2015 01:58:27 +0000 (20:58 -0500)]
prdb_check: check for continuation entries in owner chains
Continuation entries may not be in owner chains. Fix the
comments in WalkOwnerChain (which were probably copied from
WalkNextChain) and add a check and error message for
continuation entries found on owner chains.
Reviewed-on: http://gerrit.openafs.org/11751 Reviewed-by: Daria Phoebe Brashear <shadow@your-file-system.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 3e9e244d1004972f202490faa0375768959f7690)
Change-Id: I8da044e32e6ade0d8d3050ccebf46d1e735e333a
Reviewed-on: http://gerrit.openafs.org/12103 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Wed, 18 Feb 2015 02:11:50 +0000 (21:11 -0500)]
libprot: add pr_PrintContEntry function
A continuation entry (struct contentry) contains 39 id elements, however
a regular entry (struct prentry) contains only 10 id elements. Attempting
to access more than 10 elements of a regular entry is undefined
behavior.
Add a new function to safely print continuation entries and change
pr_PrintEntry to avoid accessing the entries array out of bounds.
The pr_PrintEntry function is at this time only used by the prdb_check
and ptclient debugging utilities.
Reviewed-on: http://gerrit.openafs.org/11750 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 121ac2d939e19741986ddfbd387b5310c40edd0d)
Change-Id: Ifaa5ba1df0e40ae03e5a80fa7f0490196e7e4369
Reviewed-on: http://gerrit.openafs.org/12102 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Stephan Wiesand [Tue, 17 Nov 2015 14:03:03 +0000 (15:03 +0100)]
writeconfig: emit error messages again in VerifyEntries
Before commit e4a8a7a38dbf29e89bc1a7b6b017447a6aa0c764 an error message
was printed if looking up a server hostname failed. Restore this, and
also print a message in the now detected case that the lookup returns
loopback addresses only.
Reviewed-on: http://gerrit.openafs.org/12097 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit f6247f90c9644d7a396531c219c585f705e0c251)
Change-Id: I6edc433cbbc8f2d8528501aa30b0aceafb85dbb6
Reviewed-on: http://gerrit.openafs.org/12105 Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Tue, 4 Nov 2014 00:06:15 +0000 (19:06 -0500)]
avoid writing loopback addresses into CellServDB
Do not use loopback addresses for the server side CellServDB file. Use
getaddrinfo() instead of gethostbyname() to look up a list of IPv4
addresses for a given hostname, and take the first non-loopback address.
This avoids writing a loopback address into the CellServDB on systems
such as Debian, which map the address 127.0.1.1 to the hostname in the
/etc/hosts file.
Reviewed-on: http://gerrit.openafs.org/11585 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: D Brashear <shadow@your-file-system.com>
(cherry picked from commit e4a8a7a38dbf29e89bc1a7b6b017447a6aa0c764)
Change-Id: Ib53b924b49c4c959c2228f953227e37fb94030a9
Reviewed-on: http://gerrit.openafs.org/12083 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Wed, 21 May 2014 21:27:47 +0000 (17:27 -0400)]
doc: document the version subcommand
Document the built-in version sub-command which displays
the OpenAFS version string. This sub-command is provided
by the cmd library.
Document the switch style -version option provided by the cmd
library for the initcmd based commands: afsmonitor, scout,
xstat_fs_test, and xstat_cm_test.
Reviewed-on: http://gerrit.openafs.org/11161 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit ed52ea68c661a7428baeddeca2d95972fe3fe618)
Change-Id: Ie7a5194b8c407c8899ae71f168dfbaf5b47a3ae5
Reviewed-on: http://gerrit.openafs.org/12096 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Fri, 6 Nov 2015 16:56:31 +0000 (11:56 -0500)]
vos: reinstate the -localauth option for vos setaddrs
Commit d1d411576cf39c4bc55918df0eb64327718d566c added the vos remaddrs
subcommand, but unfortunately stole the common parameters from
setaddrs. Fix this bug and remove the extra blank line between
the subcommand syntax and the common params macro.
Reviewed-on: http://gerrit.openafs.org/12093 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 69d11fd5ee556bb375967d7c41dab39b9c1befbe)
Change-Id: I99e6586c8d2b5e2a20bfb404099f6aed950356e7
Reviewed-on: http://gerrit.openafs.org/12094 Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Mon, 17 Nov 2014 16:23:38 +0000 (11:23 -0500)]
vos: remaddrs sub-command
Introduce the vos remaddrs sub-command for removing multi-homed server
entries from the vldb. The remaddrs sub-command completes the listaddrs
and setaddrs command suite and allows vos changeaddr to be deprecated
completely.
Michael Meffie [Fri, 14 Nov 2014 21:57:53 +0000 (16:57 -0500)]
fix byte ordering in check_sysid
Several uuid fields as well as the ip addreses in the sysid file are in
network byte order. Fix the check_sysid utility to decode these fields
properly. In addition, print the server uuid in the common string
format used to display uuids, instead of by individual uuid fields.
Note: Although this fix is marked as a "cherry-pick", this patch was
rewritten for the 1.6 branch since the opr uuid handling functions are
not available in the 1.6 branch.
Change-Id: I52e74fc28b30f06a8180ff65a8006c9281162fe9
Reviewed-on: http://gerrit.openafs.org/12090 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Mon, 9 Feb 2015 20:04:19 +0000 (15:04 -0500)]
bozo: fix -pidfiles default
Fix the default value for the -pidfiles argument. The pidfiles
should be stored in the local state directory, not the server
configuration directory when using modern paths.
Michael Meffie [Sat, 8 Nov 2014 18:14:27 +0000 (13:14 -0500)]
vldb_check: rebuild free list with -fix
Rebuild the vldb free chain in addition to the hash chains when
vldb_check is run with the -fix option. Print a FIX: message for
entries added to the free chain.
Example vldb with a broken free chain.
$ vldb_check vldb.broken
address 199364 (offset 0x30b04): Free vlentry not on free chain
address 223192 (offset 0x36818): Free vlentry not on free chain
address 235180 (offset 0x396ec): Free vlentry not on free chain
Scanning 1707 entries for possible repairs
$ vldb_check -fix vldb.broken
Rebuilding 1707 entries
FIX: Putting free entry on the free chain: addr=199364 (offset 0x30b04)
FIX: Putting free entry on the free chain: addr=223192 (offset 0x36818)
FIX: Putting free entry on the free chain: addr=235180 (offset 0x396ec)
Thanks to Kostas Liakakis for reporting this bug.
Reviewed-on: http://gerrit.openafs.org/11598 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 3b9d52b2e8020cce65d55516db36580d58a51f0b)
Change-Id: I01987451857b26fb9e87984da85976196145e1dd
Reviewed-on: http://gerrit.openafs.org/12084 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Michael Meffie [Mon, 29 Sep 2014 16:14:24 +0000 (12:14 -0400)]
vos: preserve cloneId and backupId when restoring
Preserve the volume clone and backup ids in the volume header when
restoring over an existing volume, instead of always setting the clone
and backup ids to zero.
For example, before this change, restoring over a volume resets the
ROnly and Backup ids reported in the volume header section of vos
examine.
$ vos examine xyzzy
xyzzy 536871023 RW 3 K On-line
myhost /vicepa
RWrite 536871023 ROnly 536871024 Backup 536871025
...
RWrite: 536871023 ROnly: 536871024 Backup: 536871025
number of sites -> 2
server myhost partition /vicepa RW Site
server myhost partition /vicepa RO Site
$ cat /tmp/xyzzy.dump | vos restore myhost a xyzzy -overwrite incremental
Restoring volume xyzzy Id 536871023 on server myhost partition /vicepa .. done
Restored volume xyzzy on myhost /vicepa
$ vos examine xyzzy
xyzzy 536871023 RW 3 K On-line
myhost /vicepa
RWrite 536871023 ROnly 0 Backup 0
...
RWrite: 536871023 ROnly: 536871024 Backup: 536871025
number of sites -> 2
server myhost partition /vicepa RW Site
server myhost partition /vicepa RO Site
Michael Meffie [Thu, 13 Nov 2014 17:12:12 +0000 (12:12 -0500)]
redhat: do not overwite the server CellServDB
The bosserver creates a pair of symlinks in the client's configuration
directory (/usr/vice/etc) during startup, if the configuration files are
not present:
Due to a bug in the bosserver (which is not fixed on 1.6.x), the
symlinks are only created when the /usr/vice/etc directory already
exists when the bosserver is started.
If the bosserver is started before the client is installed (and the
/usr/vice/etc directory is present), then the packaging script will
write to the symlink CellServDB, overwriting the server's CellServDB with
the contents of the client's CellServDB.local and CellServDB.dist files.
Also, if the client is started after the bosserver creates the symlinks,
the client init script will overwrite the server's CellServDB with the
contents of the client's CellServDB.local and CellServDB.dist files.
Update the packaging and the client init script to delete this symlink
if present, since it is only intended to provide stub configuration
for the client utilities while setting up an initial server. Then,
the updating of the CellServDB will create a local file, instead of
following the symlink and overwriting the server CellServDB.
While here, adjust the indentation whitespace to match the tabs below.
Reviewed-on: http://gerrit.openafs.org/11601 Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 75d67780b42c1a7bfa506fcd230b28a6f293fcbd)
Change-Id: I7f899c7ea35d5df6a2e846a0354717fd51e2eba4
Reviewed-on: http://gerrit.openafs.org/12081 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Benjamin Kaduk [Wed, 17 Sep 2014 16:07:02 +0000 (12:07 -0400)]
Fix disk name initialization in scout
Scout needs to initialize names in scout_disk structures to prevent
the use of uninitialized data. However, '\0' is a NUL character
constant, i.e., the integer value 0, which is interpreted as NULL
(the pointer constant) in a pointer context, such as when assigned to
a variable of type char*. Since the name field in these structs is
passed to printing routines, the safe initialization value is the
empty string constant "", not a zero value.
Reviewed-on: http://gerrit.openafs.org/11469 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Perry Ruiter <pruiter@sinenomine.net> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
(cherry picked from commit 57ca77786eb6c04519f9358f1456fdf5b8006757)
Change-Id: I970e19c698cc26255cd244671908a631ef959c30
Reviewed-on: http://gerrit.openafs.org/12078 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Anders Kaseorg [Sat, 1 Aug 2015 09:52:59 +0000 (05:52 -0400)]
src/kauth/krb_udp.c: Remove redundant NULL check for array address
Resolves this warning with clang:
krb_udp.c:302:13: warning: address of array 'tentry.misc_auth_bytes' will always evaluate to 'true' [-Wpointer-bool-conversion]
if (tentry.misc_auth_bytes) {
~~ ~~~~~~~^~~~~~~~~~~~~~~
Reviewed-on: http://gerrit.openafs.org/11964 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 09bf3ebb26a3d8a4bd10571b394a59207a7f6980)
Change-Id: I94850d438902c358239142d696fae7206cef55a6
Reviewed-on: http://gerrit.openafs.org/12077 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Several functions in src/auth/userok.c construct pathnames in fixed
size buffers on their stacks. Those buffers are simultaneously too
small for the purpose for which they are used and too large to be
placed on the stack. This change replaces these fixed-size buffers
with dynamically-allocated buffers which are either exactly the right
size (due to asprintf) or have size AFSDIR_PATH_MAX.
This file has diverged quite substantially between master and 1.6.x,
so though it is marked as a "cherry-pick", this patch was substantially
rewritten for the 1.6 branch. In particular, we must use afs_asprintf()
since asprintf() is not available everywhere.
Change-Id: Iac62cb8293e7b28b422e7401eccb1f26841aff66
Reviewed-on: http://gerrit.openafs.org/11436 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Chas Williams <3chas3@gmail.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Daria Phoebe Brashear <shadow@your-file-system.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Michael Meffie <mmeffie@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Stephan Wiesand [Tue, 4 Aug 2015 11:28:35 +0000 (13:28 +0200)]
vlserver: Use the right variable for error code in SVL_GetStats
Commit 6c9fe7f80e4b5d9fb21609ee6743470d39dfb8f5 missed one instance
of "code" (as used on the master branch) that should have been changed
to "errorcode" (as used on the 1.6 branch) as part of the cherry-pick.
Fix this so that the right varlue is returned.
This is a 1.6-only change.
Change-Id: I97d9ac5961836843b617bab007d0c4d8bed82fef
Reviewed-on: http://gerrit.openafs.org/11970 Reviewed-by: Daria Phoebe Brashear <shadow@your-file-system.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>
Benjamin Kaduk [Wed, 10 Dec 2014 19:07:14 +0000 (14:07 -0500)]
Handle backupDate of zero
In older versions of OpenAFS (prior to 2001), the backupDate was
never set. Try to provide somewhat more reasonable behavior in
this case, by using a different date in that case.
This patch modifies a patch committed as 1e6fb1b7b7, the dumpTimes.to is now
set to creationDate for R/O volumes. The old value copyDate is wrong, if the
R/O volumes is re-cloned. This does not happen with "vos dump -clone", but
may happen with dumping a R/O volume directly: "vos dump <R/O volume>".
Volume dumps can be created from backup volumes, cloned volumes, or
directly from RW volumes. The beginning and end of the time range
covered by the dump is recorded in the DumpHeader. The end time is
based on the type of the volume. Use backupDate for backup volumes,
use copyDate for cloned volumes, and updateDate for RW volumes.
Reviewed-on: http://gerrit.openafs.org/11389 Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: D Brashear <shadow@your-file-system.com>
(cherry picked from commit 1e6fb1b7b7ed32e2035452db9fc221f38a8b4956)
Jeffrey Altman [Fri, 9 Oct 2015 02:22:12 +0000 (22:22 -0400)]
rx: OPENAFS-SA-2015-007 "Tattletale"
CVE-2015-7762:
The CMU/Transarc/IBM definition of rx_AckDataSize(nAcks) was mistakenly
computed from sizeof(struct rx_ackPacket) and inadvertently added three
octets to the computed ack data size due to C language alignment rules.
When constructing ack packets these three octets are not assigned a
value before writing them to the network.
Beginning with AFS 3.3, IBM extended the ACK packet with the "maxMTU" ack
trailer value which was appended to the packet according to the
rx_AckDataSize() computation. As a result the three unassigned octets
were unintentionally cemented into the ACK packet format.
In OpenAFS commit 4916d4b4221213bb6950e76dbe464a09d7a51cc3 Nickolai
Zeldovich <kolya@mit.edu> noticed that the size produced by the
rx_AckDataSize(nAcks) macro was dependent upon the compiler and processor
architecture. The rx_AckDataSize() macro was altered to explicitly
expose the three octets that are included in the computation.
Unfortunately, the failure to initialize the three octets went unnoticed.
The Rx implementation maintains a pool of packet buffers that are reused
during the lifetime of the process. When an ACK packet is constructed
three octets from a previously received or transmitted packets will be
leaked onto the network. These octets can include data from a
received packet that was encrypted on the wire and then decrypted.
If the received encrypted packet is a duplicate or if it is outside the
valid window, the decrypted packet will be used immediately to construct
an ACK packet.
CVE-2015-7763:
In OpenAFS commit c7f9307c35c0c89f7ec8ada315c81ebc47517f86 the ACK packet
was further extended in an attempt to detect the path MTU between two
peers. When the ACK reason is RX_ACK_PING a variable number of octets is
appended to the ACK following the ACK trailers.
The implementation failed to initialize all of the padding region.
A variable amount of data from previous packets can be leaked onto the
network. The padding region can include data from a received packet
that was encrypted on the wire and then decrypted.
OpenAFS 1.5.75 through 1.5.78 and all 1.6.x releases (including release
candidates) are vulnerable.
Credits:
Thanks to John Stumpo for identifying both vulnerabilities.
Thanks to Simon Wilkinson for patch development.
Thanks to Ben Kaduk for managing the security release cycle.
Benjamin Kaduk [Mon, 8 Sep 2014 17:47:33 +0000 (13:47 -0400)]
Tweak AFSDIR_PATH_MAX definition
On recent Debian, we run into runtime errors in the test suite
because _POSIX_PATH_MAX is only 256, and that buffer is too small
for a call to realpath(). Use PATH_MAX if it's available and larger
than _POSIX_PATH_MAX, in a way that should be safe even when PATH_MAX
is not defined.
Reviewed-on: http://gerrit.openafs.org/11453 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Reviewed-by: Perry Ruiter <pruiter@sinenomine.net> Reviewed-by: D Brashear <shadow@your-file-system.com> Tested-by: D Brashear <shadow@your-file-system.com>
(cherry picked from commit ec2382e060753dfdcaf84b9ac03e1534c65fcdbc)
Andrew Deason [Mon, 27 Oct 2014 21:39:34 +0000 (16:39 -0500)]
rx: Reset lastSendData when resetting call
Currently we use call->lastSendData to attempt to detect a stalled
call, if it's been too long since the last time the call sent any
data. However, we never initialize lastSendData to anything when
creating a new call.
This means that when rx_NewCall (or rxi_NewCall) returns, lastSendData
can be nonzero. This can happen if we reuse a DALLY call, or if we
pull a call off of rx_freeCallQueue. This can be a time very far in
the past, since the lastSendData time has not changed since the last
time the call was used; it will remain unchanged until a user of the
new call writes something to the call stream.
This can be a problem between the time when a caller creates a new
call with rx_NewCall and when the caller actually writes something to
the stream. Between those two times, if lastSendData happens to be set
to a time in the past, we may call rxi_CheckCall on that call, and
abort the call for being idle. The call will thus be aborted before it
even sent any data on the wire.
This is of particular concern for multi_Rx calls, since those can
create a large number of call structures, possibly introducing a delay
between calling rx_NewCall and writing anything to the stream (if one
of the later rx_NewCall invocations blocks waiting for an open call
channel, for instance, all of the previous allocated calls will stick
around unused for potentially a long time).
One such multi_Rx call is done by the cache manager, where it
periodically uses multi_Rx to call RXAFS_GetCapabilities to probe
fileservers for reachability. If this issue occurs during that
operation you can see a large number of servers get marked down for
code -9 (RX_CALL_IDLE), and then get marked as coming back up.
To fix this, set lastSendData to 0 when resetting a call, along with
most of the other fields in a call, to indicate that the call has
never sent any data. As long as lastSendData is 0, the call will never
get aborted with RX_CALL_IDLE, and this situation will be avoided.
This ensures that this issue cannot happen, since rxi_ResetCall is
guaranteed to be called at some point whenever we reuse a call
structure for any reason.
Reviewed-on: http://gerrit.openafs.org/11557 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
(cherry picked from commit 8c78a44cf5197ceee6907e947074973138c442f0)
Russ Allbery [Sat, 29 Jun 2013 21:27:55 +0000 (14:27 -0700)]
Fix restorevol crash on corrupt nDumpTimes value
If the number of dump times claimed in the volume header was greater
than MAXDUMPTIMES, restorevol would happily write over random stack
memory and crash. Sanity-check the loaded value and cap it to
MAXDUMPTIMES with a warning.
Bug found by Mayhem and reported by Alexandre Rebert.
In various places where we intentionally ignore the return values of system
calls and standard library routines, this changes the way in which we do so,
to avoid compiler warnings when building on Ubuntu 12.10, with gcc 4.7.2 and
eglibc 2.15-0ubuntu20.1.
Reviewed-on: http://gerrit.openafs.org/9980 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Benjamin Kaduk <kaduk@mit.edu>
(cherry picked from commit 73cad3be0a3489237ab7e66d3b12c52ffb0b67d0)
Michael Meffie [Tue, 18 Feb 2014 20:23:54 +0000 (15:23 -0500)]
vos: cross-device link error message
Print a better diagnostic message for cross-device link errors, which
happens when a clone volume is not in the same partition as the
parent read-write volume.
Reviewed-on: http://gerrit.openafs.org/10850 Reviewed-by: D Brashear <shadow@your-file-system.com> Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Jeffrey Altman <jaltman@your-file-system.com>
(cherry picked from commit da1597d74a0f56e35a156ec27df231f965934910)
Change-Id: I30cb0e87612732bfbce2c001831324d1a9e54409
Reviewed-on: http://gerrit.openafs.org/11587 Reviewed-by: Daria Phoebe Brashear <shadow@your-file-system.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Mark Vitale <mvitale@sinenomine.net> Reviewed-by: Stephan Wiesand <stephan.wiesand@desy.de>