Ben Kaduk [Sat, 3 Dec 2011 19:37:09 +0000 (14:37 -0500)]
FBSD: switch afsi_SetServerIPRank implementation
Upstream has removed the ia_net{,mask} elements from
struct in_ifaddr, so we can no longer use them directly.
Switch to passing an rx_ifaddr_t (i.e. struct ifaddr*) in instead,
as that uses a slightly different codepath which still works
for our purposes.
We compile the kernel module with -Werror, so storing a pointer
(memcpy return value) in an int is forbidden, hence the conditional
declaration of 't'.
Ben Kaduk [Sun, 13 Nov 2011 18:12:50 +0000 (13:12 -0500)]
FBSD: cleanup dvp locking for ISDOTDOT
This is a more correct version of c2ed2577f9c16df3088158fb593d7aab6e8690d0, which was reverted since
it caused build issues on some versions and kernel panics on others.
We do want to always unlock dvp before calling over the network
in the ISDOTDOT case, but be sure to use the proper spelling
for this operation (as the syntax has changed between FreeBSD versions).
This requires not unlocking dvp right after the afs_lookup() call if
it succeeds, letting us just lock the "child" vp (which is actually
the parent starting from '/') first, and then re-lock dvp.
The error case of afs_lookup() was already handled correctly in
this logic, which is to say that it was incorrect before this change,
attempting to recursively lock dvp which causes a panic.
Ben Kaduk [Sun, 23 Oct 2011 15:22:07 +0000 (11:22 -0400)]
FBSD: typo fix
Gerrit/5572 added conditionals on __FreeBSD_version >= 900044, which
is (approximately) when a bunch of kernel API renames happened.
(There has since been a dedicated version bump to 900045 a month
or two post-facto, but 900044 should be fine for now.)
However, 900044 is not 90004.
Andrew Deason [Tue, 15 Nov 2011 19:18:48 +0000 (13:18 -0600)]
afs: Leave cellnum alone for explicit mtpt cell
When a mountpoint is given an explicit cell, don't alter cellnum.
Cellnum represents the cell for the parent, and is used for
determining whether or not we're crossing a cell boundary.
Previously, this code forced the mount point to always be treated as
foreign (for a mountpoint prefixed with a cell name), or to always be
treated as local (for a mountpoint prefixed with a cell number).
Reviewed-on: http://gerrit.openafs.org/6051 Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com> Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit e14dec55e6600edb60ce5184b4ab1f646c68947b)
Edward Z. Yang [Sun, 27 Nov 2011 00:32:51 +0000 (19:32 -0500)]
Linux: 3: Update specfile to know about 3.* kernels.
Update spec file to be consistent with acinclude.m4 with regards to
sysnames. We don't bother updating the code inside the legacy kernel
build section, as it doesn't get triggered for 3.* kernels (it should
probably get cleaned up at some point.)
Also, fix a bug in error message printing of unrecognized kernel.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
Reviewed-on: http://gerrit.openafs.org/6120 Tested-by: BuildBot <buildbot@rampaginggeek.com> Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk> Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 0f9214164ad56bfe74d0f2cec8775a312f5128dd)
Marc Dionne [Tue, 22 Nov 2011 02:27:06 +0000 (21:27 -0500)]
Linux: make sure backing_dev_info is zeroed
The afs backing_dev_info structure is allocated dynamically
without zeroing out the contents. In particular there's no
guarantee that congested_fn is NULL, causing spurious oopses
when bdi_congested in the kernel tries to call it.
Edward Z. Yang [Tue, 18 Oct 2011 03:16:15 +0000 (23:16 -0400)]
linux: Update Packaging to build OpenAFS services for Fedora's systemd
Fedora 15 now uses systemd (see http://fedoraproject.org/wiki/Systemd)
for the OS init system. While it currently has backwards
compatibility with older SysV-style init scripts, future versions of
Fedora may no longer support it, and OS startup tends to be faster
with the systemd service units. Also, systemd runs all the service's
processes within a linux kernel cgroup.
(see http://www.kernel.org/doc/Documentation/cgroups/cgroups.txt)
This change includes an openafs-client.service and
openafs-server.service unit files for the client and server packages
respectively.
Client
- Loading the openafs module was moved into
/etc/sysconfig/modules/openafs-client.modules. This causes the OS to
load the module on boot. This is the preferred way for modules to be
loaded with Fedora. (See
http://docs.fedoraproject.org/en-US/Fedora/15/html/Deployment_Guide/sec-Persistent_Module_Loading.html
for more details)
- The CellServDB file is generated with sed rather than cat.
This change was made because Systemd doesn't execute as a shell
script, but rather executes processes directly. Rather than invoking
a shell to concatenate the CellServDB.* files, they're written to the
CellServDB file using a sed oneliner.
- Do all of the proper kernel module loading and unloading.
Server
- Since systemd uses cgroups, when the service is shut down, all
processes in the openafs-server.service cgroup will be terminated.
The other changes are standard as per:
http://fedoraproject.org/wiki/Packaging:ScriptletSnippets#Systemd
Original version by Jonathan Billings <jsbillin@umich.edu>.
Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
Reviewed-on: http://gerrit.openafs.org/5637 Reviewed-by: Derrick Brashear <shadow@dementix.org> Reviewed-by: Alex Chernyakhovsky <achernya@mit.edu> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 286ffa0d7c4d594ff107b70f9e930271c027a79e)
Marc Dionne [Sat, 29 Oct 2011 23:23:07 +0000 (19:23 -0400)]
Linux: 3.1: update RCU path walking detection in permission i_op
The permission() inode operation changed again with kernel 3.1,
back to the form it had before 2.6.38. This compiles fine,
but is missing the new way of detecting when we get called in
RCU path walking mode, resulting in system hangs.
Jeffrey Altman [Fri, 14 Oct 2011 13:10:19 +0000 (08:10 -0500)]
klog.krb5: enforce DES for rxkad
0. Always request a TGT regardless of the state of
writeTicketFile.
1. request des-cbc-crc when requesting a ticket for an
rxkad service principal
2. check the returned key length to ensure that it matches
the permitted length of an rxkad key. If not, generate
an error instead of overwriting memory and continuing.
FIXES 130278
Reviewed-on: http://gerrit.openafs.org/5619 Tested-by: BuildBot <buildbot@rampaginggeek.com> Tested-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Andrew Deason <adeason@sinenomine.net> Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 3a9a5783cd1fd73902655f0876e2069b42688c94)
Andrew Deason [Wed, 2 Nov 2011 16:35:42 +0000 (11:35 -0500)]
Solaris: Specify ARCHFLAGS in CFLAGS
Various autoconf tests which use the C compiler may yield different
results depending on whether or not we are compiling for x86 or amd64
on Solaris (different libraries are available, structures may be
different, et al). So, set CFLAGS depending on which arch we are
targeting, so the autoconf results are more consistent with the actual
compilation during the build.
Andrew Deason [Fri, 4 Nov 2011 17:42:33 +0000 (12:42 -0500)]
DAFS: Deal with exclusive-state volume headers
GetVolumeHeader assumes that headers on the LRU are not associated
with a volume in an exclusive state. This is known to not be true for
some cases when salvage requests are received over FSSYNC, and may be
true in other scenarios. It's easy to just skip such headers, so skip
them.
Andrew Deason [Thu, 3 Nov 2011 18:17:33 +0000 (13:17 -0500)]
salvager: Implement AskDAFS via SYNC flags
Instead of probing the DAFS-ness of the fileserver by probing which
FSSYNC opcodes it supports, detect DAFS-ness by looking at the SYNC
response header flags, which explicitly state whether or not the
endpoint is DAFS. This avoids unnecessary "protocol mismatch" log
messages when the endpoint is not DAFS.
Andrew Deason [Wed, 9 Nov 2011 23:04:09 +0000 (17:04 -0600)]
volser: Preserve needsSalvaged during restore
Some of the routines during a volume restore may set needsSalvaged, if
an inconsistency is detected while writing the given volume data.
However, after the data is read, we set the volume header information
to what was found in the dump stream, ignoring any needsSalvaged that
may have been set.
To ensure that inconsistent volumes in this situation actually get
demand-salvaged (for DAFS) or offlined (non-DAFS), keep the value of
needsSalvaged in the header, if it was set.
Andrew Deason [Thu, 10 Nov 2011 17:58:12 +0000 (11:58 -0600)]
namei: Remove extraneous rmdir
We just unlinked the file, so we know we won't be able to rmdir() the
same thing. Give a path one level higher to
namei_RemoveDataDirectories, so we start rmdir()ing at the parent dir.
Jeffrey Altman [Sat, 12 Nov 2011 18:45:08 +0000 (13:45 -0500)]
Windows: Track active RPCs per scache_t
It has been noticed that multiple RPCs can be active on
a cm_scache_t object at the same time. This is especially
true of directory objects with the redirector. Track the
number of active RPCs and use that number in cm_MergeStatus
when deciding whether or not to discard the cached data for
the object.
Jeffrey Altman [Fri, 28 Oct 2011 15:36:10 +0000 (11:36 -0400)]
Windows: out of date version not in current chunk
In buf_GetNewLocked(), the comparision to decide whether a
cm_buf_t is a member of the current chunk must take the data
version into account. If the data version is out of date, it
is not part of the current chunk and is an object that can be
safely recycled.
Jeffrey Altman [Thu, 27 Oct 2011 21:57:25 +0000 (17:57 -0400)]
Windows: only flush buffers on shutdown if running
If a service shutdown message is received prior to the
service entering the running state, do not attempt to
buf_CleanAndReset() because the required data structures
and locks are not initialized.
Jeffrey Altman [Tue, 25 Oct 2011 19:32:11 +0000 (15:32 -0400)]
Windows: Do not EEXIST exact match during rename
AFS Rename operations on the file server will delete a
target file if it exists. Do not prevent renames because
an exact match of the target name exists in the target
directory.
Instead of dropping the lock for read and reacquiring for write
use lock_ConvertRToW() which will make the change atomicly if
it is possible or place the thread into the wait list if not.
The buffer free list least recently used queue has both
head and tail points. Use the proper versions of the queue
mgmt functions and do not handle edge cases as special cases.
The windows cache manager tracks volumes by volume group.
Up to this point all volume location updates have been performed
by the volume name. What if the volume name was altered? In this
case the volume location information for the in use volume ids will
fail until a mount point to the new name is queried. Before
marking the volume group as non-existent attempt to perform a
lookup using either the volume id for the readwrite or readonly
volume.
Jeffrey Altman [Mon, 14 Nov 2011 15:23:53 +0000 (10:23 -0500)]
Windows: netidmgr krb5_cc_get_principal can fail
Do not dereference a NULL pointer if krb5_cc_get_principal fails.
On master this bug is fixed by e55d1774b1b5b27a3617467b5e2a24ee2be3a38c
but that change is after the conversion to the Kerberos Compatibility
SDK and cannot be applied to openafs-stable-1_6_x.
Andrew Deason [Fri, 4 Nov 2011 22:19:28 +0000 (17:19 -0500)]
volser: Remove debugging log messages
While the -log option to volserver is supposed to print additional log
information, it shouldn't spam the log with useless data. Remove some
of the log lines that are really more "debug" information, so we log
the same amount of information as in the 1.4 series.
Simon Wilkinson [Wed, 12 Oct 2011 13:47:14 +0000 (09:47 -0400)]
rx: Don't clear the receive queue when out of packets
We can end up discarding a receive queue that's been soft acked,
effectively taking back soft acks we sent. Whilst the RX
documentation says that a client can drop soft acked packets at
will, our RX implementation assumes that if the final packet in
a call has been soft acked, we won't clear the queue. If a client
clears the queue in this situation, the call will hang.
What *should* happen is that we should take necessary locks,
confirm that we have not soft-acked all of the packets in a flow,
and then discard, or, if we're just going to discard, error the
call.
Andrew Deason [Wed, 2 Nov 2011 15:28:35 +0000 (10:28 -0500)]
afs: Only use actual connections for GetTime calls
The for() loop that makes an RXAFS_GetTime call in afs_CheckServers
was iterating over conns and rxconns from 0 to j. However, 'j' here is
just the size of the allocated array, whereas 'nconns' is the number
of structures in the array actually initialized. So, just go up to
nconns to avoid using unitialized connections and Rx connections.
This is a 1.6-only change. On master, the -settime code has been
completely removed in change
Id291f5f88b1ad84594706f2a1a02a933dddd0cb9.
Adam Megacz [Fri, 23 Mar 2007 19:14:41 +0000 (12:14 -0700)]
make bozo honor -rxbind correctly
Bozo needs to call rxInitHost() rather than rxInit() when -rxbind is
present. This patch causes it to read NetInfo/NetRestrict earlier in
the startup process so it can make that decision.
Andrew Deason [Wed, 13 Apr 2011 17:39:19 +0000 (12:39 -0500)]
Suppress cmp component version error messages
When we use cmp to determine whether to replace
AFS_component_version_number.c, suppress stderr in addition to stdout,
to slightly reduce output during the build.
Reviewed-on: http://gerrit.openafs.org/4471 Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk> Reviewed-by: Derrick Brashear <shadow@dementia.org> Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 21578144e08d46eeec9a2944e92e8d0d7a6dba57)
Andrew Deason [Tue, 15 Mar 2011 19:24:01 +0000 (14:24 -0500)]
viced: Check vnode length on dir ops
The commit aadf69eabb1962496fa93745ab560a5b48cacd61 added checks on
vnode length whenever we read or write from a vnode. Add the same
check on directory vnodes when we modify the directory (whenever
entries are added or deleted).
Andrew Deason [Thu, 3 Mar 2011 22:02:47 +0000 (16:02 -0600)]
viced: Check vnode length on read and write
When reading or writing a file vnode, check that the length of the
vnode in the vnode index matches the size of the on-disk file
containing the data for the file. If it does not match, take the
volume offline (and for DAFS, demand-salvage it).
Andrew Deason [Wed, 16 Mar 2011 16:48:08 +0000 (11:48 -0500)]
DAFS: DFlushVolume outside of vol glock
DFlushVolume may traverse a long list of directory objects, and can
even hit the disk, so we should drop the glock for it. This should be
safe in DAFS, since we already transition the volume to an exclusive
state before doing this, and DFlushVolume only deals with structures
internal to the directory package and maintains its own locking.
Andrew Deason [Fri, 1 Apr 2011 18:43:13 +0000 (13:43 -0500)]
afs: Retry unlock after afs_StoreAllSegments
HandleFlock calls afs_StoreAllSegments when unlocking an exclusive
flock lock. This can drop the write lock on avc, so we must
effectively retry the entire lock operation again, since the world may
have changed while we were waiting to reacquire the lock on avc. So,
retry once all of the lock checks up to that point, to ensure that a
lock on the file actually still exists.
Andrew Deason [Fri, 1 Apr 2011 21:43:24 +0000 (16:43 -0500)]
afs: Avoid memory leak on recursive write flock
When a process requests an exclusive lock on a file on which it
already holds an exclusive lock, we basically form a no-op. However,
HandleFlock was allocating a new SimpleLocks and attaching it to
avc->slocks, without freeing the old SimpleLocks structure.
Since we don't need to do anything if we already hold an exclusive
lock, just break out of the loop right away when we detect that
scenario. Thus we avoid adding a new structure to avc->slocks, and we
avoid a memory leak.
Andrew Deason [Thu, 24 Mar 2011 15:22:52 +0000 (10:22 -0500)]
DAFS: Correct FSYNC_VOL_QUERY_VOP checks
Check that the given partition matches the vp partition, and ensure
the vp is not in an exclusive state when we check the state.
Otherwise, we may return pending vol ops for a volume on a different
partition, or we may incorrectly return that there is no pending vol
op when in fact the volume does not exist at all.
Andrew Deason [Wed, 23 Mar 2011 22:25:03 +0000 (17:25 -0500)]
salvager: Give back volumes when exiting early
Sometimes the salvager exits a bit earlier than normal. For instance,
when no applicable inodes are found for a volume group, or if the
-inodes command line option was given. In these cases, we have already
checked out singleVolumeNumber from the fileserver (if we're salvaging
a single VG), so we need to give it back. So, give it back in those
instances.
Andrew Deason [Wed, 23 Mar 2011 21:46:47 +0000 (16:46 -0500)]
DAFS: Do not record vol ops for DELETED vols
When a volume is VOL_STATE_DELETED, it effectively does not exist, so
there is little point in recording a vp->pending_vol_op structure for
it. Just let callers checkout the volume as they would a nonexistent
volume: without recording anything about the operation.
This just reduces some edge cases and confusing debugging info, so we
don't have to worry about cleaning up pending_vol_op structures for
nonexistent volumes.
Andrew Deason [Wed, 23 Mar 2011 21:12:20 +0000 (16:12 -0500)]
salvager: Do not AskOnline nonexistent volumes
If singleVolumeNumber is not in our volume summary list, then the
singleVolumeNumber volume does not exist. So, don't try to bring it
back online. Still do try to make sure we don't have the volume
checked out, though, so issue an AskDelete, so ensure that it's not
checked out and that the fileserver does not think it exists.
Change AskDelete so we don't care if we tried to delete a volume that
the fileserver thinks already doesn't exist. Change the FSYNC_VOL_DONE
handler so it does not complain about already-deleted volumes.
Andrew Deason [Thu, 10 Mar 2011 23:59:39 +0000 (17:59 -0600)]
vol: Handle large volume IDs in VLockFile
VLockVolumeByIdNB currently cannot handle volume IDs larger than
2^31-1. Fix this by using struct flock64, F_SETLKW64, and F_SETLK64 in
the VLockFile functions where possible.
Thanks to Simon Wilkinson for pointing out F_SETLK64.
Andrew Deason [Wed, 2 Mar 2011 19:12:25 +0000 (13:12 -0600)]
Give a default reason in *sync-debug
If no -reason is given for fssync-debug calls, we currently just
transmit garbage to the fileserver or salvageserver. Instead, give a
default (the *_WHATEVER constant), so we do something consistent.
Andrew Deason [Thu, 3 Feb 2011 21:40:48 +0000 (15:40 -0600)]
ConvertROtoRW: Use old copyDate for creationDate
When we convert an RO volume to an RW, currently we just copy the
copyDate and creationDate from the RO metadata into the RW. But the
copyDate and creationDate fields have different meanings for RW and RO
volumes: for ROs, the creationDate is merely the last time the data
was updated from the RW during a release operation.
So, if the copyDate is older than the creationDate, use the copyDate
as the new RW creationDate instead. This will probably not match the
creationDate of the original RW, but it will be closer to it, and it
will more accurately represent the conceptual "created time" of the
new RW.
Doing this can avoid forcing an unnecessary full dump on a subsequent
release of the resultant RW volume, since the creationDate is more
accurate.
Andrew Deason [Thu, 12 Aug 2010 19:38:55 +0000 (14:38 -0500)]
libafs: Set tvcp->callback before BulkStatus
When we call InlineBulkStatus or BulkStatus, we currently do not touch
tvcp->callback for any of the vcaches before making the call. This can
cause us to not notice an InitCallBackState issued by the fileserver
before the BulkStatus call returns, since the InitCallBackState
handler looks at tvcp->callback to determine what vcaches to clear
callbacks for. In turn, this can cause us to think we have a callback
agreement with the fileserver on one of the BulkStatus'd files, when
the fileserver does not actually have such a callback agreement.
So, set tvcp->callback to the server we are contacting, so if we get
an InitCallBackState call from that fileserver, the CBulkFetching
state will be cleared, and we will correctly discard the callback
information for that vcache.
Andrew Deason [Wed, 24 Nov 2010 15:03:19 +0000 (10:03 -0500)]
ubik: Log a message when we replay the trans log
It can be helpful to know that an interrupted transaction was replayed
on startup, and this should be rare. So log a message when that
happens, indicating what db version we replayed to.
Andrew Deason [Wed, 24 Nov 2010 14:36:05 +0000 (09:36 -0500)]
ubik: Replay the transaction log label correctly
Commit eec0d94f519b3e27f255b9b7a637df043951424e fixed the transaction
replay log code to correctly identify valid transaction logs on
little-endian systems, but missed ntohl'ing the database label read in
a LOGEND opcode. Fix that, so the database is labelled correctly when
replayed from a transaction log.
And while we're here, actually pass a struct ubik_version* to
adbase->setlabel, to make it a little more clear what's happening.
Andrew Deason [Wed, 1 Sep 2010 20:10:56 +0000 (15:10 -0500)]
ubik: Record the last write tid in writeTidCounter
ubik is currently tracking writeTidCounter for write transactions
separately from regular transactions (assigned from tidCounter).
Specifically, tidCounter is incremented twice for each transaction,
but writeTidCounter is incremented twice only for write transactions.
As a result, writeTidCounter and tidCounter tend to drift far apart.
This is a problem, since the tid for DISK_* calls uses the transaction
id of the current transaction (based on tidCounter), and VOTE_Beacon
uses writeTidCounter for its transaction id. So, in effect, the tid in
VOTE_Beacon is completely bogus and unrelated to the transaction id of
the actual current write transaction. This can cause valid write
transactions to become invalidated when tidCounter becomes negative,
since VOTE_Beacon will send a positive tid, and if there is a current
in-flight write transaction with a negative tid, SVOTE_Beacon will
deem the transactions inequal and will abort the write transaction.
So instead, record the transaction id counter for the last write
transaction in writeTidCounter. This way, when we call VOTE_Beacon, we
will use the correct transaction id counter for the current write
transaction, and SVOTE_Beacon on the remote site will not invalidate
the transaction.
Andrew Deason [Wed, 8 Sep 2010 19:32:35 +0000 (14:32 -0500)]
DAFS: raise vhashsize limit
Raise the maximum specifiable vhashsize to 28 (from 14). Specifying a
vhashsize over 14 can be reasonable if you expect to have a few
million volumes on a fileserver.
Andrew Deason [Tue, 8 Mar 2011 22:59:32 +0000 (16:59 -0600)]
SOLARIS: Perform daemon syscalls as kernel threads
Add AFS_SUN5_ENV to the list of platforms where AFS_DAEMONOP_ENV is
defined. Implement the necessary functionality so we spawn kernel
threads when a daemon syscall is called. Remove the rxk_Listener
wrapper, since it will be called in a separate thread via the
afs_DaemonOp interface.
Andrew Deason [Tue, 8 Mar 2011 21:37:17 +0000 (15:37 -0600)]
libafs: Consolidate afs_DaemonOp code
Create the AFS_DAEMONOP_ENV define to simplify the logic of when we
perform afs_DaemonOp-y code paths. Also create the daemonOp_common
function, to perform common pre-fork operations that are common
between platforms.
Ben Kaduk [Sat, 8 Oct 2011 21:16:26 +0000 (17:16 -0400)]
FBSD: deal with kernel API rename
Upstream decided to rename the kernel functions that implement
syscalls to have a sys_prefix (including afs3_syscall!).
We use a couple of them, so we need to conditionalize accordingly.
Unfortunately, __FreeBSD_version was not bumped with the change,
so we use something close to it and hope it's close enough.
allow cloning of any volume to any volume with same parent ID
remove checks to disallow cloning of ro volumes to rw volumes,
which allows cloning of any volume within the same parent ID
grouping, including allowing destruction of newer version of the
volumes.
remove check for disallowing clones of backup or ro volumes
removes the if-statement ensuring that the volume being cloned is
not a backup volume, nor a read-only volume. This allows clones
from any type of volume to a given volume. Parent volume meta-data
is maintained, only the cloneId value changes.
Andrew Deason [Mon, 29 Aug 2011 22:41:31 +0000 (17:41 -0500)]
DAFS: Remove VOL_SALVAGE_INVALIDATE_HEADER
Currently VRequestSalvage_r takes a flag,
VOL_SALVAGE_INVALIDATE_HEADER, which causes the header for the
specified volume to be freed (via FreeVolumeHeader). This is almost
never safe to do, since there may be other users of the specified
volume that can be accessing the volume header at the same time.
There is also no reason to invalidate the header at the time of the
VRequestSalvage_r call, since the header must be invalidated when we
detach the volume (other utilities may change header information). So,
if there are any problems in the future because we do not invalidate
the header at the time of VRequestSalvage_r, it is the fault of the
detachment/offlining logic.
So, remove VOL_SALVAGE_INVALIDATE_HEADER and all of its users. Take
this opportunity to correctly document the VRequestSalvage_r headers
in the VRequestSalvage_r comment, as it was previously missing the
VOL_SALVAGE_NO_OFFLINE flag.
Michael Meffie [Thu, 13 Oct 2011 16:23:35 +0000 (12:23 -0400)]
DAFS: fssync online requires a partition name argument
fssync-debug online silently fails when run without a partition name.
Check for the required partition name on the server side and the client
side. Report errors back to the client when the server side fails to
pre-attach the volume.
Andrew Deason [Tue, 11 Oct 2011 15:51:14 +0000 (10:51 -0500)]
volser: Remove ExtractVolId
volser was using its own function to extract a volume ID from a
filename string, and was using atol to do so. The ato* family of
functions can have problems with larger volume IDs, not to mention a
lack of error checking, so don't use it. Since we already have the
function VolumeNumber in the vol package to do the very same thing,
just use that instead.
Andrew Deason [Mon, 3 Oct 2011 18:10:44 +0000 (13:10 -0500)]
viced: Check for HOSTDELETED in stillborn check
h_FindClient_r checks the connection rock for a client object twice.
First it sees if we already have a client object, and if we don't, we
effectively create one (or find a suitable one). Then we check again,
to see if someone else set the rock while we were creating a client
structure.
Currently, the first check checks if client->host->hostFlags has
HOSTDELETED set, but the second check does not. So, if the host
associated with the client has been deleted by someone else, currently
we will unnecessarily log a "stillborn client" message, and we will
continue to use the deleted host. If the host continues to be held by
someone, we will run into the same situation repeatedly on future
requests until all of the host references go away.
To fix this, also ignore HOSTDELETED clients when performing the
stillborn race check.
Andrew Deason [Fri, 14 Oct 2011 16:32:34 +0000 (11:32 -0500)]
vos offline: Bring volume back online for -busy
vos offline is supposed to bring a volume back online from "busy"
status before exiting, as volumes should not be in "busy" status for
extended periods of time. This was being enforced by required that
-sleep be specified; however, -sleep only results in the volume being
brought back online if a non-zero sleep time was specified. So, make
sure the volume is brought back online if -busy was specified.
do set errors when we bomb out early
do not unlock and return early when we happen to do a correct zero
length read
do set errors the kernel can deal with if we're feeding a page routine
Simon Wilkinson [Sun, 23 Oct 2011 23:07:33 +0000 (19:07 -0400)]
rpm: Turn on debugging
Now that we build with a blank CFLAGS line, we need to make sure and
actually turn on debugging in the build system, so that our debuginfo
files are vaguely useful
Simon Wilkinson [Wed, 12 Oct 2011 13:50:18 +0000 (09:50 -0400)]
rx: ackall handling
If we ACKALL a stream, then we're sending a hard ACK for all of the
packets in the stream. We shouldn't send that hard ACK, and then a
load of soft ACKs for packets that don't actually exist.