git.michaelhowe.org Git - packages/o/openafs.git/log

viced: Remove extraneous h_AHTAHT_r in h_GetHost_r

We added this address to the host with an addInterfaceAddr_r call just
a few lines before, which adds the host to the address hash table.
Another call to h_AddHostToAddrHashTable_r is pure overhead and
confusing.

Reviewed-on: http://gerrit.openafs.org/6729
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit f52c33ea10de8d1d07a9c4805366283e6ca635dc)

Change-Id: Ib97718a42f9997a1fa257533296c62f3d618e1a7
Reviewed-on: http://gerrit.openafs.org/6769
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

viced: Set h_GetHost_r probefail if MPAA_r fails

Currently, in h_GetHost_r, if we get a connection whose address does
not match an extant host, but the reported uuid does, we ProbeUuid the
old host. If it fails, we call MultiProbeAlternateAddress_r and set
'probefail'. Later on, if 'probefail' is set, we always add the
connection address to the host, and remove the host->host,host->port
address from the host.

However, this is not always correct. Consider the following situation.

We have an existing host that has primary address 1.1.1.1, and also
has addresses 1.1.1.2 and 1.1.1.3 on the interface list but not on the
hash table. Say that host A stops responding on 1.1.1.1, and a
connection comes in from 1.1.1.2. We ProbeUuid 1.1.1.1 and get a
failure, so we call MultiProbeAlternateAddress_r.
MultiProbeAlternateAddress_r probes via rx_Multi the addresses 1.1.1.2
and 1.1.1.3. Say that 1.1.1.3 responds first, and responds
successfully, so MultiProbeAlternateAddress_r sets 1.1.1.3 to be the
primary address for the host.

After MultiProbeAlternateAddress_r returns, 'probefail' is set. A few
lines down, we see that oldHost->host does not match haddr, and
'probefail' is set, so we add 1.1.1.2 to the interface list, and
remove 1.1.1.3, and set 1.1.1.2 to be the primary address, even though
1.1.1.3 is the address we most recently 'know' is correct.

To fix this, only set 'probefail' if MultiProbeAlternateAddress_r also
fails after the failed ProbeUuid call. Conceptually this makes sense,
since if MultiProbeAlternateAddress_r succeeds, it found an address
that responds successfully to ProbeUuid, and it sets that address to
be the primary address. Therefore, after MultiProbeAlternateAddress_r
returns success, the situation is the same as if the 'good' address
was already the primary address, and the ProbeUuid call succeeded, so
'probefail' should be cleared.

Reviewed-on: http://gerrit.openafs.org/6728
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 3c803580bb503c7650f7b138c1b3f2eafd92b985)

Change-Id: I6554688447e7e62874e45a00a4c1faf957e29cb6
Reviewed-on: http://gerrit.openafs.org/6768
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

viced: Correctly update addrs on alt addr probe

The functions MultiBreakCallBackAlternateAddress_r and
MultiProbeAlternateAddress_r try to find a valid address in a host's
interface list of addrs. If they find one, they update host->host and
host->port. However, they do so just by changing those fields directly
and by calling h_DeleteHostFromAddrHashTable_r and
h_AddHostToAddrHashTable_r. This leaves the old host->host, host->port
on the interface list, and leaves it marked as 'valid'. Similarly, the
new host and port may still be marked as not 'valid'.

This can result in the host being on the addr hash table via an
address that is not on the host's interface list. After the above
situation occurs, we may call

  removeInterfaceAddr_r(host, host->host, host->port);

and then update host->host and host->port, which happens in a variety
of places. Since host->host, host->port is not marked as valid in the
interface list, it is not removed from the addr hash table, but it is
removed from the interface list. Eventually, this can cause the host
to be referenced from the addr hash table even after it has been
freed.

Since this can result in hash table entries pointing to the 'wrong'
host, this can result in FileLog messages such as:

Sun Feb  5 03:16:35 2012 Removing address that does not belong to host 0xdeadbeefdead (1.2.3.4:7001).

And bogus instances of the message:

Sun Feb  5 03:16:36 2012 CB: new identity for host 0xdeadbeefdead (1.2.3.4:7001), deleting(1 baadcafe 12345678-9abc-def0-12-34-456789abcdef fedcba98-76543210f-ed-cb-a9876543210f)

To fix this, make MultiBreakCallBackAlternateAddress_r and
MultiProbeAlternateAddress_r update the address list the same way as
all of the code in host.c does; by adding the new address with
addInterfaceAddr_r, removing it with removeInterfaceAddr_r, and
updating host->host and host->port.

Reviewed-on: http://gerrit.openafs.org/6727
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 7a6efc9bfcd955901d19274cc96f9a1b67f54f95)

Change-Id: I3bf82f116bc2dd979e1e93cea58a4c74b0a2023d
Reviewed-on: http://gerrit.openafs.org/6767
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

viced: Delete dup host before probing old host

Currently, when the fileserver gets a new connection from an address
not on the addr hash table, we allocate a new host structure and add
that host to the addr hash table. If we then find that that host's
uuid matches the uuid of an extant host, we do the following:

- probe the old host with the uuid, and MultiProbeAlternateAddress_r
   if the probe fails

- mark the duplicate host as HOSTDELETED

- manipulate the interface lists

Consider, for example, that we have an extant host ('oldHost') with
address 1.2.3.4:7001, but with 5.6.7.8:7001 on its alternate interface
list. At some point, the 1.2.3.4:7001 interface goes away or becomes
unreachable. A new connection comes in from that same host on
5.6.7.8:7001.

What will happen is we create a new host for address 5.6.7.8:7001, and
then detect the uuid collision. When we try to probe the old address
of 1.2.3.4:7001, it will fail, and we will try to
MultiProbeAlternateAddress_r. MultiProbeAlternateAddress_r will
determine that the alternate address 5.6.7.8:7001 responds
successfully to the probe, and it tries to set 5.6.7.8:7001 to be the
primary address of 'oldHost', and add 'oldHost' to the addr hash table
under 5.6.7.8:7001.

But the "new" host from the incoming connection is already hashed on
the address hash table under 5.6.7.8:7001, so the
h_AddHostToAddrHashTable_r call in MultiProbeAlternateAddress_r fails.
Since we later delete the new duplicate host, this results in
5.6.7.8:7001 being the primary address for the host, but that address
is not anywhere in the address hash table.

This behavior can be seen by the following pair of FileLog messages:

Wed Feb  1 11:02:38 2012 CB: ProbeUuid for 0xdeadbeefdead (1.2.3.4:7001) failed -01
Wed Feb  1 11:02:38 2012 h_AddHostToAddrHashTable_r: refusing to hash host beefdead, baadcafe (5.6.7.8:7001) already hashed

While those message do not necessarily indicate this problem, this
problem will result in those messages.

To fix this, mark the duplicate host as HOSTDELETED before we do any
probing on 'oldHost'. This way, if MultiProbeAlternateAddress_r tries
to add 'oldHost' to the addr hash table under 5.6.7.8:7001, it will be
able to do so successfully, since the old duplicate host is deleted.

Reviewed-on: http://gerrit.openafs.org/6726
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 9754c4e15fb9073ed9f95d5d4242d311eb65d717)

Change-Id: I35d41c91e496086377065f862021a5bb3fd221ef
Reviewed-on: http://gerrit.openafs.org/6766
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

vos: allow releases without offline time

allow releases using dumps to clones to avoid offline time

Reviewed-on: http://gerrit.openafs.org/6254
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 13a4f2b18bb84d05773529a794371d29f64570ab)

Change-Id: Iec0f2d882dc2ac9a11ed4ca282cb2424db052803
Reviewed-on: http://gerrit.openafs.org/6765
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

vos: refactor code

change vos to remove lots of duplicated code for volume deletes and clones

Reviewed-on: http://gerrit.openafs.org/6253
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 8d618dceeefacbeb37c4ef3b1f9a8e80552311aa)

Change-Id: I2c26dce796f93c8c993148a94d21dce8608e8c43
Reviewed-on: http://gerrit.openafs.org/6764
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Rx: Avoid lastBusy/PEER_BUSY discrepancy

If an rx call has the RX_CALL_PEER_BUSY flag set, but the call's
conn->lastBusy is not set, we can easily cause an rx caller to loop
infinitely. rx_NewCall will see that lastBusy for a call channel is
not set, and will use that call channel, but rxi_CheckBusy will note
that the call appears busy and that there are non-busy call channels
on the same conn, and so will return RX_CALL_BUSY.

This can currently happen in rxi_ResetCall, since we set
RX_CALL_PEER_BUSY on the call again if the call had that flag set when
rxi_ResetCall was called. If we are calling rxi_ResetCall with
'newcall' set, the passed in call is unrelated to the new call, since
it was obtained from the free list. Thus, the busy-ness of the call
should be ignored. Fix this by only paying attention to the incoming
RX_CALL_PEER_BUSY flag if 'newcall' is not set.

Also prevent this from happening by clearing RX_CALL_PEER_BUSY in
rx_NewCall when we select a call and clear lastBusy for that call.

Reviewed-on: http://gerrit.openafs.org/6707
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 2a4c6c3b9e1dc30d5599e67e02237a1aeef8a0f0)

Change-Id: I60d76469bc3dcf764e67524f39b3c55894e7ce99
Reviewed-on: http://gerrit.openafs.org/6763
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

vol: allow clones of readonly volumes

allow writing of data where it's not user data we're changing
(e.g. allow a vnode to be marked cloned in the vnode index)

Reviewed-on: http://gerrit.openafs.org/6251
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 4b93c42513785d1094c5336b5c9cc4add1b89c5e)

Change-Id: I9849897ae69a426026f6d030ca4e50e8cd7066b2
Reviewed-on: http://gerrit.openafs.org/6762
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

volser: allow clonevol purge id to be new id

effectively the same functionality that reclone already uses, but
for some reason we artificially limit it out of clone despite
the interface being there for it. it used to be there. put it back.

Reviewed-on: http://gerrit.openafs.org/6250
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 641c67473615e80cfb8cf1e67636a82e42e5c899)

Change-Id: I31df948a21639bd93c573c77207f0f6c9e43deed
Reviewed-on: http://gerrit.openafs.org/6761
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

volser: allow cloning non-rw volumes

remove EROFS error which is the only thing preventing a working clone
on a non-RW.

Reviewed-on: http://gerrit.openafs.org/6249
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit f1de04f3b35e91923efddca57e744b2138619223)

Change-Id: Ieb02a2d2c4d59681f5d6f372c7cd77a181d214dd
Reviewed-on: http://gerrit.openafs.org/6760
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

libafs: kill rxevent daemon even in upcall mode

the switch from rxk listener env to upcall env could leave the event
daemon running. fix that.

Reviewed-on: http://gerrit.openafs.org/6713
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit a4d9fbaa8036cc78ae0119330314f6deab159c90)

Change-Id: I2e87c692ee2003a24590f700accc30704899db8b
Reviewed-on: http://gerrit.openafs.org/6759
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

doc: refer to aklog instead of klog

klog (and kaserver) is deprecated. In generic examples, refer to the Kerberos
5 equivalents.

Reviewed-on: http://gerrit.openafs.org/6721
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 07d9b18e36fff6fc96c629ac2bebe8bb43f6b9dd)

Change-Id: I3e00b5d6acbdae35ac9ea645f094ebe46d391776
Reviewed-on: http://gerrit.openafs.org/6758
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

RedHat: Fail openafs-client 'stop' on rmmod error

Currently, the openafs-client RPM init script ignores any error
reported by rmmod. If 'umount /afs' succeeds but rmmod does not, the
client may panic the machine if the client is started again (from e.g.
running the 'restart' init script method), since afsd will try to
initialize AFS with a libafs that has been shut down.

So, do not ignore errors from 'rmmod', and instead fail the 'stop'
method from the init script if we get an error.

Reviewed-on: http://gerrit.openafs.org/6709
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 12e2a3abe7ca640a7cef2630039c06964f779f17)

Change-Id: I31256abac839c9011754445efa09960f061fdbb0
Reviewed-on: http://gerrit.openafs.org/6757
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

doc: fix AdminGuide

The AdminGuide was broken by e99224f2fe049bc339e87c8b6c195de67dca2f08.

Reviewed-on: http://gerrit.openafs.org/6703
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit aaab21e7a123ce701a8d5b2144032739fe177d6f)

Change-Id: I350186c617b3b39829c9af1ff6a4aa2835abbdc2
Reviewed-on: http://gerrit.openafs.org/6756
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

doc: add section on direct volume access

Provide examples of the direct volume access syntax, using the
fictitious example.com cell.

Reviewed-on: http://gerrit.openafs.org/6691
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit e99224f2fe049bc339e87c8b6c195de67dca2f08)

Change-Id: I5b2ac3b6f255d5918eeea4a63d4c7bb6164961d5
Reviewed-on: http://gerrit.openafs.org/6755
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

viced: Keep H_LOCK while locking host in h_Alloc_r

Currently in h_Alloc_r, we h_Lock_r the host, so we have it locked on
return. However, h_Lock_r drops the host glock, which is bad in this
situation since we have already added the host to the global hash
table, so other threads may see it. This can mean that by the time
h_Alloc_r returns, the returned host may have HOSTDELETED set, and/or
the addresses associated with the host may be completely different.

h_Alloc_r's caller, h_GetHost_r, seems to assume that the host is
still associated with the address of the passed-in connection. When
this is not true, this can result in the host structure getting into a
strange state, such as the primary addr/port may not be hashed. The
host may also have HOSTDELETED set, in which case we're not supposed
to be dealing with it at all.

To avoid these problems, lock host->lock directly in h_Alloc_r,
without going through h_Lock_r and dropping H_LOCK. Also do it as one
of the first things we do to initialize the host, just to make sure
that if anybody else happens to see the host, it is locked by us when
they do.

Reviewed-on: http://gerrit.openafs.org/6389
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit d6f977830c164ee079c68101595c28ff1847f88f)

Change-Id: Ib0916f3a92c4a34555ee3fa2880dec10041bf047
Reviewed-on: http://gerrit.openafs.org/6754
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

viced: Allow null host for BreakCallBack

For replication writes at the remote site, we will want to call
this without a host structure.

Reviewed-on: http://gerrit.openafs.org/6674
Reviewed-by: Simon Wilkinson <simonxwilkinson@gmail.com>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 01301d0a5323a836efaae30cac325c25f6a7577a)

Change-Id: I1fb0dff655515fedd7dfb41139f1fb6c85599377
Reviewed-on: http://gerrit.openafs.org/6753
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

com_err: correctly deal with lack of libintl

On machines lacking a libintl, _intlize() currently fails to initialize
the output error string--leading to tools (e.g., translate_et) returning
a null string; make afs_com_err fall back to returning the en/US canonical
error text when we don't have any i18n support...

Reviewed-on: http://gerrit.openafs.org/6638
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit ef63547e955edc60e2d074ef825b091e1c43882e)

Change-Id: Id138e48826aa855bd87e47f201ed6840399aa640
Reviewed-on: http://gerrit.openafs.org/6752
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

linux: fix probing for noop_fsync

Commit 267934d0e6910c8d8166a6e78f93c1bab40857b8 introduced
probing code to deal with the renameing of simple_fsync
inside the linux-kernel.
This test does not take different parameter-lists
for noop_fsync or simple_fsync resp. into account.
Fix this.

Reviewed-on: http://gerrit.openafs.org/6628
Reviewed-by: Marc Dionne <marc.c.dionne@gmail.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 20e82cecd9008f9b3467c9a323c5c3abf27f3021)

Change-Id: I478a1ea15150ca120c8f85e9696d8bdedfc974d1
Reviewed-on: http://gerrit.openafs.org/6751
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

viced: lockcount only valid if not expired

locks are issued on a lease. If the lock is expired, the lock
count is zero.

Reviewed-on: http://gerrit.openafs.org/6740
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Alistair Ferguson <alistair.ferguson@mac.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit 4603057d99a1501275f14f6d5aba089364785e09)

Change-Id: I784bdccae6d5fb01c76590ccd34fb9efa417747e
Reviewed-on: http://gerrit.openafs.org/6750
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Disable kernel opt by default on Solaris 10 and 11

With newer Solaris Studio (sometime in the 12.* series), cc started
adding SSE instructions to optimized x86 code, which is invalid for
kernel code and can generate panics. There appears to be no way to
turn this off currently (-xvector=%none is non-functional), so default
to not optimizing kernel code.

Reviewed-on: http://gerrit.openafs.org/6671
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 80592c53cbb0bce782eb39a5e64860786654be9f)

Change-Id: If1539dd88d4d28771a7eafcdaff30a75cb230917
Reviewed-on: http://gerrit.openafs.org/6683
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

SOLARIS: Use kcred instead of afs_osi_cred

For many vfs ops to the cache, we currently pass &afs_osi_cred for our
credentials, which is a mostly zeroed-out credential structure. In
some modern versions of Solaris (Solaris 11), at least some parts of
this structure need to not be NULL (cr_zone), or we will panic.

The Solaris kernel provides a 'kcred' credentials structure for the
purpose of using "kernel" credentials for i/o. So just use that
instead for Solaris 8 and beyond, since kcred has existed at least
since Solaris 8.

Reviewed-on: http://gerrit.openafs.org/6669
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit dc6beb3ea29a64bcf59807fd451a573aa54e1122)

Change-Id: I6fd0ce4a890c2e6d9377cad39f47303aa1687a6b
Reviewed-on: http://gerrit.openafs.org/6682
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: Panic on afs_conn refcount imbalance

An undercounted afs_conn can easily cause a panic and/or memory
corruption later on, since we put an rx_connection reference with each
afs_conn reference. Panic as soon as we detect this, as this indicates
a serious bug.

Reviewed-on: http://gerrit.openafs.org/6413
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 8a574ba16a80fc2b8b703ddcfc99486b977e6071)

Change-Id: Ibd60dafdf1a800349b73754dae18666fa0edd300
Reviewed-on: http://gerrit.openafs.org/6642
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Unix CM: reset blacklist on hard-mount retry

Reset black-listed servers on a request when retrying due to a
hard-mount retry. When hard-mounts are in effect, a request may
retry indefinitely. If all the servers have been black-listed
due to a transient error, the request may never complete.

Reviewed-on: http://gerrit.openafs.org/6330
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit faa58c9f60a158481bdfee27e175a37c5fcd64aa)

Change-Id: I1ecc3fa78c064c46849dec47c77f2fc405f2ee7f
Reviewed-on: http://gerrit.openafs.org/6641
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Derrick Brashear <shadow@dementix.org>

Linux: rpm: Update openafs.spec.in to include changes to installed files

Pull up some more of 3f7d8ec219e1aa04b6c0417ecf5e730d40b4f149 to
handle changes that have made it into 1.6 since the last pullup:

* Exclude the aklog_dynamic_auth man page, since it is AIX-only
* Add new files that have appeared in the distribution, such as the
'afsio' binary.

Reviewed-on: http://gerrit.openafs.org/4814
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry-picked from 3f7d8ec219e1aa04b6c0417ecf5e730d40b4f149)

Change-Id: Ib702f39d930057d92eca4d157fddb633cccf9fae
Reviewed-on: http://gerrit.openafs.org/6640
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

SOLARIS: Do not build x86 kernel module on 5.11

Oracle Solaris 11 no longer supports x86 (amd64 is required). If we
try to build the x86 module, /usr/include/sys/kobj.h complains that
the ISA is unsupported, and refuses to go on. So, just remove
MODLOAD32 from the libafs directories to build on sunx86_511.

Reviewed-on: http://gerrit.openafs.org/5835
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit c6a22d67ff9787ace2249d528eb9db99c5b19427)

Change-Id: I00f9f19653a2f98276c236d7e2331bc81f7c4f13
Reviewed-on: http://gerrit.openafs.org/6643
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

make openafs 1.6.1pre2

prerelease for 1.6.1

Change-Id: I3dbef9e4d360314cd4c789268d7b0d5c5011f6fc
Reviewed-on: http://gerrit.openafs.org/6614
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

viced: disable rx keepalives during disk io

when we are going to hit the backend storage, disable keepalives.
the net effect of this is that no idle dead time is needed; instead,
the normal dead time will result in a connection with no activity
simply dying naturally if i/o blocks forever.

it's important that keepalives be enabled during callback breaks,
so that is done.

(cherry picked from commit 05f3a0d1e0359f604cc6162708f3f381eabcd1d7)
Reviewed-on: http://gerrit.openafs.org/6515
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Change-Id: If2ee7f3ad7f2dc835dd350bb9558fde0aa179240
Reviewed-on: http://gerrit.openafs.org/6613
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

rx: RX_CALL_IDLE and RX_CALL_BUSY

Allocate new Rx error codes for Idle and Busy calls but do not
send these errors on the wire. They are only intended for local
use.

RX_CALL_IDLE is an indication to an application that requests it
that the rx peer is maintaining an open call channel but has not
sent any actual data for the length of the registered idle dead
timeout.

RX_CALL_BUSY is an indication to an application that requests it
that the rx peer believes the selected call channel is in use by
a pre-existing call.

When either RX_CALL_IDLE or RX_CALL_BUSY are assigned as the call
error and an abort must be sent to the rx peer, the errors are
translated to RX_CALL_TIMEOUT. This is necessary because it is
not possible to add new Rx error values in a method that is safe
for peers that are not expecting them.

This patchset also documents which Rx errors defined in rx.h are
used on the wire and which are not.

The Unix and Windows cache managers are updated to build with
these new error codes.

eviewed-on: http://gerrit.openafs.org/6128
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit c7673f4fad8e8b9390564e3cbfa11d5f1b52ba2f)
Change-Id: I4c7d6733ddae03bda5a31fe4486ada090dcfd0b3
Reviewed-on: http://gerrit.openafs.org/6612
Reviewed-by: Derrick Brashear <shadow@dementix.org>

RX: Avoid timing out non-kernel busy channels

When we encounter a "busy" call channel (indicated by receiving
RX_PACKET_TYPE_BUSY packets), we can error out a call with
RX_CALL_TIMEOUT to try and get the application code to retry the call.
However, many RX applications are not aware of this, and will just
fail with an error upon receiving a single busy packet.

So instead, make this behavior optional, and only do it if the
application tells us what specific error it expects to receive when a
busy call channel is detected. Enable this behavior for the Unix cache
manager, as it can cope with receiving an RX_CALL_TIMEOUT error in
this scenario.

(cherry picked from commit eddcee3ad518dff9fbfda790640c5bfd2e97ef5a)
Reviewed-on: http://gerrit.openafs.org/4159
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
Change-Id: I3938e79ab009f14f5421a4a45e2a099276c49f24
Reviewed-on: http://gerrit.openafs.org/6611
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

libafs: add replicated connection pool

keep pool of connections to use for replicated volumes,
so we can have a separate idle time setting

(cherry picked from commit cd1f72649650404581cfcdcf3beeeaf2bb960bd6)
Reviewed-on: http://gerrit.openafs.org/6546
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Change-Id: I056ba28d11313c9925df63869e0c55a1a4f132da
Reviewed-on: http://gerrit.openafs.org/6610
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

vol: remove SYNC fatal_error processing

Currently SYNC clients will "disable" themselves on certain error
patterns. For example, if the server end closes its file descriptor
too many times, or takes too long and then closes the fd, the SYNC
client will return an error and set fatal_error. On any subsequent
SYNC requests, the request will immediately fail without contacting
the server, often making SYNC client programs effectively useless
until they are restarted.

There isn't really any reason to cause future requests to fail.
Transient problems in the fileserver can easily make this situation
possible (e.g. a fileserver can crash but still take several minutes
to close the SYNC fd while the core is written to disk), and so while
we may return an error for a specific problematic request, future
requests may be fine.

So, just remove everything related to fatal_error, so future SYNC
requests can continue to be attempted. Adjust some log messages to
reflect the new behavior.

Reviewed-on: http://gerrit.openafs.org/6548
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 40bf6dee2409197f7494c3d09bf2dea7c248d185)

Change-Id: I0f7a1792afd1ace3beabe238107d0a5069ccbb44
Reviewed-on: http://gerrit.openafs.org/6609
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

rx: Correctly test for end of call queue

The intention of this condition is to check if the current call
being considered is the last one on the queue, but the test is
incorrect. A null next pointer indicates a removed item, not
the end of the queue.

Use the queue_IsLast macro instead to correctly determine that
this is the last item in the queue and that a call has to be
selected, either the current one or a previously seen good choice.

This can cause calls to get permanently stuck in the call queue
and never get assigned to a thread, even when all threads are
idle.

Reviewed-on: http://gerrit.openafs.org/6564
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 6ad3d646e62801cb81a3c9efeac320daa44936e1)

Change-Id: Ic9d0ff51c79115960ebb4634fc35a5e9da21c380
Reviewed-on: http://gerrit.openafs.org/6570
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Linux: use standard macro for set_nlink configure test

A generic macro exists to test for functions in the kernel, use
it for set_nlink.

Reviewed-on: http://gerrit.openafs.org/6566
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 64bd0b728ca95ba7bb4f1fdd909386fde3ce81e1)

Change-Id: I93d169bec8f476d5e692f7f5a7fe31002af7ce1e
Reviewed-on: http://gerrit.openafs.org/6569
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

vol: Fix VCreateVolume special inode cleanup

In order to dec the relevant special inodes, we need to know the
parent vol id in addition to the vol id itself. Use the appropriate
volume IDs when IH_DEC'ing special inodes after we fail to create the
volume, so we don't leave behind special inodes.

(cherry picked from commit 627cfb1d4e7b32b4342c59b162a36ba9beb8a066)
Reviewed-on: http://gerrit.openafs.org/6529
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Change-Id: I9f40f170cd6a0fffe2e17fc199af99e087066902
Reviewed-on: http://gerrit.openafs.org/6550
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

DAFS: Fix SYNC_FAILED VScheduleSalvage_r log

SYNC_FAILED is not an unknown protocol code, so stop saying it is.

Reviewed-on: http://gerrit.openafs.org/6530
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit fda2fe8538e00baddcd7fcf2c669162634b9d14e)

Change-Id: Ibd70b9f95031baf4955d503d7eb8b5f3a50febf7
Reviewed-on: http://gerrit.openafs.org/6549
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

rx: add and export a public keepalive toggle

make enabling and disabling keepalives a public function.
export the function

(cherry picked from commit 2a31f35936698c504c863702ebb675ac9dfe47e1)
Reviewed-on: http://gerrit.openafs.org/6517
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Change-Id: If7bd2b72980dd92771614a6d73a04441222a8314
Reviewed-on: http://gerrit.openafs.org/6522
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: put back conn if not using in checkserver loop

we get a conn, check it for eligibility, and if not,
just abandon it. "oops"

(cherry picked from commit 26fc0cda94c24a1c5f0bef109bca920456c25265)

Change-Id: I8e4f762b5170f07d6abc3508e88f001ca147c3a7
Reviewed-on: http://gerrit.openafs.org/6521
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Make libjafs buildable again

libjafs is surprisingly close to being buildable. Fix a few misc
things which have bitrotted over the years so it is possible to
actually build:

- Add -I$SRC/config to the cflags, so we can include afsconfig.h

- Remove references to the nonexistant rxkstats.o

- Do not link with UAFS' AFS_component_version_number.o, since this
gives us duplicate version number symbols

- Include afs_vosAdmin.h in Group.c, to satisfy some missing symbols

Reviewed-on: http://gerrit.openafs.org/6524
Reviewed-by: Steven Jenkins <steven@synaptian.com>
Tested-by: Steven Jenkins <steven@synaptian.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 967d7201ee5c27db6d75d5efafcad9458e2b5167)

Change-Id: I0cb510e3f115c2c35f06cf9cbddaf31835704eea
Reviewed-on: http://gerrit.openafs.org/6527
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

libuafs: only rebuild h directory when needed

A few changes to allow a "make all ; sudo make install ; make all..."
workflow to work without manually removing files in between.

Make the rebuilding of the h directory dependent on the source
files scanned to build it. This prevents it from being rebuilt
for every "make install".

While we're here, use -f when removing linktest for the clean target.
This allows "make clean" to remove it without prompting when the user
doesn't have write access to the file, as is the case when make install
rebuilds it as root.

Reviewed-on: http://gerrit.openafs.org/6519
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 2caf0778ddeb6eeb854360cac20c6b3f0894f3eb)

Change-Id: Id4ccad953669538072b834a6aa49b8beaeeeed35
Reviewed-on: http://gerrit.openafs.org/6526
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: discard cached state when we are unsure of validity

in the event we got a network error, we don't know if the server
completed (or will complete) our operation. we can assume nothing.
a more complicated version of this could attempt to verify that the
state is what we expect it to be, but in extended callbacks universe
this is potentially easier to solve anyway. for now, return the
error to the caller, and mark the vcache unstat'd.

(cherry picked from commit c2fc7e0f66621fc97f5b4dc389d379260638315c)

Change-Id: Ic38cf16e47664e6f36ad614735b42d3f4e5a6ce2
Reviewed-on: http://gerrit.openafs.org/6520
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

DAFS: Atomically re-hash vnode in VGetFreeVnode_r

VGetFreeVnode_r pulls a vnode off of the vnode LRU, and removes the
vnode from the vnode hash table. In DAFS, we may drop the volume glock
immediately afterwards in order to close the ihandle for the old vnode
structure.

While we have the glock dropped, another thread may try to
VLookupVnode for the new vnode we are creating, find that it is not
hashed, and call VGetFreeVnode_r itself. This can result in two
threads having two separate copies of the same vnode, which bypasses
any mutual exclusion ensured by per-vnode locks, since they will lock
their own version of the vnode. This can result in a variety of
different problems where two threads try to write to the same vnode at
the same time. One example is calling CopyOnWrite on the same file in
parallel, which can cause link undercounts, writes to the wrong vnode
tag, and other CoW-related errors.

To prevent all this, make VGetFreeVnode_r atomically remove the old
vnode structure from the relevant hashes, and add it to the new hashes
before dropping the glock. This ensures that any other thread trying
to load the same vnode will see the new vnode in the hash table,
though it will not yet be valid until the vnode is loaded.

Note that this only solves this race for DAFS. For non-DAFS, the vol
glock is held over the ihandle close, so this race does not exist.
The comments around the callers of VGetFreeVnode_r indicate that
similar extant races exist here for non-DAFS, but they are unsolvable
without significant DAFS-like changes to the vnode package.

Reviewed-on: http://gerrit.openafs.org/6385
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 8e15e16c9e6a5768f31976cc21b48d5bb10617b7)

Change-Id: I915d18c4252b698f39fdf65793311a39381096b4
Reviewed-on: http://gerrit.openafs.org/6495
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: Grab a reference to setp in afs_icl_Event4

We can drop GLOCK in several places in afs_icl_Event4 and the
afs_icl_AppendRecord callee. To ensure that the given afs_icl_set does
not get freed while we have GLOCK dropped, grab a reference to the
set.

Thanks to Ryan C. Underwood for reporting an issue triggered by this.

Reviewed-on: http://gerrit.openafs.org/6431
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 7461fa11939556d3b6f3ea38da7ff65607805579)

Change-Id: I7a33cf96d2031dd1798f7598918396eb8fbde611
Reviewed-on: http://gerrit.openafs.org/6494
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

xstat: cm xstat time values are 32 bit

The kernel space cm xstat time structures are implemented as 32
bit values in memory and on the wire. Define the client side
xstat userspace structures as 32 bit time values as well to avoid
size mismatches on systems with native 64 bit time values.

Reviewed-on: http://gerrit.openafs.org/5237
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 130144850c6d05bc69e06257a5d7219eb98697d8)

Change-Id: I8726efdd7123e9a1e0e4536bf2766c441964475d
Reviewed-on: http://gerrit.openafs.org/6386
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: increase idledead time

it's actually important this be more than the rx call dead time
so timing out server callbacks to clients don't result in us idle deading
a call to the server when callbacks need to be broken

FIXES 130327

Reviewed-on: http://gerrit.openafs.org/6497
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 0f4da13137612a9b0c0c3b57aa939d6661fb67f8)

Change-Id: I181d89c36175be93ed59226b401d48903fb5f584
Reviewed-on: http://gerrit.openafs.org/6518
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

linux: fsync on a directory should return 0, not EINVAL

Directory writes are synchronous, so this is fine. There's a
mostly-convenient function in fs/libfs.c that returns 0 that we can use
to do what we want ("mostly" because it was renamed in 2.6.35).

FIXES 130425

Reviewed-on: http://gerrit.openafs.org/6491
Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 267934d0e6910c8d8166a6e78f93c1bab40857b8)

Change-Id: Iaeb8a699673b6144c186b470f6d877fb54f1e319
Reviewed-on: http://gerrit.openafs.org/6493
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

rpm: Don't attempt to restart on upgrade when using systemd

systemd is actually rather capable of leaving the OpenAFS client in an
incredibly broken state, thanks to its willingness to track services and
kill their processes. We should not attempt to restart the client on
upgrade, whether a normal upgrade or a migration from SysV initscripts.
In the former case, it's fine (and correct) for the old AFS to keep
running; in the latter case, the unit file is capable of correctly
shutting down an initscript-launched client. The same is true for the
OpenAFS server.

This brings the packaging in line with the SysV initscript code in the
specfile, which does not attempt to restart the service, as well as with
e.g. Debian's packaging, which uses --no-restart-on-upgrade.

While we're here, clean up a redundant BuildRequires on systemd-units.

Reviewed-on: http://gerrit.openafs.org/6247
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit dee93ff1d114da711df345e06b5e1a682c877315)

Change-Id: I4ecf3b2f307a81549e0bd568ab5e4585a2ef1f2d
Reviewed-on: http://gerrit.openafs.org/6492
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

libafs: memset dirHeader->hashTable

Clear dirHeader->hashTable via memset instead of via a loop. This is
more efficient, and avoids the loop getting optimized into an unusable
_memset call on recent versions of Solaris Studio when building for
the kernel.

Thanks to Jeff Blaine for reporting the issue with Solaris Studio.

Reviewed-on: http://gerrit.openafs.org/4829
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit f091ace32e3045da396d577055dafd67888ff7ea)

Change-Id: Ia098730c3e83429ce4f886b1427159d13eff4c4e
Reviewed-on: http://gerrit.openafs.org/6414
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

Include afsconfig.h before anything else

afsconfig.h can define various preprocessor symbols that can affect
how system headers behave. For example, the presence of the
_POSIX_PTHREAD_SEMANTICS symbol changes the number of arguments to
getpwnam_r on at least Solaris 8. So, we must include afsconfig.h
before including anything else, to ensure consistency.

FIXES 130413

Reviewed-on: http://gerrit.openafs.org/6387
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit 37f537a21db6d560dd16a53ff5e0d2f0456d4c48)

Change-Id: I64970fd06af9a13d91acaf03b80a2baf224754ff
Reviewed-on: http://gerrit.openafs.org/6388
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>

aklog: Add replacement setenv/unsetenv

aklog makes use of the setenv and unsetenv functions, which do not
exist (at least) on HP-UX earlier than around 11i v3, and do not exist
on Solaris earlier than Solaris 10. Add replacement functions for
setenv and unsetenv when they are not present. Note that these
implementations are copied from libroken, and setenv was modified to
not use asprintf.

This is 1.6-specific. On the master branch, libroken takes care of
these for us. On the master branch, setenv and unsetenv from libroken
were added in 70e8451acd0426024c152073e53bc6606e0189e1.

Change-Id: I35546f1add7f4f87c6ffc484059057825887499f
Reviewed-on: http://gerrit.openafs.org/6376
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

make openafs 1.6.1pre1

prerelease for 1.6.1

Change-Id: Ia54b5c304791ebfc33b7043af9ea3688442e4b81
Reviewed-on: http://gerrit.openafs.org/5809
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afs: Clear VHardMount on ResetVolumeInfo

afs_Analyze sets VHardMount on a volume struct when a hard-mount
scenario is encountered, and clears it after sleeping. However, if the
volume struct has VRecheck set, or if it's not in memory, afs_Analyze
cannot retrieve the volume struct in order to clear VHardMount again.

For the VRecheck case, this can results in VHardMount never getting
cleared, and so hard-mount messages for the volume seem to disappear.
So, clear VHardMount when we set VRecheck so this does not occur.

For the case where the volume struct is not in memory, this is not a
problem, since when we allocate a volume struct again, the VHardMount
state will not be retained.

Reviewed-on: http://gerrit.openafs.org/6335
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit f469be407789e696c0b9e9a431b4879798a00e2a)

Change-Id: If13769445f20336dfba9755f3af0a1499ce16a6d
Reviewed-on: http://gerrit.openafs.org/6348
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

libafs: Rate-limit hard-mount waiting messages

Limit how often we log "hard-mount waiting for XXX" messages. Without
this, it is possible for a client with hard-mounts enabled to spam the
kernel log rather excessively (in extreme cases this can even panic
the machine on at least some Linux).

To keep things simple, just log approximately one message per volume
per hard-mount interval.

Reviewed-on: http://gerrit.openafs.org/5060
Tested-by: Derrick Brashear <shadow@dementia.org>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 530b5ecac51cc7ce61ccddd50868c632c4a47298)

Change-Id: I566aa3d411ff100ccc6afa9a5273fb84e6172dd0
Reviewed-on: http://gerrit.openafs.org/6347
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

libafs: Remove unused volume "states" flags

VResort and VMoreReps are not referenced anywhere in the tree, so
remove their definitions.

Reviewed-on: http://gerrit.openafs.org/5059
Tested-by: Derrick Brashear <shadow@dementia.org>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 6cae7c554e917a26b197167e177bd3eb22bce71a)

Change-Id: I0a282dac3a9e31bff4ff37c61275cc7c08456cad
Reviewed-on: http://gerrit.openafs.org/6346
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

libafs: Avoid using changing unixuser ticket data

PSetTokens was afs_osi_Alloc'ing after afs_osi_Free'ing the previous
token data. This can sleep, causing tu->stp to be pointing to garbage
while we wait to alloc. Additionally, rxkad_NewClientSecurityObject
can sleep while waiting to alloc memory, and so the given tu->stp
pointer given to it by afs_ConnBySA may be invalid by the time it
actually uses the data.

To fix this, we could implement unixuser locking to ensure mutual
exclusion of these events. However, this implements a more
conservative change for the 1.4 and 1.6 branches. In PSetTokens we
alloc the new memory before we change anything, and in afs_ConnBySA we
make copies of the ticket data before giving it to rxkad. With these
changes, the glock gives us enough serialization to avoid issues with
tu->stp changing underneath us.

This change is specific to 1.4 and 1.6. On the master branch, this
issue is fixed by implementing unixuser locks in change
Idd66d72f716b7e7dc08faa31ae43e9a23639bae3.

Reviewed-on: http://gerrit.openafs.org/4649
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 1465946bb6863430bf0efebd024d394549a8775f)

Change-Id: Icab5176bf685c408447f0f32ad65c5b003299d3d
Reviewed-on: http://gerrit.openafs.org/6345
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>

pam: Fix password torching const-ness

In some code branches, the PAM code "torches" a password by zeroing
it. However, it does this through a const pointer which we otherwise
know is not actually const. Make sure we get better type checking by
doing this through a non-const pointer.

Reviewed-on: http://gerrit.openafs.org/4554
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 5cd4282758317b24d2f63408ab4c62551bbebc03)

Change-Id: I94b22a31884dc9b184ec094e5cca4b6b0098cb15
Reviewed-on: http://gerrit.openafs.org/6295
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

pam: Clear up PAM_CONST related warnings on Linux

Commit 78d1f8d8 expanded the use of PAM_CONST and introduced many
new warnings on Linux where pam expects "const" arguments.

This clears up the warnings by doing the following:
- Cast "user" to char * when kalling ka* functions
- Change the signature of pam_afs_prompt and pam_afs_printf to use
PAM_CONST
- Use a separate non-const password pointer for pam_afs_prompt

Reviewed-on: http://gerrit.openafs.org/4487
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 3ea39166d64d2e66cddef015734c2f91548423af)

Change-Id: I16179a1c8b9d0e53c90b54733d1c5130f1d23153
Reviewed-on: http://gerrit.openafs.org/6293
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

make afsdump_scan get ACLs right

This makes afsdump_scan get the ACLs right on little endian systems.
It also corrects and slightly beautifies some output (indentation,
cut&paste error for negative ACL label).

Reviewed-on: http://gerrit.openafs.org/4494
Reviewed-by: Derrick Brashear <shadow@dementia.org>
Tested-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 1105d63ddf5a32b9381ff47e8101c3f141366fa6)

Change-Id: Iec0fa5bc9673bdce616611f422d74e55b0aa90f1
Reviewed-on: http://gerrit.openafs.org/6292
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

ntohs ubik header size

The 'size' field in the ubik header is only 16-bits wide, so we should
be using ntohs to read it, not ntohl. The database checking utilities
for the prdb and kadb were still using ntohl (vldb was fixed by
591f9b6de9ab3dc5c17ad41af0241527f7f04b31).

Reviewed-on: http://gerrit.openafs.org/5466
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit e69714739f64475d71633fd4cb3523bc1ae143bb)

Change-Id: Id4f677cddcedba3008d349bcf9740168129f8496
Reviewed-on: http://gerrit.openafs.org/6314
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

solaris: libafs depends on fs/ufs

The solaris afs module depends on symbols exported by fs/ufs.
Set this dependency in the afs module so the kernel loader
will automatically load the fs/ufs driver if is not already
loaded, such on zfs only systems.

Reviewed-on: http://gerrit.openafs.org/5456
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 0cb10104f5af73614e6b7673d3711ddbc3f3a866)

Change-Id: Ifcb5e2725bbd2de44218109aac9c20439dadf41e
Reviewed-on: http://gerrit.openafs.org/6315
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

vos: fix code to not triple-negate

!!! is !. just write it that way.

Reviewed-on: http://gerrit.openafs.org/6252
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 44045af35a6ae44880655115685e0755d6a0c828)

Change-Id: I646387f30c178ad512decd507925408183f83894
Reviewed-on: http://gerrit.openafs.org/6329
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

vol: log error reason on header read failure

Log the error reason instead of just VSALVAGE when
ReadHeader() fails.

Reviewed-on: http://gerrit.openafs.org/6108
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Andrew Deason <adeason@sinenomine.net>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 0d0a8288c1cdd05bbf5717ac45638cf6760ee7a8)

Change-Id: Ie49c9ee3ea23873f8d71c80fda45b763bcd8e466
Reviewed-on: http://gerrit.openafs.org/6328
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

libafs: disable mtu discovery

we need to rework this to use lack of soft acks instead of this
method, which is too fragile

Reviewed-on: http://gerrit.openafs.org/6256
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 4d76b19b600aad461ee1231eeadb9b7a27b7f117)

Change-Id: Iba3f3d9d475959f99759db9e81c05c300aa6cd02
Reviewed-on: http://gerrit.openafs.org/6327
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

libafs: only do pings for default conn with root uid

instead of doing it for potentially every unauth user, just do it for
root.

Reviewed-on: http://gerrit.openafs.org/6255
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 78885611ac8aa6602a4a1f42379c9d78ef226100)

Change-Id: Id54f6608b8807289242d094f48e394f0341782da
Reviewed-on: http://gerrit.openafs.org/6326
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

salvager: Create link table with volume group id

The link table needs to be created with the VG id or RW vol id, not
the non-RW vol id. Unlike other special inodes, this goes for both the
'parent' and 'volume' volume ids, not just the 'parent' id, since
there is only one link table per VG.

Without this, the salvager can generate invalid linktable special
inodes if it encounters a VG with no inodes for the RW vol.

Reviewed-on: http://gerrit.openafs.org/6179
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit ae227049fca2519e1f5ae1e8b68efbff10ebb665)

Change-Id: Ia8089cae6cb5ab97ef9d4ea306f3c48bead59914
Reviewed-on: http://gerrit.openafs.org/6325
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

DAFS: Ensure logging on attach2 errors

The attach2 error path transitions a volume to VOL_STATE_ERROR, in
case whatever got us to that error path did not already put the volume
in an appropriate state. Log when we do this, to make sure we do not
end up with a volume in VOL_STATE_ERROR state silently.

Reviewed-on: http://gerrit.openafs.org/6168
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 53230846a202a50f6c3a61b38d62ccba8876f89d)

Change-Id: I4dbe5c6f8be8820620e7a68c7f42b426211dbb82
Reviewed-on: http://gerrit.openafs.org/6324
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

DAFS: Avoid unnecessary preattach on FSYNC_VOL_ON

FSYNC_VOL_ON/FSYNC_VOL_ATTACH can be called to "online" a volume that
was actually kept online for the duration of the volume operation.
Avoid calling VPreAttachVolumeByVp_r for such a volume if it's already
attached, in order to avoid an unnecessary log message and to save a
tiny bit of processing.

Reviewed-on: http://gerrit.openafs.org/6167
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit d5d2d00a47cf53054bd18d7404be26bea34cba6f)

Change-Id: I2a7f4b214176570e787978dbe0aa2eb8dc57730f
Reviewed-on: http://gerrit.openafs.org/6323
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

DAFS: Log more for VPreAttachVolumeByVp odd states

When we encounter "odd" states in VPreAttachVolumeByVp_r, say what the
actual state we encountered was, along with the attach flags, so we
have a better idea of what's going on.

Reviewed-on: http://gerrit.openafs.org/6166
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 4fd8347e842af61681c1718e456500b92c5b6ea9)

Change-Id: If1c6fdba7b097a4bfb9e8e3e972ee56dee43bf2d
Reviewed-on: http://gerrit.openafs.org/6322
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

DAFS: Ensure GetVolume errors on ERROR volumes

In GetVolume, after we call VAttachVolumeByVp_r, there is no explicit
check to see if vp is in VOL_STATE_ERROR state. Make sure we don't try
to use such a volume, or blindly transition the volume away from that
state.

Reviewed-on: http://gerrit.openafs.org/6165
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit f59312c0aee1a5376b29262efc6e6ea71264305a)

Change-Id: Ibdd5cb5c475409918cdad1e73e2d7ed4ef57bd13
Reviewed-on: http://gerrit.openafs.org/6321
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

DAFS: Do not transition to ERROR on trivial errors

attach2 can result in many different errors; some indicate that the
volume is in an inconsistent state, but many others just indicate that
the volume cannot be attached for benign reasons (such as VNOVOL if
the volume doesn't exist, or VOFFLINE if the volume is being used by a
volume utility). Currently, for DAFS, attach2 transitions the relevant
volume to the VOL_STATE_ERROR state for almost all errors encountered,
even the benign ones. Instead, skip the error state transition for
error handling paths that do not reflect a "broken" volume.

Reviewed-on: http://gerrit.openafs.org/6164
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 5fc2365f5dff7f193781093ecb886b4c7391d5a3)

Change-Id: Ia3d732781c98fcda4db7b41cd744db860781594f
Reviewed-on: http://gerrit.openafs.org/6320
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

SOLARIS: Define BSD_COMP for non-UKERNEL on 5.11

We were defining BSD_COMP twice for UKERNEL. Move one of the #define's
up to the !UKERNEL section.

Reviewed-on: http://gerrit.openafs.org/6162
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 0ee7fcc0a49494ad66778012b7808f80ee3af8d3)

Change-Id: I683e1be2141c0cecac3f60ac4928d3e84a96bef8
Reviewed-on: http://gerrit.openafs.org/6319
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

vlserver: Avoid atoi for vol ids

Reviewed-on: http://gerrit.openafs.org/6050
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@secure-endpoints.com>
Tested-by: Jeffrey Altman <jaltman@secure-endpoints.com>
(cherry picked from commit d113c0eb8ac4717cafd7747a78c5aa3b649b8e68)

Change-Id: If965a7442262048048be9eca3e643c01d7b5c277
Reviewed-on: http://gerrit.openafs.org/6318
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

cache bypass: don't define iovecp for UKERNEL

iovecp is defined but not used for UKERNEL. Define it conditionally
to avoid gcc warnings and --enable-checking failure.

Reviewed-on: http://gerrit.openafs.org/5650
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 547d94edd3affb5f377cd1b3db39c46ca0cf5aec)

Change-Id: I700b82173b5c2435a716aaf10541e1583f2655f5
Reviewed-on: http://gerrit.openafs.org/6316
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

rx: arrange for Finalize to really stop running calls

previously rxi_ServerProc would happily error a call once
rx_tranquil was set, but keep calling ExecuteRequest.
Reorder code so kernel shutdown attempts are processed first;
then arrange if we are tranquil to not process the call further.

Issue discovered by Chaskiel Grundman.

Reviewed-on: http://gerrit.openafs.org/5447
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 6196abf3c864f8cc6ab1efc6e5625a5cc68158bd)

Change-Id: I00fad117ee8386fc29cd2423aa9fb7d89af55160
Reviewed-on: http://gerrit.openafs.org/6313
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

RPM: Fix dkms support on Fedora 15

Newer dkms no longer uses or supplies a $kernelver_array variable;
instead it uses $kernelver. The attached patch uses both, one of
which will be empty, so the test will do the Right Thing regardless
of your dkms version.

Further, the "mv" command at the end of the MAKE[0]= line needs
lots of back-slashes on each of its parms. We need three to make it
all the way to the final dkms.conf file -- so that's six -- plus one
more to escape the '$'; that's seven in all.

In case there's any question (and with all the back-slashes involved,
there should be) about the intent here, the whole point of this
patch is to make the final dkms.conf MAKE[0]= line look like this
(module line breaks:

MAKE[0]="KMODNAME=openafs.ko; DSTKMOD=\".\"; [ \"\`echo
\"${kernelver_array[0]}${kernelver[0]}\" | sed -e
's/^$[0-9]*\.[0-9]*$\..*/\1/'\`\" = \"2.4\" ] && KMODNAME=\"libafs-*\"
&& DSTKMOD=openafs.o; ./configure
--with-linux-kernel-headers=${kernel_source_dir}
--with-linux-kernel-packaging; make; mv src/libafs/MODLOAD-*/\\\$KMODNAME
\\\$DSTKMOD"

This is what was required to get "dkms build ..." to work on Fedora 15,
and as near as I can tell it shouldn't break 2.4 or other builds.

FIXES 130211

Reviewed-on: http://gerrit.openafs.org/5393
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 8e0aaae076f4cccfd2d6ed81ede4e355235b578e)

Change-Id: I47b0e24a0cbbd8402d5dd902e7e2af59ca1c30b7
Reviewed-on: http://gerrit.openafs.org/6312
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

aklog: strlen(NULL) doesn't work

strlen(filepath) when !filepath isnt going to work very well. i believe
this to be the intent of the author of the original patch.

Reviewed-on: http://gerrit.openafs.org/5328
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Chas Williams - CONTRACTOR <chas@cmf.nrl.navy.mil>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit c3e82ee98bf66058636f11d7a98d3bebe3bac955)

Change-Id: I89911d2da314059db633c00c69c9c9ec2050bb86
Reviewed-on: http://gerrit.openafs.org/6311
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: Derrick Brashear <shadow@dementix.org>

ihandle: OPEN fdPs are not counted in ihP refcount

Just add a comment explaining that an OPEN FdHandle_t does not count
against the ref count for its parent IHandle_t. Recently I've seen
some confusion about this when discussing ihandle internals, and this
should make this abundantly clear.

Reviewed-on: http://gerrit.openafs.org/5317
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 0f4dfaed6b25ae4282298cc2ba4908ce9f36f043)

Change-Id: Icd0d5b368ccc679967e14b2460f47c814598c797
Reviewed-on: http://gerrit.openafs.org/6310
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

xserver lock order violation

individual volume locks are pretty far down, well after afs_xserver.

afs_SetupVolume (with tv->lock)-> InstallUVolumeEntry-> afs_GetServer.

Install*Volume is careful to protect against recursing into the volume
lock via ResetVolumeInfo. Unfortunately, GetServer acquires xserver,
and then if it needs to call GetCapabilities, it drops and reacquires
xserver.

turns out the volume locks weren't protecting much. they also aren't
grabbed before xvolume is dropped. fine, so, restructure to do all the
work, then merge the result.

Reviewed-on: http://gerrit.openafs.org/5303
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 16dff61e148ce6893a68dda6e05e84f96fa753ac)

Change-Id: I7ca73fe9cf76e9a47cdccfc6cf0e9188fce9f5a6
Reviewed-on: http://gerrit.openafs.org/6309
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

xvcb lock order violation

afs_FlushVCBs(1) = xvcb, xserver (in that order)

afs_GetServer = xserver, xsrvAddr, (call afs_RemoveSrvAddr which calls
afs_FlushServer, which gets xvcb)

"nope". do a little dance to get xvcb, searching for a struct server to reuse
again if we had to block.

if you're curious:
Lock afs_xserver status: (reader_waitingwriter_waiting, write_locked(pid:1589 at:36), 3 waiters)
Lock afs_xvcb status: (none_waiting, write_locked(pid:0 at:273))
Lock afs_xsrvAddr status: (none_waiting, write_locked(pid:1589 at:116))

Reviewed-on: http://gerrit.openafs.org/5294
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 12fa5b859b857aaf0ab6975ebac0d4867d0ae0ff)

Change-Id: Ifee367fef4da44bcfd92cea6d26612977d6653a1
Reviewed-on: http://gerrit.openafs.org/6308
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

aklog: work around lion kerberos disaster

fine, so, instead of needing weak crypto enabled, use krb5 config
paths trick and ship a config to deal.

Reviewed-on: http://gerrit.openafs.org/5310
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit 33bb5218ba8d6c5b5c5c4839fd31824ca90c062b)

Change-Id: I91a8a02638cadf6f55814763b16cc50d3c7334c5
Reviewed-on: http://gerrit.openafs.org/6307
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

ihandle: Actually assert active fdPs are not AVAIL

FdHandle_t's that are on the linked list for an associated IHandle_t
should not be in the state FD_HANDLE_AVAIL. For the non-PIO case, we
assert that this is the case in ih_open (since we assert that if the
FdHandle_t is not in INUSE state, then it must be in OPEN state).
However, for the PIO case, we were just skipping over any FdHandle_t's
that were in the AVAIL state. These should never exist while on that
linked list, so assert for the PIO case, as well.

In the absence of bugs, there is no functional change here, but it
perhaps makes the ih_open loop easier to understand.

Reviewed-on: http://gerrit.openafs.org/5307
Reviewed-by: Derrick Brashear <shadow@dementix.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 6d79cfb36165c33dd1fd9c4d7ca8436d9a78f7ff)

Change-Id: If9e74f6120b007368128aead8787d715a1b1f093
Reviewed-on: http://gerrit.openafs.org/6306
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

LINUX vcache lock ordering in afs_linux_readdir

Normalize shared and exclusive lock operations. Take the lock
exclusive immediately, since the code assumes a write lock if
the vcache state is in flux or the entry is being fetched, releasing
-write- rather than shared, since we do not hold a shared lock.

Reviewed-on: http://gerrit.openafs.org/5309
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk>
Reviewed-by: Derrick Brashear <shadow@dementix.org>
(cherry picked from commit fa97579a08cdf23fcff3c50a5845d72a785feeaf)

Change-Id: I282913fead10791751ebaf3c7c6b33e3fbd9a1f7
Reviewed-on: http://gerrit.openafs.org/6305
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

butc: initialize startTime before it is used

In some unusual error situations, startTime may be used uninitialized.
Move the initialization up above the first such error condition.
(None of the intervening code can take measurably long to execute
so this should not make any difference in the non-error case.)

Found-by: clang static analyzer
Reviewed-on: http://gerrit.openafs.org/5165
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Simon Wilkinson <sxw@inf.ed.ac.uk>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 34cc26a1b11bc8cf8f91996a019ac4b7d21dccd8)

Change-Id: I70e08b61fbc33857da88224a0577330a0d68d9a7
Reviewed-on: http://gerrit.openafs.org/6304
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afsd: Fail gracefully on mtab open failure

On Linux and IRIX, fail gracefully when we fail to open /etc/mtab,
instead of segfaulting. Move strdup'ing cacheMountDir until after
opening /etc/mtab, to simplify the error handling.

Reviewed-on: http://gerrit.openafs.org/4825
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit b1f0bb472e237f5a6f88449db44f030c08a5a324)

Change-Id: Id12f6190eac15593dd32fd46db354e169d19dc2f
Reviewed-on: http://gerrit.openafs.org/6303
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

vos: Don't leak/overflow bulkaddrs

The vos listaddrs command repeatedly reuses a bulkaddrs array. It
zeros it once (without freeing the allocated memory), and then
repeatedly uses it without zeroing in a loop. This means that the XDR
library assumes that a sufficiently large block is already allocated,
doesn't reallocate for the incoming data, or check limits.

This means that if the first call to VL_GetAddrsU returns a set of
addresses smaller than subsequent calls, we'll write past the end
of the array, causing memory corruption.

Fix this by freeing the arrays correctly with each pass of the call.

Reviewed-on: http://gerrit.openafs.org/4756
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit b6add117ad210665a811213fe17a30fabbda3a3c)

Change-Id: Ic3ae8f506e87d18fdc121ff21221f61c359e38aa
Reviewed-on: http://gerrit.openafs.org/6302
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

volinfo: fix size totals when saving inodes

Fix the volume size calculation when volinfo is invoked with
both -sizeOnly and -saveinodes at the same time.

Reviewed-on: http://gerrit.openafs.org/4691
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit ababc1ba4412ae94b29f8ba0832eac087a024af2)

Change-Id: I371a983078c12e09474051ba71f63cdeb57c3631
Reviewed-on: http://gerrit.openafs.org/6301
Reviewed-by: Michael Meffie <mmeffie@sinenomine.net>
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

libafs: Always use anonymous VL connections

afs_NewVolumeByName was using the areq given by the caller for
afs_SetupVolume, which may represent authenticated credentials. Give
afs_SetupVolume &treq instead, which will be anonymous, so we don't
have to deal with rxkad for VL lookups.

Reviewed-on: http://gerrit.openafs.org/4666
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 4a82c0cc4167b729108813965bd39bf86ea15e6b)

Change-Id: Ic10e85b925176719c6c5dc708a1d1a315409d295
Reviewed-on: http://gerrit.openafs.org/6300
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

viced: Don't VTakeOffline_r without glock

We don't have the volume glock, so don't call _r functions.

Reviewed-on: http://gerrit.openafs.org/4669
Reviewed-by: Derrick Brashear <shadow@dementia.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit ef0ac2fbb026385f2306189230c2cff8706dff06)

Change-Id: I3d7c2ca8a514d50c01d4830640e806cefac32af1
Reviewed-on: http://gerrit.openafs.org/6299
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

viced: Check vnode length on Rename and Link

Commit 2578555d7e08131bf2fe4cdd0aa4b32567a76eb2 added vnode length
checks when we create or remove vnodes, but not during Rename and Link
operations (when vnodes are neither created nor destroyed). Add the
check in Rename and Link.

Reviewed-on: http://gerrit.openafs.org/4668
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 6df5547a7b93af74bc49ec8d4678aafd646dda1b)

Change-Id: I795407a143a56f26c0679b929763ebdc9c633e7a
Reviewed-on: http://gerrit.openafs.org/6298
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

viced: Do not try to reuse deleted client

When h_FindClient_r encounters a deleted client structure, it does not
try to find a different client structure to use. Force it to use a new
client structure by setting client to NULL when it detects a deleted
client.

This arguably reverts part of
4e55e30f5b2c149b350b6d6875793adf722fdc21, but the code paths in
h_FindClient_r are very different now, so that commit is probably not
too relevant.

Reviewed-on: http://gerrit.openafs.org/4582
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit b2c6a850738437256626e0dfe743a09224879ad4)

Change-Id: I5e3a12ee79847a915edeec732946b43270a35697
Reviewed-on: http://gerrit.openafs.org/6296
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

pam: Password is const in setcred

afs_setcred.c gets the "password" pointer from pam_get_data, which
always gives a const pointer (unlike pam_get_item used in afs_auth.c
&c, which sometimes gives a const or not-const pointer, depending on
the PAM implementation).

So, declare password const, to get better type checking.

Reviewed-on: http://gerrit.openafs.org/4553
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 94a9b2afd82b6729ddceb7ef736ddeb039e0ae1b)

Change-Id: I3171babfbdf29e7aa543a17f7dd543deedc9b30c
Reviewed-on: http://gerrit.openafs.org/6294
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

pam: Use PAM_CONST more often

Some callers of pam_get_item et al were just casting their argument to
a const void **. Some PAM implementations (Linux) want a const void**,
but others (Solaris) do not. Use the PAM_CONST symbol already defined
by autoconf to declare or cast the relevant variable const or not as
appropriate.

Reviewed-on: http://gerrit.openafs.org/4470
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 78d1f8d88334f711eaaf6555d3a962a504d3e80e)

Change-Id: I831fa52c238a6cf7ef211e8198815c4420ae7dce
Reviewed-on: http://gerrit.openafs.org/6291
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

pam: Check for null upwd from getpwnam_r

The POSIX getpwnam_r can yield a NULL struct passwd pointer even when
the returned error code is 0 (in particular, when the requested entry
is not found). Just add a check for a null upwd to make sure we don't
dereference a NULL pointer.

Reviewed-on: http://gerrit.openafs.org/4469
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit fbb4c6115b9af9c52ee06fa9c979a3f4195ad342)

Change-Id: I9a8bccba7b6ecbce393ea149270e5c61ebadd05c
Reviewed-on: http://gerrit.openafs.org/6290
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

pam: Use POSIX getpwnam_r on Solaris

_POSIX_PTHREAD_SEMANTICS is now always defined for Solaris, which
means we get a POSIX-conforming getpwnam_r, which takes 5 arguments.
So, add Solaris to the list of platforms that use a POSIX getpwnam_r.

Reviewed-on: http://gerrit.openafs.org/4468
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit a7d4fbd36a120b16caaddcd9d1c7f550cb14aae5)

Change-Id: I2ce885da5018b250052852cb70c70eaecd521cc5
Reviewed-on: http://gerrit.openafs.org/6289
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

kernel upcall rx env should shut down event daemon

also shut down event daemon in upcall environment

Reviewed-on: http://gerrit.openafs.org/4473
Reviewed-by: Derrick Brashear <shadow@dementia.org>
Tested-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit f4776f0a4d51472ee6f2406174b074c03213f7da)

Change-Id: I7b362e0e0d1ac5f028718b522e56101f2bed297e
Reviewed-on: http://gerrit.openafs.org/6288
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

DAFS: Request salvage on detach for volser

When the volserver notices that a volume needs salvaging, mark
V_needsSalvaged. So when we VDetachVolume the volume, we can then just
request the salvage in the volume package.

Fix the VolClone salvaging code to do this as well, instead of using
the vol-private VRequestSalvage_r interface.

Reviewed-on: http://gerrit.openafs.org/4452
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit de0c72bf7c7d284f4d15d99c79b39e0c97f1a122)

Change-Id: Id6f86368386a5e113a00aa0a496649d69875d283
Reviewed-on: http://gerrit.openafs.org/6286
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

volser: Avoid assert on ViceCreateRoot failure

If IH_CREATE fails in ViceCreateRoot, it may just be due to an on-disk
inconsistency. So, don't assert, but just return an error and detach
the volume.

Reviewed-on: http://gerrit.openafs.org/4444
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 399655e3df3bf30d7878dec70402fc0021cae752)

Change-Id: Icbc934bfe59f6468771f37e5721341dae49ba460
Reviewed-on: http://gerrit.openafs.org/6285
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

DAFS: Do not give back vol to viced after salvage

If we VRequestSalvage_r a volume successfully, and we are not the
fileserver, we will tell the fileserver to salvage a volume. So, we do
not need to give back the volume afterwards, since telling the
fileserver that a volume needs a salvage effectively gives it back (so
the salvager can take it).

So, clear needsPutBack so we don't try to also give back the volume,
and avoid the fileserver yelling at us for trying to give back a
volume that is checked out by someone else (or is not checked out at
all).

Reviewed-on: http://gerrit.openafs.org/4445
Reviewed-by: Derrick Brashear <shadow@dementia.org>
Tested-by: BuildBot <buildbot@rampaginggeek.com>
(cherry picked from commit 316b0421a27a4a76298f60ecd62b1236c971e512)

Change-Id: I432abb4d65a738e0e1936a7ff2fff2eccf45834a
Reviewed-on: http://gerrit.openafs.org/6284
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

afsd: Trim trailing slashes on Linux mntent

When we write a mount entry on Linux when mounting /afs, trim trailing
slashes on the mount path. Otherwise, the umount utility can get
slightly confused, and leave the /afs mount entry in /etc/mtab after
it's been unmounted.

For full correctness we should probably completely canonicalize the
path like the mount utility does, but it's unlikely that anyone will
provide significantly weird paths for cacheMountDir, so don't bother.

Reviewed-on: http://gerrit.openafs.org/4442
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 325443e6178f9dcdba7326bdb675447ac72bd540)

Change-Id: I9832fad8a43278c5eb618e4148c71f8a9ef81e87
Reviewed-on: http://gerrit.openafs.org/6283
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>

ubik: don't rely on timeout value after select()

The value of timeout after a select() call should be considered
undefined; relying on its value is not portable.
Since IOMGR_Select doesn't modify the timeout it is given, the
intention of the code seems to be to wait for gradually increasing
timeout values, starting at 50ms. At least under Linux, the
timeout gets set to 0 by select() if it waited for the full specified
time, resulting in a much shorter maximum possible wait period.

Initialize the timeout value for each loop according to the existing
logic, to get consistent behaviour between the lwp and pthreaded code.

Reviewed-on: http://gerrit.openafs.org/4441
Tested-by: BuildBot <buildbot@rampaginggeek.com>
Reviewed-by: Jeffrey Altman <jaltman@openafs.org>
Reviewed-by: Derrick Brashear <shadow@dementia.org>
(cherry picked from commit 0b510fe30afb34202342364e96bd9030052e1567)

Change-Id: I24eb4d4b1f758f33e3517671cb576ff23e641fb3
Reviewed-on: http://gerrit.openafs.org/6282
Tested-by: Derrick Brashear <shadow@dementix.org>
Reviewed-by: Derrick Brashear <shadow@dementix.org>