and osi_Invisible cleanup so ifdef logic would be more clear
====================
This delta was composed from multiple commits as part of the CVS->Git migration.
The checkin message with each commit was inconsistent.
The following are the additional commit messages.
====================
"1. The default Open AFS is set to normal security (doesn't generate random
user names).
If you are installing over a previous version (before 1.2.2b) it's default
is
high security; therefore, if you want the normal security, you should
uninstall the previous version (1.2.2a or earlier) and select to 'Not
Preserve previous settings'.
To manually change security you need to set the following registry keys:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\TransarcAFSDaemond\NetworkProvider
LogonOptions = 1 - Integrated Logon
LogonOptions = 2 - High Security options, Random User name generation
LogonOptions = 3 - both
3. Windows 2000/NT, Win9x - First time installations will create necessary
directories when user decides to download CellServDB
based on suggestion from Ted Anderson
"the changes make more sense
than the code as it currently exists. the only think i am nervous
about is the dontSleep delete. while it makes more sense to just
not wakeup sleepers if none exist, i suppose its possible that
some bit of afs code wants acausal (wake before sleep) events.
that does seem quite unlikely. just looking at the sleep on
solaris, it checks the seq number to get the next event not
a previous event.
i imported the changes and make the fixup in osi_stoplistener().
i dropped some of the silly syntax changes that junked up the
diff -- this makes it a bit easier to see what was changed.
i just added an assert in afs_addevent for quality assurance
purposes."
====================
This delta was composed from multiple commits as part of the CVS->Git migration.
The checkin message with each commit was inconsistent.
The following are the additional commit messages.
====================
fix for osi_StopListener so it does the rigth thing
"The first is to change the gfp_mask passed to kmalloc(). Using GFP_KERNEL,
it is possible that the VM will call back to the filesystem to free up
memory to satisfy the kmalloc request. GFP_NOFS will prevent this possible
recursion. I believe GFP_NOFS first appeared in the 2.4.6 kernel.
The second change involves the call to schedule() when vmalloc() fails. This
can also cause a hang. The schedule() call could be replaced with:
"This fixes a livelock condition introduced in my earlier
resource starvation patch; apparently I had erred too far
on the side of "wake up just in case". The livelock bug
is exhibited when running 10 fsstress processes at once;
if many processes are waiting for a new Rx call, they get
stuck in an uninterruptible kernel loop waking each other
up."
This patch makes sure that in-kernel aliases to non-existant names aren't
accidentally created due to case mismatch (e.g. "athena" being created as
a symlink to "athena.MIT.EDU", while "athena.mit.edu" is the real cell
that already exists). It also lowercases cell names in AFSDB lookups,
otherwise the same problem appears in userspace (eg "aklog athena" tries
to obtain tokens for cell "athena.MIT.EDU").
"My theory of what happened is roughly as follows:
Process tries to read data from AFS (as part of a page fault);
issues a new Rx call on an Rx connection to the fileserver.
The server transmits some data back to the client, but some packet
is lost.
Something tries to garbage-collect/destroy the connection; since
there is an active call, it can't do so, but issues an rx_AckAll
anyway, which acknowledges all packets transmitted by the server
as having been received. Server flushes its retransmit queue.
Client waits forever for the lost packet to arrive, but since the
server has already flushed the transmit queue, it cannot possibly
retransmit it.
All this is happening while the client has read-locked its address
space (since the read is part of a page fault). /proc accesses that
try to poke into that processes address space hang waiting for said
lock, causing the lossage we actually observed."
"This fix deals with the following lose case:
Client starts a call that, for some reason, takes a long time on the
server. While the client waits for the server to finish, client and
server usually send each other keep alive packets. If something
causes those packets to be delayed or dropped, then the client will
conclude that the call has failed or finished (usually failed), while
the server is still *busy* doing the call.
In this circumstance, the client will initiate another call and the
server will correctly respond that it is busy. Unfortunately, if the
callNumber of a received packet doesn't match the callNumber of the
outstanding call, then the client never sees that the server says it's
busy. Instead the server appears as a black hole to the client.
This fix ensures that the client sees the busy packets when its
callNumber is reasonably out of sync with the server."
This patch fixes a resource starvation condition in Rx. The
problem arises, for instance, when more than 4 daemons try to
prefetch chunks of the same file at once. The fifth daemon is
stuck in MAKECALL_WAITING state, never getting a chance to run,
because the other 4 daemons never yield to the scheduler after
releasing the call, and just grab the call back again.
Currently it's possible to give StoreData negative Pos/Length/FileLength
arguments and thereby set the volume quota usage to arbitrary values.
This patch makes these values unsigned, since negative file positions
and lengths don't make sense anyway.
no reason server etcdir needs to be forced world readable; nothing need
default to those cellconfig files except in the localauth case and then
you need to be able to read the KeyFile anyway
afs_RemoveCellEntry holds afs_xcell; setserverprefs modified the same
structure but did not which was problematic if something changed out from under
it
this caused a call to pdflush to happen at the wrong time, which should fix
the zero filled files problem, the osi_assert(cred) problem and the
execsorwriters == 0 warnings to go away
if you're not using ufs logging it's ok to replace solaris fsck with vfsck,
except sometimes it exits with 40 and that's not a failure to the solaris
scripts.
Currently nothing clears the CLIENTDELETED flag in hosts, so once
a client has been deleted, h_TossStuff_r() will keep getting called
with every host release. This patch clears the CLIENTDELETED flag
every time we take care of deleted clients.
"apparently the rev 1 r5000 chips implement 'cvt' incorrectly. the irix
kernel works around this problem by checking each text page mapped into
memory and doing a fixup on the cvt instructions. it tries to maintain
a hash of these pages using fid2() or fid() if fid2() returns ENOSYS.
afs, in an effort to prevent people from doing checkpoints on an afs
filesystem, makes fid2() return EINVAL. this also keeps the kernel from
mapping executables that are in afs space on the broken r5000's.
this is the patch i have been using for the past couple years while
waiting for an official fix. it makes fid2() return ENOSYS, so you
now need to have to have v_ckpt. however i disabled the rest of the
CKPT code since i have no idea how well that code actually works.
additionally, this behavior is only functional on machines with the
'broken' r5000 h/w. i cant think of a better way to fix this problem
since i cant change the irix kernel."