Windows uses case-insensitive file name pattern matching
but AFS is a case sensitive file system. The AFS3 directory
format is block based, uses network byte order and
includes a hash table for fast case sensitive lookups.
This causes several problems for the Windows AFS client.
(1) Traversing the directory blocks is cpu expensive
(2) A hash table miss does not indicate that the desired
entry does not exist.
(3) Determining whether a non-ambiguous inexact match or
the entry does not exist requires a linear traversal
of the entire directory.
These issues often result in 100% CPU utilization.
These issues are addressed by building a modified B+ tree for
each directory and then using the B+ tree for searches.
Further improvements can be made by using the B+ tree leaf
nodes for directory enumeration.
for . and .. find the last time we saw the fid in the list
instead of moving back a fixed count since the parent might
be a symlink or a mount point or both
New registry value "BlockSize" can be used to specify an alternative
block size. The default is 4K. A larger blocksize will be needed if
you want to support a 6TB cache.
Also extend the service startup timeout hint to two minutes to give
the AFS client service more time to startup successfully when the
cache is really large.
(1) fixes a bug that could cause a 'host' structure to not be removed
from the global host list if the 'host' did not possess an interface
list. This would happen with older AFS clients that do not support the
WhoAreYou family of RPCs. Windows clients older than 1.3.80 and old
Transarc UNIX clients.
(2) fixes a bug which could result in ViceLog being called with an
uninitialized 'hoststr' buffer as a parameter.
(3) ensures that only addresses known to belong to the 'host' are
added to the address hash table. The list of addresses provided by
the client are stored as alternates and are only used when searching
for a client that is no longer accessible on the primary address.
These addresses are not stored in the address hash table within
initInterfaceAddr_r().
The addresses provided by the client should not be added to the hash
table because they have not been verified as belonging to the 'host'
that provided them. The contents of the list may in fact be completely
unreliable. Consider the existing UNIX clients that generate the list
at startup and never alter it even after the client has migrated to a
different network. If two client's both claim the same address,
lookups by address may fail to find the correct one.
a. The client list might contain private address ranges which
are likely to be re-used by many clients allocated addresses
by a NAT.
b. The client list will not include any public addresses that
are hidden by a NAT.
c. Private address ranges that are exposed to the server will
be obtained from the rx connections that use them.
d. Lists provided by the client are not necessarily truthful.
Many existing clients (UNIX) do not refresh the IP address
list as the actual assigned addresses change. The end result
is that they report the initial address list for the lifetime
of the process. In other words, a client can report addresses
that they are in fact not using. Adding these addresses to
the host interface list without verification is not only
pointless, it is downright dangerous.
e. The reported addresses do not include port numbers and
guessing that the port number is 7001 does not work when
port mapping devices such as NATs or some VPNs are in
use.
(4) improves logging to ensure that all references to a 'host' structure
report both a memory address and the IP address/port. this will avoid
confusion *if* more than one 'host' structure is assigned the same
primary address.
(5) logs the UUID along with the client addresses when initializing the
host's interface list. (level 125)
(6) saves memory by using a smaller structure for the UUID hash table
MultiProbeAlternateAddress_r badly indexes the list of interfaces for
clients with multiple IP interfaces, resulting in peers with IP
address 0 port 0 to be created. This in turn results in rxi_sendmsg
errors (on systems where caught early, as on Linux, on others it may
pass unnoticed).
Add a new fs newalias man page. Add -help to the synopsis and options of
the other new man pages. Add additional missing links in the fs man page.
Fix some wording in the CellAlias man page.
Complete the documentation of the afsd flags and update a few things like
-settime and -nosettime. Add man pages for fs setcrypt, fs getcrypt, and
CellAlias. Based on work by Jason Edgecombe and then extensively edited,
so any errors I probably introduced.
The windows cache manager has suffered from poor performance as a result
of Create, Rename, and Delete operations because they invalidate the
contents of the directory pages in the cache thereby forcing them to be
reloaded from the file server. As the directory size increases, the clock
time necessary to perform the reload increases.
This delta adds support for parsing and updating the AFS3 directory buffers
to cm_dir.c. It then uses that functionality to perform local updates to
the directory buffers whenever the following conditions are met:
1. the data version on the directory as a result of the change
was incremented by one.
2. all of the directory buffers required for the update are in
the cache.
If these conditions are not met, the directory is reloaded from the file
server.
The windows cache manager has suffered from poor performance as a result
of Create, Rename, and Delete operations because they invalidate the
contents of the directory pages in the cache thereby forcing them to be
reloaded from the file server. As the directory size increases, the clock
time necessary to perform the reload increases.
This delta adds support for parsing and updating the AFS3 directory buffers
to cm_dir.c. It then uses that functionality to perform local updates to
the directory buffers whenever the following conditions are met:
1. the data version on the directory as a result of the change
was incremented by one.
2. all of the directory buffers required for the update are in
the cache.
If these conditions are not met, the directory is reloaded from the file
server.
if all of the servers are down when a callback is due to expire
delay the expiration until at least one server is available.
this prevents some applications that are running when the CM
is off the network from failing if their pages are swapped out.
Mention aklog and kinit in klog's man page, add -dynroot to the afsd man
page, and mention that -skipauth tells uss not to create any Kerberos
principal and this has to be done separately.
This delta adds an interface to an optional volume status handler.
The handler (if provided) receives status updates when volumes
change state between online, offline, busy, and alldown.