From 66adaf326861ae8a544650928bc6934d26a91d1d Mon Sep 17 00:00:00 2001 From: Jeffrey Altman Date: Sun, 10 Jun 2007 17:58:51 +0000 Subject: [PATCH] windows-volume-status-tracking-20070610 * changed the enum values for cm_serverRef_t state info to use a private name space to avoid collisions (srv_) * added a srv_deleted state for cm_serverRef_t objects. This state is set when cm_FreeServerList() is called with the CM_FREESERVERLIST_DELETE flag set. cm_FreeServerList() may not always delete the cm_serverRef_t from the list if it is still in use by another thread. the srv_deleted state means the object's contents are no longer valid and it must be skipped. It will be deleted the next time the object is freed and the refcount hits zero. * the srv_deleted state is also used when a file server reports either VNOVOL or VMOVED instead of marking the cm_serverRef_t as offline. This is done to prevent additional usage of the stale vldb data while waiting for the update volume request to complete. * added a state field to the cm_volume_t object (enum volstate vl_ name space) that maintains the state of the volume based upon the states of all of the cm_serverRef_t and cm_server_t objects. * modified cm_UpdateVolume() to set the state of the cm_volume_t RW, RO, and BK to either vl_alldown or vl_online. There can't be any other states because cm_UpdateVolume() destroys any previous knowledge we might have had regarding busy or offline volume status * modified cm_UpdateVolume() to update the volume name in the cm_volume_t to the volume base name if the previous value was a volume ID. * modified cm_FollowMountPoint() to check to see if the volume name is a volume ID and if so call cm_GetVolumeByID instead of cm_GetVolumeByName. This ensures that volume IDs are always looked up as numeric values. There is no longer a need to maintain a separate cm_volume_t containing the string representation of the ID value. * Added a flags parameter to cm_GetVolumeByName() and cm_GetVolumeByID(). The first flag is a "CREATE" flag which is set by all existing calls. The flag is not set by calls to cm_GetVolumeByID() from the server probe code when volume status is being updated. We do not want the server probe operation to result in additional turnover in the cached volume data. The second flag is NO_LRU_UPDATE which is set when the server probe code updates the volume status. This flag will be used to prevent the server probe operation from changing the order of the least recently used queue. * Modified cm_GetVolumeByName to ensure that only one cm_volume_t is allocated for a given set of normal, readonly, and backup volumes regardless of whether or not the volume is accessed via name or ID number. The cm_volume_t namep field is always the base name of the volume. * Added a new volume state, vl_unknown. This state is used as the initial state for all cm_volume_t when the cache manager starts, for each cm_volume_t at creation, and for each cm_volume_t when recycling. The cache manager does not know the state of all volumes in the world, only those that are in the cache and for which it has queried the VLDB and hosting file servers. * modified cm_GetVolumeByName() to initialize the state of a volume to vl_unknown. The actual state will be set when a cm_VolumeUpdate() call completes successfully. * changed name of scache hash table variables to avoid ambiguity when adding hash tables for volumes * fix a buffer overrun in sys\pioctl_nt.c pioctl(). (thanks Asanka) * modified cm_UpdateVolume() to handle the case in which there is no RW volume but there is are RO volumes for a given base name. This is done by querying for the ".readonly" volume name if the base name does not exist in the VLDB. We never query for the .backup name because under the current usage model a .backup volume may only exist on the server that the read-write volume is located. If there is no RW volume, there can be no .backup. * Added four hash tables for cm_volume_t objects to improve the search time of cm_GetVolumeByID(), cm_GetVolumeByName() and cm_ForceUpdateVolume(). One each for Name, RWID, ROID, and BKID. Three ID hash tables are necessary as long as it is desireable to maintain a single cm_volume_t containing all of the related RW, RO, and BK volume data. Having the RW and RO volume data in the same object is necessary for the implementation of cm_GetROVolumeID() which returns either the RO or RW ID depending upon the existence of RO volume instances. * Added a volume LRU queue so that volume reuse becomes fairer. This does not replace the all Volumes list which is used when it is desireable to walk a list of all the volumes whose order is not going to change out from underneath you which makes it safe to drop the cm_volumeLock. * handles volume hash table updates where volume name to volume ID number changes. The volume name remains constant in the cm_volume_t. if a vos rename is performed, the name of the volume will change and the volume IDs will be updated. Subsequent access to the old volume ID will create a new cm_volume_t with the new name. * Added a daemon thread operation to query the state of volumes listed as busy or offline. cm_CheckBusyVolumes() calls RXAFS_GetVolumeStatus() for each volume ID that is marked vl_busy or vl_offline. If the volume is now online, the status on the volume is updated. The default period is 600 seconds. This can be configured with the BusyVolumeCheckInterval registry value. * Added prototype for smb_IoctlPrepareRead() which was missing a return type in the function definition. * Added volume id lists to the cm_server_t. These lists are allocated in blocks of ~32 IDs. When a cm_PingServer() detects a change in server state, the state of the cm_volume_t is updated. * Added volID to the cm_serverRef_t object. volID is used to identify the volume for which the object is a referral. cm_FreeServerList() uses the volID to remove the volume from the cm_server_t. * In cm_Analyze, when VNOVOL or VMOVED are received, call cm_ForceVolumeUpdate() to force a refresh of the volume location data. * Added cm_VolumeStatusNotification() which is used at the moment to log volume status changes to the trace log. It will also be used as the access point to the File System Filter driver notification engine. * Added an all cm_scache_t list to cm_data. This replaces the use of the stat cache LRU queue when we need to enumerate all entries. The LRU list order is not static and when using it to enumerate all entries it can result in items being missed or items being processed more than once. * Modified cm_Analyze(). Instead of reseting the busy or offline state of a volume and forcing a retry of the operation cm_Analyze will defer to the background daemon thread that will update the state once every 600 seconds. * Added the automatic generation of a Freelance ".root" read-write mountpoint that refers to the root.afs volume of the workstation cellname at the time the mountpoint is created. --- src/WINNT/afsd/afsd_init.c | 2 +- src/WINNT/afsd/cm.h | 5 + src/WINNT/afsd/cm_buf.c | 18 +- src/WINNT/afsd/cm_callback.c | 24 +- src/WINNT/afsd/cm_cell.c | 4 +- src/WINNT/afsd/cm_conn.c | 113 ++-- src/WINNT/afsd/cm_conn.h | 6 +- src/WINNT/afsd/cm_daemon.c | 15 + src/WINNT/afsd/cm_dcache.c | 6 +- src/WINNT/afsd/cm_freelance.c | 14 +- src/WINNT/afsd/cm_ioctl.c | 43 +- src/WINNT/afsd/cm_memmap.c | 35 +- src/WINNT/afsd/cm_memmap.h | 15 +- src/WINNT/afsd/cm_scache.c | 123 ++--- src/WINNT/afsd/cm_scache.h | 3 +- src/WINNT/afsd/cm_server.c | 178 ++++++- src/WINNT/afsd/cm_server.h | 17 +- src/WINNT/afsd/cm_vnodeops.c | 50 +- src/WINNT/afsd/cm_volume.c | 944 ++++++++++++++++++++++++++++++---- src/WINNT/afsd/cm_volume.h | 87 +++- src/WINNT/afsd/smb_ioctl.c | 2 +- src/WINNT/afsd/smb_ioctl.h | 2 + 22 files changed, 1374 insertions(+), 332 deletions(-) diff --git a/src/WINNT/afsd/afsd_init.c b/src/WINNT/afsd/afsd_init.c index 3110a8614..dc9389193 100644 --- a/src/WINNT/afsd/afsd_init.c +++ b/src/WINNT/afsd/afsd_init.c @@ -1241,7 +1241,7 @@ int afsd_InitDaemons(char **reasonP) osi_Log0(afsd_logp, "Loading Root Volume from cell"); do { code = cm_GetVolumeByName(cm_data.rootCellp, cm_rootVolumeName, cm_rootUserp, - &req, CM_FLAG_CREATE, &cm_data.rootVolumep); + &req, CM_GETVOL_FLAG_CREATE, &cm_data.rootVolumep); afsi_log("cm_GetVolumeByName code %x root vol %x", code, (code ? (cm_volume_t *)-1 : cm_data.rootVolumep)); } while (code && --attempts); diff --git a/src/WINNT/afsd/cm.h b/src/WINNT/afsd/cm.h index 20cf55416..c193d5f46 100644 --- a/src/WINNT/afsd/cm.h +++ b/src/WINNT/afsd/cm.h @@ -296,4 +296,9 @@ int RXAFS_Lookup (struct rx_connection *, #define CM_ERROR_TOOFEWBUFS (CM_ERROR_BASE+50) #define CM_ERROR_TOOMANYBUFS (CM_ERROR_BASE+51) #define CM_ERROR_BAD_LEVEL (CM_ERROR_BASE+52) + +/* Used by cm_FollowMountPoint and cm_GetVolumeByName */ +#define RWVOL 0 +#define ROVOL 1 +#define BACKVOL 2 #endif /* __CM_H_ENV__ */ diff --git a/src/WINNT/afsd/cm_buf.c b/src/WINNT/afsd/cm_buf.c index 11439f229..0931f0fc0 100644 --- a/src/WINNT/afsd/cm_buf.c +++ b/src/WINNT/afsd/cm_buf.c @@ -310,7 +310,7 @@ long buf_Init(int newFile, cm_buf_ops_t *opsp, afs_uint64 nbuffers) cm_data.buf_hashSize = osi_PrimeLessThan((afs_uint32)(cm_data.buf_nbuffers/7 + 1)); /* create hash table */ - memset((void *)cm_data.buf_hashTablepp, 0, cm_data.buf_hashSize * sizeof(cm_buf_t *)); + memset((void *)cm_data.buf_scacheHashTablepp, 0, cm_data.buf_hashSize * sizeof(cm_buf_t *)); /* another hash table */ memset((void *)cm_data.buf_fileHashTablepp, 0, cm_data.buf_hashSize * sizeof(cm_buf_t *)); @@ -506,7 +506,7 @@ cm_buf_t *buf_FindLocked(struct cm_scache *scp, osi_hyper_t *offsetp) cm_buf_t *bp; i = BUF_HASH(&scp->fid, offsetp); - for(bp = cm_data.buf_hashTablepp[i]; bp; bp=bp->hashp) { + for(bp = cm_data.buf_scacheHashTablepp[i]; bp; bp=bp->hashp) { if (cm_FidCmp(&scp->fid, &bp->fid) == 0 && offsetp->LowPart == bp->offset.LowPart && offsetp->HighPart == bp->offset.HighPart) { @@ -645,7 +645,7 @@ void buf_Recycle(cm_buf_t *bp) /* Remove from hash */ i = BUF_HASH(&bp->fid, &bp->offset); - lbpp = &(cm_data.buf_hashTablepp[i]); + lbpp = &(cm_data.buf_scacheHashTablepp[i]); for(tbp = *lbpp; tbp; lbpp = &tbp->hashp, tbp = *lbpp) { if (tbp == bp) break; } @@ -818,8 +818,8 @@ long buf_GetNewLocked(struct cm_scache *scp, osi_hyper_t *offsetp, cm_buf_t **bu #endif bp->offset = *offsetp; i = BUF_HASH(&scp->fid, offsetp); - bp->hashp = cm_data.buf_hashTablepp[i]; - cm_data.buf_hashTablepp[i] = bp; + bp->hashp = cm_data.buf_scacheHashTablepp[i]; + cm_data.buf_scacheHashTablepp[i] = bp; i = BUF_FILEHASH(&scp->fid); nextBp = cm_data.buf_fileHashTablepp[i]; bp->fileHashp = nextBp; @@ -1190,7 +1190,7 @@ long buf_CleanAndReset(void) lock_ObtainWrite(&buf_globalLock); for(i=0; ihashp) { + for(bp = cm_data.buf_scacheHashTablepp[i]; bp; bp = bp->hashp) { if ((bp->flags & CM_BUF_DIRTY) == CM_BUF_DIRTY) { buf_HoldLocked(bp); lock_ReleaseWrite(&buf_globalLock); @@ -1550,7 +1550,7 @@ buf_ValidateBufQueues(void) } #endif /* TESTING */ -/* dump the contents of the buf_hashTablepp. */ +/* dump the contents of the buf_scacheHashTablepp. */ int cm_DumpBufHashTable(FILE *outputFile, char *cookie, int lock) { int zilch; @@ -1558,7 +1558,7 @@ int cm_DumpBufHashTable(FILE *outputFile, char *cookie, int lock) char output[1024]; afs_uint32 i; - if (cm_data.buf_hashTablepp == NULL) + if (cm_data.buf_scacheHashTablepp == NULL) return -1; if (lock) @@ -1570,7 +1570,7 @@ int cm_DumpBufHashTable(FILE *outputFile, char *cookie, int lock) for (i = 0; i < cm_data.buf_hashSize; i++) { - for (bp = cm_data.buf_hashTablepp[i]; bp; bp=bp->hashp) + for (bp = cm_data.buf_scacheHashTablepp[i]; bp; bp=bp->hashp) { StringCbPrintfA(output, sizeof(output), "%s bp=0x%08X, hash=%d, fid (cell=%d, volume=%d, " diff --git a/src/WINNT/afsd/cm_callback.c b/src/WINNT/afsd/cm_callback.c index 9d6c0a1d1..be93ebe25 100644 --- a/src/WINNT/afsd/cm_callback.c +++ b/src/WINNT/afsd/cm_callback.c @@ -191,7 +191,7 @@ void cm_RevokeCallback(struct rx_call *callp, AFSFid *fidp) /* do all in the hash bucket, since we don't know how many we'll find with * varying cells. */ - for (scp = cm_data.hashTablep[hash]; scp; scp=scp->nextp) { + for (scp = cm_data.scacheHashTablep[hash]; scp; scp=scp->nextp) { if (scp->fid.volume == tfid.volume && scp->fid.vnode == tfid.vnode && scp->fid.unique == tfid.unique && @@ -240,8 +240,8 @@ void cm_RevokeVolumeCallback(struct rx_call *callp, AFSFid *fidp) lock_ObtainWrite(&cm_scacheLock); - for (hash = 0; hash < cm_data.hashTableSize; hash++) { - for(scp=cm_data.hashTablep[hash]; scp; scp=scp->nextp) { + for (hash = 0; hash < cm_data.scacheHashTableSize; hash++) { + for(scp=cm_data.scacheHashTablep[hash]; scp; scp=scp->nextp) { if (scp->fid.volume == fidp->Volume && scp->cbExpires > 0 && scp->cbServerp != NULL) { @@ -446,8 +446,8 @@ SRXAFSCB_InitCallBackState(struct rx_call *callp) * are "rare," hopefully this won't be a problem. */ lock_ObtainWrite(&cm_scacheLock); - for (hash = 0; hash < cm_data.hashTableSize; hash++) { - for (scp=cm_data.hashTablep[hash]; scp; scp=scp->nextp) { + for (hash = 0; hash < cm_data.scacheHashTableSize; hash++) { + for (scp=cm_data.scacheHashTablep[hash]; scp; scp=scp->nextp) { cm_HoldSCacheNoLock(scp); lock_ReleaseWrite(&cm_scacheLock); lock_ObtainMutex(&scp->mx); @@ -689,8 +689,8 @@ SRXAFSCB_GetCE(struct rx_call *callp, long index, AFSDBCacheEntry *cep) ntohl(host), ntohs(port)); lock_ObtainRead(&cm_scacheLock); - for (i = 0; i < cm_data.hashTableSize; i++) { - for (scp = cm_data.hashTablep[i]; scp; scp = scp->nextp) { + for (i = 0; i < cm_data.scacheHashTableSize; i++) { + for (scp = cm_data.scacheHashTablep[i]; scp; scp = scp->nextp) { if (index == 0) goto searchDone; index--; @@ -795,8 +795,8 @@ SRXAFSCB_GetCE64(struct rx_call *callp, long index, AFSDBCacheEntry64 *cep) ntohl(host), ntohs(port)); lock_ObtainRead(&cm_scacheLock); - for (i = 0; i < cm_data.hashTableSize; i++) { - for (scp = cm_data.hashTablep[i]; scp; scp = scp->nextp) { + for (i = 0; i < cm_data.scacheHashTableSize; i++) { + for (scp = cm_data.scacheHashTablep[i]; scp; scp = scp->nextp) { if (index == 0) goto searchDone; index--; @@ -1693,7 +1693,7 @@ long cm_GetCallback(cm_scache_t *scp, struct cm_user *userp, osi_Log4(afsd_logp, "CALL FetchStatus scp 0x%p vol %u vn %u uniq %u", scp, sfid.volume, sfid.vnode, sfid.unique); do { - code = cm_Conn(&sfid, userp, reqp, &connp); + code = cm_ConnFromFID(&sfid, userp, reqp, &connp); if (code) continue; @@ -1751,8 +1751,8 @@ void cm_CheckCBExpiration(void) now = osi_Time(); lock_ObtainWrite(&cm_scacheLock); - for (i=0; inextp) { + for (i=0; inextp) { cm_HoldSCacheNoLock(scp); if (scp->cbExpires > 0 && (scp->cbServerp == NULL || now > scp->cbExpires)) { lock_ReleaseWrite(&cm_scacheLock); diff --git a/src/WINNT/afsd/cm_cell.c b/src/WINNT/afsd/cm_cell.c index 8f76acc31..18742ddf5 100644 --- a/src/WINNT/afsd/cm_cell.c +++ b/src/WINNT/afsd/cm_cell.c @@ -44,7 +44,7 @@ long cm_AddCellProc(void *rockp, struct sockaddr_in *addrp, char *namep) tsp = cm_NewServer(addrp, CM_SERVER_VLDB, cellp); /* Insert the vlserver into a sorted list, sorted by server rank */ - tsrp = cm_NewServerRef(tsp); + tsrp = cm_NewServerRef(tsp, 0); cm_InsertServerList(&cellp->vlServersp, tsrp); /* drop the allocation reference */ lock_ObtainWrite(&cm_serverLock); @@ -77,7 +77,7 @@ cm_cell_t *cm_UpdateCell(cm_cell_t * cp) ) { /* must empty cp->vlServersp */ if (cp->vlServersp) { - cm_FreeServerList(&cp->vlServersp); + cm_FreeServerList(&cp->vlServersp, CM_FREESERVERLIST_DELETE); cp->vlServersp = NULL; } diff --git a/src/WINNT/afsd/cm_conn.c b/src/WINNT/afsd/cm_conn.c index 986e672b2..28fcb7c9f 100644 --- a/src/WINNT/afsd/cm_conn.c +++ b/src/WINNT/afsd/cm_conn.c @@ -117,10 +117,12 @@ static long cm_GetServerList(struct cm_fid *fidp, struct cm_user *userp, } cellp = cm_FindCellByID(fidp->cell); - if (!cellp) return CM_ERROR_NOSUCHCELL; + if (!cellp) + return CM_ERROR_NOSUCHCELL; - code = cm_GetVolumeByID(cellp, fidp->volume, userp, reqp, &volp); - if (code) return code; + code = cm_GetVolumeByID(cellp, fidp->volume, userp, reqp, CM_GETVOL_FLAG_CREATE, &volp); + if (code) + return code; *serversppp = cm_GetVolServers(volp, fidp->volume); @@ -133,12 +135,12 @@ static long cm_GetServerList(struct cm_fid *fidp, struct cm_user *userp, * and if we're going to retry, determine whether failover is appropriate, * and whether timed backoff is appropriate. * - * If the error code is from cm_Conn() or friends, it will be a CM_ERROR code. + * If the error code is from cm_ConnFromFID() or friends, it will be a CM_ERROR code. * Otherwise it will be an RPC code. This may be a UNIX code (e.g. EDQUOT), or * it may be an RX code, or it may be a special code (e.g. VNOVOL), or it may * be a security code (e.g. RXKADEXPIRED). * - * If the error code is from cm_Conn() or friends, connp will be NULL. + * If the error code is from cm_ConnFromFID() or friends, connp will be NULL. * * For VLDB calls, fidp will be NULL. * @@ -242,10 +244,14 @@ cm_Analyze(cm_conn_t *connp, cm_user_t *userp, cm_req_t *reqp, } else if (errorCode == CM_ERROR_ALLOFFLINE) { + osi_Log0(afsd_logp, "cm_Analyze passed CM_ERROR_ALLOFFLINE."); + /* Volume instances marked offline will be restored by the + * background daemon thread as they become available + */ +#if 0 if (timeLeft > 7) { - osi_Log0(afsd_logp, "cm_Analyze passed CM_ERROR_ALLOFFLINE."); thrd_Sleep(5000); - + if (fidp) { /* Not a VLDB call */ if (!serversp) { code = cm_GetServerList(fidp, userp, reqp, &serverspp); @@ -256,11 +262,13 @@ cm_Analyze(cm_conn_t *connp, cm_user_t *userp, cm_req_t *reqp, } if (serversp) { lock_ObtainWrite(&cm_serverLock); - for (tsrp = serversp; tsrp; tsrp=tsrp->next) - tsrp->status = not_busy; + for (tsrp = serversp; tsrp; tsrp=tsrp->next) { + /* REDIRECT */ + tsrp->status = srv_not_busy; + } lock_ReleaseWrite(&cm_serverLock); if (free_svr_list) { - cm_FreeServerList(&serversp); + cm_FreeServerList(&serversp, 0); *serverspp = serversp; } retry = 1; @@ -270,21 +278,26 @@ cm_Analyze(cm_conn_t *connp, cm_user_t *userp, cm_req_t *reqp, } else { /* VLDB call */ if (serversp) { lock_ObtainWrite(&cm_serverLock); - for (tsrp = serversp; tsrp; tsrp=tsrp->next) - tsrp->status = not_busy; + for (tsrp = serversp; tsrp; tsrp=tsrp->next) { + /* REDIRECT */ + tsrp->status = srv_not_busy; + } lock_ReleaseWrite(&cm_serverLock); if (free_svr_list) { - cm_FreeServerList(&serversp); + cm_FreeServerList(&serversp, 0); *serverspp = serversp; } } } } +#endif } - - /* if all servers are busy, mark them non-busy and start over */ else if (errorCode == CM_ERROR_ALLBUSY) { + /* Volume instances marked busy will be restored by the + * background daemon thread as they become available. + */ osi_Log0(afsd_logp, "cm_Analyze passed CM_ERROR_ALLBUSY."); +#if 0 if (timeLeft > 7) { thrd_Sleep(5000); if (!serversp) { @@ -296,16 +309,19 @@ cm_Analyze(cm_conn_t *connp, cm_user_t *userp, cm_req_t *reqp, } lock_ObtainWrite(&cm_serverLock); for (tsrp = serversp; tsrp; tsrp=tsrp->next) { - if (tsrp->status == busy) - tsrp->status = not_busy; + if (tsrp->status == srv_busy) { + /* REDIRECT */ + tsrp->status = srv_not_busy; + } } lock_ReleaseWrite(&cm_serverLock); if (free_svr_list) { - cm_FreeServerList(&serversp); + cm_FreeServerList(&serversp, 0); *serverspp = serversp; } retry = 1; } +#endif } /* special codes: VBUSY and VRESTARTING */ @@ -319,15 +335,15 @@ cm_Analyze(cm_conn_t *connp, cm_user_t *userp, cm_req_t *reqp, } lock_ObtainWrite(&cm_serverLock); for (tsrp = serversp; tsrp; tsrp=tsrp->next) { - if (tsrp->server == serverp - && tsrp->status == not_busy) { - tsrp->status = busy; + if (tsrp->server == serverp && tsrp->status == srv_not_busy) { + /* REDIRECT */ + tsrp->status = srv_busy; break; } } lock_ReleaseWrite(&cm_serverLock); if (free_svr_list) { - cm_FreeServerList(&serversp); + cm_FreeServerList(&serversp, 0); *serverspp = serversp; } retry = 1; @@ -386,11 +402,19 @@ cm_Analyze(cm_conn_t *connp, cm_user_t *userp, cm_req_t *reqp, } } for (tsrp = serversp; tsrp; tsrp=tsrp->next) { - if (tsrp->server == serverp) - tsrp->status = offline; + if (tsrp->server == serverp) { + /* REDIRECT */ + if (errorCode == VNOVOL || errorCode == VMOVED) { + tsrp->status = srv_deleted; + if (fidp) { + cm_ForceUpdateVolume(fidp, userp, reqp); + } + } else + tsrp->status = srv_offline; + } } if (free_svr_list) { - cm_FreeServerList(&serversp); + cm_FreeServerList(&serversp, 0); *serverspp = serversp; } if ( timeLeft > 2 ) @@ -629,10 +653,12 @@ long cm_ConnByMServers(cm_serverRef_t *serversp, cm_user_t *usersp, lock_ReleaseWrite(&cm_serverLock); if (!(tsp->flags & CM_SERVERFLAG_DOWN)) { allDown = 0; - if (tsrp->status == busy) { + if (tsrp->status == srv_deleted) { + /* skip this entry. no longer valid. */; + } else if (tsrp->status == srv_busy) { allOffline = 0; someBusy = 1; - } else if (tsrp->status == offline) { + } else if (tsrp->status == srv_offline) { allBusy = 0; someOffline = 1; } else { @@ -833,10 +859,10 @@ long cm_ServerAvailable(struct cm_fid *fidp, struct cm_user *userp) cm_GetServerNoLock(tsp); if (!(tsp->flags & CM_SERVERFLAG_DOWN)) { allDown = 0; - if (tsrp->status == busy) { + if (tsrp->status == srv_busy) { allOffline = 0; someBusy = 1; - } else if (tsrp->status == offline) { + } else if (tsrp->status == srv_offline) { allBusy = 0; someOffline = 1; } else { @@ -847,7 +873,7 @@ long cm_ServerAvailable(struct cm_fid *fidp, struct cm_user *userp) cm_PutServerNoLock(tsp); } lock_ReleaseWrite(&cm_serverLock); - cm_FreeServerList(serverspp); + cm_FreeServerList(serverspp, 0); if (allDown) return 0; @@ -859,8 +885,12 @@ long cm_ServerAvailable(struct cm_fid *fidp, struct cm_user *userp) return 1; } -long cm_Conn(struct cm_fid *fidp, struct cm_user *userp, cm_req_t *reqp, - cm_conn_t **connpp) +/* + * The returned cm_conn_t ** object is released in the subsequent call + * to cm_Analyze(). + */ +long cm_ConnFromFID(struct cm_fid *fidp, struct cm_user *userp, cm_req_t *reqp, + cm_conn_t **connpp) { long code; cm_serverRef_t **serverspp; @@ -872,11 +902,26 @@ long cm_Conn(struct cm_fid *fidp, struct cm_user *userp, cm_req_t *reqp, } code = cm_ConnByMServers(*serverspp, userp, reqp, connpp); - cm_FreeServerList(serverspp); + cm_FreeServerList(serverspp, 0); + return code; +} + + +long cm_ConnFromVolume(struct cm_volume *volp, unsigned long volid, struct cm_user *userp, cm_req_t *reqp, + cm_conn_t **connpp) +{ + long code; + cm_serverRef_t **serverspp; + + serverspp = cm_GetVolServers(volp, volid); + + code = cm_ConnByMServers(*serverspp, userp, reqp, connpp); + cm_FreeServerList(serverspp, 0); return code; } -extern struct rx_connection * + +extern struct rx_connection * cm_GetRxConn(cm_conn_t *connp) { struct rx_connection * rxconn; diff --git a/src/WINNT/afsd/cm_conn.h b/src/WINNT/afsd/cm_conn.h index 5f27504a4..f127254ca 100644 --- a/src/WINNT/afsd/cm_conn.h +++ b/src/WINNT/afsd/cm_conn.h @@ -113,9 +113,13 @@ extern long cm_ConnByMServers(struct cm_serverRef *, struct cm_user *, extern long cm_ConnByServer(struct cm_server *, struct cm_user *, cm_conn_t **); -extern long cm_Conn(struct cm_fid *, struct cm_user *, struct cm_req *, +extern long cm_ConnFromFID(struct cm_fid *, struct cm_user *, struct cm_req *, cm_conn_t **); +extern long cm_ConnFromVolume(struct cm_volume *volp, unsigned long volid, + struct cm_user *userp, cm_req_t *reqp, + cm_conn_t **connpp); + extern void cm_PutConn(cm_conn_t *connp); extern void cm_GCConnections(cm_server_t *serverp); diff --git a/src/WINNT/afsd/cm_daemon.c b/src/WINNT/afsd/cm_daemon.c index d95f13888..70ac61fbe 100644 --- a/src/WINNT/afsd/cm_daemon.c +++ b/src/WINNT/afsd/cm_daemon.c @@ -32,6 +32,7 @@ long cm_daemonCheckVolInterval = 3600; long cm_daemonCheckCBInterval = 60; long cm_daemonCheckLockInterval = 60; long cm_daemonTokenCheckInterval = 180; +long cm_daemonCheckBusyVolInterval = 600; osi_rwlock_t cm_daemonLock; @@ -273,6 +274,12 @@ cm_DaemonCheckInit(void) if (code == ERROR_SUCCESS) cm_daemonTokenCheckInterval = dummy; + dummyLen = sizeof(DWORD); + code = RegQueryValueEx(parmKey, "BusyVolumeCheckInterval", NULL, NULL, + (BYTE *) &dummy, &dummyLen); + if (code == ERROR_SUCCESS) + cm_daemonCheckBusyVolInterval = dummy; + RegCloseKey(parmKey); } @@ -286,6 +293,7 @@ void cm_Daemon(long parm) time_t lastDownServerCheck; time_t lastUpServerCheck; time_t lastTokenCacheCheck; + time_t lastBusyVolCheck; char thostName[200]; unsigned long code; struct hostent *thp; @@ -325,6 +333,7 @@ void cm_Daemon(long parm) lastDownServerCheck = now - cm_daemonCheckDownInterval/2 + (rand() % cm_daemonCheckDownInterval); lastUpServerCheck = now - cm_daemonCheckUpInterval/2 + (rand() % cm_daemonCheckUpInterval); lastTokenCacheCheck = now - cm_daemonTokenCheckInterval/2 + (rand() % cm_daemonTokenCheckInterval); + lastBusyVolCheck = now - cm_daemonCheckBusyVolInterval/2 * (rand() % cm_daemonCheckBusyVolInterval); while (daemon_ShutdownFlag == 0) { /* check to see if the listener threads halted due to network @@ -378,6 +387,12 @@ void cm_Daemon(long parm) now = osi_Time(); } + if (now > lastBusyVolCheck + cm_daemonCheckBusyVolInterval) { + lastVolCheck = now; + cm_CheckBusyVolumes(); + now = osi_Time(); + } + if (now > lastCBExpirationCheck + cm_daemonCheckCBInterval) { lastCBExpirationCheck = now; cm_CheckCBExpiration(); diff --git a/src/WINNT/afsd/cm_dcache.c b/src/WINNT/afsd/cm_dcache.c index 9feaf1b1a..a8ee555fc 100644 --- a/src/WINNT/afsd/cm_dcache.c +++ b/src/WINNT/afsd/cm_dcache.c @@ -147,7 +147,7 @@ long cm_BufWrite(void *vscp, osi_hyper_t *offsetp, long length, long flags, /* now we're ready to do the store operation */ do { - code = cm_Conn(&scp->fid, userp, reqp, &connp); + code = cm_ConnFromFID(&scp->fid, userp, reqp, &connp); if (code) continue; @@ -354,7 +354,7 @@ long cm_StoreMini(cm_scache_t *scp, cm_user_t *userp, cm_req_t *reqp) /* now we're ready to do the store operation */ do { - code = cm_Conn(&scp->fid, userp, reqp, &connp); + code = cm_ConnFromFID(&scp->fid, userp, reqp, &connp); if (code) continue; @@ -1427,7 +1427,7 @@ long cm_GetBuffer(cm_scache_t *scp, cm_buf_t *bufp, int *cpffp, cm_user_t *up, /* now make the call */ do { - code = cm_Conn(&scp->fid, up, reqp, &connp); + code = cm_ConnFromFID(&scp->fid, up, reqp, &connp); if (code) continue; diff --git a/src/WINNT/afsd/cm_freelance.c b/src/WINNT/afsd/cm_freelance.c index 2ad9c8170..caa74248d 100644 --- a/src/WINNT/afsd/cm_freelance.c +++ b/src/WINNT/afsd/cm_freelance.c @@ -145,7 +145,7 @@ void cm_InitFreelance() { cm_data.fakeDirVersion = cm_data.rootSCachep->dataVersion; // yj: first we make a call to cm_initLocalMountPoints - // to read all the local mount points from an ini file + // to read all the local mount points from the registry cm_InitLocalMountPoints(); // then we make a call to InitFakeRootDir to create @@ -344,7 +344,7 @@ void cm_InitFakeRootDir() { int cm_FakeRootFid(cm_fid_t *fidp) { fidp->cell = AFS_FAKE_ROOT_CELL_ID; /* root cell */ - fidp->volume = AFS_FAKE_ROOT_VOL_ID; /* root.afs ? */ + fidp->volume = AFS_FAKE_ROOT_VOL_ID; /* root.afs ? */ fidp->vnode = 0x1; fidp->unique = 0x1; return 0; @@ -390,7 +390,7 @@ int cm_reInitLocalMountPoints() { lock_ObtainMutex(&cm_Freelance_Lock); /* always scache then freelance lock */ for (i=0; inextp) { + for (scp=cm_data.scacheHashTablep[hash]; scp; scp=scp->nextp) { if (scp->fid.volume == aFid.volume && scp->fid.vnode == aFid.vnode && scp->fid.unique == aFid.unique @@ -407,7 +407,7 @@ int cm_reInitLocalMountPoints() { cm_ReleaseSCacheNoLock(scp); // take the scp out of the hash - for (lscpp = &cm_data.hashTablep[hash], tscp = cm_data.hashTablep[hash]; + for (lscpp = &cm_data.scacheHashTablep[hash], tscp = cm_data.scacheHashTablep[hash]; tscp; lscpp = &tscp->nextp, tscp = tscp->nextp) { if (tscp == scp) { @@ -453,7 +453,7 @@ int cm_reInitLocalMountPoints() { } -// yj: open up the ini file and read all the local mount +// yj: open up the registry and read all the local mount // points that are stored there. Part of the initialization // process for the freelance client. /* to be called while holding freelance lock unless during init. */ @@ -501,7 +501,8 @@ long cm_InitLocalMountPoints() { if (code == 0) { cm_FreelanceAddMount(&rootCellName[1], &rootCellName[1], "root.cell.", 0, NULL); cm_FreelanceAddMount(rootCellName, &rootCellName[1], "root.cell.", 1, NULL); - dwMountPoints = 2; + cm_FreelanceAddMount(".root", &rootCellName[1], "root.afs.", 1, NULL); + dwMountPoints = 3; } } @@ -697,6 +698,7 @@ long cm_InitLocalMountPoints() { if (code == 0) { cm_FreelanceAddMount(&rootCellName[1], &rootCellName[1], "root.cell.", 0, NULL); cm_FreelanceAddMount(rootCellName, &rootCellName[1], "root.cell.", 1, NULL); + cm_FreelanceAddMount(".root", &rootCellName[1], "root.afs.", 1, NULL); } return 0; } diff --git a/src/WINNT/afsd/cm_ioctl.c b/src/WINNT/afsd/cm_ioctl.c index 4d314e4ab..afa0b6bef 100644 --- a/src/WINNT/afsd/cm_ioctl.c +++ b/src/WINNT/afsd/cm_ioctl.c @@ -128,8 +128,8 @@ long cm_FlushVolume(cm_user_t *userp, cm_req_t *reqp, afs_uint32 cell, afs_uint3 #endif lock_ObtainWrite(&cm_scacheLock); - for (i=0; inextp) { + for (i=0; inextp) { if (scp->fid.volume == volume && scp->fid.cell == cell) { cm_HoldSCacheNoLock(scp); lock_ReleaseWrite(&cm_scacheLock); @@ -156,8 +156,8 @@ void cm_ResetACLCache(cm_user_t *userp) int hash; lock_ObtainWrite(&cm_scacheLock); - for (hash=0; hash < cm_data.hashTableSize; hash++) { - for (scp=cm_data.hashTablep[hash]; scp; scp=scp->nextp) { + for (hash=0; hash < cm_data.scacheHashTableSize; hash++) { + for (scp=cm_data.scacheHashTablep[hash]; scp; scp=scp->nextp) { cm_HoldSCacheNoLock(scp); lock_ReleaseWrite(&cm_scacheLock); lock_ObtainMutex(&scp->mx); @@ -525,7 +525,7 @@ long cm_IoctlGetACL(smb_ioctl_t *ioctlp, cm_user_t *userp) do { acl.AFSOpaque_val = ioctlp->outDatap; acl.AFSOpaque_len = 0; - code = cm_Conn(&scp->fid, userp, &req, &connp); + code = cm_ConnFromFID(&scp->fid, userp, &req, &connp); if (code) continue; callp = cm_GetRxConn(connp); @@ -607,7 +607,7 @@ long cm_IoctlSetACL(struct smb_ioctl *ioctlp, struct cm_user *userp) do { acl.AFSOpaque_val = ioctlp->inDatap; acl.AFSOpaque_len = (u_int)strlen(ioctlp->inDatap)+1; - code = cm_Conn(&scp->fid, userp, &req, &connp); + code = cm_ConnFromFID(&scp->fid, userp, &req, &connp); if (code) continue; callp = cm_GetRxConn(connp); @@ -639,8 +639,8 @@ long cm_IoctlFlushAllVolumes(struct smb_ioctl *ioctlp, struct cm_user *userp) cm_InitReq(&req); lock_ObtainWrite(&cm_scacheLock); - for (i=0; inextp) { + for (i=0; inextp) { cm_HoldSCacheNoLock(scp); lock_ReleaseWrite(&cm_scacheLock); @@ -723,7 +723,8 @@ long cm_IoctlSetVolumeStatus(struct smb_ioctl *ioctlp, struct cm_user *userp) return CM_ERROR_READONLY; } - code = cm_GetVolumeByID(cellp, scp->fid.volume, userp, &req, &tvp); + code = cm_GetVolumeByID(cellp, scp->fid.volume, userp, &req, + CM_GETVOL_FLAG_CREATE, &tvp); if (code) { cm_ReleaseSCache(scp); return code; @@ -750,7 +751,7 @@ long cm_IoctlSetVolumeStatus(struct smb_ioctl *ioctlp, struct cm_user *userp) } do { - code = cm_Conn(&scp->fid, userp, &req, &tcp); + code = cm_ConnFromFID(&scp->fid, userp, &req, &tcp); if (code) continue; callp = cm_GetRxConn(tcp); @@ -793,7 +794,7 @@ long cm_IoctlGetVolumeStatus(struct smb_ioctl *ioctlp, struct cm_user *userp) cm_scache_t *scp; char offLineMsg[256]; char motd[256]; - cm_conn_t *tcp; + cm_conn_t *connp; register long code; AFSFetchVolumeStatus volStat; register char *cp; @@ -812,15 +813,15 @@ long cm_IoctlGetVolumeStatus(struct smb_ioctl *ioctlp, struct cm_user *userp) OfflineMsg = offLineMsg; MOTD = motd; do { - code = cm_Conn(&scp->fid, userp, &req, &tcp); + code = cm_ConnFromFID(&scp->fid, userp, &req, &connp); if (code) continue; - callp = cm_GetRxConn(tcp); + callp = cm_GetRxConn(connp); code = RXAFS_GetVolumeStatus(callp, scp->fid.volume, &volStat, &Name, &OfflineMsg, &MOTD); rx_PutConnection(callp); - } while (cm_Analyze(tcp, userp, &req, &scp->fid, NULL, NULL, NULL, code)); + } while (cm_Analyze(connp, userp, &req, &scp->fid, NULL, NULL, NULL, code)); code = cm_MapRPCError(code, &req); cm_ReleaseSCache(scp); @@ -927,7 +928,7 @@ long cm_IoctlWhereIs(struct smb_ioctl *ioctlp, struct cm_user *userp) if (!cellp) return CM_ERROR_NOSUCHCELL; - code = cm_GetVolumeByID(cellp, volume, userp, &req, &tvp); + code = cm_GetVolumeByID(cellp, volume, userp, &req, CM_GETVOL_FLAG_CREATE, &tvp); if (code) return code; cp = ioctlp->outDatap; @@ -941,7 +942,7 @@ long cm_IoctlWhereIs(struct smb_ioctl *ioctlp, struct cm_user *userp) cp += sizeof(long); } lock_ReleaseRead(&cm_serverLock); - cm_FreeServerList(tsrpp); + cm_FreeServerList(tsrpp, 0); lock_ReleaseMutex(&tvp->mx); /* still room for terminating NULL, add it on */ @@ -1311,7 +1312,7 @@ long cm_IoctlNewCell(struct smb_ioctl *ioctlp, struct cm_user *userp) long code; lock_ObtainMutex(&cp->mx); /* delete all previous server lists - cm_FreeServerList will ask for write on cm_ServerLock*/ - cm_FreeServerList(&cp->vlServersp); + cm_FreeServerList(&cp->vlServersp, CM_FREESERVERLIST_DELETE); cp->vlServersp = NULL; code = cm_SearchCellFile(cp->name, cp->name, cm_AddCellProc, cp); #ifdef AFS_AFSDB_ENV @@ -2740,10 +2741,10 @@ cm_CheckServersStatus(cm_serverRef_t *serversp) lock_ReleaseRead(&cm_serverLock); if (!(tsp->flags & CM_SERVERFLAG_DOWN)) { allDown = 0; - if (tsrp->status == busy) { + if (tsrp->status == srv_busy) { allOffline = 0; someBusy = 1; - } else if (tsrp->status == offline) { + } else if (tsrp->status == srv_offline) { allBusy = 0; someOffline = 1; } else { @@ -2797,14 +2798,14 @@ long cm_IoctlPathAvailability(struct smb_ioctl *ioctlp, struct cm_user *userp) if (!cellp) return CM_ERROR_NOSUCHCELL; - code = cm_GetVolumeByID(cellp, volume, userp, &req, &tvp); + code = cm_GetVolumeByID(cellp, volume, userp, &req, CM_GETVOL_FLAG_CREATE, &tvp); if (code) return code; lock_ObtainMutex(&tvp->mx); tsrpp = cm_GetVolServers(tvp, volume); code = cm_CheckServersStatus(*tsrpp); - cm_FreeServerList(tsrpp); + cm_FreeServerList(tsrpp, 0); lock_ReleaseMutex(&tvp->mx); cm_PutVolume(tvp); return 0; diff --git a/src/WINNT/afsd/cm_memmap.c b/src/WINNT/afsd/cm_memmap.c index 606901dda..5a83ee21d 100644 --- a/src/WINNT/afsd/cm_memmap.c +++ b/src/WINNT/afsd/cm_memmap.c @@ -38,6 +38,14 @@ ComputeSizeOfVolumes(DWORD maxvols) return size; } +afs_uint64 +ComputeSizeOfVolumeHT(DWORD maxvols) +{ + afs_uint64 size; + size = osi_PrimeLessThan((afs_uint32)(maxvols/7 + 1)) * sizeof(cm_volume_t *); + return size; +} + afs_uint64 ComputeSizeOfCells(DWORD maxcells) { @@ -66,7 +74,7 @@ afs_uint64 ComputeSizeOfSCacheHT(DWORD stats) { afs_uint64 size; - size = (stats + 10) / 2 * sizeof(cm_scache_t *);; + size = osi_PrimeLessThan(stats / 2 + 1) * sizeof(cm_scache_t *);; return size; } @@ -109,6 +117,7 @@ ComputeSizeOfMappingFile(DWORD stats, DWORD maxVols, DWORD maxCells, DWORD chunk size = ComputeSizeOfConfigData() + ComputeSizeOfVolumes(maxVols) + + 4 * ComputeSizeOfVolumeHT(maxVols) + ComputeSizeOfCells(maxCells) + ComputeSizeOfACLCache(stats) + ComputeSizeOfSCache(stats) @@ -221,11 +230,12 @@ cm_ShutdownMappedMemory(void) afsi_log(" blockSize = %d", cm_data.blockSize); afsi_log(" bufferSize = %d", cm_data.bufferSize); afsi_log(" cacheType = %d", cm_data.cacheType); + afsi_log(" volumeHashTableSize = %d", cm_data.volumeHashTableSize); afsi_log(" currentVolumes = %d", cm_data.currentVolumes); afsi_log(" maxVolumes = %d", cm_data.maxVolumes); afsi_log(" currentCells = %d", cm_data.currentCells); afsi_log(" maxCells = %d", cm_data.maxCells); - afsi_log(" hashTableSize = %d", cm_data.hashTableSize); + afsi_log(" scacheHashTableSize = %d", cm_data.scacheHashTableSize); afsi_log(" currentSCaches = %d", cm_data.currentSCaches); afsi_log(" maxSCaches = %d", cm_data.maxSCaches); @@ -404,11 +414,12 @@ cm_ValidateMappedMemory(char * cachePath) fprintf(stderr," blockSize = %d\n", config_data_p->blockSize); fprintf(stderr," bufferSize = %d\n", config_data_p->bufferSize); fprintf(stderr," cacheType = %d\n", config_data_p->cacheType); + fprintf(stderr," volumeHashTableSize = %d", config_data_p->volumeHashTableSize); fprintf(stderr," currentVolumes = %d\n", config_data_p->currentVolumes); fprintf(stderr," maxVolumes = %d\n", config_data_p->maxVolumes); fprintf(stderr," currentCells = %d\n", config_data_p->currentCells); fprintf(stderr," maxCells = %d\n", config_data_p->maxCells); - fprintf(stderr," hashTableSize = %d\n", config_data_p->hashTableSize); + fprintf(stderr," scacheHashTableSize = %d\n", config_data_p->scacheHashTableSize); fprintf(stderr," currentSCaches = %d\n", config_data_p->currentSCaches); fprintf(stderr," maxSCaches = %d\n", config_data_p->maxSCaches); cm_data = *config_data_p; @@ -802,11 +813,12 @@ cm_InitMappedMemory(DWORD virtualCache, char * cachePath, DWORD stats, DWORD chu afsi_log(" blockSize = %d", config_data_p->blockSize); afsi_log(" bufferSize = %d", config_data_p->bufferSize); afsi_log(" cacheType = %d", config_data_p->cacheType); + afsi_log(" volumeHashTableSize = %d", config_data_p->volumeHashTableSize); afsi_log(" currentVolumes = %d", config_data_p->currentVolumes); afsi_log(" maxVolumes = %d", config_data_p->maxVolumes); afsi_log(" currentCells = %d", config_data_p->currentCells); afsi_log(" maxCells = %d", config_data_p->maxCells); - afsi_log(" hashTableSize = %d", config_data_p->hashTableSize); + afsi_log(" scacheHashTableSize = %d", config_data_p->scacheHashTableSize); afsi_log(" currentSCaches = %d", config_data_p->currentSCaches); afsi_log(" maxSCaches = %d", config_data_p->maxSCaches); @@ -827,7 +839,8 @@ cm_InitMappedMemory(DWORD virtualCache, char * cachePath, DWORD stats, DWORD chu cm_data.chunkSize = chunkSize; cm_data.blockSize = CM_CONFIGDEFAULT_BLOCKSIZE; cm_data.bufferSize = mappingSize; - cm_data.hashTableSize = osi_PrimeLessThan(stats / 2 + 1); + cm_data.scacheHashTableSize = osi_PrimeLessThan(stats / 2 + 1); + cm_data.volumeHashTableSize = osi_PrimeLessThan((afs_uint32)(maxVols/7 + 1)); if (virtualCache) { cm_data.cacheType = CM_BUF_CACHETYPE_VIRTUAL; } else { @@ -844,17 +857,25 @@ cm_InitMappedMemory(DWORD virtualCache, char * cachePath, DWORD stats, DWORD chu baseAddress += ComputeSizeOfConfigData(); cm_data.volumeBaseAddress = (cm_volume_t *) baseAddress; baseAddress += ComputeSizeOfVolumes(maxVols); + cm_data.volumeNameHashTablep = (cm_volume_t **)baseAddress; + baseAddress += ComputeSizeOfVolumeHT(maxVols); + cm_data.volumeRWIDHashTablep = (cm_volume_t **)baseAddress; + baseAddress += ComputeSizeOfVolumeHT(maxVols); + cm_data.volumeROIDHashTablep = (cm_volume_t **)baseAddress; + baseAddress += ComputeSizeOfVolumeHT(maxVols); + cm_data.volumeBKIDHashTablep = (cm_volume_t **)baseAddress; + baseAddress += ComputeSizeOfVolumeHT(maxVols); cm_data.cellBaseAddress = (cm_cell_t *) baseAddress; baseAddress += ComputeSizeOfCells(maxCells); cm_data.aclBaseAddress = (cm_aclent_t *) baseAddress; baseAddress += ComputeSizeOfACLCache(stats); cm_data.scacheBaseAddress = (cm_scache_t *) baseAddress; baseAddress += ComputeSizeOfSCache(stats); - cm_data.hashTablep = (cm_scache_t **) baseAddress; + cm_data.scacheHashTablep = (cm_scache_t **) baseAddress; baseAddress += ComputeSizeOfSCacheHT(stats); cm_data.dnlcBaseAddress = (cm_nc_t *) baseAddress; baseAddress += ComputeSizeOfDNLCache(); - cm_data.buf_hashTablepp = (cm_buf_t **) baseAddress; + cm_data.buf_scacheHashTablepp = (cm_buf_t **) baseAddress; baseAddress += ComputeSizeOfDataHT(cacheBlocks); cm_data.buf_fileHashTablepp = (cm_buf_t **) baseAddress; baseAddress += ComputeSizeOfDataHT(cacheBlocks); diff --git a/src/WINNT/afsd/cm_memmap.h b/src/WINNT/afsd/cm_memmap.h index d5442fc08..5a6775a5a 100644 --- a/src/WINNT/afsd/cm_memmap.h +++ b/src/WINNT/afsd/cm_memmap.h @@ -51,14 +51,23 @@ typedef struct cm_config_data { cm_aclent_t * aclLRUp; cm_aclent_t * aclLRUEndp; - cm_scache_t ** hashTablep; - afs_uint32 hashTableSize; + cm_scache_t ** scacheHashTablep; + afs_uint32 scacheHashTableSize; + cm_scache_t * allSCachesp; afs_uint32 currentSCaches; afs_uint32 maxSCaches; cm_scache_t * scacheLRUFirstp; cm_scache_t * scacheLRULastp; + cm_volume_t ** volumeNameHashTablep; + cm_volume_t ** volumeRWIDHashTablep; + cm_volume_t ** volumeROIDHashTablep; + cm_volume_t ** volumeBKIDHashTablep; + afs_uint32 volumeHashTableSize; + cm_volume_t * volumeLRUFirstp; + cm_volume_t * volumeLRULastp; + cm_nc_t * ncfreelist; cm_nc_t * nameCache; cm_nc_t ** nameHash; @@ -67,7 +76,7 @@ typedef struct cm_config_data { cm_buf_t * buf_freeListEndp; cm_buf_t * buf_dirtyListp; cm_buf_t * buf_dirtyListEndp; - cm_buf_t ** buf_hashTablepp; + cm_buf_t ** buf_scacheHashTablepp; cm_buf_t ** buf_fileHashTablepp; cm_buf_t * buf_allp; afs_uint64 buf_nbuffers; diff --git a/src/WINNT/afsd/cm_scache.c b/src/WINNT/afsd/cm_scache.c index ed21e8a8a..d3646547a 100644 --- a/src/WINNT/afsd/cm_scache.c +++ b/src/WINNT/afsd/cm_scache.c @@ -40,7 +40,7 @@ extern osi_mutex_t cm_Freelance_Lock; #endif /* must be called with cm_scacheLock write-locked! */ -void cm_AdjustLRU(cm_scache_t *scp) +void cm_AdjustScacheLRU(cm_scache_t *scp) { if (scp == cm_data.scacheLRULastp) cm_data.scacheLRULastp = (cm_scache_t *) osi_QPrev(&scp->q); @@ -60,7 +60,7 @@ void cm_RemoveSCacheFromHashTable(cm_scache_t *scp) if (scp->flags & CM_SCACHEFLAG_INHASH) { /* hash it out first */ i = CM_SCACHE_HASH(&scp->fid); - for (lscpp = &cm_data.hashTablep[i], tscp = cm_data.hashTablep[i]; + for (lscpp = &cm_data.scacheHashTablep[i], tscp = cm_data.scacheHashTablep[i]; tscp; lscpp = &tscp->nextp, tscp = tscp->nextp) { if (tscp == scp) { @@ -220,7 +220,7 @@ cm_scache_t *cm_GetNewSCache(void) scp; scp = (cm_scache_t *) osi_QPrev(&scp->q)) { - osi_assert(scp >= cm_data.scacheBaseAddress && scp < (cm_scache_t *)cm_data.hashTablep); + osi_assert(scp >= cm_data.scacheBaseAddress && scp < (cm_scache_t *)cm_data.scacheHashTablep); if (scp->refCount == 0) { if (scp->flags & CM_SCACHEFLAG_DELETED) { @@ -231,7 +231,7 @@ cm_scache_t *cm_GetNewSCache(void) /* now remove from the LRU queue and put it back at the * head of the LRU queue. */ - cm_AdjustLRU(scp); + cm_AdjustScacheLRU(scp); /* and we're done */ return scp; @@ -242,7 +242,7 @@ cm_scache_t *cm_GetNewSCache(void) /* now remove from the LRU queue and put it back at the * head of the LRU queue. */ - cm_AdjustLRU(scp); + cm_AdjustScacheLRU(scp); /* and we're done */ return scp; @@ -256,40 +256,40 @@ cm_scache_t *cm_GetNewSCache(void) /* There were no deleted scache objects that we could use. Try to find * one that simply hasn't been used in a while. */ - for ( scp = cm_data.scacheLRULastp; - scp; - scp = (cm_scache_t *) osi_QPrev(&scp->q)) - { - /* It is possible for the refCount to be zero and for there still - * to be outstanding dirty buffers. If there are dirty buffers, - * we must not recycle the scp. */ - if (scp->refCount == 0 && scp->bufReadsp == NULL && scp->bufWritesp == NULL) { - if (!buf_DirtyBuffersExist(&scp->fid)) { - if (!cm_RecycleSCache(scp, 0)) { - /* we found an entry, so return it */ - /* now remove from the LRU queue and put it back at the - * head of the LRU queue. - */ - cm_AdjustLRU(scp); - - /* and we're done */ - return scp; - } - } else { - osi_Log1(afsd_logp,"GetNewSCache dirty buffers exist scp 0x%x", scp); - } - } - } - osi_Log1(afsd_logp, "GetNewSCache all scache entries in use (retry = %d)", retry); - - return NULL; + for ( scp = cm_data.scacheLRULastp; + scp; + scp = (cm_scache_t *) osi_QPrev(&scp->q)) + { + /* It is possible for the refCount to be zero and for there still + * to be outstanding dirty buffers. If there are dirty buffers, + * we must not recycle the scp. */ + if (scp->refCount == 0 && scp->bufReadsp == NULL && scp->bufWritesp == NULL) { + if (!buf_DirtyBuffersExist(&scp->fid)) { + if (!cm_RecycleSCache(scp, 0)) { + /* we found an entry, so return it */ + /* now remove from the LRU queue and put it back at the + * head of the LRU queue. + */ + cm_AdjustScacheLRU(scp); + + /* and we're done */ + return scp; + } + } else { + osi_Log1(afsd_logp,"GetNewSCache dirty buffers exist scp 0x%x", scp); + } + } + } + osi_Log1(afsd_logp, "GetNewSCache all scache entries in use (retry = %d)", retry); + + return NULL; } /* if we get here, we should allocate a new scache entry. We either are below * quota or we have a leak and need to allocate a new one to avoid panicing. */ scp = cm_data.scacheBaseAddress + cm_data.currentSCaches; - osi_assert(scp >= cm_data.scacheBaseAddress && scp < (cm_scache_t *)cm_data.hashTablep); + osi_assert(scp >= cm_data.scacheBaseAddress && scp < (cm_scache_t *)cm_data.scacheHashTablep); memset(scp, 0, sizeof(cm_scache_t)); scp->magic = CM_SCACHE_MAGIC; lock_InitializeMutex(&scp->mx, "cm_scache_t mutex"); @@ -303,6 +303,8 @@ cm_scache_t *cm_GetNewSCache(void) cm_data.currentSCaches++; cm_dnlcPurgedp(scp); /* make doubly sure that this is not in dnlc */ cm_dnlcPurgevp(scp); + scp->allNextp = cm_data.allSCachesp; + cm_data.allSCachesp = scp; return scp; } @@ -417,8 +419,8 @@ cm_ValidateSCache(void) } } - for ( i=0; i < cm_data.hashTableSize; i++ ) { - for ( scp = cm_data.hashTablep[i]; scp; scp = scp->nextp ) { + for ( i=0; i < cm_data.scacheHashTableSize; i++ ) { + for ( scp = cm_data.scacheHashTablep[i]; scp; scp = scp->nextp ) { if (scp->magic != CM_SCACHE_MAGIC) { afsi_log("cm_ValidateSCache failure: scp->magic != CM_SCACHE_MAGIC"); fprintf(stderr, "cm_ValidateSCache failure: scp->magic != CM_SCACHE_MAGIC\n"); @@ -450,8 +452,8 @@ cm_ShutdownSCache(void) { cm_scache_t * scp; - for ( scp = cm_data.scacheLRULastp; scp; - scp = (cm_scache_t *) osi_QPrev(&scp->q) ) { + for ( scp = cm_data.allSCachesp; scp; + scp = scp->allNextp ) { if (scp->randomACLp) { lock_ObtainMutex(&scp->mx); cm_FreeAllACLEnts(scp); @@ -471,15 +473,16 @@ void cm_InitSCache(int newFile, long maxSCaches) if (osi_Once(&once)) { lock_InitializeRWLock(&cm_scacheLock, "cm_scacheLock"); if ( newFile ) { - memset(cm_data.hashTablep, 0, sizeof(cm_scache_t *) * cm_data.hashTableSize); + memset(cm_data.scacheHashTablep, 0, sizeof(cm_scache_t *) * cm_data.scacheHashTableSize); + cm_data.allSCachesp = NULL; cm_data.currentSCaches = 0; cm_data.maxSCaches = maxSCaches; cm_data.scacheLRUFirstp = cm_data.scacheLRULastp = NULL; } else { cm_scache_t * scp; - for ( scp = cm_data.scacheLRULastp; scp; - scp = (cm_scache_t *) osi_QPrev(&scp->q) ) { + for ( scp = cm_data.allSCachesp; scp; + scp = scp->allNextp ) { lock_InitializeMutex(&scp->mx, "cm_scache_t mutex"); lock_InitializeRWLock(&scp->bufCreateLock, "cm_scache_t bufCreateLock"); @@ -521,10 +524,10 @@ cm_scache_t *cm_FindSCache(cm_fid_t *fidp) } lock_ObtainWrite(&cm_scacheLock); - for (scp=cm_data.hashTablep[hash]; scp; scp=scp->nextp) { + for (scp=cm_data.scacheHashTablep[hash]; scp; scp=scp->nextp) { if (cm_FidCmp(fidp, &scp->fid) == 0) { cm_HoldSCacheNoLock(scp); - cm_AdjustLRU(scp); + cm_AdjustScacheLRU(scp); lock_ReleaseWrite(&cm_scacheLock); return scp; } @@ -565,7 +568,7 @@ long cm_GetSCache(cm_fid_t *fidp, cm_scache_t **outScpp, cm_user_t *userp, // yj: check if we have the scp, if so, we don't need // to do anything else lock_ObtainWrite(&cm_scacheLock); - for (scp=cm_data.hashTablep[hash]; scp; scp=scp->nextp) { + for (scp=cm_data.scacheHashTablep[hash]; scp; scp=scp->nextp) { if (cm_FidCmp(fidp, &scp->fid) == 0) { #ifdef DEBUG_REFCOUNT afsi_log("%s:%d cm_GetSCache (1) outScpp 0x%p ref %d", file, line, scp, scp->refCount); @@ -573,7 +576,7 @@ long cm_GetSCache(cm_fid_t *fidp, cm_scache_t **outScpp, cm_user_t *userp, #endif cm_HoldSCacheNoLock(scp); *outScpp = scp; - cm_AdjustLRU(scp); + cm_AdjustScacheLRU(scp); lock_ReleaseWrite(&cm_scacheLock); return 0; } @@ -636,8 +639,8 @@ long cm_GetSCache(cm_fid_t *fidp, cm_scache_t **outScpp, cm_user_t *userp, scp->dotdotFid.unique=1; scp->dotdotFid.vnode=1; scp->flags |= (CM_SCACHEFLAG_PURERO | CM_SCACHEFLAG_RO); - scp->nextp=cm_data.hashTablep[hash]; - cm_data.hashTablep[hash]=scp; + scp->nextp=cm_data.scacheHashTablep[hash]; + cm_data.scacheHashTablep[hash]=scp; scp->flags |= CM_SCACHEFLAG_INHASH; scp->refCount = 1; osi_Log1(afsd_logp,"cm_GetSCache (freelance) sets refCount to 1 scp 0x%x", scp); @@ -683,7 +686,7 @@ long cm_GetSCache(cm_fid_t *fidp, cm_scache_t **outScpp, cm_user_t *userp, if (!cellp) return CM_ERROR_NOSUCHCELL; - code = cm_GetVolumeByID(cellp, fidp->volume, userp, reqp, &volp); + code = cm_GetVolumeByID(cellp, fidp->volume, userp, reqp, CM_GETVOL_FLAG_CREATE, &volp); if (code) return code; lock_ObtainWrite(&cm_scacheLock); @@ -692,7 +695,7 @@ long cm_GetSCache(cm_fid_t *fidp, cm_scache_t **outScpp, cm_user_t *userp, /* otherwise, we have the volume, now reverify that the scp doesn't * exist, and proceed. */ - for (scp=cm_data.hashTablep[hash]; scp; scp=scp->nextp) { + for (scp=cm_data.scacheHashTablep[hash]; scp; scp=scp->nextp) { if (cm_FidCmp(fidp, &scp->fid) == 0) { #ifdef DEBUG_REFCOUNT afsi_log("%s:%d cm_GetSCache (3) outScpp 0x%p ref %d", file, line, scp, scp->refCount); @@ -700,7 +703,7 @@ long cm_GetSCache(cm_fid_t *fidp, cm_scache_t **outScpp, cm_user_t *userp, #endif cm_HoldSCacheNoLock(scp); osi_assert(scp->volp == volp); - cm_AdjustLRU(scp); + cm_AdjustScacheLRU(scp); lock_ReleaseWrite(&cm_scacheLock); if (volp) cm_PutVolume(volp); @@ -744,13 +747,13 @@ long cm_GetSCache(cm_fid_t *fidp, cm_scache_t **outScpp, cm_user_t *userp, scp->dotdotFid = volp->dotdotFid; } - if (volp->roID == fidp->volume) + if (volp->ro.ID == fidp->volume) scp->flags |= (CM_SCACHEFLAG_PURERO | CM_SCACHEFLAG_RO); - else if (volp->bkID == fidp->volume) + else if (volp->bk.ID == fidp->volume) scp->flags |= CM_SCACHEFLAG_RO; } - scp->nextp = cm_data.hashTablep[hash]; - cm_data.hashTablep[hash] = scp; + scp->nextp = cm_data.scacheHashTablep[hash]; + cm_data.scacheHashTablep[hash] = scp; scp->flags |= CM_SCACHEFLAG_INHASH; scp->refCount = 1; osi_Log1(afsd_logp,"cm_GetSCache sets refCount to 1 scp 0x%x", scp); @@ -791,7 +794,7 @@ cm_scache_t * cm_FindSCacheParent(cm_scache_t * scp) if (cm_FidCmp(&scp->fid, &parent_fid)) { i = CM_SCACHE_HASH(&parent_fid); - for (pscp = cm_data.hashTablep[i]; pscp; pscp = pscp->nextp) { + for (pscp = cm_data.scacheHashTablep[i]; pscp; pscp = pscp->nextp) { if (!cm_FidCmp(&pscp->fid, &parent_fid)) { cm_HoldSCacheNoLock(pscp); break; @@ -1353,7 +1356,7 @@ void cm_MergeStatus(cm_scache_t *dscp, struct cm_volume *volp = NULL; cm_GetVolumeByID(cellp, scp->fid.volume, userp, - (cm_req_t *) NULL, &volp); + (cm_req_t *) NULL, CM_GETVOL_FLAG_CREATE, &volp); osi_Log2(afsd_logp, "old data from server %x volume %s", scp->cbServerp->addr.sin_addr.s_addr, volp ? volp->namep : "(unknown)"); @@ -1562,7 +1565,7 @@ int cm_FindFileType(cm_fid_t *fidp) osi_assert(fidp->cell != 0); lock_ObtainWrite(&cm_scacheLock); - for (scp=cm_data.hashTablep[hash]; scp; scp=scp->nextp) { + for (scp=cm_data.scacheHashTablep[hash]; scp; scp=scp->nextp) { if (cm_FidCmp(fidp, &scp->fid) == 0) { lock_ReleaseWrite(&cm_scacheLock); return scp->fileType; @@ -1589,7 +1592,7 @@ int cm_DumpSCache(FILE *outputFile, char *cookie, int lock) sprintf(output, "%s - dumping scache - cm_data.currentSCaches=%d, cm_data.maxSCaches=%d\r\n", cookie, cm_data.currentSCaches, cm_data.maxSCaches); WriteFile(outputFile, output, (DWORD)strlen(output), &zilch, NULL); - for (scp = cm_data.scacheLRULastp; scp; scp = (cm_scache_t *) osi_QPrev(&scp->q)) + for (scp = cm_data.allSCachesp; scp; scp = scp->allNextp) { if (scp->refCount != 0) { @@ -1600,12 +1603,12 @@ int cm_DumpSCache(FILE *outputFile, char *cookie, int lock) } } - sprintf(output, "%s - dumping cm_data.hashTable - cm_data.hashTableSize=%d\r\n", cookie, cm_data.hashTableSize); + sprintf(output, "%s - dumping cm_data.hashTable - cm_data.scacheHashTableSize=%d\r\n", cookie, cm_data.scacheHashTableSize); WriteFile(outputFile, output, (DWORD)strlen(output), &zilch, NULL); - for (i = 0; i < cm_data.hashTableSize; i++) + for (i = 0; i < cm_data.scacheHashTableSize; i++) { - for(scp = cm_data.hashTablep[i]; scp; scp=scp->nextp) + for(scp = cm_data.scacheHashTablep[i]; scp; scp=scp->nextp) { if (scp->refCount != 0) { diff --git a/src/WINNT/afsd/cm_scache.h b/src/WINNT/afsd/cm_scache.h index 8d9d07829..d04514317 100644 --- a/src/WINNT/afsd/cm_scache.h +++ b/src/WINNT/afsd/cm_scache.h @@ -85,6 +85,7 @@ typedef struct cm_scache { osi_queue_t q; /* lru queue; cm_scacheLock */ afs_uint32 magic; struct cm_scache *nextp; /* next in hash; cm_scacheLock */ + struct cm_scache *allNextp; /* next in all scache list; cm_scacheLock */ cm_fid_t fid; afs_uint32 flags; /* flags; locked by mx */ @@ -298,7 +299,7 @@ typedef struct cm_scache { ((fidp)->volume + \ (fidp)->vnode + \ (fidp)->unique)) \ - % cm_data.hashTableSize) + % cm_data.scacheHashTableSize) #include "cm_conn.h" #include "cm_buf.h" diff --git a/src/WINNT/afsd/cm_server.c b/src/WINNT/afsd/cm_server.c index e687cc499..b64cab594 100644 --- a/src/WINNT/afsd/cm_server.c +++ b/src/WINNT/afsd/cm_server.c @@ -50,6 +50,7 @@ cm_PingServer(cm_server_t *tsp) long usecs; Capabilities caps = {0, 0}; char hoststr[16]; + cm_req_t req; lock_ObtainMutex(&tsp->mx); if (tsp->flags & CM_SERVERFLAG_PINGING) { @@ -119,6 +120,28 @@ cm_PingServer(cm_server_t *tsp) osi_LogSaveString(afsd_logp, hoststr), tsp->type == CM_SERVER_VLDB ? "vldb" : "file", tsp->capabilities); + + /* Now update the volume status if necessary */ + if (wasDown) { + cm_server_vols_t * tsrvp; + cm_volume_t * volp; + int i; + + for (tsrvp = tsp->vols; tsrvp; tsrvp = tsrvp->nextp) { + for (i=0; iids[i] != 0) { + cm_InitReq(&req); + + code = cm_GetVolumeByID(tsp->cellp, tsrvp->ids[i], cm_rootUserp, + &req, CM_GETVOL_FLAG_NO_LRU_UPDATE, &volp); + if (code == 0) { + cm_UpdateVolumeStatus(volp, tsrvp->ids[i]); + cm_PutVolume(volp); + } + } + } + } + } } else { /* mark server as down */ tsp->flags |= CM_SERVERFLAG_DOWN; @@ -129,6 +152,28 @@ cm_PingServer(cm_server_t *tsp) osi_LogSaveString(afsd_logp, hoststr), tsp->type == CM_SERVER_VLDB ? "vldb" : "file", tsp->capabilities); + + /* Now update the volume status if necessary */ + if (!wasDown) { + cm_server_vols_t * tsrvp; + cm_volume_t * volp; + int i; + + for (tsrvp = tsp->vols; tsrvp; tsrvp = tsrvp->nextp) { + for (i=0; iids[i] != 0) { + cm_InitReq(&req); + + code = cm_GetVolumeByID(tsp->cellp, tsrvp->ids[i], cm_rootUserp, + &req, CM_GETVOL_FLAG_NO_LRU_UPDATE, &volp); + if (code == 0) { + cm_UpdateVolumeStatus(volp, tsrvp->ids[i]); + cm_PutVolume(volp); + } + } + } + } + } } if (tsp->waitCount == 0) @@ -307,22 +352,24 @@ cm_server_t *cm_NewServer(struct sockaddr_in *socketp, int type, cm_cell_t *cell osi_assert(socketp->sin_family == AF_INET); tsp = malloc(sizeof(*tsp)); - memset(tsp, 0, sizeof(*tsp)); - tsp->type = type; - tsp->cellp = cellp; - tsp->refCount = 1; - lock_InitializeMutex(&tsp->mx, "cm_server_t mutex"); - tsp->addr = *socketp; - tsp->flags = CM_SERVERFLAG_DOWN; /* assume down; ping will mark up if available */ - - cm_SetServerPrefs(tsp); - - lock_ObtainWrite(&cm_serverLock); /* get server lock */ - tsp->allNextp = cm_allServersp; - cm_allServersp = tsp; - lock_ReleaseWrite(&cm_serverLock); /* release server lock */ - - cm_PingServer(tsp); /* Obtain Capabilities and check up/down state */ + if (tsp) { + memset(tsp, 0, sizeof(*tsp)); + tsp->type = type; + tsp->cellp = cellp; + tsp->refCount = 1; + lock_InitializeMutex(&tsp->mx, "cm_server_t mutex"); + tsp->addr = *socketp; + tsp->flags = CM_SERVERFLAG_DOWN; /* assume down; ping will mark up if available */ + + cm_SetServerPrefs(tsp); + + lock_ObtainWrite(&cm_serverLock); /* get server lock */ + tsp->allNextp = cm_allServersp; + cm_allServersp = tsp; + lock_ReleaseWrite(&cm_serverLock); /* release server lock */ + + cm_PingServer(tsp); /* Obtain Capabilities and check up/down state */ + } return tsp; } @@ -351,17 +398,73 @@ cm_server_t *cm_FindServer(struct sockaddr_in *addrp, int type) return tsp; } -cm_serverRef_t *cm_NewServerRef(cm_server_t *serverp) +cm_server_vols_t *cm_NewServerVols(void) { + cm_server_vols_t *tsvp; + + tsvp = malloc(sizeof(*tsvp)); + if (tsvp) + memset(tsvp, 0, sizeof(*tsvp)); + + return tsvp; +} + +cm_serverRef_t *cm_NewServerRef(cm_server_t *serverp, afs_uint32 volID) { cm_serverRef_t *tsrp; + cm_server_vols_t **tsrvpp = NULL; + afs_uint32 *slotp = NULL; + int found = 0; cm_GetServer(serverp); tsrp = malloc(sizeof(*tsrp)); tsrp->server = serverp; - tsrp->status = not_busy; + tsrp->status = srv_not_busy; tsrp->next = NULL; + tsrp->volID = volID; tsrp->refCount = 1; + /* if we have a non-zero volID, we need to add it to the list + * of volumes maintained by the server. There are two phases: + * (1) see if the volID is already in the list and (2) insert + * it into the first empty slot if it is not. + */ + if (volID) { + lock_ObtainMutex(&serverp->mx); + + tsrvpp = &serverp->vols; + while (*tsrvpp) { + int i; + + for (i=0; iids[i] == volID) { + found = 1; + break; + } else if (!slotp && (*tsrvpp)->ids[i] == 0) { + slotp = &(*tsrvpp)->ids[i]; + } + } + + if (found) + break; + + tsrvpp = &(*tsrvpp)->nextp; + } + + if (!found) { + if (slotp) { + *slotp = volID; + } else { + /* if we didn't find an empty slot in a current + * page we must need a new page */ + *tsrvpp = cm_NewServerVols(); + if (*tsrvpp) + (*tsrvpp)->ids[0] = volID; + } + } + + lock_ReleaseMutex(&serverp->mx); + } + return tsrp; } @@ -511,6 +614,8 @@ void cm_RandomizeServer(cm_serverRef_t** list) /* call cm_FreeServer while holding a write lock on cm_serverLock */ void cm_FreeServer(cm_server_t* serverp) { + cm_server_vols_t * tsrvp, *nextp; + cm_PutServerNoLock(serverp); if (serverp->refCount == 0) { @@ -534,12 +639,37 @@ void cm_FreeServer(cm_server_t* serverp) } } } + + /* free the volid list */ + for ( tsrvp = serverp->vols; tsrvp; tsrvp = nextp) { + nextp = tsrvp->nextp; + free(tsrvp); + } + free(serverp); } } } -void cm_FreeServerList(cm_serverRef_t** list) +void cm_RemoveVolumeFromServer(cm_server_t * serverp, afs_uint32 volID) +{ + cm_server_vols_t * tsrvp; + int i; + + if (volID == 0) + return; + + for (tsrvp = serverp->vols; tsrvp; tsrvp = tsrvp->nextp) { + for (i=0; iids[i] == volID) { + tsrvp->ids[i] = 0;; + break; + } + } + } +} + +void cm_FreeServerList(cm_serverRef_t** list, afs_uint32 flags) { cm_serverRef_t **current = list; cm_serverRef_t **nextp = 0; @@ -552,11 +682,19 @@ void cm_FreeServerList(cm_serverRef_t** list) nextp = &(*current)->next; if (--((*current)->refCount) == 0) { next = *nextp; + + if ((*current)->volID) + cm_RemoveVolumeFromServer((*current)->server, (*current)->volID); cm_FreeServer((*current)->server); free(*current); *current = next; } else { - current = nextp; + if (flags & CM_FREESERVERLIST_DELETE) { + (*current)->status = srv_deleted; + if ((*current)->volID) + cm_RemoveVolumeFromServer((*current)->server, (*current)->volID); + } + current = nextp; } } diff --git a/src/WINNT/afsd/cm_server.h b/src/WINNT/afsd/cm_server.h index c636108d8..de8aef4f9 100644 --- a/src/WINNT/afsd/cm_server.h +++ b/src/WINNT/afsd/cm_server.h @@ -13,6 +13,13 @@ #include #include +/* this value is set to 1022 in order to */ +#define NUM_SERVER_VOLS (32 - sizeof(void *) / 4) +typedef struct cm_server_vols { + afs_uint32 ids[NUM_SERVER_VOLS]; + struct cm_server_vols *nextp; +} cm_server_vols_t; + /* pointed to by volumes and cells without holds; cm_serverLock is obtained * at the appropriate times to change the pointers to these servers. */ @@ -28,15 +35,17 @@ typedef struct cm_server { unsigned long refCount; /* locked by cm_serverLock */ osi_mutex_t mx; unsigned short ipRank; /* server priority */ + cm_server_vols_t * vols; /* by mx */ } cm_server_t; -enum repstate {not_busy, busy, offline}; +enum repstate {srv_not_busy, srv_busy, srv_offline, srv_deleted}; typedef struct cm_serverRef { struct cm_serverRef *next; /* locked by cm_serverLock */ struct cm_server *server; /* locked by cm_serverLock */ enum repstate status; /* locked by cm_serverLock */ unsigned long refCount; /* locked by cm_serverLock */ + afs_uint32 volID; /* locked by cm_serverLock */ } cm_serverRef_t; /* types */ @@ -68,7 +77,7 @@ typedef struct cm_serverRef { extern cm_server_t *cm_NewServer(struct sockaddr_in *addrp, int type, struct cm_cell *cellp); -extern cm_serverRef_t *cm_NewServerRef(struct cm_server *serverp); +extern cm_serverRef_t *cm_NewServerRef(struct cm_server *serverp, afs_uint32 volID); extern LONG_PTR cm_ChecksumServerList(cm_serverRef_t *serversp); @@ -100,7 +109,9 @@ extern void cm_RandomizeServer(cm_serverRef_t** list); extern void cm_FreeServer(cm_server_t* server); -extern void cm_FreeServerList(cm_serverRef_t** list); +#define CM_FREESERVERLIST_DELETE 1 + +extern void cm_FreeServerList(cm_serverRef_t** list, afs_uint32 flags); extern void cm_ForceNewConnectionsAllServers(void); diff --git a/src/WINNT/afsd/cm_vnodeops.c b/src/WINNT/afsd/cm_vnodeops.c index e265b800d..4b3921e4e 100644 --- a/src/WINNT/afsd/cm_vnodeops.c +++ b/src/WINNT/afsd/cm_vnodeops.c @@ -22,11 +22,6 @@ #include "afsd.h" -/* Used by cm_FollowMountPoint */ -#define RWVOL 0 -#define ROVOL 1 -#define BACKVOL 2 - #ifdef DEBUG extern void afsi_log(char *pattern, ...); #endif @@ -989,6 +984,7 @@ long cm_ReadMountPoint(cm_scache_t *scp, cm_user_t *userp, cm_req_t *reqp) return code; } + /* called with a locked scp and chases the mount point, yielding outScpp. * scp remains locked, just for simplicity of describing the interface. */ @@ -1065,7 +1061,13 @@ long cm_FollowMountPoint(cm_scache_t *scp, cm_scache_t *dscp, cm_user_t *userp, /* now we need to get the volume */ lock_ReleaseMutex(&scp->mx); - code = cm_GetVolumeByName(cellp, volNamep, userp, reqp, 0, &volp); + if (cm_VolNameIsID(volNamep)) { + code = cm_GetVolumeByID(cellp, atoi(volNamep), userp, reqp, + CM_GETVOL_FLAG_CREATE, &volp); + } else { + code = cm_GetVolumeByName(cellp, volNamep, userp, reqp, + CM_GETVOL_FLAG_CREATE, &volp); + } lock_ObtainMutex(&scp->mx); if (code == 0) { @@ -1086,14 +1088,14 @@ long cm_FollowMountPoint(cm_scache_t *scp, cm_scache_t *dscp, cm_user_t *userp, * the read-only, otherwise use the one specified. */ if (mtType == '#' && (scp->flags & CM_SCACHEFLAG_PURERO) - && volp->roID != 0 && type == RWVOL) + && volp->ro.ID != 0 && type == RWVOL) type = ROVOL; if (type == ROVOL) - scp->mountRootFid.volume = volp->roID; + scp->mountRootFid.volume = volp->ro.ID; else if (type == BACKVOL) - scp->mountRootFid.volume = volp->bkID; + scp->mountRootFid.volume = volp->bk.ID; else - scp->mountRootFid.volume = volp->rwID; + scp->mountRootFid.volume = volp->rw.ID; /* the rest of the fid is a magic number */ scp->mountRootFid.vnode = 1; @@ -1383,7 +1385,7 @@ long cm_Unlink(cm_scache_t *dscp, char *namep, cm_user_t *userp, cm_req_t *reqp) osi_Log1(afsd_logp, "CALL RemoveFile scp 0x%p", dscp); do { - code = cm_Conn(&dscp->fid, userp, reqp, &connp); + code = cm_ConnFromFID(&dscp->fid, userp, reqp, &connp); if (code) continue; @@ -2024,7 +2026,7 @@ cm_TryBulkStat(cm_scache_t *dscp, osi_hyper_t *offsetp, cm_user_t *userp, cm_StartCallbackGrantingCall(NULL, &cbReq); osi_Log1(afsd_logp, "CALL BulkStatus, %d entries", filesThisCall); do { - code = cm_Conn(&dscp->fid, userp, reqp, &connp); + code = cm_ConnFromFID(&dscp->fid, userp, reqp, &connp); if (code) continue; @@ -2292,7 +2294,7 @@ long cm_SetAttr(cm_scache_t *scp, cm_attr_t *attrp, cm_user_t *userp, /* now make the RPC */ osi_Log1(afsd_logp, "CALL StoreStatus scp 0x%p", scp); do { - code = cm_Conn(&scp->fid, userp, reqp, &connp); + code = cm_ConnFromFID(&scp->fid, userp, reqp, &connp); if (code) continue; @@ -2370,7 +2372,7 @@ long cm_Create(cm_scache_t *dscp, char *namep, long flags, cm_attr_t *attrp, /* try the RPC now */ osi_Log1(afsd_logp, "CALL CreateFile scp 0x%p", dscp); do { - code = cm_Conn(&dscp->fid, userp, reqp, &connp); + code = cm_ConnFromFID(&dscp->fid, userp, reqp, &connp); if (code) continue; @@ -2504,7 +2506,7 @@ long cm_MakeDir(cm_scache_t *dscp, char *namep, long flags, cm_attr_t *attrp, /* try the RPC now */ osi_Log1(afsd_logp, "CALL MakeDir scp 0x%p", dscp); do { - code = cm_Conn(&dscp->fid, userp, reqp, &connp); + code = cm_ConnFromFID(&dscp->fid, userp, reqp, &connp); if (code) continue; @@ -2595,7 +2597,7 @@ long cm_Link(cm_scache_t *dscp, char *namep, cm_scache_t *sscp, long flags, /* try the RPC now */ osi_Log1(afsd_logp, "CALL Link scp 0x%p", dscp); do { - code = cm_Conn(&dscp->fid, userp, reqp, &connp); + code = cm_ConnFromFID(&dscp->fid, userp, reqp, &connp); if (code) continue; dirAFSFid.Volume = dscp->fid.volume; @@ -2663,7 +2665,7 @@ long cm_SymLink(cm_scache_t *dscp, char *namep, char *contentsp, long flags, /* try the RPC now */ osi_Log1(afsd_logp, "CALL Symlink scp 0x%p", dscp); do { - code = cm_Conn(&dscp->fid, userp, reqp, &connp); + code = cm_ConnFromFID(&dscp->fid, userp, reqp, &connp); if (code) continue; @@ -2745,7 +2747,7 @@ long cm_RemoveDir(cm_scache_t *dscp, char *namep, cm_user_t *userp, /* try the RPC now */ osi_Log1(afsd_logp, "CALL RemoveDir scp 0x%p", dscp); do { - code = cm_Conn(&dscp->fid, userp, reqp, &connp); + code = cm_ConnFromFID(&dscp->fid, userp, reqp, &connp); if (code) continue; @@ -2897,7 +2899,7 @@ long cm_Rename(cm_scache_t *oldDscp, char *oldNamep, cm_scache_t *newDscp, osi_Log2(afsd_logp, "CALL Rename old scp 0x%p new scp 0x%p", oldDscp, newDscp); do { - code = cm_Conn(&oldDscp->fid, userp, reqp, &connp); + code = cm_ConnFromFID(&oldDscp->fid, userp, reqp, &connp); if (code) continue; @@ -3481,7 +3483,7 @@ long cm_IntSetLock(cm_scache_t * scp, cm_user_t * userp, int lockType, lock_ReleaseMutex(&scp->mx); do { - code = cm_Conn(&cfid, userp, reqp, &connp); + code = cm_ConnFromFID(&cfid, userp, reqp, &connp); if (code) break; @@ -3525,7 +3527,7 @@ long cm_IntReleaseLock(cm_scache_t * scp, cm_user_t * userp, osi_Log1(afsd_logp, "CALL ReleaseLock scp 0x%p", scp); do { - code = cm_Conn(&cfid, userp, reqp, &connp); + code = cm_ConnFromFID(&cfid, userp, reqp, &connp); if (code) break; @@ -4525,7 +4527,7 @@ void cm_CheckLocks() lock_ReleaseMutex(&scp->mx); do { - code = cm_Conn(&cfid, userp, + code = cm_ConnFromFID(&cfid, userp, &req, &connp); if (code) break; @@ -4965,9 +4967,9 @@ void cm_ReleaseAllLocks(void) cm_file_lock_t *fileLock; unsigned int i; - for (i = 0; i < cm_data.hashTableSize; i++) + for (i = 0; i < cm_data.scacheHashTableSize; i++) { - for ( scp = cm_data.hashTablep[i]; scp; scp = scp->nextp ) { + for ( scp = cm_data.scacheHashTablep[i]; scp; scp = scp->nextp ) { while (scp->fileLocksH != NULL) { lock_ObtainMutex(&scp->mx); lock_ObtainWrite(&cm_scacheLock); diff --git a/src/WINNT/afsd/cm_volume.c b/src/WINNT/afsd/cm_volume.c index 4167ccbf7..eada0891b 100644 --- a/src/WINNT/afsd/cm_volume.c +++ b/src/WINNT/afsd/cm_volume.c @@ -27,7 +27,7 @@ cm_ValidateVolume(void) cm_volume_t * volp; afs_uint32 count; - for (volp = cm_data.allVolumesp, count = 0; volp; volp=volp->nextp, count++) { + for (volp = cm_data.allVolumesp, count = 0; volp; volp=volp->allNextp, count++) { if ( volp->magic != CM_VOLUME_MAGIC ) { afsi_log("cm_ValidateVolume failure: volp->magic != CM_VOLUME_MAGIC"); fprintf(stderr, "cm_ValidateVolume failure: volp->magic != CM_VOLUME_MAGIC\n"); @@ -38,9 +38,9 @@ cm_ValidateVolume(void) fprintf(stderr, "cm_ValidateVolume failure: volp->cellp->magic != CM_CELL_MAGIC\n"); return -2; } - if ( volp->nextp && volp->nextp->magic != CM_VOLUME_MAGIC ) { - afsi_log("cm_ValidateVolume failure: volp->nextp->magic != CM_VOLUME_MAGIC"); - fprintf(stderr, "cm_ValidateVolume failure: volp->nextp->magic != CM_VOLUME_MAGIC\n"); + if ( volp->allNextp && volp->allNextp->magic != CM_VOLUME_MAGIC ) { + afsi_log("cm_ValidateVolume failure: volp->allNextp->magic != CM_VOLUME_MAGIC"); + fprintf(stderr, "cm_ValidateVolume failure: volp->allNextp->magic != CM_VOLUME_MAGIC\n"); return -3; } if ( count != 0 && volp == cm_data.allVolumesp || @@ -65,8 +65,17 @@ cm_ShutdownVolume(void) { cm_volume_t * volp; - for (volp = cm_data.allVolumesp; volp; volp=volp->nextp) + for (volp = cm_data.allVolumesp; volp; volp=volp->allNextp) { + + if (volp->rw.ID) + cm_VolumeStatusNotification(volp, volp->rw.ID, volp->rw.state, vl_alldown); + if (volp->ro.ID) + cm_VolumeStatusNotification(volp, volp->ro.ID, volp->ro.state, vl_alldown); + if (volp->bk.ID) + cm_VolumeStatusNotification(volp, volp->bk.ID, volp->bk.state, vl_alldown); + lock_FinalizeMutex(&volp->mx); + } return 0; } @@ -82,21 +91,51 @@ void cm_InitVolume(int newFile, long maxVols) cm_data.allVolumesp = NULL; cm_data.currentVolumes = 0; cm_data.maxVolumes = maxVols; + memset(cm_data.volumeNameHashTablep, 0, sizeof(cm_volume_t *) * cm_data.volumeHashTableSize); + memset(cm_data.volumeRWIDHashTablep, 0, sizeof(cm_volume_t *) * cm_data.volumeHashTableSize); + memset(cm_data.volumeROIDHashTablep, 0, sizeof(cm_volume_t *) * cm_data.volumeHashTableSize); + memset(cm_data.volumeBKIDHashTablep, 0, sizeof(cm_volume_t *) * cm_data.volumeHashTableSize); + cm_data.volumeLRUFirstp = cm_data.volumeLRULastp = NULL; } else { cm_volume_t * volp; - for (volp = cm_data.allVolumesp; volp; volp=volp->nextp) { + for (volp = cm_data.allVolumesp; volp; volp=volp->allNextp) { lock_InitializeMutex(&volp->mx, "cm_volume_t mutex"); volp->flags |= CM_VOLUMEFLAG_RESET; - volp->rwServersp = NULL; - volp->roServersp = NULL; - volp->bkServersp = NULL; + volp->rw.state = vl_unknown; + volp->rw.serversp = NULL; + volp->ro.state = vl_unknown; + volp->ro.serversp = NULL; + volp->bk.state = vl_unknown; + volp->bk.serversp = NULL; + if (volp->rw.ID) + cm_VolumeStatusNotification(volp, volp->rw.ID, vl_alldown, volp->rw.state); + if (volp->ro.ID) + cm_VolumeStatusNotification(volp, volp->ro.ID, vl_alldown, volp->ro.state); + if (volp->bk.ID) + cm_VolumeStatusNotification(volp, volp->bk.ID, vl_alldown, volp->bk.state); } } osi_EndOnce(&once); } } + +/* returns true if the id is a decimal integer, in which case we interpret it + * as an id. make the cache manager much simpler. + * Stolen from src/volser/vlprocs.c */ +int +cm_VolNameIsID(char *aname) +{ + int tc; + while (tc = *aname++) { + if (tc > '9' || tc < '0') + return 0; + } + return 1; +} + + /* * Update a volume. Caller holds volume's lock (volp->mx). * @@ -139,17 +178,20 @@ long cm_UpdateVolume(struct cm_cell *cellp, cm_user_t *userp, cm_req_t *reqp, #ifdef MULTIHOMED struct uvldbentry uvldbEntry; #endif - int type = -1; + int method = -1; int ROcount = 0; long code; + enum volstatus rwNewstate = vl_online; + enum volstatus roNewstate = vl_online; + enum volstatus bkNewstate = vl_online; /* clear out old bindings */ - if (volp->rwServersp) - cm_FreeServerList(&volp->rwServersp); - if (volp->roServersp) - cm_FreeServerList(&volp->roServersp); - if (volp->bkServersp) - cm_FreeServerList(&volp->bkServersp); + if (volp->rw.serversp) + cm_FreeServerList(&volp->rw.serversp, CM_FREESERVERLIST_DELETE); + if (volp->ro.serversp) + cm_FreeServerList(&volp->ro.serversp, CM_FREESERVERLIST_DELETE); + if (volp->bk.serversp) + cm_FreeServerList(&volp->bk.serversp, CM_FREESERVERLIST_DELETE); #ifdef AFS_FREELANCE_CLIENT if ( cellp->cellID == AFS_FAKE_ROOT_CELL_ID && atoi(volp->namep)==AFS_FAKE_ROOT_VOL_ID ) @@ -158,7 +200,7 @@ long cm_UpdateVolume(struct cm_cell *cellp, cm_user_t *userp, cm_req_t *reqp, vldbEntry.flags |= VLF_RWEXISTS; vldbEntry.volumeId[0] = AFS_FAKE_ROOT_VOL_ID; code = 0; - type = 0; + method = 0; } else #endif { @@ -170,16 +212,16 @@ long cm_UpdateVolume(struct cm_cell *cellp, cm_user_t *userp, cm_req_t *reqp, continue; #ifdef MULTIHOMED code = VL_GetEntryByNameU(connp->callp, volp->namep, &uvldbEntry); - type = 2; + method = 2; if ( code == RXGEN_OPCODE ) #endif { code = VL_GetEntryByNameN(connp->callp, volp->namep, &nvldbEntry); - type = 1; + method = 1; } if ( code == RXGEN_OPCODE ) { code = VL_GetEntryByNameO(connp->callp, volp->namep, &vldbEntry); - type = 0; + method = 0; } } while (cm_Analyze(connp, userp, reqp, NULL, NULL, cellp->vlServersp, NULL, code)); code = cm_MapVLRPCError(code, reqp); @@ -190,6 +232,46 @@ long cm_UpdateVolume(struct cm_cell *cellp, cm_user_t *userp, cm_req_t *reqp, osi_Log2(afsd_logp, "CALL VL_GetEntryByName{UNO} name %s:%s SUCCESS", volp->cellp->name, volp->namep); } + + /* We can end up here with code == CM_ERROR_NOSUCHVOLUME if the base volume name + * does not exist but there might exist a .readonly volume. If the base name + * doesn't exist we will not care about the .backup that might be left behind + * since there should be no method to access it. + */ + if (code == CM_ERROR_NOSUCHVOLUME && volp->rw.ID == 0 && strlen(volp->namep) < (VL_MAXNAMELEN - 9)) { + char name[VL_MAXNAMELEN]; + + snprintf(name, VL_MAXNAMELEN, "%s.readonly", volp->namep); + + /* now we have volume structure locked and held; make RPC to fill it */ + osi_Log2(afsd_logp, "CALL VL_GetEntryByName{UNO} name %s:%s", volp->cellp->name, name); + do { + code = cm_ConnByMServers(cellp->vlServersp, userp, reqp, &connp); + if (code) + continue; +#ifdef MULTIHOMED + code = VL_GetEntryByNameU(connp->callp, name, &uvldbEntry); + method = 2; + if ( code == RXGEN_OPCODE ) +#endif + { + code = VL_GetEntryByNameN(connp->callp, name, &nvldbEntry); + method = 1; + } + if ( code == RXGEN_OPCODE ) { + code = VL_GetEntryByNameO(connp->callp, name, &vldbEntry); + method = 0; + } + } while (cm_Analyze(connp, userp, reqp, NULL, NULL, cellp->vlServersp, NULL, code)); + code = cm_MapVLRPCError(code, reqp); + if ( code ) + osi_Log3(afsd_logp, "CALL VL_GetEntryByName{UNO} name %s:%s FAILURE, code 0x%x", + volp->cellp->name, name, code); + else + osi_Log2(afsd_logp, "CALL VL_GetEntryByName{UNO} name %s:%s SUCCESS", + volp->cellp->name, name); + } + if (code == 0) { afs_int32 flags; afs_int32 nServers; @@ -198,8 +280,12 @@ long cm_UpdateVolume(struct cm_cell *cellp, cm_user_t *userp, cm_req_t *reqp, afs_int32 bkID; afs_int32 serverNumber[NMAXNSERVERS]; afs_int32 serverFlags[NMAXNSERVERS]; + afs_int32 rwServers_alldown = 1; + afs_int32 roServers_alldown = 1; + afs_int32 bkServers_alldown = 1; + char name[VL_MAXNAMELEN]; - switch ( type ) { + switch ( method ) { case 0: flags = vldbEntry.flags; nServers = vldbEntry.nServers; @@ -210,6 +296,8 @@ long cm_UpdateVolume(struct cm_cell *cellp, cm_user_t *userp, cm_req_t *reqp, serverFlags[i] = vldbEntry.serverFlags[i]; serverNumber[i] = vldbEntry.serverNumber[i]; } + strncpy(name, vldbEntry.name, VL_MAXNAMELEN); + name[VL_MAXNAMELEN - 1] = '\0'; break; case 1: flags = nvldbEntry.flags; @@ -221,6 +309,8 @@ long cm_UpdateVolume(struct cm_cell *cellp, cm_user_t *userp, cm_req_t *reqp, serverFlags[i] = nvldbEntry.serverFlags[i]; serverNumber[i] = nvldbEntry.serverNumber[i]; } + strncpy(name, nvldbEntry.name, VL_MAXNAMELEN); + name[VL_MAXNAMELEN - 1] = '\0'; break; #ifdef MULTIHOMED case 2: @@ -270,24 +360,71 @@ long cm_UpdateVolume(struct cm_cell *cellp, cm_user_t *userp, cm_req_t *reqp, } } nServers = j; /* update the server count */ + strncpy(name, uvldbEntry.name, VL_MAXNAMELEN); + name[VL_MAXNAMELEN - 1] = '\0'; break; #endif } /* decode the response */ lock_ObtainWrite(&cm_volumeLock); - if (flags & VLF_RWEXISTS) - volp->rwID = rwID; - else - volp->rwID = 0; - if (flags & VLF_ROEXISTS) - volp->roID = roID; - else - volp->roID = 0; - if (flags & VLF_BACKEXISTS) - volp->bkID = bkID; - else - volp->bkID = 0; + if (cm_VolNameIsID(volp->namep)) { + size_t len; + + len = strlen(name); + + if (len >= 8 && strcmp(name + len - 7, ".backup") == 0) { + name[len - 7] = '\0'; + } else if (len >= 10 && strcmp(name + len - 9, ".readonly") == 0) { + name[len - 9] = '\0'; + } + + osi_Log2(afsd_logp, "cm_UpdateVolume name %s -> %s", volp->namep, name); + + if (volp->flags & CM_VOLUMEFLAG_IN_HASH) + cm_RemoveVolumeFromNameHashTable(volp); + + strcpy(volp->namep, name); + + cm_AddVolumeToNameHashTable(volp); + } + + if (flags & VLF_RWEXISTS) { + if (volp->rw.ID != rwID) { + if (volp->rw.flags & CM_VOLUMEFLAG_IN_HASH) + cm_RemoveVolumeFromIDHashTable(volp, RWVOL); + volp->rw.ID = rwID; + cm_AddVolumeToIDHashTable(volp, RWVOL); + } + } else { + if (volp->rw.flags & CM_VOLUMEFLAG_IN_HASH) + cm_RemoveVolumeFromIDHashTable(volp, RWVOL); + volp->rw.ID = 0; + } + if (flags & VLF_ROEXISTS) { + if (volp->ro.ID != roID) { + if (volp->ro.flags & CM_VOLUMEFLAG_IN_HASH) + cm_RemoveVolumeFromIDHashTable(volp, ROVOL); + volp->ro.ID = roID; + cm_AddVolumeToIDHashTable(volp, ROVOL); + } + } else { + if (volp->ro.flags & CM_VOLUMEFLAG_IN_HASH) + cm_RemoveVolumeFromIDHashTable(volp, ROVOL); + volp->ro.ID = 0; + } + if (flags & VLF_BACKEXISTS) { + if (volp->bk.ID != bkID) { + if (volp->bk.flags & CM_VOLUMEFLAG_IN_HASH) + cm_RemoveVolumeFromIDHashTable(volp, BACKVOL); + volp->bk.ID = bkID; + cm_AddVolumeToIDHashTable(volp, BACKVOL); + } + } else { + if (volp->bk.flags & CM_VOLUMEFLAG_IN_HASH) + cm_RemoveVolumeFromIDHashTable(volp, BACKVOL); + volp->bk.ID = 0; + } lock_ReleaseWrite(&cm_volumeLock); for (i=0; irwServersp, tsrp); + tsrp = cm_NewServerRef(tsp, rwID); + cm_InsertServerList(&volp->rw.serversp, tsrp); + lock_ObtainWrite(&cm_serverLock); tsrp->refCount--; /* drop allocation reference */ lock_ReleaseWrite(&cm_serverLock); + + if (!(tsp->flags & CM_SERVERFLAG_DOWN)) + rwServers_alldown = 0; } if ((tflags & VLSF_ROVOL) && (flags & VLF_ROEXISTS)) { - tsrp = cm_NewServerRef(tsp); - cm_InsertServerList(&volp->roServersp, tsrp); + tsrp = cm_NewServerRef(tsp, roID); + cm_InsertServerList(&volp->ro.serversp, tsrp); lock_ObtainWrite(&cm_serverLock); tsrp->refCount--; /* drop allocation reference */ lock_ReleaseWrite(&cm_serverLock); ROcount++; + + if (!(tsp->flags & CM_SERVERFLAG_DOWN)) + roServers_alldown = 0; } /* We don't use VLSF_BACKVOL !?! */ + /* Because only the backup on the server holding the RW + * volume can be valid. This check prevents errors if a + * RW is moved but the old backup is not removed. + */ if ((tflags & VLSF_RWVOL) && (flags & VLF_BACKEXISTS)) { - tsrp = cm_NewServerRef(tsp); - cm_InsertServerList(&volp->bkServersp, tsrp); + tsrp = cm_NewServerRef(tsp, bkID); + cm_InsertServerList(&volp->bk.serversp, tsrp); lock_ObtainWrite(&cm_serverLock); tsrp->refCount--; /* drop allocation reference */ lock_ReleaseWrite(&cm_serverLock); + + if (!(tsp->flags & CM_SERVERFLAG_DOWN)) + bkServers_alldown = 0; } /* Drop the reference obtained by cm_FindServer() */ cm_PutServer(tsp); @@ -353,9 +504,32 @@ long cm_UpdateVolume(struct cm_cell *cellp, cm_user_t *userp, cm_req_t *reqp, * lists are length 1. */ if (ROcount > 1) { - cm_RandomizeServer(&volp->roServersp); + cm_RandomizeServer(&volp->ro.serversp); } + + rwNewstate = rwServers_alldown ? vl_alldown : vl_online; + roNewstate = roServers_alldown ? vl_alldown : vl_online; + bkNewstate = bkServers_alldown ? vl_alldown : vl_online; + } else { + rwNewstate = roNewstate = bkNewstate = vl_alldown; + } + + if (volp->rw.state != rwNewstate) { + if (volp->rw.ID) + cm_VolumeStatusNotification(volp, volp->rw.ID, volp->rw.state, rwNewstate); + volp->rw.state = rwNewstate; + } + if (volp->ro.state != roNewstate) { + if (volp->ro.ID) + cm_VolumeStatusNotification(volp, volp->ro.ID, volp->ro.state, roNewstate); + volp->ro.state = roNewstate; } + if (volp->bk.state != bkNewstate) { + if (volp->bk.ID) + cm_VolumeStatusNotification(volp, volp->bk.ID, volp->bk.state, bkNewstate); + volp->bk.state = bkNewstate; + } + return code; } @@ -368,26 +542,64 @@ void cm_GetVolume(cm_volume_t *volp) } } -long cm_GetVolumeByID(cm_cell_t *cellp, long volumeID, cm_user_t *userp, - cm_req_t *reqp, cm_volume_t **outVolpp) + +long cm_GetVolumeByID(cm_cell_t *cellp, afs_uint32 volumeID, cm_user_t *userp, + cm_req_t *reqp, afs_uint32 flags, cm_volume_t **outVolpp) { cm_volume_t *volp; - char volNameString[100]; +#ifdef SEARCH_ALL_VOLUMES + cm_volume_t *volp2; +#endif + char volNameString[VL_MAXNAMELEN]; + afs_uint32 hash; long code = 0; - lock_ObtainWrite(&cm_volumeLock); - for(volp = cm_data.allVolumesp; volp; volp=volp->nextp) { + lock_ObtainRead(&cm_volumeLock); +#ifdef SEARCH_ALL_VOLUMES + for(volp = cm_data.allVolumesp; volp; volp=volp->allNextp) { if (cellp == volp->cellp && - ((unsigned) volumeID == volp->rwID || - (unsigned) volumeID == volp->roID || - (unsigned) volumeID == volp->bkID)) + ((unsigned) volumeID == volp->rw.ID || + (unsigned) volumeID == volp->ro.ID || + (unsigned) volumeID == volp->bk.ID)) break; } + volp2 = volp; +#endif /* SEARCH_ALL_VOLUMES */ + + hash = CM_VOLUME_ID_HASH(volumeID); + /* The volumeID can be any one of the three types. So we must + * search the hash table for all three types until we find it. + * We will search in the order of RO, RW, BK. + */ + for ( volp = cm_data.volumeROIDHashTablep[hash]; volp; volp = volp->ro.nextp) { + if ( cellp == volp->cellp && volumeID == volp->ro.ID ) + break; + } + if (!volp) { + /* try RW volumes */ + for ( volp = cm_data.volumeRWIDHashTablep[hash]; volp; volp = volp->rw.nextp) { + if ( cellp == volp->cellp && volumeID == volp->rw.ID ) + break; + } + } + if (!volp) { + /* try BK volumes */ + for ( volp = cm_data.volumeBKIDHashTablep[hash]; volp; volp = volp->bk.nextp) { + if ( cellp == volp->cellp && volumeID == volp->bk.ID ) + break; + } + } + +#ifdef SEARCH_ALL_VOLUMES + assert(volp == volp2); +#endif + + lock_ReleaseRead(&cm_volumeLock); + /* hold the volume if we found it */ if (volp) - volp->refCount++; - lock_ReleaseWrite(&cm_volumeLock); + cm_GetVolume(volp); /* return it held */ if (volp) { @@ -400,9 +612,13 @@ long cm_GetVolumeByID(cm_cell_t *cellp, long volumeID, cm_user_t *userp, volp->flags &= ~CM_VOLUMEFLAG_RESET; } lock_ReleaseMutex(&volp->mx); - if (code == 0) + if (code == 0) { *outVolpp = volp; - else + + lock_ObtainWrite(&cm_volumeLock); + cm_AdjustVolumeLRU(volp); + lock_ReleaseWrite(&cm_volumeLock); + } else cm_PutVolume(volp); return code; @@ -411,39 +627,111 @@ long cm_GetVolumeByID(cm_cell_t *cellp, long volumeID, cm_user_t *userp, /* otherwise, we didn't find it so consult the VLDB */ sprintf(volNameString, "%u", volumeID); code = cm_GetVolumeByName(cellp, volNameString, userp, reqp, - 0, outVolpp); + flags, outVolpp); return code; } + long cm_GetVolumeByName(struct cm_cell *cellp, char *volumeNamep, - struct cm_user *userp, struct cm_req *reqp, - long flags, cm_volume_t **outVolpp) + struct cm_user *userp, struct cm_req *reqp, + afs_uint32 flags, cm_volume_t **outVolpp) { cm_volume_t *volp; - long code = 0; - - lock_ObtainWrite(&cm_volumeLock); - for (volp = cm_data.allVolumesp; volp; volp=volp->nextp) { - if (cellp == volp->cellp && strcmp(volumeNamep, volp->namep) == 0) { +#ifdef SEARCH_ALL_VOLUMES + cm_volume_t *volp2; +#endif + long code = 0; + char name[VL_MAXNAMELEN]; + size_t len; + int type; + afs_uint32 hash; + + strncpy(name, volumeNamep, VL_MAXNAMELEN); + name[VL_MAXNAMELEN-1] = '\0'; + len = strlen(name); + + if (len >= 8 && strcmp(name + len - 7, ".backup") == 0) { + type = BACKVOL; + name[len - 7] = '\0'; + } else if (len >= 10 && strcmp(name + len - 9, ".readonly") == 0) { + type = ROVOL; + name[len - 9] = '\0'; + } else { + type = RWVOL; + } + + lock_ObtainRead(&cm_volumeLock); +#ifdef SEARCH_ALL_VOLUMES + for (volp = cm_data.allVolumesp; volp; volp=volp->allNextp) { + if (cellp == volp->cellp && strcmp(name, volp->namep) == 0) { break; } } - - /* otherwise, get from VLDB */ - if (!volp) { + volp2 = volp; +#endif /* SEARCH_ALL_VOLUMES */ + + hash = CM_VOLUME_NAME_HASH(name); + for (volp = cm_data.volumeNameHashTablep[hash]; volp; volp = volp->nameNextp) { + if (cellp == volp->cellp && strcmp(name, volp->namep) == 0) + break; + } + +#ifdef SEARCH_ALL_VOLUMES + assert(volp2 == volp); +#endif + + if (!volp && (flags & CM_GETVOL_FLAG_CREATE)) { + /* otherwise, get from VLDB */ + if ( cm_data.currentVolumes >= cm_data.maxVolumes ) { - for (volp = cm_data.allVolumesp; volp; volp=volp->nextp) { + +#ifdef RECYCLE_FROM_ALL_VOLUMES_LIST + for (volp = cm_data.allVolumesp; volp; volp=volp->allNextp) { if ( volp->refCount == 0 ) { /* There is one we can re-use */ break; } } +#else + for ( volp = cm_data.volumeLRULastp; + volp; + volp = (cm_volume_t *) osi_QPrev(&volp->q)) + { + if ( volp->refCount == 0 ) { + /* There is one we can re-use */ + break; + } + } +#endif if (!volp) osi_panic("Exceeded Max Volumes", __FILE__, __LINE__); - } - if (volp) { - volp->rwID = volp->roID = volp->bkID = 0; + lock_ReleaseRead(&cm_volumeLock); + lock_ObtainMutex(&volp->mx); + lock_ObtainWrite(&cm_volumeLock); + + osi_Log2(afsd_logp, "Recycling Volume %s:%s", + volp->cellp->name, volp->namep); + + if (volp->flags & CM_VOLUMEFLAG_IN_LRU_QUEUE) + cm_RemoveVolumeFromLRU(volp); + if (volp->flags & CM_VOLUMEFLAG_IN_HASH) + cm_RemoveVolumeFromNameHashTable(volp); + if (volp->rw.flags & CM_VOLUMEFLAG_IN_HASH) + cm_RemoveVolumeFromIDHashTable(volp, RWVOL); + if (volp->ro.flags & CM_VOLUMEFLAG_IN_HASH) + cm_RemoveVolumeFromIDHashTable(volp, ROVOL); + if (volp->bk.flags & CM_VOLUMEFLAG_IN_HASH) + cm_RemoveVolumeFromIDHashTable(volp, BACKVOL); + + if (volp->rw.ID) + cm_VolumeStatusNotification(volp, volp->rw.ID, volp->rw.state, vl_unknown); + if (volp->ro.ID) + cm_VolumeStatusNotification(volp, volp->ro.ID, volp->ro.state, vl_unknown); + if (volp->bk.ID) + cm_VolumeStatusNotification(volp, volp->bk.ID, volp->bk.state, vl_unknown); + + volp->rw.ID = volp->ro.ID = volp->bk.ID = 0; volp->dotdotFid.cell = 0; volp->dotdotFid.volume = 0; volp->dotdotFid.unique = 0; @@ -452,36 +740,55 @@ long cm_GetVolumeByName(struct cm_cell *cellp, char *volumeNamep, volp = &cm_data.volumeBaseAddress[cm_data.currentVolumes++]; memset(volp, 0, sizeof(cm_volume_t)); volp->magic = CM_VOLUME_MAGIC; - volp->nextp = cm_data.allVolumesp; + volp->allNextp = cm_data.allVolumesp; cm_data.allVolumesp = volp; lock_InitializeMutex(&volp->mx, "cm_volume_t mutex"); - } + lock_ReleaseRead(&cm_volumeLock); + lock_ObtainMutex(&volp->mx); + lock_ObtainWrite(&cm_volumeLock); + } volp->cellp = cellp; - strncpy(volp->namep, volumeNamep, VL_MAXNAMELEN); + strncpy(volp->namep, name, VL_MAXNAMELEN); volp->namep[VL_MAXNAMELEN-1] = '\0'; - volp->refCount = 1; /* starts off held */ + volp->refCount = 1; /* starts off held */ volp->flags = CM_VOLUMEFLAG_RESET; + volp->rw.state = volp->ro.state = volp->bk.state = vl_unknown; + volp->rw.nextp = volp->ro.nextp = volp->bk.nextp = NULL; + volp->rw.flags = volp->ro.flags = volp->bk.flags = 0; + cm_AddVolumeToNameHashTable(volp); + lock_ReleaseWrite(&cm_volumeLock); } - else { - volp->refCount++; + else if (volp) { + lock_ReleaseRead(&cm_volumeLock); + cm_GetVolume(volp); + lock_ObtainMutex(&volp->mx); } - /* next should work since no one could have gotten ptr to this structure yet */ - lock_ReleaseWrite(&cm_volumeLock); - lock_ObtainMutex(&volp->mx); + /* if we don't have a volp structure return no such volume */ + if (!volp) + return CM_ERROR_NOSUCHVOLUME; + /* if we get here we are holding the mutex */ if (volp->flags & CM_VOLUMEFLAG_RESET) { code = cm_UpdateVolume(cellp, userp, reqp, volp); if (code == 0) volp->flags &= ~CM_VOLUMEFLAG_RESET; } + lock_ReleaseMutex(&volp->mx); - if (code == 0) + if (code == 0 && (type == BACKVOL && volp->bk.ID == 0 || + type == ROVOL && volp->ro.ID == 0)) + code = CM_ERROR_NOSUCHVOLUME; + + if (code == 0) { *outVolpp = volp; - else + + lock_ObtainWrite(&cm_volumeLock); + cm_AdjustVolumeLRU(volp); + lock_ReleaseWrite(&cm_volumeLock); + } else cm_PutVolume(volp); - lock_ReleaseMutex(&volp->mx); return code; } @@ -489,6 +796,10 @@ void cm_ForceUpdateVolume(cm_fid_t *fidp, cm_user_t *userp, cm_req_t *reqp) { cm_cell_t *cellp; cm_volume_t *volp; +#ifdef SEARCH_ALL_VOLUMES + cm_volume_t *volp2; +#endif + afs_uint32 hash; if (!fidp) return; @@ -496,19 +807,50 @@ void cm_ForceUpdateVolume(cm_fid_t *fidp, cm_user_t *userp, cm_req_t *reqp) if (!cellp) return; /* search for the volume */ - lock_ObtainWrite(&cm_volumeLock); - for(volp = cm_data.allVolumesp; volp; volp=volp->nextp) { + lock_ObtainRead(&cm_volumeLock); +#ifdef SEARCH_ALL_VOLUMES + for(volp = cm_data.allVolumesp; volp; volp=volp->allNextp) { if (cellp == volp->cellp && - (fidp->volume == volp->rwID || - fidp->volume == volp->roID || - fidp->volume == volp->bkID)) + (fidp->volume == volp->rw.ID || + fidp->volume == volp->ro.ID || + fidp->volume == volp->bk.ID)) break; } +#endif /* SEARCH_ALL_VOLUMES */ + + hash = CM_VOLUME_ID_HASH(fidp->volume); + /* The volumeID can be any one of the three types. So we must + * search the hash table for all three types until we find it. + * We will search in the order of RO, RW, BK. + */ + for ( volp = cm_data.volumeROIDHashTablep[hash]; volp; volp = volp->ro.nextp) { + if ( cellp == volp->cellp && fidp->volume == volp->ro.ID ) + break; + } + if (!volp) { + /* try RW volumes */ + for ( volp = cm_data.volumeRWIDHashTablep[hash]; volp; volp = volp->rw.nextp) { + if ( cellp == volp->cellp && fidp->volume == volp->rw.ID ) + break; + } + } + if (!volp) { + /* try BK volumes */ + for ( volp = cm_data.volumeBKIDHashTablep[hash]; volp; volp = volp->bk.nextp) { + if ( cellp == volp->cellp && fidp->volume == volp->bk.ID ) + break; + } + } + +#ifdef SEARCH_ALL_VOLUMES + assert(volp == volp2); +#endif + + lock_ReleaseRead(&cm_volumeLock); /* hold the volume if we found it */ if (volp) - volp->refCount++; - lock_ReleaseWrite(&cm_volumeLock); + cm_GetVolume(volp); /* update it */ cm_data.mountRootGen = time(NULL); @@ -535,19 +877,19 @@ void cm_ForceUpdateVolume(cm_fid_t *fidp, cm_user_t *userp, cm_req_t *reqp) } /* find the appropriate servers from a volume */ -cm_serverRef_t **cm_GetVolServers(cm_volume_t *volp, unsigned long volume) +cm_serverRef_t **cm_GetVolServers(cm_volume_t *volp, afs_uint32 volume) { cm_serverRef_t **serverspp; cm_serverRef_t *current;; lock_ObtainWrite(&cm_serverLock); - if (volume == volp->rwID) - serverspp = &volp->rwServersp; - else if (volume == volp->roID) - serverspp = &volp->roServersp; - else if (volume == volp->bkID) - serverspp = &volp->bkServersp; + if (volume == volp->rw.ID) + serverspp = &volp->rw.serversp; + else if (volume == volp->ro.ID) + serverspp = &volp->ro.serversp; + else if (volume == volp->bk.ID) + serverspp = &volp->bk.serversp; else osi_panic("bad volume ID in cm_GetVolServers", __FILE__, __LINE__); @@ -574,10 +916,10 @@ long cm_GetROVolumeID(cm_volume_t *volp) long id; lock_ObtainMutex(&volp->mx); - if (volp->roID && volp->roServersp) - id = volp->roID; + if (volp->ro.ID && volp->ro.serversp) + id = volp->ro.ID; else - id = volp->rwID; + id = volp->rw.ID; lock_ReleaseMutex(&volp->mx); return id; @@ -592,15 +934,15 @@ void cm_RefreshVolumes(void) /* force a re-loading of volume data from the vldb */ lock_ObtainWrite(&cm_volumeLock); - for (volp = cm_data.allVolumesp; volp; volp=volp->nextp) { + for (volp = cm_data.allVolumesp; volp; volp=volp->allNextp) { volp->refCount++; lock_ReleaseWrite(&cm_volumeLock); - lock_ObtainMutex(&volp->mx); + lock_ObtainMutex(&volp->mx); volp->flags |= CM_VOLUMEFLAG_RESET; - lock_ReleaseMutex(&volp->mx); - lock_ObtainWrite(&cm_volumeLock); + + lock_ObtainWrite(&cm_volumeLock); osi_assert(volp->refCount-- > 0); } lock_ReleaseWrite(&cm_volumeLock); @@ -625,6 +967,166 @@ void cm_RefreshVolumes(void) } +/* called from the Daemon thread */ +void cm_CheckBusyVolumes(void) +{ + cm_volume_t *volp; + cm_conn_t *connp; + register long code; + AFSFetchVolumeStatus volStat; + char *Name; + char *OfflineMsg; + char *MOTD; + cm_req_t req; + struct rx_connection * callp; + char volName[32]; + char offLineMsg[256]; + char motd[256]; + + Name = volName; + OfflineMsg = offLineMsg; + MOTD = motd; + + lock_ObtainWrite(&cm_volumeLock); + for (volp = cm_data.allVolumesp; volp; volp=volp->allNextp) { + volp->refCount++; + lock_ReleaseWrite(&cm_volumeLock); + lock_ObtainMutex(&volp->mx); + + if (volp->rw.ID != 0 && (volp->rw.state == vl_busy || volp->rw.state == vl_offline)) { + cm_InitReq(&req); + + do { + code = cm_ConnFromVolume(volp, volp->rw.ID, cm_rootUserp, &req, &connp); + if (code) + continue; + + callp = cm_GetRxConn(connp); + code = RXAFS_GetVolumeStatus(callp, volp->rw.ID, + &volStat, &Name, &OfflineMsg, &MOTD); + rx_PutConnection(callp); + + } while (cm_Analyze(connp, cm_rootUserp, &req, NULL, NULL, NULL, NULL, code)); + code = cm_MapRPCError(code, &req); + + if (code == 0 && volStat.Online) { + cm_VolumeStatusNotification(volp, volp->rw.ID, volp->rw.state, vl_online); + volp->rw.state = vl_online; + } + } + + if (volp->ro.ID != 0 && (volp->ro.state == vl_busy || volp->ro.state == vl_offline)) { + cm_InitReq(&req); + + do { + code = cm_ConnFromVolume(volp, volp->ro.ID, cm_rootUserp, &req, &connp); + if (code) + continue; + + callp = cm_GetRxConn(connp); + code = RXAFS_GetVolumeStatus(callp, volp->ro.ID, + &volStat, &Name, &OfflineMsg, &MOTD); + rx_PutConnection(callp); + + } while (cm_Analyze(connp, cm_rootUserp, &req, NULL, NULL, NULL, NULL, code)); + code = cm_MapRPCError(code, &req); + + if (code == 0 && volStat.Online) { + cm_VolumeStatusNotification(volp, volp->ro.ID, volp->ro.state, vl_online); + volp->ro.state = vl_online; + } + } + + if (volp->bk.ID != 0 && (volp->bk.state == vl_busy || volp->bk.state == vl_offline)) { + cm_InitReq(&req); + + do { + code = cm_ConnFromVolume(volp, volp->bk.ID, cm_rootUserp, &req, &connp); + if (code) + continue; + + callp = cm_GetRxConn(connp); + code = RXAFS_GetVolumeStatus(callp, volp->bk.ID, + &volStat, &Name, &OfflineMsg, &MOTD); + rx_PutConnection(callp); + + } while (cm_Analyze(connp, cm_rootUserp, &req, NULL, NULL, NULL, NULL, code)); + code = cm_MapRPCError(code, &req); + + if (code == 0 && volStat.Online) { + cm_VolumeStatusNotification(volp, volp->bk.ID, volp->bk.state, vl_online); + volp->bk.state = vl_online; + } + } + + lock_ReleaseMutex(&volp->mx); + lock_ObtainWrite(&cm_volumeLock); + osi_assert(volp->refCount-- > 0); + } + lock_ReleaseWrite(&cm_volumeLock); +} + +void +cm_UpdateVolumeStatus(cm_volume_t *volp, afs_uint32 volID) +{ + struct cm_vol_state * statep = NULL; + enum volstatus newStatus; + cm_serverRef_t *tsrp; + cm_server_t *tsp; + int someBusy = 0, someOffline = 0, allOffline = 1, allBusy = 1, allDown = 1; + + if (volp->rw.ID == volID) { + statep = &volp->rw; + } else if (volp->ro.ID == volID) { + statep = &volp->ro; + } else if (volp->bk.ID == volID) { + statep = &volp->bk; + } + + if (!statep) { +#ifdef DEBUG + DebugBreak(); +#endif + return; + } + + lock_ObtainWrite(&cm_serverLock); + for (tsrp = statep->serversp; tsrp; tsrp=tsrp->next) { + tsp = tsrp->server; + cm_GetServerNoLock(tsp); + if (!(tsp->flags & CM_SERVERFLAG_DOWN)) { + allDown = 0; + if (tsrp->status == srv_busy) { + allOffline = 0; + someBusy = 1; + } else if (tsrp->status == srv_offline) { + allBusy = 0; + someOffline = 1; + } else { + allOffline = 0; + allBusy = 0; + } + } + cm_PutServerNoLock(tsp); + } + lock_ReleaseWrite(&cm_serverLock); + + if (allDown) + newStatus = vl_alldown; + else if (allBusy || (someBusy && someOffline)) + newStatus = vl_busy; + else if (allOffline) + newStatus = vl_offline; + else + newStatus = vl_online; + + + if (statep->ID && statep->state != newStatus) + cm_VolumeStatusNotification(volp, statep->ID, statep->state, newStatus); + + statep->state = newStatus; +} + /* ** Finds all volumes that reside on this server and reorders their ** RO list according to the changed rank of server. @@ -636,19 +1138,19 @@ void cm_ChangeRankVolume(cm_server_t *tsp) /* find volumes which might have RO copy on server*/ lock_ObtainWrite(&cm_volumeLock); - for(volp = cm_data.allVolumesp; volp; volp=volp->nextp) + for(volp = cm_data.allVolumesp; volp; volp=volp->allNextp) { code = 1 ; /* assume that list is unchanged */ volp->refCount++; lock_ReleaseWrite(&cm_volumeLock); lock_ObtainMutex(&volp->mx); - if ((tsp->cellp==volp->cellp) && (volp->roServersp)) - code =cm_ChangeRankServer(&volp->roServersp, tsp); + if ((tsp->cellp==volp->cellp) && (volp->ro.serversp)) + code =cm_ChangeRankServer(&volp->ro.serversp, tsp); /* this volume list was changed */ if ( !code ) - cm_RandomizeServer(&volp->roServersp); + cm_RandomizeServer(&volp->ro.serversp); lock_ReleaseMutex(&volp->mx); lock_ObtainWrite(&cm_volumeLock); @@ -657,7 +1159,7 @@ void cm_ChangeRankVolume(cm_server_t *tsp) lock_ReleaseWrite(&cm_volumeLock); } -/* dump all scp's that have reference count > 0 to a file. +/* dump all volumes that have reference count > 0 to a file. * cookie is used to identify this batch for easy parsing, * and it a string provided by a caller */ @@ -675,21 +1177,21 @@ int cm_DumpVolumes(FILE *outputFile, char *cookie, int lock) sprintf(output, "%s - dumping volumes - cm_data.currentVolumes=%d, cm_data.maxVolumes=%d\r\n", cookie, cm_data.currentVolumes, cm_data.maxVolumes); WriteFile(outputFile, output, (DWORD)strlen(output), &zilch, NULL); - for (volp = cm_data.allVolumesp; volp; volp=volp->nextp) + for (volp = cm_data.allVolumesp; volp; volp=volp->allNextp) { if (volp->refCount != 0) { cm_scache_t *scp; int scprefs = 0; - for (scp = cm_data.scacheLRULastp; scp; scp = (cm_scache_t *) osi_QPrev(&scp->q)) + for (scp = cm_data.allSCachesp; scp; scp = scp->allNextp) { if (scp->volp == volp) scprefs++; } sprintf(output, "%s cell=%s name=%s rwID=%u roID=%u bkID=%u flags=0x%x fid (cell=%d, volume=%d, vnode=%d, unique=%d) refCount=%u scpRefs=%u\r\n", - cookie, volp->cellp->name, volp->namep, volp->rwID, volp->roID, volp->bkID, volp->flags, + cookie, volp->cellp->name, volp->namep, volp->rw.ID, volp->ro.ID, volp->bk.ID, volp->flags, volp->dotdotFid.cell, volp->dotdotFid.volume, volp->dotdotFid.vnode, volp->dotdotFid.unique, volp->refCount, scprefs); WriteFile(outputFile, output, (DWORD)strlen(output), &zilch, NULL); @@ -706,3 +1208,229 @@ int cm_DumpVolumes(FILE *outputFile, char *cookie, int lock) } +/* + * String hash function used by SDBM project. + * It was chosen because it is fast and provides + * decent coverage. + */ +afs_uint32 SDBMHash(const char * str) +{ + afs_uint32 hash = 0; + size_t i, len; + + if (str == NULL) + return 0; + + for(i = 0, len = strlen(str); i < len; i++) + { + hash = str[i] + (hash << 6) + (hash << 16) - hash; + } + + return (hash & 0x7FFFFFFF); +} + +/* call with volume write-locked and mutex held */ +void cm_AddVolumeToNameHashTable(cm_volume_t *volp) +{ + int i; + + if (volp->flags & CM_VOLUMEFLAG_IN_HASH) + return; + + i = CM_VOLUME_NAME_HASH(volp->namep); + + volp->nameNextp = cm_data.volumeNameHashTablep[i]; + cm_data.volumeNameHashTablep[i] = volp; + volp->flags |= CM_VOLUMEFLAG_IN_HASH; +} + +/* call with volume write-locked and mutex held */ +void cm_RemoveVolumeFromNameHashTable(cm_volume_t *volp) +{ + cm_volume_t **lvolpp; + cm_volume_t *tvolp; + int i; + + if (volp->flags & CM_VOLUMEFLAG_IN_HASH) { + /* hash it out first */ + i = CM_VOLUME_NAME_HASH(volp->namep); + for (lvolpp = &cm_data.volumeNameHashTablep[i], tvolp = cm_data.volumeNameHashTablep[i]; + tvolp; + lvolpp = &tvolp->nameNextp, tvolp = tvolp->nameNextp) { + if (tvolp == volp) { + *lvolpp = volp->nameNextp; + volp->flags &= ~CM_VOLUMEFLAG_IN_HASH; + volp->nameNextp = NULL; + break; + } + } + } +} + +/* call with volume write-locked and mutex held */ +void cm_AddVolumeToIDHashTable(cm_volume_t *volp, afs_uint32 volType) +{ + int i; + struct cm_vol_state * statep; + + switch (volType) { + case RWVOL: + statep = &volp->rw; + break; + case ROVOL: + statep = &volp->ro; + break; + case BACKVOL: + statep = &volp->bk; + break; + default: + return; + } + + if (statep->flags & CM_VOLUMEFLAG_IN_HASH) + return; + + i = CM_VOLUME_ID_HASH(statep->ID); + + switch (volType) { + case RWVOL: + statep->nextp = cm_data.volumeRWIDHashTablep[i]; + cm_data.volumeRWIDHashTablep[i] = volp; + break; + case ROVOL: + statep->nextp = cm_data.volumeROIDHashTablep[i]; + cm_data.volumeROIDHashTablep[i] = volp; + break; + case BACKVOL: + statep->nextp = cm_data.volumeBKIDHashTablep[i]; + cm_data.volumeBKIDHashTablep[i] = volp; + break; + } + statep->flags |= CM_VOLUMEFLAG_IN_HASH; +} + + +/* call with volume write-locked and mutex held */ +void cm_RemoveVolumeFromIDHashTable(cm_volume_t *volp, afs_uint32 volType) +{ + cm_volume_t **lvolpp; + cm_volume_t *tvolp; + struct cm_vol_state * statep; + int i; + + switch (volType) { + case RWVOL: + statep = &volp->rw; + break; + case ROVOL: + statep = &volp->ro; + break; + case BACKVOL: + statep = &volp->bk; + break; + default: + return; + } + + if (statep->flags & CM_VOLUMEFLAG_IN_HASH) { + /* hash it out first */ + i = CM_VOLUME_ID_HASH(statep->ID); + + switch (volType) { + case RWVOL: + lvolpp = &cm_data.volumeRWIDHashTablep[i]; + tvolp = cm_data.volumeRWIDHashTablep[i]; + break; + case ROVOL: + lvolpp = &cm_data.volumeROIDHashTablep[i]; + tvolp = cm_data.volumeROIDHashTablep[i]; + break; + case BACKVOL: + lvolpp = &cm_data.volumeBKIDHashTablep[i]; + tvolp = cm_data.volumeBKIDHashTablep[i]; + break; + } + do { + if (tvolp == volp) { + *lvolpp = statep->nextp; + statep->flags &= ~CM_VOLUMEFLAG_IN_HASH; + statep->nextp = NULL; + break; + } + + switch (volType) { + case RWVOL: + lvolpp = &tvolp->rw.nextp; + tvolp = tvolp->rw.nextp; + break; + case ROVOL: + lvolpp = &tvolp->ro.nextp; + tvolp = tvolp->ro.nextp; + break; + case BACKVOL: + lvolpp = &tvolp->bk.nextp; + tvolp = tvolp->bk.nextp; + break; + } + } while(tvolp); + } +} + +/* must be called with cm_volumeLock write-locked! */ +void cm_AdjustVolumeLRU(cm_volume_t *volp) +{ + if (volp == cm_data.volumeLRULastp) + cm_data.volumeLRULastp = (cm_volume_t *) osi_QPrev(&volp->q); + if (volp->flags & CM_VOLUMEFLAG_IN_LRU_QUEUE) + osi_QRemoveHT((osi_queue_t **) &cm_data.volumeLRUFirstp, (osi_queue_t **) &cm_data.volumeLRULastp, &volp->q); + osi_QAdd((osi_queue_t **) &cm_data.volumeLRUFirstp, &volp->q); + volp->flags |= CM_VOLUMEFLAG_IN_LRU_QUEUE; + if (!cm_data.volumeLRULastp) + cm_data.volumeLRULastp = volp; +} + +/* must be called with cm_volumeLock write-locked! */ +void cm_RemoveVolumeFromLRU(cm_volume_t *volp) +{ + if (volp->flags & CM_VOLUMEFLAG_IN_LRU_QUEUE) { + if (volp == cm_data.volumeLRULastp) + cm_data.volumeLRULastp = (cm_volume_t *) osi_QPrev(&volp->q); + osi_QRemoveHT((osi_queue_t **) &cm_data.volumeLRUFirstp, (osi_queue_t **) &cm_data.volumeLRULastp, &volp->q); + volp->flags &= ~CM_VOLUMEFLAG_IN_LRU_QUEUE; + } +} + +static char * volstatus_str(enum volstatus vs) +{ + switch (vs) { + case vl_online: + return "online"; + case vl_busy: + return "busy"; + case vl_offline: + return "offline"; + case vl_alldown: + return "alldown"; + default: + return "unknown"; + } +} + +void cm_VolumeStatusNotification(cm_volume_t * volp, afs_uint32 volID, enum volstatus old, enum volstatus new) +{ + char volstr[CELL_MAXNAMELEN + VL_MAXNAMELEN]; + char *ext = ""; + + if (volID == volp->rw.ID) + ext = ""; + else if (volID == volp->ro.ID) + ext = ".readonly"; + else if (volID == volp->bk.ID) + ext = ".backup"; + else + ext = ".nomatch"; + snprintf(volstr, sizeof(volstr), "%s:%s%s", volp->cellp->name, volp->namep, ext); + + osi_Log4(afsd_logp, "VolumeStatusNotification: %-48s [%10u] (%s -> %s)", + volstr, volID, volstatus_str(old), volstatus_str(new)); +} diff --git a/src/WINNT/afsd/cm_volume.h b/src/WINNT/afsd/cm_volume.h index 06cd1acef..1f89ddeab 100644 --- a/src/WINNT/afsd/cm_volume.h +++ b/src/WINNT/afsd/cm_volume.h @@ -14,32 +14,66 @@ #define CM_VOLUME_MAGIC ('V' | 'O' <<8 | 'L'<<16 | 'M'<<24) +enum volstatus {vl_online, vl_busy, vl_offline, vl_alldown, vl_unknown}; + +struct cm_vol_state { + afs_uint32 ID; /* by mx */ + struct cm_volume *nextp; /* volumeIDHashTable; by cm_volumeLock */ + cm_serverRef_t *serversp; /* by mx */ + enum volstatus state; /* by mx */ + afs_uint32 flags; /* by mx */ +}; + typedef struct cm_volume { + osi_queue_t q; /* LRU queue; cm_volumeLock */ afs_uint32 magic; + struct cm_volume *allNextp; /* allVolumes; by cm_volumeLock */ + struct cm_volume *nameNextp; /* volumeNameHashTable; by cm_volumeLock */ cm_cell_t *cellp; /* never changes */ - char namep[VL_MAXNAMELEN]; /* by cm_volumeLock */ - unsigned long rwID; /* by cm_volumeLock */ - unsigned long roID; /* by cm_volumeLock */ - unsigned long bkID; /* by cm_volumeLock */ - struct cm_volume *nextp; /* by cm_volumeLock */ + char namep[VL_MAXNAMELEN]; /* name of the normal volume - assigned during allocation; */ + /* by cm_volumeLock */ + struct cm_vol_state rw; /* by cm_volumeLock */ + struct cm_vol_state ro; /* by cm_volumeLock */ + struct cm_vol_state bk; /* by cm_volumeLock */ struct cm_fid dotdotFid; /* parent of volume root */ osi_mutex_t mx; - long flags; /* by mx */ - unsigned long refCount; /* by cm_volumeLock */ - cm_serverRef_t *rwServersp; /* by mx */ - cm_serverRef_t *roServersp; /* by mx */ - cm_serverRef_t *bkServersp; /* by mx */ + afs_uint32 flags; /* by mx */ + afs_uint32 refCount; /* by cm_volumeLock */ } cm_volume_t; -#define CM_VOLUMEFLAG_RESET 1 /* reload this info on next use */ +#define CM_VOLUMEFLAG_RESET 1 /* reload this info on next use */ +#define CM_VOLUMEFLAG_IN_HASH 2 +#define CM_VOLUMEFLAG_IN_LRU_QUEUE 4 + + +typedef struct cm_volumeRef { + struct cm_volumeRef * next; + afs_uint32 volID; +} cm_volumeRef_t; extern void cm_InitVolume(int newFile, long maxVols); -extern long cm_GetVolumeByName(struct cm_cell *, char *, struct cm_user *, - struct cm_req *, long, cm_volume_t **); +extern long cm_GetVolumeByName(struct cm_cell *cellp, char *volNamep, + struct cm_user *userp, struct cm_req *reqp, + afs_uint32 flags, cm_volume_t **outVolpp); + +extern long cm_GetVolumeByID(struct cm_cell *cellp, afs_uint32 volumeID, + cm_user_t *userp, cm_req_t *reqp, + afs_uint32 flags, cm_volume_t **outVolpp); -extern long cm_GetVolumeByID(struct cm_cell *cellp, long volumeID, - cm_user_t *userp, cm_req_t *reqp, cm_volume_t **outVolpp); +#define CM_GETVOL_FLAG_CREATE 1 +#define CM_GETVOL_FLAG_NO_LRU_UPDATE 2 + +/* hash define. Must not include the cell, since the callback revocation code + * doesn't necessarily know the cell in the case of a multihomed server + * contacting us from a mystery address. + */ +#define CM_VOLUME_ID_HASH(volid) ((unsigned long) volid \ + % cm_data.volumeHashTableSize) + +#define CM_VOLUME_NAME_HASH(name) (SDBMHash(name) % cm_data.volumeHashTableSize) + +extern afs_uint32 SDBMHash(const char *); extern void cm_GetVolume(cm_volume_t *volp); @@ -50,7 +84,7 @@ extern long cm_GetROVolumeID(cm_volume_t *volp); extern void cm_ForceUpdateVolume(struct cm_fid *fidp, cm_user_t *userp, cm_req_t *reqp); -extern cm_serverRef_t **cm_GetVolServers(cm_volume_t *volp, unsigned long volume); +extern cm_serverRef_t **cm_GetVolServers(cm_volume_t *volp, afs_uint32 volume); extern void cm_ChangeRankVolume(cm_server_t *tsp); @@ -61,4 +95,25 @@ extern long cm_ValidateVolume(void); extern long cm_ShutdownVolume(void); extern int cm_DumpVolumes(FILE *outputFile, char *cookie, int lock); + +extern int cm_VolNameIsID(char *aname); + +extern void cm_RemoveVolumeFromNameHashTable(cm_volume_t * volp); + +extern void cm_RemoveVolumeFromIDHashTable(cm_volume_t * volp, afs_uint32 volType); + +extern void cm_AddVolumeToNameHashTable(cm_volume_t * volp); + +extern void cm_AddVolumeToIDHashTable(cm_volume_t * volp, afs_uint32 volType); + +extern void cm_AdjustVolumeLRU(cm_volume_t *volp); + +extern void cm_RemoveVolumeFromLRU(cm_volume_t *volp); + +extern void cm_CheckBusyVolumes(void); + +extern void cm_UpdateVolumeStatus(cm_volume_t *volp, afs_uint32 volID); + +extern void cm_VolumeStatusNotification(cm_volume_t * volp, afs_uint32 volID, enum volstatus old, enum volstatus new); + #endif /* __CM_VOLUME_H_ENV__ */ diff --git a/src/WINNT/afsd/smb_ioctl.c b/src/WINNT/afsd/smb_ioctl.c index 24bfda60e..abe3660cc 100644 --- a/src/WINNT/afsd/smb_ioctl.c +++ b/src/WINNT/afsd/smb_ioctl.c @@ -110,7 +110,7 @@ void smb_SetupIoctlFid(smb_fid_t *fidp, cm_space_t *prefix) * this is the first read call. This is the function that actually makes the * call to the ioctl code. */ -smb_IoctlPrepareRead(smb_fid_t *fidp, smb_ioctl_t *ioctlp, cm_user_t *userp) +long smb_IoctlPrepareRead(smb_fid_t *fidp, smb_ioctl_t *ioctlp, cm_user_t *userp) { long opcode; smb_ioctlProc_t *procp = NULL; diff --git a/src/WINNT/afsd/smb_ioctl.h b/src/WINNT/afsd/smb_ioctl.h index 3e7abdaca..ce7f3b1b9 100644 --- a/src/WINNT/afsd/smb_ioctl.h +++ b/src/WINNT/afsd/smb_ioctl.h @@ -37,4 +37,6 @@ extern long smb_IoctlV3Read(smb_fid_t *fidp, smb_vc_t *vcp, smb_packet_t *inp, s extern long smb_IoctlReadRaw(smb_fid_t *fidp, smb_vc_t *vcp, smb_packet_t *inp, smb_packet_t *outp); +extern long smb_IoctlPrepareRead(smb_fid_t *fidp, smb_ioctl_t *ioctlp, cm_user_t *userp); + #endif /* __SMB_IOCTL_H_ENV__ */ -- 2.39.5