Database Reference
In-Depth Information
2014-01-15 00:21:18.
453: [ CSSD][1109969216]clssnmvDiskVerify: Successful discovery for disk
/dev/oracleasm/disks/SAVOL1, UID 8cccacc5-d4eb4ff7-bf541125-3ef008ae, Pending CIN 0:1389748088:0,
Committed CIN 0:1389748088:0
453: [ CSSD][1109969216]clssnmvDiskVerify: Pending CIN of the potential voting file for CIN
0:1389748088:0
465: [ CSSD][1109969216] misscount 30 reboot latency 3
465: [ CSSD][1109969216] long I/O timeout 200 short I/O timeout 27
465: [ CSSD][1109969216] rim hub timeout 30 grace period 0
465: [ CSSD][1109969216] hub size 32 active version 12.1.0.1.0
465: [ CSSD][1109969216] Listing unique IDs for 1 voting files:
465: [ CSSD][1109969216] voting file 1: 8cccacc5-d4eb4ff7-bf541125-3ef008ae
465: [ CSSD][1109969216]clssnmvDiskVerify: Committed CIN of the potential voting file for CIN
0:1389748088:0
465: [ CSSD][1109969216] misscount 30 reboot latency 3
465: [ CSSD][1109969216] long I/O timeout 200 short I/O timeout 27
465: [ CSSD][1109969216] rim hub timeout 30 grace period 0
465: [ CSSD][1109969216] hub size 32 active version 12.1.0.1.0
465: [ CSSD][1109969216] Listing unique IDs for 1 voting files:
465: [ CSSD][1109969216] voting file 1: 8cccacc5-d4eb4ff7-bf541125-3ef008ae
465: [ CSSD][1109969216]clssnmvCloseDiskHandle: Closing handle (0x25068b0)
465: [ SKGFD][1109969216]Lib :UFS:: closing handle 0x2506c50 for disk :/dev/oracleasm/disks/SAVOL1:
465: [ CSSD][1109969216]clssnmvDiskVerify: Successful discovery of 1 disks
466: [ CSSD][1109969216]clssnmvDiskVerify: exit
Important parameters are highlighted in the preceding output. Misscount refers to the number of times the
NHB could be missed before the clusterware decides to evict the member from the cluster. If the heartbeat is restored
before the 30-count value is reached, the member continues to be part of the cluster.
Similar to the misscount value used by the NHB mechanism, the DHB mechanism uses the long I/O timeout to
determine the health of the storage subsystem. If the I/O cannot complete in 200 seconds to a specific voting file, the
voting file is considered to be unhealthy and is taken offline. The voting disk is referenced by the clusterware with a
unique ID. This ID could also be obtained by querying the vote disk using the crsctl command-line utility.
By increasing the logging level of the CSSD daemon process, the details of the NHB activity can be tracked. For
example, as seen in the following, the first failure to connect to remote node two happened at 22:44:05.427 . The
biggest misstime recorded in the cssd.log file was 1980 milliseconds (1.98 seconds). The NHB continues between the
nodes every second, and at 22:44:33.659 the biggest misstime was 29990 millseconds (29.99 seconds). If the NHB
receives a successful HB back from node 2 at this time, the node would be back in business. Unfortunately, in this
specific example the node is evicted from the cluster.
2014-01-15 22:44:05.
427: [ CSSD][1090533696]clssgmConnectToNode: Failed to connect to remote node(2)
427: [ CSSD][1090533696]clssgmPeerListener: connected to 1 of 2
635: [ CSSD][1115601216]clssnmWaitThread: thrd(1), timeout(1000), elapsed 1000
635: [ CSSD][1115601216]clssscAllocAsyncMsg: msg(0x7f142c0aceb8), len(124), asqhd(0x7f142c0ace90),
flags(0x083)
635: [ CSSD][1115601216]clssnmsendmsg: sending msg type 3 size 124 to node 2 endp 0x7f140007525e
635: [ CSSD][1115601216]clssnmSendGIPC: cookie 0x7f1434027340 - endp 0x7525e type 3 size 124 dst 0
635: [ CSSD][1115601216]clssnmsendmsg: msg type 3 sent to node 2
635: [ CSSD][1115601216]clssnmHBInfo: css timestmp 1389843845 635 slgtime 8312654 DTO 27790
(index=0) biggest misstime 1980 NTO 26520
 
Search WWH ::




Custom Search