Database Reference
In-Depth Information
2013-03-17 05:25:08.821: [ CSSD][1]clssgmCleanupNodeContexts(): successful cleanup of nodes
rcfg(286)
2013-03-17 05:25:09.724: [ CSSD][56]clssnmDeactivateNode: node 04, state 5
2013-03-17 05:25:09.724: [ CSSD][56]clssnmDeactivateNode: node 04 (node04) left cluster
Node Evictions—Top/Common Causes and Factors
The following are only a few of the most common symptoms/factors that lead to node evictions, cluster stack sudden
death, reboots, and status going unhealthy:
Network disruption, latency, or missing network heartbeats
Delayed or missing disk heartbeats
Corrupted network packets on the network may also cause CSS reboots on certain platforms
Slow interconnect or failures
Known Oracle Clusterware bugs
Unable to read/write or access the majority of the voting disks (files)
Lack of sufficient resource (CPU/memory starvation) availability on the node for OS
scheduling by key CRS daemon processes
Manual termination of the critical cluster stack daemon background processes
( css, cssdagent, cssdmonitor )
No space left on the device for the GI or /var file system
Sudden death or hang of CSSD process
ORAAGENT/ORAROOTAGENT excessive resource (CPU, MEMORY, SWAP) consumption resulting in
node eviction on specific OS platforms
Gather Crucial Information
Consult/refer to the following various trace/log files and gather crucial information in order to diagnose/identify the
real symptoms of node eviction:
alert.log : to determine which process actually caused the reboot, refer to the cluster alter.log
under $GI_HOME/log/nodename location. The alert log provides first-hand information
to debug the root cause of the issue. Pay close attention to the component to determine
important information. If the component shows cssmoint or cssdagent , then the node is
evicted due to resource unavailability for OS scheduling. Either the CPU was 100% clocked
for a long time period or too much swapping/paging took place due to insufficient memory
availability. If it shows cssagent , then it could be due to network issues
ocss.log: If the node eviction happens due to network failure or latency, or voting disk issues,
refer to ocss.log file under $GI_HOME/log/nodename/cssd location
cssmonit/cssagent_nodename.lgl , depending on the OS you are on, either in /etc/oracle/
lastgasp or /var/adm/oracle/lastgasp
oracssdmonitor/oracssdagent_root , under $GI_HOME/log/nodename/agent/ohasd location
In addition to the preceding, refer to OS-specific logs
 
Search WWH ::




Custom Search