Example 1 - Server Down
Logging into my main server I check that the cluster is running ok.
If OEM is running, log in and click on <Cluster> (top right tab) and the Hosts item (2nd in the list) shows 1/1 indicating that one is down. Click on the 2 (2 nodes in my setup) and you can see on the next page which node is unavailable.
If OEM isnt running (and even if it is), its quicker to start a terminal session and run crsctl.
This gives an overall general view of things.
grid_env
crsctl check crs
CRS-4638: Oracle High Availability Services is online
CRS-4537: Cluster Ready Services is online
CRS-4529: Cluster Synchronization Services is online
CRS-4533: Event Manager is online
but it isnt helpful.
crsctl status resource -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg ONLINE ONLINE ol5-112-rac1
ora.LISTENER.lsnr ONLINE ONLINE ol5-112-rac1
ora.asm ONLINE ONLINE ol5-112-rac1 Started
ora.eons ONLINE ONLINE ol5-112-rac1
ora.gsd OFFLINE OFFLINE ol5-112-rac1
ora.net1.network ONLINE ONLINE ol5-112-rac1
ora.ons ONLINE ONLINE ol5-112-rac1
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE ol5-112-rac1
ora.oc4j 1 OFFLINE OFFLINE
ora.ol5-112-rac1.vip 1 ONLINE ONLINE ol5-112-rac1
ora.ol5-112-rac2.vip 1 ONLINE INTERMEDIATE ol5-112-rac1 FAILED OVER
ora.orcl.db 1 ONLINE ONLINE ol5-112-rac1 Open
2 ONLINE OFFLINE
ora.scan1.vip 1 ONLINE ONLINE ol5-112-rac1
I've jiggled the output slightly to make it more readable.
GSD and OC4J are normally down in this installation so no worries there.
We can also see that ora.orcl.db (orcl is my database) 2 is offline, and rac2 vip is failed over to rac1 indicating that the server is unavailable.
You can also use crs_stat but the output is less friendly and the FAILOVER is obvious.
So the cluster is running and access to the database is possible but the high availability option of the 2nd instance is gone. I restart the other server and the status is corrected.
ora.orcl.db 1 ONLINE ONLINE ol5-112-rac1 Open
2 ONLINE ONLINE ol5-112-rac2 Open
ora.scan1.vip 1 ONLINE ONLINE ol5-112-rac1
<u>Clusterware ALERT Log (alert<nodename>.log)</u>
RAC has got an alert log. Its has logs for everything as you would expect. These are in $GRID_HOME (or whatever you call it)/log/<nodename>
<u>crsctl options</u>
crsctl check crs - checks the viability of the CRS stack
crsctl check cssd - checks the viability of CSS
crsctl check crsd - checks the viability of CRS
crsctl check evmd - checks the viability of EVM
crsctl set css <parm> <value> - sets a parameter override
crsctl get css <parm> - gets the value of a CSS parameter
crsctl unset css <parm> - sets CSS parameter to its default
crsctl query css votedisk - lists the voting disks used by CSS
crsctl add css votedisk <path> - adds a new voting disk
crsctl delete css votedisk <path> - removes a voting disk
crsctl enable crs - enables startup for all CRS daemons
crsctl disable crs - disables startup for all CRS daemons
crsctl start crs - starts all CRS daemons
crsctl stop crs - stops all CRS daemons
crsctl start resources - starts CRS resources
crsctl stop resources - stops CRS resources
to be continued......
Happyjohn
No comments:
Post a Comment