Monday, February 23, 2015

RMAN-20011: target database incarnation is not current in recovery catalog

Change NID for Restored Database (dbid)

Problem:
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of backup command at 02/23/2015 08:42:50
RMAN-03014: implicit resync of recovery catalog failed
RMAN-06004: ORACLE error from recovery catalog database: RMAN-20011: target database incarnation is not current in recovery catalog

Solution:
. oraenv
shutdown immediate
startup mount

nid target=sys as sysdba
supply password

shutdown immediate
startup mount
alter database open resetlogs;

rman target / catalog user/password@catdb
register database;
perform a full backup plus archivelog backup



Happyjohn

Friday, February 20, 2015

Oracle RAC - crs_stat/crsctl improvements

I can heartily recommend the following link. In fact I wish I’d written it myself. It is a work in progress I guess, but its great. I vastly improves the look of the crsctl stat res –t /crs_stat –t output results.

http://blog.enkitec.com/2011/10/my-crsstat-script-improved-formatting-of-crs_stat-on-10g-and-11g/

 

 

happyjohn

 

Thursday, February 19, 2015

RAC Tips (1)

This is a simple thing but one I like. It slightly reformats the output of crsctl into a readable state.

alias crsstat_jcdm="crsctl stat res -t | sed 's/^      /                          /'| sed 's/TARGET/                    TARGET/'"

All it does is move the detail lines across a bit so they don't overlap with the NAME above.


For example

ora.LISTENER.lnsr
             ONLINE ONLINE ol5-112-rac1
             ONLINE ONLINE ol5-112-rac2

becomes

ora.LISTENER.lsnr
                                 ONLINE ONLINE ol5-112-rac1
                                 ONLINE ONLINE ol5-112-rac2


Its nothing in the great scope of things, but in the heat of the moment, it can make a difference.


Happyjohn






RAC - Error Diagnosis

Checking your RAC isnt difficult but there are so many commands.

Example 1 - Server Down

Logging into my main server I check that the cluster is running ok.

If OEM is running, log in and click on <Cluster> (top right tab) and the Hosts item (2nd in the list) shows 1/1 indicating that one is down. Click on the 2 (2 nodes in my setup) and you can see on the next page which node is unavailable.

If OEM isnt running (and even if it is), its quicker to start a terminal session and run crsctl.

This gives an overall general view of things.

     grid_env
     crsctl check crs

     CRS-4638: Oracle High Availability Services is online
     CRS-4537: Cluster Ready Services is online
     CRS-4529: Cluster Synchronization Services is online
     CRS-4533: Event Manager is online

but it isnt helpful.

     crsctl status resource -t
     --------------------------------------------------------------------------------
     NAME TARGET STATE SERVER STATE_DETAILS
     --------------------------------------------------------------------------------
     Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg ONLINE ONLINE ol5-112-rac1
ora.LISTENER.lsnr ONLINE ONLINE ol5-112-rac1
ora.asm ONLINE ONLINE ol5-112-rac1 Started
ora.eons ONLINE ONLINE ol5-112-rac1
ora.gsd OFFLINE OFFLINE ol5-112-rac1
ora.net1.network ONLINE ONLINE ol5-112-rac1
ora.ons ONLINE ONLINE ol5-112-rac1
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr 1 ONLINE ONLINE ol5-112-rac1
ora.oc4j 1 OFFLINE OFFLINE
ora.ol5-112-rac1.vip 1 ONLINE ONLINE ol5-112-rac1
ora.ol5-112-rac2.vip 1 ONLINE INTERMEDIATE ol5-112-rac1 FAILED OVER
ora.orcl.db 1 ONLINE ONLINE ol5-112-rac1 Open
2 ONLINE OFFLINE
ora.scan1.vip 1 ONLINE ONLINE ol5-112-rac1

I've jiggled the output slightly to make it more readable.

GSD and OC4J are normally down in this installation so no worries there.

We can also see that ora.orcl.db (orcl is my database) 2 is offline, and rac2 vip is failed over to rac1 indicating that the server is unavailable.

You can also use crs_stat but the output is less friendly and the FAILOVER is obvious.

So the cluster is running and access to the database is possible but the high availability option of the 2nd instance is gone. I restart the other server and the status is corrected.

ora.ol5-112-rac1.vip 1 ONLINE ONLINE ol5-112-rac1
ora.ol5-112-rac2.vip 1 ONLINE ONLINE ol5-112-rac2
ora.orcl.db 1 ONLINE ONLINE ol5-112-rac1 Open
2 ONLINE ONLINE ol5-112-rac2 Open
ora.scan1.vip 1 ONLINE ONLINE ol5-112-rac1





<u>Clusterware ALERT Log (alert<nodename>.log)</u>



RAC has got an alert log. Its has logs for everything as you would expect. These are in $GRID_HOME (or whatever you call it)/log/<nodename>





<u>crsctl options</u>



crsctl check crs - checks the viability of the CRS stack

crsctl check cssd - checks the viability of CSS

crsctl check crsd - checks the viability of CRS

crsctl check evmd - checks the viability of EVM

crsctl set css <parm> <value> - sets a parameter override

crsctl get css <parm> - gets the value of a CSS parameter

crsctl unset css <parm> - sets CSS parameter to its default

crsctl query css votedisk - lists the voting disks used by CSS

crsctl add css votedisk <path> - adds a new voting disk

crsctl delete css votedisk <path> - removes a voting disk

crsctl enable crs - enables startup for all CRS daemons

crsctl disable crs - disables startup for all CRS daemons

crsctl start crs - starts all CRS daemons

crsctl stop crs - stops all CRS daemons

crsctl start resources - starts CRS resources

crsctl stop resources - stops CRS resources



to be continued......



Happyjohn

Friday, February 13, 2015

Virtualbox Networking (1) - NAT

I had a bunch of problems trying to understand the networking options on VIRTUALBOX. There is lots written about it, but very little to say... do this...

So when you what to get on the network to use Firefox....

1. Create a NAT Network. Before starting the VM, goto Settings, Network, then enable one of the Adaptors and attached to NAT.
2. Start the VM
3. Log in as root
4. goto System > Administration > Network
5. Create or edit one of the Devices eth0, eth1 etc. Ethernet, the eth0 etc should match the adaptor you chose in step 1
6. Choose DCHP. This is the key. Don't try to manually set any of the ip values.

This should be all you need to piggy back your PCs internet.

Happyjohn

Wednesday, February 4, 2015

Oracle Data Guard Troubleshooting

Just a couple of SQL scripts that help with Data Guard problems.
This first is for the Primary and shows details of the current logfiles.

SELECT
 (SELECT name FROM V$DATABASE
 ) name,
 (SELECT MAX (sequence#) FROM v$archived_log WHERE dest_id = 1
 ) Current_primary_seq,
 (SELECT MAX (sequence#)
 FROM v$archived_log
 WHERE TRUNC(next_time) > SYSDATE - 1
 AND dest_id = 2
 ) max_stby,
 (SELECT NVL (
 (SELECT MAX (sequence#) - MIN (sequence#)
 FROM v$archived_log
 WHERE TRUNC(next_time) > SYSDATE - 1
 AND dest_id = 2
 AND applied = 'NO'
 ), 0)
 FROM DUAL
 ) "To be applied",
 (
 (SELECT MAX (sequence#) FROM v$archived_log WHERE dest_id = 1
 ) -
 (SELECT MAX (sequence#) FROM v$archived_log WHERE dest_id = 2
 )) "To be Shipped"
FROM DUAL;

Next is for the Standby database and shows apply progress.

SELECT 'Last Applied : ' Logs,
TO_CHAR(next_time,'DD-MON-YY:HH24:MI:SS') TIME,thread#,sequence#
FROM v$archived_log
WHERE sequence# =
(SELECT MAX(sequence#) FROM v$archived_log WHERE applied='YES'
)
UNION
SELECT 'Last Received : ' Logs,
TO_CHAR(next_time,'DD-MON-YY:HH24:MI:SS') TIME,thread#,sequence#
FROM v$archived_log
WHERE sequence# =
(SELECT MAX(sequence#) FROM v$archived_log )
/

And lastly one to tell you if there are errors. I used rman duplicate to create a standby which of course didn't work. When I ran this
I immediately saw that I had an error on the LOG_ARCHIVE_DEST_2 parameter. Typo of course.

col archive_dest for a30
set lines 132
SELECT DEST_ID "ID",
STATUS "DB_status",
DESTINATION "Archive_dest",
ERROR "Error"
FROM V$ARCHIVE_DEST WHERE DEST_ID <=5;

This is a good guide too.



Happyjohn.

Monday, February 2, 2015

RMAN - Recovering a data file loss

If you lose a data file, it will be reported. To recover it try these steps.

rman target / catalog rman/rman@rmandb
list failure;
(this displays details of the problem unless you've just deleted the file yourself when it might not yet have been reported)

advise failure;
(this is really useful because not only does it give you possible strategies for recovery, but it creates the script to)

In the test I did, I removed a datafile manually (users01.dbf). Then I tried to create a table in that tablespace which resulted in an error.
Then turning to RMAN, list failure showed 'one or more datafiles missing'. Advise failure told me to restore and recover datafile 6.

So I did this in rman

sql 'alter database datafile 6 offline';
recover datafile 6;
restore datafile 6;
sql 'alter database datafile 6 online';

But that's exactly what the 'hm' script, produced by advise failure, did. So just run it.


Happyjohn