Pages

Thursday, September 17, 2015

RMAN BACKUP AS COPY failing with ORA-00600 [kfioTranslateIO03] [17090]

Recently, I migrating a standalone database 'rnwphoto' from file system to ASM. I decided to go with the BACKUP AS COPY strategy, to save on downtime. But I was stuck for a long time trying to make the copy, but was being haunted by its failure.

Let's get to the background information

Database Standalone Home: /u01/app/oracle/product/11.2.0/dbhome_1 --> This is the Oracle Home used by rnwphoto database.

To convert the database to ASM based RAC, we installed GI in the GRID_HOME (/u01/app/11.2.0/grid) and RAC DB Home (/u01/app/oracle/product/11.2.0/dbhome_2)

Issue:

RMAN> BACKUP AS COPY DATABASE FORMAT '+DATA_DG'; 

Starting backup at 09-SEP-15 
allocated channel: ORA_DISK_1 
channel ORA_DISK_1: SID=8 device type=DISK 
allocated channel: ORA_DISK_2 
channel ORA_DISK_2: SID=69 device type=DISK 
allocated channel: ORA_DISK_3 
channel ORA_DISK_3: SID=132 device type=DISK 
allocated channel: ORA_DISK_4 
channel ORA_DISK_4: SID=72 device type=DISK 
channel ORA_DISK_1: starting datafile copy 
input datafile file number=00001 name=/u01/app/oracle/rnwphoto/system01.dbf 
channel ORA_DISK_2: starting datafile copy 
input datafile file number=00002 name=/u01/app/oracle/rnwphoto/sysaux01.dbf 
channel ORA_DISK_3: starting datafile copy 
input datafile file number=00005 name=/u01/app/oracle/rnwphoto/example01.dbf 
channel ORA_DISK_4: starting datafile copy 
input datafile file number=00003 name=/u01/app/oracle/rnwphoto/undotbs01.dbf 
RMAN-03009: failure of backup command on ORA_DISK_1 channel at 09/09/2015 11:12:22 
RMAN-10038: database session for channel ORA_DISK_1 terminated unexpectedly 
channel ORA_DISK_1 disabled, job failed on it will be run on another channel 
RMAN-03009: failure of backup command on ORA_DISK_2 channel at 09/09/2015 11:12:22 
RMAN-10038: database session for channel ORA_DISK_2 terminated unexpectedly 
channel ORA_DISK_2 disabled, job failed on it will be run on another channel 
RMAN-03009: failure of backup command on ORA_DISK_3 channel at 09/09/2015 11:12:22 
RMAN-10038: database session for channel ORA_DISK_3 terminated unexpectedly 
channel ORA_DISK_3 disabled, job failed on it will be run on another channel 
RMAN-00571: =========================================================== 
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== 
RMAN-00571: =========================================================== 
RMAN-03009: failure of backup command on ORA_DISK_4 channel at 09/09/2015 11:12:22 
RMAN-10038: database session for channel ORA_DISK_4 terminated unexpectedly 

ASM alertlog didnt show any errors or warnings.
In the database alertlog you will see that each time the RMAN command fails with  --> ORA-00600 [kfioTranslateIO03] [17090] 

A similar issue and it's resolution is shown in the Doc ID 1336846.1 
As per this document, this issue occurs due to multiple reasons and one of them is a permission issue... 

1. Check the group owner required on 'oracle' binary file to access ASM successfully:

[grid@node1-riz lib]$ cat $ORACLE_HOME/lib/config.c | grep ASM 
#define SS_ASM_GRP "asmadmin"     >>>> 
char *ss_dba_grp[] = {SS_DBA_GRP, SS_OPER_GRP, SS_ASM_GRP}; 

2. Let's check the group owner of all ORACLE_HOME/bin/oracle binary:

Grid Home:
[grid@node1-riz ~]$ ls -latr $GRID_HOME/bin/oracle 
-rwsr-s--x 1 grid oinstall 169802036 Sep 3 12:22 /u01/app/11.2.0/grid/bin/oracle 

RAC Database Home:
[oracle@node1-riz ~]$ ls -lart /u01/app/oracle/product/11.2.0/dbhome_2/bin/oracle -rwsr-s--x 1 oracle asmadmin 192296439 Sep 3 13:30 /u01/app/oracle/product/11.2.0/dbhome_2/bin/oracle

rnwphoto Standalone Database Home: 
[oracle@node1-riz ~]$ ls -lart $ORACLE_HOME/bin/oracle 
-rwsr-s--x 1 oracle oinstall 192296337 Sep 2 19:41 /u01/app/oracle/product/11.2.0/dbhome_1/bin/oracle 
>>>>> *****************************************************  <<<<<

Do you see it? If not, read ahead...

Point #1 states that 'asmadmin' should be the group owner of the $ORACLE_HOME/bin/oracle

Let's ignore the Grid Home findings - as that's not a part of the problem. It's just for your information.

RAC Database Home works fine with the ASM -- as the group owner of the 'oracle' binary is 'asmadmin'

Standalone Database Home was created much before there was a Grid Home, so it still has 'oinstall' as the group owner of the 'oracle' binary file. 

Solution:
In the Standalone Oracle Home, change the ownership of the 'oracle' binary from oracle:oinstall to oracle:asmadmin --- and WOOOLA !!! 

WARNING: You need a downtime to make this change. If you change the ownership without shutting the database(s), then expect all the databases running on that ORACLE_HOME to CRASH!!!

No comments:

Post a Comment