1) How to recover contiguous corrupted data blocks using 'recover datafile' RMAN syntax;
2) How to recover sparse corrupted data blocks using 'recover datafile' RMAN syntax;
3) How to recover corrupted data blocks using 'recover datafile' RMAN syntax on a system critical datafile;
Reading the documentation there is a case where it isn't possible to use the block recover feature: it happens when the corrupt block is the first, I mean data block number 1 containing the header of the datafile.
Let's simulate this scenario using the following dd command:
[oracle@localhost trace]$ dd if=/dev/zero of=/home/oracle/app/oracle/oradata/orcl/marcov01.dbf bs=8k conv=notrunc seek=1 count=1 1+0 records in 1+0 records out 8192 bytes (8.2 kB) copied, 0.000432187 seconds, 19.0 MB/sTo check and have a completed list of corrupt blocks as we have already seen we should issue a backup validate command:
RMAN> backup validate tablespace marcov; Starting backup at 19-03-2013 23:07:11 using channel ORA_DISK_1 RMAN-06169: could not read file header for datafile 11 error reason 7 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of backup command at 03/19/2013 23:07:11 RMAN-06056: could not access datafile 11The V$DATABASE_BLOCK_CORRUPTION view as usual should contain data about which blocks are corrupt, but it doesn't show any rows:
SQL> select * from v$database_block_corruption; no rows selectedWhat does it happen if I try to recover that first block ? It seems to work fine...
RMAN> recover datafile 11 block 1; Starting recover at 19-03-2013 23:08:09 using channel ORA_DISK_1 starting media recovery media recovery complete, elapsed time: 00:00:01 Finished recover at 19-03-2013 23:08:10... but trying again a backup validate command I obtain the same error.
RMAN> backup validate tablespace marcov; Starting backup at 19-03-2013 23:08:30 using channel ORA_DISK_1 RMAN-06169: could not read file header for datafile 11 error reason 7 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of backup command at 03/19/2013 23:08:30 RMAN-06056: could not access datafile 11Ok, it's time to perform a more familiar restore/recover tablespace operation.
So a typical sequence is put that tablespace offline, restore it, recover it and then put it back online:
RMAN> sql 'alter tablespace marcov offline immediate'; sql statement: alter tablespace marcov offline immediate RMAN> restore tablespace marcov; Starting restore at 19-03-2013 23:13:30 using channel ORA_DISK_1 RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03002: failure of restore command at 03/19/2013 23:13:30 ORA-01135: file 11 accessed for DML/query is offline ORA-01110: data file 11: '/home/oracle/app/oracle/oradata/orcl/marcov01.dbf' RMAN-06019: could not translate tablespace name "MARCOV"Ops... the familiar restore/recover sequence is not working at all.
And I can't even put the tablespace online again.
RMAN> sql 'alter tablespace marcov online'; sql statement: alter tablespace marcov online RMAN-00571: =========================================================== RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS =============== RMAN-00571: =========================================================== RMAN-03009: failure of sql command on default channel at 03/19/2013 23:19:30 RMAN-11003: failure during parse/execution of SQL statement: alter tablespace marcov online ORA-01122: database file 11 failed verification check ORA-01110: data file 11: '/home/oracle/app/oracle/oradata/orcl/marcov01.dbf' ORA-01210: data file header is media corruptIt seems the familiar restore/recover operation cannot start because of the corrupt data block number 1.
So how can I proceed ? I put again the tablespace offline ...
RMAN> sql 'alter tablespace marcov offline immediate'; sql statement: alter tablespace marcov offline immediate... and simply removed the entire datafile with the corrupt data block number 1.
[oracle@localhost trace]$ rm /home/oracle/app/oracle/oradata/orcl/marcov01.dbfWithout the entire datafile, the restore tablespace command is able to complete successfully.
RMAN> restore tablespace marcov; Starting restore at 19-03-2013 23:20:29 using channel ORA_DISK_1 channel ORA_DISK_1: starting datafile backup set restore channel ORA_DISK_1: specifying datafile(s) to restore from backup set channel ORA_DISK_1: restoring datafile 00011 to /home/oracle/app/oracle/oradata/orcl/marcov01.dbf channel ORA_DISK_1: reading from backup piece /home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp channel ORA_DISK_1: piece handle=/home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp tag=TAG20130318T224952 channel ORA_DISK_1: restored backup piece 1 channel ORA_DISK_1: restore complete, elapsed time: 00:00:02 Finished restore at 19-03-2013 23:20:32Also the recover tablespace command is able to finish its operation.
RMAN> recover tablespace marcov; Starting recover at 19-03-2013 23:20:45 using channel ORA_DISK_1 starting media recovery media recovery complete, elapsed time: 00:00:00 Finished recover at 19-03-2013 23:20:47The tablespace should be now available again to all the users.
RMAN> sql 'alter tablespace marcov online'; sql statement: alter tablespace marcov onlineOk... It's not true and I know why. During the restore operation RMAN simply used the first available backup piece: o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp. But that backup piece contains a datafile with few corrupt data blocks because I forced to skip them during the backup operation. You can verify what I'm saying here in one of my previous posts.
SQL> select count(*) from marcov.t1; select count(*) from marcov.t1 * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 11, block # 10) ORA-01110: data file 11: '/home/oracle/app/oracle/oradata/orcl/marcov01.dbf'Let's update the V$DATABASE_BLOCK_CORRUPTION view issuing a backup validate command.
RMAN> backup validate tablespace marcov; Starting backup at 19-03-2013 23:21:53 using channel ORA_DISK_1 channel ORA_DISK_1: starting compressed full datafile backup set channel ORA_DISK_1: specifying datafile(s) in backup set input datafile file number=00011 name=/home/oracle/app/oracle/oradata/orcl/marcov01.dbf channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02 List of Datafiles ================= File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------- 11 OK 6 553 768 14877710 File Name: /home/oracle/app/oracle/oradata/orcl/marcov01.dbf Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------- Data 0 196 Index 0 0 Other 0 19 Finished backup at 19-03-2013 23:21:56As you can see datafile number 11 has few corrupt blocks.
SQL> select * from v$database_block_corruption; FILE# BLOCK# BLOCKS CORRUPTION_CHANGE# CORRUPTIO ---------- ---------- ---------- ------------------ --------- 11 8 1 14877710 CORRUPT 11 7 1 0 CORRUPT 11 10 1 0 CORRUPT 11 3 3 0 CORRUPTUsing the recover corruption list we are able to correct the content of those corrupt blocks.
Just note also how RMAN "failover to previous backup" (skipping the already mentioned o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp backup piece and also the next available one, o1_mf_nnndf_TAG20130318T090132_8ngglx76_.bkp) because using the current backup piece is not able to perform and complete the recovery operation.
RMAN> recover corruption list; Starting recover at 19-03-2013 23:22:36 using channel ORA_DISK_1 channel ORA_DISK_1: restoring block(s) channel ORA_DISK_1: specifying block(s) to restore from backup set restoring blocks of datafile 00011 channel ORA_DISK_1: reading from backup piece /home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp channel ORA_DISK_1: piece handle=/home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T224952_8nhz414b_.bkp tag=TAG20130318T224952 channel ORA_DISK_1: restored block(s) from backup piece 1 channel ORA_DISK_1: block restore complete, elapsed time: 00:00:00 failover to previous backup channel ORA_DISK_1: restoring block(s) channel ORA_DISK_1: specifying block(s) to restore from backup set restoring blocks of datafile 00011 channel ORA_DISK_1: reading from backup piece /home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T090132_8ngglx76_.bkp channel ORA_DISK_1: piece handle=/home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T090132_8ngglx76_.bkp tag=TAG20130318T090132 channel ORA_DISK_1: restored block(s) from backup piece 1 channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01 failover to previous backup channel ORA_DISK_1: restoring block(s) channel ORA_DISK_1: specifying block(s) to restore from backup set restoring blocks of datafile 00011 channel ORA_DISK_1: reading from backup piece /home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T072836_8ng94o0k_.bkp channel ORA_DISK_1: piece handle=/home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_18/o1_mf_nnndf_TAG20130318T072836_8ng94o0k_.bkp tag=TAG20130318T072836 channel ORA_DISK_1: restored block(s) from backup piece 1 channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01 starting media recovery media recovery complete, elapsed time: 00:00:04 Finished recover at 19-03-2013 23:22:43RMAN after looking into o1_mf_nnndf_TAG20130318T072836_8ng94o0k_.bkp backup piece completes the recovery operation and the T1 table is now really available to all the users.
SQL> select count(*) from marcov.t1; COUNT(*) ---------- 1000Next step will be about how to recover corrupt block using 'recover tablespace' and data block address (dba) RMAN syntax.
That's all.