Pages

Friday, April 5, 2013

How to recover corrupted data blocks using 'recover tablespace' RMAN syntax

This post finishes a series of previous posts related to the recovery process of corrupt data blocks:
1) How to recover contiguous corrupted data blocks using 'recover datafile' RMAN syntax;
2) How to recover sparse corrupted data blocks using 'recover datafile' RMAN syntax;
3) How to recover corrupted data blocks using 'recover datafile' RMAN syntax on a system critical datafile;
4) How to perform a recovery when the first block of a datafile containing the datafile header becomes corrupt;

Reading the documentation there was only another syntax method to recover corrupt data blocks: the recover tablespace ... dba ... command.

I don't find it very useful compared with the recover corruption list command but I would like only to describe this possibility.

So let's simulate this scenario backing up MARCOV tablespace first:
RMAN> backup tablespace marcov;

Starting backup at 20-03-2013 02:51:45
using channel ORA_DISK_1
channel ORA_DISK_1: starting compressed full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00011 name=/home/oracle/app/oracle/oradata/orcl/marcov01.dbf
channel ORA_DISK_1: starting piece 1 at 20-03-2013 02:51:45
channel ORA_DISK_1: finished piece 1 at 20-03-2013 02:51:47
piece handle=/home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_20/o1_mf_nnndf_TAG20130320T025145_8nm1ol0w_.bkp tag=TAG20130320T025145 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:02
Finished backup at 20-03-2013 02:51:47

Starting Control File and SPFILE Autobackup at 20-03-2013 02:51:47
piece handle=/home/oracle/app/oracle/flash_recovery_area/ORCL/autobackup/2013_03_20/o1_mf_s_810528707_8nm1omhz_.bkp comment=NONE
Finished Control File and SPFILE Autobackup at 20-03-2013 02:51:48
First of all let's wipe out some blocks of MARCOV tablespace's datafile, in particular some blocks of the header.
[oracle@localhost trace]$ dd if=/dev/zero of=/home/oracle/app/oracle/oradata/orcl/marcov01.dbf bs=8k conv=notrunc seek=3 count=3
3+0 records in
3+0 records out
24576 bytes (25 kB) copied, 0.000349459 seconds, 70.3 MB/s
DBVERIFY is able to find the three corrupt blocks on the datafile:
[oracle@localhost trace]$ dbv file=/home/oracle/app/oracle/oradata/orcl/marcov01.dbf blocksize=8192

DBVERIFY: Release 11.2.0.2.0 - Production on Wed Mar 20 02:53:27 2013

Copyright (c) 1982, 2009, Oracle and/or its affiliates.  All rights reserved.

DBVERIFY - Verification starting : FILE = /home/oracle/app/oracle/oradata/orcl/marcov01.dbf
Page 3 is marked corrupt
Corrupt block relative dba: 0x02c00003 (file 11, block 3)
Completely zero block found during dbv: 

Page 4 is marked corrupt
Corrupt block relative dba: 0x02c00004 (file 11, block 4)
Completely zero block found during dbv: 

Page 5 is marked corrupt
Corrupt block relative dba: 0x02c00005 (file 11, block 5)
Completely zero block found during dbv: 

DBVERIFY - Verification complete

Total Pages Examined         : 768
Total Pages Processed (Data) : 196
Total Pages Failing   (Data) : 0
Total Pages Processed (Index): 0
Total Pages Failing   (Index): 0
Total Pages Processed (Other): 16
Total Pages Processed (Seg)  : 0
Total Pages Failing   (Seg)  : 0
Total Pages Empty            : 553
Total Pages Marked Corrupt   : 3
Total Pages Influx           : 0
Total Pages Encrypted        : 0
Highest block SCN            : 14877710 (0.14877710)
Even flushing the buffer cache I'm able to successfully execute query like the following if some header blocks are corrupt:
SQL> select count(*) from marcov.t1;

  COUNT(*)
----------
      1000

SQL> alter system flush buffer_cache;

System altered.

SQL> select count(*) from marcov.t1;

  COUNT(*)
----------
      1000

SQL> select header_block, blocks, extents from dba_segments where segment_name='T1';

HEADER_BLOCK     BLOCKS    EXTENTS
------------ ---------- ----------
          10        256         17
Now I'm going to reset other two blocks after the end of the latest header block, two data blocks:
[oracle@localhost trace]$ dd if=/dev/zero of=/home/oracle/app/oracle/oradata/orcl/marcov01.dbf bs=8k conv=notrunc seek=11 count=2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.00193982 seconds, 8.4 MB/s
I'm not able to count again the rows in T1 table after I flush the buffer cache:
SQL> select count(*) from marcov.t1;

  COUNT(*)
----------
      1000

SQL> alter system flush buffer_cache;

System altered.

SQL> select count(*) from marcov.t1;
select count(*) from marcov.t1
*
ERROR at line 1:
ORA-01578: ORACLE data block corrupted (file # 11, block # 11)
ORA-01110: data file 11: '/home/oracle/app/oracle/oradata/orcl/marcov01.dbf'
The database currently doesn't know there are few header and data corrupt blocks on datafile 11:
SQL> select * from v$database_block_corruption;

no rows selected
To check and have a completed list of corrupt blocks you should issue a backup validate command:
RMAN> backup validate tablespace marcov;

Starting backup at 20-03-2013 03:00:12
using channel ORA_DISK_1
channel ORA_DISK_1: starting compressed full datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00011 name=/home/oracle/app/oracle/oradata/orcl/marcov01.dbf
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
List of Datafiles
=================
File Status Marked Corrupt Empty Blocks Blocks Examined High SCN
---- ------ -------------- ------------ --------------- ----------
11   FAILED 0              553          768             14877710  
  File Name: /home/oracle/app/oracle/oradata/orcl/marcov01.dbf
  Block Type Blocks Failing Blocks Processed
  ---------- -------------- ----------------
  Data       0              194             
  Index      0              0               
  Other      5              21              

validate found one or more corrupt blocks
See trace file /home/oracle/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_9582.trc for details
Finished backup at 20-03-2013 03:00:13
The backup validate command is able to finish its job and identify the corrupt blocks, filling the v$database_block_corruption view.
SQL> select * from v$database_block_corruption;

     FILE#     BLOCK#     BLOCKS CORRUPTION_CHANGE# CORRUPTIO
---------- ---------- ---------- ------------------ ---------
        11         11          2                  0 ALL ZERO
        11          3          3                  0 ALL ZERO
Let's try to start the recovery process using the recover tablespace ... dba syntax where dba stands for data block address.
How can you know the data block address of block 3,4 and 5 of datafile 11 ? You can use the Oracle DBMS_UTILITY package and obtain the data block address like in the following example:
SQL> select DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS(11,3) dba1, DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS(11,3+1) dba2, DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS(11,3+2) dba3 from dual;

      DBA1       DBA2       DBA3
---------- ---------- ----------
  46137347   46137348   46137349
You can follow the same procedure to know the data block address of data block 11 and 12:
SQL> select DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS(11,11) dba4, DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS(11,11+1) dba5 from dual;

      DBA4       DBA5
---------- ----------
  46137355   46137356
Now that you know every data block address you can specify them in one unique RMAN command:
RMAN> recover tablespace marcov dba 46137347,46137348,46137349,46137355,46137356;

Starting recover at 20-03-2013 03:08:49
using channel ORA_DISK_1

channel ORA_DISK_1: restoring block(s)
channel ORA_DISK_1: specifying block(s) to restore from backup set
restoring blocks of datafile 00011
channel ORA_DISK_1: reading from backup piece /home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_20/o1_mf_nnndf_TAG20130320T025145_8nm1ol0w_.bkp
channel ORA_DISK_1: piece handle=/home/oracle/app/oracle/flash_recovery_area/ORCL/backupset/2013_03_20/o1_mf_nnndf_TAG20130320T025145_8nm1ol0w_.bkp tag=TAG20130320T025145
channel ORA_DISK_1: restored block(s) from backup piece 1
channel ORA_DISK_1: block restore complete, elapsed time: 00:00:01

starting media recovery
media recovery complete, elapsed time: 00:00:02

Finished recover at 20-03-2013 03:08:52
The v$database_block_corruption view doesn't contain any information:
SQL> select * from v$database_block_corruption;

no rows selected
It's again possible to query T1 table
SQL> select count(*) from marcov.t1;

  COUNT(*)
----------
      1000
If you try to dump the three header blocks...
SQL> alter system dump datafile '/home/oracle/app/oracle/oradata/orcl/marcov01.dbf' block min 3 block max 5;

System altered.
... you can find that beside rdba there is an hexadecimal number:
...
buffer tsn: 40 rdba: 0x02c00003 (11/3)
scn: 0x0000.00e2a62a seq: 0x01 flg: 0x04 tail: 0xa62a1e01
frmt: 0x02 chkval: 0x4dcc type: 0x1e=KTFB Bitmapped File Space Bitmap
Hex dump of block: st=0, typ_found=1
...
buffer tsn: 40 rdba: 0x02c00004 (11/4)
scn: 0x0000.00e2a047 seq: 0x01 flg: 0x04 tail: 0xa0471e01
frmt: 0x02 chkval: 0x823c type: 0x1e=KTFB Bitmapped File Space Bitmap
Hex dump of block: st=0, typ_found=1
...
buffer tsn: 40 rdba: 0x02c00005 (11/5)
scn: 0x0000.00e2a049 seq: 0x01 flg: 0x04 tail: 0xa0491e01
frmt: 0x02 chkval: 0xc235 type: 0x1e=KTFB Bitmapped File Space Bitmap
Hex dump of block: st=0, typ_found=1
...
If I try to convert those hexadecimal numbers to decimal I find the same number returned by the previous DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS procedures:
[oracle@localhost trace]$ printf '%d %d %d\n' 0x02c00003 0x02c00004 0x02c00005
46137347 46137348 46137349
The same thing happens for the two data blocks:
SQL> alter system dump datafile '/home/oracle/app/oracle/oradata/orcl/marcov01.dbf' block min 11 block max 12;

System altered.
The two hexadecimal numbers...
...
buffer tsn: 40 rdba: 0x02c0000b (11/11)
scn: 0x0000.00e3040e seq: 0x07 flg: 0x04 tail: 0x040e0607
frmt: 0x02 chkval: 0x0691 type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
...
buffer tsn: 40 rdba: 0x02c0000c (11/12)
scn: 0x0000.00e3040e seq: 0x07 flg: 0x04 tail: 0x040e0607
frmt: 0x02 chkval: 0x6c94 type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
...
are converted and equal to the data block address returned by the DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS procedure.
[oracle@localhost trace]$ printf '%d %d\n' 0x02c0000b 0x02c0000c
46137355 46137356

That's all.