Monday, March 11, 2013

On maximizing data loading speeds

Few days ago my datawarehouse division asked to check and eventually speed up their nightly loading jobs:
they said already in the past different solutions were implemented by other collegues and would like to know if it was possible to implement or adopt new feature to solve their problems.
I asked which were the most important jobs to speed up and which Oracle schemas they used.
It was clear and evident those hundred of jobs were created years ago and no one wanted to manage them. Several shell scripts scheduled by Sun Grid Engine (now Oracle Grid Engine) using different sql scripts, several schemas and logging in many different ways:
of course any kind of documentation was available or ever written.
To be concise using find and grep utilities I was able to find those jobs and begin my analisys:
first of all I would like to know which tables are used and started to analyse them.
In the analysed schema no index was present and I also verified all tables used by those jobs were created using the NOLOGGING storage option and that was good.
At the same time I asked if they knew what NOLOGGING meant from a backup point of view:
those tables were dropped and recreated every night, but just because they were based on the NOLOGGING option, those operations were not saved in any redo log, any archived redo log, any backup and so there was no way to recover them in case of a media failure.
It was time to view inside the shell script... which called a sql script in which there were the following lines code:
DROP TABLE table_name PURGE;
CREATE TABLE table_name (...) ... NOLOGGING;
exec schema_name.INSERT_INTO_TABLE_GENERIC_PROCEDURE ('TABLE_NAME');
The extents allocated for that table were:
SQL> select count(*) from dba_extents
  2  where segment_name = 'TABLE_NAME'
  3  and owner = 'SCHEMA_NAME';

  COUNT(*)
----------
       256

SQL> select segment_name, sum(bytes)/(1024*1024) MB
  2  from dba_extents
  3  where segment_name = 'TABLE_NAME'
  4  and owner = 'SCHEMA_NAME'
  5  group by segment_name;

SEGMENT_NAME      MB
----------------------------------------- ----------
TABLE_NAME           2312.5
The above procedure created a dynamic insert getting all the necessary columns from a table properly configured. For my table the dynamic sql executed was something like:
INSERT INTO TABLE_NAME (col1, col2, ... , coln)
SELECT ... FROM TABLE_NAME@REMOTE_MACHINE;
COMMIT;
It was evident that:
1) all inserts were still logged: 
When you set the NOLOGGING storage option during the creation of your table minimal redo information are generated during this creation whereas you can suppress some kinds of redo operations performed on it. If the database or tablespace were created using FORCE LOGGING mode, then any kind of direct path is logged, regardless of the logging settings of your tables. Every DML statement is not affected by your NOLOGGING setting: it will be always logged, generating it's usual redo information. It is possible to minimize redo information for direct path operations performed on the tables with NOLOGGING option: so the old script issued a common DML statement and it was naturally logged;
2) all the extents allocated for the tables needed to be recreated as well:
Every time a table is dropped and recreated Oracle has to de-allocate every allocated extents and then reallocate them one by one during the next massive INSERT statement;
3) I had to ask again whether or not datawarehouse division had completely understood the meaning and all the considerations of using NOLOGGING option in their backup strategies:
In NOLOGGING mode, data is modified with minimal logging (to mark new extents INVALID and to record dictionary changes). When applied during media recovery, the extent invalidation records mark a range of blocks as logically corrupt, because the redo data is not fully logged. Therefore, if you cannot afford to lose the database object and their new data, then you should take a backup after the NOLOGGING operation and avoid errors such as:
ORA-01578: ORACLE data block corrupted (file # %s, block # %s)
ORA-26040: Data block was loaded using the NOLOGGING option
To test all these considerations I added few lines to the old and new scripts. Let's consider just one of them:
-- OLD SCRIPT
set timi on
set serveroutput on
spool MV_ALL_020.log
declare
 l_redo01 number;
 l_redo02 number;
 begin

select b.value
 into l_redo01
 from v$statname a, v$mystat b
 where a.statistic# = b.statistic#
 and a.name = 'redo size';

execute immediate 'DROP TABLE SCHEMA_NAME.TABLE_NAME PURGE';
execute immediate 'CREATE TABLE SCHEMA_NAME.TABLE_NAME (...)';
INSERT INTO SCHEMA_NAME.TABLE_NAME
(COL1, COL2, ... , COLn) 
SELECT COL1, COL2, ... , COLn 
FROM REMOTE_SCHEMA_NAME.REMOTE_TABLE_NAME@REMOTE_DATABASE;

select b.value
 into l_redo02
 from v$statname a, v$mystat b
 where a.statistic# = b.statistic#
 and a.name = 'redo size';

dbms_output.put_line
 ( 'Redo generated: ' || to_char( (l_redo02-l_redo01), '999,999,999,999' ) );
end;
/
spool off
quit;
-- NEW SCRIPT
set timi on
set serveroutput on
spool MV_ALL_020_APPEND.log
declare
 l_redo01 number;
 l_redo02 number;
 begin

select b.value
 into l_redo01
 from v$statname a, v$mystat b
 where a.statistic# = b.statistic#
 and a.name = 'redo size';

execute immediate 'TRUNCATE TABLE SCHEMA_NAME.TABLE_NAME REUSE STORAGE';
INSERT /*+ APPEND */ INTO SCHEMA_NAME.TABLE_NAME
(COL1, COL2, ... , COLn) 
SELECT COL1, COL2, ... , COLn 
FROM REMOTE_SCHEMA_NAME.REMOTE_TABLE_NAME@REMOTE_DATABASE;

select b.value
 into l_redo02
 from v$statname a, v$mystat b
 where a.statistic# = b.statistic#
 and a.name = 'redo size';

dbms_output.put_line
 ( 'Redo generated: ' || to_char( (l_redo02-l_redo01), '999,999,999,999' ) );
end;
/
spool off
quit;
I had several scripts to test so I reported the spool log of just few of them here. The following is the command line executed to run one of them:
[sgeuser@sge02prdpom TEST]$ jobs
[1]+  Running                 sqlplus MARCOV/MARCOV@MYDB @MV_ALL_021.sql &
The following were the results of elapsed time of just 8 OLD scripts:
SPOOL_MV_ALL_020.log:Elapsed: 00:08:36.56
SPOOL_MV_ALL_021.log:Elapsed: 00:06:15.52
SPOOL_MV_ALL_022.log:Elapsed: 00:12:39.15
SPOOL_MV_ALL_023.log:Elapsed: 00:10:57.40
SPOOL_MV_ALL_024.log:Elapsed: 00:14:39.42
SPOOL_MV_ALL_025.log:Elapsed: 00:08:17.77
SPOOL_MV_ALL_026.log:Elapsed: 00:02:36.08
SPOOL_MV_ALL_027.log:Elapsed: 00:06:04.93
While the following are the results of elapsed time of NEW scripts:
SPOOL_MV_ALL_020_APPEND.log:Elapsed: 00:03:31.45
SPOOL_MV_ALL_021_APPEND.log:Elapsed: 00:02:49.26
SPOOL_MV_ALL_022_APPEND.log:Elapsed: 00:05:53.73
SPOOL_MV_ALL_023_APPEND.log:Elapsed: 00:05:47.56
SPOOL_MV_ALL_024_APPEND.log:Elapsed: 00:07:10.47
SPOOL_MV_ALL_025_APPEND.log:Elapsed: 00:04:38.26
SPOOL_MV_ALL_026_APPEND.log:Elapsed: 00:01:20.38
SPOOL_MV_ALL_027_APPEND.log:Elapsed: 00:02:45.18
The following were the results of redo size generated of the 8 OLD scripts:
SPOOL_MV_ALL_020.log:Redo generated:    2,388,254,036
SPOOL_MV_ALL_021.log:Redo generated:    2,024,760,520
SPOOL_MV_ALL_022.log:Redo generated:    4,192,424,864
SPOOL_MV_ALL_023.log:Redo generated:    3,504,667,421
SPOOL_MV_ALL_024.log:Redo generated:    4,825,084,628
SPOOL_MV_ALL_025.log:Redo generated:    2,784,744,144
SPOOL_MV_ALL_026.log:Redo generated:      727,602,864
SPOOL_MV_ALL_027.log:Redo generated:    1,881,049,084
While the following are the results of redo size generated of NEW scripts:
SPOOL_MV_ALL_020_APPEND.log:Redo generated:        1,356,192
SPOOL_MV_ALL_021_APPEND.log:Redo generated:        1,040,544
SPOOL_MV_ALL_022_APPEND.log:Redo generated:        1,805,952
SPOOL_MV_ALL_023_APPEND.log:Redo generated:        1,712,002
SPOOL_MV_ALL_024_APPEND.log:Redo generated:        1,996,384
SPOOL_MV_ALL_025_APPEND.log:Redo generated:        1,295,352
SPOOL_MV_ALL_026_APPEND.log:Redo generated:          599,672
SPOOL_MV_ALL_027_APPEND.log:Redo generated:          986,140
And finally the percentage comparison between OLD and NEW scripts in terms of elapsed time and redo size:
job 020's elapsed time was reduced by 59.1 %
job 021's elapsed time was reduced by 54.9 %
job 022's elapsed time was reduced by 53.5 %
job 023's elapsed time was reduced by 47.2 %
job 024's elapsed time was reduced by 51.1 %
job 025's elapsed time was reduced by 44.1 %
job 026's elapsed time was reduced by 48.7 %
job 027's elapsed time was reduced by 54.7 %

job 020's redo size was reduced by 99.943 %
job 021's redo size was reduced by 99.949 %
job 022's redo size was reduced by 99.957 %
job 023's redo size was reduced by 99.951 %
job 024's redo size was reduced by 99.959 %
job 025's redo size was reduced by 99.953 %
job 026's redo size was reduced by 99.918 %
job 027's redo size was reduced by 99.948 %

With these few expedients I was able to successfully run the nightly jobs inside a suitable window and adequately maximize the speed of insert statements.
That's all.

2 comments:

Anonymous said...

Hi there,
I have found your blog, which I really like it. I enjoyed to read your posts.
I was wondering if you would be interested in sharing your posts and ideas on Glipho? It's a quite new social publishing platform, where you can connect to every social network accounts (Facebook, Twitter, Instagram, YouTube, etc). Really easy to use it and communicate with your followers. In additional, you are able to import your posts too.
Please, have a look and take a tour to know more about Glipho.
If you would like to set up your account, please do not hesitate to ask me for further information.
We are always delighted to get any feedback about the things which do or don’t work for you.
I hope you will join to Glipho community soon. It's so much fun! :)
All the best,
Monika

FG said...

Excellent