Wednesday, October 19, 2011

Index Compression Part 3 (Compressing the indexes online)

My first post on index compression talked about what it is (and isn't).  My 2nd post talked about alter index...validate structure, and offered a very fast way to get the number of columns to compress and estimated compression information into a persistent table.  So now we know what we want to do, and which indexes we want to do it to...now what?

If you ran the procedure from Part 2, you now have a populated table with index owner, index name, the number of columns in the index you should compress (if any) and the expected compression ratio you'll achieve.  The options before you are to:

1. Rebuild the indexes that will benefit from compression, compressing by the number of columns that were indicated from the table populated in Part 2...doing this offline, causing DML locking on the table you're building the index on.

2. Same as above, only doing it online, where you don't block dml.

3. Forget it...leave them the way they are because you don't think the space saved is worth the trouble of compressing them.  This is a completely valid result of this analysis.  If you have abundant storage or if the compression yields were low, don't bother.

Assuming you don't go with 3, and you're in a live database (not your testing database where you're nearly the only user) I would go with #2 above.  Online index rebuilds are similar in some ways to online table redefs.  What happens is that, instead of referencing your table for data to build the index, Oracle references the original index as the source, as of an SCN.  No DDL operations are allowed.  If there's DML on the table, that DML is entered into a journal table.  When the first pass is complete, the journal table is accessed and the changes made during the online create are applied to the new index.  If changes are made during this pass, they're again added to the journal table.  Its possible that high DML activity will cause many passes, which is why Oracle recommends doing this during periods of low DML activity.  Eventually, you'll have a very short pass where all the changes are made to the new index.  At this point, the original index, the new index and the table are all in sync with each other.  The next thing to happen is a nearly instant metadata update that makes the new index the active index, and the metadata for the original index is gone (for all practical purposes).  This means DML is allowed on the table with the online index rebuild for nearly the entire time...there's just one extremely brief lock during the metadata switch-over.  In most environments, this can be done without an outage...only extremely sensitive applications would have a problem with the lock during the metadata update.  Again...test it, so when your management needs to be convinced, you'll have data to prove the outage time is insignificant.

Using the table from part 2 of this series, you can use something like this sql statement to dynamically generate your index rebuilds.  It'll build them parallel 24, and then set them back to whatever parallel setting they had before.  I'm filtering out indexes where compression will save less than 25MB...feel free to tweak it to fit your needs.

set timing off
set heading off
set echo off
set linesize 300
spool comp_idx_usmg.sql
select 'alter index '||owner||'.'||index_name||' rebuild parallel 24 compress '||cmpr_count
||' online; '||chr(10)||'alter index '||owner||'.'||index_name||' parallel (DEGREE '||
(select degree from dba_indexes d2 where d2.owner=d1.owner and d2.index_name=d1.index_name)||');'
from
(
select owner, index_name, cmpr_count,
(select /*+ PARALLEL 24 */ sum(bytes/1024/1024)*(ici.cmpr_pctsave/100)
from dba_segments where owner=ici.owner and segment_name=ici.index_name) saved
from idx_comp_info ici where cmpr_pctsave!=0
) d1 where saved>=25
order by owner, index_name;
spool off
@comp_idx_usmg.sql
exit

I just met with some reps from FusionIO...its clear that as ultra-fast storage options come up, little space-saving techniques like this are even more valuable.  NAND flash technology is becoming cheaper all the time...but it still comes at a premium, and putting indexes on FusionIO is a HUGE performance benefit to your database.  The smaller you can make your indexes, the more you can store on flash, and the faster the FusionIO card performs.  (Yes, as the SSD/Flash card fills, it slows.)  That's an idea for a whole new series, for another day.

I hope all this information helps you when you find that storage is at a premium on your database.  Index compression doesn't have the "Wow!" effect that HCC compression has, but with very little, if any, downside with modern processors, the technology is worth at least a consideration.

Just an addendum:

The stored proc doesn't work for local indexes...its a different syntax to analyze them, and the index partition info is stored in a different view (dba_ind_partitions).  This is unfortunate because that's likely where your biggest indexes (and thus, biggest compression gains) will come from.  Until I get around to adding that functionality, you can use this sql to generate a script to find the compression gains from those indexes:

select 'analyze index '||index_owner||'.'||index_name||' partition ('||partition_name||') validate structure;'||CHR(13)||
  'delete from idx_comp_info where owner='''||index_owner||''' and index_name='''||index_name||''';'||chr(13)||
  'insert into idx_comp_info select '''||index_owner||''','''||index_name||'.'||partition_name||''', opt_cmpr_count, opt_cmpr_pctsave, -1  from index_stats;'||chr(13)||
  'commit;'
  from dba_ind_partitions di where subpartition_count=0
  and exists
  (select 'X' from IDX_COMP_INFO_LOG2 cil
     where di.index_owner=cil.owner
       and di.index_name=cil.name)          
union all    
select 'analyze index '||index_owner||'.'||index_name||' subpartition ('||subpartition_name||') validate structure;'||CHR(13)||
  'delete from idx_comp_info where owner='''||index_owner||''' and index_name='''||index_name||''';'||chr(13)||
  'delete from idx_comp_info where owner='''||index_owner||''' and index_name='''||index_name||'.'||partition_name||''';'||chr(13)||
  'insert into idx_comp_info select '''||index_owner||''','''||index_name||'.'||partition_name||''', opt_cmpr_count, opt_cmpr_pctsave, -1  from index_stats;'||chr(13)||
  'commit;'
  from dba_ind_subpartitions di
  where exists
  (select 'X' from IDX_COMP_INFO_LOG2 cil
     where di.index_owner=cil.owner
       and di.index_name=cil.name)          
     order by 1,

Compressed Indexes (Part 1)
Compressed Indexes (Part 2)
Compressed Indexes (Part 3)


No comments:

Post a Comment