Wednesday, July 13, 2011

Be careful when you stop using a v11 Goldengate extract

Just wanted to give a heads up on an issue that I caught, that would have taken down our Exadata environment.  In the old version of Goldengate, you create extracts and you have .prm files that have their parameter information.  When you no longer want to use that extract, you can be lazy and just leave it there not running, or you can delete the prm file and drop the extract.

When you do this...verify that the number of extracts you have running in GGSCI is equal to the number of entries you see in dba_capture.  Make sure the old entries are gone.  There's an enhancement in GG 11 that will prevent rman from deleting files that haven't been mined.  What actually happens is rman checks min_required_capture_change# in v$database, which is populated every 6 hrs based on dba_capture.  So, if you stop using an extract, but its entry is still in dba_capture, rman won't delete the archivelogs with the usual command:

backup archvielog all delete input;


Instead you'll get RMAN-08137 and you may see this in your backup log:


RMAN-08120: WARNING: archived log not deleted, not yet applied by standby

From note 1079953.1 you can force it to delete them:
delete noprompt force archivelog all completed before 'sysdate-10/1440';

...but in our situation we're using a standby database.  The enhancement to not delete logs that you still need is great...I don't want them deleted before I can have GG mine them and data guard apply them.  So...forcing the delete seems...rash to me.

I've been told be a consultant from Oracle (formerly a consultant from GG) that if you connect to the db and then delete the extract, it will also remove the streams/dba_capture entry from the database...if you don't connect to the db when you drop the entry, it'll stay in the db until you manually unregister it.  Also, after you create the extract, when you start it, it verifies there's an entry in dba_capture for this process...if there isn't, it creates one.  He said there's very little documentation on this...just a mention in the release notes.  This change is a pretty big deal, given that if you aren't aware of it, and don't notice the errors in your backup logs, it could stop preprocessing in your database.  What would be *much* better, is if...instead of just verifying your extract exists, that it did something like a sync, and made the list in dba_capture match the list in GGSCI.  It would also be nice if they changed a "drop extract" command to require a db login to minimize the issue.

Regardless if you choose to go with the MOS work around or unregister the extra streams entries in dba_capture, be aware if you don't do something, your FRA or archivelog destination will eventually fill, which will halt your database.  

Monday, July 11, 2011

*FAST* Domain (and normal unq, non-unq) index rebuilds on Exadata

With any migration one of the key objectives is to minimize downtime.  Even in a MAA migration where you're expecting to move the data and keep it in sync via some form of CDC and finish with a nearly instant "flip-over", you still want the time between the start of the data move and the start of the CDC to be minimized. The longer it takes, the more storage is needed to keep track of the DML changes.  So, as we prepare for another round of database migrations to Exadata, one of my objectives has been to minimize the time it takes to get the data in a usable state on the Exadata target.

As part of the migration, to take advantage of the compression and performance features in Exadata, we've also made a lot of table partitioning changes.  We have several 2TB+ tables that use context indexes (aka domain indexes).  These are a little obscure, so if you don't know what these are...in essence, they bring many advanced search features to large text objects...key among those features is they make clobs searchable.  They're completely unlike any other index...when you create a domain index, with a single statement you're creating many new objects...tables and indexes to support that index.  Its very messy, so its common to want to put those "supporting" objects in a different schema.

As you can imagine, to rebuild these multi-terrabyte indexes can take an extremely long time, so its avoided whenever possible.  Do to the migration's new partitioning schemes, it can't be avoided, and these indexes will need to be rebuilt.  The last time it was attempted, one of the indexes took over a week to rebuild.  The reason it took so long is the limitation Oracle has to parallelize the process.  If you have a domain index with 8 partitions and you issue a rebuild command specifying parallel 80, in v$process for the first few seconds you'll see 80 processes spawn off...then you'll see all but 8 of them die.  Oracle will only allow 1 process per partition during the domain index rebuild.  This makes the rebuild process on large domain indexes extremely slow.

To get around this issue, Oracle created a built-in package that will submit multiple jobs (one job per partition), and allow each of those to be parallelized.  Its called dbms_pclxutil.build_part_index.  The first time I used this I was extremely pleased...in my experimentation instance I was able to recreate the index that had previously taken over a week in about 4 hours.

The problem was, in my sandbox I had created the table and the index in my schema, and I was the one issuing the rebuild so everything just worked.  During a later test migration when I submitted the same procedure that worked before...it gave me errors.  There are multiple limitations with Oracle's procedure...the big one is that the table owner, index owner and index re-builder all have to be the same user.

That makes no sense, especially for domain indexes, which, like I said, commonly are created in a separate schema.  Is this a technical limitation?  No.  I can still do an "alter index rebuild partition" statement on an index not owned by me, and I can build a domain index on a table owned by somebody else.  I checked around the internet and found a possible reason << HERE >>.  Which is to say...there is no reason.

In the link I referenced, For starters, the said index has to be partitioned and unusable. Not a biggie, but why?"   I think the answer is...its not that its necessary, but if you aren't partitioned, why not just do a normal alter index rebuild statement?  It wouldn't hurt to not have the limitation...I'm just saying a procedure wouldn't really help you.  In all the situations I've needed to use this, it was a new index, and so creating it unusable was the fastest way...other than that I don't know what advantage the unusable limitation brings. 

*update*
I changed the procedure below to work for both partitioned and non-partitioned indexes.  Although it doesn't bring any performance benefits, with the ability to do both partitioned and non-partitioned, now you don't have to verify if an index is partitioned if you use this in a script.
*/update*


Like Charles, I find the limitations pointless and oppressive, so...since this is something that's going to need to happen over and over for multiple domain indexes and multiple migrations...I created my own package that does the same thing without the limitations.  It will recursively call itself for each partition, submitting jobs to the job queue to perform the index rebuild.

I also added a few features...the big one is that you specify how many RAC nodes you want it to run on...and when the jobs are submitted, they're submitted round-robin to each node.  For one of the indexes, without this package I was only able to go parallel 9 (because I had 9 partitions on the domain index.)  With it, I went parallel 10 times 9 partitions...so I had 90 processes across 4 Exadata compute nodes rebuilding this index.  This was on a 4 node high capacity Exadata machine and I was able to build an index that had previously taken over a week on an IBM P595 in only 4 hours using all 4 nodes.

There's nothing technically that specifies "domain index" in this...so I guess it could be used for other large indexes too, although I've only tested it in the scenario I described.

*UPDATE*
I've now tested it with many different types of indexes, both domain and local without any issues.
*/UPDATE*

See the usage info in the comment section in the package body under the "ind" procedure.

Disclaimer: I hope this helps you, but as always, the thoughts described here are my opinions...use at your own risk.  Like most personal relationships, the good thing about this package is potentially the worst thing about it.  It parallelizes and can consume all the resources on your RAC if you haven't already made limitations in your init.ora parameters for parallelism and job_queue_processes.

CREATE OR REPLACE package quickly_rebuild
authid current_user
as
PROCEDURE ind_part
(
  Ind_Owner_in     varchar2,
  Ind_Name_in      varchar2,
  Ind_Part_Name_in varchar2,
  Parallelism_in   number
);

PROCEDURE ind
(
  Ind_Owner_in        varchar2,
  Ind_Name_in         varchar2,
  Parallelism_in      number,
  Instance_Count_in   number default 1
);
  v_Last_Node number; --persistent variable to track the last node that was assigned a job, so you can do many jobs spread across nodes equally.
end;
/
CREATE OR REPLACE package body quickly_rebuild
as
PROCEDURE ind_part
(
ind_owner_in IN varchar2,
ind_name_in IN varchar2,
ind_part_name_in IN varchar2,
parallelism_in IN number
) IS
cursor_name INTEGER;
ret INTEGER;
v_Ind_Owner varchar2(30) := replace(ind_owner_in,';',':');
v_Ind_Name varchar2(30) := replace(ind_name_in,';',':');
v_Ind_Part_Name varchar2(30) := replace(ind_part_name_in,';',':');
v_Parallelism number := parallelism_in;
BEGIN

cursor_name := DBMS_SQL.OPEN_CURSOR;
if ind_part_name_in='NOT_PARTITIONED' then
DBMS_SQL.PARSE(cursor_name, 'alter index '||ind_owner_in||'.'||ind_name_in||' rebuild parallel '||parallelism_in, DBMS_SQL.NATIVE);
else
DBMS_SQL.PARSE(cursor_name, 'alter index '||ind_owner_in||'.'||ind_name_in||' rebuild partition '||ind_part_name_in||' parallel '||parallelism_in, DBMS_SQL.NATIVE);
end if;
ret := DBMS_SQL.EXECUTE(cursor_name);
DBMS_SQL.CLOSE_CURSOR(cursor_name);
END;


PROCEDURE ind
(
Ind_Owner_in varchar2,
Ind_Name_in varchar2,
Parallelism_in number,
Instance_Count_in number default 1
)
IS
type r_script is record (
line number,
text varchar2(1024));

type r_IP is record (
index_owner varchar2(30),
index_name varchar2(30),
partition_name varchar2(30)
);

type t_IP is table of r_IP index by binary_integer;
c_IP t_IP;
v_Job number;
v_Count number;

Ind_Not_Exist exception;
pragma exception_init(Ind_Not_Exist,-20001);

v_Statement1 varchar2(32767) := 'select count(0) from dba_ind_partitions where index_owner=:a and index_name=:b and status=''UNUSABLE''';
v_Statement2 varchar2(32767) := 'select index_owner, index_name, partition_name from dba_ind_partitions where index_owner=:a and index_name=:b and status=''UNUSABLE''';
v_Statement3 varchar2(32767) := 'select count(0) from dba_indexes where owner=:a and index_name=:b and status=''UNUSABLE''';

begin

/*
Author: Andy Black
For more info see: http://otipstricks.blogspot.com/2011/07/fast-domain-index-rebuild-on-exadata.html

USAGE: Create the index with the UNUSABLE clause and then issue:
exec quickly_rebuild.ind(index_owner, index_name, parallelism_per_partition, instance_count);

Req: The user submitting this work will need alter any index granted directly
You'll need the init.ora parameter job_queue_processes!=0
The index you're rebuilding must exist, must be partitioned and must be marked unusable
Don't forget to commit after executing this procedure

This procedure will submit to the job queue a seperate job for each partition of your index, and rebuild each one with the parallism you specified.
In RAC, when you specify instance_count_in, it will try to evenly distribute the job submissions across instances. So on a 5 node RAC an index with 10 partitions
and parallel 10 passed in will have 100 processes rebuilding it, 20 process, 2 partitions and 2 jobs running on each node.

This was created to overcome the limitation of a similar Oracle-supplied package that limits the index owner, the table owner and the index rebuilder to all be the same user.
This procedure doesn't have that limitation. I had a multi-TB domain index that took over a week to rebuild the normal way,
because the index rebuild parallelism is limited to the number of partitions of the domain index. I found this limitation unnecessary and impractical,
so I made this. Using this procedure it took about 4 hrs. There's nothing "domain index" specific about the rebuild statement,
so it will likely work for other indexes, although that hasn't been tested.

Any errors, such as ORA-29952 will be found in the alert log of the node that had the error.

*/

execute immediate v_Statement1 into v_Count using Ind_Owner_in, Ind_Name_in;
if v_Count=0 then
dbms_output.put_line('Not partitioned or error...');
execute immediate v_Statement3 into v_Count using Ind_Owner_in, Ind_Name_in;
if v_Count=0 then
raise_application_error(-20001, 'That index does not exist or is not marked UNUSABLE.',FALSE);
else -- not a partitioned index
quickly_rebuild.v_Last_Node := nvl(quickly_rebuild.v_Last_Node,0)+1;
dbms_output.put_line('Launching...');
dbms_job.submit( job => v_Job,
what => 'begin quickly_rebuild.ind_part('''||upper(Ind_Owner_in)||''','''||upper(Ind_Name_in)||''',''NOT_PARTITIONED'','||parallelism_in||'); end;',
next_date =>sysdate-1,
interval =>null,
no_parse => false,
instance => mod(quickly_rebuild.v_Last_Node,Instance_Count_in)+1,
force =>true);
dbms_output.put_line('Launched...');
end if;
else

execute immediate v_Statement2 bulk collect into c_IP using Ind_Owner_in, Ind_Name_in;
for i in 1..c_IP.last loop
quickly_rebuild.v_Last_Node := nvl(quickly_rebuild.v_Last_Node,0)+1;
dbms_job.submit( job => v_Job,
what => 'begin quickly_rebuild.ind_part('''||c_IP(i).index_owner||''','''||c_IP(i).index_name||''','''||c_IP(i).partition_name||''','||parallelism_in||'); end;',
next_date =>sysdate-1,
interval =>null,
no_parse => false,
instance => mod(quickly_rebuild.v_Last_Node,Instance_Count_in)+1,
force =>true);

end loop;
end if;
end;
end;
/