Search This Blog

Friday, April 27, 2018

Poor query performance after stats were gathered?

Although the Oracle optimizer is brilliant, its not infallible.  Its vulnerability is that it depends on good table statistics to determine an optimal plan.  I optimized a query today that was projecting to use over 2PB of temp space.  It was a *horrible* plan...I started to try to rewrite it and thought...why is Oracle doing this...there isn't THAT much data.  The first thing I checked was the accuracy of the table stats...within about 10 minutes I had the query completing in ~5 seconds. 

When it comes to gathering stats, there are 2 schools of thought out there...those that gather stats frequently and those that gather stats that generate plans they're happy with, then lock them...or at least gather them much less frequently and intentionally.

Some people believe stats should be gathered frequently to always have the optimal query performance all the time.  If the table data changes sizes dramatically and frequently...this might be ok.  If the stats were gathered with estimates (which is commonly done), its possible that you'll gather stats based on a subset of your data that doesn't represent the whole table.  So then your stats aren't great...but even then...usually the optimizer gets the right plan or a plan that's "close enough" that it doesn't cause any pain.  Jonathan Lewis points out in "The Cost-Based Optimizer" (great book, by the way), that one of the primary purposes of table statistics are to mathematically create relativity between the table row counts in a join. 

This means, if the largest table in a join today is the largest table if you have 1mil rows today and it grows to 2 mil rows, you probably don't want your plan to change and so you don't need to regather stats. 

Let's say the 1mil row table becomes a 1 row table, you could regather stats and get a new plan and everything would be great.  The next day a data load brings it up to 10 mil rows...suddenly the plan that ran great isn't finishing in its SLA.  A good plan for a query today isn't necessarily a good plan tomorrow.

My opinion is...if your table sizes fluctuate, you probably want to gather stats when the tables are large and lock them, which will cause your plans to be stable.  If the table is much smaller tomorrow, your plan might not be optimal, but it won't be worse than it is you'll have ~ consistency in run time performance (and you meet your SLA's.)

Oracle has made great strides to improve this...with 12.1's OPTIMIZER_ADAPTIVE_FEATURES and 12.2's OPTIMIZER_ADAPTIVE_STATISTICS, Oracle will correct itself with statistics feedback...which will prevent you from "falling off the temp usage cliff."  Although these features are would be better to not have a problem that needs to be corrected in the first place. 

Since these problems are usually on complex views (on views, on views, on views...ets...)  Here's a little query you can run to find the dependencies of the top level view, gather stats on its tables/indexes, and lock them (so the problem doesn't happen again.)  I'm gathering them w/null est % (ie:compute)...adjust that and the degree/method_opt to fit your needs.  This should generate pretty good stats and lock them, allowing the optimizer to make its brilliant decisions once again.

  distinct 'begin'||chr(13)||'
    estimate_percent  => NULL,
    method_opt=> ''FOR ALL INDEXED COLUMNS SIZE AUTO '',
    degree            => 32,
    cascade           => TRUE,
    no_invalidate  => FALSE);'||chr(13)||
    'exec dbms_stats.lock_table_stats('''||o2.owner||''','''||o2.object_name||''');'
from   sys.dba_objects o1,
       sys.dba_objects o2,
      (Select object_id, referenced_object_id
       from   (select object_id, referenced_object_id
               from   public_dependency
               where  referenced_object_id <> object_id) pd
       start with  object_id = (select object_id from dba_objects where object_name='YOUR_TABLE_NAME' and owner='YOUR_TABLE_OWNER')
       connect by nocycle prior referenced_object_id =  object_id) o3
where o1.object_id = o3.object_id
and   o2.object_id = o3.referenced_object_id
and o2.object_type='TABLE'
and   o1.owner not in ('SYS', 'SYSTEM')
and   o2.owner not in ('SYS', 'SYSTEM')
and   o1.object_name <> 'DUAL'
and   o2.object_name <> 'DUAL';