Search This Blog

Tuesday, July 23, 2019

We're using up storage too fast! Where is all the growth coming from?

Technical challenges aside, its difficult to navigate the politics of being an Oracle DBA.  In today's world, data warehouses/data lakes are expected to ingest massive amounts of data and still perform well.  Ok, no problem...this can be done...but when the storage utilization spikes, the storage guys (who write the checks for the expensive storage arrays the data is on)...they want to know if/why they need to buy another storage array.  DBA's are expected to perform storage planning, but are rarely informed when the data owners increase utilization until after the fact.

This question comes up pretty often...why is the database growing so fast?  Coming in to a new environment I'm not familiar with...I have a bunch of scripts I usually run to get a profile of the databases.  Usually a non-technical manager wants to see a summary, so if this data can be presented graphically as part of a presentation, communication flows better, and it looks more professional.

All the storage utilization data is being kept in the history tables...its just a matter of pulling it .  This is a dynamic query that creates a query...the results of that can be put into Excel and graphed.  Once the results are graphed, things pretty clearly jump out at you.  So...

1. Run this query (if you aren't using 8k block size, change 8192 below to w/e you're using):

select stmnt from (
select 1 o1, 0 o2, 'with stacked as
(
select snap_id, tablespace_id, trunc(to_date(rtime,''MM/DD/YYYY HH24:MI:SS'')) rtime, (tablespace_size*8192)/1024/1024/1024 ts_size
from dba_hist_tbspc_space_usage tdu
where snap_id in (select max(snap_id) from dba_hist_tbspc_space_usage tdu2
                    where trunc(to_date(tdu.rtime,''MM/DD/YYYY HH24:MI:SS''))=trunc(to_date(tdu2.rtime,''MM/DD/YYYY HH24:MI:SS'')))
)
select distinct rtime,'  stmnt from dual
union all
select 2, TS#,'(select ts_size from stacked s2 where s2.tablespace_id='||TS#||' and s1.rtime=s2.rtime and rownum=1) as "'||NAME||'"'||decode((tsm-ts#),0,null,',') from v$tablespace ts1 , (select max(TS#) tsm from v$tablespace) ts2
union all
select 3,0,'from stacked s1
 where (select ts_size from stacked s2 where s2.tablespace_id=0 and s1.rtime=s2.rtime) is not null;' from dual
 ) order by o1,o2;

2. Take the output of that query, and run it.  It'll create a very wide data set with daily storage size data for each tablespace.

3. Cut/Paste and put that data into Excel.

4. Select the data columns in Excel and at the top, select insert, then on the "Charts" group, select your favorite chart type..."2-D, stacked area" works well for this data, IMHO.

You should now see something like this (GB on the Y axis):

...so now its clear where the acceleration in growth came from.  The relatively new, orange "troublemaker_data" tablespace had a ~10TB growth spurt until May 1st, where it looks like it leveled off.  This empowers you to go after the data owners and ask them if this will continue, and you can tell the storage guys there was growth but it settled down and they don't need to panic and buy more storage.

I hope this helps you with your storage planning!

No comments:

Post a Comment