Friday, April 27, 2018

Poor query performance after stats were gathered?

Although the Oracle optimizer is brilliant, its not infallible.  Its vulnerability is that it depends on good table statistics to determine an optimal plan.  I optimized a query today that was projecting to use over 2PB of temp space.  It was a *horrible* plan...I started to try to rewrite it and thought...why is Oracle doing this...there isn't THAT much data.  The first thing I checked was the accuracy of the table stats...within about 10 minutes I had the query completing in ~5 seconds. 

When it comes to gathering stats, there are 2 schools of thought out there...those that gather stats frequently and those that gather stats that generate plans they're happy with, then lock them...or at least gather them much less frequently and intentionally.

Some people believe stats should be gathered frequently to always have the optimal query performance all the time.  If the table data changes sizes dramatically and frequently...this might be ok.  If the stats were gathered with estimates (which is commonly done), its possible that you'll gather stats based on a subset of your data that doesn't represent the whole table.  So then your stats aren't great...but even then...usually the optimizer gets the right plan or a plan that's "close enough" that it doesn't cause any pain.  Jonathan Lewis points out in "The Cost-Based Optimizer" (great book, by the way), that one of the primary purposes of table statistics are to mathematically create relativity between the table row counts in a join. 

This means, if the largest table in a join today is the largest table if you have 1mil rows today and it grows to 2 mil rows, you probably don't want your plan to change and so you don't need to regather stats. 

Let's say the 1mil row table becomes a 1 row table, you could regather stats and get a new plan and everything would be great.  The next day a data load brings it up to 10 mil rows...suddenly the plan that ran great isn't finishing in its SLA.  A good plan for a query today isn't necessarily a good plan tomorrow.

My opinion is...if your table sizes fluctuate, you probably want to gather stats when the tables are large and lock them, which will cause your plans to be stable.  If the table is much smaller tomorrow, your plan might not be optimal, but it won't be worse than it is you'll have ~ consistency in run time performance (and you meet your SLA's.)

Oracle has made great strides to improve this...with 12.1's OPTIMIZER_ADAPTIVE_FEATURES and 12.2's OPTIMIZER_ADAPTIVE_STATISTICS, Oracle will correct itself with statistics feedback...which will prevent you from "falling off the temp usage cliff."  Although these features are would be better to not have a problem that needs to be corrected in the first place. 

Since these problems are usually on complex views (on views, on views, on views...ets...)  Here's a little query you can run to find the dependencies of the top level view, gather stats on its tables/indexes, and lock them (so the problem doesn't happen again.)  I'm gathering them w/null est % (ie:compute)...adjust that and the degree/method_opt to fit your needs.  This should generate pretty good stats and lock them, allowing the optimizer to make its brilliant decisions once again.

  distinct 'begin'||chr(13)||'
    estimate_percent  => NULL,
    method_opt=> ''FOR ALL INDEXED COLUMNS SIZE AUTO '',
    degree            => 32,
    cascade           => TRUE,
    no_invalidate  => FALSE);'||chr(13)||
    'exec dbms_stats.lock_table_stats('''||o2.owner||''','''||o2.object_name||''');'
from   sys.dba_objects o1,
       sys.dba_objects o2,
      (Select object_id, referenced_object_id
       from   (select object_id, referenced_object_id
               from   public_dependency
               where  referenced_object_id <> object_id) pd
       start with  object_id = (select object_id from dba_objects where object_name='YOUR_TABLE_NAME' and owner='YOUR_TABLE_OWNER')
       connect by nocycle prior referenced_object_id =  object_id) o3
where o1.object_id = o3.object_id
and   o2.object_id = o3.referenced_object_id
and o2.object_type='TABLE'
and   o1.owner not in ('SYS', 'SYSTEM')
and   o2.owner not in ('SYS', 'SYSTEM')
and   o1.object_name <> 'DUAL'
and   o2.object_name <> 'DUAL';  

Thursday, August 24, 2017

Effect of MBPS vs Latency

I do *a lot* of performance testing on high performance storage arrays for multiple vendors.  Usually if I'm involved, the client is expecting to put mission critical databases on their new expensive storage, and they need to know it performs well enough to meet their needs. 

So...parsing that out..."meet their needs"...means different things to different people.  Most businesses are cyclical, so the performance they need today is likely not the performance they need at their peek.  For example...Amazon does much more business the day after Thanksgiving than they do in a random day in May.  If you gather the usage stats being used in May and size it appropriately, you're going to get a call in a few months when performance is exposed. 

Before I talk about latency, let me just say AWR does a great job of keeping performance data, if you have your data kept long enough...preferably at least 2 business cycles so you can do comparisons and projections.

This statement will keep AWR data for 3 years, capturing it at an aggressive 15 minute interval:

execute dbms_workload_repository.modify_snapshot_settings (interval => 15,retention => 1576800); that point, see my other post re:gathering IOPS and Throughput requirements.

Anyway, I often have discussions with people who don't understand the effect of latency on OLTP databases.  This is a overly-focused serial example, but its enough to make the point.  Think about this...let's say you have a normal 8K block Oracle database using Netapp or EMC NFS on an active-active 10Gb network.  Let's say your amazing all-SSD storage array is capable of flowing 10Gb between multiple paths.   So...the time to move 8K over a 10Gb pipe is...

(8Kb/s)/(10485760Kb/s)=0.000000762939453125 seconds to copy 8Kb over the 10Gb pipe. the time it passes through your FC network, gets processed by the storage array, gets retrieved from disk, and makes it back to your server can easily exceed a few ms...but for fun let's say we're getting an 8ms response time.  That's .008 seconds.

.008/0.000000762939453125=10,485.76... the effect of latency on your block is 10,485X greater than the effect of throughput.  If your throughput magically got faster but your latency stayed the same...performance wouldn't really improve very much.  If you went from 8ms to 5ms, on the other hand, this would have a huge effect on your database performance.

There's a lot that can affect latency...usually the features in use on the storage array play a big part.  CPU utilization on the storage array can become too high.  This is ultra complicated for the storage array guys to diagnose.  On EMC VMAX3's for example, CPU is allocated to "pools" for different features.  So...even though you may not use eNAS, by default, you allocate a lot of your VMAX CPU cores to it.  When your FC traffic pegs its cores and latency tanks...the administrator may think to look at the CPU utilization and not see an issue...there's free CPU available...just not in the pool used for the FC front end cores, so it creates a bottleneck.  Awesome performance improvements are possible by working closely with your storage vendor to reduce latency during testing...about 6 months ago I worked with a team that achieved improvements by over 50% from the standard VMAX3 as delivered by adjusting those allocations.

All this to say...Latency is very important for common OLTP databases.  Don't ignore throughput, but don't focus on it.

The last secret tweak for improving datapump performance (Part 2)

In my previous post, I mentioned some of the common datapump performance tweaks we see.  In this one, I want to talk about one that's never mentioned, and it might be the best of all.

5.  ADOP - The last tweak...I don't think I've seen any blog posts or Oracle documentation about this as it applies to ADOP-Auto degree of parallelization.  This can be a *huge* 4X or more...improvement on imports, which is typically where most of your datapump time is spent.  This has been around since 11.2, but until recently (12.1,12.2) its been a little difficult to control how parallel things would run at.  To enable it, you simply set:


(which means, if the optimizer thinks this statement will take more than 60 seconds, it will consider parallelizing it)

This is nice because quick queries will run without the overhead of parallelism, and long running queries might find value in parallelism.

Today in 12c, we have the parallel_degree_level parameter, but in 11.2 we could tweak the parallelism by adjusting max_pmbps in sys.resource_io_calibrate$.  From my testing, 200=~parallel 2 or 3.  A SMALLER value increases the amount of parallelism (50=~parallel 20.)  Effectively this gives us the same effect as the new 12c feature...which is to make Oracle make rational decisions on how parallel the auto parallelism should be. 

Datapump is a logical copy (as opposed to a physical backup/restore) so it can't copy the original indexes, it has to rebuild them.  If you have a typical import with 10,000 tables and indexes, the last 100 are big, the last 10 are huge.  Datapump's parallelism will rip through the small objects very quickly, with one process per object.  When the time comes to rebuild the indexes on the huge tables, datapump will again assign one process per index rebuild.  When the create index statement is analyzed by the optimizer, it will create it in parallel based on the algorithm derived from max_pmbps  (even though the create index statement may be parallel 2 or noparallel).  This will save many hours on a large datapump import.  Its crazy to see a serialized create index statement with 50 busy parallel slaves...but that's what can happen.

When its done, the indexes all have the original parallel spec they started with.  Nothing is any different than it would have been if you hadn't used ADOP (other than it was done much, much faster.)

One word of caution:  You have to watch your system resources and parallel limits.  If you have datapump running at parallel 50, that means potentially you'll be rebuilding 50 indexes simultaneously.  If they're each "large" and the optimizer thinks they'll take over [parallel_min_time_threshold] seconds to rebuild, each of them could be built parallel and you could have hundreds (datapump 50 * adop 50) of parallel processes.  This is a wonderful thing if your system can handle it.  Depending on your parallel limit parameters, ADOP may queue the statements until you have enough parallel prevent the system from overloading.  IMHO, that's also a wonderful thing, but it may be unexpected.  The truly unfortunate situation is when you have them too high.  You'll use up your server resources and inefficiently use CPU...and may even swap if you run low on RAM.  So test!

...but that's what dry runs are for.  I hope this last tweak helps you.  I've seen it make miraculous differences meeting otherwise impossible SLA's for datapump export/imports.  Two other posts you may want to read are:
1. Gwen Shapira has the best post on it, IMHO)
2. Kerry Osborne has a nice post on the 12c changes.

Previous post -> The last secret tweak for improving datapump performance (Part 1)
This post -> The last secret tweak for improving datapump performance (Part 2)

The last secret tweak for improving datapump performance (Part 1)

There are 10,000 blog posts and oracle docs on the internets (thank you, Mr Bush) for improving datapump performance.  This is one of the features used very frequently in Oracle shops around the world.  And DBA's are under huge stress to meet impossible downtime SLA's for their export/import. I think they all miss the best tweak (ADOP)...but they basically summarize to one simple fact...outside of parallelism, if you have a well-tuned database on fast storage, there's not a lot more you can do to improve performance more than a few percentage.  The obvious improvements are:

1. Parallelize! To quote Vizzini, "I do not think it means what you think it means."

This will create multiple processes and each will take one object and work with it, each one serialized (usually).  This is great, and if all your objects are equally sized, this is perfect...but that's not typically reality.  Usually you have a few objects that are much bigger than the rest and each of them by default will only get a single process.  This limit really hurts during imports, when you need to rebuild large indexes...more on this later.

Check that its working as expected by hitting control-c and typing in status.  Ideally, you should see all parallel slaves working.  Check not just that they exist, but that they're working (verify you used %U in your dump filename if they aren't.)  ie: DUMPFILE=exaexp%U.dmp PARALLEL=100

2. If you're importing into a new database, you have the flexibility to make some temporary changes to tweak things.  Verify Disk Asynchronous IO is set to true (DISK_ASYNCH_IO=true) and disable all the block verification (DB_BLOCK_CHECK=FALSE, DB_BLOCK_CHECKSUM=FALSE)  These aren't "game changers" but they'll give you 10-20% improvements, depending on how you had them set previously.

3. Memory Settings - Datapump parallelization uses some of the streams API's, and so the streams pool is used.  Make sure you have enough memory for the shared_pool, streams_pool and the db_cache_size parameters.  Consult your gv$streams_pool_advice, gv$shared_pool_advice, gv$db_cache_advice and gv$sga_target_advice  views.  I like to tune it so as the delta in ESTD_DB_TIME_FACTOR from one row to the row below it approaches zero, the corresponding size of the pool is close to 1.  (Any more than that is a waste, any less than that is lost performance.) 

Sometimes you'll see a huge dropoff and its more clear than this example...but you get the idea.  If you're importing into a new database, you'll need to run the import dry run and then check these views to make sure this is tuned well.

158720 0.5 12162648 1.2045
178560 0.5625 11421479 1.1311 0.0734
198400 0.625 10937800 1.0832 0.0479
218240 0.6875 10607608 1.0505 0.0327
238080 0.75 10604578 1.0502 0.0003
257920 0.8125 10375361 1.0275 0.0227
277760 0.875 10222886 1.0124 0.0151
297600 0.9375 10101716 1.0004 0.012
317440 1 10097674 1 0.0004
337280 1.0625 10017905 0.9921 0.0079
357120 1.125 9955300 0.9859 0.0062
376960 1.1875 9908850 0.9813 0.0046
396800 1.25 9905821 0.981 0.0003
416640 1.3125 9870479 0.9775 0.0035
436480 1.375 9844225 0.9749 0.0026
456320 1.4375 9826049 0.9731 0.0018
476160 1.5 9821001 0.9726 0.0005

4. Something often missed when importing into a new database, size your redo logs to be relatively huge.  The redo logs will work like a cache and cycle around.  Eventually if you're adding data extremely fast, the last log will fill and can't switch until the next log is cleared.  "Huge" is relative to the size, speed of your database and hardware.  While you're running your import, select * from v$log and make sure you see at least one "inactive" logs in front of the current log.

The best, virtually unused tweak is in the next post....

This post:  The last secret tweak for improving datapump performance (Part 1)
Next post: The last secret tweak for improving datapump performance (Part 2)

Wednesday, February 22, 2017

Vertica Architecture (Part 2)

This is continued from my previous post.

In review, I spoke with some very smart and experienced Vertica consultants regarding the DR architecture, and found the most obvious solutions all had huge drawbacks.

1. Dual-Load: Double your license costs(?), there's also the potential to have the two clusters out of synch, which means you need to put logic in your loads to handle the possibility that a load succeeds in datacenter 1 and fails in datacenter 2.
2. Periodic Incremental Backups:Need identical standby system (aka, half the capacity and performance of your hardware because the standby is typically idle)
3. Replication solutions provided by storage vendors: The recommended design uses local storage, not storage arrays, so this is difficult to implement, in addition to the expense and the potential of replicating media failures.

At first, here's what we did instead:

Initially (aka don't do this), we set up 2 failgroups, 3 nodes in datacenter 1 and 3 in datacenter 2. Failgroups in Vertica are intended for use where you could have known dependencies that are transparent to Vertica...for example, a server rack.  Both failgroups are in the same cluster, and so data that's entered into nodes 1,2 or 3 get replicated automatically by Vertica to the other failgroup's nodes 4, 5 and 6.

We were trying to protect ourselves from the possibility of a complete datacenter failure, or a WAN failure.  The WAN is a 10Gb, low latency dark fiber link with a ring design, so highly available.  Although the network is HA, the occasional "blip" happens, where a very brief outage causes a disconnection.  Clusters don't like disconnections.

We were very proud of this design until we tested completely failed.  It made sense...although logically we had all the data we needed in a single failgroup, if we simulated a network outage we'd see all 6 nodes go down.  This is actually an intentional outcome, and a good thing.  If you've worked with clusters know its much better to have the cluster go down than to have it stay up in a split brain scenario and corrupt all your data.  If the cluster stays up and becomes out of synch, you have to fix whatever the initial issue was, and you compound the problem with the need to restore all your data.

So...intentionally, if you have half your nodes go down, Vertica causes the whole cluster to go down, even if you have all the data you need to stay up in the surviving nodes.  Oracle RAC uses a disk voting mechanism to decide which part of the cluster stays up, but there's no such mechanism in Vertica.

We were back to the 3 original options...all with their drawbacks.  While pouring over the documentation looking for an out-of-the-box solution, I noticed Vertica 8 introduced a new type of node called an Execute node.  Again...very little documentation on this, but I was told this was a more official way to deal with huge ingest problems like they had at Facebook (35TB/hr).  Instead of using Ephimeral nodes (nodes in transition between being up and being down) like they did, you could create execute nodes that only store the catalog...they store no other data, but only exist for the purpose of ingestion.

Upon testing, we also found Execute nodes "count" as a node in the instead of having 6 nodes-3 nodes in DC1 and 3 in DC2, we'd add a 7th node in a cloud (we chose Oracle's cloud.)  Its a great use case for a cloud server because it has almost no outgoing data, almost no CPU utilization (only enough to maintain the catalog) and the only IO is for the catalog.  So now, if DC1 went down, we had a quorum of 4 surviving nodes (4,5,6,7)...if DC2 went down, we still have 4 surviving nodes (1,2,3,7).  If all the nodes stayed up, but the WAN between DC1 and DC2 stopped functioning, Vertica would kill one of the failgroups and continue to no risk of a split brain.

We're continuing to test, but at this point, its performed perfectly.  This has effectively doubled our performance and capacity because we have a 6 node cluster instead of two 3 node clusters.  Its all real time, and there's no complex dual load logic to program in our application.

Next, I'll talk about Vertica backups.

Tuesday, February 21, 2017

Vertica Capacity Architecture

Since nearly the first time I logged in to an Oracle database, I remember finding issues with documentation and the occasional error in documentation.  Its understandable...usually this was due to a change or a new feature that was introduced and the documentation just wasn't updated.  The real problem was in me...I was judging Oracle's documentation vs perfection...I should have lowered my expectations and appreciated what it was instead of being upset for what it wasn't.

While evaluating and designing the architecture for an HP Vertica database for a client, I gained a new appreciation for Oracle's documentation.  I expected to find everything I needed to do a perfect Vertica cluster install across 2 data centers in an active/active configuration for DR.  When the documentation failed and I resorted to gGoogle, I mostly found people with the same questions I had and no solutions.

Soooo...I thought I'd make a few notes on what I learned and landed on. I am by no means a Vertica expert, but I've definitely learned a lot about and I've had the opportunity to stand on the shoulders of a few giants recently.

Our requirement is to store 10TB of actual data, we don't know how well it will we're ignoring compression for capacity planning purposes.  How much physical storage do you need for that much data?  Vertica licensing is based on data capacity, but that's not the amount of capacity used...its the amount of data ingested.  Vertica makes "projections" that (in Oracle terms) I think of as self-created materialized views and aggregates that can later be used for query re-write.  Vertica will learn from your queries what it needs to do in the future to improve performance, it'll create projections and these projections use storage.  Since there's columnar compression in Vertica by default, these projections are stored efficiently...and they aren't counted toward your licensed total.  I've heard stories that companies have had so many (200+) that the performance of importing data was hampered...these physically stored objects are updated as data is loaded.  Since projections will take up storage you have to account for that in the early design, but it completely depends on your dataset and access patterns.  Estimates based on other companies I've spoken with are between 0% (everything is deduped) and 50% (their ETL is done in Vertica, so less deduplication), so lets say 35%.

Also, you're strongly recommended to use local storage (raid 1+0...mirrored and striped), and the storage is replicated in multiple nodes for protection.  They call this concept k-safety. The idea is that you can lose "k" nodes, and the database would still continue to run normally.  We would run K+1 (the default).

In order to do a rebalance (needed when you add or remove a node), the documentation suggests you have 40% capacity free.

Also, Vertica expects you to isolate your "catalog" metadata from your actual data, so you need to set up one mirrored raid group with ~150GB for catalog...and OS, etc.  They give an example architecture using HP hardware with servers that have 24 slots for drives.  2 of them are used for mirroring the OS/Catalog, leaving 22 for your actual data.  Knowing SSD's are the future for storage, the systems we worked on are Cisco UCS C-series with 24 slots filled with 100% SSD's.  From the feedback from Vertica, this will help with rebuild times, but not so much with normal processing performance, since so much of Vertica is done in memory.  There's a huge price increase in $/GB between 400 and 800GB drives.

So...if you have 6 nodes with 22 slots, each populated with 400GB SSD's, you have 52,800GB. Half that for raid 1+0=26,400.  If you have an HA architecture, you'd expect to half that again (3 nodes in datacenter 1, 3 nodes in datacenter 2)...which brings you to 13,200GB.  Since you have to keep at least 40% free for a rebalance operation, that brings you down to 7,920GB.  We have to account for projections...we said the would be 35% of our dataset...which brings us to 5,148GB.  All the data in Vertica is copied to 2 nodes, so half the storage again....2,574GB.

Hmmm...2.5TB of storage is less than our 10TB requirement.  I'll show you how we changed the design to double capacity in my next post.

Thursday, January 19, 2017

UDEV updated

In my previous post, I wrote about automating the creation of your Oracle RAC cluster.  One of the more complicated parts to that is using shared VMDK's and configuring UDEV to create aliases so all your RAC nodes have the same name for the same ASM disk, no matter what order udev finds the devices.

Assuming you used the VM create script in the previous post, the prerequisites of:

1.  "disk.EnableUUID"="true"; 
2. Data on SCSI adapter 1, Redo on SCSI adapter 2, and FRA on SCSI adapter 3
3. Node equivalence

...should already be met, and this should just work.  This needs to be executed as root, and should be tested in a sandbox environment until you're confident.  I've used it for years, but like everything on the internet, no warranties or promises implied or otherwise.  It may be found to connect to a secret government computer and play Global Thermonuclear War.

#! /bin/sh
# Name:
# Date: 5/9/2012
# Purpose:  This script will create all the udev rules necessary to support
#        Oracle ASM for RH 5 or RH6.  It will name the aliased devices
#        appropriately for the different failgroups, based on the contoller
#        they're assigned to.
# Revisions:
#   5/8/2012  - JAB: Created
#   5/10/2012 - JAB: Will now modify the existing rules to allow the addition of a
#          single new disk.
#   1/8/2013  - JAB: assorted RH6 related issues corrected.
release_test=`lsb_release -r | awk 'BEGIN {FS=" "}{print $2}' | awk 'BEGIN {FS="."}{print $1}'`
echo "Detected RH release ${release_test}"

if [ -f "/etc/udev/rules.d/99-oracle-asmdevices.rules" ]; then
  echo -e "Detected a pre-existing asm rules file.  Analyzing...\c"
  for y in {1..50}
    found_data_disk=`cat /etc/udev/rules.d/99-oracle-asmdevices.rules|grep asm-data-disk${y}`
    found_redo_disk=`cat /etc/udev/rules.d/99-oracle-asmdevices.rules|grep asm-redo-disk${y}`
    found_arch_disk=`cat /etc/udev/rules.d/99-oracle-asmdevices.rules|grep asm-arch-disk${y}`
    if [ -n "${found_data_disk}" ]; then
      let "data_disk++"
    if [ -n "${found_redo_disk}" ]; then
      let "redo_disk++"
    if [ -n "${found_arch_disk}" ]; then
      let "arch_disk++"
    echo -e ".\c"
  echo "complete."
  echo "Existing rules file contains:"
  echo " ASM Data Disks: ${data_disk}"
  echo " ASM Redo Disks: ${redo_disk}"
  echo " ASM Arch Disks: ${arch_disk}"
  echo "Detected no pre-existing asm udev rules file.  Building..."

for x in {a..z}
  if [ -n "`ls /dev/sd* | grep sd${x}1 `" ] ; then
    asm_test1=`file -s /dev/sd${x}1 |grep "/dev/sd${x}1: data" `
    asm_test2=`file -s /dev/sd${x}1 |grep "Oracle ASM" `
    #echo "Testing for sd${x}1 complete."
    if [[ -n "${asm_test1}" || -n "${asm_test2}" ]] ; then
      # ie: scsi_device:1:0:1:0
      if [ "${release_test}" = "5" ]; then
        controller=`ls /sys/block/sd${x}/device|grep scsi_device | awk 'BEGIN {FS=":"}{print $2}'`
        result=`/sbin/scsi_id -g -u -s /block/sd${x}`
      elif [ "${release_test}" = "6" ]; then
        controller=`ls /sys/block/sd${x}/device/scsi_device | awk 'BEGIN {FS=":"}{print $1}'`
        result=`/sbin/scsi_id -g -u -d /dev/sd${x}`
      if [ "${controller}" = "3" ]; then
        if [ -f "/etc/udev/rules.d/99-oracle-asmdevices.rules" ]; then
          found_uuid=`cat /etc/udev/rules.d/99-oracle-asmdevices.rules|grep $result`
        if [[ -z "${found_uuid}" || "${new_file}" = "true" ]]; then
          echo "Detected a new data disk.  Adding rule to /etc/udev/rules.d/99-oracle-asmdevices.rules"
          let "data_disk++"
          if [ "${release_test}" = "5" ]; then
            echo "KERNEL==\"sd?1\", BUS==\"scsi\", PROGRAM==\"/sbin/scsi_id -g -u -s /block/\$parent\", RESULT==\"${result}\", NAME=\"asm-data-disk${data_disk}\", OWNER=\"oracle\", GROUP=\"dba\", MODE=\"0660\"" >> /etc/udev/rules.d/99-oracle-asmdevices.rules
          elif [ "${release_test}" = "6" ]; then
            echo "KERNEL==\"sd?1\", BUS==\"scsi\", PROGRAM==\"/sbin/scsi_id -g -u -d /dev/\$parent\", RESULT==\"${result}\", NAME=\"asm-data-disk${data_disk}\", OWNER=\"oracle\", GROUP=\"dba\", MODE=\"0660\"" >> /etc/udev/rules.d/99-oracle-asmdevices.rules
      elif [ "${controller}" = "4" ]; then
        if [ -f "/etc/udev/rules.d/99-oracle-asmdevices.rules" ]; then
          found_uuid=`cat /etc/udev/rules.d/99-oracle-asmdevices.rules|grep $result`
        if [[ -z "${found_uuid}" || "${new_file}" = "true" ]]; then
          echo "Detected a new Redo disk.  Adding rule to /etc/udev/rules.d/99-oracle-asmdevices.rules"
          let "redo_disk++"
          if [ "${release_test}" = "5" ]; then
            echo "KERNEL==\"sd?1\", BUS==\"scsi\", PROGRAM==\"/sbin/scsi_id -g -u -s /block/\$parent\", RESULT==\"${result}\", NAME=\"asm-redo-disk${redo_disk}\", OWNER=\"oracle\", GROUP=\"dba\", MODE=\"0660\"" >> /etc/udev/rules.d/99-oracle-asmdevices.rules
          elif [ "${release_test}" = "6" ]; then
            echo "KERNEL==\"sd?1\", BUS==\"scsi\", PROGRAM==\"/sbin/scsi_id -g -u -d /dev/\$parent\", RESULT==\"${result}\", NAME=\"asm-redo-disk${redo_disk}\", OWNER=\"oracle\", GROUP=\"dba\", MODE=\"0660\"" >> /etc/udev/rules.d/99-oracle-asmdevices.rules
      elif [ "${controller}" = "5" ]; then
        if [ -f "/etc/udev/rules.d/99-oracle-asmdevices.rules" ]; then
          found_uuid=`cat /etc/udev/rules.d/99-oracle-asmdevices.rules|grep $result`
        if [[ -z "${found_uuid}" || "${new_file}" = "true" ]]; then
          echo "Detected a new Arch disk.  Adding rule to /etc/udev/rules.d/99-oracle-asmdevices.rules"
          let "arch_disk++"
          if [ "${release_test}" = "5" ]; then
            echo "KERNEL==\"sd?1\", BUS==\"scsi\", PROGRAM==\"/sbin/scsi_id -g -u -s /block/\$parent\", RESULT==\"${result}\", NAME=\"asm-arch-disk${arch_disk}\", OWNER=\"oracle\", GROUP=\"dba\", MODE=\"0660\"" >> /etc/udev/rules.d/99-oracle-asmdevices.rules
          elif [ "${release_test}" = "6" ]; then
            echo "KERNEL==\"sd?1\", BUS==\"scsi\", PROGRAM==\"/sbin/scsi_id -g -u -d /dev/\$parent\", RESULT==\"${result}\", NAME=\"asm-arch-disk${arch_disk}\", OWNER=\"oracle\", GROUP=\"dba\", MODE=\"0660\"" >> /etc/udev/rules.d/99-oracle-asmdevices.rules
      echo "/dev/sd${x} is not an asm disk."
echo "Complete."

echo "To see the ASM UDEV rules: cat /etc/udev/rules.d/99-oracle-asmdevices.rules"