Friday, June 28, 2013

Adding a disk to ASM and using UDEV?

The project I've been writing about to migrate many single instance IBM P5 595 AIX databases to 11.2.0.3 RAC on EMC vBlocks is coming to a close.  I thought there might be value in sharing some of the lessons learned from the experience.  There have been quite a few....

As I was sitting with the DBA team, discussing how well everything has gone and how stable the new environment was, alerts started going off that services, a vip and a scan listener on one of the production RAC nodes had failed over.  Hmm...that's strange.  About 45 seconds later...more alerts came in that the same thing happened on the next node...that happened over and over.  We poured through the clusterware/listener/alert logs and found nothing helpful...only that there was a network issue and clusterware took measures after the fact...nothing to point at the root cause.

Eventually we looked in the OS message log, and found this incident:

May 30 18:05:14 SCOOBY kernel: udev: starting version 147
May 30 18:05:16 SCOOBY ntpd[4312]: Deleting interface #11 eth0:1, 115.18.28.17#123, interface stats: received=0, sent=0, dropped=0, active_time=896510 secs
May 30 18:05:16 SCOOBY ntpd[4312]: Deleting interface #12 eth0:2, 115.18.28.30#123, interface stats: received=0, sent=0, dropped=0, active_time=896510 secs
May 30 18:05:18 SCOOBY ntpd[4312]: Listening on interface #13 eth0:1, 115.18.28.30#123 Enabled
May 30 18:08:21 SCOOBY kernel: ata1: soft resetting link
May 30 18:08:22 SCOOBY kernel: ata1.00: configured for UDMA/33
May 30 18:08:22 SCOOBY kernel: ata1: EH complete
May 30 18:09:55 SCOOBY kernel: sdab: sdab1
May 30 18:10:13 SCOOBY kernel: sdac: sdac1
May 30 18:10:27 SCOOBY kernel: udev: starting version 147

Udev started, ntpd reported the network issue, then udev finished.  Hmm...why did Udev start?  It turns out that the unix team added a disk (which has always been considered safe during business hours) and as part of Oracle's procedure to create the udev rule, they needed to run start_udev.  The first reaction was to declare "adding storage" an "after-hours practice only" from now on...and that would usually be ok...but there are times that emergencies come up and adding storage can't wait until after hours, and must be done online...so we needed a better answer.

The analysis of the issue showed that when the Unix team followed their procedure and ran start_udev, udev deleted the public network interface and re-created it within a few seconds which caused the listener to crash...and of course, clusterware wasn't ok with this.  All the scan listeners and services fled from that node to other nodes.  Without noticing an issue, the unix team proceeded to add the storage to the other nodes causing failovers over and over.  

We opened tickets with Oracle (since we followed their documented process per multiple MOS notes) and Redhat (since they support Udev).  The Oracle ticket didn't really go anywhere...the Redhat ticket said this is normal, expected behavior, which I thought is strange...I've done this probably hundreds of times and never noticed a problem, and I found nothing on MOS that mentions a problem.   RH eventually suggested we add HOTPLUG="NO" to the network configuration files.  After that, when we run start_udev, we don't have the problem, the message log doesn't show the network interface getting dropped and re-created...and everything is good.  We're able to add storage w/o an outage again.


I updated the MOS SR w/Redhat's resolution.  Hopefully this will be mentioned in a future note, or added to RACCHECK, for those of us running Oracle on Redhat 6+, where asmlib is unavailable.

-- UPDATE -- (Thanks for finding this, Dave Jones)

From Oracle, per note 414897.1, 1528148.1, 371814.1 etc, we're told to use start_udev to activate a new rule and add storage.  From Redhat (https://access.redhat.com/site/solutions/154183) we're told to never manually run start_udev.

Redhat has a better suggestion...you can trigger the udev event and not lose your network configuration and only effect the specific device you're working with via:

echo change > /sys/block/sdg/sdg1/uevent

I think this is a better option...so...do this instead of start_udev.  I would expect this to become a bigger issue as more people migrate to RH 6+, where asmlib isn't an option.