What's going on in Mesiols work day

Wednesday, December 3, 2008

Get rid of ORA-04031 "cannot allocate XXXX Bytes ..."

Today a customer called telling me he was unable to use his software, because at login he receivces an
ORA-04031 "cannot allocate XXX Bytes ..."

First mind "aah out of shared pool".

First i tried an

But this doesn't solve the problem.

So i decided to dive into the database and take a look what's going on.

The following query determines the available memory for SHARED_POOL_SIZE in Oracle SGA

SQL> select sum (bytes)/1024/1024 from v$sgastat where pool=’shared pool’;

The above query provides the shared pool size in mega bytes.

The following query determines the total used memory by shared_pool in Oracle SGA.

SQL> select sum (bytes)/1024/1024 from v$sgastat where pool=’shared pool’ and name not in (’free memory’);

The total allocated memory for shared_pool in the init.ora or spfile can be detemined by

SQL> select value from v$parameter where name=’shared_pool_size’

I found a customer package consuming 400 megs of shared pool, i stopped the execution by killing the session and the problem was gone.

Tuesday, December 2, 2008

VxVM drop plex/mirror from volume and rename plex

Today i had to do some work on a storage system where new storages where installed and the old one should be unconfigured from the system.

By using

sh#> vxprint -g $DISKGROUP -rtLv $VOLUMENAME

i checked what mirrors (plexes) are available for the volume.

I selected the mirrors located on the old storage an dropped them from the volume with

sh#> vxplex -o rm dis $PLEXNAME

I noticed that the naming convention, after deleting the older plexes, does not look really pretty, because when the plex was created VxVM added a sequentiell number add the end of the plexname, so i decided to rename the plexes.

This was done with:

sh#> vxedit -g $DISKGROUP -p rename $OLD_PLEX_NAME $NEW_PLEX_NAME

After that all looks pretty well.

Rename Oracle temporary tablespace

Today i run into the problem, that a customer has created a temporary file for temp tablespace called ".dbf".

Because it is not possible to rename the file i did the following steps to get a usefull filename:

First i created a new temporary tablespace called temp2

TEMPFILE '/u05/oradata/db_1/data/TEMP2_01.dbf'
SIZE 1024M;

After that i switched the system to the new temp tablespace:


Than i dropped the old temp tablespace


and recreated it:

TEMPFILE '/u05/oradata/db_1/data/TEMP_01.dbf'
SIZE 1024M;

and changed the database back to tablespace named TEMP


and deleted TEMP2


This all works online.

Thursday, November 27, 2008

Linux native multipathing and EMC CX300


a few days ago i got 2 new machines, which should be connected to EMC CX300.

Dell 2950 hardware with QLogic 2432 HBAs.

To retrieve the HBA WWN i did 'cat /sys/class/fc_host/host$1/port_name'

To use failover for more than one HBA use the following entry depending on the hardware vendor:

For QLogic hostbus adapter:
options qla2xxx qlport_down_retry=1 ql2xfailover=0 ql2xretrycounts=5

For Emulex hostbus adapter:
options lpfc lpfc_nodev_tmo=14 lpfc_lun_queue_depth=30 lpfc_discovery_threads=1

in you /etc/modprobe.conf.local

I went to EMC web gui Navisphere and configured a new RAID group.
After that two host groups where created, where the LUN was assigned. This produces and error message in Navisphere telling you about not to add a LUN to different host groups. It can be safely ignored.

I setup the SAN switches zoning and installed Naviagent on the two machines and started the agent. They registered sucessfull and could be found in Navisphere. I assigned each host to it's host group.

I had to setup failover mode on CX300 to 1, otherwise the Linux dmp driver tells the path is faulty.

On the Linux host i created a multipath.conf file and added the local SCSI controller to be blacklisted.
For SuSE 10.1 i used the following:

--------------------/etc/multipath.conf SuSE 10.1 ------------------------------------
## This is the /etc/multipath.conf file recommended for
## EMC storage devices.
## OS : SLES 9 SP3/ SLES 10 SP1
## Arrays : CLARiiON and SYMMETRIX
## The blacklist is the enumeration of all devices that are to be
## excluded from multipath control
devnode_blacklist {
## Replace the wwid with the output of the command
## 'scsi_id -g -u -s /block/[internal scsi disk name]'
## Enumerate the wwid for all internal scsi disks.
## PLEASE CHANGE WWID to fit you needs
wwid 36001ec90e72e4c0010402ffc04201bbc
devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*"
devnode "^hd[a-z][[0-9]*]"
devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]"
devices {
## Device attributes for EMC SYMMETRIX
device {
vendor "EMC "
product "SYMMETRIX"
path_grouping_policy multibus
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
path_selector "round-robin 0"
features "0"
hardware_handler "0"
failback immediate
## Device attributes for EMC CLARiiON
## Device attributes for EMC CLARiiON
device {
vendor "DGC"
product "*"
path_grouping_policy group_by_prio
getuid_callout "/sbin/scsi_id -g -u -s /block/%n"
prio_callout "/sbin/mpath_prio_emc /dev/%n"
hardware_handler "1 emc"
features "1 queue_if_no_path"
no_path_retry 300
path_checker emc_clariion
failback immediate
------------------------------end /etc/multipath.conf SuSE 10.1---------------------------------
In the blacklist section the 'wwid' entry should be configure to fit your needs, in my case this does exclude internal controllers from being multipathed.

For SuSE 10.2 you may not define the device sections, they are obsolete. Other entries are the same.

After running 'chkconfig boot.multipath on' and 'chkconfig multipathd on' i run 'multipath -d -v2' and found the ouput showing all paths to the devices. 'multipath -v2 ' shows the newly created device /dev/dm-0.

You can view your setup using 'dmsetup -v status ' and 'dmsetup -v ls'.

'dmsetup -v status' shows:

Name: 360060160c11219002cac56035689dd11
Tables present: LIVE
Open count: 4
Event number: 1
Major, minor: 253, 0
Number of targets: 1
UUID: mpath-360060160c11219002cac56035689dd11

0 209715200 multipath 1 0 0 2 1 A 0 2 0 8:32 A 0 8:48 A 0 E 0 2 0 8:16 A 0 8:64
A 0

Also you can view the path status and device with 'multipath -ll'

360060160c11219002cac56035689dd11dm-0 DGC,RAID 5
[size=100G][features=1 queue_if_no_path][hwhandler=1 emc]
\_ round-robin 0 [prio=2][active]
\_ 1:0:1:0 sdc 8:32 [active][ready]
\_ 2:0:0:0 sdd 8:48 [active][ready]
\_ round-robin 0 [prio=0][enabled]
\_ 1:0:0:0 sdb 8:16 [active][ready]
\_ 2:0:1:0 sde 8:64 [active][ready]

This can from this point used like a normal harddisk device which can be partitioned and so on.

Today's Storage Day

Everytime a new thing ;)

I like competitions.

So today i had to do some storage stuff. Setting up two new StorTeks in a SUN Cluster.

Two SUN StorTek 6140 (to replace 2 old T3's and 2 old 6120)
Two E2900 Solaris 9, SUN Cluster 3.0,VxVM 3.5

Both host directly connected to both storages via two HBAs. QLA26xx

Ater running 'devfsadm -Cv', to get the device entries updated, i called 'luxadm probe' to if the LUNs are found correctly on the host.

'vxdmpadm listctlr all' show the new connected controllers.

I enabled the new path with 'vxdmpadm enable $CONTROLLER'.

I labeled the new disks with format and found them shown state online in 'vxdisk list' output.

By using 'vxdisetup -i $DEVICE' i setup the private/public region on the disk.

I added them to the different diskgroups with 'vxdg -g $DISKGROUP adddisk $LOGICALNAME=$DEVICE'.

I began mirroring the old T3/4's 'vxassist -g $DISKGROUP mirror $VOLUME $LOGICALNAME'.
The performance was real okay, the old storages hosts 6 Oracle databases, with 450GB amount of data.
After around 5 hours all works was done.

I had to update the SUN Cluster configuration to recognize changes in diskgroups.
This can be done the following way 'scconf -c -D name=$DISKGROUP,sync' so the volume information of VxVM will be syncronized with SUN Cluster configuration.

After that all was fine.