unixadmin.free.fr Handy Unix Plumbing Tips and Tricks

12sept/16

Oracle RAC 11gR2 need multicast

After AIX / Oracle RAC migration to new Datacenter, the DBA encountered a problem to start Oracle RAC with network heartbeat error...
Root cause: Network Team has dropped multicast beetween Datacenter.

cat $GRID_HOME/log/grac2/cssd/ocssd.log
2016-09-11 21:15:55.296: [    CSSD][382113536]clssnmPollingThread: node 2, orac002 (1) at 90% heartbeat fatal, removal in 2.950 seconds, seedhbimpd 1
2016-09-11 21:15:56.126: [    CSSD][388421376]clssnmvDHBValidateNCopy: node 2, orac002, has a disk HB, but no network HB, DHB has rcfg 269544449, wrtcnt, 547793, LATS 2298

Workaround: Run tcpdump and grep multicast MAC on Oracle Interconnect Interface and send multicast address for Network Team. That's work better after.

root@orac002:/root #  tcpdump -en -i en0 |  grep "01:00:5e"

21:43:24.678611 76:82:b2:99:ac:0b > 01:00:5e:00:00:fb, ethertype IPv4 (0x0800), length 1202: 192.168.10.217.42424 > 224.0.0.251.42424: UDP, length 1160
21:43:24.678798 76:82:b2:99:ac:0b > 01:00:5e:00:01:00, ethertype IPv4 (0x0800), length 1202: 192.168.10.217.42424 > 230.0.1.0.42424: UDP, length 1160

Oracle Doc ID 1212703.1
Bug 9974223 : GRID INFRASTRUCTURE NEEDS MULTICAST COMMUNICATION ON 230.0.1.0 ADDRESSES WORKING

Remplis sous: AIX, ORACLE Aucun commentaire
7mar/12

Server running AIX with Oracle RAC reboots itself

SOURCE: Technote T1011228

Problem(Abstract)

Server running AIX with Oracle RAC reboots itself with no warning
Symptom

AIX server shuts down and/or reboots.

A REBOOT_ID is logged in /var/adm/ras/errlog indicating "SYSTEM SHUTDOWN BY USER" although no shutdown or reboot command was issued by any user.

example error message...

LABEL: REBOOT_ID
IDENTIFIER: 2BFA76F6

Date/Time: Wed Dec 3 08:19:09 2008
Sequence Number: 1447
Machine Id: 0000ABCD1234
Node Id: nodeA
Class: S
Type: TEMP
Resource Name: SYSPROC

Description
SYSTEM SHUTDOWN BY USER

Probable Causes
SYSTEM SHUTDOWN

Detail Data
USER ID
0
0=SOFT IPL 1=HALT 2=TIME REBOOT
0
TIME TO REBOOT (FOR TIMED REBOOT ONLY)
0

Cause

Oracle Real Application Clusters (RAC) is known to reboot the operating system with no warning due to configuration of the oprocd daemon

Environment

AIX with Oracle RAC

Diagnosing the problem

Oracle Real Application Clusters (RAC) typically runs a process called oprocd.

The idea of OPROCD is quite straightforward. It’s goal is to provide I/O fencing. Basically oprocd works by setting a timer, then sleeping. If, when it wakes up again and gets scheduled onto cpu, it sees that a longer time has passed than the acceptable margin, oprocd will decide to reboot the node.

You can check for the oprocd process with the ps command...

# ps -ef | grep oprocd
root 221672 1 0 08:27:44 - 0:00
/u01/crs/oracle/product/10.2.0/crs_1/bin/oprocd run -t 1000 -m 500 -f

These options to oprocd are saying -t 1000 (wake up every 1000 ms) and -m 500 (allow up to 500 ms margin of error on the time that oprocd wakes up before rebooting). In other words, if oprocd wakes up after > 1.5 secs it’s going to force a reboot.

Resolving the problem

The timeout and margin times are computed from the elements of diagwait and reboot time and it isn't recommended changing them via the init.cssd file, but rather through the command 'crsctl set css diagwait '.

There is a formula involved in the calculation of the times. For example, if the reboot time is 3 and you submit a diagwait setting of 13 you will get -t 1000 -m 10000.

# crsctl set css diagwait 13 -force

# ps -ef | grep oprocd
root 221672 1 0 08:27:44 - 0:00
/u01/crs/oracle/product/10.2.0/crs_1/bin/oprocd run -t 1000 -m 10000 -f

You can see that the margin has changed to 10000 ms, that is 10 seconds in place of the default 0.5 seconds. This is a 20 fold increase allows oprocd more time to determine if the node needs to be rebooted.

IBM recommends the customer contact Oracle Support before modifying this value.

IBM and Oracle came to the agreement that a diagwait value of 13 is a suitable value if the best practices are used...

IBM recommends customers follow best practices, and if possible update to AIX 6.1 or AIX 7.1 with current Technology Levels which include the new non-pagable kernel as the preferred corrective action.

The Oracle master document can be found here...

Remplis sous: AIX, ORACLE Aucun commentaire
22avr/11

Change Oracle system password impact Enterprise Manager login

Change Oracle system password impact Enterprise Manager login in Oracle 10g

There is much information on the web but a SYSMAN account locked prevented EM to connect to database.

Stop Oracle Enterprise Manager

emctl stop dbconsole
emctl status dbconsole

Connect to database as sysdba and change system password

sqlplus / as sysdba

alter user sys identified by NewPassword ;
alter user system identified by NewPassword ;
alter user dbsnmp identified by NewPassword ;
alter user sysman identified by NewPassword ;

In file ${ORACLE_HOME}/`hostname`_${ORACLE_SID}/sysman/config/emoms.properties, modify the lines:

oracle.sysman.eml.mntr.emdRepUser=SYSMAN
oracle.sysman.eml.mntr.emdRepPwd=d0355495a68cd5ae
oracle.sysman.eml.mntr.emdRepPwdEncrypted=TRUE

by

oracle.sysman.eml.mntr.emdRepUser=SYSMAN
oracle.sysman.eml.mntr.emdRepPwd=Newpassword
oracle.sysman.eml.mntr.emdRepPwdEncrypted=FALSE

Open file ${ORACLE_HOME}/`hostname`_${ORACLE_SID}/sysman/emd/targets.xml and modify the lines:

<Property NAME="UserName" VALUE="2ed7f792e30adc89" ENCRYPTED="TRUE"/>
<Property NAME="password" VALUE="c8d4082a472b36ae" ENCRYPTED="TRUE"/>

by

<Property NAME="UserName" VALUE="dbsnmp" ENCRYPTED="FALSE"/>
<Property NAME="password" VALUE="Newpassword" ENCRYPTED="FALSE"/>

Before restart Oracle Enterprise Manager check if SYSMAN is not LOCKED like bellow

SQL> SELECT username, account_status FROM dba_users WHERE username IN ('SYSMAN','DBSNMP');

USERNAME                       ACCOUNT_STATUS
------------------------------ --------------------------------
DBSNMP                         OPEN
SYSMAN                         LOCKED(TIMED)

Unlock SYSMAN account

SQL> alter user sysman identified by newpassword account unlock;

SQL> SELECT username, account_status FROM dba_users WHERE username IN ('SYSMAN','DBSNMP');


USERNAME                       ACCOUNT_STATUS
------------------------------ --------------------------------
SYSMAN                         OPEN
DBSNMP                         OPEN

Now you can start Oracle Enterprise Manager

emctl start dbconsole

After restart Oracle Enterprise Manager check the encryption of newpassword in targets.xml and emoms.properties files.

Source
http://fadace.developpez.com/oracle/pwd/

Remplis sous: ORACLE Aucun commentaire
17mar/11

Comment reconnaitre un disque ASM sous AIX

Comment reconnaitre un disque ASM sous AIX et evite de creer un VG dessus.

# lquerypv -h /dev/hdiskpower22

hdiskpower22
00000000   00820101 00000000 80000003 5EDF6D84  |............^.m.|
00000010   0001EFD7 00000000 00000000 00000000  |................|
00000020   4F52434C 4449534B 00000000 00000000  |ORCLDISK........|
00000030   00000000 00000000 00000000 00000000  |................|
00000040   0A100000 00030104 41534D5F 50524F47  |........ASM_PROG|
00000050   5F47524F 55505F30 30303300 00000000  |_GROUP_0003.....|
00000060   00000000 00000000 41534D5F 50524F47  |........ASM_PROG|
00000070   5F47524F 55500000 00000000 00000000  |_GROUP..........|
00000080   00000000 00000000 41534D5F 50524F47  |........ASM_PROG|
00000090   5F47524F 55505F30 30303300 00000000  |_GROUP_0003.....|
000000A0   00000000 00000000 00000000 00000000  |................|
000000B0   00000000 00000000 00000000 00000000  |................|
000000C0   00000000 00000000 01F61B0C DB9AFC00  |................|
000000D0   01F61B0C DB9B0800 02001000 00100000  |................|
000000E0   0001BC80 00002800 00000002 00000001  |......(.........|
000000F0   00000002 00000000 0003FFFF FFFFFFFF  |................|

ORCL = Instance Oracle.

Remplis sous: ORACLE Aucun commentaire