unixadmin.free.fr just another IBM blog and technotes backup


AIX SCSI-2 Reservation on INFINIDAT

A customer encountered a problem with Disaster Recovery plan for AIX rootvg boot on SAN (reserve_policy=single_path) on Infinidat model F6130.

Problem: No boot disk on SMS menu.

Workaround: From the Infinidat Storage Unmap & Map rootvg LUN to the host.

Fix: Infinidat corrected Bug in Firmware and add also a internal Script for SCSI-2 reservation.

Before the Fix Infinidat Storage managed only SCSI-3 reservation and AIX use SCSI-2 reservation.

Remplis sous: AIX Aucun commentaire

Oracle RAC 11gR2 need multicast

After AIX / Oracle RAC migration to new Datacenter, the DBA encountered a problem to start Oracle RAC with network heartbeat error...
Root cause: Network Team has dropped multicast beetween Datacenter.

cat $GRID_HOME/log/grac2/cssd/ocssd.log
2016-09-11 21:15:55.296: [    CSSD][382113536]clssnmPollingThread: node 2, orac002 (1) at 90% heartbeat fatal, removal in 2.950 seconds, seedhbimpd 1
2016-09-11 21:15:56.126: [    CSSD][388421376]clssnmvDHBValidateNCopy: node 2, orac002, has a disk HB, but no network HB, DHB has rcfg 269544449, wrtcnt, 547793, LATS 2298

Workaround: Run tcpdump and grep multicast MAC on Oracle Interconnect Interface and send multicast address for Network Team. That's work better after.

root@orac002:/root #  tcpdump -en -i en0 |  grep "01:00:5e"

21:43:24.678611 76:82:b2:99:ac:0b > 01:00:5e:00:00:fb, ethertype IPv4 (0x0800), length 1202: > UDP, length 1160
21:43:24.678798 76:82:b2:99:ac:0b > 01:00:5e:00:01:00, ethertype IPv4 (0x0800), length 1202: > UDP, length 1160

Oracle Doc ID 1212703.1

Remplis sous: AIX, ORACLE Aucun commentaire

How to check memory and core activated via CUoD Activation Code

Go to IBM capacity on demand, enter type and serial number and check POD and MOD lines.

Ex 1: model 9117 type MMD
POD 53C1340827291F44AAF4000000040041E4 09/27/2015
AAF4 = CCIN = 4.228 GHz core
04 = 4 core activated

MOD 2A2A7F64BEEEC606821200000032004187
8212 = Feature code = Activation of 1 GB
32 = 32 GB activated

Ex 2: model 8233
POD 80FF07034C0917FA771400000016004166 09/17/2010
7714 = Feature code = 3.0 GHz core
16 = 16 core activated

Source :
Thank's to Mr Delmas
for CCIN reference check IBM Knowledge Center
for Feature code reference check IBM sales manual


HMC Save Upgrade Data failed

If you want to upgrade HMC and Saves Hardware Management Console (HMC) upgrade data failed with HSCSAVEUPGRDATA_ERROR, then check if the home directory of hscroot or other hmcsuperadmin are filled with Virtual I/O server ISO images. The filesystem (/mnt/upgrade) is used to store save upgrade data backup and it is to small to contains ISO images.

Fix: remove VIOS ISO images from HMC and relauch saveupgdata command.

Remplis sous: HMC Commentaires

vio_daemon consuming high memory on VIOS

This saturday I update two dual VIOS from to combinated + + Fixpack.
one VIOS was using lot of memory (8GB of computational), svmon show that vio_daemon used 12 segments of application stack (it's a joke). In fact, the customer had modified /etc/security/limits with stack, data and rss unlimited for root. Solved by Setting default value and reboot VIOS. See IBM technote.

Why is vio_daemon consuming high memory on PoweVM Virtual I/O Server (VIOS)?

There is a known issue in VIOS thru with vio_daemon memory leak that was fixed at with IV64508.

To check your VIOS level, as padmin, run:
$ ioslevel

If your VIOS level is or higher, the problem may be due to having values in /etc/security/limits set to "unlimited" (-1). Particularly, the "stack" size setting, which exposes a condition where the system can be allowed to pin as much stack as desired causing vio_daemon to consume a lot of memory.

$ oem_setup_env

# vi /etc/security/limits ->check the default stanza

        fsize = -1
        core = -1
        cpu = -1
        data = -1
        rss = -1
        stack = -1
        nofiles = -1

In some cases, the issue with vio_daemon consuming high memory is noticed after a VIOS update to 2.2.3.X. However, a VIOS update will NOT change these settings. It is strongly recommended not to modify these default values as doing so is known to cause unpredictable results. Below is an example of the default values:

        fsize = 2097151
        core = 2097151
        cpu = -1
        data = 262144
        rss = 65536
        stack = 65536
        nofiles = 2000

To correct the problem change the settings back to "default" values. Then reboot the VIOS at your earliest convenience.

Note 1
If the stack size was added to the root and/or padmin stanzas with unlimited setting, it should be removed prior to rebooting the VIOS.

Note 2
If there clients are not redundant via a second VIOS, a maintenance window should be schedule to bring the clients down before rebooting the VIOS.

SOURCE: IBM technote

Taggé comme: Commentaires

Drive paths to library client taken offline when server option SANDISCOVERY set to ‘YES’

Technote (troubleshooting)


This message in the activity log of the library manager appears: ANR1772E The path from source to destination is taken offline.


On the library client, these messages are observed in the activity log when a library sharing session is opened to the library manager:

ANR1926W A SCSI inquiry has timed out after 15 seconds.
ANR3626W A check condition occurred during a small computer system interface (SCSI) inquiry at Fibre Channel port WWN=<wwn_number> , KEY=00, ASC=00, ASCQ=00.
ANR1786W HBAAPI not able to get adapter name.
ANR8963E Unable to find path to match the serial number defined for drive <DRIVE_NAME> in library <LIBRARY_NAME>.
ANR8873E The path from source <library_client> to destination <drive> (/dev/rmtXYZ) is taken offline.

On the library manager, you can see this corresponding message showing the path to the drive being taken offline:

ANR1772E The path from source to destination <drive> is taken offline.


The SAN discovery's query of the HBA has timed out, and the path is taken offline. This can occur in SAN environments with a large number of devices.
Diagnosing the problem

Verify that there is not an underlying hardware problem causing the drives paths to go offline.

Check the value of the SANDISCOVERYTIMEOUT option on the library clients and the library manager. The default value is 15 seconds:


Resolving the problem

If the value of the option is at or near the default value of 15 seconds, increase to a greater number. For example:

Remplis sous: TSM Commentaires

Why Are Tapes with PRIVATE Status Not Found in QUERY VOLUME Output?

Technote (FAQ)


QUERY LIBVOLUME shows tape volumes with status of PRIVATE, but the same volumes do not show up with the command: Q VOL

Why are these tapes PRIVATE?


QUERY VOLUME will only return information about volumes that belong to stgpools, but there are other types of volumes that can have valid data on them: DB backups, exports, backupsets and remote volumes that belong to a Library Client server.

The volume history will keep a record of all volumes and you can display these other types of non-stgpool volumes with the following commands:

q volh type=dbb
q volh type=dbs
q volh type=export
q volh type=backupset
q volh type=remote

If a PRIVATE volume is not part of a stgpool and does not display in any of the above Q VOLH commands then you can set it to scratch using the command:

UPDATE LIBVOL <library_name> <vol_name> STATUS=SCRATCH

If you do have a library sharing environment it is recommended to run an AUDIT LIBRARY on the Library Client servers prior to changing the status of a volume to scratch on the Library Manager server.

Remplis sous: TSM Commentaires

Using ‘dd’ to verify Tivoli Storage Manager tape volume labels

How can I use the Unix 'dd' command to verify a tape volume label?

The first step that may be necessary to verify a tape volume label is to find out the block size in use on that tape volume. This parameter is typically set on the physical tape library console interface and will vary between manufacturers so the ideal place to search is on the manufacturer website.
There is a method to manually find out the block size as follows:

On most Unix systems, the 'dd' command will output a message indicating a read failed from a tape drive (and corresponding tape volume) along with insufficient memory message. For example on AIX:

    bash$ dd if=/dev/rmt1 of=/tmp/test.file ibs=32 count=1
    dd: 0511-051 The read failed.
    : There is not enough memory available now.
    0+0 records in.
    0+0 records out.

The 'if' parameter must reference a valid path to a drive that contains the volume you are seeking information about. This volume may be loaded using a utility such as tapeutil or directly from the physical library's console management. The drive should not be in use by Tivoli Storage Manager at the time the command is run and it is recommended to take the drive offline to Tivoli Storage Manager.

The 'ibs' parameter indicates the block size to use in bytes unless a 'k' is specific, in which case the parameter is read as kilobytes. A value of 32 bytes, as in the example above, is a good starting value. If this command returns a memory related error message then the value can be doubled.

    bash$ dd if=/dev/rmt1 of=/tmp/test.file ibs= 64 count=1
    dd: 0511-051 The read failed.
    : There is not enough memory available now.
    0+0 records in.
    0+0 records out.

The value specified for the block size is still smaller than what is actually on the volume so another error is generated. The value must be increased (each time doubling it) until no error message is reported:

    bash$ dd if=/dev/rmt1 of=/tmp/test.file ibs= 128 count=1
    0+1 records in.
    0+1 records out.

Once the correct block size has been discovered, the 'dd' command should not generate a memory error when reading from the volume.
Now that the block size is known, the data on the first block of the tape can be dumped to a file:
dd bs= conv=ascii if= of=/tmp/.out

For example:

    bash$ dd bs=128 conv=ascii skip=0 count=1 if=/dev/rmt1 of=/tmp/block1.out
    0+1 records in.
    0+1 records out.

Once the file is file '/tmp/block1.out' is written, the file may be viewed in any text editor or the cat command can be used:

    bash$ cat /tmp/block1.out

In this case the 'VOL1200312' is the label of the tape volume residing in the drive /dev/rmt1.

Source: IBM Technote

Remplis sous: TSM Aucun commentaire

NMON Visualizer

Pour ceux qui utilise l'outil de collecte de performance nmon développé par Nigel Griffiths et disponible pour IBM AIX/VIOS et Linux (Power, x86, x86_64, Mainframe & now ARM (Raspberry Pi)), comme vous le savez il faut se servir du fichier excel nmon analyser pour visualiser les fichiers de collecte nmon.

En complément de cet outil je vous conseille de tester NMONVisualizer, un projet IBM démarré par Hunter Presnall qui est un excelent outil pour comparer et analyser les fichiers nmon issue des collectes de performances de plusieurs systèmes ou VM AIX / Linux.

NMONVisualizer http://nmonvisualizer.github.io/nmonvisualizer/index.html




Nmon analyser

My thanks to you all for a job very well done. :)


How to add a hardware error in AIX errlog

Comment générer des erreurs hardware dans l'errlog d'AIX. Utile pour tester des logiciels de supervision ou la gestion des évènements sous PowerHA.
Voir le fichier /usr/include/sys/errids.h pour les LABELS errpt.

echo "EPOW_SUS\nEMULATE\n1\ntexte1\ntexte2" | /usr/lib/ras/ras_logger
echo "SCSI_ERR1\nEMULATE\n1\ntexte1\ntexte2" | /usr/lib/ras/ras_logger
# errpt
0502F666   1013162315 P H EMULATE        ADAPTER ERROR
74533D1A   1013162115 U H EMULATE        LOSS OF ELECTRICAL POWER
Remplis sous: AIX, HACMP Aucun commentaire