repair AIX IPL hang at LED value 518
Pour réparer un serveur AIX figé sur la LED 518 vous pouvez suivre la Technote IBM :
http://www-01.ibm.com/support/docview.wss?uid=isg3T1000131
Dans mon cas cela n'a pas suffit car le LVCB de hd2 était corrompu:
- hd2 LVCB corrompu
- /etc/filesystems corrompu pour hd2
- ODM corrompu dans CuAt
Démarrer en Maintenance mode via un DVD AIX, NIM ou une mksysb (Tape, DVD, ISO).
Choisissez l'Option 3 puis Access rootvg volume groupe, identifier le disque contenant les LV système (hd4, hd2 ...)
Installation and Maintenance
Type the number of your choice and press Enter. Choice is indicated by >>>.
1 Start Install Now with Default Settings
2 Change/Show Installation Settings and Install
=> 3 Start Maintenance Mode for System Recovery
4 Make Additional Disks Available
5 Select Storage Adapters
Maintenance
Type the number of your choice and press Enter.
=> 1 Access a Root Volume Group
Type the number of your choice and press Enter.
0 Continue
Access a Root Volume Group
Type the number for a volume group to display the logical volume information
and press Enter.
1) Volume Group 00c8502e00004c0000000145e1f142ed contains these disks:
hdisk0 10240 vscsi
Volume Group Information
------------------------------------------------------------------------------
Volume Group ID 00c8502e00004c0000000145e1f142ed includes the following
logical volumes:
hd5 hd6 hd8 hd4 hd2 hd9var
hd3 hd1 hd10opt hd11admin livedump
------------------------------------------------------------------------------
Choisir l'option 2 (Access this Volume Group and start a shell before mounting filesystems)
1) Access this Volume Group and start a shell
=> 2) Access this Volume Group and start a shell before mounting filesystems
Pendant l'import du VG rootvg on constate un message pas commun.
rootvg
Could not find "/" and/or "/usr" filesystems.
Exiting to shell.
Checker les filesystems et formater le Log device.
# fsck -y /dev/hd2
# fsck -y /dev/hd9var
# fsck -y /dev/hd3
# fsck -y /dev/hd1
# logform /dev/hd8
logform: destroy /dev/rhd8 (y)?y
Afficher le contenu du "Logical Volume Control Block" des LVs
On constate que le Label du LVCB de hd2 est corrompu
AIX LVCB
intrapolicy = c
copies = 1
interpolicy = m
lvid = 00c8502e00004c0000000145e1f142ed.5
lvname = hd2
label = /usr/!+or
machine id = 8502E4C00
number lps = 165
relocatable = y
strict = y
stripe width = 0
stripe size in exponent = 0
type = jfs2
upperbound = 32
fs = vfs=jfs2:log=/dev/hd8
time created = Fri May 9 17:04:23 2014
time modified = Tue Nov 4 13:19:18 2014
Corriger le Label de hd2 via la commande putlvcb puis vérifier
# getlvcb hd2 -AT
AIX LVCB
intrapolicy = c
copies = 1
interpolicy = m
lvid = 00c8502e00004c0000000145e1f142ed.5
lvname = hd2
label = /usr
machine id = 8502E4C00
number lps = 165
relocatable = y
strict = y
stripe width = 0
stripe size in exponent = 0
type = jfs2
upperbound = 32
fs = vfs=jfs2:log=/dev/hd8
time created = Fri May 9 17:04:23 2014
time modified = Tue Nov 4 13:23:00 2014
A ce stade on ne peut pas se "chrooter" dans le disque système rootvg car le VG à été importé avec une valeur corrompue pour hd2 (/usr), Obligé de redémarrer en Maintenance mode
Choisissez l'option 2 et vérifier que l'erreur précédente ne s'affiche plus.
1) Access this Volume Group and start a shell
2) Access this Volume Group and start a shell before mounting filesystems
99) Previous Menu
Choice [99]: 2
Importing Volume Group...
rootvg
Checking the / filesystem.
The current volume is: /dev/hd4
Primary superblock is valid.
Checking the /usr filesystem.
The current volume is: /dev/hd2
Primary superblock is valid.
Exit from this shell to continue the process of accessing the root
volume group.
Pour ce "chrooter" dans le disque rootvg et monter les filesystems taper "exit"
# df
Filesystem 512-blocks Free %Used Iused %Iused Mounted on
/dev/ram0 720896 319192 56% 11797 22% /
/proc 720896 319192 56% 11797 22% /proc
/dev/cd0 - - - - - /SPOT
/dev/hd4 720896 319192 56% 11797 22% /
/dev/hd2 5406720 638528 89% 54526 39% /usr
/dev/hd3 294912 219096 26% 88 1% /tmp
/dev/hd9var 1015808 303160 71% 8987 17% /var
/dev/hd10opt 1015808 484192 53% 8860 13% /opt
On constate que le label /usr est corrompu
rootvg:
LV NAME TYPE LPs PPs PVs LV STATE MOUNT POINT
hd5 boot 2 2 1 closed/syncd N/A
hd6 paging 32 32 1 open/syncd N/A
hd8 jfs2log 1 1 1 open/syncd N/A
hd4 jfs2 22 22 1 open/syncd /
hd2 jfs2 165 165 1 open/syncd /usr/!+or
hd9var jfs2 31 31 1 open/syncd /var
hd3 jfs2 9 9 1 open/syncd /tmp
hd1 jfs2 1 1 1 closed/syncd /home
hd10opt jfs2 31 31 1 open/syncd /opt
hd11admin jfs2 8 8 1 closed/syncd /admin
livedump jfs2 16 16 1 closed/syncd /var/adm/ras/livedump
# grep -p hd2 /etc/filesystems
/usr/!+or:
dev = /dev/hd2
vfs = jfs2
log = /dev/hd8
mount = automatic
check = false
type = bootfs
vol = /usr
free = false
# odmget -q 'name=hd2 and attribute=label' CuAt
CuAt:
name = "hd2"
attribute = "label"
value = "/usr/!+or"
type = "R"
generic = "DU"
rep = "s"
nls_index = 640
Corriger les corruptions de l'ODM et du fichier /etc/filesystems
exporter la valeur ODM corrompu (hd2 + label) dans un fichier puis éditer et corriger le fichier
# export TERM=vt320
# export VISUAL=vi
# set -o vi
# vi /tmp/odm
CuAt:
name = "hd2"
attribute = "label"
value = "/usr"
type = "R"
generic = "DU"
rep = "s"
nls_index = 640
Sauvegarder la classe ODM CuAt puis supprimer la valeur corrompue de la classe ODM CuAt
# odmdelete -q 'name=hd2 and attribute=label' -o CuAt
0518-307 odmdelete: 1 objects deleted.
Ajouter la nouvelle valeur à partir du fichier et vérifier l'ODM CuAt
# odmget -q 'name=hd2 and attribute=label' CuAt
CuAt:
name = "hd2"
attribute = "label"
value = "/usr"
type = "R"
generic = "DU"
rep = "s"
nls_index = 640
Sauvegarder l'ODM dans le Boot Logical Volume (hd5)
saving to '/dev/hd5'
47 CuDv objects to be saved
120 CuAt objects to be saved
14 CuDep objects to be saved
8 CuVPD objects to be saved
356 CuDvDr objects to be saved
2 CuPath objects to be saved
0 CuPathAt objects to be saved
0 CuData objects to be saved
0 CuAtDef objects to be saved
Number of bytes of data to save = 19005
Compressing data
Compressed data size is = 6850
bi_start = 0x3600
bi_size = 0x1b20000
bd_size = 0x1b00000
ram FS start = 0x9363b0
ram FS size = 0x114bc17
sba_start = 0x1b03600
sba_size = 0x20000
sbd_size = 0x1ac6
Checking boot image size:
new save base byte cnt = 0x1ac6
Wrote 6854 bytes
Successful completion
Éditer et modifier le fichier /etc/filesystems pour /usr puis contrôler
# vi /etc/filesystems
# grep -p hd2 /etc/filesystems
/usr:
dev = /dev/hd2
vfs = jfs2
log = /dev/hd8
mount = automatic
check = false
type = bootfs
vol = /usr
free = false
Enfin synchroniser la mémoire sur les filesystems et redémarrer.
Aucun trackbacks pour l'instant