TSM « unixadmin.free.fr

16août/14

Potential problems when the Tivoli Storage Manager server processes files greater than the MAXFRAGMENTSIZE setting

Abstract

Various problems can occur while processing files larger than the MAXFRAGMENTSIZE option with a Tivoli Storage Manager V7.1.0 server. The default value for this option is 10G (10000M). The affected operations include Client Node Restore or Retrieve, Node Replication, Export, LAN-free Restore and Storage Pool Backup.

Content

Releases affected:
Tivoli Storage Manager V7.1.0.0xx server for all platforms
All other releases and levels are not affected

Required Conditions:

CLIENT NODE RESTORE OR RETRIEVE
Client node restore or retrieve can fail if the files are larger than the MAXFRAGMENTSIZE setting and reside in a random-access DISK storage pool on a Tivoli Storage Manager V7.1.0 server. The default value for this option is 10G (10000M).

The Tivoli Storage Manager server reports the following error:
ANR0548W Retrieve or restore failed for session <session number> for node <node name> (<client platform>) processing file space <filespace filespace id> for file <file name> stored as <storage repository> - error detected.

The Tivoli Storage Manager client reports the following error:
ANS4035W File <file name> currently unavailable on server.

Issue the following SQL statement to determine whether there are fragmented files that are stored in random-access DISK storage pools. Issue the following commands as the instance user:

db2 connect to tsmdb1

For backup objects:
db2 "select bfsa.objid,nodename, cast( fsname as varchar(128) for sbcs data) as fsname, cast( hl_name as varchar(128) for sbcs data) as Hl_name, cast( ll_name as varchar(128) for sbcs data) as ll_name from tsmdb1.bf_super_aggregates bfsa left join tsmdb1.df_bitfiles dfbf on ( dfbf.srvid=0 and dfbf.bfid=bfsa.bfid ) left join tsmdb1.backup_objects imbk on ( imbk.objid=bfsa.objid ) left join tsmdb1.nodes nd on ( nd.nodeid=imbk.nodeid ) left join tsmdb1.filespaces fs on ( fs.nodeid=imbk.nodeid and fs.fsid=imbk.fsid ) where bfsa.sequence=0 and imbk.nodeid is not null group by (bfsa.objid,nodename,fsname,imbk.hl_name,ll_name)"

For archive objects:
db2 "select bfsa.objid,nodename, cast( fsname as varchar(128) for sbcs data) as fsname, cast( hl_name as varchar(128) for sbcs data) as Hl_name, cast( ll_name as varchar(128) for sbcs data) as ll_name from tsmdb1.bf_super_aggregates bfsa left join tsmdb1.df_bitfiles dfbf on ( dfbf.srvid=0 and dfbf.bfid=bfsa.bfid ) left join tsmdb1.archive_objects imar on ( imar.objid=bfsa.objid ) left join tsmdb1.nodes nd on ( nd.nodeid=imar.nodeid ) left join tsmdb1.filespaces fs on ( fs.nodeid=imar.nodeid and fs.fsid=imar.fsid ) where bfsa.sequence=0 and imar.nodeid is not null"

For space managed objects:
db2 "select bfsa.objid,nodename, cast( fsname as varchar(128) for sbcs data) as fsname, cast( alias as varchar(128) for sbcs data) as alias from tsmdb1.bf_super_aggregates bfsa left join tsmdb1.df_bitfiles dfbf on ( dfbf.srvid=0 and dfbf.bfid=bfsa.bfid ) left join tsmdb1.spaceman_objects imsm on ( imsm.objid=bfsa.objid ) left join tsmdb1.nodes nd on ( nd.nodeid=imsm.nodeid ) left join tsmdb1.filespaces fs on ( fs.nodeid=imsm.nodeid and fs.fsid=imsm.fsid ) where bfsa.sequence=0 and imsm.nodeid is not null"

Move the file from the random-access DISK storage pool to a sequential access storage pool. You can use the MOVE DATA or MOVE NODEDATA commands. Issue the restore or retrieve again after the move is complete.

This problem is reported by APAR IT03455. Apply the fixing level when available. These problems are fixed in Tivoli Storage Manager server level 7.1.0.100 and above for all server platforms.

NODE REPLICATION
When processing files that are larger than the MAXFRAGMENTSIZE setting, node replication sends only the first fragment of a file that is replicated if all of the following conditions are met:
The source servers storage pool is data-deduplication enabled.
The source object has gone through identify and reclamation processing.
The target servers storage pool is data-deduplication enabled.
The file is larger than the MAXFRAGMENTSIZE setting.

Files that are larger than the MAXFRAGMENTSIZE setting and the corresponding fragments are intact on the source server but are not replicated in their entirety to the target server.

After both the source and target servers are upgraded to V7.1.1.0, the Tivoli Storage Manager servers will automatically fix this problem during the next node replication process. Only the source server is required to be updated to the level that contains the fix; however, we recommend both the source and the target servers be upgraded.

To identify and resolve the issue at 7.1.0.xxx level, issue the following SQL statement on the source server to see whether there are any files affected by this problem. Issue the following commands as the instance user:

db2 connect to tsmdb1

For backup objects:
db2 "select distinct nrro.repl_objid from tsmdb1.bf_super_aggregates bfsa left join tsmdb1.backup_objects imbk on (bfsa.objid = imbk.objid AND bfsa.bfid != bfsa.fragid) left join tsmdb1.replicated_objects nrro on (nrro.nodeid = imbk.nodeid AND nrro.fsid = imbk.fsid AND nrro.objid = imbk.objid) where nrro.repl_objid IS NOT NULL"

For archive objects:
db2 "select distinct nrro.repl_objid from tsmdb1.bf_super_aggregates bfsa left join tsmdb1.archive_objects imbk on (bfsa.objid = imbk.objid AND bfsa.bfid != bfsa.fragid) left join tsmdb1.replicated_objects nrro on (nrro.nodeid = imbk.nodeid AND nrro.fsid = imbk.fsid AND nrro.objid = imbk.objid) where nrro.repl_objid IS NOT NULL"

For space managed objects:
db2 "select distinct nrro.repl_objid from tsmdb1.bf_super_aggregates bfsa left join tsmdb1.spaceman_objects imbk on (bfsa.objid = imbk.objid AND bfsa.bfid != bfsa.fragid) left join tsmdb1.replicated_objects nrro on (nrro.nodeid = imbk.nodeid AND nrro.fsid = imbk.fsid AND nrro.objid = imbk.objid) where nrro.repl_objid IS NOT NULL"

The fix in 7.1.0.100 does not fix the objects that are already replicated. To fix the objects that are already replicated, for each object that was returned from the previous commands, issue the following command on the target replication server. These objects will be deleted and replicated again during the next node replication process.:

DELETE OBJECT <objectid> FORCE=YES

This problem is reported by APAR IT00679.

EXPORT
The EXPORT command will only process the first fragment of a file if all the following conditions are met:
The object is stored in a data-deduplication enabled storage pool.
The object has gone through identify and reclamation processing.
After the reclamation processing, a fragment for the file is smaller than 100M in size and this file is included in an EXPORT process.

If any files that meet the conditions listed are exported and then imported to a target server, only the first fragment is stored. You must delete these files on the target server. Issue the following steps to accomplish this:

1) Issue the following SQL statements on the source server to determine whether there are files that will have this problem if they are exported. Issue the following commands as the instance user:

db2 connect to tsmdb1

For backup objects:
db2 "select bfsa.objid,nodename, cast( fsname as varchar(128) for sbcs data) as fsname, cast( hl_name as varchar(128) for sbcs data) as Hl_name, cast( ll_name as varchar(128) for sbcs data) as ll_name from tsmdb1.bf_super_aggregates bfsa left join tsmdb1.af_bitfiles afbf on ( afbf.srvid=0 and afbf.bfid=bfsa.bfid ) left join tsmdb1.backup_objects imbk on ( imbk.objid=bfsa.objid ) left join tsmdb1.nodes nd on ( nd.nodeid=imbk.nodeid ) left join tsmdb1.filespaces fs on ( fs.nodeid=imbk.nodeid and fs.fsid=imbk.fsid ) where afbf.size < 100000000 and bfsa.sequence=0 and imbk.nodeid is not null group by (bfsa.objid,nodename,fsname,imbk.hl_name,ll_name)"

For archive objects:
db2 "select bfsa.objid,nodename, cast( fsname as varchar(128) for sbcs data) as fsname, cast( hl_name as varchar(128) for sbcs data) as Hl_name, cast( ll_name as varchar(128) for sbcs data) as ll_name from tsmdb1.bf_super_aggregates bfsa left join tsmdb1.af_bitfiles afbf on ( afbf.srvid=0 and afbf.bfid=bfsa.bfid ) left join tsmdb1.archive_objects imar on ( imar.objid=bfsa.objid ) left join tsmdb1.nodes nd on ( nd.nodeid=imar.nodeid ) left join tsmdb1.filespaces fs on ( fs.nodeid=imar.nodeid and fs.fsid=imar.fsid ) where afbf.size < 100000000 and bfsa.sequence=0 and imar.nodeid is not null"

For space managed objects:
db2 "select bfsa.objid,nodename, cast( fsname as varchar(128) for sbcs data) as fsname, cast( alias as varchar(128) for sbcs data) as alias from tsmdb1.bf_super_aggregates bfsa left join tsmdb1.af_bitfiles afbf on ( afbf.srvid=0 and afbf.bfid=bfsa.bfid ) left join tsmdb1.spaceman_objects imsm on ( imsm.objid=bfsa.objid ) left join tsmdb1.nodes nd on ( nd.nodeid=imsm.nodeid ) left join tsmdb1.filespaces fs on ( fs.nodeid=imsm.nodeid and fs.fsid=imsm.fsid ) where afbf.size < 100000000 and bfsa.sequence=0 and imsm.nodeid is not null"

The output displays the objId, nodename, fsname, hl_name and ll_name for each file that can cause a problem if is exported. Using the ObjId value, issue the following command and replace the obj_Id with the ObjId value:

SHOW INVO obj_Id

The output from the SHOW command displays the date that the file was 'Inserted'. Record the date for each objId so that you can use it to ensure that you are deleting the correct file on the target server.

2) Using the output from the SQL statement on the source server, you must determine whether any of the files listed have been improperly exported and then imported to the target server. If so, these files need to be deleted on the target server because they are missing fragments. The following steps are all issued on the target server.

Issue the following commands to determine whether any of the improperly exported files exist on the target server. Complete the following steps for each file that is displayed by the source server SQL statement:

For backup objects:
Issue the following command by using the source server SQL output and replace the nodename, fsname, hl_name, and ll_name with the values that are displayed:

SHOW VERSIONS nodename fsname hl_name ll_name

Note: for Windows objects, add the following parameter: nametype=unicode

If no output is displayed, then the object was not imported or it has been expired. If the command produces output, find the entry that matches the 'Inserted' date and note the 'ObjId' value. Issue the following command to delete the file and replace the obj_id with the ObjIdvalue from the SHOW VERSIONS command:

DELETE OBJECT <obj_id> FORCE=YES

For archive objects:
Issue the following command by using the source server SQL output and replace the nodename, fsname, hl_name, and ll_name with the values that are displayed:

SHOW ARCHIVE nodename fsname hl_name ll_name

Note: for Windows objects, add the following parameter: nametype=unicode

DELETE OBJECT <obj_id> FORCE=YES

For space managed objects:
Issue the following command by using the source server SQL output and replace the nodename, fsname with the values that are displayed:

SHOW SPACEMG nodename fsname

DELETE OBJECT <obj_id> FORCE=YES

Replacing deleted objects on the target server
When the fixing level has been applied on the source server (both servers is recommended), complete the export for the nodes and file spaces of the deleted files. Then complete the import by using the MERGEFILESPACES=YES parameter. The parameter causes files that were deleted, and any new files, to be stored. Files that already exist in the file space are skipped.

It is possible that some of the deleted files will not be available to be exported and imported. Expiration or administrator actions such as DELETE VOLUME can cause the files to be deleted from the source server. If this occurs then the deleted file can not be recovered.

This problem is reported by APAR IT02849.

LAN-free RESTORE
LAN-free restore restores only the first fragment of a file if the storage agent level is lower than 7.1.0.0. For this problem to occur, the file must have been backed up over the LAN and then restored by using LAN-free data movement.

After you apply the fix for this problem any restores from a down level storage agent will cause the restore to go over the LAN for any volume that contains a fragment when you perform a no query restore. For a classic restore over the SAN, a file that contains a fragment will cause the restore to fail with the following message:
ANR4629E Attempting to restore a fragmented file through a earlier-level storage agent, the file will not be restored: Node XXXX, Type Backup, File space XXXXXXXX, File name XXXXXX.

This problem is reported by APAR IT02547.

This problem can be avoided by upgrading the storage agent to level 7.1.0.0 or later.

STORAGE POOL BACKUP
Storage pool backup might not copy all of the file data to the copy storage pool. The following SELECT statements can be used to determine the files that are affected. Issue the following commands as the instance user:

db2 connect to tsmdb1

IT02571:
db2 "select distinct objid, fragid, bfid from tsmdb1.BF_Super_Aggregates where objid in (select distinct sa1.objid from tsmdb1.BF_Super_Aggregates sa1 left join tsmdb1.BF_Super_Aggregates sa2 on (sa1.OBJID = sa2.OBJID and sa1.FRAGID = sa2.FRAGID and sa2.POOLID < 0) where sa1.POOLID > 0 and (sa2.OBJID is NULL or sa2.PENDINGID is not NULL)) and poolid < 0 and PENDINGID is NULL"

IT02717:
db2 "select afvs.BFID from tsmdb1.af_backup_optimization afbo, tsmdb1.af_vol_segments afvs where afbo.VOLID=afvs.VOLID and afbo.START > afvs.START and afvs.bfid in (select distinct sa1.bfid from tsmdb1.BF_Super_Aggregates sa1 left join tsmdb1.BF_Super_Aggregates sa2 on (sa1.OBJID = sa2.OBJID and sa1.FRAGID = sa2.FRAGID and sa2.POOLID < 0) where sa1.POOLID > 0 and (sa2.OBJID is NULL or sa2.PENDINGID is not NULL) and sa1.SEQUENCE=0)"

For the problem reported by IT02717 the following process can be used to allow the next storage pool backup attempt to process the files again:

1. Issue the following db2 command to obtain copy pool ids:
db2 "SELECT POOLNAME, POOLID FROM TSMDB1.SS_POOLS"

2. Replace the "copy_pool_id" in the following SQL statement with a copy pool id from the previous SQL output. If you have more than one copy pool, issue this step for each copy pool:

db2 "delete from tsmdb1.af_backup_optimization where COPYPOOLID= copy_pool_id and VOLID in (select afbo.VOLID from tsmdb1.af_backup_optimization afbo, tsmdb1.af_vol_segments afvs where afbo.VOLID=afvs.VOLID and afbo.START > afvs.START and afvs.bfid in (select distinct sa1.bfid from tsmdb1.BF_Super_Aggregates sa1 left join tsmdb1.BF_Super_Aggregates sa2 on (sa1.OBJID = sa2.OBJID and sa1.FRAGID = sa2.FRAGID and sa2.POOLID < 0) where sa1.POOLID > 0 and (sa2.OBJID is NULL or sa2.PENDINGID is not NULL) and sa1.SEQUENCE=0) and afbo.COPYPOOLID= copy_pool_id)"

The next storage pool backup can take longer to complete since all of the files on the affected primary storage pool volumes need to be verified to determine whether the data exists in the copy storage pool.

APARs IT02571 and IT02717 were created for this problem. The fix for these APARs will automatically fix the reported problems on the next storage pool backup.

Problem Resolution:
Apply the fixing level when available. These problems are fixed in Tivoli Storage Manager server level 7.1.0.100 and later for all server platforms.

Circumvention:
It is possible to turn off fragmentation processing for the server; however, this may cause longer running transactions which could cause out of space conditions for the database active log. This change will not affect fragments that are already created on the Tivoli Storage Manager server. It will only change the fragments for new data. Choose one of the following methods to turn off fragmentation.

1) Increase the MAXFRAGMENTSIZE option in the server options file to 999999 which will prevent files smaller than 999999M from being fragmented. Files larger than this size will still be fragmented.

a. Record the current setting by issuing the following command:
QUERY OPTION MAXFRAGMENTSIZE

b. Issue the following command to update the setting to 999999:
SETOPT MAXFRAGMENTSIZE 999999

c. After you upgrade the server to the fixing level, you must return the MAXFRAGMENTSIZE setting to its original value. Issue the following command:
SETOPT MAXFRAGMENTSIZE original_value

2) Update all of the nodes to turn off the SPLITLARGEOBJECTS option. This defaults to YES for all nodes.
a. Issue the following command:
UPDATE NODE * SPLITLARGEOBJECTS=NO

b. After you upgrade the server to the fixing level, you must update the node to turn on SPLITLARGEOBJECTS. Issue the following command:
UPDATE NODE * SPLITLARGEOBJECTS=YES

Remplis sous: TSM Aucun commentaire

24mai/14

Bypass the server and storage agent prerequisites during installation

Question

How do you bypass the Tivoli Storage Manager server and storage agent prerequisites during installation?

Answer

You can bypass the Tivoli Storage Manager server and storage agent prerequisites during the installation or upgrade.
On test servers only: Use the following command to bypass prerequisite checks such as the operating system and the required memory. Do not issue this command on a production server.

AIX, HP-UX, Linux, and Solaris
=========================================================================
For fresh installations, issue the following command:
Installation wizard: ./install.sh -g -vmargs "-DBYPASS_TSM_REQ_CHECKS=true"
Console mode: ./install.sh -c -vmargs "-DBYPASS_TSM_REQ_CHECKS=true"
Silent mode: ./install.sh -s -acceptLicense -vmargs "-DBYPASS_TSM_REQ_CHECKS=true"

Otherwise, update the following file to add the -DBYPASS_TSM_REQ_CHECKS=true flag:
Installation wizard: Installation Manager/eclipse/ibmim.ini
Console mode: Installation Manager/eclipse/tools/imcl.ini
Silent mode: Installation Manager/eclipse/tools/imcl.ini

Important: The dash in -DBYPASS_TSM_REQ_CHECKS=true is required and the flag must be added on a new line after the -vmargs flag.

After the flag has been added, run in installer.

Windows
=========================================================================
For fresh installations, issue the following command:
Installation wizard: install.bat -g -vmargs "-DBYPASS_TSM_REQ_CHECKS=true"
Console mode: install.bat -c -vmargs "-DBYPASS_TSM_REQ_CHECKS=true"
Silent mode: install.bat -s -acceptLicense -vmargs "-DBYPASS_TSM_REQ_CHECKS=true"

Otherwise, update the following file to add the -DBYPASS_TSM_REQ_CHECKS=true flag:
Installation wizard: Installation Manager\eclipse\ibmim.ini
Console mode: Installation Manager\eclipse\tools\imcl.ini
Silent mode: Installation Manager\eclipse\tools\imcl.ini

Important: The dash in -DBYPASS_TSM_REQ_CHECKS=true is required and the flag must be added on a new line after the -vmargs flag.

After the flag has been added, run in installer.

Remplis sous: TSM Aucun commentaire

6août/13

DSMSERV RESTORE DB FAILED

Environment:
TSM 6.3 - Windows 2008

1) make a copie of volhist.dat and check it content search impacted BACKUPFULL

**************************************************
Operation Date/Time: 2013/07/06 18:57:10

Volume Type: BACKUPFULL
* Location for volume K:\BACKUP_TSMDB\75808230.DBV is: ''
Volume Name: "K:\BACKUP_TSMDB\75808230.DBV"
Backup Series: 125
Backup Op: 0
Volume Seq: 10001
Device Class Name: SEC03
**************************************************

2) creates a hex dump of TSM DB BACKUP and check value of Backup Series :

@ 0000020 => (0000 7f00) = 0x7f = 127

The Backup Series in DB BACKUP is different than Volhist file.

# xxd 75808230.dbv | more
0000000: 0100 0000 2400 0000 4943 4944 0000 189c ....$...ICID....
0000010: 0000 0024 0000 0100 07dd 0806 1239 0a00 ...$.........9..
0000020: 0000 7f00 0000 0000 0000 0101 0002 02ff ................
0000030: ffff ffff ffff ff53 514c 5542 524d 4544 .......SQLUBRMED
0000040: 4845 4144 2000 0054 534d 4442 3100 0000 HEAD ..TSMDB1...
0000050: 3230 3133 3038 3036 3138 3537 3130 0000 20130806185710..
0000060: 0053 4552 5645 5231 0000 0001 0000 00f8 .SERVER1........
0000070: 0100 0000 0d54 534d 4442 3100 0000 0072 .....TSMDB1....r

Workaround:
Edit volhist file and replace Backup Series value with value checked in hex dump of TSM DB BACKUP.
DSMSERV RESTORE DB work fine.

Check TSM FIXPACK

Remplis sous: TSM Aucun commentaire

6août/13

NAS backup fails with ANR8758W on EMC DATADOMAIN

ANR1069E ANR8758W failure for NAS (NDMP) backup indicating there are insufficient mount points and the drives to not match the number of paths for the source node.
Symptom

NAS (NDMP) backup fails with:

ANR8758W The number of online drives in the VTL library NASLIB does not match the number of online drive paths for source NASNODE.
ANR1069E NAS Backup process 33 terminated - insufficient number of mount points available for removable media.

The problem was seen in an environment with a Unix server, Protectier VTL and Network Appliance (NetApp) NAS, but may also occur in other environments.
Resolving the problem

In this case, the error was resolved by changing the Library Type from VTL to SCSI.

Use the Tivoli Storage Manager command: UPDATE LIBR LIBTYPE=SCSI

Taggé comme: NAS Aucun commentaire

5août/13

How is Tivoli Storage Manager applying versioning to NAS backups?

To backup NAS filer using NDMP protocol a Tivoli Storage Manager client NAS node needs to be defined to the Tivoli Storage Manager server. This Tivoli Storage Manager client node belongs to a policy domain as all other nodes.
Therefore Tivoli Storage Manager policies (like versioning) apply to Tivoli Storage Manager NAS backups, too.

Tivoli Storage Manager versioning applies to the complete NDMP dump only because the Tivoli Storage Manager server is not aware of the single objects included in the NDMP dump (except when reading the TOC).
To apply Tivoli Storage Manager versioning to single objects the single objects within the NDMP dump would need to have their own Tivoli Storage Manager server internal object ID assigned which is NOT the case.
In addition, if Tivoli Storage Manager versioning would apply to single objects within the NDMP dump something similar to aggregate compression had to be available to "delete" the invalid objects out of the NDMP dump which is NOT the case, too.

For a NAS filesystem, full and differential backups are grouped, with the full backup being the peer group leader.

If for example VEREXISTS = 4 and you do a full backup followed by 3 differentials then your Tivoli Storage Manager server database will have 4 versions of this backup image.
The next differential backup of the NAS filer will expire the full backup (but the Tivoli Storage Manager server is still keeping it internally, since it is needed to restore any of the differential images ).

The Tivoli Storage Manager server may store a full backup in excess of the number of versions you specified. When this happens, the full backup will stay in Tivoli Storage Manager database until all dependent backups have expired.

'QUERY NASBACKUP' will not show this extra version.

Use SQL 'SELECT' statements and/or 'SHOW VERSION' Tivoli Storage Manager server commands to see this extra version.

Use the following command to examine the dependency of full image and differential image backups:

'show version nodename filespace_name'

/vol/vol1 : /NAS/ IMAGE (MC: default)
Inactive, Inserted 05/25/05 11:14:57, Deactivated 1900-01-01 00:00:00.000000
ObjId: 0.138114, GroupMap 00050000, objType 0x0b
Attr Group Leader, GroupId: 0.138114
Delta Group Leader, GroupId: 0.138114

We see this version is deactivated already (Deactivated 1900-01-01 00:00:00.000000), it should have expired, but it stays in the Tivoli Storage Manager server database because it is a delta group leader (GroupId: 0.138114) and the following delta member (GroupId: 0.138114) has not yet expired:

/vol/vol1 : /NAS/ IMAGE (MC: Default)
Inactive, Inserted 07/20/05 20:41:28, Deactivated 07/27/05 22:15:21
ObjId: 0.179387, GroupMap 00040001, objType 0x0c
Delta Group Member, GroupId: 0.138114
Attr Group Leader, GroupId: 0.179387

In the example above, Delta Group Leader represents the full image backup and the Delta Group Member the differential image backup.

Important to understand:
Although the already expired full and differential NAS backups can be seen, it is not possible to do a point in time (PIT) restore from the date of an expired full or differential backup! It is only possible to do a PIT restore from full and differential NAS backups that have not yet expired.

SOURCE: 1200154

Taggé comme: NAS Aucun commentaire

9juil/13

Migrating TSM Node Data from one Library to Another

Problem
Procedure to migrate any TSM data that resides in Storage Pools from one library to another library. This procedure can be used to move data from any devclass to any other devclass.

Cause
Data needs to be migrated to alternate library hardware.

Resolving the problem
1. Physically attach the new library to the server and define it in the TSM server.

2. Once it is defined to the O/S and TSM, create a new stgpool (NEWPOOL) using the new library. See the TSM Admin guide / quick start guide for procedure.
DEFINE STG NEWPOOL

3. disable sessions & events so that backups will not go to old stgpool.
DISABLE SESS
DISABLE EVENTS

It depends on how much data you have and what daily processing is going on at the time, but it is recommended to mark the OLDPOOL volumes 'read only' at this point.

4. Change the copygroup DESTination setting from your OLDPOOL to point to NEWPOOL. If your backups go to a Disk stgpool and then migrate to OLDPOOL, update the disk stgpool to point to NEWPOOL as well, without changing copygroup setting.

If your copygroups go directly to tape, then change those now as well. Validate and Activate the policy sets after making the changes to the copy destinations.

5. Enable sessions & events.

Now new backups will go to new library. Next we will migrate the data off of the old library..

6. Update your old stgpool NEXTstgpool setting to go to the new stgpool.
UPDATE STG OLDPOOL NEXT=NEWPOOL

7. Create the new copy storage pool, if you use them.
DEFINE STG NEWCOPY

8. Do this next step in increments as a large amount of data will be transferred during each step. During the migration you probably want to continue with every day production and you do not want to fill up your database. Migration can be controlled through your Hi / Lo settings, for example:
UPDATE STG OLDPOOL HI=80 LOW=70
UPDATE STG OLDPOOL HI=70 LOW=60 (next day, and so on.)

Look at Q STG OLDPOOL to verify your setting so migration can occur if you are having trouble at this point.

After each migration increment, Bring the new copy pool up to date (if used.)
BACKUP STG NEWPOOL NEWCOPY

This will run a backup from the NEWPOOL to the NEWCOPY pool. When this step is complete, do the next incremental migration and another backup stgpool and so on until the entire storage pool is migrated.

A way to move the data over to the new storage pool with more control is to use the MOVE DATA command. See the ADMIN REFERENCE for the syntax and options for this command.

Notes: Regardless of the method used, the only way to move copypool data to the new devclass is by backing it up again with 'backup stg' specifying a copy stgpool in the new devclass. The data in the copy stgpool in the old devclass will need to be removed by 'delete volume ##### discarddata=yes'. It is recommended to do this after the data has been copied to the new copy stgpool.

9. Once the BACKUP STG is done, get rid of the old pool
for each volume in "Q VOL STG=OLDCOPY" do
DELETE VOL

10. When done, delete the old copy stg pool.
DELETE STG OLDCOPY

Remplis sous: TSM Aucun commentaire

2juil/13

Replacing a damaged primary storage pool volume

Problem

My primary storage pool volume is physically damaged and cannot be reused. Steps for recovering the data on the volume.

Resolving the problem

NOTE: The key factor is that a copy storage pool must exist for the primary storage pool, and the data on the volume(s) has been copied with the 'backup stgpool' command. If the data on the damaged volume(s) has not been copied to the copy storage pool, or if the data on the volume was damaged prior to copy, then it will not be recoverable. Any files that are unable to be restored would be eligible for backup from the client again.

1. The damaged volume must be marked as destroyed to prevent access:
UPDATE VOLUME ACCESS=DESTROYED

2. Check the volume out of the library before proceeding. When the restore commands are executed the destroyed volume is deleted from the TSM database, and cannot be removed within TSM if the volume has not been checked out. To checkout the volume:
CHECKOUT LIBVOLUME

3. This command will preview the restore and not move any physical data. This will produce a list of volumes needed for the restore:
RESTORE VOLUME PREVIEW=YES

4. Look in the Activity log for the list of volumes that must be returned from offsite storage and checked into the library.

5. Place the volumes from step #4 in the BULK I/O door and then check them into the library:
CHECKIN LIBVOLUME SEARCH=BULK CHECKLABEL=BARCODE STATUS=PRIVATE

If you do not have a bulk I/O door then place them in empty slots inside the library and change the SEARCH parameter:
CHECKIN LIBVOLUME SEARCH=YES CHECKLABEL=BARCODE STATUS=PRIVATE

6. The checked in volumes must be marked READONLY to prevent processes from using them. The important thing to remember is that all volumes listed in the preview need to be in a ACCESS=READONLY state so that the next command can run to completion:
UPDATE VOLUME ACCESS=READONLY WHERESTGPOOL=

7. The next command will start the volume restore process. The 'maximum number of processes' needs to be at least 2, one process to read from the copy pool volume, and one process to write the new primary storage pool volume. Ensure that sufficient scratch volumes are available.
RESTORE VOLUME MAXPROCESS=

8. Once the restore has been completed, the copy volumes will need to be sent back offsite. The first step is to change there access back to offsite. By updating all volumes from step #5 to have an ACCESS=OFFSITE, the server will not try to use them during reclamation.
UPDATE VOLUME * ACCESS=OFFSITE WHERESTGPOOL=

9. Check out the copy pool volumes so that they can be delivered back to the vault location:
CHECKOUT LIBVOLUME CHECKLABEL=NO

10. This command will need to be run on the damaged primary volume only if the TSM Server reports that files are still located on the volume. This could be because of previous problems with the backup storage pool command or files on that volume had not been copied to the copy storage pool as of yet. These files, as long as they are still present on the client, will be backed up again from the owning client during their next incremental:
DELETE VOLUME DISCARD=YES

Remplis sous: TSM Aucun commentaire

20juin/13

Redefine Library and Drives on Windows TSM server

Problem
Drives not working after system, library or cabling reconfiguration.
Cause
There are times, especially when a system has been rebuilt, that it is necessary to remove & redefine the library & the drive(s). Sometimes there are specific errors associated with this as well (see Problem Abstract for example).
Resolving the problem

Common errors that may be encountered after a library or recabling configuration may be:
ANR8300E I/O error on library (OP=xx CC=xx KEY=xx ASC=xx ASCQ=xx SENSE=xx)
ANR8441E Initialization failed for SCSI library
ANR8301E I/O Error on library

NOTE: These steps apply to libraries of LIBTYPE=SCSI or (with the library path definition step removed) LIBTYPE=MANUAL. For libraries of LIBTYPE=ACSLS or LIBTYPE=EXTERNAL, refer to the TSM Server Administrator's Guide and the Storagetek (in the case of ACSLS) or External Media Management software vendor (in the case of EXTERNAL, e.g. Gresham) documentation for configuration information. This technote does not account for library client servers or storage agents in a library sharing environment. Additional steps will be needed to redefine drive paths for the library client servers and/or storage agents to the same drive definitions for which the library manager server's drive paths point to the same physical device.

This is specific to Microsoft Windows Operating Systems. These steps should only be used, after attempts to update the drive and paths using autodetect features have been exhausted.

1. Run the 'TSMDLST' command to produce a list of the configured OS devices. This command is located in the following directory of a standard install and should be run from a MS-DOS prompt:
C:\program files\tivoli\tsm\console\

This information is usually (although not always) also viewable from the TSM Management Console's "Device Information" window under TSM Device Driver for the machine on which the TSM server is running.

2. Write down the 'TSM Name' for the library and drives:
Library=lb#.#.#.#
Drives=mt#.#.#.#

3. Gather the output of the following for a reference:
QUERY LIBRARY
QUERY DRIVE F=D
QUERY PATH F=D

4. First the drive and library paths must be deleted.

Do this for all drives:
DELETE PATH SRCTYPE=SERVER DESTTYPE=DRIVE
LIBRARY=

Do the following for the library:
DELETE PATH SRCTYPE=SERVER DESTTYPE=LIBRARY

5. Now the device reference must be deleted.
Do the following for all drives:
DELETE DRIVE

Then for the library:
DELETE LIBRARY

6. Now the devices can be redefined to TSM. Use the query outputs from step 3 as a guide for the device names. First, redefine the library:
DEFINE LIBRARY LIBTYPE=
(NOTE: If this is a shared library, add SHARED=YES to the above definition)

7. Next, redefine the path to the library. Often, drive and drive path definitions cannot be created until the TSM server is able to communicate with the library in which the drives exist.

Redefine path to library using the library device value from step 2:
DEFINE PATH SRCTYPE=SERVER DESTTYPE=LIBRARY DEVICE=lb#.#.#.#

8. Redefine the drives:
DEFINE DRIVE

9. Redefine the paths to the drives. Use the information gathered from the 'tsmdlst' in step 2 for the device parameter. Use the query outputs from step 3 as a guide for the device names.

Redefine paths to all drives:
DEFINE PATH SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY= DEVICE=mt#.#.#.# AUTODETECT=YES

NOTE: For multivendor libraries (as in libraries whose drives are not from the same vendor, e.g. a Dell Powervault 136T with IBM LTO drives), the autodetect feature may not function in such a way to automatically detect the drive's element and serial numbers. In that event, refer to the device's worksheet (often found by going to www.ibm.com/support and searching on the make & model of the library itself) to obtain a list of valid drive element numbers. This information may also be available through the use of vendor-specific library querying utilities (the vendor equivalent of the IBMTape utility "ntutil.exe").

8. Check all of the Scratch volumes back in.
IMPORTANT - you MUST check in the Scratch volumes first or ALL volume will be marked as "Private."
CHECKIN LIBVOL SEARCH=YES CHECKLABEL=BARCODE STATUS=SCRATCH

9. Check in the Private volumes as well.
CHECKIN LIBVOL SEARCH=YES CHECKLABEL=BARCODE STATUS=PRIVATE

Remplis sous: TSM Aucun commentaire

20juin/13

Redefining TSM Library and Drives for UNIX OS

Question
Frequently when hardware or firmware has changed it is necessary to remove the tape library and drive definitions from the (IBM Tivoli Storage Manager) TSM Server, then re-define them.
Cause
Sometimes there are specific errors, such as:
ANR0523W Transaction failed - error on output storage device
ANR8300E I/O error on library (OP=xx, CC=xx, KEY=xx, ASC=xx, ASCQ=xx, SENSE=xx)
ANR8301E I/O Error on library
ANR8355E I/O error reading label for volume NNNNNN on drive XXXXX
ANR8359E Media fault detected on volume NNNNNN in drive XXXXX
ANR8441E Initialization failed for SCSI library
ANR8779E Unable to open drive XXXXX, error number=ZZZ
ANR8944E Hardware or media error on drive
ANR8963E Unable to find path to match the serial number defined for drive

Frequently the TSM Server can automatically rediscover devices when using "SANDISCOVERY ON" or by using "UPDATE PATH" with "AUTODETECT=YES" to refresh the values.

However, there are times when that may not be successful. For example, if a tape drive, tape library, fibre/SCSI HBA, or SAN has experienced changes (such as hardware, firmware or device drivers) it may require rebuilding the TSM "special files" to re-establish connectivity to the library and drives. To rebuild the "special files," we must delete and re-define the hardware devices to the TSM Server (UPDATE does not rebuild).
Answer
Perform these tasks in this sequence to totally re-define the tape devices to TSM. These steps should be taken only if attempts to update the devices/paths using the autodetect features have failed:

1. Before deleting anything, gather the output from these commands, so you can use the same naming conventions when re-defining the tape devices:
QUERY STATUS (get SERVERNAME value for "")
QUERY DEVCLASS
QUERY LIBRARY FORMAT=DETAIL
QUERY DRIVE FORMAT=DETAIL
QUERY PATH FORMAT=DETAIL

2. Run the appropriate OS command to produce a list of the configured HW 'special file' device names.
AIX ==> lsdev -Cc tape (-or- 'cfgmgr')
lsdev -Cc adsmtape (for TSM devices)
lsdev -Cc library
Solaris ==> ls -l /dev/rmt/*st (-or- 'sysdef')
ls -l /dev/rmt/*smc
HP-UX ==> /usr/sbin/ioscan -funC tape
(-or 'ioscan -kfn')
Linux ==> ls -l /dev/IBM*
ls -l /dev/tsmscsi/*
(-or- 'more /etc/sysconfig/hwconf')

If the tape devices are not defined to the OS, please work with your OS or SCSI/SAN hardware support to configure them. Until the OS can use the drives (can write to them, for example using 'tar' or 'dd') the tape devices cannot be defined to TSM.

3. From the '/dev' directory, write down the OS-level device definitions for the library and drives:
AIX Linux Solaris HP-UX
TSM Drives mt# tsmscsi/mt# rmt/# rmt/tsmmt#
IBM Drives rmt# IBMtape# rmt/#st rmt/#m
TSM Library lb# tsmscsi/lb# rmt/#lb tsmchgr#
358x Library smc# IBMchanger# rmt/#smc rmt/#chng
3494 Library lmcp# 3494lib libmgrc# libmgrc#

4a. First the drives and drive paths must be deleted. From a TSM Server admin commandline, for all the drives:
DELETE PATH SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY=

4b. Then delete all the TSM drive definitions:
DELETE DRIVE

5a. Next, delete the path for the tape library:
DELETE PATH SRCTYPE=SERVER DESTTYPE=LIBR

5b. And finally delete the TSM library definition:
DELETE LIBRARY

If the OS cannot access the tape drives at this point, stop. Check hardware, device drivers, update firmware, swap cables; consider power-cycling the tape library then deleting and re-defining to the OS. There is no point attempting to get TSM to write to the devices if they are not recognized by the OS; work with OS and/or hardware vendors to resolve HW issues before proceeding.

6a. Now the tape library and library path can be re-defined. Use the TSM QUERY outputs from "Step 1" as a guide for the library name and LIBTYPE; no additional parameters are necessary in the syntax below. Redefine the library:
DEFINE LIBRARY LIBTYPE= SERIAL=AUTODETECT

Note: If this TSM Server is hosting a tape library for other systems, for example any "TSM Server Library Clients" or "TSM Storage Agents" then you also need "SHARED=YES" on the "DEFINE LIBRARY".

6b. Redefine the path to the library. For SCSI libraries, confirm the DEVICE value matches the latest OS-level info gathered from "Step 2". For 3494, ACSLS, and other types of libraries using software configuration files, use the previous values from "Step 1" to redefine the DEVICE or ACSID, and so on:
DEFINE PATH SRCTYPE=SERVER DESTTYPE=LIBRARY DEVICE=

7a. Redefine the drives and drive paths. Redefine all the drives using names from "Step 1" for example:
DEFINE DRIVE SERIAL=AUTODETECT ELEMENT=AUTODETECT

7b. Redefine paths to all drives, using the OS-level info gathered from "Step 2" for the DEVICE values. Keep in mind the OS-level DEVICE values may have changed since the they were previously defined.
DEFINE PATH SRCTYPE=SERVER DESTTYPE=DRIVE LIBRARY= DEVICE=

Note: If this TSM Server is hosting a tape library for other systems, for example any "TSM Server Library Clients" or "TSM Storage Agents" then in addition to the "TSM Server Library Manager" DRIVE PATH, you also need to define a new PATH for each drive for those systems, substituting the SERVERNAME (shown by "Q SERVER") for the value of "" and the local DEVICE value for the drive as seen by that other system.

8. Verify the library, drives, and paths are online:
QUERY LIBRARY FORMAT=DETAIL
QUERY DRIVE * FORMAT=DETAIL
QUERY PATH * * FORMAT=DETAIL

9. Since the library is "new" to TSM, the volumes must be checked in again to re-create the inventory (AUDIT LIBRARY does not CHECKIN). Use *this* sequence, first SCRATCH, then PRIVATE:
CHECKIN LIBVOL SEARCH=Y STATUS=SCR CHECKL=BARC
CHECKIN LIBVOL SEARCH=Y STATUS=PRIV CHECKL=BARC

NOTE: For ACSLS libraries, use "CHECKLABEL=NO" on the CHECKIN commands, because "CHECKLABEL=BARCODE" is not supported for an ACSLS Library.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

If that doesn't resolve the issue, the tape drive problem seems beyond the control of the TSM Server (software). Review the output from the OS-level logs for additional hardware error information:
Remove OS Install
OS Diagnostics Devices OS Devices
------- ------------ --------- ------------
AIX errpt –a rmdev cfgmgr
Linux dmesg /dev/MAKEDEV
Solaris mbin/prtdiag rem_drv drvconfig
HP-UX dmesg rmsf insf -e

If you cannot reach HW support immediately, you could take the additional action of power-cycling in this order:
1. Tape library.
2. SAN switch (if any).
3. Consider updating to latest device drivers and/or firmware.
4. Halt TSM and reboot system with TSM Server.
5. Re-define the tape device to the OS (see commands above).
6. If tape device definitions have changed, DELETE & re-DEFINE to TSM.

And that is all we can do from a software perspective, if errors persist it points to an issue at a layer which TSM cannot repair.

Remplis sous: TSM Aucun commentaire

10juin/13

Reclaiming volumes when out of scratch tapes

1. To start expiration issue the following command from a Tivoli Storage Manager command line:

EXPIre Inventory

2. Next, identify the RECLAMATION THRESHOLD is on your tape storage pool(s)

Query STGpool Format=Detail

3. If the threshold is set to 100 then adjust the reclamation threshold value down to a size that will cause reclamation to reclaim the volumes. *

UPDdate STGpool REClaim=

NOTE: Do not set the REClaim value lower that 50 as this will cause reclamation to run for an extended period of time.

* After expiration is finished you can see how many volumes would be reclaimed if the reclamation threshold were to be changed to a certain percent for your storage pool using the following command:

select count(*) from volumes where stgpool_name='XXXX' and upper(status)='FULL' and pct_utilized < ##

NOTE: Replace the XXXXX with the storage pool in UPPERCASE and the ## with the appropriate two digit numeric value.

Tivoli Storage Manager has the ability to run this at regular intervals during the day or at night. You will need to figure out what time would be the best for reclamation to run. After this you will need to check your system to see if it is set up to not run reclamation.

1. Check you DSMSERV.OPT file and make sure that the options NOMIGRRECL and EXPINTERVAL are not set. If they are set you will need to either remove them from the file or comment out the lines. Once this is done you will then need to stop and restart the Tivoli Storage Manager server so that it re-reads the file as these settings are retained in memory.

2. If the NOMIGRRECL and EXPINTERVAL were not in the file, then issue the following command

Query OPTions

Look for the ExpInterval in the first 'server option' column. The number to the right specifies the time, in hours, between automatic inventory expiration processes. You can specify from 0 to 336 (14 days). A value of 0 means that expiration must be manually started with the EXPIRE INVENTORY command. You can update this server option without stopping and restarting the server by using the SETOPT command.

SETOPT EXPINterval 24

In the above example, this command will cause Expiration to run every 24 hours.

3. If you then want to control when reclamation runs you can build some Administrative schedules to raise and lower the reclamation threshold.

DEFINE SCHEDULE ReclaimStart TYPE=ADMIN CMD="UPDATE STG REClaim= " ACTIVE=Yes DESC="Lower Reclamation Threshold" STARTT= DAY=ANY EXP=N

DEFINE SCHEDULE ReclaimStop TYPE=ADMIN CMD="UPDATE STG REClaim=100" ACTIVE=Yes DESC="Raise Reclamation Threshold" STARTT= DAY=ANY EXP=N

NOTE: In the second command set the time for 2 - 3 hours after the time you entered in the ReclaimStart command.

After accomplishing these three steps Tivoli Storage Manager will expire files and reclaim the volumes automatically everyday at the time you designated. Take into account that either of the two procedures stated above will need at least two scratch tapes available to move data off of the volumes it is reclaiming. If you do not have any scratch tapes available to initiate this in the beginning then you will need to perform the following steps.

1. Run the 'select' query from the first set of procedures after expiration is finished.

2. Manually move data from some of the Tape storage pool volumes to clear some space.

MOVe Data

This will move data from the volume to other volumes within the same storage pool.

3. Once this command is finished you then need to delete the volume from the storage pool with either one of the two following commands.

DEL V

DEL V DISCARD=Y

NOTE: Ensure that the move data command finished successfully (by reviewing the Tivoli Storage Manager server activity log) before running this command. If there is any data on the volume it will be deleted and lost from both the primary and copy storage pools.

Remplis sous: TSM Aucun commentaire

« Nouvelles Anciennes »

unixadmin.free.fr Handy Unix Plumbing Tips and Tricks

Potential problems when the Tivoli Storage Manager server processes files greater than the MAXFRAGMENTSIZE setting

Abstract

Content

Bypass the server and storage agent prerequisites during installation

Question

Answer

DSMSERV RESTORE DB FAILED

NAS backup fails with ANR8758W on EMC DATADOMAIN

How is Tivoli Storage Manager applying versioning to NAS backups?

Migrating TSM Node Data from one Library to Another

Replacing a damaged primary storage pool volume

Redefine Library and Drives on Windows TSM server

Redefining TSM Library and Drives for UNIX OS

Reclaiming volumes when out of scratch tapes

Catégories

Liens

Tag Cloud

Visiteurs