Raghu On Tech
Pro tips on managing data at scaleTroubleshoot Db2 Backup Issues With Veritas NetBackup

Problem Statement:
So you are a database administrator and you configured your Db2 environment to send its backups to Veritas NetBackup. However your backups are failing and you are not getting enough information from the generic errors messages thrown at you by Db2. Furthermore you are not familiar with NetBackup, how do you troubleshoot Db2 backup issues with Veritas NetBackup ? If that is the case, the information I gathered while troubleshooting Db2 backup issues may be of use to you.
Problem Troubleshooting:
Why The Backup Failed ?
First of all remember that your backup’s might fail for N number of reasons. At the same time your db2diag.log should give you a good indication of why your database backups are failing in most cases. However when the backup failures are related to the vendor backup solutions such as TSM or NetBackup, Db2 will throw a -2062 SQLCODE. -2062 in it self can be caused due to a number of reasons either due to the misconfiguration on database side or NetBackup side.
Once you ruled out all the other possible causes of backup failure, you can look at a few places on the NetBackup side to see if there is a problem.
First Things First:
If this is the initial configuration of NetBackup that you are trying to make it work, then its better to start by verifying /usr/openv/netbackup/bp.conf and ${DB2HOME}/db2.conf. Ensure that there are no lines with extra white spaces etc to rule out issues with the configuration. Once this check is done you can proceed for further checks.
Check NetBackup Logs:
NetBackup stores the logs related to Db2 backup operations under “/usr/openv/netbackup/logs/user_ops/dbext/logs“. There will be multiple log files, one for each successful or failed backup operation. I will show how a log file will look up on successful backup and also output from a failed backup operation.
Successful Backup Operation:
Backup started Fri Mar 16 11:38:38 2018 11:38:49 Initiating backup 11:38:49 INF - Starting bpbrm 11:38:50 INF - Data socket = nbmedia5.IPC:/usr/openv/var/tmp/vnet-21187521214730458685000000124-AxH7GM;3d4edbd3f293443f807fd16b2d66b6fe;11;900 11:38:50 INF - Name socket = nbmedia5.IPC:/usr/openv/var/tmp/vnet-21188521214730541690000000124-M1H1TM;da6b08d0ff599cdf19c00ba8000fc67c;11;900 11:38:50 INF - Job id = 11572629 11:38:50 INF - Backup id = yourdb2server.example.com_1521214729 11:38:50 INF - Backup time = 1521214729 11:38:50 INF - Policy name = DB2_MAXIMA_BACKUPS 11:38:50 INF - Snapshot = 0 11:38:50 INF - Frozen image = 0 11:38:50 INF - Backup copy = 0 11:38:50 INF - Master server = nbmaster 11:38:50 INF - Media server = nbmedia5 11:38:50 INF - Multiplexing = 0 11:38:50 INF - New data socket = nbmedia5.IPC:/usr/openv/var/tmp/vnet-21186521214730372664000000124-4SpKtM;ce61048e8788ec55628ce7c182a4c155;11;900 11:38:50 INF - Use shared memory = 0 11:38:50 INF - Compression = 0 11:38:50 INF - Encrypt = 0 11:38:50 INF - Keep logs = 7 11:38:50 INF - Client read timeout = 7200 11:38:50 INF - Media mount timeout = 0 11:38:53 INF - Beginning backup on server nbmedia5 of client yourdb2server.example.com 11:38:54 INF - Server status = 0 11:38:54 INF - Backup by traveler on client yourdb2server.example.com using policy DB2_MAXIMA_BACKUPS, sched Default-Application-Backup:the requested operation was successfully completed
Failed Backup Operation:
As you can see below, backup job waited in NetBackup scheduler queue for more than 15 minutes. Once its turn arrived, it waited for close to another 30 minutes before bailing out saying that NetBackup could not connect to client i.e. your Db2 server. We will see how to troubleshoot connectivity issues in the next section.
Backup started Fri Mar 16 15:59:48 2018 15:59:59 Initiating backup 16:00:59 INF - Server status = 25 16:00:59 INF - Backup by traveler on client yourdb2server.example.com using policy DB2_MAXIMA_BACKUPS, sched Default-Application-Backup:cannot connect on socket. 16:15:46 INF - Waiting in NetBackup scheduler work queue on server nbmaster 16:17:46 INF - Waiting in NetBackup scheduler work queue on server nbmaster 16:19:46 INF - Waiting in NetBackup scheduler work queue on server nbmaster 16:21:46 INF - Waiting in NetBackup scheduler work queue on server nbmaster 16:23:46 INF - Waiting in NetBackup scheduler work queue on server nbmaster 16:25:46 INF - Waiting in NetBackup scheduler work queue on server nbmaster 16:27:46 INF - Waiting in NetBackup scheduler work queue on server nbmaster 16:29:46 INF - Waiting in NetBackup scheduler work queue on server nbmaster 16:31:46 INF - Waiting in NetBackup scheduler work queue on server nbmaster 16:33:40 INF - Starting bpbrm 16:33:41 INF - Data socket = nbmedia5.IPC:/usr/openv/var/tmp/vnet-21129521232421356667000000135-caxcjo;12630dfc044093977b89ee4930b738d5;11;900 16:33:41 INF - Name socket = nbmedia5.IPC:/usr/openv/var/tmp/vnet-21130521232421437639000000135-Uz9Mvo;0a7adbc53eedd1fecd6e4da163aa5b17;11;900 16:33:41 INF - Job id = 11576278 16:33:41 INF - Backup id = yourdb2server.example.com_1521232420 16:33:41 INF - Backup time = 1521232420 16:33:41 INF - Policy name = DB2_MAXIMA_BACKUPS 16:33:41 INF - Snapshot = 0 16:33:41 INF - Frozen image = 0 16:33:41 INF - Backup copy = 0 16:33:41 INF - Master server = nbmaster 16:33:41 INF - Media server = nbmedia5 16:33:41 INF - Multiplexing = 0 16:33:41 INF - New data socket = nbmedia5.IPC:/usr/openv/var/tmp/vnet-21128521232421275652000000135-KVzB6n;38944abe38005ced648a9adee8998c44;11;900 16:33:41 INF - Use shared memory = 0 16:33:41 INF - Compression = 0 16:33:41 INF - Encrypt = 0 16:33:41 INF - Keep logs = 7 16:33:41 INF - Client read timeout = 7200 16:33:41 INF - Media mount timeout = 0 16:48:41 INF - Server status = 54 16:48:41 INF - Backup by traveler on client yourdb2server.example.com using policy DB2_MAXIMA_BACKUPS, sched Default-Application-Backup:timed out connecting to client
Check If your NetBackup Client Deamons Are Running:
You can easily verify the NetBackup Client daemons by querying the process table on Linux as follows or by using bpps command.
# ps -ef|grep -i netbackup root 2519 1 0 Mar14 ? 00:00:01 /usr/openv/netbackup/bin/vnetd -standalone root 2523 1 0 Mar14 ? 00:00:01 /usr/openv/netbackup/bin/bpcd -standalone root 2536 1 0 Mar14 ? 00:00:10 /usr/openv/netbackup/bin/nbdisco or # /usr/openv/netbackup/bin/bpps -x NB Processes ------------ root 2519 1 0 Mar14 ? 00:00:01 /usr/openv/netbackup/bin/vnetd -standalone root 2523 1 0 Mar14 ? 00:00:01 /usr/openv/netbackup/bin/bpcd -standalone root 2536 1 0 Mar14 ? 00:00:11 /usr/openv/netbackup/bin/nbdisco Shared Symantec Processes ------------------------- root 2126 1 0 Mar14 ? 00:00:00 /opt/VRTSpbx/bin/pbx_exchange
Check visibility between NetBackup and your Db2 server:
You can check if your Db2 server is able to see the NetBackup master and media servers by running the following commands. As you can see, you can check visibility using both hostname and/or IP address. If following command fails then you need to address issues with firewalls, DNS, host entries etc.
# CHECK MASTER SERVER VISIBILITY BY HOSTNAME. $ /usr/openv/netbackup/bin/bpclntcmd -hn nbmaster.example.com host nbmaster.example.com: nbmaster.cloud.example.com at 123.123.123.12 host nbmaster.example.com: nbmaster.cloud.example.com at ::123.123.123.12 aliases: nbmaster.example.com nbmaster.cloud.example.com ::123.123.123.12 123.123.123.12 # CHECK MASTER SERVER VISIBILITY BY IP ADDRESS. $ /usr/openv/netbackup/bin/bpclntcmd -ip 123.123.123.12 host 1123.123.123.12: nbmaster.example.com at 123.123.123.12 host 123.123.123.12: nbmaster.example.com at ::123.123.123.12 aliases: nbmaster.example.com ::123.123.123.12 123.123.123.12 # CHECK MEDIA SERVER VISIBILITY BY HOSTNAME. $ /usr/openv/netbackup/bin/bpclntcmd -hn nbmedia1.example.com host nbmedia1.example.com: nbmedia1.example.com at 111.111.111.111 aliases: nbmedia1.example.com 111.111.111.111 # CHECK MEDIA SERVER VISIBILITY BY IP ADDRESS. $ /usr/openv/netbackup/bin/bpclntcmd -ip 111.111.111.111 host 111.111.111.111: nbmedia1.example.com at 111.111.111.111 aliases: nbmedia1.example.com 111.111.111.111
Below is how the output from “/usr/openv/netbackup/bin/bpclntcmd” will look when you run it on a hostname and ip address that you can not communicate with or does not exist.
$ /usr/openv/netbackup/bin/bpclntcmd -hn yourdb2server.example.com client hostname could not be found alter client: yourdb2server.example.com : not found. (48) $ /usr/openv/netbackup/bin/bpclntcmd -ip 10.34.354.435 client hostname could not be found
Clear The NetBackup Cache:
In case something recently changed like configuration parameters in the /usr/openv/netbackup/bp.conf etc. you may want to clear the NetBackup cache to ensure that any unwanted entries lurking in the memory are cleared out. You can run the below command to clear out the cache.
# cd /usr/openv/netbackup/bin # ./bpclntcmd -clear_host_cache
Restart the services:
In case the clearing out of cache and any other things that you may have tried after changing parameters on the NetBackup side did not resolve your issues, you may want to try restarting of NetBackup services. Restarts can be performed in a couple of ways, either cleanly or forcefully.
# Stop and Start NetBackup Client daemons cleanly $ cd /usr/openv/netbackup/bin/goodies $ ./netbackup stop (Stop the NBU daemon) $ ./netbackup start (Start the NBU daemon) # Stop and Start NetBackup Client daemons forcefully $ /usr/openv/Netbackup/bin/bp.kill_all $ /usr/openv/Netbackup/bin/bp.start_all
0 Comments
Trackbacks/Pingbacks