Friday, July 23, 2010

Storage RAID Data Collection(Auto Dump)

Storage RAID Data Collection
The RAID "Auto Dump" Facility is required for most problems that relate to HDS RAID subsystems.

A dump can only be initiated from the subsystem SVP, and thus most commonly it requires action from an HDS engineer.

This function is a procedure that merges the separate DUMP and FDCOPY functions into one operation, and provides a single compressed file available for transfer from the SVP. The default location of this file is:
  • C:\DKC200\TMP\hdcp.lzh (for 9900, 9900V)
  • C:\DKC200\TMP\hdcp.tgz (for USP, NSC, USP-V, USP-VM)

For instructions on how to perform an Auto Dump on a RAID subsystem, refer to the SVP section --> 2. Function of the SVP of the Maintenance Manual.

Note A common mistake is to collect the file called dump.lzh or dump.tgz. This file is included in hdcp.lzh or hdcp.tgz and cannot be used alone without the logs and config info. Always collect hdcp.lzh or hdcp.tgz

Types of Auto Dump:

  1. RAPID
    Gathers logs, SVP operation history, configuration information. NOT collected: Dump data (Processor, ABEND, WCHK1) and monitor data. This type should only be collected when urgent initial analysis is required and/or slow upload connection is available. This type is typically used for Backend (Hard Disk) related errors and for system healthchecks.
  2. NORMAL
    Performs and collects dumps (all microprocessors, ABEND, WCHK1), in addition to files gathered by RAPID. Monitor data is not collected. This type should be collected by default when no instruction has been given and the problem types refered to in "3. DETAIL" below do not apply.
  3. DETAIL
    Gathers monitor data, in addition to information gathered by RAPID and NORMAL. This type should be collected when there is some suspicion that the problem is related to performance, subsystem workload, sidefile or HUR journal overflow.

Note: If the performance problem is happening when the dump is about to be taken, then if at all possible set the System Option MODE 31=ON, wait 5 minutes, then take the Detailed dump. This could save a lot of time later on. Turn MODE 31 OFF when you have taken the dump.

Other Autodump types are available with USP,NSC,USP-V,USP-VM and should only be collected if instructed by GSC.

IMPORTANT

Before performing dump procedure, check maintenance STATUS and VERSION.

In the case of Mainframe subsystem, also display Main Frame Paths (USP, NSC, USP-V, USP-VM) or Physical Paths (9900, 990V)

Special Instructions when a DETAIL dump is to be collected for Performance Problem troubleshooting:

  1. For 9900 and 9900V, be sure to start the SVP Monitor (refer to Monitoring function in SVP MM pages). All items of data are collected internally, regardless of display settings. Choose at least ONE display setting. Choose an interval appropriate to the error condition (ex: "60 sec" interval can collect data for 7 hour period). This step is not required for USP, NSC55, USP-V or USP-VM
  2. Turn on System Mode 31 before the testing starts for the problem you are trying to troubleshoot. Mode 31 captures critical trace data related to Host IO performance. Its buffers can wrap after just 60 seconds on a busy port. Thus make sure that the DETAIL dump is taken before the testing is complete - and at a time the problem is being experienced.
  3. Open maintenance STATUS and VERSION before starting the dump.
  4. If MODE 31 has been turned on, remember to turn it off after the DETAIL DUMP completes. MODE 31 can have a performance impact on some workload types and should not be left on unless specifically requested by GSC.

No comments: