- Overview
- Required and Recommended Patches
- Known Problems
- Removing Diagnostics
- Getting More Information
- EMS Hardware Monitors
This DIAGNOSTICS.readme document covers the September 1999 (IPR 9909) release of Support Plus for S800/S700 systems (all versions of HP-UX).
NOTE: As of the September 1999 release, the name of the Diagnostic/IPR Media has been changed to Support Plus. In addition, the format has changed so that there is a separate CD-ROM for each version of the operating system (HP-UX 10.20 and HP-UX 11.0).
CAUTION: You must install certain patches before loading Online Diagnostics (Support Tools). See Required and Recommended Patches below.
Support Plus, in addition to IPR software, contains a complete build of the following support tools:
- Support Tool Manager (STM) for online diagnostics
- ODE (offline diagnostics) / LIFLOAD
- EMS hardware monitors (HP-UX 10.20 and 11.0 only)
- EMS Kernel resource monitors (HP-UX 11.0 only)
- Predictive Support (S800 only)
- (Support Plus no longer contains File System Recovery. There is now a separate Recovery Media for 10.X releases and the 11.X releases have File System Recovery on the Core/Install Media. These are included in your HP-UX media kits.)
The support tools are all contained in a Software Depot (SD) bundle named "OnlineDiag". This bundle is distributed in two ways:
The Support Tools Manager, ODE/LIFLOAD, and (optionally) Predictive Support must be loaded after the Operating System is installed. The EMS Hardware Monitors are installed automatically when STM is installed.
- Support Plus media
- HP Software Depot website
Support Plus can be:
For this September 1999 release:
- Booted to ISL and used to load and run offline diagnostics.
- For 10.20 and 11.X users, mounted and used to install online diagnostics and support tools via swinstall(1m).
- All diagnostic defect repairs and enhancements as of 9/1/99 are included. Any future patches dated after 9/1/99 must be loaded after Support Plus is loaded.
- Support for the new hardware in all active releases as of 9/1/99.
Required and Recommended Patches
CAUTION: You must install certain patches before loading Online Diagnostics (Support Tools).
This document lists the required and recommended patches at the time of writing. However, these patches may be superseded by the time you do your install.
Patch REQUIRED for HP-UX 11.0 (S800 and S700):
For HP-UX 11.0 (S800 and S700): PHKL_18543: s700_800 11.00 PM/VM/UFS/async/scsi/io/DMAPI/JFS/perf patchFor proper operation of the Online Diagnostics (HP-UX 11.0 version only), you must install the above patch BEFORE installing Online Diagnostics. Otherwise, you may see error messages about the missing patches during the installation of Online Diagnostics; you can get further information by reviewing the swagent.log file.This is a large patch which can take a while to load (for example 30 minutes). It also has the following dependencies (other patches that must be loaded):
PHCO_17556Patches for diag drivers no longer required on HP-UX 10.20HP-UX 10.20 no longer requires you to install the diag driver patches for its operation (for example "diag0", "diag1" and "diag2" patches), as of the September 1999 release (IPR 9909). These patches are automatically installed when the Support Tools are installed; a reboot will take place.
Patch recommended (but not required) for older systems with HP-PB bus (HP-UX 11.0/S800 only):
For HP-UX 11.0 (S800): PHKL_18490: s700_800 11.00 diag0 handle more devices,fix data corruptionPatch required only if you intend to run the EMS hardware monitors for the Fibre Channel Arbitrated Loop Hub Monitor or the Fibre Channel Switch Monitor:For HP-UX 11.0 (S800 and S700): PHSS_16587:HP aC++ runtime libraries (aCC A.03.13) For HP-UX 10.20 (S800 and S700): PHSS_17872: s700_800 10.X HP aC++ runtime libraries (aCC A.01.21) [1] [1] PHSS_17872 has a dependency, PHSS_17225 (dld.sl:Purify:Shared:VTable),Patch required only if your system includes a HP SureStore E Disk Array FC60. This patch is required to to run the EMS hardware monitor (fc60mon) or STM tools for this device.For HP-UX 11.0 (S800 only): PHCO_18685: s700_800 11.00 HP Array Manager/60 cumulative patch For HP-UX 10.20 (S800 only): PHCO_18684: s700_800 10.20 HP Array Manager/60 installation patchPatch STRONGLY RECOMMENDED
For HP-UX 11.00 (S800 and S700): PHSS_20007: s700_800 11.00 STM panic, disk_em,diagmond,tlscsidev For HP-UX 10.20 (S800): PHSS_20006: s800 10.20 STM panic, disk_em,diagmond,tlscsidev For HP-UX 10.20 (S700): PHSS_20005: s700 10.20 STM panic, disk_em,diagmond,tlscsidevThese patches fix several problems on the IPR 9909 release, some of which are critical:
- Excessive CPU usage by the diagmond daemon. diagmond can use 10-20% of the CPU time on a system whenever a User Interface (UI) for STM is not running.
- STM User Interface may have too many open file descriptors on V class.
- Enhanced the memory information tool to support J-Class & C-Class for 11.00. Added memory support for A-Class.
- Accumulation of temporary device files in /var/tmp. In some rare cases that were impossible to reproduce, the diag0 device file was removed, so that some tools are non-functional unless an "insf -e" command is subsequently performed.
- System panics were reportedly caused by the disk_em monitor due to a trap 18 in the scsi3 driver; this situation was rare and impossible to reproduce.
- A misleading message was generated by the disk_em monitor when an unknown combination of Sense code, key and qualifier was received.
Patches PHSS_20005, PHSS_20006,and PHSS_20007 are only available from the HP IT Resource Center, (http://us.itrc.hp.com). They are not in the HWCR patch bundle on the Support Plus media.
When installed, these patches will "bump" the STM version number to A.17.10. These patches replace patches PHSS_19538, PHSS_19562, and PHSS_19563.
Loading Patches
You can load the patches in one of three different ways:
Method 1: Entire patch bundle. Install the entire HW or HWCR patch bundle for your system. Advantages: simple and tested process. Disadvantages: the bundle can be many megabytes in size.
Choose the Hardware Critical (HWCR) or Hardware (HW) patch bundle appropriate for your system. For example, choose XSW800HWCR1020 for a Series 800 system running HP-UX 10.20.
The patch bundles are distributed in the same way as the OnlineDiag bundle:
The procedure for using swinstall to load the patches is described in Chapter 5 of the "Support Plus: Diagnostics Users Manual."
- The Support Plus media
- The HP Software Depot website
Method 2: Individual patches from bundle. Install ONLY the individual patches required for your system from the HW or HWCR patch bundle described above. Advantages: Small number of patches. Disadvantages: Requires knowledge of SD (swinstall) to select patches (interactive selection or command line selection).
Method 3: Individual patches from website. You can also obtain the patches through the HP IT Resource Center (http://us.itrc.hp.com). A problem with loading individual patches from this website is that a system reboot is required for every patch that requires a reboot (patches to the kernal, indicated by "PHKL" in the patch name, all require a reboot).
CAUTION: Problem with FC60 Hardware MonitorOn the September 1999 release (IPR 9909), the FC60 hardware monitor (fc60mon) does not consistently report problems with the FC60 array. This was also the version contained on the CD shipped with the array. This problem was fixed on the December 1999 release (IPR 9912).
To check the version of fc60mon you are running, execute the following command:
# what /usr/bin/stm/uut/bin/tools/monitor/fc60monYou are running a bad version if the command returns a version number of "A.01.03", for example:fc60.mon: A.01.03 Tue Jul 20 19:00:41 MDT 1999To fix the problem, update the support tools with the December 1999 (IPR 9912) release or later. This will cause version A.01.04 of fc60mon to be loaded.
PROBLEM: Offline Diagnostics and 440MHz N4000Several offline diagnostics (MAPPER, IOTEST, and PERFVER) do not run on the 440MHz N4000 server (Prelude hversion 5CC). On these systems, the OnlineDiag installs the wrong version (32-bit version) of LIFLOAD into the root disk's LIF area. The correct version should be the 64-bit version of LIFLOAD. Thus, the 32-bits programs (MAPPER, IOTEST, and PERFVER) in the LIFLOAD cannot be run.
This problem does not occur in the Dec 1999 release (IPR 9912).
Workaround for IPR 9909: Users can run the 64-bit offline diagnostic programs by booting the system from the Support Plus Media, then running ODE. From the ODE prompt, you can run the desired 64-bit diagnostic programs, e.g., MAPPER2, IOTEST2, etc.
CAUTION: Monitoring Changes for disc30, sdisk and disk array devicesAs of IPR 9902 (Feb 99 release), there has been a change to the way that monitoring is done for disc30, sdisk and the HA Disk Array Models 10, 20, and 30FC.
Formerly, the "diaglogd exec" programs (pdisc30_exec, pharaymon_exec, and psdisk_exec) handled driver error entries for these devices.
As of IPR 9902, these programs have been deleted and their functionality is now provided by the EMS Hardware Monitors.
If you had customized the configuration files for the dialogd exec programs (disk30_exec.cfg, sdisk_exec.cfg, and haraymon_exec.cfg) you may wish to re-configure the EMS Hardware Monitors to achieve the same results.
Problem: (September 1999 Release): Error Messages While Intalling Support ToolsWhile installing the September 1999 Support Tools, you may see error messages during the analysis phase of the installation of products with a dependency on Event Monitoring Services. For example, you may see these messages when EMS hardware monitors are being installed. Similar messages may appear during the installation of EMS High Availability (HA) Monitors and MC/ServiceGuard, if you are installing those products.
Error messages may begin with a line like:
"ERROR: A later revision (one with a higher revision number) of fileset "EMS-Config.EMS-GUI,r=A.03.00" has already been installed. Either remove this fileset or change the "Allow_downdate" option to "true"Workaround: You can ignore these error messages. The "EMS-Config.EMS-CORE,r=A.03.00" and "EMS-Config.EMS-GUI,r=A.03.00" filesets will be skipped. The "SD Intall" screen "Status" field will show "Completed with errors." The products will be installed OK.
Problem: (N-Class Only): PCI Fibre Channel Interface Information tool will exit with an INCOMPLETE status and log the following message:This card ( PCI Fibre Channel Interface ) is not supported by the Information tool on N class machines.
CAUTION: Compatibility Problem with ServiceGuard and LockManagerFrom the February 1999 release (IPR 9902) onwards, the Support Tools (diagnostics) include EMS hardware monitors and EMS version A.03.00 on both HP-UX 10.20 and HP-UX 11.00.
This version of EMS is incompatible with ServiceGuard A.10.10, which includes version A.01.00 of EMS. It is also incompatible with ServiceGuard and LockManager versions A.11.01, A.11.02 and A.11.03, which include version A.02.00 of EMS.
If you run these releases of ServiceGuard or LockManager, you must upgrade them before installing the Support Tools on the February 1999 (IPR 9902) or newer releases.
On HP-UX 10.20 you should upgrade ServiceGuard to A.10.11 and on HP-UX 11.00 you should upgrade ServiceGuard or LockManager to release A.11.04 or newer.
If you do not upgrade, EMS will silently be upgraded to version A.03.00 when you install the diagnostics; ServiceGuard and LockManager will fail to work if you have any monitored resources. In this case, if you execute swverify or other SD-UX commands, you will see error messages like:
The corequisite "EMS-Core.EMS-CORE,r=A.01.00,a=HP-UX_B.10.20_800,v=HP" for fileset "Cluster-Monitor.CM-CORE,l=/,r=A.10.10" cannot be successfully resolved.If you have already loaded the diagnostics and therefore upgraded to EMS A.03.00 and are still running an incompatible release of ServiceGuard or LockManager, you should now upgrade to get your system into a supported and working state.There is no functional difference between ServiceGuard A.10.10 and ServiceGuard A.10.11, other than support for the new version of EMS and bug fixes. Functional differences for the 11.00 releases of ServiceGuard and LockManager can be found in the release notes.
Older versions of ServiceGuard and LockManager, for example A.10.06 and A.10.07.01, do not provide any support for EMS, and so are not affected by this issue.
Problem with Tools for SCSI Devices (Older Systems Only)Support tools for SCSI devices may terminate with a status of INCOMPLETE, if you loaded Diagnostics from the June 99 (IPR 9906) or September 99 (IPR 9909) releases onto an older system, such as T-Class or "Nova" (F,G,H,I Class; xx7 Family). The problem does not occur on newer systems such as D-Class, K-Class, N-Class or V-Class. The problem has been reported on both HP-UX 10.20 and 11.00.
If the system has Predictive Support, you you may see "SCSISCAN 500" errors reported.
To fix the problem, issue the following commands (as root):
cd /dev insf -e # re-creates the diagnostic device filesThe problem may occur when an Information, Expert or Firmware Download Tools is run on SCSI devices on systems with the SIO bus. An entry in the tool's Activity Log will report the error "/dev/diag/diag0 not found."The problem is caused by an error in a SCSI library which removes /dev/diag/diag0 if a call to get access to the SIO passthru driver fails. The error will be fixed in a future release.
If you wish to remove the STM online diagnostic system after it has already been installed, type:
swremove OnlineDiagNOTE: Executing the "swremove OnlineDiag" command also removes the Predictive Support package.Problem with Removing Diagnostics (HP-UX 10.20): There is a problem removing Diagnostics and associated patches once they have been installed on systems running HP-UX 10.20. For example, this problem occurs if you try to remove an old patch for diag1:
(S800) PHKL_17590: diag1 support PCI with subvendor/subsystem info (S700) PHKL_17589: diag1 support PCI with subvendor/subsystem infoWhen you try to remove the diag1 patch, there will be an attempt to rebuild the kernel (required after removing a kernel patch). This kernel rebuild will fail, leaving an entry in the /var/adm/sw/swagent.log file that contains this text (and more):/usr/ccs/bin/ld: Unsatisfied symbols: diag1_install (code)This problem will occur even if you remove the Diagnostics first.FIX: Avoid the problem entirely; DO NOT REMOVE THE PATCHES. Instead, just remove the Diagnostics (if desired) by using swremove.
Removing the diag1 and diag2 patches is not recommended. The patches are small, their functionality is limited to the diagnostics and OS error logging, removal and installation require that the system be rebooted, and they are required for versions of STM starting with A.14.00 (IPR 9902). In addition, one of them corrects a potential system panic and data corruption problem.
If you feel you must remove the patches associated with diagnostics on HP-UX 10.20 (not recommended), here is the procedure:
- Edit the file /stand/system and remove the line containing the word "diag1"
- Remove the Diagnostics using swremove.
- Now you can remove the diag1 and other patches. (again, this is not recommended).
You can get more information on Diagnostics (Support Tools) in the following ways:
EMS Hardware Monitors
- Once you install a specific stream (e.g. HP-UX 10.20), the Release Notes for that stream are available:
Support Tools Manager (STM): /usr/sbin/stm/Rel_NOTES.STM EMS hardware monitors: /usr/sbin/stm/Rel_NOTES.HWE Predictive Support: /opt/pred/bin/Rel_NOTES.PRED- For the latest information on hardware support tools, such as STM and EMS Hardware Monitors, refer to the "Diagnostics" section of Hewlett-Packard's online documentation Web site at:
http://docs.hp.com/hpux/diag/This site provides manuals, tutorials, FAQs, and other reference material. Two complete manuals ("Support Plus: Diagnostics User's Guide" and "EMS Hardware Monitors User's Guide") appear on the Web site and in the two following locations:- In the DIAGNOSTICS directory under your mount point for the CD-ROM (e.g. /diagtemp/DIAGNOSTICS ). The files are named diag_usr.pdf and ems_usr.pdf and can be read with the Adobe Acrobat viewer which can be downloaded from the Adobe Web site.
- On the Instant Information CD-ROM.
Included on the Support Plus CD-ROM are the EMS Hardware Monitors which are an important tool for maintaining system availability. The EMS monitors allow you to monitor the operation of a wide variety of hardware products and be alerted immediately if any failure or other unusual event occurs. Hardware event monitoring is available to users running HP-UX 10.20 or 11.X (IPR February 1999 and later).
Hardware event monitoring provides a high level of protection against system hardware failure. By using hardware event monitoring, you can virtually eliminate undetected hardware failures that could interrupt system operation or cause data loss.
For complete information on installing and using EMS hardware event monitors, as well as a list of supported hardware, refer to the documents listed in "Getting More Information" earlier in this file.
The EMS Hardware Monitors are installed at the same time as the Support Tools Manager. Once the monitoring software is installed, monitoring is automatically enabled.
By default, messages regarding major warning, serious and critical events that occur on hardware being monitored will be:
All events will be stored in /var/opt/resmon/log/event.log.
- Written to /var/adm/syslog/syslog.log
- Sent to EMAIL address root
To configure, enable, or disable hardware event monitoring, run the monitoring request manager: /etc/opt/resmon/lbin/monconfig .