- Overview
- Configuring Hardware Monitoring
- Changes
- Known Problems
- Monitors Provided
- Monitor Dependencies
- Product Documentation
- SD Product Structure
- Reporting Defects
This release notes describes the changes made to the EMS Hardware Monitors for the HP-UX 11i v3 (11.31), February 2007 release.
Note: On the HP-UX 11i v3 operating system, Online Diagnostics does not support tape drives. Although some of the Support Tools Manager (STM) tools may work with tape drives, they are not supported. The diagnostic tools and utilities that support these devices are HP StorageWorks Library and Tape Tools (L and TT). These tools are available at:
http://www.hp.com/support/tapetoolsHardware event monitoring enables you to eliminate undetected hardware failures that can interrupt system operation or cause data loss. The EMS Hardware Monitors are an important set of tools for maintaining system availability. The EMS Hardware Monitors enable you to monitor operations of a wide variety of hardware resources, and alert you immediately in the event of a failure or if an unusual event occurs. These monitors are included in the OnlineDiag bundle.
Hardware event monitoring provides a high level of protection against system hardware failure.
The Serial-Attached SCSI (SAS) Mass Storage Adapter Monitor (dm_sas_adapter), a new hardware monitor, is included on Itanium-based systems. It monitors the HP SAS Mass Storage Adapter, and generates an event if there is an abnormal activity on the device.
Configuring Hardware Monitoring
The EMS Hardware Monitors are installed along with the Support Tools Manager (STM). Once the monitoring software is installed, monitoring is enabled automatically.
By default, monitors report events with severity levels, Major, Warning, Serious, or Critical in the following ways:
Events are stored in the /var/opt/resmon/log/event.log file.
- Written to /var/adm/syslog/syslog.log file
- Sent to the root E-mail address
To configure, enable, or disable hardware event monitoring, run the /etc/opt/resmon/lbin/monconfig monitoring request manager.
The Peripheral Status Monitor (PSM) is configured using the Event Monitoring Service (EMS). For more information on how to configure PSM, see Configuring Monitors with the EMS GUI at:
http://docs.hp.com/en/diag/ems/ems_gui.htmFollowing are the changes made to the EMS Hardware Monitors for the current release:
- Changes to all Monitors
- Changes to Individual Monitors
- Changes to Configuration Files
- Changes to Monitoring Request Manager
- Changes to all Monitors
- Not applicable
- Changes to Individual Monitors
This section describes the changes made to individual monitors. Monitors are listed in alphabetical order.
- Chassis Code Monitor (dm_chassis)
The following change applies to PA-RISC-based systems only:The following change applies to Itanium-based systems only:
- JAGaf65955
Problem: The event description for Event 1352 is incorrect.
Fix: The event description is changed from "The battery on the SBCH is below the safe threshold. The battery can be replaced online." to "The battery on the SBCH is below the safe threshold."
- JAGaf69452
Problem: The cause and action of Event 1839 are incorrect.
Fix: The cause and action are modified.- CMC Monitor (cmc_em)
- Not applicable
- Core Hardware (dm_core_hw)
The following changes apply to PA-RISC-based systems only:
- JAGaf80771
Problem: The monitor does not deallocate the file descriptors in certain failure conditions.
Fix: The monitor now deallocates the file descriptors in all conditions.- If an online Field Replacement Unit (FRU) addition, replacement, or removal is in progress, an I/O read or write operation may fail. With this fix, the monitor is enhanced to handle this failure.
- Core Hardware for PA-RISC and Itanium-based Intelligent Platform Managment Interface (IPMI) systems (ia64_corehw)
The following change applies to PA-RISC and Itanium-based systems:
- JAGag09544
If an error exists on a monitored hardware for a duration longer than what is specified in the SENSOR_RESCAN_INTERVAL, reminder events are not generated. This problem is fixed and the reminder events are now generated.- Core Hardware Monitor - Asama (ipfcorehw_asama)
- Not applicable
- Core Hardware Monitor - Hitachi (ipfcorehw_hitachi)
- Not applicable
- CPE Monitor (cpe_em)
The following changes apply to Itanium-based systems only:
- JAGag01709 and JAGaf01732
The cpe_em monitor is now supported on the following systems:
- rx2660
- rx3600
- rx6600
- rx7640
- rx8640
- SD16B
- SD32B
- SD64B
- BL860c
- JAGag01044
Problem: On generating Event 100227 and Event 100229, the cpe_em monitor produces a core and aborts.
Fix: This problem is fixed.- JAGaf90565
Problem: The cell number displayed for events 100212, 100213, and 100214 in the Error Description field is inconsistent with the cell number displayed in the Event Details field.
Fix: This problem is now fixed.- JAGag19632; JAGag19730
When a memory double chip spare event occurs, the cpe_em monitor incorrectly generates the event 100299. This incorrect behavior is fixed.- CPU Monitor (lpmc_em)
The following changes apply to PA-RISC-based systems only:The following change applies to PA-RISC and Itanium-based systems:
- The CPU monitor is enabled for Montecito-based systems.
- JAGag09249
Problem: The monitor terminates immediately after starting up, because of a memory buffer overflow.
Fix: This problem is fixed.
- The lpmc_em monitor is shut down when an online migration of a processor starts. On completion of the online migration of the processor, the monitor is restarted.
- CPU Monitor - Hitachi (cmc_em_hitachi)
- Not applicable
- Disk Array FC60 Monitor (fc60mon)
- Not applicable
- Disk Monitor (disk_em)
- Not applicable
- Fibre Channel Adapter (dm_ql_adapter)
- Not applicable
- Fibre Channel Adapter Model A5158 Monitor (dm_TL_adapter)
- Not applicable
- Fibre Channel Switch (dm_fc_sw)
- Not applicable
- Forward Progress Log (FPL) Monitor (fpl_em)
- Not applicable
- High Availability Disk Array Monitor (ha_disk_array)
- Not applicable
- High Availability Storage System (dm_ses_enclosure)
- Not applicable
- iSCSI Driver Subsystem Monitor (dm_iscsi_adapter)
- Not applicable
- Memory (dm_memory)
- JAGaf23521
Problem: No event is generated if the Page De-allocation Table (PDT) is 100% full.
Fix: Event 1400 is generated by the memory monitor when the Page De-allocation Table (PDT) is 100% full. The severity of the event is Critical. An event is generated once in 24 hours.- JAGaf30728
Problem: Events 3100, 3200, and 3300 are not generated even if all the single bit errors are on the same address.
Fix: Event 3100 to Event 3300 are disabled, by default. However, you can enable these events. These events are generated when the number of single bit errors on the same address exceeds the specified limit. However, there is no time frame for exceeding this limit.- Problem: Event 4000 to Event 4200 are not generated if the number of unique memory addresses on the same DIMM meets the specified threshold within the specified time frame.
Fix: Event 4000 to Event 4200 are now generated.- The severity levels of events 3100, 3200, and 3300 are changed to INFORMATION, to disable the events. These events are disabled by default in the default_dm_memory.clcfg file.
- JAGaf48614
Problem: The MC/EXT field in the events generated on rp4440, rp3440, and c8000 systems is blank.
Fix: This problem is fixed.- The dm_memory monitor is shut down when an online memory migration starts. On completion of the online memory migration, the monitor is restarted.
- Memory IA64 (memory_ia64)
- JAGag10908
Events 5000, 5100, 5200, and 5300 are disabled. Event 5400 is enabled.- The memory_ia64 monitor is shut down when an online memory migration starts. On completion of the online memory migration, the monitor is restarted.
- Memory Monitor -- Hitachi (ipfmemory_hitachi)
- Not applicable
- MSA1000/MSA30 Storage Disk Array Monitor (msamon)
The following change applies to PA-RISC and Itanium-based systems:
- JAGaf00620
The msamon monitor now supports HP Modular Smart Array (MSA) Controller.- Peripheral Status Monitor (psmmon)
- Not applicable
- RAID Adapter (dm_raid_adapter)
- Not applicable
- Remote Monitor (RemoteMonitor)
- Not applicable
- SCSI Disk Monitor (scsi_disk)
- Not applicable
- System Status Monitor (sysstat_em)
The following changes apply to PA-RISC and Itanium-based systems:
- JAGaf05354
Problem: The sysstat_em monitor does not log a message in the /var/opt/resmon/log/api.log file to indicate that the monitor has started.
Fix: The monitor is enhanced to log a new message.- JAGaf87835
Problem: When the monitor generates Event 100008, the status of the ui_host device changes to DOWN. However, the status of this device does not change to UP when the monitor restarts.
Fix: This problem is now fixed.- UPS Monitor (dm_ups)
- Not applicable
- Changes to Configuration Files
The following change applies to PA-RISC and Itanium-based systems:
- JAGaf61014
Problem: If there is an error in any field in the .clcfg file, the monitors generate an event at the first occurrence after startup even if the flag value is False.
Fix: This problem is now fixed.- Changes to Monitoring Request Manager
The following changes apply to PA-RISC and Itanium-based systems:The following change applies to PA-RISC and Itanium-based systems:
- JAGaf32869
Problem: When users issue the disable monitor request, using monconfig, the following error message is displayed:
KILLMON ERROR: could not find a dictionary file containing the monitor name '/storage/events/disk_arrays/AutoRAID'Fix: The error message is reworded as follows:
KILLMON ERROR: could not find a dictionary file containing the resource path '/storage/events/disk_arrays/AutoRAID'- JAGaf71186
If the system time is changed to a time that is earlier than the last boot time, all the monitors generate the following errors when the moncheck utility is executed:
>/StorageAreaNetwork/events/SAN_Monitor ... NOT READY. >/system/events/cpu/cmc ... NOT READY. >/system/events/cpu_hitachi/cmc ... NOT READY. >/system/events/cpe ... NOT READY. >/storage/events/disks/default ... NOT READY. >/storage/events/disks_hitachi/default ... NOT READY. >/adapters/events/TL_adapter ... NOT READY. >/connectivity/events/hubs/FC_hub ... NOT MONITORING. (Possibly there is no hardware to monitor.) >/connectivity/events/switches/FC_switch ... NOT MONITORING. (Possibly there is no hardware to monitor.) >/adapters/events/iscsi_adapter ... NOT READY. >/system/events/dm_memory_asama ... NOT READY.This indicates that monitoring is not possible in such a scenario.
Fix: The monitors are now enhanced to overcome this limitation. They continue monitoring even if the system time is changed to a time that is earlier than the last boot time.
- JAGaf86545 and JAGag03713
For security reasons, the EMS framework restricts text logging to a particular directory path on the system. The modified monconfig module verifies whether the same directory path is provided in the user input while defining the path for logging.
- If the maxssiz_64bit kernel parameter is set below the default value of 0x800000, it can cause the lpmc_em monitor to abort.
- The Memory Page Deallocation (MPD), which runs on most current HP-UX systems, does not work on rx4610 systems. The activity log for memlogd includes a message that reads unsupported device.
MPD cannot be implemented on the rx4610 system, because the system's design does not allow the memlogd daemon to run on it.
For the February 2007 release of HP-UX 11i v3 (11.31), the following monitors are provided:The following monitors are NOT provided:
- CMC Monitor (cmc_em)
- Core Hardware (dm_core_hw)
- Core Hardware for PA-RISC and Itanium-based Intelligent Platform Managment Interface (IPMI) systems (ia64_corehw)
- Core Hardware Monitor -- Asama (ipfcorehw_asama)
- Core Hardware Monitor -- Hitachi (ipfcorehw_hitachi)
- CPE Monitor (cpe_em)
- CPU Monitor (lpmc_em)
- CPU Monitor -- Hitachi (cmc_em_hitachi)
- Disk (disk_em)
- Disk Array FC60 (fc60mon)
- Fibre Channel Adapter (dm_ql_adapter)
- Fibre Channel Adapter Model A5158 (dm_TL_adapter)
- Forward Progress Log (FPL) Monitor (fpl_em)
- High Availability Disk Array (ha_disk_array)
- High Availability Storage System (dm_ses_enclosure)
- iSCSI Driver Subsystem Monitor (dm_iscsi_adapter)
- Memory (dm_memory)
- Memory IA64 (memory_ia64)
- Memory Monitor -- Hitachi (ipfmemory_hitachi)
- MSA1000/MSA30 Storage Disk Array Monitor (msamon)
- Peripheral Status Monitor (psmmon)
- RAID Adapter (dm_raid_adapter)
- Remote Monitor (RemoteMonitor)
- Serial-Attached SCSI (SAS) Mass Storage Adapter monitor (dm_sas_adapter)
- SCSI Disk Monitor (scsi_disk)
- System Status (sysstat_em)
- UPS (dm_ups)
- dm_FCMS_adapter
- Fibre Channel SCSI Multiplexer (dm_fc_scsi_mux)
- fw_disk_array: hardware not supported on the system
- scsi123_em: hardware not supported on the system
For detailed information about the products and the monitors supporting them, and additional dependencies, see the documentation on Online Diagnosticcs at:
http://docs.hp.com/en/diag/For a list of the current required patches, see the DIAGNOSTIC.readme file at:
http://docs.hp.com/en/diag/st/st_read.htmCurrent monitor requirements are described in the Online Diagnostics Administrator's and User's Guide at:
http://docs.hp.com/en/diag .Following are the documents related to EMS Hardware Monitors available at:
http://docs.hp.com/en/diag/
- Data Sheets
- Online Diagnostics Administrator's and User's Guide
- EMS HW Monitors for Hitachi Systems Running HP-UX
- Event Descriptions
- Release Notes
The EMS Hardware Monitors are installed as part of the OnlineDiag bundle (product number B4708AA). In addition, they require the EMS framework (product number B7609BA).
For information on the STM product, see the STM release notes file at:
/usr/sbin/stm/REL_NOTES.STMFollowing is the information about the bundle, product, sub-product, and fileset of the OnlineDiag depot:
SD Bundle: OnlineDiag Description: On-line Diagnostic System (Series 800/700) SD PRODUCT: Sup-Tool-Mgr Description: Support Tools Manager for HP-UX Systems SD SUB-PRODUCT: Manuals Description: Support Tools Manager Manual Pages FILESET: RELEASE_NOTES Description: HPUX STM Release Notes FILESET: STM-MAN Description: HPUX STM Manual Pages SD SUB-PRODUCT: Runtime Description: STM Manual Runtime FILESET: STM-CATALOGS Description: HPUX STM Shared Libraries FILESET: STM-SHLIBS Description: HPUX STM Shared Libraries FILESET: STM-UI-RUN Description: HPUX STM User Interface FILESET: STM-UUT-RUN Description: HPUX STM Unit Under Test Runtime SD PRODUCT: EMS-Config Description: EMS Config FILESET: EMS-GUI Description: Event Monitoring Service Graphical User Interface SD PRODUCT: EMS-Core Description: EMS Core Product FILESET: EMS-CORE Description: Event Monitoring Service Core FilesYou can report defects related to EMS Hardware Monitors by filing a request on CHART. If you do not have access to CHART, contact your local HP representative to file a defect on your behalf.