Announcement
System Fault Management (SFM) is a collection of tools used to monitor the health of HP servers and receive information about hardware such as memory, CPU, power supplies, and cooling devices. SFM operates in the Web-Based Enterprise Management (WBEM) environment.
SFM includes the following tools:
- SFM Providers
- EVWEB
- EMT
This document contains the following sections:
- SFM Providers
- EVWEB
- EMT
- System Requirements
- Supported Browsers
- Limitations and Workarounds
- Product Documentation
- Software and Documentation Availability in Native Languages
- Product Structure
- Reporting Defects
SFM Providers
SFM providers are tools that gather information related to various hardware devices and report to the Common Interface Model Object Manager (CIMOM).
Following lists the SFM providers, and their respective functions:
Providers Functions CPU Instance Provider Retrieves information about processor inventory and consolidated health of the processor subsystem Memory Instance Provider Gathers information about memory inventory and consolidated health of the memory subsystem EMS Wrapper Provider Converts events generated by the EMS Hardware Monitors into indications and reports those indications to the CIMOM Filter Metadata (FMD) Provider Provides the facility to predefine the important filter in a repository. FMD also ensures that all important or chosen indications are logged to the local event archive. FMD creates HP-advised subscriptions when SFM is installed. Environmental Providers Retrieve information about cooling devices (fans) and power supply (bulk power supply and AC input lines) on HP servers. They also retrieve consolidated health of cooling, power, system temperature, and system voltage subsystems on HP servers. Event Manager Common Information Model (EVM CIM)Provider Converts EVM events into indications and reports those indications to the CIMOM SFMIndicationProvider Generates WBEM indications when an abnormal activity is detected on the monitored devices and reports these WBEM indications to the EMS framework. Firmware Revision Instance Provider Retrieves information about the firmware revision of system hardware components, such as system firmware version and Management Processor (MP) firmware version. Disk Instance Provider Retrieves information about the consolidated health status and inventory information of direct attached disk drives, such as SCSI drives. MP Instance Provider Retrieves information about the management processor of the system. Enclosure Instance Provider Retrieves information about the Onboard Administrator (OA), such as OA description, OA IP address, OA MAC address, and the URL to launch the OA. Record Log Provider Enables event analysis tools such as Web-Based Enterprise Services (WEBES) to access details of indications generated by the SFMIndicationProvider that are available in the SFM database, for event analysis. Temperature Provider Describes properties such as sensor number, current temperature reading, and temperature sensor status. The MP Provider is enhanced. The HP SMH property page for MP Provider describes the following new properties:
- Firmware revision of the MP
- Server locator LED color
- Server locator LED blink rate
- MP NIC connection status
A new provider, called the Temperature Provider is introduced. The HP SMH property page for the Temperature Provider describes the following new properties:
- Sensor number
- Current temperature reading
- Temperature sensor status
SFM supports the PCI Express interface events on the following HP Integrity systems:
- rx7640
- rx8640
- SD16B
- SD32B
- SD64B
SFM is the default monitoring mode. However, it is possible to switch to the OnlineDiag monitoring mode. For information on how to switch to the OnlineDiag mode, see the SFM Administrator’s Guide at:
http://docs.hp.com/en/diagA new provider, called the Record Log Provider is introduced. This provider enables event analysis tools such as Web-Based Enterprise Services (WEBES) to access details of indications generated by the SFMIndicationProvider that are available in the SFM database, for event analysis.
Select SFM features support the following non-HP systems:
- ia64hitachiserverBladeSymphony
- ia64NECserveru32000
- ia64NECserveru64000
The features that support the mentioned non-HP systems are as follows:
- EVWEB. However, throttling configuration and Logviewer are not supported.
- FMD Provider is supported.
- The sfmconfig command with select options is supported.
SFM is enhanced to support the Automatic Process Recovery (APR) functionality. Event 100662 is generated when a recoverable Machine Check Abort (MCA) happens. If a second recoverable MCA occurs again within a period of two months, Event 100661 is generated, and Dynamic Processor Resilience (DPR) action is initiated on the faulty processor.
Supported EMS Hardware Monitors
The following Event Monitoring Service (EMS) monitors are supported on HP 9000 servers running the HP-UX 11i v3 operating system:
- LPMC (now CPU) (lpmc_em)
- Memory (dm_memory)
- Core HW (dm_core_hw)
- Chassis Code (dm_chassis)
- Disk (disk_em)
- Integrity Core Hardware Monitor(ia64_corehw)
- IPMI Forward Progress Log Monitor (fpl_em)
The following EMS hardware monitors are supported on Itanium-based servers running the HP-UX 11i v3 operating system:
Defect Fixes
- Corrected Platform Error Monitor (cpe_em)
- IPMI Forward Progress Log Monitor (fpl_em)
- CMC Monitor (cmc_em)
- Itanium Core Hardware Monitor(ia64_corehw)
- SCSI Disk Monitor (disk_em)
- Itanium Memory Monitor (memory_ia64)
- QXCR1000587224
Problem: The clock speed reported by SFM on HP Integrity systems based on Dual-Core Intel Itanium processor or Dual-Core Intel® Itanium® processor 9100 series is different from that displayed in the output of the following command:
# /usr/contrib/bin/machinfo
Cause: SFM reports incorrect clock speed.
Resolution: This problem is fixed. SFM reports the correct clock speed.- QXCR1000750479
Problem: HP 9000 Itanium-Based Intelligent Platform Management Interface (IPMI) systems do not support the dm_chassis monitor. However, the EMS Wrapper Provider attempts to start the monitor. As a result, errors are logged in the api.log file.
Resolution: This problem is fixed. The EMS Wrapper Provider does not attempt to start the dm_chassis monitor on HP 9000 IPMI-based systems.- QXCR1000569807
Problem: SFM clears event logs when the System Event Log (SEL) is 80% - 95% full. On low-end HP Integrity systems, the LED is switched on when the SEL is 75% full, which is not desirable.
Resolution: The threshold in the ia64_corehw.cfg file is modified. When the SEL is 65% - 70% full, SFM clears event logs to avoid the generation of the event, and the consequent glowing of the LED.EVWEB
This section describes EVWEB.
EVWEB is a tool that can be used to view and administer WBEM indications generated on the HP-UX 11i v3 system.
The EVWEB tool includes the following components:
Benefits
- Event Subscription Administrator
Event Subscription Administrator enables users to subscribe to an indication and view it. In addition, users with administrative privileges can also modify, and delete subscriptions. By subscribing to an indication, users can obtain detailed information about various WBEM indications. Users can also view indications generated by the High Availability Monitors. Indications generated by High Availability Monitors are called HP threshold indications.
As a part of event subscription, users must specify event subscription criteria. Users must also select one or more destinations to receive information about indications.
Users can select one or more destination from the following list:
- Event Archive: The path to Event Archive is /var/opt/sfmdb/pgsql. Event Archive is the default destination.
- syslog: The path to syslog is /var/adm/syslog/syslog.log.
- Email: Event notification will be emailed to the specified email address. Users can specify multiple email addresses.
- Event Viewer
The Event Viewer enables users to view the indications stored in the Event Archive. In addition, users with administrative privileges can also delete these indications. By default, HP-advised subscriptions are stored in the Event Archive. The Event Viewer also enables users to search for an indication logged in the Event Archive.- Log Viewer
The Log Viewer enables users to view and search the low level logs stored in the log database.Following lists the benefits of EVWEB:
Features
- Enables users to manage all WBEM indications that are supported by SFM.
- Provides an option to customize the indication destination to receive information about HP-advised subscriptions.
- Enables users to view the command-line equivalent of an action performed using the GUI, thereby, educating users about the usage of various commands.
Following lists the features that EVWEB offers:
- Provides quick search and advanced search mechanisms to view events from the Event Archive
- Generates a list of events in a printer-friendly format (GUI only)
- Enables users with administrative privileges to delete indications
- Enables users with administrative privileges to manage subscriptions, such as creating, modifying, and deleting indications
- Enables users to view subscriptions created using EVWEB
- Enables users to view externally created subscriptions.
Subscriptions created by using tools other than EVWEB are termed as externally created event subscriptions.- Enables users to view HP-advised subscriptions. HP-advised subscriptions are provided by default by HP.
Note: EVWEB supports these features on browser-based GUI and the CLI.
What Is New in EVWEBDefect Fixes
- The Throttling Configuration feature is available in both SFM and EMS modes. However, in the EMS mode, the association of a throttling configuration with a subscription is not effective. By design, the EMS mode associates only the default throttling configuration with a subscription.
- The -y option is added to the evweb subscribe –T command. This option enables you to copy a throttling configuration policy to a new file. The EVWEB GUI has a new link, Copy throttling config to file, which enables you to complete the same task from the HP System Management Homepage (SMH).
- The Copy and Create Throttling Configuration feature on the EVWEB GUI is removed.
Limitations
- QXCR1000740925
- The –e option is added to the evweb logviewer –L command. This option specifies whether the –L option must include the low level details and the summary information of the low level logs.
- Problem: The evweb logviewer –E –i
-n command requires both log ID and log index to display the low level log details of a single, unique log. Therefore, it is mandatory to use the –n option with the –i option.
Solution: This feature is now modified. The evweb logviewer –E –icommand now displays the details of all logs that match the log ID and the corresponding log indexes. Therefore, it is not mandatory to use the –n option with the –i option. - QXCR1000714243
The Evweb Event Viewer is enhanced to display the error details about the fpl_em and ia64_corehw monitors.- QXCR1000745044
The Evweb Event Viewer is enhanced to display the error details about the cpe monitor.- QXCR1000589288
Problem: The summary and description of events generated in the EMS mode is incomplete. For example, the following summary is displayed for an event generated in the EMS mode:
" parity errors have been detected in the Instruction or Data Cache Memory (I-Cache or D-Cache) in."
Solution: This issue is now fixed. The summary for an event generated in the EMS mode is now displayed as follows:
"6 parity errors have been detected in the Instruction or Data Cache Memory (I-Cache or D-Cache) in 3 Days."
- When a HP-advised subscription is copied to create or modify another subscription, the subscription criteria is not copied. However, only destinations are copied to the new subscription.
- Event details displayed in EVWEB Event Viewer and embedded in the EVWEB email notification may not have similar readability or formatting as provided by the EMS event notification. However, this issue is not applicable to HP_DeviceIndication class indications.
EMT
This section describes EMT.
Error Management Technology (EMT) is a component of SFM. EMT includes Common Error Repository (CER), which is an online, searchable, and updateable error repository. The CER contains error metadata such as error description, error number, error type, severity, cause of the error, and corrective actions for errors generated on the HP-UX 11i v3 system.
BenefitsFollowing lists the benefits of EMT:
Features
- Enables users to view most errors that can occur on the HP-UX 11i v3 system.
- Provides an option to the administrators to add, modify, and delete custom solutions.
- Enables users to view the command-line equivalent of an action performed using the GUI, thereby, educating users about the usage of various commands.
EMT offers the following features:
Note: EMT supports these features on browser-based GUI and the CLI.
- Provides both quick search and advanced search mechanisms to view error metadata from CER
- Generates a list of errors in a printer-friendly format (GUI only)
- Enables users with administrative privileges to create, modify, and delete custom solutions
- EMT is enhanced to enable querying of system-specific events. This enhancement reduces the query time, and is available in both GUI and CLI modes.
Following is a limitation of EMT:
- When you make a generic query to the CER, a huge amount of data is retrieved from the CER. However, this behavior may affect the performance of EMT.
System Requirements
SFM is supported on the following systems running the HP-UX 11i v3 operating system:
- HP 9000 servers
- rp3410
- rp3440
- rp4410
- rp4440
- rp7405
- rp7410
- rp7420
- rp8400
- rp8420
- SD16, SD32, SD64
- SD16A, SD32A, SD64A
- SD16B, SD32B, SD64B
- Itanium-based servers
- cx2600
- cx2620
- rx1600
- rx1620
- rx2600
- rx2620
- rx2660 (supports PCI-X IO backplane only)
- rx3600
- rx4640
- rx5670
- rx6600
- rx7620
- rx7640
- rx8620
- rx8640
- SD16A, SD32A, SD64A
- SD16B, SD32B, SD64B
- BL60p HP Server Blade
SFM supports the following systems based on the Intel® Itanium® Processor 9100 series and running the HP-UX 11i v3 operating system:
- rx7640
- rx8640
- SD16B
- SD32B
- SD64B
Following lists the software requirements for using SFM:
- HP-UX 11i v3 February 2007
- OpenSSL Version A.00.09.07e.013 or later
- WBEM Services Version A.02.07 or later
- EVM-EventMgr B.11.31
- SysMgmtBase B.00.02.03
- SysMgmtWeb version A.2.2.4 (HP-UX Web Based System Management User Interface)
- Online Diagnostics B.11.31.02.yy
Notes:
- SysMgmtWeb is optional. However, you will not be able to access EVWEB GUI if SysMgmtWeb is not installed on the system. SysMgmtWeb, WBEMServices, and Online Diagnostics are available on the Operating Environment (OE) media.
- HP recommends that you install HP Systems Insight Manager (HP SIM) version 5.0.01 to remotely administer indications and instances.
- The mentioned versions of the software are minimum requirements. All future versions support SFM by default.
Supported Browsers
Following lists the browsers supported by SFM:
- Internet Explorer version 6.0 and above
- Mozilla version 1.5 and above
Limitations and Workarounds
- After the system is rebooted or the CIMOM is restarted, the first request to SFM hardware inventory providers such as the CPU Instance Provider, Memory Provider, and the Environmental Providers may fail with the generation of the CIM_ERR_FAILED status code. Also, a message is displayed on the client system that states " Inventory information is being built currently. Please try after some time" . However, on subsequent requests, the SFM hardware inventory providers respond with requested information instantaneously.
- Hardware inventory providers are not supported on HP Virtual Machines.
- QXCR1000748311
Symptom: On HP Integrity systems, indications may not be generated after an OE update.
Cause: On HP Integrity systems, EMS internal subscriptions can be missing after an OE update as the domain name cannot be retrieved.
Workaround: Restart SFMProviderModule. To generate indications, complete the following steps:
- Log in as super user.
- To disable the SFMProviderModule, enter the following command at the HP-UX prompt:
# /opt/wbem/bin cimprovider -dm SFMProviderModule- To enable the SFMProviderModule, enter the following command at the HP-UX prompt:
# /opt/wbem/bin cimprovider -em SFMProviderModuleProduct Documentation
For more information on SFM, see the following documents at:
http://docs.hp.com/en/diag.html
- SFM Frequently Asked Questions (FAQs)
- System Fault Management Administrator's Guide
- SFM Provider Data Sheets
- SFM Tables of Versions
- SFM Patch Descriptions
Software and Documentation Availability in Native Languages
SFM software and documents are available only in the English language.
Product Structure
The SFM product, consisting of SFM providers and EVWEB, is installed as part of the SysFaultMgmt bundle.
Following are the commands you must use to obtain the bundle, product, sub-product, and the fileset information about the SysFaultMgmt depot:
- Bundle
$ swlist -s <SysFaultMgmt Depot Location> SysFaultMgmt C.03.00.xx.yy HPUX System Fault Management- Product(s)
$ swlist -l product -s <SysFaultMgmt Depot Location> SFM-CORE C.03.00.xx HPUX System Fault Management SFMDB C.03.00.xx HP System Management Database (SFMDB)- Sub-product(s)
$ swlist -l subproduct -s <SysFaultMgmt Depot Location> SFM-CORE.HS-PROVIDER HS-PROVIDER SFM-CORE.ERROR-MGMT Error Management Technology SFM-CORE.EVWEB SFM-CORE.FMD-PROVIDER FMD-PROVIDER SFM-CORE.GS GS SFM-CORE.SFM-HAS SFM-HAS SFM-CORE.SFM-PROVIDER SFM-PROVIDER SFMDB C.03.00.xx HP System Management Database (SFMDB)- Fileset(s)
$ swlist -l fileset -s <SysFaultMgmt Depot Location> # SFM-CORE C.03.00.xx HPUX System Fault Management SFM-CORE.HS_PRO_COREIA C.03.00.xx HealthState Instance Provider Platform Specific Fileset SFM-CORE.HS_PRO_COREPA C.03.00.xx HealthState Instance Provider Platform Specific Fileset SFM-CORE.CTR_PRO_COMM C.03.00.xx Control Provider Common Fileset SFM-CORE.CTR_PRO_COREIA C.03.00.xx Control Provider Platform Specific Fileset SFM-CORE.CTR_PRO_COREPA C.03.00.xx Control Provider Platform Specific Fileset SFM-CORE.EMT_COMM C.03.00.xx EMT Common components SFM-CORE.EMT_COREIA C.03.00.xx EMT core platform specific fileset SFM-CORE.EMT_COREPA C.03.00.xx EMT core platform specific fileset SFM-CORE.EVWEB_COMM C.03.00.xx Event Manager (EvWEB) Common components SFM-CORE.EVWEB_COREIA C.03.00.xx EvWEB core platform specific fileset SFM-CORE.EVWEB_COREPA C.03.00.xx EvWEB core platform specific fileset SFM-CORE.EVWEB_DOC C.03.00.xx EvWEB Online help fileset SFM-CORE.EVWEB_GUI_COMM C.03.00.xx EvWEB GUI common fileset SFM-CORE.EVWEB_GUI_IA C.03.00.xx EvWEB GUI platform specific fileset SFM-CORE.EVWEB_GUI_PA C.03.00.xx EvWEB GUI platform specific fileset SFM-CORE.EVWEB_MAN C.03.00.xx EVWEB Man pages fileset SFM-CORE.FMD_PRO_COMM C.03.00.xx Filter Metadata Instance Provider Common Fileset SFM-CORE.FMD_PRO_COREIA C.03.00.xx Filter Metadata Instance Provider Platform Specific Fileset SFM-CORE.FMD_PRO_COREPA C.03.00.xx Filter Metadata Instance Provider Platform Specific Fileset SFM-CORE.GS_COMM C.03.00.xx General Services Common Fileset SFM-CORE.GS_COREIA C.03.00.xx General Services Platform Specific Fileset SFM-CORE.GS_COREPA C.03.00.xx General Services Platform Specific Fileset SFM-CORE.HAS-IA C.03.00.xx Hardware Access ServicesIA SFM-CORE.HAS-PA C.03.00.xx Hardware Access Services PA SFM-CORE.MISC_COMM C.03.00.xx MISC Common Fileset SFM-CORE.MISC_COREIA C.03.00.xx MISC Platform Specific Fileset SFM-CORE.MISC_COREPA C.03.00.xx MISC Platform Specific Fileset SFM-CORE.SFM_PRO_COMM C.03.00.xx SysFaultMgmt Provider Module COMMON SFM-CORE.SFM_PRO_IA C.03.00.xx SysFaultMgmt Provider Module IA SFM-CORE.SFM_PRO_PA C.03.00.xx SysFaultMgmt Provider Module PA # SFMDB C.03.00.xx HP System Management Database (SFMDB) SFMDB.SMPGSQL-DOC C.03.00.xx PostgreSQL (SFMDB) Documentation Files SFMDB.SMPGSQL-INC C.03.00.xx PostgreSQL (SFMDB) Header Files SFMDB.SMPGSQL-LIB C.03.00.xx PostgreSQL (SFMDB) Library Files (Architecture dependent) SFMDB.SMPGSQL-LIB C.03.00.xx PostgreSQL (SFMDB) Library Files (Architecture dependent) SFMDB.SMPGSQL-MAN C.03.00.xx PostgreSQL (SFMDB) Manual Pages SFMDB.SMPGSQL-RUN C.03.00.xx PostgreSQL (SFMDB) Executable Files (Architecture dependent) SFMDB.SMPGSQL-RUN C.03.00.xx PostgreSQL (SFMDB) Executable Files (Architecture dependent) SFMDB.SMPGSQL-SHA C.03.00.xx PostgreSQL (SFMDB) Share File SFMDB.SMPGSQL-SRC C.03.00.xx PostgreSQL (SFMDB) Source FilesReporting Defects
You can report defects related to SFM or EVWEB by filing a request on CHART. The name of the project is diag.sfm. If you do not have access to CHART, contact your local HP representative to file a defect on your behalf.