Announcement
System Fault Management (SFM) is a collection of tools used to monitor the health of HP servers and receive information about hardware such as memory, CPU, power supplies, and cooling devices. SFM operates in the Web-Based Enterprise Management (WBEM) environment.
SFM includes the following tools:
- SFM Providers
- EVWEB
Note: Starting with the HP-UX 11i v2 March 2006 release, EVWEB is included in the SysFaultMgmt bundle.- EMT
This document contains the following sections:
- SFM Providers
- EVWEB
- EMT
- System Requirements
- Supported Browsers
- Limitations and Workarounds
- Product Documentation
- Software and Documentation Availability in Native Languages
- Product Structure
- Reporting Defects
SFM Providers
SFM providers are tools that gather information related to various hardware devices and report to the Common Interface Model Object Manager (CIMOM).
Table 1 lists the SFM providers, and their respective functions:
Table 1: Providers and their Functions
Providers Functions CPU Instance Provider Retrieves information about processor inventory and consolidated health of the processor subsystem. Memory Instance Provider Gathers information about memory inventory and consolidated health of the memory subsystem. EMS Wrapper Provider Converts events generated by the EMS Hardware Monitors into indications and reports those indications to the CIMOM. Filter Metadata (FMD) Provider Provides the facility to predefine the important filter in a repository. FMD also ensures that all important or chosen indications are logged to the local event archive. FMD creates HP-advised subscriptions when SFM is installed. Environmental Providers Retrieve information about cooling devices (fans) and power supply (bulk power supply and AC input lines) on HP servers. They also retrieve consolidated health of cooling, power, system temperature, and system voltage subsystems on HP servers. Event Manager Common Information Model (EVM CIM) Provider Converts EVM events into indications and reports those indications to the CIMOM. SFMIndicationProvider Generates WBEM indications when an abnormal activity is detected on the monitored devices and reports these WBEM indications to the EMS framework. Firmware Revision Instance Provider Retrieves information about the firmware revision of system hardware components, such as system firmware version and Management Processor (MP) firmware version. MP Instance Provider Retrieves information about the management processor of the system. Enclosure Instance Provider Retrieves information about the Onboard Administrator (OA), such as OA description, OA IP address, OA MAC address, and the URL to launch the OA. Record Log Provider Enables event analysis tools such as Web-Based Enterprise Services (WEBES) to access details of indications generated by the SFMIndicationProvider that are available in the SFM database, for event analysis. The Record Log Provider also enables event analysis tools to access MCA logs for event analysis. Temperature Provider Describes properties such as sensor number, current temperature reading, and temperature sensor status. ComputerSystem Chassis Provider Describes properties such as the serial number, product ID, and virtual Universally Unique ID (UUID). MCA Indication Provider Generates WBEM indications when MCA logs are present.
- March 2009 release onwards, the SysFaultMgmt and OnlineDiag products have placed a pre-requisite on the ProviderSvcsBase product. SysFaultMgmt no longer has a pre-requisite on the OnlineDiag product.
As a part of stand-alone install, ProviderSvcsBase does not install in the following scenarios:
- If OnlineDiag September 2008 or an earlier version is installed or selected as part of installation.
- If SysFaultMgmt September 2008 or an earlier version is installed or selected as part of installation.
This implies that an install or an upgrade to SysFaultMgmt March 2009 version or later would require an upgrade of OnlineDiag March 2009 version or later and vice versa to ensure that both the products remain fully functional.- SFM introduces the CIMUtil command. Using CIMUtil, one can enumerate instances related to various devices and create filters, handlers, and subscriptions that SFM supports.
- SFM does not monitor information of storage disks or disk enclosures. However, inventory details of direct attached storage disks and error monitoring is done by DAS Provider. For information, see HP-UX WBEM Storage Indication Provider (DAS Provider) at the following location http://docs.hp.com/en/netsys.html#HP%20WBEM%20Providers Additionally, there is no impact on the functionality or the information retrieved.
- Starting March 2009 release, the IPMI Event Viewer will be delivered as part of SysFaultMgmt. In the earlier releases it was delivered as part of Online Diagnostics bundle. IPMI EventViewer is used to display low-level system log information.
Supported EMS Hardware Monitors
SFM is the default monitoring mode. However, you can switch to the OnlineDiag monitoring mode. For information on how to switch to the OnlineDiag mode, see the SFM Administrator's Guide at: http://docs.hp.com/en/diag
The Event Monitoring Service (EMS) Wrapper Provider receives events generated by the EMS Hardware Monitors. The following EMS Hardware Monitors are supported on HP 9000 servers running the HP-UX 11i v3 operating system:
- LPMC (now CPU) (lpmc_em)
- Memory (dm_memory)
- Core HW (dm_core_hw)
- Chassis Code (dm_chassis)
- IPMI Forward Progress Log Monitor (fpl_em)
The following EMS Hardware Monitors are supported on HP Integrity servers running the HP-UX 11i v3 operating system:
- Corrected Platform Error Monitor (cpe_em)
- IPMI Forward Progress Log Monitor (fpl_em)
- CMC Monitor (cmc_em)
- Itanium Core Hardware Monitor(ia64_corehw)
- Itanium Memory Monitor (memory_ia64)
NOTE: As of today, all of EMS Hardware Monitors are not supported by EMS Wrapper Provider. This
document reader should understand that all of the EMS events generated by the rest of EMS Hardware
Monitors are not conveyed to the CIMOM regardless of the current diag mode (SFM or EMS).
Defect Fixes
- QXCR1000866240
Problem: The SFM provider module is seen to hog the CPU time, which can go upto 400% of cpu cycle time.
Cause: This problem occurs due to thread synchronization issues in handling critical section of code.
Resolution: The shared/critical section of code has been wrapped within the Synchronization primitives to avoid resource contentions.- QXCR1000888403
Symptoms: After an Operating Environment update from HP-UX 11i v2 March 2006 or prior to HP-UX 11i v3 March 2009 release, switching the H/W diagnostics mode using ‘sfmconfig –w’ command might fail. This problem occurs due to a dated ISEE.sapcfg file (gets delivered from ISEE v3 client) that contains the entries for some deprecated EMS resources.
Workaround: Migrate from ISEE v3 to ISEE(RSP) v5 to avoid this issue. For migration information, see ISEE documentation at: http://h20219.www2.hp.com/services/cache/600312-0-0-225-121.html
EVWEB
This section describes EVWEB.
EVWEB is a tool that can be used to view and administer WBEM indications generated on the HP-UX 11i v3 system.
The EVWEB tool includes the following components:
- Event Subscription Administrator
Event Subscription Administrator enables users to subscribe to an indication and view it. In addition, users with administrative privileges can also modify, and delete subscriptions. By subscribing to an indication, users can obtain detailed information about various WBEM indications.
As a part of event subscription, users must specify event subscription criteria. Users must also select one or more destinations to receive information about indications.
Users can select one or more destination from the following list:
- Event Archive: The path of Event Archive is /var/opt/sfmdb/pgsql. Event Archive is the default destination.
- Email: Event notification will be emailed to the specified email address. Users can specify multiple email addresses.
- syslog: The path to syslog is /var/adm/syslog/syslog.log.
- Event Viewer
The Event Viewer enables users to view the indications stored in the Event Archive. In addition, users with administrative privileges can also delete these indications. By default, HP-advised subscriptions are stored in the Event Archive. The Event Viewer also enables users to search for an indication logged in the Event Archive.- Log Viewer
The Log Viewer enables users to view and search the low level logs stored in the log database.Following lists the benefits of EVWEB:
- Enables users to manage all WBEM indications that are supported by SFM.
- Provides an option to customize the indication destination to receive information about HP-advised subscriptions.
- Enables users to view the command-line equivalent of an action performed using the GUI, thereby, educating users about the usage of various commands.
EVWEB offers the following features:
- Provides both quick search and advanced search mechanisms to view events from the Event Archive.
- Provides both simple and advanced search mechanism to search for low level logs from the Log Viewer.
- Generates a list of events in a printer-friendly format (GUI only).
- Enables users with administrative privileges to create, modify, and delete indications.
- Enables users to view subscriptions created using EVWEB
- Enables users to view externally created subscriptions.
Subscriptions created by using tools other than EVWEB are termed as externally created event subscriptions.- Enables users to view HP-advised subscriptions. HP-advised subscriptions are provided by default by HP.
Notes:
- EVWEB supports these features on browser-based GUI and the CLI.
- None.
- When an HP-advised subscription is copied to create or modify another subscription, the subscription criteria is not copied. However, only destinations are copied to the new subscription.
- Event details displayed in EVWEB Event Viewer and embedded in the EVWEB email notification may not have similar readability or formatting as provided by the EMS event notification. However, this issue is not applicable to HP_DeviceIndication class indications.
EMT
This section describes EMT.
Error Management Technology (EMT) is a component of SFM. EMT includes Common Error Repository (CER), which is an online, searchable, and updateable error repository. The CER contains error metadata such as error description, error number, error type, severity, cause of the error, and corrective actions for errors generated on the HP-UX 11i v3 system.
Following lists the benefits of EMT:
- Enables users to view most errors that can occur on the HP-UX 11i v3 system.
- Provides an option to the administrators to add, modify, and delete custom solutions.
- Enables users to view the command-line equivalent of an action performed using the GUI, thereby, educating users about the usage of various commands.
EMT offers the following features:
- Provides both quick search and advanced search mechanisms to view error metadata from CER
- Generates a list of errors in a printer-friendly format (GUI only)
- Enables users with administrative privileges to create, modify, and delete custom solutions
Note: EMT supports these features on browser-based GUI and the CLI.
- None.
Following is a limitation of EMT:
- When you make a generic query to the CER, a huge amount of data is retrieved from the CER. However, this behavior may affect the performance of EMT.
System Requirements
SFM is supported on the following systems running the HP-UX 11i v3 operating system:
- HP 9000 servers
- rp3410
- rp3440
- rp4410
- rp4440
- rp7405
- rp7410
- rp7420
- rp8400
- rp8420
- SD16, SD32, SD64
- SD16A, SD32A, SD64A
- SD16B, SD32B, SD64B
- HP Integrity servers
- cx2600
- cx2620
- rx1600
- rx1620
- rx2600
- rx2620
- rx2660 (supports PCI-X IO backplane only)
- rx3600
- rx4640
- rx5670
- rx6600
- rx7620
- rx7640
- rx8620
- rx8640
- SD16A, SD32A, SD64A
- SD16B, SD32B, SD64B
- BL60p HP Server Blade
- BL860c HP Server Blade
- BL870c HP Server Blade
SFM supports the following systems based on the Dual-Core Intel® Itanium® Processor 9100 series and running the HP-UX 11i v3 operating system:
- rx7640
- rx8640
- SD16B
- SD32B
- SD64B
Following lists the software requirements for using SFM:
- HP-UX 11i v3 February 2007 or later
- OpenSSL A.00.09.07e.013 or later
- SysMgmtWeb version A.2.2.5 (HP-UX Web-Based System Management User
- WBEMSvcs A.02.07 or later
- EVM-EventMgr B.11.31
- SysMgmtBase B.00.02.03 (Interface)
- HP Systems Insight Manager (HP SIM) version 5.0.01
- ProviderSvcsBase (any version)
- SysMgmtPlus A.01.00 or later
Notes:
- The ProviderSvcsBase product offers interface for user space processes (typically, but not limited to, WBEM Indication Providers) to retrieve error logs from HP-UX Kernel. Using these error logs WBEM Indication Providers generate WBEM indications, when an event of interest occurs on the system.
- SysMgmtPlus is an enhancement plus package to HP-UX System Management Homepage. SysMgmtPlus enhances the property pages of SMH by adding additional details and introducing dynamic capability to the web page.
- SysMgmtWeb is optional. However, you will not be able to access EVWEB GUI if SysMgmtWeb is not installed on the system. SysMgmtWeb, WBEMServices, and ProviderSvcsBase are available on the Operating Environment(OE) media.
- HP SIM is required only for remote administration of indications and instances. HP SIM version 5.0.01 is the minimum requirement. However, HP recommends you install HP SIM version C.05.02.01.xx.yy.
- The mentioned versions of the software are minimum requirements. All future versions support SFM by default.
Supported Browsers
Following lists the browsers supported by SFM:
- Internet Explorer version 6.0 and above
- Mozilla version 1.5 and above
Limitations and Workarounds
- After the system is rebooted or the CIMOM is restarted, the first request to SFM hardware inventory providers such as the CPU Instance Provider, Memory Provider, and the Environmental Providers may fail with the generation of the CIM_ERR_FAILED status code. Also, a message is displayed on the client system that states "Inventory information is being built currently. Please try after some time". However, on subsequent requests, the SFM hardware inventory providers respond with requested information instantaneously.
- Hardware inventory providers are not supported on HP Virtual Machines. For indication providers, only FPL and ia64_corehw indications are generated on HPVM guests.
- QXCR1000900238
SFMDB does not start when the time zone is GMT0.
Problem: On IA systems when Time Zone is set to GMT0, postmaster will stop and log messages in sfmdb.log. This is because GMT0 is not a valid time zone recognized by PostgresSQL. As a result, SFM database does not come up.
Solution: PHSS_39073 patch is released.
Workaround: Change the time zone from GMT0 to GMT+0 and restart PostgresSQL by executing the following commands:
/sbin/init.d/sfmdb stop
/sbin/init.d/sfmdb start- QXCR1000909213
Switchdiag operation sometime fails due to its dependency on EMS monitors.
Problem: switchdiag operation doesn't guarantee the switch of diagnostics mode from EMS to SFM and vice-versa due to its major dependency on EMS monitors' termination status. It has been observed that sometimes, EMS monitors do not get disabled when a request for disable monitoring is raised (either using toggle_switch or monconfig). Since SysFaultMgmt's switchdiag script makes use of toggle_switch command during the switch between EMS and SFM mode, it may not switch the mode successfully, when toggle_switch fails to bring EMS monitors down.
Solution: PHSS_39065 patch will be released.- QXCR1000905449
FPLWriter unable to write into fpl.log.xx files.
Problem: After installing March 2009 bits, since SFM comes up as default solution, EMS fpl_em monitor will not be running. Due to this reason, no fpl.log.xx files would be created under /var/stm directories. (this directory is created by EMS fpl_em monitor). This blocks FPL writer (running from SFM provider process) to write any data to these files.
Solution: PHSS_39065 patch will be released.
Workaround: Manually create the directory in /var/stm/logs/os/- QXCR1000907300
SBE Events are not received in Memory Vertical
Problem: SBE events are not generating for memory, as the log type placed under the SAL buffers is not correct. This causes the SBE SAL buffers to get ignored during SBE events qualification.
Solution: PHSS_39065 patch will be delivered during March 2009 for covering the solution to this problem.
Product Documentation
For more information on SFM, see the following documents at:
http://docs.hp.com/en/diag.html
- SFM Frequently Asked Questions (FAQs)
- System Fault Management Administrator's Guide
- SFM Provider Data Sheets
- SFM Tables of Versions
- SFM Patch Descriptions
- SFM Event Descriptions
Software and Documentation Availability in Native Languages
SFM software and documents are available only in the English language.
Product Structure
The SFM product, consisting of SFM providers and EVWEB, is installed as part of the SysFaultMgmt bundle.
Following are the commands you must use to obtain the bundle, product, sub-product, and the fileset information about the SysFaultMgmt depot:
- Bundle
$ swlist -s <SysFaultMgmt Depot Location> SysFaultMgmt C.05.00.xx.yy HPUX System Fault Management- Product(s)
$ swlist -l product -s <SysFaultMgmt Depot Location> SFM-CORE C.05.00.xx HPUX System Fault Management SFMDB C.05.00.xx HP System Management Database (SFMDB)- Sub-product(s)
$ swlist -l subproduct -s <SysFaultMgmt Depot Location> # SFM-CORE C.05.00.xx HPUX System Fault Management SFM-CORE.ERROR-MGMT Error Management Technology SFM-CORE.EVMCIM EVMCIM SFM-CORE.EVWEB EVWEB SFM-CORE.FMD-PROVIDER FMD-PROVIDER SFM-CORE.GS GS SFM-CORE.HS-PROVIDER HS-PROVIDER SFM-CORE.SFM-HAS SFM-HAS SFM-CORE.SFM-PROPPAGE SFM PROPERTY PAGE SFM-CORE.SFM-PROVIDER SFM-PROVIDER SFMDB C.05.00.xx HP System Management Database (SFMDB)- Fileset(s)
$ swlist -l fileset -s <SysFaultMgmt Depot Location> # SFM-CORE C.05.00.xx HPUX System Fault Management SFM-CORE.CTR_PRO_COMM C.05.00.xx Control Provider Common Fileset SFM-CORE.CTR_PRO_COREIA C.05.00.xx Control Provider Platform Specific Fileset SFM-CORE.CTR_PRO_COREPA C.05.00.xx Control Provider Platform Specific Fileset SFM-CORE.EMT_COMM C.05.00.xx EMT COMMON SFM-CORE.EMT_COREIA C.05.00.xx EMT CORE IA SFM-CORE.EMT_COREPA C.05.00.xx EMT CORE PA SFM-CORE.EMT_DOC C.05.00.xx EMT Online help fileset SFM-CORE.EMT_MAN C.05.00.xx EMT Man pages fileset SFM-CORE.EVM_PRO_COMM C.05.00.xx EVM CIM Indication Provider Common Fileset SFM-CORE.EVM_PRO_COREIA C.05.00.xx EVM CIM Indication Provider Platform Specific Fileset SFM-CORE.EVM_PRO_COREPA C.05.00.xx EVM CIM Indication Provider Platform Specific Fileset SFM-CORE.EVWEB_COMM C.05.00.xx Event Manager (EvWEB) Common components SFM-CORE.EVWEB_COREIA C.05.00.xx EvWEB core platform specific fileset SFM-CORE.EVWEB_COREPA C.05.00.xx EvWEB core platform specific fileset SFM-CORE.EVWEB_DOC C.05.00.xx EvWEB Online help fileset SFM-CORE.EVWEB_GUI_COMM C.05.00.xx EvWEB GUI common fileset SFM-CORE.EVWEB_GUI_IA C.05.00.xx EvWEB GUI platform specific fileset SFM-CORE.EVWEB_GUI_PA C.05.00.xx EvWEB GUI platform specific fileset SFM-CORE.EVWEB_MAN C.05.00.xx EVWEB Man pages fileset SFM-CORE.FMD_PRO_COMM C.05.00.xx Filter Metadata Instance Provider Common Fileset SFM-CORE.FMD_PRO_COREIA C.05.00.xx Filter Metadata Instance Provider Platform Specific Fileset SFM-CORE.FMD_PRO_COREPA C.05.00.xx Filter Metadata Instance Provider Platform Specific Fileset SFM-CORE.GS_COMM C.05.00.xx General Services Common Fileset SFM-CORE.GS_COREIA C.05.00.xx General Services Platform Specific Fileset SFM-CORE.GS_COREPA C.05.00.xx General Services Platform Specific Fileset SFM-CORE.HAS-IA C.05.00.xx Hardware Access Services IA SFM-CORE.HAS-PA C.05.00.xx Hardware Access Services PA SFM-CORE.HS_PRO_COMM C.05.00.xx HealthState Instance Provider Common Fileset SFM-CORE.HS_PRO_COREIA C.05.00.xx HealthState Instance Provider Platform Specific Fileset SFM-CORE.HS_PRO_COREPA C.05.00.xx HealthState Instance Provider Platform Specific Fileset SFM-CORE.MISC_COMM C.05.00.xx MISC Common Fileset SFM-CORE.MISC_COREIA C.05.00.xx MISC Platform Specific Fileset SFM-CORE.MISC_COREPA C.05.00.xx MISC Platform Specific Fileset SFM-CORE.MISC_TOOLS C.05.00.xx MISC Tools Fileset SFM-CORE.MISC_TOOLS C.05.00.xx MISC Tools Fileset SFM-CORE.SFMUI-PROPPAGE C.05.00.xx SFM property pages fileset SFM-CORE.SFM_MAN C.05.00.xx SFM Man pages fileset SFM-CORE.SFM_PRO_COMM C.05.00.xx SysFaultMgmt Provider Module COMMON SFM-CORE.SFM_PRO_IA C.05.00.xx SysFaultMgmt Provider Module IA SFM-CORE.SFM_PRO_PA C.05.00.xx SysFaultMgmt Provider Module PA # SFMDB C.05.00.xx HP System Management Database (SFMDB) SFMDB.SMPGSQL-DOC C.05.00.xx PostgreSQL (SFMDB) Documentation Files SFMDB.SMPGSQL-INC C.05.00.xx PostgreSQL (SFMDB) Header Files SFMDB.SMPGSQL-LIB C.05.00.xx PostgreSQL (SFMDB) Library Files (Architecture dependent) SFMDB.SMPGSQL-LIB C.05.00.xx PostgreSQL (SFMDB) Library Files (Architecture dependent) SFMDB.SMPGSQL-MAN C.05.00.xx PostgreSQL (SFMDB) Manual Pages SFMDB.SMPGSQL-RUN C.05.00.xx PostgreSQL (SFMDB) Executable Files (Architecture dependent) SFMDB.SMPGSQL-RUN C.05.00.xx PostgreSQL (SFMDB) Executable Files (Architecture dependent) SFMDB.SMPGSQL-SHA C.05.00.xx PostgreSQL (SFMDB) Share File SFMDB.SMPGSQL-SRC C.05.00.xx PostgreSQL (SFMDB) Source Files
Reporting Defects
You can report defects related to SFM by filing a request on QuIX. The name of the project is SysFaultMgmt. If you do not have access to QuIX, contact your local HP representative to file a defect on your behalf.