Announcement
System Fault Management (SFM) is a collection of tools used to monitor the health of HP servers and receive information about hardware such as memory, CPU, power supplies, and cooling devices. SFM operates in the Web-Based Enterprise Management (WBEM) environment.
SFM includes the following tools:
Note:
- SFM Providers
- EVWEB
- HP-UX need not be rebooted upon upgrade/install of this version.
- Starting with the HP-UX 11i v2 March 2006 release, EVWEB is included in the SysFaultMgmt bundle.
This document contains the following sections:
- SFM Providers
- EVWEB
- EMT
- System Requirements
- Supported Browsers
- Limitation and Workaround
- Product Documentation
- Software and Documentation Availability in Native Languages
- Product Structure
- Reporting Defects
SFM Providers
SFM providers are tools that gather information related to various hardware devices and report to the Common Interface Model Object Manager (CIMOM).
Table 1 lists the SFM providers, and their respective functions:
Table 1: Providers and their Functions
Providers Functions BladeProvider and BladeStatusProvider Available on Blade servers, providing Blade Inventory details and health status of blades current partition EMDProvider Helps to query CER (Common Error Repository) for error metadata used only in SysFaultMgmt. CPU Instance Provider Retrieves information about processor inventory and consolidated health of the processor subsystem. Memory Instance Provider Gathers information about memory inventory and consolidated health of the memory subsystem. EMS Wrapper Provider Converts events generated by the EMS Hardware Monitors into indications and reports those indications to the CIMOM. Filter Metadata (FMD) Provider Provides the facility to predefine the important filter in a repository. FMD also ensures that all important or chosen indications are logged to the local event archive. FMD creates HP-advised subscriptions when SFM is installed. Environmental Providers Retrieve information about cooling devices (fans) and power supply (bulk power supply and AC input lines) on HP servers. They also retrieve consolidated health of cooling, power, system temperature, and system voltage subsystems on HP servers. Event Manager Common Information Model (EVM CIM) Provider Converts EVM events into indications and reports those indications to the CIMOM. SFMIndicationProvider Generates WBEM indications when an abnormal activity is detected on the monitored devices and reports these WBEM indications to the EMS framework. Firmware Revision Instance Provider Retrieves information about the firmware revision of system hardware components, such as system firmware version and Management Processor (MP) firmware version. MP Instance Provider Retrieves information about the management processor of the system. Enclosure Instance Provider Retrieves information about the Onboard Administrator (OA), such as OA description, OA IP address, OA MAC address, and the URL to launch the OA. Record Log Provider Enables event analysis tools such as Web-Based Enterprise Services (WEBES) to access details of indications generated by the SFMIndicationProvider that are available in the SFM database, for event analysis. The Record Log Provider also enables event analysis tools to access MCA logs for event analysis. Temperature Provider Describes properties such as sensor number, current temperature reading, and temperature sensor status. ComputerSystem Chassis Provider Describes properties such as the serial number, product ID, and virtual Universally Unique ID (UUID). MCA Indication Provider Generates WBEM indications when MCA logs are present.
- Logviewer enhancement: RecordLog Provider interface to return instances of all log records that are available.
- IPMI Event Viewer changes for E1 events: Support for new IPMI event format.
- New Property pages: Enhancements have been made to the property pages in System Management Homepage (see hpsmh(1M), and the following new pages have been added:
- Complex-wide Info
- Cell Board
- Partition Information
- Blade- Health Test: The Health Test support is available when the feature is installed by ProviderSvcsBase. For information on installing Health Test feature, see ProviderSvcsBase Release Notes. It includes CPU Test and Memory Test features to describe the health test of CPU and memory.
- OnlineDiag equivalence features
- Syslog functionality – SysFaultMgmt will provide a default subscription to syslog which will provide a summary of the event information of critical and serious events in syslog.
- Event.log –SysFaultMgmt will be providing event details in /var/opt/sfm/log/event.log- Staring from September 2009 release, the below given components will be available in ProviderSvcsBase.
- SFMDB – This product in SysFaultMgmt obsoletes on Itanium-based systems. SFMDB will continue to be available on PA-RISC-based systems.
- Indication MOFs – Common indication MOFs across providers are available in ProviderSvcsBase on Itanium-based systems. For more information, see ProviderSvcsBase Release Notes at http://www.docs.hp.com/en/diag.html#Diagnostics%20%28Support%20Tools%29%3A%20General
Supported EMS Hardware Monitors
SFM is the default monitoring mode. However, you can switch to the OnlineDiag monitoring mode. For information on how to switch to the OnlineDiag mode, see the SFM Administrator's Guide at: http://docs.hp.com/en/diag
The Event Monitoring Service (EMS) Wrapper Provider receives events generated by the EMS Hardware Monitors. The following EMS Hardware Monitors are supported on HP 9000 servers running the HP-UX 11i v3 operating system:
- LPMC (now CPU) (lpmc_em)
- Memory (dm_memory)
- Core HW (dm_core_hw)
- Chassis Code (dm_chassis)
- IPMI Forward Progress Log Monitor (fpl_em)
The following EMS Hardware Monitors are supported on HP Integrity servers running the HP-UX 11i v3 operating system:
- Corrected Platform Error Monitor (cpe_em)
- IPMI Forward Progress Log Monitor (fpl_em)
- CMC Monitor (cmc_em)
- Itanium Core Hardware Monitor(ia64_corehw)
- Itanium Memory Monitor (memory_ia64)
NOTE: As of today, all of EMS Hardware Monitors are not supported by EMS Wrapper Provider. This
document reader should understand that all of the EMS events generated by the rest of EMS Hardware
Monitors are not conveyed to the CIMOM regardless of the current diag mode (SFM or EMS).
- QXCR1000891467
Problem: cimprovagt 11.11 L class or rp5470 hangs if /dev/ipmi present.- QXCR1000882967
Problem: cimserver returns an error when WbemWrapperMonitor sends a request. This occurs when cimserver is not running.- QXCR1000922116
Problem: After installing QPKBase patch bundle which requires a system reboot, SFM does not process system events written to SFMDB.- QXCR1000821045
Problem: Thermal interrupt not reaching HPUX, Hence no graceful shutdown rather it HPMC's.- QXCR1000899014
Problem: Physical address in EVWEB not decoded correctly.- QXCR1000912086
Problem: The SFM process uses 100% of CPU. Initial 3 minutes SFM process uses more CPU and the cimprovagt program uses 100% of CPU on startup for a long time- QXCR1000828702
Problem: The FPL provider on a cell system does not report BPS_REMOVED event.- QXCR1000900238
SFMDB does not start when the time zone is GMT0.
Problem: On IA systems when Time Zone is set to GMT0, postmaster will stop and log messages in sfmdb.log. This is because GMT0 is not a valid time zone recognized by PostgresSQL. As a result, SFM database does not come up.
EVWEB
This section describes EVWEB.
EVWEB is a tool that can be used to view and administer WBEM indications generated on the HP-UX 11i v3 system.
The EVWEB tool includes the following components:
- Event Subscription Administrator
Event Subscription Administrator enables users to subscribe to an indication and view it. In addition, users with administrative privileges can also modify, and delete subscriptions. By subscribing to an indication, users can obtain detailed information about various WBEM indications.
As a part of event subscription, users must specify event subscription criteria. Users must also select one or more destinations to receive information about indications.
Users can select one or more destination from the following list:
- Event Archive: The path of Event Archive is /var/opt/sfmdb/pgsql for PA-RISC and /var/opt/psb/db/pgsql/ for Itanium. Event Archive is the default destination.
- Email: Event notification will be emailed to the specified email address. Users can specify multiple email addresses.
- syslog: The path to syslog is /var/adm/syslog/syslog.log.
- Event Viewer
The Event Viewer enables users to view the indications stored in the Event Archive. In addition, users with administrative privileges can also delete these indications. By default, HP-advised subscriptions are stored in the Event Archive. The Event Viewer also enables users to search for an indication logged in the Event Archive.- Log Viewer
The Log Viewer enables users to view and search the low level logs stored in the log database.Following lists the benefits of EVWEB:
- Enables users to manage all WBEM indications that are supported by SFM.
- Provides an option to customize the indication destination to receive information about HP-advised subscriptions.
- Enables users to view the command-line equivalent of an action performed using the GUI, thereby, educating users about the usage of various commands.
EVWEB offers the following features:
- Provides both quick search and advanced search mechanisms to view events from the Event Archive.
- Provides both simple and advanced search mechanism to search for low level logs from the Log Viewer.
- Generates a list of events in a printer-friendly format (GUI only).
- Enables users with administrative privileges to create, modify, and delete indications.
- Enables users to view subscriptions created using EVWEB
- Enables users to view externally created subscriptions.
Subscriptions created by using tools other than EVWEB are termed as externally created event subscriptions.- Enables users to view HP-advised subscriptions. HP-advised subscriptions are provided by default by HP.
Note: EVWEB supports these features on browser-based GUI and the CLI.
- None.
- When an HP-advised subscription is copied to create or modify another subscription, the subscription criteria is not copied. However, only destinations are copied to the new subscription.
- Event details displayed in EVWEB Event Viewer and embedded in the EVWEB email notification may not have similar readability or formatting as provided by the EMS event notification. However, this issue is not applicable to HP_DeviceIndication class indications.
EMT
This section describes EMT.
Error Management Technology (EMT) is a component of SFM. EMT includes Common Error Repository (CER), which is an online, searchable, and updateable error repository. The CER contains error metadata such as error description, error number, error type, severity, cause of the error, and corrective actions for errors generated on the HP-UX 11i v3 system.
Following lists the benefits of EMT:
- Enables users to view most errors that can occur on the HP-UX 11i v3 system.
- Provides an option to the administrators to add, modify, and delete custom solutions.
- Enables users to view the command-line equivalent of an action performed using the GUI, thereby, educating users about the usage of various commands.
EMT offers the following features:
- Provides both quick search and advanced search mechanisms to view error metadata from CER
- Generates a list of errors in a printer-friendly format (GUI only)
- Enables users with administrative privileges to create, modify, and delete custom solutions
Note: EMT supports these features on browser-based GUI and the CLI.
- None.
Following is a limitation of EMT:
- When you make a generic query to the CER, a huge amount of data is retrieved from the CER. However, this behavior may affect the performance of EMT.
System Requirements
SFM is supported on the following systems running the HP-UX 11i v3 operating system:
- HP 9000 servers
- rp3410
- rp3440
- rp4410
- rp4440
- rp7405
- rp7410
- rp7420
- rp8400
- rp8420
- SD16, SD32, SD64
- SD16A, SD32A, SD64A
- SD16B, SD32B, SD64B
- HP Integrity servers
- cx2600
- cx2620
- rx1600
- rx1620
- rx2600
- rx2620
- rx2660
- rx3600
- rx4640
- rx5670
- rx6600
- rx7620
- rx7640
- rx8620
- rx8640
- SD16A, SD32A, SD64A
- SD16B, SD32B, SD64B
- BL60p HP Server Blade
- BL860c HP Server Blade
- BL870c HP Server Blade
SFM supports the following systems based on the Dual-Core Intel® Itanium® Processor 9100 series and running the HP-UX 11i v3 operating system:
- rx7640
- rx8640
- SD16B
- SD32B
- SD64B
Following lists the software requirements for using SFM:
- HP-UX 11i v3 February 2007 release
- OpenSSL Version A.00.09.07e.013 or later
- EVM-EventMgr B.11.31 (September 2007 or later)
- SysMgmtBase B.00.02.03 (Interface)
- WBEMSvcs A.02.09 or later
- SysMgmtWeb (HP-UX Web-Based System Management User Interface) A.3.0 (September 2009, or later)
- HP Systems Insight Manager (HP SIM) version 5.0.01
- ProviderSvcsBase C.02.00.xx (September 2009)
- SysMgmtPlus A.02.00 (September 2009 or later)
- On PA-RISC only, OnlineDiag B.11.31.06.xx (September 2009 or later)
EMS Version
A.04.20.31.03
STM Version
D.06.00
Notes:
- The listed versions of the software are the minimum supported requirements. Subsequent versions are compatible with this version of SFM unless otherwise noted.
- WBEM Services, Online Diagnostics, SysMgmtWeb, and HP SIM are available on the Operating Environment (OE) media and can be selected for install during the SFM installation.
- HP System Management Homepage (SMH) – bundled in SysMgmtWeb – is an optional install. However, without it you cannot access the EvWEB GUI (Event Viewer, Subscription Management and Log Viewer interface). The command line interface for EvWEB will still be accessible.
- HP recommends using the latest available HP Systems Insight Manager (HP SIM) to remotely administer indications and instances. The minimum requirements are the April/May 2007 release:
HP SIM 5.1 with Update 1 - HP-UX (C.05.01.00.01).
Supported Browsers
Following lists the browsers supported by SFM:
- Internet Explorer version 6.0 and above
- Mozilla version 1.5 and above
Limitations and Workarounds
- Problem: SFMProviderModule (SysFaultMgmt product) cimprovagt process may show increased memory usage.
Description: After generating large number of events in a short time ( which does not happen in a customer environment unless there is a catastrophic failure), the memory footprint of SFMProviderModule cimprovagt process increases.
The increase may be in the order of 100 to 200 MB. Typically, this may happen over a long period of time (several months to years) if server is not rebooted during this time.
Solution: If memory footprint has increased, one solution would be to disable and enable SFMProviderModule.
# cimprovider –dm SFMProviderModule
# cimprovider –em SFMProviderModule- Problem: An empty log file in IPMI Event Viewer displays warning.
Description: While viewing an empty log file in HP SMH, the warning is displayed as “Internal Server Error” instead of an error message as "An unknown error has occurs"
Solution: Use the following command line to view the empty log file
/opt/sfm/bin/slview –d –f <the log file name>
Product Documentation
For more information on SFM, see the following documents at:
http://docs.hp.com/en/diag.html
- SFM Frequently Asked Questions (FAQs)
- System Fault Management Administrator's Guide
- SFM Release Notes
- SFM Provider Data Sheets
- SFM Tables of Versions
- SFM Patch Descriptions
- SFM Event Descriptions
Software and Documentation Availability in Native Languages
SFM software and documents are available only in the English language.
Product Structure
The SFM product, consisting of SFM providers and EVWEB, is installed as part of the SysFaultMgmt bundle.
Following are the commands you must use to obtain the bundle, product, sub-product, and the fileset information about the SysFaultMgmt depot:
- Bundle
$ swlist -s <SysFaultMgmt Depot Location> SysFaultMgmt C.06.00.xx.yy HPUX System Fault Management- Product(s)
$ swlist -l product -s <SysFaultMgmt Depot Location> SFM-CORE C.06.00.xx HPUX System Fault Management SFMDB C.06.00.xx HP System Management Database (SFMDB)- Sub-product(s)
$ swlist -l subproduct -s <SysFaultMgmt Depot Location> # SFM-CORE C.06.00.xx HPUX System Fault Management SFM-CORE.ERROR-MGMT Error Management Technology SFM-CORE.EVMCIM EVMCIM SFM-CORE.EVWEB EVWEB SFM-CORE.FMD-PROVIDER FMD-PROVIDER SFM-CORE.GS GS SFM-CORE.HS-PROVIDER HS-PROVIDER SFM-CORE.SFM-HAS SFM-HAS SFM-CORE.SFM-PROPPAGE SFM PROPERTY PAGE SFM-CORE.SFM-PROVIDER SFM-PROVIDER SFMDB C.06.00.xx HP System Management Database (SFMDB)- Fileset(s)
$ swlist -l fileset -s <SysFaultMgmt Depot Location> # SFM-CORE C.06.00.xx HPUX System Fault Management SFM-CORE.CTR_PRO_COMM C.06.00.xx Control Provider Common Fileset SFM-CORE.CTR_PRO_COREIA C.06.00.xx Control Provider Platform Specific Fileset SFM-CORE.CTR_PRO_COREPA C.06.00.xx Control Provider Platform Specific Fileset SFM-CORE.EMT_COMM C.06.00.xx EMT COMMON SFM-CORE.EMT_COREIA C.06.00.xx EMT CORE IA SFM-CORE.EMT_COREPA C.06.00.xx EMT CORE PA SFM-CORE.EMT_DOC C.06.00.xx EMT Online help fileset SFM-CORE.EMT_MAN C.06.00.xx EMT Man pages fileset SFM-CORE.EVM_PRO_COMM C.06.00.xx EVM CIM Indication Provider Common Fileset SFM-CORE.EVM_PRO_COREIA C.06.00.xx EVM CIM Indication Provider Platform Specific Fileset SFM-CORE.EVM_PRO_COREPA C.06.00.xx EVM CIM Indication Provider Platform Specific Fileset SFM-CORE.EVWEB_COMM C.06.00.xx Event Manager (EvWEB) Common components SFM-CORE.EVWEB_COREIA C.06.00.xx EvWEB core platform specific fileset SFM-CORE.EVWEB_COREPA C.06.00.xx EvWEB core platform specific fileset SFM-CORE.EVWEB_DOC C.06.00.xx EvWEB Online help fileset SFM-CORE.EVWEB_GUI_COMM C.06.00.xx EvWEB GUI common fileset SFM-CORE.EVWEB_GUI_IA C.06.00.xx EvWEB GUI platform specific fileset SFM-CORE.EVWEB_GUI_PA C.06.00.xx EvWEB GUI platform specific fileset SFM-CORE.EVWEB_MAN C.06.00.xx EVWEB Man pages fileset SFM-CORE.FMD_PRO_COMM C.06.00.xx Filter Metadata Instance Provider Common Fileset SFM-CORE.FMD_PRO_COREIA C.06.00.xx Filter Metadata Instance Provider Platform Specific Fileset SFM-CORE.FMD_PRO_COREPA C.06.00.xx Filter Metadata Instance Provider Platform Specific Fileset SFM-CORE.GS_COMM C.06.00.xx General Services Common Fileset SFM-CORE.GS_COREIA C.06.00.xx General Services Platform Specific Fileset SFM-CORE.GS_COREPA C.06.00.xx General Services Platform Specific Fileset SFM-CORE.HAS-IA C.06.00.xx Hardware Access Services IA SFM-CORE.HAS-PA C.06.00.xx Hardware Access Services PA SFM-CORE.HS_PRO_COMM C.06.00.xx HealthState Instance Provider Common Fileset SFM-CORE.HS_PRO_COREIA C.06.00.xx HealthState Instance Provider Platform Specific Fileset SFM-CORE.HS_PRO_COREPA C.06.00.xx HealthState Instance Provider Platform Specific Fileset SFM-CORE.MISC_COMM C.06.00.xx MISC Common Fileset SFM-CORE.MISC_COREIA C.06.00.xx MISC Platform Specific Fileset SFM-CORE.MISC_COREPA C.06.00.xx MISC Platform Specific Fileset SFM-CORE.MISC_TOOLS C.06.00.xx MISC Tools Fileset SFM-CORE.MISC_TOOLS C.06.00.xx MISC Tools Fileset SFM-CORE.SFMUI-PROPPAGE C.06.00.xx SFM property pages fileset SFM-CORE.SFM_MAN C.06.00.xx SFM Man pages fileset SFM-CORE.SFM_PRO_COMM C.06.00.xx SysFaultMgmt Provider Module COMMON SFM-CORE.SFM_PRO_IA C.06.00.xx SysFaultMgmt Provider Module IA SFM-CORE.SFM_PRO_PA C.06.00.xx SysFaultMgmt Provider Module PA # SFMDB C.06.00.xx HP System Management Database (SFMDB) SFMDB.SMPGSQL-DOC C.06.00.xx PostgreSQL (SFMDB) Documentation Files SFMDB.SMPGSQL-INC C.06.00.xx PostgreSQL (SFMDB) Header Files SFMDB.SMPGSQL-LIB C.06.00.xx PostgreSQL (SFMDB) Library Files (Architecture dependent) SFMDB.SMPGSQL-LIB C.06.00.xx PostgreSQL (SFMDB) Library Files (Architecture dependent) SFMDB.SMPGSQL-MAN C.06.00.xx PostgreSQL (SFMDB) Manual Pages SFMDB.SMPGSQL-RUN C.06.00.xx PostgreSQL (SFMDB) Executable Files (Architecture dependent) SFMDB.SMPGSQL-RUN C.06.00.xx PostgreSQL (SFMDB) Executable Files (Architecture dependent) SFMDB.SMPGSQL-SHA C.06.00.xx PostgreSQL (SFMDB) Share File SFMDB.SMPGSQL-SRC C.06.00.xx PostgreSQL (SFMDB) Source Files
Reporting Defects
You can report defects related to SFM by filing a request on QuIX. The name of the project is SysFaultMgmt. If you do not have access to QuIX, contact your local HP representative to file a defect on your behalf.