Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP Capacity Advisor Version 4.0 User's Guide > Chapter 3 Key Capacity Advisor Concepts

Measuring and Analyzing Resource Utilization

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

In using Capacity Advisor, it is helpful to understand how the tool approaches sampling and data analysis, and the user-provided information that affects these.

Peaks and Sums

Measuring utilization of computing resources is more complex than simply determining the maximum memory or processor utilization.

Sum of Peaks. An old standby in capacity planning is to simply take the peak of the two loads and use that to determine the maximum required capacity; this is the “sum of peaks”. While this will definitely provide a robust solution, it does not take into account the timing of the peak of the loads and may end up planning for more capacity than is actually used.

Peak of Sums. A more efficient planning solution, which is easily accomplished using HP Capacity Advisor, takes into account the timing of the maximum utilization peaks in the individual loads. By adding together utilization at each measured interval and then taking the maximum of the resulting time sequence, a more accurate measure of the required maximum resource can be determined. This can lead to cost savings when planning the resources required to consolidate loads onto new or existing servers.

Sampling Interval

HP Capacity Advisor utilization providers run on each monitored system to collect information on resource utilization. At the CPU-clock cycle level, a processor is either busy or idle. For Capacity Advisor, the average utilization for each 5–minute interval is stored. Therefore, peaks lasting less than 5 minutes are not visible.

Because each data point is the average of the five preceding minutes of values, this averaging tends to flatten the graphs, particularly when compared with real-time graphs in which each data point is the average of values from the 15 preceding seconds.

Headroom

Headroom is the difference between the average utilization on a system and the maximum available capacity. Optimum headroom varies depending on size of system. While a single processor system might require 50% headroom to preserve reasonable response times, a 16-way system might have reasonable response times when loaded at 80%.

Adequate headroom can also depend heavily on the characteristics of the loads; highly interactive systems require much more headroom than those that can tolerate delays in response time; batch systems may get by with very little headroom at all.

Headroom Stars Definitions

Various reports and results show headroom star rankings. The headroom of a system is the amount of additional capacity that can be used without violating the utilization limits of the applications running on that system. For example, if you have a system with 4 cores where you never want utilization to exceed 75%, and peak utilization is 1.75 cores, then headroom is 1.25 cores.

The highlighted number of stars representing headroom can be interpreted as follows:

Table 3-1 Stars Defined

Number of Stars
012345
One or more resources do not fit in the system; the utilization limits are violated.All resources fit in the system, but no or little headroom is available.All resources fit, and at least 25% headroom for any single workload is available.All resources fit, and at least 50% headroom for any single workload is available.All resources fit, and at least 75% headroom for any single workload is available.Not only do all resources fit, but double the resource usage for any single workload could fit.

 

where

  • resources can be CPU cores, memory, network I/O, and disk I/O. In the case of a virtual machine, the number of CPU cores considered are those assigned to the VM, not the total number of cores on the VM host. The VM host clock speed, network capacity, and disk capacity are all inherited by the VM guest when it is moved onto the VM Host.

  • fit means the utilization limits (see “Utilization Limits ”) are met

  • headroom means “room for growth

Interpreting the Star Rating

Headroom star ratings for a host are a weighted average of all of the star ratings of the workloads on that host. The weighting tends to give the highest weight to the lowest ratings. One low rating can dramatically lower the rating for the entire host.

In the case of a VM host, the star ratings account for how well the workloads fit into their virtual machines, as well as how well the virtual machines fit on the VM host. The rating for the VM host will be low if any of the virtual machines are too small for their workloads.

Interpreting the Star Rating Given by the HP Smart Solver

When using the Smart Solver to find a plan to convert physical systems to virtual machines, consider the following factors that can adversely affect the Smart Solver results.

  • The addition of a virtualization overhead multiplier to a VM will often reduce the number of stars for that workload by 1 or 2 stars.

  • The clock speed of the VM host may be slower than the original physical system. The work that was done by 1 core at 2.6 GHz, may require 2 cores when placed on 2.1GHz VM host.

You can avoid having the Smart Solver produce inaccurate or useless results by resizing your systems before running the Smart Solver. If either of the above conditions exist in your situation, consider increasing the number of cores on your simulated physical systems before running the Smart Solver. (Select What-if Actions->Edit System... on the System tab on the Edit Scenario screen.) If you change the number of cores from 1 to 2 or 2 to 3 before consolidating, the resulting virtual machines will have enough cores to cover the virtualization overhead or a slower VM host.

Resizing the virtual machines after running the Smart Solver can be less effort, as you only have to resize the VMs that have fewer stars than your desired goal. After adding more cores to the VMs for which CPU resources are too tight, you can rerun Smart Solver to balance the load on the VM hosts to improve the solution a bit more.

TIP: Use a Scenario Comparison report to compare the headroom stars rating for saved scenarios.

Missing or Invalid Data

Data collected by Capacity Advisor is used in the scenarios you create and manipulate. During an interval when no data was collected, the data is considered missing (data may not have been collected, for example, because a system was down during data collection). Invalidated (or invalid) data is data that you have marked as invalid.

For each metric about a system or workload, if a significant amount of data is missing or invalid, the metric is followed by asterisks with the following meaning:

  • [blank] : 91% to 100% of data is valid.

  • * : 51% to 90% of data is valid.

  • ** : 11% to 50% of data is valid.

  • *** : Less than 10% of data is valid.

  • N/A: There is no valid data.

Thus, metrics without asterisks are considered useful and reliable for analysis.

NOTE: In some situations, where time or time zones on a server are incorrect, it may appear that data collected has only old data. For more information on this topic, see the section on Handling Old Data in the Capacity Advisor Error Messages appendix in this document.

Utilization Limits

Utilization limits allow you to set specific service level objectives for workloads. Beyond overall system utilization, these utilization limits place service level objectives on one or more specific utilization metrics (CPU, memory, network, or disk utilization) for any given workload. When making automated changes, such as the automated system consolidation done by the HP Smart Solver, these utilization limits are honored in determining a solution for automated load balance of servers and virtual machines and for automated workload stacking.

The default utilization limits used globally across Capacity Advisor in the absence of user-defined limits are the following:

  • CPU utilization cannot exceed 70% of the capacity for more than 15 minutes at a time.

  • Memory utilization cannot exceed 100% of the capacity.

(For information on how utilization is calculated for each resource, seeAppendix B .)

Specifying Utilization Limits

There are three building blocks to specifying a utilization limit:

  • The Limit. The maximum percentage or absolute amount of a resource allowed to be used by a workload. For example, a CPU utilization limit might be “not above 90%” utilization.

  • The Resource. Utilization limits are applied to specific resources:

    • CPU cores

    • memory

    • network I/O bandwidth

    • disk I/O bandwidth

  • The Time Criteria. You can specify the time portion of a utilization limit in either of two ways:

    • Sustained (consecutive) time limits

    • Percentage of time limits

For more information about time limits, see “Sustained Time Limits” and “Percentage of Time Limits”

TIP:

You Can Specify More Than One Utilization Limit for a Resource. Using the Utilization Limits Editor, you can add multiple settings for a resource. For example, you can create multiple different utilization limits for CPU cores by varying percentage and allowed duration for each limit. Multiple limits for CPU cores could look like this:

  • Utilization can exceed 90 percent of assigned cores 0 percent of the time

  • Utilization can exceed 85 percent of assigned cores for a maximum of 5 minutes duration

Not Specifying a Limit Allows HP Smart Solver to Over-provision Systems. To achieve best results with the Smart Solver, it is better to set specific limits, rather than to depend on the default settings for limits to provide the best fit.

Sustained Time Limits

A sustained limit specifies a limit where the resource cannot exceed that utilization limit for X consecutive minutes. For example, if X is 20, this means that the resource cannot exceed the utilization limit for 20 consecutive minutes.

Because the Capacity Advisor collects data samples every 5 minutes, the time X for the sustained limit must be a multiple of 5 minutes; the minimum for X is 0 minutes.

Percentage of Time Limits

A percentage of time limit specifies that the resource cannot exceed the limit for more than the designated percent of time, where percent of time is related to the percentile utilization ranges in the Capacity Advisor data.

Given that there are about 10,000 minutes in a week, 3% of the time is roughly 300 minutes (3% of 10,000). These 300 minutes total to 5 hours per week. Below is a table relating percentages of time to hours per week, which may help you in specifying percent of time utilization limits.

Table 3-2 Percent of Time Conversions

Percent of TimeMinutes/ WeekHours/ WeekHours/Day

(24–hour day)

1100.8 1.68 .24
2201.63.36.48
3302.45.04.72
5504.08.401.20
101008.016.82.40
151512.025.23.60
202016.033.64.80
252520.042.06.00
303,024.050.47.20
10010080.0168.0024.00

 

Understanding Utilization Limit Messages

Percentage of Allocation

The utilization limit messages are shown as a percentage of allocation, where allocation is subset of the given hardware for the system the workload is running on. For example, for a 1–core system, the allocation is 1 CPU. The CPU utilization limit of 50% would mean 50% of 1 core, or .5 cores. However, this percentage changes when the hardware (allocation) changes. If 2 additional cores are added (say through dynamic CPU migration with vPars), the CPU utilization limit of 50% would mean 50% of 3 cores, or 1.5 cores.

The allocation values for network and disk may be updated each time utilization data is collected from the system using the capcollect command. (See the “Command Reference” in this guide.) If a new high observed value occurs during the time period collected, the network or disk allocation value for the system is increased to reflect it. This increased value then affects any network or disk utilization limits for workloads on that system. The current allocation values for a system are displayed on the Profile Viewer page under Platform Characteristics.

With Sustained Limits

A sustained utilization limit can be set such that CPU utilization cannot exceed 50% of allocation for 20 consecutive minutes, where the allocation of hardware is based upon a 3–core system. The utilization limit message would read:

CPU utilization may not exceed 50% of allocation or 1.5 cores for more than 20 minutes.

With Percentage of Time Limits

A percentage of time utilization limit could be set such that CPU utilization cannot exceed 90% of allocation for more 10% of the time, where the allocation is based upon a 3–core system. The utilization limit message would read:

CPU utilization may not exceed 50% of allocation or 1.5 cores for more than 10% of the time.

Scope of Utilization Limits

Utilization limits can be set to apply broadly or narrowly within the Capacity Advisor user interface:

  • Globally. These limits apply to every workload, wherever workloads are analyzed.

  • By Workload. These limits apply to one specific workload, wherever that workload is analyzed.

  • Scenario-wide. These limits apply to every workload within one specific scenario.

  • By Scenario Workload. These limits apply to one specific workload within one specific scenario.

When a workload falls within more than one scope, only the more specific one applies, as shown in the table below.

You can disable a more specific scope where you do not want a specific scope to apply.

Table 3-3 Scope of Utilization Limits

ScopeLimitsDescriptionOverrides
More global

Global Utilization Limit

  • Applies to all workloads for which a more specific utilization limit is not provided.

  • Cannot be disabled

  • Nothing

Workload Utilization Limit

  • Applies to a specific workload unless a more specific utilization limit is provided.

  • Can be enabled or disabled

  • Global

More Local

Scenario Utilization Limit

  • Applies to all workloads within a scenario for which a more specific utilization limit is not provided.

  • Can be enabled or disabled

  • Global

  • Workload

Scenario Workload Utilization Limit

  • Applies to a specific workload within a scenario.

  • Can be enabled or disabled

  • Global

  • Workload

  • Scenario

 

Scaling Multipliers

Capacity Advisor gives you the ability to provide a compensating factor, a scaling multiplier, to help Capacity Advisor to adjust needed resources when analyzing moving from one platform to another. The following table lists the scaling factors that you can use to more accurately simulate potential change.

(For information on how utilization is calculated for each resource, seeAppendix B .)

Table 3-4 Definitions of Scaling Multipliers Used in Capacity Advisor

MultiplierMeaning/Default/Example
Cooling Multiplier

Meaning. The ratio of the energy consumed by the air conditioning system to remove heat from the machine room to the energy consumed by the computers in that room.

This ratio varies depending on the climate and the type of air conditioner used. It is generally a value between 0.5 and 1.5.

Default. None.

Example. A value entered of 0.9 would mean that for every 10 kilowatt-hours of energy used by the computers, another 9 kilowatt-hours of energy are needed to cool the machine room.

CPU Multiplier %

Meaning. This represents the % change in the CPU utilization when you are sizing a workload to simulate a new workload. For example, if you want the workload to be 1/2 of an existing workload, the multiplier would be –50 (for negative 50% change).

When accounting for

  • different CPU clock speeds, you should leave this number as is because Capacity Advisor will account for this automatically.

  • different platforms, such as moving from PA to IPF, use the CPU platform multiplier.

Default. The default is 0% (0% change).

Example. If you want the CPU utilization to increase by 10% of the existing workload, the multiplier would be 10 (a 10% change).

CPU Platform Multiplier %

Meaning. This represents the change in CPU utilization due to using a different platform (e.g,. PA, IPF, etc.). If you are using the same platform, you can keep the multiplier as is.

Default. The default is 0% (0% change)

Basic Example. If you are moving from:

  • PA to PA, keep the value as 0 (0% change (no change))

  • PA to IPF, because of the IPF faster processing, use –10 (-10% decrease in utilization).

  • IPF to PA, because of the slower processing, use 10 (10% increase in utilization).

  • IPF to Proliant (Xenon), use 100% (100% increase in utilization)

  • Proliant (Xenon) to IPF, use –50 (50% decrease in utilization)

Detailed Example. The CPU Platform Multiplier is harder to compute because CPU times are automatically scaled to the clock speed of the CPU cores. Here the multiplier is the ratio of the CPU seconds multiplied by the clock speed (100*1600)/(400*550) or 0.73. This multiplier is independent of clock speed and is a reasonable estimate when moving from a 650MHz PA system, even though the benchmark was taken on a 550MHz system.

The multiplier of 0.73 would be a –27% change. This takes into account several things:

  • Move from a PA system to an Integrity server (tends to lower the multiplier)

  • Move from a two-way to a one-way system (lowers the multiplier)

  • Move from a standalone system to a virtual machine (tends to raise the multiplier)

  • Change of software release (can raise or lower the multiplier)

CPU Virtualization Overhead %

Meaning. This represents the percent change in CPU utilization due to the overhead of running in a virtual system (that is, the additional processing for virtual systems that does not exist on a standalone system where there is no virtualization software running).

Default. The default is 0% (0% change).

Example: When making a server become a virtual machine. If your virtualization software would cause a 5% increase in CPU utilization due to the virtualization overhead , enter 5 for the CPU Virtualization Overhead % to account for the additional demand on the CPU core(s) when changing a server to a VM.

Example: When making a virtual machine become a server. If your virtualization software causes a 5% increase in CPU utilization due to the virtualization overhead within the virtual machine, enter -5 for the CPU Virtualization Overhead % to account for the gain in CPU availability when changing a VM to a server.

Memory Multiplier %

Meaning. This represents the change in the memory utilization when you are sizing a workload to simulate a new workload. For example, if you want the workload to be 1/2 of an existing workload, the multiplier would be -50 (decrease of 50%).

When accounting for different platforms, such as moving from PA to IPF, use the Memory platform multiplier.

Default. The default is 0% (0% change).

Example. If you want the memory utilization to increase by 10% of the existing workload, the multiplier would be 10 (a 10% change).

Platform Memory Multiplier %

Meaning. This represents the change in memory utilization due to using a different platform (e.g,. PA, IPF, etc.).

Default. The default is 0% (0% change).

Basic Example. If you are moving from:

  • PA to PA, keep the value as 0% (0% change)

  • PA to IPF, use 50% because of IPFs 64–bit addressing.

Detailed Example. The Platform Memory Multiplier is the ratio of the memory used on each system. Using the Detailed Example in the CPU Platform Multiplier %, if the memory ratio is 600/400 or 1.5, the % change would be 50% (a 50% increase). This change is affected primarily by the move to Integrity, which usually increases the memory multiplier, and by getting a new version of the software which can raise or lower the multiplier. Factors like the number of CPU cores and the use of virtual machines have no effect unless the software tests for these things and changes its behavior.

 

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2006-2008 Hewlett-Packard Development Company, L.P.