Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP XC System Software: XC Installation Guide > Chapter 3 Configuring and Imaging the System

Task 12: Run the startsys Utility to Start the System and Propagate the Golden Image

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

The first time the entire system is started with the startsys command, power to each node is turned on, each node boots from its network adapter, and the SystemImager automatic installation environment is downloaded. This environment automatically installs and configures each node from the golden image. The startsys command might take several minutes to power on the nodes on large-scale systems because of scale requirements.

The number of nodes to be installed influences the amount of time it takes to complete the process. After all nodes are installed, they automatically reboot to the login prompt. This process can take between two to three hours on a system with 1024 compute nodes.

This release uses the multicast file transfer technology to download software to client nodes during their image installation. Multicast file transfer technology provides a fast and scalable method of installing systems. Using multicast imaging sends data to many nodes simultaneously that have been previously set up to listen to a multicast from the designated image server. Multicast imaging provides very little resource drain on the image server as compared to other file transfer technologies, and therefore, allows systems of all sizes to be installed relatively quickly.

Multicast imaging uses the udpcast open source package, and the flamethrower functionality of SystemImager. A series of udp-sender daemons are run on the image server, and each client node runs a series of udp-receiver daemons during the imaging operation. The udp-sender daemons are managed by the startsys command. The startsys command starts these daemons when the --image_only or --image_and_boot options are entered on the command line and then shuts these daemons down after the imaging operation is complete. Therefore, you must use startsys when performing a full installation through the imaging operation.

Startup Procedure

Follow this procedure to start the system and propagate the golden image to all nodes; the command-line options depend on the number of nodes in the system:

  1. Determine whether you want to override files delivered in the golden image. This step is optional, and typically during an initial system installation and configuration, it is not necessary. However, be aware that you can modify the files delivered in the golden image, and the HP XC System Software Administration Guide describes how to do so.

    NOTE: If the hardware configuration is not homogenous (that is, not all nodes are the same hardware model) and contains one or more HP ProLiant DL145 G2 nodes and they are using InfiniBand PCI-X cards, add the additional kernel boot option noapic to the grub.conf file as an override to the system image.
  2. Make sure the XC.lic license key file is located in the following directory:

    # ls /opt/hptc/etc/license
    CAUTION: You cannot continue if the license key file is not present in this directory. See “Task 7: Have the License Key File Ready” and “Put the License Key File in the Correct Location (Required)” for more information about obtaining and positioning the license key file if you have not already done so.
  3. Ensure that the power is off on all nodes except the head node.

  4. Use the startsys command to turn on power to all nodes, image the nodes, and boot the nodes.

    The command-line options for the initial system image and boot are listed in Table 3-10 and depend on the number of nodes, whether or not the hardware configuration contains HP server blades and enclosures, and the size of the disks.

    Table 3-10 startsys Command-Line Options Based on Hardware Configuration

    Hardware Configurationstartsys Command Line
    Fewer than 300 nodes

    For small-scale hardware configurations, nodes are imaged and rebooted in one operation. The nodes complete their per-node configuration phase, thus completing the installation. This option applies only for nodes that have previously been set up to network boot.

    Enter the following command to image and boot all nodes in one step:

    # startsys --image_and_boot
    More than 300 nodes

    For large-scale hardware configurations, booting nodes while imaging is not recommended. Thus, issue the following commands to image and boot nodes in two separate steps:

    1. Propagate the golden image to all nodes:

      # startsys --image_only 
    2. Boot all nodes:

      # startsys --boot_group_delay=240
    NOTE: Use the --boot_group_delay=240 option only the first time nodes are booted after being imaged. The value 240 specifies the number of seconds to wait between groups of nodes as they are booting.
    Contains HP server blades and enclosures

    If the hardware configuration contains HP server blades, booting nodes while imaging is not recommended, and additional options are required on the command line. Issue the following commands to boot and image all nodes:

    1. Propagate the golden image to all nodes:

      # startsys --image_only --flame_sync_wait=480 \
           --power_control_wait=90 \
           --image_timeout=90
    2. Boot all nodes when the imaging process is complete:

      # startsys --power_control_wait=90 \
           --boot_group_delay=45 \
           --max_at_once=50
    Nodes have disks that are 250 GB or larger in size or have SATA disks

    If the hardware configuration contains disks that are 250 GB or larger in size or contains SATA disks, you might need to increase the 45 minute default allowed for imaging by the startsys command.

    For example, on systems with disks larger than 250 GB, increase the image timeout limit to 60 minutes, as follows:

    # startsys --image_and_boot --image_timeout=60

    You might use a different value depending upon the size of the disks in the system.

     

    For more information about startsys command-line options and option values, see startsys(8).

  5. If you want to watch as the startsys command images and turns on power to the nodes, open a second terminal window and issue a tail command to view the following log files:

    • /hptc_cluster/adm/logs/imaging.log

    • /hptc_cluster/adm/logs/startsys.log

    Command output on a small, 16-node configuration is similar to the following:

    Fri Jul 06 08:49:10 2007 Enabled nodes: 16 nodes -> n[1-16]
    Fri Jul 06 08:49:12 2007 Removing the execution node: n16
    Fri Jul 06 08:49:12 2007 Boot hierarchy of specified 
       nodes is: n15 n[1-14]
    Fri Jul 06 08:49:15 2007 Initial power test - please wait.
    Fri Jul 06 08:49:24 2007 Nodes that will image: 
       15 nodes -> n[1-15]
    You must manually power on the following nodes:
    n1
    Press enter after applying power to these nodes.
    
    continuing ........
    Fri Jul 06 08:49:29 2007 Powering on for image: 
       14 nodes -> n[2-15]
    Fri Jul 06 08:50:34 2007 Retrying power --on command: 
       3 nodes -> n[2-3,15]
    
    *** Fri Jul 06 08:52:19 2007 Current statistics:
      Imaging: 15 nodes -> n[1-15]
    
    Progress:
    Flamethrower started: nodes waiting: 15 nodes -> n[1-15]
    
    *** Fri Jul 06 08:55:19 2007 Current statistics:
      Imaging: 15 nodes -> n[1-15]
    
    Progress:
    
    *** Fri Jul 06 08:58:19 2007 Current statistics:
      Imaging: 15 nodes -> n[1-15]
    
    Progress:
    Fri Jul 06 08:58:34 2007 Imaging completed; will be powered off: 
       2 nodes -> n[1-2]
    You must manually power off the following nodes: 1
    n1
    Press enter after removing power from these nodes.
    
    
    continuing ........
    Fri Jul 06 08:59:02 2007 Powering off: 1 node -> n2
    Fri Jul 06 08:59:48 2007 Imaging completed; will be powered off: 
       9 nodes -> n[4-10,12,14]
    Fri Jul 06 08:59:48 2007 Powering off: 9 nodes -> n[4-10,12,14]
    Fri Jul 06 09:00:04 2007 Imaging completed; will be powered off: 
       3 nodes -> n[11,13,15]
    Fri Jul 06 09:00:04 2007 Powering off: 3 nodes -> n[11,13,15]
    Fri Jul 06 09:00:52 2007 Imaging completed; will be powered off: 
       1 node -> n3
    Fri Jul 06 09:00:52 2007 Powering off: 1 node -> n3
    Fri Jul 06 09:01:07 2007 Retrying power --off command: 1 node -> n15
    
    *** Fri Jul 06 09:01:22 2007 Current statistics:
      Waiting for hierarchy to boot: 15 nodes -> n[1-15]
    Progress:
    Fri Jul 06 09:01:22 2007 Powering on for boot: 1 node -> n15
    Fri Jul 06 09:02:33 2007 Retrying power --on command: 1 node -> n15
    Fri Jul 06 09:04:18 2007 Processing completed for: 1 node -> n15
    
    *** Fri Jul 06 09:04:33 2007 Current statistics:
    Booted and available: 1 node -> n15
    Waiting for hierarchy to boot: 14 nodes -> n[1-14]
    
      Progress:
    You must manually power on the following nodes: 2
    n1
    Press enter after applying power to these nodes.
    
    continuing ........
    Fri Jul 06 09:04:37 2007 Powering on for boot: 
       13 nodes -> n[2-14]
    Fri Jul 06 09:05:33 2007 Retrying power --on command: 
       12 nodes -> n[2-6,8-14]
    Fri Jul 06 09:06:48 2007 Processing completed for: 1 node -> n1
    Fri Jul 06 09:07:03 2007 Processing completed for: 1 node -> n7
    Fri Jul 06 09:07:18 2007 Processing completed for: 
       9 nodes -> n[4-5,8-14]
    
    *** Fri Jul 06 09:07:33 2007 Current statistics:
    Booted and available: 15 nodes -> n[1-15]
    
    Progress:
    Fri Jul 06 09:07:33 2007 Processing completed for: 
       3 nodes -> n[2-3,6]
    
    *** Fri Jul 06 09:07:33 2007 Current statistics:
    Booted and available: 15 nodes -> n[1-15]
    
    Progress:
    Fri Jul 06 09:07:34 2007 startsys process exiting with code 0  3
    1

    Watch the screen carefully for the message You must manually power off the following nodes. This message means that you must go to the specific node and press the power button to turn off power to the node. After doing so, return to the screen and press the Enter key.

    2

    Watch the screen carefully for the message You must manually power on the following nodes. This message means that you must go to the specific node and press the power button to turn on power to the node. After doing so, return to the screen and press the Enter key.

    3

    The message exiting with code 0 indicates successful completion.

  6. See “Troubleshooting the Imaging Process” if you encounter problems during the node imaging process.

Proceed to “Task 13: Perform Postconfiguration Tasks for the InfiniBand Interconnect”.

Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2003–2007 Hewlett-Packard Development Company, L.P.