Jump to content United States-English
HP.com Home Products and Services Support and Drivers Solutions How to Buy
» Contact HP
More options
HP.com home
HP XC System Software : Installation Guide > Chapter 7 Upgrading Your HP XC System

Task 7: Configure the System and Propagate the Golden Image

» 

Technical documentation

Complete book in PDF
» Feedback
Content starts here

 » Table of Contents

 » Glossary

 » Index

Follow this procedure to configure your upgraded system and propagate the new golden image to all client nodes:

  1. Run the following utility to back up the existing database and migrate existing data to the new release format:

    # upgradesys

    Command output looks similar to the following:

    The upgradesys utility performs all the necessary steps
    to upgrade your cluster.  This script should be run immediately 
    after you have upgraded the head node with the latest XC software 
    and any third party vendor rpms.  
    
    Do you wish to continue? [y/n] y
    Backing up database to 
    /opt/hptc/etc/sysconfig/upgrade/upgradesys.dbbackup-20060103145027.sql ...
    Executing C02database gupdate
    Starting MySQL:                                            [  OK  ]
    Executing C20server_type gupdate
    Executing C30device_names gupdate
    Executing C33etc_hosts gupdate
    Executing C35region gupdate
    Executing C40role_migration gupdate
    Executing C90systemimager gupdate
    Removing XC MLIB RPMs
    
    upgradesys output logged to /var/log/upgradesys/upgradesys.log
  2. Run the cluster_config utility to configure your system. Table 7-7 describes two options to the cluster_config utility that you can use to reconfigure a system after a software upgrade. Decide which option you want depending upon how you want the upgrade to proceed.

    Table 7-7 Upgrade Options for the cluster_config Utility

    --migrate Option --init Option

    Performs a series of known migration steps to bring existing, recognized roles in the database into alignment with the new roles introduced in this release.

    Using this option does not guarantee the correct migration steps for unrecognized (user-created) roles and services in the database.

    Before you decide to use this option, view the /opt/hptc/etc/sysconfig/upgrade/role_migration.ini file to see how your previous role assignments compare to the roles provided in XC Version 3.0.

    Initializes (resets) your existing node role assignments and configures your system with the default node role assignments introduced in XC Version 3.0.

    The default roles and assignments have been optimized for performance in XC Version 3.0, and you may decide that this configuration is better suited for your environment.

    See Chapter 8 for a description of roles and the services provided by them, as well as the default node role assignments.

     

    Enter one of the following cluster_config options:

    • To migrate your existing system configuration:

      # /opt/hptc/config/sbin/cluster_config --migrate

      Proceed to step 3.

    • To apply new default role assignments to your existing system configuration:

      # /opt/hptc/config/sbin/cluster_config --init

      If you specify the --init option, in the next step you must remember to reassign any role assignments you previously customized. For example, if your system configuration had login roles on one or more nodes, you must assign a login role on any node on which you want users to be able to log in. In the default configuration, a login role is not assigned to any node.

    Proceed to step 3.

  3. The cluster_config utility displays the following menu. Enter the letter p to proceed with the system configuration process; refer to Appendix F for information about using this menu.

    Important:

    If the head node was configured as a NIS slave server in the previous release, you must assign the nis_server role to the head node now, because this role is being introduced in this release. Use the [M]odify option of the cluster_config utility to assign this role to the head node.

    [L]ist Nodes, [M]odify Nodes, [A]nalyze, [H]elp, [P]roceed, [Q]uit: p
    
    Do you want to apply your changes to the cluster configuration? [y/n]  y
    [S]ervices Config, [P]roceed, [Q]uit: p
    
    Do you want to apply your changes to the service configuration? [y/n] y
  4. Follow along on your system while the cluster_config utility is configuring your system and, when prompted, provide the answers listed in Table 7-8.

    Table 7-8 Answers to cluster_config Prompts

    PromptAnswer

    Regenerate ssh keys?

    yes

    Recreate the qsnet database? (For systems using a QSnetII interconnect).

    yes

    Reconfigure SLURM?

    yes

    Create new slurm.conf file?

    yes

    Install LSF?

    yes

    Upgrade to new version of LSF?

    u (upgrade)

    All other prompts

    Accept the default response

     

    Output from the cluster_config command looks similar to the following:

    Configuring system wide functions / policies / behaviors 
    Executing C02ssh_config 
    sconfigure Root ssh keys for the cluster already exist
    (Warning: you will not be able to ssh/pdsh to  other nodes until 
    you reimage them) 
    Would you like to regenerate them? ([n]/y) y
    Executing C10cluster_fstab sconfigure
    Executing C20sysparams sconfigure
    NFS daemon tuning:
    Given that there are 6 nodes in this cluster, enter the number of
    NFS daemons that shall be configured to support them [8] : Enter
     
    Executing C75mpiic sconfigure
    Configuring service specific functions
    Executing C05pdsh gconfigure
    Executing C08ntp gconfigure
    Configuring the following nodes as ntp servers for the cluster:        
             n16
    
    You must now specify the clock source for the server nodes.  
    If the nodes have external connections, you may specify up 
    to 4 external NTP servers.  Otherwise, you must use the node's 
    system clock.
    Enter the IP address or host name of the first external NTP server
    or leave blank to use the system clock on the NTP server node: 
    Renaming previous /etc/ntp.conf to /etc/ntp.conf.bak
    
    Executing C10hptc_cluster_fs gconfigure 
    Executing C20gmmon gconfigure 
    Executing C30swmlogger gconfigure 
    Executing C30syslogng_forward gconfigure 
    Executing C35dhcp gconfigure 
    Executing C50cmf gconfigure 
    Executing C50nagios gconfigure 
    
    Would you like to enable web based monitoring? ([y]/n) y
    Enter the password for the 'nagiosadmin' web user:
    New password:
    Re-type new password:
    Adding password for user nagiosadmin
    Executing C50nat gconfigure
    Executing C50supermond gconfigure
    Executing C51nagios_monitor gconfigure
    Executing C50nat gconfigure
    Executing C50supermond gconfigure
    Executing C51nagios_monitor gconfigure
    Executing C60nis gconfigure
        Network Information Service (NIS) Configuration
    
    This step sets up one or more NIS servers within the XC system
    that are "slaves" to an external NIS "master".  The master NIS
    server provides the slaves with copies of its NIS maps.
    
    In order to successfully complete this configuration step, the NIS
    master must have been previously set to allow slaves to communicate
    with it.  On Linux systems, this is typically accomplished by adding
    the NIS slave hostname(s) to the /var/yp/ypservers file on the NIS
    master, and then running 'make'.
    
    In addition, to complete this configuration, you will need to provide
    
    1) the name or IP address of the NIS master, and
    2) the NIS domain name hosted by the NIS master
    
    Enter the name or IP address of the external NIS master: [] NIS_IP_address
    Enter the NIS domain hosted by the NIS master: [] your_NIS_domain
    
    Executing C90munge gconfigure
    Executing C90slurm gconfigure
    
    Do you want to configure SLURM now? (y/n) [y]:y 
    An existing SLURM configuration file has been detected.
    Do you want to delete this file and generate a new one?
    Answering 'no' means to edit the existing file. (y/n) [n]: y
    
    This SLURM configuration needs a special SLURM user. The SLURM
    controller daemons will be run by this user, and certain SLURM
    runtime files will be owned by this user.
    Enter the SLURM username [slurm]: Enter
    
    n16 is the only node with the Resource Management
    role. Therefore the SLURM Master Controller daemon will be set up
    on this node, and there will be no SLURM Backup Controller.
    The current Compute Node configuration is:
        NodeName=n[11-16] Procs=2
    
    NOTE: The only Partition created by default is the lsf
    partition. If you want additional partitions, configure
    them manually in the /hptc_cluster/slurm/etc/slurm.conf file.
    
    The current Node Partition configuration is:
        PartitionName=lsf RootOnly=YES Shared=FORCE Nodes= n[11-16]
    
    Do you want to enable SLURM-controlled user-access to the
    compute nodes? (y/n) [n]: n 
    
    SLURM configuration complete. Press 'Enter' to continue: Enter
    Executing C95lsf gconfigure 
    
    Do you want to install LSF locally now? (y|n) [y]: y
    
    LSF appears to be already installed. Do you want to upgrade this 
    installation, or delete it and perform a clean install?
    ([u]pgrade or [d]elete) [u]: u
    
    Pre-installation check report saved as text file: 
    /opt/hptc/lsf/files/lsfhpc/install-20051216023643/hpc6.1_hpcinstall/   \
       prechk.rpt.
    
    ... Done LSF pre-installation check.
    
    ... Done installing hpc binary files "linux2.6-glibc2.3-ia32e-slurm".
    
    ... LSF configuration is done.
    
    hpcinstall is done.
    
    To complete your hpc installation and get your 
    cluster "hptclsf" up and running, follow the steps in 
    "/opt/hptc/lsf/files/lsfhpc/install-20051216023643/hpc6.1_hpcinstall/   \
           hpc_getting_started.html".
    
    After setting up your LSF server hosts and verifying 
    your cluster "hptclsf" is running correctly, 
    see "/opt/hptc/lsf/top/6.1/hpc_quick_admin.html" 
    to learn more about your new LSF cluster.
    
    
    ***Begin LSF-HPC Post-Processing***
    
    Created '/hptc_cluster/lsf/tmp'...
    
    Editing /opt/hptc/lsf/top/conf/lsf.cluster.hptclsf...
    Moving /opt/hptc/lsf/top/conf/lsf.cluster.hptclsf
     to /opt/hptc/lsf/top/conf/lsf.cluster.hptclsf.old.6490...
    
    Editing /opt/hptc/lsf/top/conf/lsf.conf...
    Moving /opt/hptc/lsf/top/conf/lsf.conf
     to /opt/hptc/lsf/top/conf/lsf.conf.old.6490...
    
    Editing /opt/hptc/lsf/top/conf/lsbatch/hptclsf/configdir/lsb.params...
    Moving /opt/hptc/lsf/top/conf/lsbatch/hptclsf/configdir/lsb.params
     to /opt/hptc/lsf/top/conf/lsbatch/hptclsf/configdir/lsb.params.old.6490...
    
    Replaced default lsb.queues with a preconfigured lsb.queues.
    
    C95lsf finished
    
    Configuring the image replication environment
        Initializing 172.20.0.16 as golden client
        Creating the golden image (takes approximately 10 minutes)
    
    **Do not interrupt this process or else the golden image will be incomplete**
    
        Setting up the bootserver
        Linking client nodes to their autoinstall script
        Initializing service persistence
        Sanitizing services in the golden image
        Creating golden image 'tar' file (takes approximately 10-15 minutes)
        Verifying integrity of golden image 'tar' file
    
    Image replication environment configuration complete.
    info: nconfig started
    info: Executing on head node
    
    info: Executing C02network nconfigure
    info: Executing C04iptables nconfigure
    info: Executing C06nfs_server nconfigure
    info: Executing C08ntp nconfigure
    info: Executing C10hptc_cluster_fs nconfigure
    info: Executing C10hptc_cluster_fs_client nconfigure
    info: Executing C20gmmon nconfigure
    info: Executing C30swmlogger nconfigure
    info: Executing C30syslogng_forward nconfigure
    info: Executing C40hpasm nconfigure
    info: Executing C50cmf nconfigure
    info: Executing C50collectl nconfigure
    info: Executing C50gather_data nconfigure
    info: Executing C50hptc-lm nconfigure
    info: Executing C50nagios nconfigure
    info: Executing C50nat nconfigure
    info: Executing C50supermond nconfigure
    info: Executing C51nagios_monitor nconfigure
    info: Executing C51nrpe nconfigure
    info: Executing C90munge nconfigure
    info: Executing C90slurm nconfigure
    info: Executing C95lsf nconfigure
    info: Executing C30syslogng_forward cconfigure
    info: Executing C35dhcp cconfigure
    info: Executing C50supermond cconfigure
    info: Executing C90munge cconfigure
    info: Executing C90slurm cconfigure
    info: Executing C95lsf cconfigure
    info: nconfig shut down
    info: nconfig started
    info: Executing on head node
    
    info: Executing C02network nrestart
    info: Executing C04iptables nrestart
    info: Executing C06nfs_server nrestart
    info: Executing C08ntp nrestart
    info: Executing C10hptc_cluster_fs nrestart
    info: Executing C10hptc_cluster_fs_client nrestart
    info: Executing C20gmmon nrestart
    info: Executing C30swmlogger nrestart
    info: Executing C30syslogng_forward nrestart
    info: Executing C40hpasm nrestart
    info: Executing C50cmf nrestart
    info: Executing C50collectl nrestart
    info: Executing C50gather_data nrestart
    info: Executing C50hptc-lm nrestart
    info: Executing C50nagios nrestart
    info: Executing C50nat nrestart
    info: Executing C50supermond nrestart
    info: Executing C51nagios_monitor nrestart
    info: Executing C51nrpe nrestart
    info: Executing C90munge nrestart
    info: Executing C90slurm nrestart
    info: Executing C95lsf nrestart
    info: Executing C30syslogng_forward crestart
    info: Executing C35dhcp crestart
    info: Executing C50supermond crestart
    info: Executing C90munge crestart
    info: Executing C90slurm crestart
    info: Executing C95lsf crestart
    info: nconfig shut down
  5. Look at the backup copy of the slurm.conf file, which is located in the /hptc_cluster/slurm/etc/slurm.conf.bak file. If you had previously customized this file, you must merge those customizations into the new version of the /hptc_cluster/slurm/etc/slurm.conf file. Otherwise, skip this step.

  6. Re-enter the monitoring line card entries in the /etc/dhcpd.conf file if your system is using a QSnetII or Myrinet interconnect. See Appendix D for more information about adding these entries to the file.

    Skip this step if your system is using an InfiniBand or Gigabit Ethernet interconnect.

  7. Enter one of the following commands depending upon the size of your system:

    • On systems with fewer than 300 nodes, enter this command to image and boot all client nodes:

      # startsys --image_and_boot
    • On systems with more than 300 nodes, enter this command to image the client nodes. Then, proceed to step 8 to boot the nodes after they are imaged.

      # startsys --image_only
  8. Enter the following command to boot the client nodes on systems with more than 300 nodes because the nodes were not booted during their imaging operation:

    # startsys --boot_group_delay=240
    Note:

    The --boot_group_delay=240 option is only used the first time nodes are booted after being imaged; the value 240 specifies the number of seconds to wait between groups of nodes as they are booting.

  9. Make sure all nodes are up:

    # power --status
  10. If your system is configured with LSF-HPC with SLURM, run the SLURM postconfiguration utility to update the slurm.conf file with compute node names and attributes:

    # spconfig
  11. Set up the LSF environment by sourcing the LSF profile file:

    # . /opt/hptc/lsf/top/conf/profile.lsf
Printable version
Privacy statement Using this site means you accept its terms Feedback to webmaster
© 2003 Hewlett-Packard Development Company, L.P.