rack_tool(1)

Date

2022-03-11

Name

rack_tool - tool for doing operations related to one or more IPU-Machines in a rack

Synopsis

rack_tool.py [––version] [––help] <command> [<args>] [––ipum name bmc_ip gw_ip bmc_username bmc_password gw_username gw_password] [––config-file config_file] [––global-credentials bmc_username bmc_password gw_username gw_password] [––log-dir log_directory]
rack_tool.py bist [––help]
rack_tool.py bmc-factory-reset [––help]
rack_tool.py hostname [––help]
rack_tool.py install-key [––help]
rack_tool.py logging-server [––help] ––address address ––port port ––device device
rack_tool.py power-cycle [––help]
rack_tool.py power-off [––help] [––hard]
rack_tool.py power-on [––help]
rack_tool.py run-command [––help] ––command command -d device
rack_tool.py status [––help] [––no-color] [––show-running-version] [––show-json]
rack_tool.py update-root-overlay [––help] [––overlay overlay_directory]
rack_tool.py upgrade [––help] [––component components] [––golden] [––gw-root-overlay path_to_overlay] [––sftp]
rack_tool.py vipu-test [––help] [––ipu-link-topology topology] [––vipu-path path_to_vipu_binaries]

Description

rack_tool is a tool to make it easy to do operations related to all IPU-Machines in a rack.

Options

The options ––ipum, ––global-credentials and ––config-file must go after the command parameters.

-a, ––address address

Set the IP address of the logging server.

-c, ––command command

Command to be executed on the device.

––component components
A list of specific components to be upgraded. If not specified, then all components are upgraded. See also Modes.
The possible values are:
  • bmc: Upgrade the BMC firmware.

  • bmc-all: Upgrade all BMC components: bmc, icu and system-fpga. This flag will not work in RNIC-only mode.

  • gw-all: Upgrade all GW components: gw, vipu-agents and vipu-standalone. This flag will not work in BMC-only mode. In RNIC-only mode this will not upgrade vipu-standalone.

  • gw: Upgrade the GW firmware.

  • icu: Upgrade the ICO firmware.

  • system-fpga: Upgrade the system FPGA.

  • vipu-agents: Upgrade the V-IPU agents for each IPU.

  • vipu-standalone: Upgrade the V-IPU controller and configure it to run as a service.

––config-file config_file
Config file with information about IPU-Machines to connect to. The config file will be ignored if the ––ipum parameter is given. See Config file format.
Example:
rack_tool.py upgrade --config-file /home/ipuuser/my_config_file.json
-d, ––device bmc|gw

Specify whether this command applies to the BMC or the GW.

––global-credentials bmc_username bmc_password gw_username gw_password
Option to set a common set of login details for the IPU-Machines selected with ––ipum option. If this option is used, the password and username parameters for the ––ipum option can be omitted.
Example:
rack_tool.py upgrade --ipum machine1 10.1.1.1 10.1.1.2 --ipum machine2 10.1.2.1 10.1.2.2 --global-credentials root password itadmin password
-g, ––golden

Force upgrade of the “golden” (known good) partition.

––gw-root-overlay path_to_overlay

Specify a directory containing files to overwrite those in the GW file system.

-H, ––hard

Turn the power off immediately, without shutting down the IPU-Machines.

-h, ––help

Prints the synopsis and a list of all available commands. ––help can also be given after a command to show individual help for each command.

-t, ––ipu-link-topology topology

Set the IPU-Link topology to either “mesh” or “torus”. The default is “mesh”.

––ipum name bmc_ip gw_ip bmc_username bmc_password gw_username gw_password
Option to manually define which IPU-Machines to perform operations on, instead of using a config file. Several IPU-Machines can be selected by passing the ––ipum option multiple times.
Example:
rack_tool.py upgrade --ipum machine1 10.1.1.1 10.1.1.2 root password itadmin password
––log-dir log_directory
Custom directory to write log files to.
Example:
rack_tool.py upgrade --log-dir /home/ipuuser/logs
––no-color

Don’t use colour when printing status information.

-o, ––overlay overlay_directory

Specify a directory containing files to overwrite those in the IPU-Machine file system.

-p, ––port port

Set the port to use on the logging server.

––sftp

Use sftp instead of scp for file transfers to the BMC.

––show-json

Display the current version information in JSON format.

––show-running-version

Display the versions of the currently running software components on the IPU-Machine.

-v, ––version

Print the version of the tool.

––vipu-path path_to_vipu_binaries

Path to the binaries needed to run the V-IPU test (vipu-server and vipu-admin).

Commands

bist

Run built-in self test that checks that most components on the board are available and functional.

bmc-factory-reset

Do factory reset on the BMC on all IPU-Machines.

hostname

Set hostname on GW and BMC.

install-key

Install the current user’s public SSH key to all IPU-Machines.

logging-server

Set logging server on a device (“bmc” or “gw”) on all IPU-Machines

power-cycle

Power cycle GW and IPUs on all IPU-Machines

power-off

Power off GW and IPUs on all IPU-Machines

power-on

Power on GW and IPUs on all IPU-Machines

run-command

Run a command on a device (“bmc” or “gw”) on all IPU-Machines.

status

Show status of IPU-Machines in a rack.

update-root-overlay

Copy all files in GW root overlay directory to all IPU-Machines. The overlay directory See GW root overlay.

upgrade
Start upgrade of IPU-Machines. This will only write the new firmware to flash memory. Some systems may require further configuration and will provide an installation script to perform the upgrade and do any necessary configuration. See the documentation for your IPU system for more information.
See Modes.
vipu-test

Run Virtual-IPU tests.

Exit status

0 Successful program execution.
1 Unsuccessful program execution.

GW root overlay

This is a directory structure that contains files that should be copied to the GW after an upgrade. This makes it possible to provide site-specific files that should be persistent on the GW after upgrade. The contents of this directory are copied over to the GW automatically. You can specify the directory either as an object in the rack config file (see Config file format) or as an argument to the ––overlay command-line option. The directory structure should be the same as the root directory on the GW.

An example of files that can be useful to have in the GW root overlay are those that relate to the outside world, such as NTP and syslog configuration.

Config file format

rack_config.json is a JSON file that rack_tool uses to connect to all the IPU-Machines in a rack. This file is created when the system is initially set up; see the appropriate IPU-POD build and test guide for details: docs.graphcore.ai/hardware. It consists of one mandatory object “machines” and the optional objects “global_credentials”, “gw_root_overlay” and “log_directory”.

rack_tool will look for a config file in three different places with the following priority:

  1. The file passed with the ––config-file parameter

  2. A file named rack_config.json in the current working directory

  3. A rack_config.json file in the ~/.rack_tool/ directory

global_credentials

This is an object that holds the login details of the BMC and GW. the object has the following key/value pairs:

"global_credentials": {
    "bmc_username": "<username>",
    "bmc_passwd": "<password>",
    "gw_username": "<username>",
    "gw_passwd": "<password>"
}
gw_root_overlay

This object is a key/value pair that points to the location of the GW root overlay.

"gw_root_overlay": "/home/ipuuser/.rack_tool/root-overlay",
log_directory
This object is a key/value pair that points to the directory where logs should be stored. If this object is not present in the config file, rack_tool will place logs in the first of the following directories that it has write permission to:
  1. <rack_tool_location>/logs

  2. /var/log/rack_tool

  3. <current_working_directory>/rack_tool_logs

"log_directory": "/home/ipuuser/rack_tool_logs"
machines

This is an array of machine objects that holds information about each IPU-Machine in the rack. Each machine object consists of the following key/value pairs:

"machines": [
    {
        "name": "<machine name>",
        "bmc_ip": "<ip address>",
        "gw_ip": "<ip address>",
        "rnic_ip": "<ip address>"
    }
]

Modes

rack_tool modes determine what functions rack-tool can perform on an IPU-Machine, based on information in the rack_config.json file.

BMC-only mode
If the rack_config.json file only contains BMC IP addresses for the IPU-Machine then rack_tool will run in BMC-only mode. The config file must also contain BMC login information used for the BMC IP address. In this mode the only components that can be updated are bmc, icu and system-fpga. In BMC-only mode, rack_tool can only communicate with the BMC. For example, attempting to upgrade GW components in BMC-only mode will fail.
Example:
"global_credentials": {
    "bmc_username": "<username>",
    "bmc_passwd": "<password>"
},
"machines": [
    {
        "name": "<machine name>",
        "bmc_ip": "<ip address>"
    }
]
RNIC-only mode
If the rack_config.json file only contains RNIC IP addresses for the IPU-Machine then rack_tool will run in RNIC-only mode. The config file must also contain GW login information used for the RNIC IP address. In this mode the only components that can be updated are gw and vipu-agents. In RNIC-only mode, rack_tool can only communicate with the GW. For example, attempting to upgrade BMC components in RNIC-only mode will fail.
Example:
"global_credentials": {
    "gw_username": "<username>",
    "gw_passwd": "<password>"
},
"machines": [
    {
        "name": "<machine name>",
        "rnic_ip": "<ip address>"
    }
]
Managed mode
If the the rack_config.json also contains BMC and GW IP addresses for the IPU-Machine, then it will run in managed mode. The config file must also contain the BMC and GW login credentials. In managed mode we have full access to all rack_tool functionality since it can communicate with both the GW and BMC.
Example:
"global_credentials": {
        "bmc_username": "<username>",
        "bmc_passwd": "<password>",
        "gw_username": "<username>",
        "gw_passwd": "<password>"
},
"machines": [
    {
        "name": "<machine name>",
        "bmc_ip": "<ip address>",
        "gw_ip": "<ip address>",
        "rnic_ip": "<ip address>"
    }
]

Files

~/.rack_tool/

This directory is the default location for configuration files.

~/.rack_tool/rack_config.json

This is the rack configuration file for the rack.

~/.rack_tool/root-overlay/

This is the default directory for GW root overlay files