Logo
Graphcore Command Line Tools
Version: 1.0.55
  • 1. Introduction
  • 2. gc-docker
    • 2.1. Start a container with IPU devices
    • 2.2. Show Docker command
    • 2.3. Usage
      • 2.3.1. Commands
      • 2.3.2. Command options
      • 2.3.3. Examples
      • 2.3.4. Notes
  • 3. gc-exchangetest
    • 3.1. Usage
      • 3.1.1. Allowed options
  • 4. gc-exchangewritetest
    • 4.1. Usage
      • 4.1.1. Allowed options
  • 5. gc-gwlinkstraffictest
    • 5.1. Standard mode
    • 5.2. Single IPU mode
    • 5.3. Usage
  • 6. gc-hostsynclatencytest
    • 6.1. Usage
      • 6.1.1. Allowed options
  • 7. gc-hosttraffictest
    • 7.1. Usage
      • 7.1.1. Allowed options
  • 8. gc-info
    • 8.1. Sub-commands
      • 8.1.1. List devices
      • 8.1.2. Tile overview
      • 8.1.3. Register dump
      • 8.1.4. Dump tile memory
    • 8.2. Usage
      • 8.2.1. Commands
      • 8.2.2. Command options
      • 8.2.3. Examples
    • 8.3. Glossary
  • 9. gc-inventory
    • 9.1. Usage
      • 9.1.1. Allowed options
  • 10. gc-iputraffictest
    • 10.1. Usage
      • 10.1.1. Allowed options
  • 11. gc-links
    • 11.1. Usage
      • 11.1.1. Allowed options
  • 12. gc-memorytest
    • 12.1. Usage
      • 12.1.1. Allowed options
  • 13. gc-monitor
    • 13.1. Output
      • 13.1.1. Usage
    • 13.2. Allowed options
    • 13.3. Notes
    • 13.4. Examples
  • 14. gc-powertest
    • 14.1. Usage
      • 14.1.1. Allowed options
  • 15. gc-reset
    • 15.1. Usage
      • 15.1.1. Allowed options
      • 15.1.2. Examples
      • 15.1.3. Notes
  • 16. gc-flops
    • 16.1. Precision
    • 16.2. Device
  • 17. C2 PCIe Device IDs and channel map
    • 17.1. Device IDs
    • 17.2. IPU-Link channel mapping
    • 17.3. PCIe ID to slot mapping
  • 18. Trademarks & copyright
Graphcore Command Line Tools

11. gc-links

This tool displays the status and connectivity of each of the IPU-Links that connect the IPUs. It does this by “training” the links with some data, and then checking that the data can be retrieved across all the links.

To use it, run:

gc-links -j

Note: On IPU-POD systems, the IPU-Links are trained when the partition is created. When running gc-links, the output will look something like this:

On a system where link training is available, the output will look similar to the following example:

{
  "ipu to ipu": {
      "from pcie id": "5",
      "to pcie id": "7",
      "channel": {
          "from": "NLC_E_2A",
          "to": "NLC_E_3A",
          "status": "passed",
          "gen": "4",
          "lanes": "8"
      },
      "channel": {
          "from": "NLC_E_2B",
          "to": "NLC_E_3B",
          "status": "passed",
          "gen": "4",
          "lanes": "8"
      }
  },
  "num ipus": "16",
  "overall result": "passed",
  "training fails": "0"
}

The “status” field shows the training status for each link. The “lanes” field shows the number of lanes being trained, and the “gen” field shows what generation of link is tested.

When gc-links finds that a link fails to train, the output looks like this:

{
  "ipu to ipu": {
    "from pcie id": "7",
    "to pcie id": "6",
    "channel": {
        "from": "NLC_W_1B",
        "to": "NLC_W_1B",
        "status": "passed",
        "gen": "4",
        "lanes": "8"
    },
    "channel": {
        "from": "NLC_W_1C",
        "to": "NLC_W_1C",
        "status": "failed",
        "gen": "0",
        "lanes": "0"
    }
  }
}

This output shows that it failed to train the link from device 7 to device 6, using link NLC_W_1C.

You can run gc-inventory to show more information on devices 6 and 7:

Device:
  id: 6
  type: C2
  Firmware Major Version: 1
  Firmware Minor Version: 0
  Firmware Patch Version: 4
  IPU: 1
  IPU version: ipu0
  PCI Id: 0000:43:00.0
  link speed: 8 GT/s
  link width: 8
  physical slot: PCIe Slot 12
  serial number: 0020.0004.018471
Device:
  id: 7
  type: C2
  Firmware Major Version: 1
  Firmware Minor Version: 0
  Firmware Patch Version: 4
  IPU: 1
  IPU version: ipu0
  PCI Id: 0000:44:00.0
  link speed: 8 GT/s
  link width: 8
  physical slot: PCIe Slot 13
  serial number: 0027.0004.018481

You can see, in this example, that there’s an issue with the link between the card in slot 12 and 13.

When training a chassis full of IPUs, the tool outputs additional information on IPU-Link failures, for example:

{
  "failures": [
    {
        "cable_id": "IPUL-00"
    },
    {
        "cable_id": "IPUL-24"
    },
    {
        "cable_id": "IPUL-25"
    }
  ]
}

The physical location of these failing cables is shown in IPU-Link channel mapping.

11.1. Usage

11.1.1. Allowed options

-j, --json-output

Emit JSON output

-n {arg}, --num-retries {arg}

Number of link training retries (default: 3)

-d {id}, --device-id {id}

Device id (default is largest group)

-i {arg}, --num-iterations {arg}

Number of times to train each link (default: 1)

-v, --verbose

Verbose output

-p, --phy-summary

Print PHY summary after all training runs

-h, --help

Produce help message

--until-trained

Run the training until it succeeds

-f, --phy-summary-failure

Print PHY summary after training failures

-l {path}, --load-phy-firmware {path}

Load PHY firmware from specified path

--allow-all-presets

Allow all NLC TX presets when training, default is preset 4 only

--allow-full-swing

Allow full swing on the transmitter, default is quarter swing, does nothing if allow-all-presets is set

--check-fom

Check the FOM, declare training fail if it is too low

--version

Version number

Previous Next

Revision c35f59e9.