3. C600 register map
3.1. API version history
When commands are added to or removed from the C600 firmware, the API version will be incremented. The API version at which each command or field has been added is indicated in the following tables to ensure that the user can identify which information is expected. The API version is itself available via command 0x03, see API version for details.
API Version |
ICU Version |
---|---|
1 |
2.6.3 |
2 |
2.6.7 |
3.2. Command list
A full listing of all SMBus commands is given in Table 3.1. More details about each of the commands can be found in Section 3.3, Command descriptions.
ID |
Name |
Bytes (r/w) |
Protocol |
API |
---|---|---|---|---|
0x01 |
2 |
Read Word |
1 |
|
0x02 |
2 |
Read Word |
1 |
|
0x03 |
2 |
Read Word |
1 |
|
0x04 |
2 |
Read Word |
1 |
|
0x05 |
2 |
Read Word |
1 |
|
0x06 |
2 |
Read Word |
1 |
|
0x07 |
31 / 1 |
Block Process Call |
1 |
|
0x08 |
24 |
Block Read |
1 |
|
0x09 |
22 |
Block Read |
1 |
|
0x0A |
2 |
Read Word |
2 |
|
0x10 |
8 |
Block Read |
1 |
|
0x12 |
4 |
Block Read |
1 |
|
0x13 |
24 / 1 |
Block Read or Write |
1 |
|
0x15 |
2 |
Write Word |
2 |
|
0x16 |
30 / 2 |
Block Process Call |
2 |
|
0x17 |
2 |
Read Word |
2 |
|
0x20 |
2 |
Read Word |
1 |
|
0x22 |
2 / 2 |
Block Read or Write |
1 |
|
0x23 |
2 / 2 |
Block Read or Write |
2 |
|
0x31 |
2 |
Read Word |
2 |
|
0x40 |
2 |
Read Word |
1 |
|
0x41 |
24 |
Block Read |
1 |
|
0x42 |
24 |
Block Read |
1 |
|
0x43 |
24 |
Block Read |
1 |
|
0x45 |
6 |
Block Read |
1 |
|
0x80 |
4 |
Block Read |
1 |
3.3. Command descriptions
3.3.1. System information
Commands 0x00 - 0x0F are reserved for system information. These values should be constant once the ICU has booted.
ID |
Name |
Bytes (r/w) |
Protocol |
API |
---|---|---|---|---|
0x01 |
2 |
Read Word |
1 |
|
0x02 |
2 |
Read Word |
1 |
|
0x03 |
2 |
Read Word |
1 |
|
0x04 |
2 |
Read Word |
1 |
|
0x05 |
2 |
Read Word |
1 |
|
0x06 |
2 |
Read Word |
1 |
|
0x07 |
31 / 1 |
Block Process Call |
1 |
|
0x08 |
24 |
Block Read |
1 |
|
0x09 |
22 |
Block Read |
1 |
|
0x0A |
2 |
Read Word |
2 |
3.3.1.1. Vendor ID (0x01)
Read word command 0x01 returns a data word fixed to contain the Graphcore PCIe vendor ID, and used to verify that the I2C device is a Graphcore ICU.
- Response:
Byte
Name
Format
Value
API
1-2
Vendor ID
uint16_t
0x1d95
1
3.3.1.2. Product ID (0x02)
Read word command 0x02 returns a data word fixed to contain the PCIe device ID of the C600 card, and used to verify that the I2C device is a Graphcore C600 PCIe card.
- Response:
Byte
Name
Format
Value
API
1-2
Product ID
uint16_t
0x0600
1
3.3.1.3. API version (0x03)
Read word command 0x03 returns a data word containing the currently supported version of the SMBus command API - see API version history.
- Response:
Byte
Name
Format
Unit
API
1-2
API version
uint16_t
1
3.3.1.4. ICU major version (0x04)
Read Word command 0x04 returns a data word containing the running firmware’s major version number.
- Response:
Byte
Name
Format
Unit
API
1-2
ICU major version
uint16_t
1
3.3.1.5. ICU minor version (0x05)
Read Word command 0x05 returns a data word containing the running firmware’s minor version number.
- Response:
Byte
Name
Format
Unit
API
1-2
ICU minor version
uint16_t
1
3.3.1.6. ICU patch version (0x06)
Read word command 0x06 returns a data word containing the running firmware’s patch version number.
- Response:
Byte
Name
Format
Unit
API
1-2
ICU patch version
uint16_t
1
3.3.1.7. ICU version string (0x07)
Block process call command 0x07 returns up to 31 data words containing the running firmware’s
null-terminated full version string, starting at the index supplied. If the returned string is not
null-terminated, more characters can be read by repeating the command after increasing the argument
index
given in the block write command, and concatenated to build up the full string.
- Block write command:
Byte
Name
Format
Unit
API
1
Index
uint8_t
1
- Block read response:
Byte
Name
Format
Unit
API
1-31
Version string
ASCII
1
3.3.1.8. Board public ID (0x08)
Block read command 0x08 returns a 24 byte null-terminated string containing the C600 card’s human readable board description.
- Response:
Byte
Name
Format
Unit
API
1-24
Board public ID
ASCII
1
3.3.1.9. Board revision (0x09)
Block read command 0x09 returns a 22 byte null-terminated string containing the C600 card’s board identification string, also known as the card serial number.
- Response:
Byte
Name
Format
Unit
API
1-22
Board revision
ASCII
1
3.3.1.10. Board PCB information (0x0A)
Read Word command 0x0A returns a data word containing the PCB and BOM identification information for the C600 card.
- Response:
Byte
Name
Format
Unit
API
1
PCB identifier
uint8_t
2
2
BOM identifier
uint8_t
2
3.3.2. Active system state
Commands 0x10 - 0x1F are reserved for the active system state. These values may change as the system is running.
ID |
Name |
Bytes (r/w) |
Protocol |
API |
---|---|---|---|---|
0x10 |
8 |
Block Read |
1 |
|
0x12 |
4 |
Block Read |
1 |
|
0x13 |
24 / 1 |
Block Read or Write |
1 |
|
0x15 |
2 |
Write Word |
2 |
|
0x16 |
30 / 2 |
Block Process Call |
2 |
|
0x17 |
2 |
Read Word |
2 |
3.3.2.1. System uptime (0x10)
Block Read command 0x10 returns a 64-bit integer containing the ICU’s uptime in milliseconds.
- Response:
Byte
Name
Format
Unit
API
1-8
System uptime
int64_t
millisec
1
3.3.2.2. POST status (0x12)
Block read command 0x12 returns the ICU’s Power On Self Test (POST) result. Outline details for reading this result are below, however any result other than 0x0 indicates an issue was encountered during the boot sequence.
- Response:
Byte
Name
Format
Unit
API
1-4
POST status
int32_t
bitfield
1
- POST status bitfield:
Bit
Description
31
Failure bit. 1 if failed else 0
30
Warning bit. 1 if multiple warnings else 0
29:24
Reserved
23:16
First encountered Zephyr error code from failed/warning code unit (truncated to 8-bit)
15:8
First encountered failure/warning number from failed/warning code unit (unit specific)
7:0
gc_init_id number associated with the failed/warning code unit
3.3.2.3. IPU errors (0x13)
3.3.2.3.1. Update errors
Block write command 0x13 triggers the ICU to check for logged errors, according to the provided update request type. Up to 1000 error records can be maintained within the ICU
- Block write command:
Byte
Name
Format
Unit
API
1
Update request
uint8_t
enum
1
- Update request enumeration:
Value
Name
Description
API
0
Next
Ready the next error log to be read, and increments read index
1
1
Most recent
Ready the most recent log to be read. Does not update
1
2
Mark all read
Updates the read index to point to the most recent error log
1
3.3.2.3.2. Read errors
Block read command 0x13 retrieves 24 bytes of data describing the requested error log. If no more
errors are available, this command will return all zeros excluding the State
field. The ICU
needs time to perform the action of a preceeding Block Write command - if the read is performed too
soon, State
may return Updating
and the Block Read command should be repeated after a short
delay. The error information consists of three error registers and a timestamp indicating when the
error was detected.
- Block Read command:
Byte
Name
Format
Unit
API
1-4
Timestamp
uint32_t
seconds
1
5-8
CMGMTEVVR
uint32_t
bitfield
1
9-12
CICERRVR
uint32_t
bitfield
1
13-16
CIUERRVR
uint32_t
bitfield
1
17-18
Index
uint16_t
1
19-20
Remaining
uint16_t
1
21-22
Bootcount
uint16_t
1
23
Source
uint8_t
enum
1
24
State
uint8_t
enum
1
- Source enumeration:
Value
Name
Description
API
0
No records
No (more) errors to report
1
1
Newman reset
Logged during a Newman reset operation
1
2
Shutdown
Logged during an ICU shutdown operation
1
3
Monitor
Logged during a regular monitoring operation
1
4
Shellcheck
Logged via a development interface
1
- State enumeration:
Value
Name
Description
API
1
Updating
Previous request is still being handled
1
2
Updated
Error log is ready and available
1
3
No records
No logs have been recorded
1
4
Error
An error occurred while reading the error logs
1
3.3.2.4. dmesg update (0x15)
Write Word command 0x15 triggers the ICU to cache its dmesg log. Up to 100 log messages can be cached. The most recent messages will be cached if there are more than 100 logs. The update request input can be any numeric value.
- Write Word command:
Byte
Name
Format
Unit
API
1-2
Update request
uint16_t
2
3.3.2.5. dmesg read (0x16)
Block Process Call command 0x16 returns up to 29 ASCII bytes containing the ICU’s dmesg, starting at
the log and message indexes supplied, along with the update state. If the returned string is not
null-terminated, more characters can be read by repeating the command after increasing the argument
Message index
given in the Block Write command, and concatenated to build up the full string.
An array of strings can be built up in a similar fashion by increasing the argument Log index
.
If the first character of the returned string is null-terminated, the end of the log cache has been
reached.
- Block Write command:
Byte
Name
Format
Unit
API
1
Message index
uint8_t
2
2
Log index
uint8_t
2
- Block Read response:
Byte
Name
Format
Unit
API
1
State
uint8_t
enum
2
2-30
dmesg string
ASCII
2
- State enumeration:
Value
Name
Description
API
0
Updating
Previous request is still being handled
2
1
Updated
dmesg cache is ready and available
2
2
Index error
Log index
out of range2
3
Error
An error occurred while reading the dmesg
2
3.3.2.6. IPU status (0x17)
Read Word command 0x17 returns a data word containing a bitfield representation of the IPU’s current status.
- Response:
Byte
Name
Format
Unit
API
1-2
IPU status
bitfield
2
- IPU status bitfield:
Bit
Description
15:3
Reserved
2
PCIe Bus Master Enable detected
1
PCIe Power Brake activated
0
IPU in use
3.3.3. Clock control
Commands 0x20 - 0x2F are reserved for clock control information and operations.
ID |
Name |
Bytes (r/w) |
Protocol |
API |
---|---|---|---|---|
0x20 |
2 |
Read Word |
1 |
|
0x22 |
2 / 2 |
Block Read or Write |
1 |
|
0x23 |
2 / 2 |
Block Read or Write |
2 |
3.3.3.1. Clock frequency (0x20)
Read word command 0x20 returns the status of the IPU’s current upper clock speed.
- Response:
Byte
Name
Format
Unit
API
1-2
Fast PLL IPU
uint16_t
MHz
1
3.3.3.2. Frequency limiter (0x22)
Block Read or Block Write commands 0x22 are used to read or set a limit to the maximum frequency that the IPU can run at, with an upper bound of the system default maximum frequency. If set to 0xFFFF then the system default will be used.
- Block read:
Byte
Name
Format
Unit
API
1-2
Frequency limit
uint16_t
MHz
1
- Block write:
Byte
Name
Format
Unit
API
1-2
Frequency limit
uint16_t
MHz
1
3.3.3.3. Board TDP limit (0x23)
Reads or sets the limit for the maximum thermal design power (TDP) that the IPU on the board can consume. This value is rounded down to the nearest multiple of 12. For example a setting of 203 watts is rounded to 192 watts.
- Block read:
Byte
Name
Format
Unit
API
1-2
TDP limit
uint16_t
Watts
2
- Block write:
Byte
Name
Format
Unit
API
1-2
TDP limit
uint16_t
Watts
2
3.3.4. Telemetry - power usage
Commands 0x30 - 0x3F are reserved for IPU and C600 power usage information.
ID |
Name |
Bytes (r/w) |
Protocol |
API |
---|---|---|---|---|
0x31 |
2 |
Read Word |
2 |
3.3.4.1. Board power consumption (0x31)
Read Word command 0x31 returns the average board power consumption over the last second. This is a digital average of readings from the on-board sensors, which provide an analogue average of the power usage over 63ms periods.
- Response:
Byte
Name
Format
Unit
API
1-2
Board power consumption
Linear11
Watts
2
3.3.5. Telemetry - temperatures
Commands 0x40 - 0x4F are reserved for IPU and C600 temperature information.
ID |
Name |
Bytes (r/w) |
Protocol |
API |
---|---|---|---|---|
0x40 |
2 |
Read Word |
1 |
|
0x41 |
24 |
Block Read |
1 |
|
0x42 |
24 |
Block Read |
1 |
|
0x43 |
24 |
Block Read |
1 |
|
0x45 |
6 |
Block Read |
1 |
3.3.5.1. Temperatures - max board (0x40)
Read Word command 0x40 returns the current maximum temperature reported by any of the temperature sensors on the C600, in whole degrees Celsius. This value should be used to control system cooling.
- Response:
Byte
Name
Format
Unit
API
1-2
Max board temp (current)
int16_t
Celsius
1
3.3.5.2. Temperatures - group A (0x41)
Block Read command 0x41 reports an aggregated list of all available temperature sensors on the C600 card. All values are reported in a Linear11 representation of Celcius allowing for decimal degrees. IPU PVT values refer to sensors within the IPU chip itself, while other values are located on the C600 card at their given location.
- Response:
Byte
Name
Format
Unit
API
1-2
IPU PVT east
Linear11
Celsius
1
3-4
IPU PVT west0
Linear11
Celsius
1
5-6
ADC inlet
Linear11
Celsius
1
7-8
ADC exhaust
Linear11
Celsius
1
9-10
ADC phase0 bottomside
Linear11
Celsius
1
11-12
ADC phase1 bottomside
Linear11
Celsius
1
13-14
ADC IPU chip bottomside
Linear11
Celsius
1
15-16
ADC mid
Linear11
Celsius
1
17-18
I2C inlet
Linear11
Celsius
1
19-20
I2C IPU chip
Linear11
Celsius
1
21-22
I2C exhaust
Linear11
Celsius
1
23-24
I2C mid
Linear11
Celsius
1
3.3.5.3. Max temperatures - group A (0x42)
Block read command 0x42 reports an aggregated list of the maximum temperatures reported by all available temperature sensors on the C600 card since the last card powercycle.
- Response:
Byte
Name
Format
Unit
API
1-2
IPU PVT east
Linear11
Celsius
1
3-4
IPU PVT west0
Linear11
Celsius
1
5-6
ADC inlet
Linear11
Celsius
1
7-8
ADC exhaust
Linear11
Celsius
1
9-10
ADC phase0 bottomside
Linear11
Celsius
1
11-12
ADC phase1 bottomside
Linear11
Celsius
1
13-14
ADC IPU chip bottomside
Linear11
Celsius
1
15-16
ADC mid
Linear11
Celsius
1
17-18
I2C inlet
Linear11
Celsius
1
19-20
I2C IPU chip
Linear11
Celsius
1
21-22
I2C exhaust
Linear11
Celsius
1
23-24
I2C mid
Linear11
Celsius
1
3.3.5.4. Temperature thresholds (0x43)
Block Read command 0x43 reports an aggregated list of the temperature thresholds used by the different temperature sensors present on the C600 card.
- Response:
Byte
Name
Format
Unit
API
1-2
PVT emergency threshold
Linear11
Celsius
1
3-4
PVT warning threshold
Linear11
Celsius
1
5-6
Thermal control maximum
Linear11
Celsius
1
7-8
Thermal control minimum
Linear11
Celsius
1
9-10
I2C inlet emergency
Linear11
Celsius
1
11-12
I2C inlet warning
Linear11
Celsius
1
13-14
I2C IPU chip emergency
Linear11
Celsius
1
15-16
I2C IPU chip warning
Linear11
Celsius
1
17-18
I2C exhaust emergency
Linear11
Celsius
1
19-20
I2C exhaust warning
Linear11
Celsius
1
21-22
I2C mid emergency
Linear11
Celsius
1
23-24
I2C mid warning
Linear11
Celsius
1
3.3.5.5. Thermal status (0x45)
Block Read command 0x45 reports information about the thermal history of the card, including active thermal events, and incrementing counters for these events since the last card powercycle.
- Response:
Byte
Name
Format
Unit
API
1-2
Thermal status register
uint16_t
<bit-field>
1
3-4
Temperature warning count
uint16_t
x1
1
5-6
Temperature excess count
uint16_t
x1
1
- Thermal status register
Bit
Name
API
0
Thermal shutdown active
1
1
Temperature warning active
1
2
Temperature excess active
1
3.3.6. Host messaging
Commands 0x80 - 0x8F are reserved for the PCIe host system to report information
ID |
Name |
Bytes (r/w) |
Protocol |
API |
---|---|---|---|---|
0x80 |
4 |
Block Read |
1 |
3.3.6.1. Driver error state (0x80)
Block Read command 0x80 reports the error status for the card as determined by the host driver. This status is non-volatile, and once set requires the card to be RMA’d.
- Response:
Byte
Name
Format
Unit
API
1-4
Driver error state
uint32_t
1