6. IPU Builtins

The following are IPU specific builtin functions that can be used in C/C++ code. See the “Tile Worker ISA” document for more information. This is available from Graphcore support on request.

Get COUNT_L from CSR

unsigned __builtin_ipu_get_scount_l();

Get the value of the control/status register (CSR) SCOUNT_L, which is the lower 32 bits of the tile cycle counter value.

Get COUNT_U from CSR

unsigned __builtin_ipu_get_scount_u();

Get the value of the CSR SCOUNT_U, which is the upper 32 bits of the tile cycle counter value.

Get VERTEX_BASE from CSR

void *__builtin_ipu_get_vertex_base();

Get vertex data structure pointer.

Get TILE_ID from CSR

void *__builtin_ipu_get_tile_id();

Get the tile ID of the current tile.

Generate random 32-bit integer

unsigned __builtin_ipu_urand32();

Generate a uniform distribution, 32-bit random integer. Targets urand32 instruction.

Generate random 64-bit integer

unsigned long long __builtin_ipu_urand64();

Generate a uniform distribution, 64-bit random integer. Targets urand64 instruction.

Generate random 16-bit float

half __builtin_ipu_urand_f16();

Generate a uniform distribution, 16-bit random float (half).

Generate random 32-bit float

float __builtin_ipu_urand_f32();

Generate a uniform distribution, 32-bit random float.

Classify float

int __builtin_ipu_f32class(float num);

Single-precision floating-point number classifier. Targets the f32class instruction.

The result will be one of the float class identifiers, which can be found in the “TileFloatClass” section in the “Tile Worker ISA”.

Triple-pack three addresses

uint2 __builtin_ipu_tapack(const void * addr1, const void * addr2, const void * addr3);

Convert three absolute addresses to the triple-packed address format. Targets tapack instruction.

Write to an upper CSR

void __builtin_ipu_uput(unsigned val, unsigned char csr_index);

Write to a control register in the upper CSR address space. Targets uput instruction. See the section “Control and Status registers” in the “Tile Worker ISA” for detailed documentation on the CSRs.

Example:

void example(unsigned x) {
  __builtin_ipu_uput(x, 2);
}

Writes immediate x to the CSR at index 2 in the upper CSR space.

Write to a CSR

void __builtin_ipu_put(unsigned val, unsigned char csr_index);

Write to a control register. Targets put instruction. See section “Control and Status registers” in the “Tile Worker ISA” for detailed documentation on the CSRs.

Example:

void example(unsigned x) {
  __builtin_ipu_put(x, 32);
}

Writes immediate x to the CSR at index 32.

Read from an upper CSR

unsigned __builtin_ipu_uget(unsigned char csr_index);

Read the value of a control/status register in the upper CSR space into a general purpose register. Targets uget instruction. See section “Control and Status registers” in the “Tile Worker ISA” for detailed documentation on the CSRs.

Example:

unsigned example() {
  unsigned res = __builtin_ipu_uget(x, 4);
  return res;
}

Sets res to the value of the CSR at index 4 in the upper CSR space.

Read from a CSR

unsigned __builtin_ipu_get(unsigned char csr_index);

Read the value of a control/status register into a general purpose register. Targets get instruction. See section “Control and Status registers” in the “Tile Worker ISA” for detailed documentation on the CSRs.

Example:

unsigned example() {
  unsigned res = __builtin_ipu_get(x, 1);
  return res;
}

Sets res to the value of the CSR at index 1.

Check for worker mode

int __builtin_ipu_is_worker_mode();

Check for worker mode.

Example:

int example() {
  int res = __builtin_ipu_is_worker_mode();
  return res;
}

Value of res will be 1 if worker mode, 0 otherwise.

Roll-left SIMD permutation

unsigned __builtin_ipu_roll8l(unsigned val1, unsigned val2);

Performs a roll-left SIMD permutation on the 8x8 values of the two inputs and returns the result. Targets roll8l instruction.

Roll-right SIMD permutation

unsigned __builtin_ipu_roll8r(unsigned val1, unsigned val2);

Performs a roll-right SIMD permutation on the 8x8 values of the two inputs and returns the result. Targets roll8r instruction.

Check whether floating-point value is finite

int __builtin_ipu_isfinite(float val);

short2 __builtin_ipu_isfinite(half2 val);

int2 __builtin_ipu_isfinite(float2 val);

short4 __builtin_ipu_isfinite(half4 val);

Check whether a floating-point value, whether scalar or vector, is finite and return the boolean result value as an integer type of same shape and size as the input parameter. This builtin expands to a sequence of instructions with vector floating-point values handled by vector code. The header ipu_builtins.h must be included for this builtin to be available.

Check whether floating-point value is infinite

int __builtin_ipu_isinf(float val);

short2 __builtin_ipu_isinf(half2 val);

int2 __builtin_ipu_isinf(float2 val);

short4 __builtin_ipu_isinf(half4 val);

Check whether a floating-point value, whether scalar or vector, is -inf or +inf and return the boolean result value as an integer type of same shape and size as the input parameter. This builtin expands to a sequence of instructions with vector floating-point values handled by vector code. The header ipu_builtins.h must be included for this builtin to be available.

Check whether floating-point value is NaN

int __builtin_ipu_isnan(float val);

short2 __builtin_ipu_isnan(half2 val);

int2 __builtin_ipu_isnan(float2 val);

short4 __builtin_ipu_isnan(half4 val);

Check whether a floating-point value, whether scalar or vector, is not a number (NaN) and return the boolean result value in an integer type of same shape and size as the input parameter. This builtin expands to a sequence of instructions with vector floating-point values handled by vector code. The header ipu_builtins.h must be included for this builtin to be available.