6. IPU Builtins
The following are IPU specific builtin functions that can be used in C/C++ code. See the “Tile Worker ISA” document for more information. This is available from Graphcore support on request.
Get COUNT_L from CSR
unsigned __builtin_ipu_get_scount_l();
Get the value of the control/status register (CSR) SCOUNT_L
, which is the lower 32 bits of the tile cycle counter value.
Get COUNT_U from CSR
unsigned __builtin_ipu_get_scount_u();
Get the value of the CSR SCOUNT_U
, which is the upper 32 bits of the tile cycle counter value.
Get VERTEX_BASE from CSR
void *__builtin_ipu_get_vertex_base();
Get vertex data structure pointer.
Get TILE_ID from CSR
void *__builtin_ipu_get_tile_id();
Get the tile ID of the current tile.
Generate random 32-bit integer
unsigned __builtin_ipu_urand32();
Generate a uniform distribution, 32-bit random integer.
Targets urand32
instruction.
Generate random 64-bit integer
unsigned long long __builtin_ipu_urand64();
Generate a uniform distribution, 64-bit random integer.
Targets urand64
instruction.
Generate random 16-bit float
half __builtin_ipu_urand_f16();
Generate a uniform distribution, 16-bit random float (half
).
Generate random 32-bit float
float __builtin_ipu_urand_f32();
Generate a uniform distribution, 32-bit random float
.
Classify float
int __builtin_ipu_f32class(float num);
Single-precision floating-point number classifier.
Targets the f32class
instruction.
The result will be one of the float class identifiers, which can be found in the “TileFloatClass” section in the “Tile Worker ISA”.
Triple-pack three addresses
uint2 __builtin_ipu_tapack(const void * addr1, const void * addr2, const void * addr3);
Convert three absolute addresses to the triple-packed address format.
Targets tapack
instruction.
Write to an upper CSR
void __builtin_ipu_uput(unsigned val, unsigned char csr_index);
Write to a control register in the upper CSR address space.
Targets uput
instruction.
See the section “Control and Status registers” in the “Tile Worker ISA” for detailed documentation on the CSRs.
Example:
void example(unsigned x) {
__builtin_ipu_uput(x, 2);
}
Writes immediate x
to the CSR at index 2
in the upper CSR space.
Write to a CSR
void __builtin_ipu_put(unsigned val, unsigned char csr_index);
Write to a control register.
Targets put
instruction.
See section “Control and Status registers” in the “Tile Worker ISA” for detailed documentation on the CSRs.
Example:
void example(unsigned x) {
__builtin_ipu_put(x, 32);
}
Writes immediate x
to the CSR at index 32
.
Read from an upper CSR
unsigned __builtin_ipu_uget(unsigned char csr_index);
Read the value of a control/status register in the upper CSR space into a general purpose register.
Targets uget
instruction.
See section “Control and Status registers” in the “Tile Worker ISA” for detailed documentation on the CSRs.
Example:
unsigned example() {
unsigned res = __builtin_ipu_uget(x, 4);
return res;
}
Sets res
to the value of the CSR at index 4
in the upper CSR space.
Read from a CSR
unsigned __builtin_ipu_get(unsigned char csr_index);
Read the value of a control/status register into a general purpose register.
Targets get
instruction.
See section “Control and Status registers” in the “Tile Worker ISA” for detailed documentation on the CSRs.
Example:
unsigned example() {
unsigned res = __builtin_ipu_get(x, 1);
return res;
}
Sets res
to the value of the CSR at index 1
.
Check for worker mode
int __builtin_ipu_is_worker_mode();
Check for worker mode.
Example:
int example() {
int res = __builtin_ipu_is_worker_mode();
return res;
}
Value of res
will be 1
if worker mode, 0
otherwise.
Roll-left SIMD permutation
unsigned __builtin_ipu_roll8l(unsigned val1, unsigned val2);
Performs a roll-left SIMD permutation on the 8x8 values of the two inputs and returns the result.
Targets roll8l
instruction.
Roll-right SIMD permutation
unsigned __builtin_ipu_roll8r(unsigned val1, unsigned val2);
Performs a roll-right SIMD permutation on the 8x8 values of the two inputs and returns the result.
Targets roll8r
instruction.
Check whether floating-point value is finite
int __builtin_ipu_isfinite(float val);
short2 __builtin_ipu_isfinite(half2 val);
int2 __builtin_ipu_isfinite(float2 val);
short4 __builtin_ipu_isfinite(half4 val);
Check whether a floating-point value, whether scalar or vector, is finite and
return the boolean result value as an integer type of same shape and size as
the input parameter. This builtin expands to a sequence of instructions with
vector floating-point values handled by vector code. The header
ipu_builtins.h
must be included for this builtin to be available.
Check whether floating-point value is infinite
int __builtin_ipu_isinf(float val);
short2 __builtin_ipu_isinf(half2 val);
int2 __builtin_ipu_isinf(float2 val);
short4 __builtin_ipu_isinf(half4 val);
Check whether a floating-point value, whether scalar or vector, is -inf or +inf
and return the boolean result value as an integer type of same shape and size
as the input parameter. This builtin expands to a sequence of instructions with
vector floating-point values handled by vector code. The header
ipu_builtins.h
must be included for this builtin to be available.
Check whether floating-point value is NaN
int __builtin_ipu_isnan(float val);
short2 __builtin_ipu_isnan(half2 val);
int2 __builtin_ipu_isnan(float2 val);
short4 __builtin_ipu_isnan(half4 val);
Check whether a floating-point value, whether scalar or vector, is not a number
(NaN) and return the boolean result value in an integer type of same shape and
size as the input parameter. This builtin expands to a sequence of instructions
with vector floating-point values handled by vector code. The header
ipu_builtins.h
must be included for this builtin to be available.