# IPU C/C++ builtins

The following IPU-specific builtin functions can be used in C/C++ code. For some of these the “Tile Worker Instruction Set Architecture (ISA)” is referenced. This is available from Graphcore support on request. Refer to this document for more detailed information on the instructions that are targeted by some of these builtins.

Note

For a lot of these builtins, it is possible to omit the `__builtin_ipu`

prefix by using the corresponding C++ intrinsic.
See `ipu_cpp_intrinsics_api`

for more information.

Note

Use `#include <ipudef.h>`

for the IPU native types mentioned throughout this section, such `half`

, `half2`

, `float2`

and more.

## IPU functionality and memory

### Get COUNT_L from CSR

`unsigned __builtin_ipu_get_scount_l();`

Get the value of the control/status register (CSR) `SCOUNT_L`

, which is the lower 32 bits of the tile cycle counter value.

### Get COUNT_U from CSR

`unsigned __builtin_ipu_get_scount_u();`

Get the value of the CSR `SCOUNT_U`

, which is the upper 32 bits of the tile cycle counter value.

### Triple-pack three addresses

`uint2 __builtin_ipu_tapack(const void * addr1, const void * addr2, const void * addr3);`

Convert three absolute addresses to the triple-packed address format.
Targets `tapack`

instruction.

### Write to a CSR

`void __builtin_ipu_put(unsigned val, unsigned char csr_index);`

Write to a control/status register.
Targets `put`

instruction.
See section “Control and Status registers” in the “Tile Worker Instruction Set Architecture (ISA)” for detailed documentation on the CSRs.

#### Example:

```
void example(unsigned x) {
__builtin_ipu_put(x, 32);
}
```

Writes immediate `x`

to the CSR at index `32`

.

### Write to an upper CSR

`void __builtin_ipu_uput(unsigned val, unsigned char csr_index);`

`void __builtin_ipu_uput(float val, unsigned char csr_index);`

Write to a control register in the upper CSR address space.
Targets `uput`

instruction.
See the section “Control and Status registers” in the “Tile Worker Instruction Set Architecture (ISA)” for detailed documentation on the CSRs.

#### Example:

```
void example(unsigned x) {
__builtin_ipu_uput(x, 2);
}
```

Writes immediate `x`

to the CSR at index `2`

in the upper CSR space.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins `__builtin_ipu_uput`

and `__builtin_ipu_uputf`

are available without this header.

### Read from a CSR

`unsigned __builtin_ipu_get(unsigned char csr_index);`

Read the value of a control/status register into a general purpose register.
Targets `get`

instruction.
See section “Control and Status registers” in the “Tile Worker Instruction Set Architecture (ISA)” for detailed documentation on the CSRs.

#### Example:

```
unsigned example() {
unsigned res = __builtin_ipu_get(x, 1);
return res;
}
```

Sets `res`

to the value of the CSR at index `1`

.

### Read from an upper CSR

`unsigned __builtin_ipu_uget(unsigned char csr_index);`

Read the value of a control/status register in the upper CSR space into a general purpose register.
Targets `uget`

instruction.
See section “Control and Status registers” in the “Tile Worker Instruction Set Architecture (ISA)” for detailed documentation on the CSRs.

#### Example:

```
unsigned example() {
unsigned res = __builtin_ipu_uget(x, 4);
return res;
}
```

Sets `res`

to the value of the CSR at index `4`

in the upper CSR space.

### Read from an upper CSR

`float __builtin_ipu_ugetf(unsigned char csr_index);`

Read the value of a control/status register in the upper CSR space into a general purpose register.
Targets `uget`

instruction.
See section “Control and Status registers” in the “Tile Worker Instruction Set Architecture (ISA)” for detailed documentation on the CSRs.

## Bit operations

### And operation

`int __builtin_ipu_and(int x, int y);`

`float __builtin_ipu_and(float x, float y);`

`float2 __builtin_ipu_and(float2 x, float2 y);`

Get the result of the `and`

bit operation of two values.
Targets `and`

instruction.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_and_i32`

, `__builtin_ipu_and_f32`

and `__builtin_ipu_and_v2f32`

are available without this header.

### Andc operation

`int __builtin_ipu_andc(int x, int y);`

`float __builtin_ipu_andc(float x, float y);`

`float2 __builtin_ipu_andc(float2 x, float2 y);`

Get the result of the `andc`

bit operation of two values.
Targets `andc`

instruction.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins `__builtin_ipu_andc_i32`

,
`__builtin_ipu_andc_f32`

and `__builtin_ipu_andc_v2f32`

are available without this header.

### Or operation

`int __builtin_ipu_or(int x, int y);`

`float __builtin_ipu_or(float x, float y);`

`float2 __builtin_ipu_or(float2 x, float2 y);`

Get the result of the `or`

bit operation of two values.
Targets `or`

instruction.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins `__builtin_ipu_or_i32`

,
`__builtin_ipu_or_f32`

and `__builtin_ipu_or_v2f32`

are available without this header.

### Not operation

`float __builtin_ipu_not(float x);`

`float2 __builtin_ipu_not(float2 x);`

Get the result of the `not`

bit operation of a value.
Targets `not`

instruction.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_not_f32`

and `__builtin_ipu_not_v2f32`

are available without this header.

### Reverse bytes

`unsigned __builtin_ipu_bitrev8(unsigned x);`

Reverses the bit order of each byte in `x`

.
Targets `bitrev8`

instruction.

### Reverse bytes

`unsigned __builtin_ipu_cms(int x);`

Calculates number of higher order bits that match the sign bit in `x`

.
Targets `cms`

instruction.

### SIMD roll permutation on 4x32-bit values

`float2 __builtin_ipu_roll32(float2 x, float2 y);`

Performs SIMD roll permutation on the 4 32-bit values across `x`

and `y`

.

```
x y -> Result
| 3 | 2 | | 1 | 0 | | 2 | 1 |
```

Targets `roll32`

instruction.

### SIMD roll-left permutation on 8x8-bit values

`unsigned __builtin_ipu_roll8l(unsigned x, unsigned y);`

Performs SIMD roll-left permutation on the 8 8-bit values across `x`

and `y`

.

```
x y -> Result
| 7 | 6 | 5 | 4 | | 3 | 2 | 1 | 0 | | 6 | 5 | 4 | 3 |
```

Targets `roll8l`

instruction.

### SIMD roll-right permutation on 8x8-bit values

`unsigned __builtin_ipu_roll8r(unsigned x, unsigned y);`

Performs SIMD roll-right permutation on the 8 8-bit values across `x`

and `y`

.

```
x y -> Result
| 7 | 6 | 5 | 4 | | 3 | 2 | 1 | 0 | | 4 | 3 | 2 | 1 |
```

Targets `roll8r`

instruction.

### Upper half of SIMD shuffle permutation on 8x8-bit values

`unsigned __builtin_ipu_shuf8x8hi(unsigned x, unsigned y);`

Performs SIMD shuffle permutation on the 8 8-bit values across `x`

and `y`

, and returns the upper word of the result.

```
x y -> Result
| 7 | 6 | 5 | 4 | | 3 | 2 | 1 | 0 | | 7 | 3 | 6 | 2 |
```

Targets `shuf8x8hi`

instruction.

### Lower half of SIMD shuffle permutation on 8x8-bit values

`unsigned __builtin_ipu_shuf8x8lo(unsigned x, unsigned y);`

Performs SIMD shuffle permutation on the 8 8-bit values across `x`

and `y`

, and returns the lower word of the result.

```
x y -> Result
| 7 | 6 | 5 | 4 | | 3 | 2 | 1 | 0 | | 5 | 1 | 4 | 0 |
```

Targets `shuf8x8lo`

instruction.

### Upper half of SIMD sort permutation on 4x32-bit values

`float2 __builtin_ipu_sort4x32hi(float2 x, float2 y);`

Performs SIMD sort permutation on the 4 32-bit values across `x`

and `y`

, and returns the upper two words of the result.

```
x y -> Result
| 3 | 2 | | 1 | 0 | | 3 | 1 |
```

Targets `sort4x32hi`

instruction.

### Lower half of SIMD sort permutation on 4x32-bit values

`float2 __builtin_ipu_sort4x32lo(float2 x, float2 y);`

Performs SIMD sort permutation on the 4 32-bit values across `x`

and `y`

, and returns the lower two words of the result.

```
x y -> Result
| 3 | 2 | | 1 | 0 | | 2 | 0 |
```

Targets `sort4x32lo`

instruction.

### SIMD sort8 permutation on 4x8-bit values

`unsigned __builtin_ipu_sort8(unsigned x);`

Performs SIMD sort8 permutation on the 4 8-bit values in `x`

.

```
x -> Result
| 3 | 2 | 1 | 0 | | 3 | 1 | 2 | 0 |
```

Targets `sort8`

instruction.

### SIMD swap8 permutation on 4x8-bit values

`unsigned __builtin_ipu_swap8(unsigned x);`

Performs SIMD swap8 permutation on the 4 8-bit values in `x`

.

```
x -> Result
| 3 | 2 | 1 | 0 | | 2 | 3 | 0 | 1 |
```

Targets `swap8`

instruction.

## Float operations

### Absolute addition of two values

`half2 __builtin_ipu_absadd(half2 x, half2 y);`

`half4 __builtin_ipu_absadd(half4 x, half4 y);`

`float __builtin_ipu_absadd(float x, float y);`

`float2 __builtin_ipu_absadd(float2 x, float2 y);`

Sum of two absolute values.

Targets the `f16v2absadd`

, `f16v4absadd`

, `f32absadd`

and `f32absadd`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2absadd`

, `__builtin_ipu_f16v4absadd`

, `__builtin_ipu_f32v2absadd`

and `__builtin_ipu_f32absadd`

are available without this header.

### Absolute maximum of two values

`half2 __builtin_ipu_absmax(half2 x, half2 y);`

`half4 __builtin_ipu_absmax(half4 x, half4 y);`

`float __builtin_ipu_absmax(float x, float y);`

`float2 __builtin_ipu_absmax(float2 x, float2 y);`

The maximum of two absolute values.

Targets the `f16v2absmax`

, `f16v4absmax`

, `f32absmax`

and `f32absmax`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2absmax`

, `__builtin_ipu_f16v4absmax`

, `__builtin_ipu_f32v2absmax`

and `__builtin_ipu_f32absmax`

are available without this header.

### Maximum of two values

`half2 __builtin_ipu_max(half2 x, half2 y);`

`half4 __builtin_ipu_max(half4 x, half4 y);`

`float __builtin_ipu_max(float x, float y);`

`float2 __builtin_ipu_max(float2 x, float2 y);`

The maximum of two values.

Targets the `f16v2max`

, `f16v4max`

, `f32max`

and `f32max`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2max`

, `__builtin_ipu_f16v4max`

, `__builtin_ipu_f32v2max`

and `__builtin_ipu_f32max`

are available without this header.

### Lateral maximum of two values

`half2 __builtin_ipu_maxc(half2 x, half2 y);`

`half4 __builtin_ipu_maxc(half4 x, half4 y);`

`float __builtin_ipu_maxc(float x, float y);`

`float2 __builtin_ipu_maxc(float2 x, float2 y);`

The lateral maximum of two variables.

Targets the `f16v2maxc`

, `f16v4maxc`

, `f32maxc`

and `f32maxc`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2maxc`

, `__builtin_ipu_f16v4maxc`

, `__builtin_ipu_f32v2maxc`

and `__builtin_ipu_f32maxc`

are available without this header.

### Minimum of two values

`half2 __builtin_ipu_min(half2 x, half2 y);`

`half4 __builtin_ipu_min(half4 x, half4 y);`

`float __builtin_ipu_min(float x, float y);`

`float2 __builtin_ipu_min(float2 x, float2 y);`

The minimum of two variables.

Targets the `f16v2min`

, `f16v4min`

, `f32min`

and `f32min`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2min`

, `__builtin_ipu_f16v4min`

, `__builtin_ipu_f32v2min`

and `__builtin_ipu_f32min`

are available without this header.

### Min-of-maximum of two values

`half2 __builtin_ipu_clamp(half2 x, half2 y);`

`half4 __builtin_ipu_clamp(half4 x, half2 y);`

`float __builtin_ipu_clamp(float x, float2 y);`

`float2 __builtin_ipu_clamp(float2 x, float2 y);`

The min-of-maximum of each of the elements in `x`

, compared with the two elements in `y`

.

Targets the `f16v2clamp`

, `f16v4clamp`

, `f32clamp`

and `f32clamp`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2clamp`

, `__builtin_ipu_f16v4clamp`

, `__builtin_ipu_f32v2clamp`

and `__builtin_ipu_f32clamp`

are available without this header.

### CMAC operation

`void __builtin_ipu_cmac(half2 x, half2 y);`

`void __builtin_ipu_cmac(half4 x, half4 y);`

Performs the CMAC operation on two values. See “Tile Worker Instruction Set Architecture (ISA)” for more information.

Targets the `f16v2cmac`

and `f16v4cmac`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2cmac`

and `__builtin_ipu_f16v4cmac`

are available without this header.

### Natural exponential

`half2 __builtin_ipu_exp(half2 x);`

`float __builtin_ipu_exp(float x);`

The natural exponential function.

Targets the `f16v2exp`

and `f32exp`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtin
`__builtin_ipu_f16v2exp`

is available without this header.

### 2-to-the-power-of

`half2 __builtin_ipu_exp2(half2 x);`

`float __builtin_ipu_exp2(float x);`

Calculates `2^x`

.

Targets the `f16v2exp2`

and `f32exp2`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtin
`__builtin_ipu_f16v2exp2`

is available without this header.

### Natural logarithm

`half2 __builtin_ipu_ln(half2 x);`

`float __builtin_ipu_ln(float x);`

The natural logarithm function.

Targets the `f16v2ln`

and `f32ln`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtin
`__builtin_ipu_f16v2ln`

is available without this header.

### Base-2 logarithm

`half2 __builtin_ipu_log2(half2 x);`

`float __builtin_ipu_log2(float x);`

Base-2 logarithm function.

Targets the `f16v2log2`

abd `f32log2`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtin
`__builtin_ipu_f16v2log2`

is available without this header.

### Probabilistic mask function

`half4 __builtin_ipu_rmask(half4 x, float y);`

`float2 __builtin_ipu_rmask(float2 x, float y);`

Returns a masked version of the first argument. See “Tile Worker Instruction Set Architecture (ISA)” for more information.

Targets the `f16v4rmask`

and `f32v2rmask`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v4rmask`

and `__builtin_ipu_f32v2rmask`

are available without this header.

### Sigmoid function

`half2 __builtin_ipu_sigm(half2 x);`

`float __builtin_ipu_sigm(float x);`

Returns the result of the sigmoid function of a value.

Targets the `f16v2sigm`

and `f32sigm`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2sigm`

and `__builtin_ipu_f32sigm`

are available without this header.

### Lateral sum

`float __builtin_ipu_sum(half2 x);`

`float2 __builtin_ipu_sum(half4 x);`

Returns the lateral summation of the elements in `x`

.

Targets the `f16v2sum`

and `f16v4sum`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2sum`

and `__builtin_ipu_f16v4sum`

are available without this header.

### Tanh

`half2 __builtin_ipu_tanh(half2 x);`

`float __builtin_ipu_tanh(float x);`

Returns the result of the hyperbolic tangent function of `x`

.

Targets the `f16v2tanh`

and `f32tanh`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtin
`__builtin_ipu_f16v2tanh`

is available without this header.

### Vector product

`void __builtin_ipu_f32v2aop(float2 x, float2 y, unsigned char z);`

Calculates vector product of the first two arguments. See “Tile Worker Instruction Set Architecture (ISA)” for more detail.

Targets the `f32v2aop`

instruction.

### Vector sum with scalar multiplicand

`float2 __builtin_ipu_f32v2axpy(float2 x, float2 y);`

Calculates vector result of `ax + y`

where `a`

is the value of the CSR `$TAS`

.

Targets the `f32v2axpy`

instruction.

### Get and initialise accumulators

`half2 __builtin_ipu_gina(half2 x, unsigned int y);`

`float2 __builtin_ipu_gina(float2 x, unsigned int y);`

Get and initialise accumulators. See “Tile Worker Instruction Set Architecture (ISA)” for more information.

Targets the `f16v2gina`

and `f32v2gina`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2gina`

and `__builtin_ipu_f32v2gina`

are available without this header.

## Float comparisons

### Equality test

`half2 __builtin_ipu_cmpeq(half2 x, half2 y);`

`half4 __builtin_ipu_cmpeq(half4 x, half4 y);`

`float __builtin_ipu_cmpeq(float x, float y);`

`float2 __builtin_ipu_cmpeq(float2 x, float2 y);`

Element-wise equality comparison of two arguments.

Targets the `f16v2cmpeq`

, `f16v4cmpeq`

, `f32cmpeq`

and `f32v2cmpeq`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2cmpeq`

, `__builtin_ipu_f16v4cmpeq`

, `__builtin_ipu_f32cmpeq`

and `__builtin_ipu_f32v2cmpeq`

are available without this header.

### Greater-than-or-equal-to test

`half2 __builtin_ipu_cmpge(half2 x, half2 y);`

`half4 __builtin_ipu_cmpge(half4 x, half4 y);`

`float __builtin_ipu_cmpge(float x, float y);`

`float2 __builtin_ipu_cmpge(float2 x, float2 y);`

Element-wise greater-than-or-equal-to test of two arguments.

Targets the `f16v2cmpge`

, `f16v4cmpge`

, `f32cmpge`

and `f32v2cmpge`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2cmpge`

, `__builtin_ipu_f16v4cmpge`

, `__builtin_ipu_f32cmpge`

and `__builtin_ipu_f32v2cmpge`

are available without this header.

### Greater-than test

`half2 __builtin_ipu_cmpgt(half2 x, half2 y);`

`half4 __builtin_ipu_cmpgt(half4 x, half4 y);`

`float __builtin_ipu_cmpgt(float x, float y);`

`float2 __builtin_ipu_cmpgt(float2 x, float2 y);`

Element-wise greater-than test of two arguments.

Targets the `f16v2cmpgt`

, `f16v4cmpgt`

, `f32cmpgt`

and `f32v2cmpgt`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2cmpgt`

, `__builtin_ipu_f16v4cmpgt`

, `__builtin_ipu_f32cmpgt`

and `__builtin_ipu_f32v2cmpgt`

are available without this header.

### Less-than-or-equal-to test

`half2 __builtin_ipu_cmple(half2 x, half2 y);`

`half4 __builtin_ipu_cmple(half4 x, half4 y);`

`float __builtin_ipu_cmple(float x, float y);`

`float2 __builtin_ipu_cmple(float2 x, float2 y);`

Element-wise less-than-or-equal-to test of two arguments.

Targets the `f16v2cmple`

, `f16v4cmple`

, `f32cmple`

and `f32v2cmple`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2cmple`

, `__builtin_ipu_f16v4cmple`

, `__builtin_ipu_f32cmple`

and `__builtin_ipu_f32v2cmple`

are available without this header.

### Less-than test

`half2 __builtin_ipu_cmplt(half2 x, half2 y);`

`half4 __builtin_ipu_cmplt(half4 x, half4 y);`

`float __builtin_ipu_cmplt(float x, float y);`

`float2 __builtin_ipu_cmplt(float2 x, float2 y);`

Element-wise less-than test of two arguments.

Targets the `f16v2cmplt`

, `f16v4cmplt`

, `f32cmplt`

and `f32v2cmplt`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2cmplt`

, `__builtin_ipu_f16v4cmplt`

, `__builtin_ipu_f32cmplt`

and `__builtin_ipu_f32v2cmplt`

are available without this header.

### Inequality test

`half2 __builtin_ipu_cmpne(half2 x, half2 y);`

`half4 __builtin_ipu_cmpne(half4 x, half4 y);`

`float __builtin_ipu_cmpne(float x, float y);`

`float2 __builtin_ipu_cmpne(float2 x, float2 y);`

Element-wise inequality test of two arguments.

Targets the `f16v2cmpne`

, `f16v4cmpne`

, `f32cmpne`

and `f32v2cmpne`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2cmpne`

, `__builtin_ipu_f16v4cmpne`

, `__builtin_ipu_f32cmpne`

and `__builtin_ipu_f32v2cmpne`

are available without this header.

## Float classification

### Classify float

`short2 __builtin_ipu_class(half2 num);`

`short4 __builtin_ipu_class(half4 num);`

`int __builtin_ipu_class(float num);`

`short2 __builtin_ipu_class(float2 num);`

Floating-point number classifier. The result will be one of the float class identifiers, which can be found in the “TileFloatClass” section in the “Tile Worker Instruction Set Architecture (ISA)”.

Targets the `f16v2class`

, `f16v4class`

, `f32class`

and `f32v2class`

instructions.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_f16v2class`

, `__builtin_ipu_f16v4class`

, `__builtin_ipu_f32class`

and `__builtin_ipu_f32v2class`

are available without this header.

### Check whether floating-point value is finite

`int __builtin_ipu_isfinite(float val);`

`short2 __builtin_ipu_isfinite(half2 val);`

`int2 __builtin_ipu_isfinite(float2 val);`

`short4 __builtin_ipu_isfinite(half4 val);`

Check whether a floating-point value, whether scalar or vector, is finite and return the boolean result value as an integer type of same shape and size as the input parameter. This builtin expands to a sequence of instructions with vector floating-point values handled by vector code.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_isfinite_f32`

, `__builtin_ipu_isfinite_v2f16`

, `__builtin_ipu_isfinite_v2f32`

and `__builtin_ipu_isfinite_v4f16`

are available without this header.

### Check whether floating-point value is infinite

`int __builtin_ipu_isinf(float val);`

`short2 __builtin_ipu_isinf(half2 val);`

`int2 __builtin_ipu_isinf(float2 val);`

`short4 __builtin_ipu_isinf(half4 val);`

Check whether a floating-point value, whether scalar or vector, is -inf or +inf and return the boolean result value as an integer type of same shape and size as the input parameter. This builtin expands to a sequence of instructions with vector floating-point values handled by vector code.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_isinf_f32`

, `__builtin_ipu_isinf_v2f16`

, `__builtin_ipu_isinf_v2f32`

and `__builtin_ipu_isinf_v4f16`

are available without this header.

### Check whether floating-point value is NaN

`int __builtin_ipu_isnan(float val);`

`short2 __builtin_ipu_isnan(half2 val);`

`int2 __builtin_ipu_isnan(float2 val);`

`short4 __builtin_ipu_isnan(half4 val);`

Check whether a floating-point value, whether scalar or vector, is not a number (NaN) and return the boolean result value in an integer type of same shape and size as the input parameter. This builtin expands to a sequence of instructions with vector floating-point values handled by vector code.

Note

The function prototypes shown above are the overloaded aliases that can be
used by including `<ipu_builtins.h>`

. The pure IPU builtins
`__builtin_ipu_isnan_f32`

, `__builtin_ipu_isnan_v2f16`

, `__builtin_ipu_isnan_v2f32`

and `__builtin_ipu_isnan_v4f16`

are available without this header.

## Random number generation

### Generate half2 vector using Gaussian distribution

`half2 __builtin_ipu_f16v2grand();`

Generate a Gaussian distribution, two-element half-precision random vector.

Targets the `f16v2grand`

instruction.

### Generate float2 vector using Gaussian distribution

`float2 __builtin_ipu_f32v2grand();`

Generate a Gaussian distribution, two-element singles-precision random vector.

Targets the `f32v2grand`

instruction.

### Generate random 32-bit integer

`unsigned __builtin_ipu_urand32();`

Generate a uniform distribution, 32-bit random integer.
Targets `urand32`

instruction.

### Generate random 64-bit integer

`unsigned long long __builtin_ipu_urand64();`

Generate a uniform distribution, 64-bit random integer.
Targets `urand64`

instruction.

### Generate random 16-bit float

`half __builtin_ipu_urand_f16();`

Generate a uniform distribution, 16-bit random float (`half`

).

### Generate random 32-bit float

`float __builtin_ipu_urand_f32();`

Generate a uniform distribution, 32-bit random `float`

.