Poplar and PopLibs
ipu Namespace Reference

IPU intrinsic functions. More...

Functions

int andc (int src0, int src1)
 Targets the andc instruction. More...
 
unsigned andc (unsigned src0, unsigned src1)
 Targets the andc instruction. More...
 
float andc (float src0, float src1)
 Targets the andc instruction. More...
 
float2 andc (float2 src0, float2 src1)
 Targets the andc64 instruction. More...
 
int bitrev8 (int src)
 Targets the bitrev8 instruction. More...
 
unsigned cms (int src)
 Targets the cms instruction. More...
 
float2 roll32 (float2 src0, float2 src1)
 Targets the roll32 instruction. More...
 
int roll8l (int src0, int src1)
 Targets the roll8l instruction. More...
 
int roll8r (int src0, int src1)
 Targets the roll8r instruction. More...
 
int shuf8x8hi (int src0, int src1)
 Targets the shuf8x8hi instruction. More...
 
int shuf8x8lo (int src0, int src1)
 Targets the shuf8x8lo instruction. More...
 
float2 sort4x32hi (float2 src0, float2 src1)
 Targets the sort4x32hi instruction. More...
 
float2 sort4x32lo (float2 src0, float2 src1)
 Targets the sort4x32lo instruction. More...
 
int sort8 (int src)
 Targets the sort8 instruction. More...
 
int sort8x8hi (int src0, int src1)
 Targets the sort8x8hi instruction. More...
 
int sort8x8lo (int src0, int src1)
 Targets the sort8x8lo instruction. More...
 
int swap8 (int src)
 Targets the sort8 instruction. More...
 
half2 absadd (half2 src0, half2 src1)
 Targets the f16v2absadd instruction. More...
 
half4 absadd (half4 src0, half4 src1)
 Targets the f16v4absadd instruction. More...
 
float2 absadd (float2 src0, float2 src1)
 Targets the f32v2absadd instruction. More...
 
float absadd (float src0, float src1)
 Targets the f32absadd instruction. More...
 
half2 absmax (half2 src0, half2 src1)
 Targets the f16v2absmax instruction. More...
 
half4 absmax (half4 src0, half4 src1)
 Targets the f16v4absmax instruction. More...
 
float2 absmax (float2 src0, float2 src1)
 Targets the f32v2absmax instruction. More...
 
float absmax (float src0, float src1)
 Targets the f32absmax instruction. More...
 
half2 max (half2 src0, half2 src1)
 Targets the f16v2max instruction. More...
 
half4 max (half4 src0, half4 src1)
 Targets the f16v4max instruction. More...
 
float2 max (float2 src0, float2 src1)
 Targets the f32v2max instruction. More...
 
float max (float src0, float src1)
 Targets the f32max instruction. More...
 
half2 maxc (half4 src)
 Targets the f16v4maxc instruction. More...
 
half2 min (half2 src0, half2 src1)
 Targets the f16v2min instruction. More...
 
half4 min (half4 src0, half4 src1)
 Targets the f16v4min instruction. More...
 
float2 min (float2 src0, float2 src1)
 Targets the f32v2min instruction. More...
 
float min (float src0, float src1)
 Targets the f32min instruction. More...
 
half2 clamp (half2 src0, half2 src1)
 Targets the f16v2clamp instruction. More...
 
half4 clamp (half4 src0, half2 src1)
 Targets the f16v4clamp instruction. More...
 
float2 clamp (float2 src0, float2 src1)
 Targets the f32v2clamp instruction. More...
 
float clamp (float src0, float2 src1)
 Targets the f32clamp instruction. More...
 
void cmac (half2 src0, half2 src1)
 Targets the f16v2cmac instruction. More...
 
void cmac (half4 src0, half4 src1)
 Targets the f16v4cmac instruction. More...
 
half2 exp (half2 src)
 Targets the f16v2exp instruction. More...
 
float exp (float src)
 Targets the f32exp instruction. More...
 
half2 exp2 (half2 src)
 Targets the f16v2exp instruction. More...
 
float exp2 (float src)
 Targets the f32exp instruction. More...
 
half2 log2 (half2 src)
 Targets the f16v2log2 instruction. More...
 
float log2 (float src)
 Targets the f32ln instruction. More...
 
half2 tanh (half2 src)
 Targets the f16v2tanh instruction. More...
 
float tanh (float src)
 Targets the f32tanh instruction. More...
 
half2 ln (half2 src)
 Targets the f16v2ln instruction. More...
 
float ln (float src)
 Targets the f32ln instruction. More...
 
float2 axpy (float2 src0, float2 src1)
 Targets the f32v2axpy instruction. More...
 
half2 f16v2grand ()
 Targets the f16v2grand instruction. More...
 
float2 f32v2grand ()
 Targets the f32v2grand instruction. More...
 
half4 rmask (half4 src0, float src1)
 Targets the f16v4rmask instruction. More...
 
float2 rmask (float2 src0, float src1)
 Targets the f32v2rmask instruction. More...
 
half2 sigm (half2 src)
 Targets the f16v2sigm instruction. More...
 
float sigm (float src)
 Targets the f32sigm instruction. More...
 
float sum (half2 src)
 Targets the f16v2sum instruction. More...
 
float2 sum (half4 src)
 Targets the f16v4sum instruction. More...
 
half2 cmpeq (half2 src0, half2 src1)
 Targets the f16v2cmpeq instruction. More...
 
half4 cmpeq (half4 src0, half4 src1)
 Targets the f16v4cmpeq instruction. More...
 
float2 cmpeq (float2 src0, float2 src1)
 Targets the f32v2cmpeq instruction. More...
 
float cmpeq (float src0, float src1)
 Targets the f32cmpeq instruction. More...
 
half2 cmpge (half2 src0, half2 src1)
 Targets the f16v2cmpge instruction. More...
 
half4 cmpge (half4 src0, half4 src1)
 Targets the f16v4cmpge instruction. More...
 
float2 cmpge (float2 src0, float2 src1)
 Targets the f32v2cmpge instruction. More...
 
float cmpge (float src0, float src1)
 Targets the f32cmpge instruction. More...
 
half2 cmpgt (half2 src0, half2 src1)
 Targets the f16v2cmpgt instruction. More...
 
half4 cmpgt (half4 src0, half4 src1)
 Targets the f16v4cmpgt instruction. More...
 
float2 cmpgt (float2 src0, float2 src1)
 Targets the f32v2cmpgt instruction. More...
 
float cmpgt (float src0, float src1)
 Targets the f32cmpgt instruction. More...
 
half2 cmple (half2 src0, half2 src1)
 Targets the f16v2cmple instruction. More...
 
half4 cmple (half4 src0, half4 src1)
 Targets the f16v4cmple instruction. More...
 
float2 cmple (float2 src0, float2 src1)
 Targets the f32v2cmple instruction. More...
 
float cmple (float src0, float src1)
 Targets the f32cmple instruction. More...
 
half2 cmplt (half2 src0, half2 src1)
 Targets the f16v2cmplt instruction. More...
 
half4 cmplt (half4 src0, half4 src1)
 Targets the f16v4cmplt instruction. More...
 
float2 cmplt (float2 src0, float2 src1)
 Targets the f32v2cmplt instruction. More...
 
float cmplt (float src0, float src1)
 Targets the f32cmplt instruction. More...
 
half2 cmpne (half2 src0, half2 src1)
 Targets the f16v2cmpne instruction. More...
 
half4 cmpne (half4 src0, half4 src1)
 Targets the f16v4cmpne instruction. More...
 
float2 cmpne (float2 src0, float2 src1)
 Targets the f32v2cmpne instruction. More...
 
float cmpne (float src0, float src1)
 Targets the f32cmpne instruction. More...
 
unsigned clz (int src)
 Targets the clz instruction. More...
 
unsigned popc (int src)
 Targets the popc instruction. More...
 
short2 roll16 (short2 src0, short2 src1)
 Targets the roll16 instruction. More...
 
ushort2 roll16 (ushort2 src0, ushort2 src1)
 Targets the roll16 instruction. More...
 
half2 roll16 (half2 src0, half2 src1)
 Targets the roll16 instruction. More...
 
short2 sort4x16hi (short2 src0, short2 src1)
 Targets the sort4x16hi instruction. More...
 
ushort2 sort4x16hi (ushort2 src0, ushort2 src1)
 Targets the sort4x16hi instruction. More...
 
half2 sort4x16hi (half2 src0, half2 src1)
 Targets the sort4x16hi instruction. More...
 
short2 sort4x16lo (short2 src0, short2 src1)
 Targets the sort4x16lo instruction. More...
 
ushort2 sort4x16lo (ushort2 src0, ushort2 src1)
 Targets the sort4x16lo instruction. More...
 
half2 sort4x16lo (half2 src0, half2 src1)
 Targets the sort4x16lo instruction. More...
 
half load_postinc (const half **a, int i)
 Post-incrementing load, targeting the ldb16step instruction. More...
 
void store_postinc (half2 **a, half2 v, int i)
 Post-incrementing store, targeting the st32step instruction. More...
 
half2 load_postinc (const half2 **a, int i)
 Post-incrementing load, targeting the ld32step instruction. More...
 
void store_postinc (half4 **a, half4 v, int i)
 Post-incrementing store, targeting the st64step instruction. More...
 
half4 load_postinc (const half4 **a, int i)
 Post-incrementing load, targeting the ld64step instruction. More...
 
void store_postinc (float **a, float v, int i)
 Post-incrementing store, targeting the st32step instruction. More...
 
float load_postinc (const float **a, int i)
 Post-incrementing load, targeting the ld32step instruction. More...
 
void store_postinc (float2 **a, float2 v, int i)
 Post-incrementing store, targeting the st64step instruction. More...
 
float2 load_postinc (const float2 **a, int i)
 Post-incrementing load, targeting the ld64step instruction. More...
 
void store_postinc (int **a, int v, int i)
 Post-incrementing store, targeting the stm32step instruction if i is a variable stride, and st32step otherwise. More...
 
int load_postinc (const int **a, int i)
 Post-incrementing load, targeting the ld32step instruction. More...
 
void store_postinc (unsigned **a, unsigned v, int i)
 Post-incrementing store, targeting the stm32step instruction if i is a variable stride, and st32step otherwise. More...
 
unsigned load_postinc (const unsigned **a, int i)
 Post-incrementing load, targeting the ld32step instruction. More...
 
void store_postinc (int2 **a, int2 v, int i)
 Post-incrementing store. More...
 
int2 load_postinc (const int2 **a, int i)
 Post-incrementing load. More...
 
void store_postinc (uint2 **a, uint2 v, int i)
 Post-incrementing store. More...
 
uint2 load_postinc (const uint2 **a, int i)
 Post-incrementing load. More...
 
void store_postinc (short **a, short v, int i)
 Post-incrementing store. More...
 
short load_postinc (const short **a, int i)
 Post-incrementing load, targeting the lds16step instruction. More...
 
void store_postinc (unsigned short **a, unsigned short v, int i)
 Post-incrementing store. More...
 
unsigned short load_postinc (const unsigned short **a, int i)
 Post-incrementing load, targeting the ldz16step instruction. More...
 
void store_postinc (short2 **a, short2 v, int i)
 Post-incrementing store, targeting the stm32step instruction if i is a variable stride, and st32step otherwise. More...
 
short2 load_postinc (const short2 **a, int i)
 Post-incrementing load, targeting the ld32step instruction. More...
 
void store_postinc (ushort2 **a, ushort2 v, int i)
 Post-incrementing store, targeting the stm32step instruction if i is a variable stride, and st32step otherwise. More...
 
ushort2 load_postinc (const ushort2 **a, int i)
 Post-incrementing load, targeting the ld32step instruction. More...
 
void store_postinc (short4 **a, short4 v, int i)
 Post-incrementing store. More...
 
short4 load_postinc (const short4 **a, int i)
 Post-incrementing load. More...
 
void store_postinc (ushort4 **a, ushort4 v, int i)
 Post-incrementing store. More...
 
ushort4 load_postinc (const ushort4 **a, int i)
 Post-incrementing load. More...
 
void store_postinc (char **a, char v, int i)
 Post-incrementing store. More...
 
char load_postinc (const char **a, int i)
 Post-incrementing load, targeting the lds8step instruction. More...
 
void store_postinc (unsigned char **a, unsigned char v, int i)
 Post-incrementing store. More...
 
unsigned char load_postinc (const unsigned char **a, int i)
 Post-incrementing load, targeting the ldz8step instruction. More...
 
float acos (float x)
 The arccos function, the inverse of cosine. More...
 
float2 acos (float2 x)
 The arccos function, the inverse of cosine. More...
 
half acos (half x)
 The arccos function, the inverse of cosine. More...
 
half2 acos (half2 x)
 The arccos function, the inverse of cosine. More...
 
half4 acos (half4 x)
 The arccos function, the inverse of cosine. More...
 
float acosh (float x)
 The arccosh function, the inverse of the hyperbolic cosine. More...
 
float2 acosh (float2 x)
 The arccosh function, the inverse of the hyperbolic cosine. More...
 
half acosh (half x)
 The arccosh function, the inverse of the hyperbolic cosine. More...
 
half2 acosh (half2 x)
 The arccosh function, the inverse of the hyperbolic cosine. More...
 
half4 acosh (half4 x)
 The arccosh function, the inverse of the hyperbolic cosine. More...
 
float asin (float x)
 The arcsin function, the inverse of sine. More...
 
float2 asin (float2 x)
 The arcsin function, the inverse of sine. More...
 
half asin (half x)
 The arcsin function, the inverse of sine. More...
 
half2 asin (half2 x)
 The arcsin function, the inverse of sine. More...
 
half4 asin (half4 x)
 The arcsin function, the inverse of sine. More...
 
float asinh (float x)
 The arcsinh function, the inverse of the hyperbolic sine. More...
 
float2 asinh (float2 x)
 The arcsinh function, the inverse of the hyperbolic sine. More...
 
half asinh (half x)
 The arcsinh function, the inverse of the hyperbolic sine. More...
 
half2 asinh (half2 x)
 The arcsinh function, the inverse of the hyperbolic sine. More...
 
half4 asinh (half4 x)
 The arcsinh function, the inverse of the hyperbolic sine. More...
 
float atan (float x)
 The arctan function, the inverse of tangent. More...
 
float2 atan (float2 x)
 The arctan function, the inverse of tangent. More...
 
half atan (half x)
 The arctan function, the inverse of tangent. More...
 
half2 atan (half2 x)
 The arctan function, the inverse of tangent. More...
 
half4 atan (half4 x)
 The arctan function, the inverse of tangent. More...
 
float atanh (float x)
 The arctanh function, the inverse of the hyperbolic tangent. More...
 
float2 atanh (float2 x)
 The arctanh function, the inverse of the hyperbolic tangent. More...
 
half atanh (half x)
 The arctanh function, the inverse of the hyperbolic tangent. More...
 
half2 atanh (half2 x)
 The arctanh function, the inverse of the hyperbolic tangent. More...
 
half4 atanh (half4 x)
 The arctanh function, the inverse of the hyperbolic tangent. More...
 
float cbrt (float x)
 The cubic root function. More...
 
float2 cbrt (float2 x)
 The cubic root function. More...
 
half cbrt (half x)
 The cubic root function. More...
 
half2 cbrt (half2 x)
 The cubic root function. More...
 
half4 cbrt (half4 x)
 The cubic root function. More...
 
float ceil (float x)
 Rounds up input to the closest integral value. More...
 
float2 ceil (float2 x)
 Rounds up input to the closest integral value. More...
 
half ceil (half x)
 Rounds up input to the closest integral value. More...
 
half2 ceil (half2 x)
 Rounds up input to the closest integral value. More...
 
half4 ceil (half4 x)
 Rounds up input to the closest integral value. More...
 
float cos (float x)
 The trigonometric cosine function. More...
 
float2 cos (float2 x)
 The trigonometric cosine function. More...
 
half cos (half x)
 The trigonometric cosine function. More...
 
half2 cos (half2 x)
 The trigonometric cosine function. More...
 
half4 cos (half4 x)
 The trigonometric cosine function. More...
 
float cosh (float x)
 The hyperbolic cosine function. More...
 
float2 cosh (float2 x)
 The hyperbolic cosine function. More...
 
half cosh (half x)
 The hyperbolic cosine function. More...
 
half2 cosh (half2 x)
 The hyperbolic cosine function. More...
 
half4 cosh (half4 x)
 The hyperbolic cosine function. More...
 
float erf (float x)
 The error function. More...
 
float2 erf (float2 x)
 The error function. More...
 
half erf (half x)
 The error function. More...
 
half2 erf (half2 x)
 The error function. More...
 
half4 erf (half4 x)
 The error function. More...
 
float erfc (float x)
 The complementary error function. More...
 
float2 erfc (float2 x)
 The complementary error function. More...
 
half erfc (half x)
 The complementary error function. More...
 
half2 erfc (half2 x)
 The complementary error function. More...
 
half4 erfc (half4 x)
 The complementary error function. More...
 
float2 exp (float2 x)
 The base-e exponential function. More...
 
half exp (half x)
 The base-e exponential function. More...
 
half4 exp (half4 x)
 The base-e exponential function. More...
 
float2 exp2 (float2 x)
 The base-2 exponential function. More...
 
half exp2 (half x)
 The base-2 exponential function. More...
 
half4 exp2 (half4 x)
 The base-2 exponential function. More...
 
float expm1 (float x)
 The base-e exponential function, minus one: exp(x) - 1. More...
 
float2 expm1 (float2 x)
 The base-e exponential function, minus one: exp(x) - 1. More...
 
half expm1 (half x)
 The base-e exponential function, minus one: exp(x) - 1. More...
 
half2 expm1 (half2 x)
 The base-e exponential function, minus one: exp(x) - 1. More...
 
half4 expm1 (half4 x)
 The base-e exponential function, minus one: exp(x) - 1. More...
 
float fabs (float x)
 Computes the absolute value of the input. More...
 
float2 fabs (float2 x)
 Computes the absolute value of the input. More...
 
half fabs (half x)
 Computes the absolute value of the input. More...
 
half2 fabs (half2 x)
 Computes the absolute value of the input. More...
 
half4 fabs (half4 x)
 Computes the absolute value of the input. More...
 
float floor (float x)
 Rounds down input to the closest integral value. More...
 
float2 floor (float2 x)
 Rounds down input to the closest integral value. More...
 
half floor (half x)
 Rounds down input to the closest integral value. More...
 
half2 floor (half2 x)
 Rounds down input to the closest integral value. More...
 
half4 floor (half4 x)
 Rounds down input to the closest integral value. More...
 
float log (float x)
 The natural logarithm. More...
 
float2 log (float2 x)
 The natural logarithm. More...
 
half log (half x)
 The natural logarithm. More...
 
half2 log (half2 x)
 The natural logarithm. More...
 
half4 log (half4 x)
 The natural logarithm. More...
 
float log10 (float x)
 The base-10 logarithm. More...
 
float2 log10 (float2 x)
 The base-10 logarithm. More...
 
half log10 (half x)
 The base-10 logarithm. More...
 
half2 log10 (half2 x)
 The base-10 logarithm. More...
 
half4 log10 (half4 x)
 The base-10 logarithm. More...
 
float log1p (float x)
 The natural logarithm of 1 + x. More...
 
float2 log1p (float2 x)
 The natural logarithm of 1 + x. More...
 
half log1p (half x)
 The natural logarithm of 1 + x. More...
 
half2 log1p (half2 x)
 The natural logarithm of 1 + x. More...
 
half4 log1p (half4 x)
 The natural logarithm of 1 + x. More...
 
float2 log2 (float2 x)
 The base-2 logarithm. More...
 
half log2 (half x)
 The base-2 logarithm. More...
 
half4 log2 (half4 x)
 The base-2 logarithm. More...
 
float nearbyint (float x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
float2 nearbyint (float2 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
half nearbyint (half x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
half2 nearbyint (half2 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
half4 nearbyint (half4 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
float rint (float x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
float2 rint (float2 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
half rint (half x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
half2 rint (half2 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
half4 rint (half4 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
float round (float x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
float2 round (float2 x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
half round (half x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
half2 round (half2 x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
half4 round (half4 x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
float sin (float x)
 The trigonometric sine function. More...
 
float2 sin (float2 x)
 The trigonometric sine function. More...
 
half sin (half x)
 The trigonometric sine function. More...
 
half2 sin (half2 x)
 The trigonometric sine function. More...
 
half4 sin (half4 x)
 The trigonometric sine function. More...
 
float sinh (float x)
 The hyperbolic sine function. More...
 
float2 sinh (float2 x)
 The hyperbolic sine function. More...
 
half sinh (half x)
 The hyperbolic sine function. More...
 
half2 sinh (half2 x)
 The hyperbolic sine function. More...
 
half4 sinh (half4 x)
 The hyperbolic sine function. More...
 
float sqrt (float x)
 The square root function. More...
 
float2 sqrt (float2 x)
 The square root function. More...
 
half sqrt (half x)
 The square root function. More...
 
half2 sqrt (half2 x)
 The square root function. More...
 
half4 sqrt (half4 x)
 The square root function. More...
 
float rsqrt (float x)
 The reciprocal square root function. More...
 
float2 rsqrt (float2 x)
 The reciprocal square root function. More...
 
half rsqrt (half x)
 The reciprocal square root function. More...
 
half2 rsqrt (half2 x)
 The reciprocal square root function. More...
 
half4 rsqrt (half4 x)
 The reciprocal square root function. More...
 
float tan (float x)
 The trigonometric tangent function. More...
 
float2 tan (float2 x)
 The trigonometric tangent function. More...
 
half tan (half x)
 The trigonometric tangent function. More...
 
half2 tan (half2 x)
 The trigonometric tangent function. More...
 
half4 tan (half4 x)
 The trigonometric tangent function. More...
 
float2 tanh (float2 x)
 The hyperbolic tangent function. More...
 
half tanh (half x)
 The hyperbolic tangent function. More...
 
half4 tanh (half4 x)
 The hyperbolic tangent function. More...
 
float tgamma (float x)
 The gamma function. More...
 
float2 tgamma (float2 x)
 The gamma function. More...
 
half tgamma (half x)
 The gamma function. More...
 
half2 tgamma (half2 x)
 The gamma function. More...
 
half4 tgamma (half4 x)
 The gamma function. More...
 
float trunc (float x)
 Rounds input towards zero to the nearest integral value that is not larger in magnitude than x. More...
 
float2 trunc (float2 x)
 Rounds input towards zero to the nearest integral value that is not larger in magnitude than x. More...
 
half trunc (half x)
 Rounds input towards zero to the nearest integral value that is not larger in magnitude than x. More...
 
half2 trunc (half2 x)
 Rounds input towards zero to the nearest integral value that is not larger in magnitude than x. More...
 
half4 trunc (half4 x)
 Rounds input towards zero to the nearest integral value that is not larger in magnitude than x. More...
 
float sigmoid (float x)
 The sigmoid function, ie 1/(1 + exp(- x )). More...
 
float2 sigmoid (float2 x)
 The sigmoid function, ie 1/(1 + exp(- x )). More...
 
half sigmoid (half x)
 The sigmoid function, ie 1/(1 + exp(- x )). More...
 
half2 sigmoid (half2 x)
 The sigmoid function, ie 1/(1 + exp(- x )). More...
 
half4 sigmoid (half4 x)
 The sigmoid function, ie 1/(1 + exp(- x )). More...
 
float atan2 (float x, float y)
 The arctangent of ( y )/( x ), in radians. More...
 
float2 atan2 (float2 x, float2 y)
 The arctangent of ( y )/( x ), in radians. More...
 
half atan2 (half x, half y)
 The arctangent of ( y )/( x ), in radians. More...
 
half2 atan2 (half2 x, half2 y)
 The arctangent of ( y )/( x ), in radians. More...
 
half4 atan2 (half4 x, half4 y)
 The arctangent of ( y )/( x ), in radians. More...
 
float copysign (float x, float y)
 Composes a value of magnitude x with the sign of y. More...
 
float2 copysign (float2 x, float2 y)
 Composes a value of magnitude x with the sign of y. More...
 
half copysign (half x, half y)
 Composes a value of magnitude x with the sign of y. More...
 
half2 copysign (half2 x, half2 y)
 Composes a value of magnitude x with the sign of y. More...
 
half4 copysign (half4 x, half4 y)
 Composes a value of magnitude x with the sign of y. More...
 
float fdim (float x, float y)
 Calculates the absolute difference between the two inputs. More...
 
float2 fdim (float2 x, float2 y)
 Calculates the absolute difference between the two inputs. More...
 
half fdim (half x, half y)
 Calculates the absolute difference between the two inputs. More...
 
half2 fdim (half2 x, half2 y)
 Calculates the absolute difference between the two inputs. More...
 
half4 fdim (half4 x, half4 y)
 Calculates the absolute difference between the two inputs. More...
 
float fmax (float x, float y)
 Calculates the maximum of the two inputs. More...
 
float2 fmax (float2 x, float2 y)
 Calculates the maximum of the two inputs. More...
 
half fmax (half x, half y)
 Calculates the maximum of the two inputs. More...
 
half2 fmax (half2 x, half2 y)
 Calculates the maximum of the two inputs. More...
 
half4 fmax (half4 x, half4 y)
 Calculates the maximum of the two inputs. More...
 
float fmin (float x, float y)
 Calculates the minimum of the two inputs. More...
 
float2 fmin (float2 x, float2 y)
 Calculates the minimum of the two inputs. More...
 
half fmin (half x, half y)
 Calculates the minimum of the two inputs. More...
 
half2 fmin (half2 x, half2 y)
 Calculates the minimum of the two inputs. More...
 
half4 fmin (half4 x, half4 y)
 Calculates the minimum of the two inputs. More...
 
float fmod (float x, float y)
 Calculates the remainder of the division x / y rounded towards zero. More...
 
float2 fmod (float2 x, float2 y)
 Calculates the remainder of the division x / y rounded towards zero. More...
 
half fmod (half x, half y)
 Calculates the remainder of the division x / y rounded towards zero. More...
 
half2 fmod (half2 x, half2 y)
 Calculates the remainder of the division x / y rounded towards zero. More...
 
half4 fmod (half4 x, half4 y)
 Calculates the remainder of the division x / y rounded towards zero. More...
 
float hypot (float x, float y)
 Calculates the hypotenuse of the right-angled triangle whose two shorter sides are of lengths given by the two inputs. More...
 
float2 hypot (float2 x, float2 y)
 Calculates the hypotenuse of the right-angled triangle whose two shorter sides are of lengths given by the two inputs. More...
 
half hypot (half x, half y)
 Calculates the hypotenuse of the right-angled triangle whose two shorter sides are of lengths given by the two inputs. More...
 
half2 hypot (half2 x, half2 y)
 Calculates the hypotenuse of the right-angled triangle whose two shorter sides are of lengths given by the two inputs. More...
 
half4 hypot (half4 x, half4 y)
 Calculates the hypotenuse of the right-angled triangle whose two shorter sides are of lengths given by the two inputs. More...
 
float pow (float x, float y)
 Calculates x to the power of y. More...
 
float2 pow (float2 x, float2 y)
 Calculates x to the power of y. More...
 
half pow (half x, half y)
 Calculates x to the power of y. More...
 
half2 pow (half2 x, half2 y)
 Calculates x to the power of y. More...
 
half4 pow (half4 x, half4 y)
 Calculates x to the power of y. More...
 
float remainder (float x, float y)
 Calculates the remainder of the division x / y, rounded to the nearest integral value, with halfway cases rounded to the even number. More...
 
float2 remainder (float2 x, float2 y)
 Calculates the remainder of the division x / y, rounded to the nearest integral value, with halfway cases rounded to the even number. More...
 
half remainder (half x, half y)
 Calculates the remainder of the division x / y, rounded to the nearest integral value, with halfway cases rounded to the even number. More...
 
half2 remainder (half2 x, half2 y)
 Calculates the remainder of the division x / y, rounded to the nearest integral value, with halfway cases rounded to the even number. More...
 
half4 remainder (half4 x, half4 y)
 Calculates the remainder of the division x / y, rounded to the nearest integral value, with halfway cases rounded to the even number. More...
 
float fma (float x, float y, float z)
 Computes ( x * y ) + z. More...
 
float2 fma (float2 x, float2 y, float2 z)
 Computes ( x * y ) + z. More...
 
half fma (half x, half y, half z)
 Computes ( x * y ) + z. More...
 
half2 fma (half2 x, half2 y, half2 z)
 Computes ( x * y ) + z. More...
 
half4 fma (half4 x, half4 y, half4 z)
 Computes ( x * y ) + z. More...
 
long long llrint (float x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
longlong2 llrint (float2 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
long long llrint (half x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
longlong2 llrint (half2 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
longlong4 llrint (half4 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
long long llround (float x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
longlong2 llround (float2 x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
long long llround (half x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
longlong2 llround (half2 x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
longlong4 llround (half4 x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
long lrint (float x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
long2 lrint (float2 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
long lrint (half x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
long2 lrint (half2 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
long4 lrint (half4 x)
 Rounds input to a nearby integral value, using the current rounding mode. More...
 
long lround (float x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
long2 lround (float2 x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
long lround (half x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
long2 lround (half2 x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 
long4 lround (half4 x)
 Rounds input to nearest integral value, with halfway cases rounded away from zero. More...
 

Detailed Description

IPU intrinsic functions.

Function Documentation

◆ absadd() [1/4]

float ipu::absadd ( float  src0,
float  src1 
)
inline

Targets the f32absadd instruction.

Parameters
src0A value of type float.
src1A value of type float.
Returns
The result of a scalar addition of absolute values src0 and src1.

◆ absadd() [2/4]

float2 ipu::absadd ( float2  src0,
float2  src1 
)
inline

Targets the f32v2absadd instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
The result of an element-wise addition of absolute values in src0 and src1.

◆ absadd() [3/4]

half2 ipu::absadd ( half2  src0,
half2  src1 
)
inline

Targets the f16v2absadd instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
The result of an element-wise addition of absolute values in src0 and src1.

◆ absadd() [4/4]

half4 ipu::absadd ( half4  src0,
half4  src1 
)
inline

Targets the f16v4absadd instruction.

Parameters
src0A value of type half4.
src1A value of type half4.
Returns
The result of an element-wise addition of absolute values in src0 and src1.

◆ absmax() [1/4]

float ipu::absmax ( float  src0,
float  src1 
)
inline

Targets the f32absmax instruction.

Parameters
src0A value of type float.
src1A value of type float.
Returns
The maximum of absolute values src0 and src1.

◆ absmax() [2/4]

float2 ipu::absmax ( float2  src0,
float2  src1 
)
inline

Targets the f32v2absmax instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
The element-wise maximum of absolute values in src0 and src1.

◆ absmax() [3/4]

half2 ipu::absmax ( half2  src0,
half2  src1 
)
inline

Targets the f16v2absmax instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
The element-wise maximum of absolute values in src0 and src1.

◆ absmax() [4/4]

half4 ipu::absmax ( half4  src0,
half4  src1 
)
inline

Targets the f16v4absmax instruction.

Parameters
src0A value of type half4.
src1A value of type half4.
Returns
The element-wise maximum of absolute values in src0 and src1.

◆ acos() [1/5]

float ipu::acos ( float  x)
inline

The arccos function, the inverse of cosine.

Parameters
xA value of type float.
Returns
The result of acos of x.

◆ acos() [2/5]

float2 ipu::acos ( float2  x)
inline

The arccos function, the inverse of cosine.

Parameters
xA value of type float2.
Returns
The element-wise result of acos of x.

◆ acos() [3/5]

half ipu::acos ( half  x)
inline

The arccos function, the inverse of cosine.

Parameters
xA value of type half.
Returns
The result of acos of x.

◆ acos() [4/5]

half2 ipu::acos ( half2  x)
inline

The arccos function, the inverse of cosine.

Parameters
xA value of type half2.
Returns
The element-wise result of acos of x.

◆ acos() [5/5]

half4 ipu::acos ( half4  x)
inline

The arccos function, the inverse of cosine.

Parameters
xA value of type half4.
Returns
The element-wise result of acos of x.

◆ acosh() [1/5]

float ipu::acosh ( float  x)
inline

The arccosh function, the inverse of the hyperbolic cosine.

Parameters
xA value of type float.
Returns
The result of acosh of x.

◆ acosh() [2/5]

float2 ipu::acosh ( float2  x)
inline

The arccosh function, the inverse of the hyperbolic cosine.

Parameters
xA value of type float2.
Returns
The element-wise result of acosh of x.

◆ acosh() [3/5]

half ipu::acosh ( half  x)
inline

The arccosh function, the inverse of the hyperbolic cosine.

Parameters
xA value of type half.
Returns
The result of acosh of x.

◆ acosh() [4/5]

half2 ipu::acosh ( half2  x)
inline

The arccosh function, the inverse of the hyperbolic cosine.

Parameters
xA value of type half2.
Returns
The element-wise result of acosh of x.

◆ acosh() [5/5]

half4 ipu::acosh ( half4  x)
inline

The arccosh function, the inverse of the hyperbolic cosine.

Parameters
xA value of type half4.
Returns
The element-wise result of acosh of x.

◆ andc() [1/4]

float ipu::andc ( float  src0,
float  src1 
)
inline

Targets the andc instruction.

Parameters
src0A value of type float.
src1A value of type float.
Returns
The bitwise logical and of src0 and the negated value of src1 of type float.

◆ andc() [2/4]

float2 ipu::andc ( float2  src0,
float2  src1 
)
inline

Targets the andc64 instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
The bitwise logical and of src0 and the negated value of src1 of type float2.

◆ andc() [3/4]

int ipu::andc ( int  src0,
int  src1 
)
inline

Targets the andc instruction.

Parameters
src0A value of type int.
src1A value of type int, can be a 12-bit constant.
Returns
The bitwise logical and of src0 and the negated value of src1 of type int.

◆ andc() [4/4]

unsigned ipu::andc ( unsigned  src0,
unsigned  src1 
)
inline

Targets the andc instruction.

Parameters
src0A value of type unsigned.
src1A value of type unsigned, can be a 12-bit constant.
Returns
The bitwise logical and of src0 and the negated value of src1 of type unsigned.

◆ asin() [1/5]

float ipu::asin ( float  x)
inline

The arcsin function, the inverse of sine.

Parameters
xA value of type float.
Returns
The result of asin of x.

◆ asin() [2/5]

float2 ipu::asin ( float2  x)
inline

The arcsin function, the inverse of sine.

Parameters
xA value of type float2.
Returns
The element-wise result of asin of x.

◆ asin() [3/5]

half ipu::asin ( half  x)
inline

The arcsin function, the inverse of sine.

Parameters
xA value of type half.
Returns
The result of asin of x.

◆ asin() [4/5]

half2 ipu::asin ( half2  x)
inline

The arcsin function, the inverse of sine.

Parameters
xA value of type half2.
Returns
The element-wise result of asin of x.

◆ asin() [5/5]

half4 ipu::asin ( half4  x)
inline

The arcsin function, the inverse of sine.

Parameters
xA value of type half4.
Returns
The element-wise result of asin of x.

◆ asinh() [1/5]

float ipu::asinh ( float  x)
inline

The arcsinh function, the inverse of the hyperbolic sine.

Parameters
xA value of type float.
Returns
The result of asinh of x.

◆ asinh() [2/5]

float2 ipu::asinh ( float2  x)
inline

The arcsinh function, the inverse of the hyperbolic sine.

Parameters
xA value of type float2.
Returns
The element-wise result of asinh of x.

◆ asinh() [3/5]

half ipu::asinh ( half  x)
inline

The arcsinh function, the inverse of the hyperbolic sine.

Parameters
xA value of type half.
Returns
The result of asinh of x.

◆ asinh() [4/5]

half2 ipu::asinh ( half2  x)
inline

The arcsinh function, the inverse of the hyperbolic sine.

Parameters
xA value of type half2.
Returns
The element-wise result of asinh of x.

◆ asinh() [5/5]

half4 ipu::asinh ( half4  x)
inline

The arcsinh function, the inverse of the hyperbolic sine.

Parameters
xA value of type half4.
Returns
The element-wise result of asinh of x.

◆ atan() [1/5]

float ipu::atan ( float  x)
inline

The arctan function, the inverse of tangent.

Parameters
xA value of type float.
Returns
The result of atan of x.

◆ atan() [2/5]

float2 ipu::atan ( float2  x)
inline

The arctan function, the inverse of tangent.

Parameters
xA value of type float2.
Returns
The element-wise result of atan of x.

◆ atan() [3/5]

half ipu::atan ( half  x)
inline

The arctan function, the inverse of tangent.

Parameters
xA value of type half.
Returns
The result of atan of x.

◆ atan() [4/5]

half2 ipu::atan ( half2  x)
inline

The arctan function, the inverse of tangent.

Parameters
xA value of type half2.
Returns
The element-wise result of atan of x.

◆ atan() [5/5]

half4 ipu::atan ( half4  x)
inline

The arctan function, the inverse of tangent.

Parameters
xA value of type half4.
Returns
The element-wise result of atan of x.

◆ atan2() [1/5]

float ipu::atan2 ( float  x,
float  y 
)
inline

The arctangent of ( y )/( x ), in radians.

Parameters
xA value of type float.
yA value of type float.
Returns
The result of atan2 of x and y.

◆ atan2() [2/5]

float2 ipu::atan2 ( float2  x,
float2  y 
)
inline

The arctangent of ( y )/( x ), in radians.

Parameters
xA value of type float2.
yA value of type float2.
Returns
The element-wise result of atan2 of x and y.

◆ atan2() [3/5]

half ipu::atan2 ( half  x,
half  y 
)
inline

The arctangent of ( y )/( x ), in radians.

Parameters
xA value of type half.
yA value of type half.
Returns
The result of atan2 of x and y.

◆ atan2() [4/5]

half2 ipu::atan2 ( half2  x,
half2  y 
)
inline

The arctangent of ( y )/( x ), in radians.

Parameters
xA value of type half2.
yA value of type half2.
Returns
The element-wise result of atan2 of x and y.

◆ atan2() [5/5]

half4 ipu::atan2 ( half4  x,
half4  y 
)
inline

The arctangent of ( y )/( x ), in radians.

Parameters
xA value of type half4.
yA value of type half4.
Returns
The element-wise result of atan2 of x and y.

◆ atanh() [1/5]

float ipu::atanh ( float  x)
inline

The arctanh function, the inverse of the hyperbolic tangent.

Parameters
xA value of type float.
Returns
The result of atanh of x.

◆ atanh() [2/5]

float2 ipu::atanh ( float2  x)
inline

The arctanh function, the inverse of the hyperbolic tangent.

Parameters
xA value of type float2.
Returns
The element-wise result of atanh of x.

◆ atanh() [3/5]

half ipu::atanh ( half  x)
inline

The arctanh function, the inverse of the hyperbolic tangent.

Parameters
xA value of type half.
Returns
The result of atanh of x.

◆ atanh() [4/5]

half2 ipu::atanh ( half2  x)
inline

The arctanh function, the inverse of the hyperbolic tangent.

Parameters
xA value of type half2.
Returns
The element-wise result of atanh of x.

◆ atanh() [5/5]

half4 ipu::atanh ( half4  x)
inline

The arctanh function, the inverse of the hyperbolic tangent.

Parameters
xA value of type half4.
Returns
The element-wise result of atanh of x.

◆ axpy()

float2 ipu::axpy ( float2  src0,
float2  src1 
)
inline

Targets the f32v2axpy instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
The single precision two-element vector res = a*src0 + src1. The scalar multiplicand a is provided by the internal state element $TAS.

◆ bitrev8()

int ipu::bitrev8 ( int  src)
inline

Targets the bitrev8 instruction.

Parameters
srcA value of type int.
Returns
A result of type int that is equivalent to the value of src with the bit order of each byte reversed.

◆ cbrt() [1/5]

float ipu::cbrt ( float  x)
inline

The cubic root function.

Parameters
xA value of type float.
Returns
The result of cbrt of x.

◆ cbrt() [2/5]

float2 ipu::cbrt ( float2  x)
inline

The cubic root function.

Parameters
xA value of type float2.
Returns
The element-wise result of cbrt of x.

◆ cbrt() [3/5]

half ipu::cbrt ( half  x)
inline

The cubic root function.

Parameters
xA value of type half.
Returns
The result of cbrt of x.

◆ cbrt() [4/5]

half2 ipu::cbrt ( half2  x)
inline

The cubic root function.

Parameters
xA value of type half2.
Returns
The element-wise result of cbrt of x.

◆ cbrt() [5/5]

half4 ipu::cbrt ( half4  x)
inline

The cubic root function.

Parameters
xA value of type half4.
Returns
The element-wise result of cbrt of x.

◆ ceil() [1/5]

float ipu::ceil ( float  x)
inline

Rounds up input to the closest integral value.

Parameters
xA value of type float.
Returns
The smallest integral value that is not less than x.

◆ ceil() [2/5]

float2 ipu::ceil ( float2  x)
inline

Rounds up input to the closest integral value.

Parameters
xA value of type float2.
Returns
A vector where every ith element is the smallest integral value that is not less than x[i].

◆ ceil() [3/5]

half ipu::ceil ( half  x)
inline

Rounds up input to the closest integral value.

Parameters
xA value of type half.
Returns
The smallest integral value that is not less than x.

◆ ceil() [4/5]

half2 ipu::ceil ( half2  x)
inline

Rounds up input to the closest integral value.

Parameters
xA value of type half2.
Returns
A vector where every ith element is the smallest integral value that is not less than x[i].

◆ ceil() [5/5]

half4 ipu::ceil ( half4  x)
inline

Rounds up input to the closest integral value.

Parameters
xA value of type half4.
Returns
A vector where every ith element is the smallest integral value that is not less than x[i].

◆ clamp() [1/4]

float ipu::clamp ( float  src0,
float2  src1 
)
inline

Targets the f32clamp instruction.

Parameters
src0A value of type float.
src1A value of type float2.
Returns
The median of src0 and the two elements in src1.

◆ clamp() [2/4]

float2 ipu::clamp ( float2  src0,
float2  src1 
)
inline

Targets the f32v2clamp instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
The min-of-maximum result of src0 and src1, of type float2. The first element is the median of the first element of src0 and the two elements in src1. The second element is the median of the second element of src0 and the two elements in src1.

◆ clamp() [3/4]

half2 ipu::clamp ( half2  src0,
half2  src1 
)
inline

Targets the f16v2clamp instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
The min-of-maximum result of src0 and src1, of type half2. The first element is the median value of the first element of src0 and the two elements in src1. The second element is the median of the second element of src0 and the two elements in src1.

◆ clamp() [4/4]

half4 ipu::clamp ( half4  src0,
half2  src1 
)
inline

Targets the f16v4clamp instruction.

Parameters
src0A value of type half4.
src1A value of type half2.
Returns
The min-of-maximum result of src0 and src1, of type half4. Each element is the median of the element in src0 at the same index, and the two values in src1.

◆ clz()

unsigned ipu::clz ( int  src)
inline

Targets the clz instruction.

Parameters
srcA value of type int.
Returns
The number of higher bits in src that are zero.

◆ cmac() [1/2]

void ipu::cmac ( half2  src0,
half2  src1 
)
inline

Targets the f16v2cmac instruction.

Parameters
src0A value of type half2.
src1A value of type half2.

◆ cmac() [2/2]

void ipu::cmac ( half4  src0,
half4  src1 
)
inline

Targets the f16v4cmac instruction.

Parameters
src0A value of type half4.
src1A value of type half4.

◆ cmpeq() [1/4]

float ipu::cmpeq ( float  src0,
float  src1 
)
inline

Targets the f32cmpeq instruction.

Parameters
src0A value of type float.
src1A value of type float.
Returns
Equality test of src0 and src1. If src0 == src1 the result will be 0xffff, and 0x0000 otherwise.

◆ cmpeq() [2/4]

float2 ipu::cmpeq ( float2  src0,
float2  src1 
)
inline

Targets the f32v2cmpeq instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
Element-wise equality test of src0 and src1. If src0[i] == src1[i], the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmpeq() [3/4]

half2 ipu::cmpeq ( half2  src0,
half2  src1 
)
inline

Targets the f16v2cmpeq instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
Element-wise equality test of src0 and src1. If src0[i] == src1[i], the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmpeq() [4/4]

half4 ipu::cmpeq ( half4  src0,
half4  src1 
)
inline

Targets the f16v4cmpeq instruction.

Parameters
src0A value of type half4.
src1A value of type half4.
Returns
Element-wise equality test of src0 and src1. If src0[i] == src1[i], the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmpge() [1/4]

float ipu::cmpge ( float  src0,
float  src1 
)
inline

Targets the f32cmpge instruction.

Parameters
src0A value of type float.
src1A value of type float.
Returns
Greater-than-or-equal-to test of src0 and src1. If src0 >= src1 the result will be 0xffff, and 0x0000 otherwise.

◆ cmpge() [2/4]

float2 ipu::cmpge ( float2  src0,
float2  src1 
)
inline

Targets the f32v2cmpge instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
Element-wise greater-than-or-equal-to test of src0 and src1. If src0[i] >= src1[i] the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmpge() [3/4]

half2 ipu::cmpge ( half2  src0,
half2  src1 
)
inline

Targets the f16v2cmpge instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
Element-wise greater-than-or-equal-to test of src0 and src1. If src0[i] >= src1[i] the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmpge() [4/4]

half4 ipu::cmpge ( half4  src0,
half4  src1 
)
inline

Targets the f16v4cmpge instruction.

Parameters
src0A value of type half4.
src1A value of type half4.
Returns
Element-wise greater-than-or-equal-to test of src0 and src1. If src0[i] >= src1[i] the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmpgt() [1/4]

float ipu::cmpgt ( float  src0,
float  src1 
)
inline

Targets the f32cmpgt instruction.

Parameters
src0A value of type float.
src1A value of type float.
Returns
Greater-than test of src0 and src1. If src0 > src1 the result will be 0xffff, and 0x0000 otherwise.

◆ cmpgt() [2/4]

float2 ipu::cmpgt ( float2  src0,
float2  src1 
)
inline

Targets the f32v2cmpgt instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
Element-wise greater-than test of src0 and src1. If src0 > src1 the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmpgt() [3/4]

half2 ipu::cmpgt ( half2  src0,
half2  src1 
)
inline

Targets the f16v2cmpgt instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
Element-wise greater-than test of src0 and src1. If src0 > src1 the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmpgt() [4/4]

half4 ipu::cmpgt ( half4  src0,
half4  src1 
)
inline

Targets the f16v4cmpgt instruction.

Parameters
src0A value of type half4.
src1A value of type half4.
Returns
Element-wise greater-than test of src0 and src1. If src0 > src1 the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmple() [1/4]

float ipu::cmple ( float  src0,
float  src1 
)
inline

Targets the f32cmple instruction.

Parameters
src0A value of type float.
src1A value of type float.
Returns
Less-than-or-equal-to test of src0 and src1. If src0 <= src1 the result will be 0xffff, and 0x0000 otherwise.

◆ cmple() [2/4]

float2 ipu::cmple ( float2  src0,
float2  src1 
)
inline

Targets the f32v2cmple instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
Element-wise less-than-or-equal-to test of src0 and src1. If src0 <= src1 the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmple() [3/4]

half2 ipu::cmple ( half2  src0,
half2  src1 
)
inline

Targets the f16v2cmple instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
Element-wise less-than-or-equal-to test of src0 and src1. If src0 <= src1 the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmple() [4/4]

half4 ipu::cmple ( half4  src0,
half4  src1 
)
inline

Targets the f16v4cmple instruction.

Parameters
src0A value of type half4.
src1A value of type half4.
Returns
Element-wise less-than-or-equal-to test of src0 and src1. If src0 <= src1 the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmplt() [1/4]

float ipu::cmplt ( float  src0,
float  src1 
)
inline

Targets the f32cmplt instruction.

Parameters
src0A value of type float.
src1A value of type float.
Returns
Less-than test of src0 and src1. If src0 < src1 the result will be 0xffff, and 0x0000 otherwise.

◆ cmplt() [2/4]

float2 ipu::cmplt ( float2  src0,
float2  src1 
)
inline

Targets the f32v2cmplt instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
Element-wise less-than test of src0 and src1. If src0 < src1 the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmplt() [3/4]

half2 ipu::cmplt ( half2  src0,
half2  src1 
)
inline

Targets the f16v2cmplt instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
Element-wise less-than test of src0 and src1. If src0 < src1 the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmplt() [4/4]

half4 ipu::cmplt ( half4  src0,
half4  src1 
)
inline

Targets the f16v4cmplt instruction.

Parameters
src0A value of type half4.
src1A value of type half4.
Returns
Element-wise less-than test of src0 and src1. If src0 < src1 the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmpne() [1/4]

float ipu::cmpne ( float  src0,
float  src1 
)
inline

Targets the f32cmpne instruction.

Parameters
src0A value of type float.
src1A value of type float.
Returns
Inequality test of src0 and src1. If src0 != src1 the result will be 0xffff, and 0x0000 otherwise.

◆ cmpne() [2/4]

float2 ipu::cmpne ( float2  src0,
float2  src1 
)
inline

Targets the f32v2cmpne instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
Element-wise inequality test of src0 and src1. If src0[i] != src1[i], the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmpne() [3/4]

half2 ipu::cmpne ( half2  src0,
half2  src1 
)
inline

Targets the f16v2cmpne instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
Element-wise inequality test of src0 and src1. If src0[i] != src1[i], the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cmpne() [4/4]

half4 ipu::cmpne ( half4  src0,
half4  src1 
)
inline

Targets the f16v4cmpne instruction.

Parameters
src0A value of type half4.
src1A value of type half4.
Returns
Element-wise inequality test of src0 and src1. If src0[i] != src1[i], the result vector element at index i will be 0xffff, and 0x0000 otherwise.

◆ cms()

unsigned ipu::cms ( int  src)
inline

Targets the cms instruction.

Parameters
srcA value of type int.
Returns
The number of higher order bits in src that match the sign bit (bit 31), as an unsigned.

◆ copysign() [1/5]

float ipu::copysign ( float  x,
float  y 
)
inline

Composes a value of magnitude x with the sign of y.

Parameters
xA value of type float.
yA value of type float.
Returns
The result of copysign of x and y.

◆ copysign() [2/5]

float2 ipu::copysign ( float2  x,
float2  y 
)
inline

Composes a value of magnitude x with the sign of y.

Parameters
xA value of type float2.
yA value of type float2.
Returns
The element-wise result of copysign of x and y.

◆ copysign() [3/5]

half ipu::copysign ( half  x,
half  y 
)
inline

Composes a value of magnitude x with the sign of y.

Parameters
xA value of type half.
yA value of type half.
Returns
The result of copysign of x and y.

◆ copysign() [4/5]

half2 ipu::copysign ( half2  x,
half2  y 
)
inline

Composes a value of magnitude x with the sign of y.

Parameters
xA value of type half2.
yA value of type half2.
Returns
The element-wise result of copysign of x and y.

◆ copysign() [5/5]

half4 ipu::copysign ( half4  x,
half4  y 
)
inline

Composes a value of magnitude x with the sign of y.

Parameters
xA value of type half4.
yA value of type half4.
Returns
The element-wise result of copysign of x and y.

◆ cos() [1/5]

float ipu::cos ( float  x)
inline

The trigonometric cosine function.

Parameters
xA value of type float.
Returns
The result of cos of x.

◆ cos() [2/5]

float2 ipu::cos ( float2  x)
inline

The trigonometric cosine function.

Parameters
xA value of type float2.
Returns
The element-wise result of cos of x.

◆ cos() [3/5]

half ipu::cos ( half  x)
inline

The trigonometric cosine function.

Parameters
xA value of type half.
Returns
The result of cos of x.

◆ cos() [4/5]

half2 ipu::cos ( half2  x)
inline

The trigonometric cosine function.

Parameters
xA value of type half2.
Returns
The element-wise result of cos of x.

◆ cos() [5/5]

half4 ipu::cos ( half4  x)
inline

The trigonometric cosine function.

Parameters
xA value of type half4.
Returns
The element-wise result of cos of x.

◆ cosh() [1/5]

float ipu::cosh ( float  x)
inline

The hyperbolic cosine function.

Parameters
xA value of type float.
Returns
The result of cosh of x.

◆ cosh() [2/5]

float2 ipu::cosh ( float2  x)
inline

The hyperbolic cosine function.

Parameters
xA value of type float2.
Returns
The element-wise result of cosh of x.

◆ cosh() [3/5]

half ipu::cosh ( half  x)
inline

The hyperbolic cosine function.

Parameters
xA value of type half.
Returns
The result of cosh of x.

◆ cosh() [4/5]

half2 ipu::cosh ( half2  x)
inline

The hyperbolic cosine function.

Parameters
xA value of type half2.
Returns
The element-wise result of cosh of x.

◆ cosh() [5/5]

half4 ipu::cosh ( half4  x)
inline

The hyperbolic cosine function.

Parameters
xA value of type half4.
Returns
The element-wise result of cosh of x.

◆ erf() [1/5]

float ipu::erf ( float  x)
inline

The error function.

Parameters
xA value of type float.
Returns
The error function value for x.

◆ erf() [2/5]

float2 ipu::erf ( float2  x)
inline

The error function.

Parameters
xA value of type float2.
Returns
The error function value for x.

◆ erf() [3/5]

half ipu::erf ( half  x)
inline

The error function.

Parameters
xA value of type half.
Returns
The error function value for x.

◆ erf() [4/5]

half2 ipu::erf ( half2  x)
inline

The error function.

Parameters
xA value of type half2.
Returns
The error function value for x.

◆ erf() [5/5]

half4 ipu::erf ( half4  x)
inline

The error function.

Parameters
xA value of type half4.
Returns
The error function value for x.

◆ erfc() [1/5]

float ipu::erfc ( float  x)
inline

The complementary error function.

Parameters
xA value of type float.
Returns
The complementary error function value for x.

◆ erfc() [2/5]

float2 ipu::erfc ( float2  x)
inline

The complementary error function.

Parameters
xA value of type float2.
Returns
The complementary error function value for x.

◆ erfc() [3/5]

half ipu::erfc ( half  x)
inline

The complementary error function.

Parameters
xA value of type half.
Returns
The complementary error function value for x.

◆ erfc() [4/5]

half2 ipu::erfc ( half2  x)
inline

The complementary error function.

Parameters
xA value of type half2.
Returns
The complementary error function value for x.

◆ erfc() [5/5]

half4 ipu::erfc ( half4  x)
inline

The complementary error function.

Parameters
xA value of type half4.
Returns
The complementary error function value for x.

◆ exp() [1/5]

float ipu::exp ( float  x)
inline

Targets the f32exp instruction.

The base-e exponential function.

Parameters
srcA value of type float.
Returns
The result of e^{src}.
Parameters
xA value of type float.
Returns
The result of exp of x.

◆ exp() [2/5]

float2 ipu::exp ( float2  x)
inline

The base-e exponential function.

Parameters
xA value of type float2.
Returns
The element-wise result of exp of x.

◆ exp() [3/5]

half ipu::exp ( half  x)
inline

The base-e exponential function.

Parameters
xA value of type half.
Returns
The result of exp of x.

◆ exp() [4/5]

half2 ipu::exp ( half2  x)
inline

Targets the f16v2exp instruction.

The base-e exponential function.

Parameters
srcA value of type half2.
Returns
A vector of the results of e^X of the two elements in src.
Parameters
xA value of type half2.
Returns
The element-wise result of exp of x.

◆ exp() [5/5]

half4 ipu::exp ( half4  x)
inline

The base-e exponential function.

Parameters
xA value of type half4.
Returns
The element-wise result of exp of x.

◆ exp2() [1/5]

float ipu::exp2 ( float  x)
inline

Targets the f32exp instruction.

The base-2 exponential function.

Parameters
srcA value of type float.
Returns
The result of 2^{src}.
Parameters
xA value of type float.
Returns
The result of exp2 of x.

◆ exp2() [2/5]

float2 ipu::exp2 ( float2  x)
inline

The base-2 exponential function.

Parameters
xA value of type float2.
Returns
The element-wise result of exp2 of x.

◆ exp2() [3/5]

half ipu::exp2 ( half  x)
inline

The base-2 exponential function.

Parameters
xA value of type half.
Returns
The result of exp2 of x.

◆ exp2() [4/5]

half2 ipu::exp2 ( half2  x)
inline

Targets the f16v2exp instruction.

The base-2 exponential function.

Parameters
srcA value of type half2.
Returns
A vector of the results of 2^X of the two elements in src.
Parameters
xA value of type half2.
Returns
The element-wise result of exp2 of x.

◆ exp2() [5/5]

half4 ipu::exp2 ( half4  x)
inline

The base-2 exponential function.

Parameters
xA value of type half4.
Returns
The element-wise result of exp2 of x.

◆ expm1() [1/5]

float ipu::expm1 ( float  x)
inline

The base-e exponential function, minus one: exp(x) - 1.

Parameters
xA value of type float.
Returns
The result of expm1 of x.

◆ expm1() [2/5]

float2 ipu::expm1 ( float2  x)
inline

The base-e exponential function, minus one: exp(x) - 1.

Parameters
xA value of type float2.
Returns
The element-wise result of expm1 of x.

◆ expm1() [3/5]

half ipu::expm1 ( half  x)
inline

The base-e exponential function, minus one: exp(x) - 1.

Parameters
xA value of type half.
Returns
The result of expm1 of x.

◆ expm1() [4/5]

half2 ipu::expm1 ( half2  x)
inline

The base-e exponential function, minus one: exp(x) - 1.

Parameters
xA value of type half2.
Returns
The element-wise result of expm1 of x.

◆ expm1() [5/5]

half4 ipu::expm1 ( half4  x)
inline

The base-e exponential function, minus one: exp(x) - 1.

Parameters
xA value of type half4.
Returns
The element-wise result of expm1 of x.

◆ f16v2grand()

half2 ipu::f16v2grand ( )
inline

Targets the f16v2grand instruction.

Returns
Gaussian distribution, two-element half-precision random vector.

◆ f32v2grand()

float2 ipu::f32v2grand ( )
inline

Targets the f32v2grand instruction.

Returns
Gaussian distribution, two-element single-precision random vector.

◆ fabs() [1/5]

float ipu::fabs ( float  x)
inline

Computes the absolute value of the input.

Parameters
xA value of type float.
Returns
The absolute value of x.

◆ fabs() [2/5]

float2 ipu::fabs ( float2  x)
inline

Computes the absolute value of the input.

Parameters
xA value of type float2.
Returns
A vector where every ith element is the absolute value of x[i].

◆ fabs() [3/5]

half ipu::fabs ( half  x)
inline

Computes the absolute value of the input.

Parameters
xA value of type half.
Returns
The absolute value of x.

◆ fabs() [4/5]

half2 ipu::fabs ( half2  x)
inline

Computes the absolute value of the input.

Parameters
xA value of type half2.
Returns
A vector where every ith element is the absolute value of x[i].

◆ fabs() [5/5]

half4 ipu::fabs ( half4  x)
inline

Computes the absolute value of the input.

Parameters
xA value of type half4.
Returns
A vector where every ith element is the absolute value of x[i].

◆ fdim() [1/5]

float ipu::fdim ( float  x,
float  y 
)
inline

Calculates the absolute difference between the two inputs.

Parameters
xA value of type float.
yA value of type float.
Returns
The result of fdim of x and y.

◆ fdim() [2/5]

float2 ipu::fdim ( float2  x,
float2  y 
)
inline

Calculates the absolute difference between the two inputs.

Parameters
xA value of type float2.
yA value of type float2.
Returns
The element-wise result of fdim of x and y.

◆ fdim() [3/5]

half ipu::fdim ( half  x,
half  y 
)
inline

Calculates the absolute difference between the two inputs.

Parameters
xA value of type half.
yA value of type half.
Returns
The result of fdim of x and y.

◆ fdim() [4/5]

half2 ipu::fdim ( half2  x,
half2  y 
)
inline

Calculates the absolute difference between the two inputs.

Parameters
xA value of type half2.
yA value of type half2.
Returns
The element-wise result of fdim of x and y.

◆ fdim() [5/5]

half4 ipu::fdim ( half4  x,
half4  y 
)
inline

Calculates the absolute difference between the two inputs.

Parameters
xA value of type half4.
yA value of type half4.
Returns
The element-wise result of fdim of x and y.

◆ floor() [1/5]

float ipu::floor ( float  x)
inline

Rounds down input to the closest integral value.

Parameters
xA value of type float.
Returns
The largest integral value that is not greater than x.

◆ floor() [2/5]

float2 ipu::floor ( float2  x)
inline

Rounds down input to the closest integral value.

Parameters
xA value of type float2.
Returns
A vector where every ith element is the largest integral value that is not greater than x[i].

◆ floor() [3/5]

half ipu::floor ( half  x)
inline

Rounds down input to the closest integral value.

Parameters
xA value of type half.
Returns
The largest integral value that is not greater than x.

◆ floor() [4/5]

half2 ipu::floor ( half2  x)
inline

Rounds down input to the closest integral value.

Parameters
xA value of type half2.
Returns
A vector where every ith element is the largest integral value that is not greater than x[i].

◆ floor() [5/5]

half4 ipu::floor ( half4  x)
inline

Rounds down input to the closest integral value.

Parameters
xA value of type half4.
Returns
A vector where every ith element is the largest integral value that is not greater than x[i].

◆ fma() [1/5]

float ipu::fma ( float  x,
float  y,
float  z 
)
inline

Computes ( x * y ) + z.

Parameters
xA value of type float.
yA value of type float.
zA value of type float.
Returns
The result of ( x * y ) + z.

◆ fma() [2/5]

float2 ipu::fma ( float2  x,
float2  y,
float2  z 
)
inline

Computes ( x * y ) + z.

Parameters
xA value of type float2.
yA value of type float2.
zA value of type float2.
Returns
A vector where every ith element is the result of ( x[i] * y[i] ) + z[i].

◆ fma() [3/5]

half ipu::fma ( half  x,
half  y,
half  z 
)
inline

Computes ( x * y ) + z.

Parameters
xA value of type half.
yA value of type half.
zA value of type half.
Returns
The result of ( x * y ) + z.

◆ fma() [4/5]

half2 ipu::fma ( half2  x,
half2  y,
half2  z 
)
inline

Computes ( x * y ) + z.

Parameters
xA value of type half2.
yA value of type half2.
zA value of type half2.
Returns
A vector where every ith element is the result of ( x[i] * y[i] ) + z[i].

◆ fma() [5/5]

half4 ipu::fma ( half4  x,
half4  y,
half4  z 
)
inline

Computes ( x * y ) + z.

Parameters
xA value of type half4.
yA value of type half4.
zA value of type half4.
Returns
A vector where every ith element is the result of ( x[i] * y[i] ) + z[i].

◆ fmax() [1/5]

float ipu::fmax ( float  x,
float  y 
)
inline

Calculates the maximum of the two inputs.

Parameters
xA value of type float.
yA value of type float.
Returns
The larger value of x and y. If one of them is a NaN, returns the other.

◆ fmax() [2/5]

float2 ipu::fmax ( float2  x,
float2  y 
)
inline

Calculates the maximum of the two inputs.

Parameters
xA value of type float2.
yA value of type float2.
Returns
A vector where every ith element is the larger of x[i] and y[i]. If either of them is a NaN, the other is set as the maximum.

◆ fmax() [3/5]

half ipu::fmax ( half  x,
half  y 
)
inline

Calculates the maximum of the two inputs.

Parameters
xA value of type half.
yA value of type half.
Returns
The larger value of x and y. If one of them is a NaN, returns the other.

◆ fmax() [4/5]

half2 ipu::fmax ( half2  x,
half2  y 
)
inline

Calculates the maximum of the two inputs.

Parameters
xA value of type half2.
yA value of type half2.
Returns
A vector where every ith element is the larger of x[i] and y[i]. If either of them is a NaN, the other is set as the maximum.

◆ fmax() [5/5]

half4 ipu::fmax ( half4  x,
half4  y 
)
inline

Calculates the maximum of the two inputs.

Parameters
xA value of type half4.
yA value of type half4.
Returns
A vector where every ith element is the larger of x[i] and y[i]. If either of them is a NaN, the other is set as the maximum.

◆ fmin() [1/5]

float ipu::fmin ( float  x,
float  y 
)
inline

Calculates the minimum of the two inputs.

Parameters
xA value of type float.
yA value of type float.
Returns
The smaller value of x and y. If one of them is a NaN, returns the other.

◆ fmin() [2/5]

float2 ipu::fmin ( float2  x,
float2  y 
)
inline

Calculates the minimum of the two inputs.

Parameters
xA value of type float2.
yA value of type float2.
Returns
A vector where every ith element is the smaller of x[i] and y[i]. If either of them is a NaN, the other is set as the minimum.

◆ fmin() [3/5]

half ipu::fmin ( half  x,
half  y 
)
inline

Calculates the minimum of the two inputs.

Parameters
xA value of type half.
yA value of type half.
Returns
The smaller value of x and y. If one of them is a NaN, returns the other.

◆ fmin() [4/5]

half2 ipu::fmin ( half2  x,
half2  y 
)
inline

Calculates the minimum of the two inputs.

Parameters
xA value of type half2.
yA value of type half2.
Returns
A vector where every ith element is the smaller of x[i] and y[i]. If either of them is a NaN, the other is set as the minimum.

◆ fmin() [5/5]

half4 ipu::fmin ( half4  x,
half4  y 
)
inline

Calculates the minimum of the two inputs.

Parameters
xA value of type half4.
yA value of type half4.
Returns
A vector where every ith element is the smaller of x[i] and y[i]. If either of them is a NaN, the other is set as the minimum.

◆ fmod() [1/5]

float ipu::fmod ( float  x,
float  y 
)
inline

Calculates the remainder of the division x / y rounded towards zero.

Parameters
xA value of type float.
yA value of type float.
Returns
The result of fmod of x and y.

◆ fmod() [2/5]

float2 ipu::fmod ( float2  x,
float2  y 
)
inline

Calculates the remainder of the division x / y rounded towards zero.

Parameters
xA value of type float2.
yA value of type float2.
Returns
The element-wise result of fmod of x and y.

◆ fmod() [3/5]

half ipu::fmod ( half  x,
half  y 
)
inline

Calculates the remainder of the division x / y rounded towards zero.

Parameters
xA value of type half.
yA value of type half.
Returns
The result of fmod of x and y.

◆ fmod() [4/5]

half2 ipu::fmod ( half2  x,
half2  y 
)
inline

Calculates the remainder of the division x / y rounded towards zero.

Parameters
xA value of type half2.
yA value of type half2.
Returns
The element-wise result of fmod of x and y.

◆ fmod() [5/5]

half4 ipu::fmod ( half4  x,
half4  y 
)
inline

Calculates the remainder of the division x / y rounded towards zero.

Parameters
xA value of type half4.
yA value of type half4.
Returns
The element-wise result of fmod of x and y.

◆ hypot() [1/5]

float ipu::hypot ( float  x,
float  y 
)
inline

Calculates the hypotenuse of the right-angled triangle whose two shorter sides are of lengths given by the two inputs.

Parameters
xA value of type float.
yA value of type float.
Returns
The result of hypot of x and y.

◆ hypot() [2/5]

float2 ipu::hypot ( float2  x,
float2  y 
)
inline

Calculates the hypotenuse of the right-angled triangle whose two shorter sides are of lengths given by the two inputs.

Parameters
xA value of type float2.
yA value of type float2.
Returns
The element-wise result of hypot of x and y.

◆ hypot() [3/5]

half ipu::hypot ( half  x,
half  y 
)
inline

Calculates the hypotenuse of the right-angled triangle whose two shorter sides are of lengths given by the two inputs.

Parameters
xA value of type half.
yA value of type half.
Returns
The result of hypot of x and y.

◆ hypot() [4/5]

half2 ipu::hypot ( half2  x,
half2  y 
)
inline

Calculates the hypotenuse of the right-angled triangle whose two shorter sides are of lengths given by the two inputs.

Parameters
xA value of type half2.
yA value of type half2.
Returns
The element-wise result of hypot of x and y.

◆ hypot() [5/5]

half4 ipu::hypot ( half4  x,
half4  y 
)
inline

Calculates the hypotenuse of the right-angled triangle whose two shorter sides are of lengths given by the two inputs.

Parameters
xA value of type half4.
yA value of type half4.
Returns
The element-wise result of hypot of x and y.

◆ llrint() [1/5]

long long ipu::llrint ( float  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type float.
Returns
The value of x rounded to a nearby integral of type long long.

◆ llrint() [2/5]

longlong2 ipu::llrint ( float2  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type float2.
Returns
A vector where every ith element is x[i] rounded to a nearby integral, of type long long.

◆ llrint() [3/5]

long long ipu::llrint ( half  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half.
Returns
The value of x rounded to a nearby integral of type long long.

◆ llrint() [4/5]

longlong2 ipu::llrint ( half2  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half2.
Returns
A vector where every ith element is x[i] rounded to a nearby integral, of type long long.

◆ llrint() [5/5]

longlong4 ipu::llrint ( half4  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half4.
Returns
A vector where every ith element is x[i] rounded to a nearby integral, of type long long.

◆ llround() [1/5]

long long ipu::llround ( float  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type float.
Returns
The nearest integral value to x as a long long, with halfway cases rounded away from zero.

◆ llround() [2/5]

longlong2 ipu::llround ( float2  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type float2.
Returns
A vector where every ith element is the nearest integral value to x[i] as a long long, with halfway cases rounded away from zero.

◆ llround() [3/5]

long long ipu::llround ( half  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type half.
Returns
The nearest integral value to x as a long long, with halfway cases rounded away from zero.

◆ llround() [4/5]

longlong2 ipu::llround ( half2  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type half2.
Returns
A vector where every ith element is the nearest integral value to x[i] as a long long, with halfway cases rounded away from zero.

◆ llround() [5/5]

longlong4 ipu::llround ( half4  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type half4.
Returns
A vector where every ith element is the nearest integral value to x[i] as a long long, with halfway cases rounded away from zero.

◆ ln() [1/2]

float ipu::ln ( float  src)
inline

Targets the f32ln instruction.

Parameters
srcA value of type half2.
Returns
The result of the natural log of src.

◆ ln() [2/2]

half2 ipu::ln ( half2  src)
inline

Targets the f16v2ln instruction.

Parameters
srcA value of type half2.
Returns
A vector of the results of the natural log of the two elements in src.

◆ load_postinc() [1/17]

char ipu::load_postinc ( const char **  a,
int  i 
)
inline

Post-incrementing load, targeting the lds8step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a char.

◆ load_postinc() [2/17]

float ipu::load_postinc ( const float **  a,
int  i 
)
inline

Post-incrementing load, targeting the ld32step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a float.

◆ load_postinc() [3/17]

float2 ipu::load_postinc ( const float2 **  a,
int  i 
)
inline

Post-incrementing load, targeting the ld64step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a float2.

◆ load_postinc() [4/17]

half ipu::load_postinc ( const half **  a,
int  i 
)
inline

Post-incrementing load, targeting the ldb16step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a half.

◆ load_postinc() [5/17]

half2 ipu::load_postinc ( const half2 **  a,
int  i 
)
inline

Post-incrementing load, targeting the ld32step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a half2.

◆ load_postinc() [6/17]

half4 ipu::load_postinc ( const half4 **  a,
int  i 
)
inline

Post-incrementing load, targeting the ld64step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a half4.

◆ load_postinc() [7/17]

int ipu::load_postinc ( const int **  a,
int  i 
)
inline

Post-incrementing load, targeting the ld32step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as an int.

◆ load_postinc() [8/17]

int2 ipu::load_postinc ( const int2 **  a,
int  i 
)
inline

Post-incrementing load.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as an int2.

◆ load_postinc() [9/17]

short ipu::load_postinc ( const short **  a,
int  i 
)
inline

Post-incrementing load, targeting the lds16step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a short.

◆ load_postinc() [10/17]

short2 ipu::load_postinc ( const short2 **  a,
int  i 
)
inline

Post-incrementing load, targeting the ld32step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a short2.

◆ load_postinc() [11/17]

short4 ipu::load_postinc ( const short4 **  a,
int  i 
)
inline

Post-incrementing load.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a short4.

◆ load_postinc() [12/17]

uint2 ipu::load_postinc ( const uint2 **  a,
int  i 
)
inline

Post-incrementing load.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a uint2.

◆ load_postinc() [13/17]

unsigned ipu::load_postinc ( const unsigned **  a,
int  i 
)
inline

Post-incrementing load, targeting the ld32step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as an unsigned.

◆ load_postinc() [14/17]

unsigned char ipu::load_postinc ( const unsigned char **  a,
int  i 
)
inline

Post-incrementing load, targeting the ldz8step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as an unsigned char.

◆ load_postinc() [15/17]

unsigned short ipu::load_postinc ( const unsigned short **  a,
int  i 
)
inline

Post-incrementing load, targeting the ldz16step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as an unsigned short.

◆ load_postinc() [16/17]

ushort2 ipu::load_postinc ( const ushort2 **  a,
int  i 
)
inline

Post-incrementing load, targeting the ld32step instruction.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a ushort2.

◆ load_postinc() [17/17]

ushort4 ipu::load_postinc ( const ushort4 **  a,
int  i 
)
inline

Post-incrementing load.

Parameters
aAddress of the variable holding the address to load from. Gets incremented by i after the load.
iValue by which to increment a after load.
Returns
The value in memory whose address is given by a, as a ushort4.

◆ log() [1/5]

float ipu::log ( float  x)
inline

The natural logarithm.

Parameters
xA value of type float.
Returns
The result of log of x.

◆ log() [2/5]

float2 ipu::log ( float2  x)
inline

The natural logarithm.

Parameters
xA value of type float2.
Returns
The element-wise result of log of x.

◆ log() [3/5]

half ipu::log ( half  x)
inline

The natural logarithm.

Parameters
xA value of type half.
Returns
The result of log of x.

◆ log() [4/5]

half2 ipu::log ( half2  x)
inline

The natural logarithm.

Parameters
xA value of type half2.
Returns
The element-wise result of log of x.

◆ log() [5/5]

half4 ipu::log ( half4  x)
inline

The natural logarithm.

Parameters
xA value of type half4.
Returns
The element-wise result of log of x.

◆ log10() [1/5]

float ipu::log10 ( float  x)
inline

The base-10 logarithm.

Parameters
xA value of type float.
Returns
The result of log10 of x.

◆ log10() [2/5]

float2 ipu::log10 ( float2  x)
inline

The base-10 logarithm.

Parameters
xA value of type float2.
Returns
The element-wise result of log10 of x.

◆ log10() [3/5]

half ipu::log10 ( half  x)
inline

The base-10 logarithm.

Parameters
xA value of type half.
Returns
The result of log10 of x.

◆ log10() [4/5]

half2 ipu::log10 ( half2  x)
inline

The base-10 logarithm.

Parameters
xA value of type half2.
Returns
The element-wise result of log10 of x.

◆ log10() [5/5]

half4 ipu::log10 ( half4  x)
inline

The base-10 logarithm.

Parameters
xA value of type half4.
Returns
The element-wise result of log10 of x.

◆ log1p() [1/5]

float ipu::log1p ( float  x)
inline

The natural logarithm of 1 + x.

Parameters
xA value of type float.
Returns
The result of log1p of x.

◆ log1p() [2/5]

float2 ipu::log1p ( float2  x)
inline

The natural logarithm of 1 + x.

Parameters
xA value of type float2.
Returns
The element-wise result of log1p of x.

◆ log1p() [3/5]

half ipu::log1p ( half  x)
inline

The natural logarithm of 1 + x.

Parameters
xA value of type half.
Returns
The result of log1p of x.

◆ log1p() [4/5]

half2 ipu::log1p ( half2  x)
inline

The natural logarithm of 1 + x.

Parameters
xA value of type half2.
Returns
The element-wise result of log1p of x.

◆ log1p() [5/5]

half4 ipu::log1p ( half4  x)
inline

The natural logarithm of 1 + x.

Parameters
xA value of type half4.
Returns
The element-wise result of log1p of x.

◆ log2() [1/5]

float ipu::log2 ( float  x)
inline

Targets the f32ln instruction.

The base-2 logarithm.

Parameters
srcA value of type half2.
Returns
The result of the log (base 2) of src.
Parameters
xA value of type float.
Returns
The result of log2 of x.

◆ log2() [2/5]

float2 ipu::log2 ( float2  x)
inline

The base-2 logarithm.

Parameters
xA value of type float2.
Returns
The element-wise result of log2 of x.

◆ log2() [3/5]

half ipu::log2 ( half  x)
inline

The base-2 logarithm.

Parameters
xA value of type half.
Returns
The result of log2 of x.

◆ log2() [4/5]

half2 ipu::log2 ( half2  x)
inline

Targets the f16v2log2 instruction.

The base-2 logarithm.

Parameters
srcA value of type half2.
Returns
A vector of the results of the log (base 2) of the two elements in src.
Parameters
xA value of type half2.
Returns
The element-wise result of log2 of x.

◆ log2() [5/5]

half4 ipu::log2 ( half4  x)
inline

The base-2 logarithm.

Parameters
xA value of type half4.
Returns
The element-wise result of log2 of x.

◆ lrint() [1/5]

long ipu::lrint ( float  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type float.
Returns
The value of x rounded to a nearby integral of type long.

◆ lrint() [2/5]

long2 ipu::lrint ( float2  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type float2.
Returns
A vector where every ith element is x[i] rounded to a nearby integral, of type long.

◆ lrint() [3/5]

long ipu::lrint ( half  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half.
Returns
The value of x rounded to a nearby integral of type long.

◆ lrint() [4/5]

long2 ipu::lrint ( half2  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half2.
Returns
A vector where every ith element is x[i] rounded to a nearby integral, of type long.

◆ lrint() [5/5]

long4 ipu::lrint ( half4  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half4.
Returns
A vector where every ith element is x[i] rounded to a nearby integral, of type long.

◆ lround() [1/5]

long ipu::lround ( float  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type float.
Returns
The nearest integral value to x as a long, with halfway cases rounded away from zero.

◆ lround() [2/5]

long2 ipu::lround ( float2  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type float2.
Returns
A vector where every ith element is the nearest integral value to x[i] as a long, with halfway cases rounded away from zero.

◆ lround() [3/5]

long ipu::lround ( half  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type half.
Returns
The nearest integral value to x as a long, with halfway cases rounded away from zero.

◆ lround() [4/5]

long2 ipu::lround ( half2  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type half2.
Returns
A vector where every ith element is the nearest integral value to x[i] as a long, with halfway cases rounded away from zero.

◆ lround() [5/5]

long4 ipu::lround ( half4  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type half4.
Returns
A vector where every ith element is the nearest integral value to x[i] as a long, with halfway cases rounded away from zero.

◆ max() [1/4]

float ipu::max ( float  src0,
float  src1 
)
inline

Targets the f32max instruction.

Parameters
src0A value of type float.
src1A value of type float.
Returns
The maximum of src0 and src1.

◆ max() [2/4]

float2 ipu::max ( float2  src0,
float2  src1 
)
inline

Targets the f32v2max instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
The element-wise maximum of src0 and src1.

◆ max() [3/4]

half2 ipu::max ( half2  src0,
half2  src1 
)
inline

Targets the f16v2max instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
The element-wise maximum of src0 and src1.

◆ max() [4/4]

half4 ipu::max ( half4  src0,
half4  src1 
)
inline

Targets the f16v4max instruction.

Parameters
src0A value of type half4.
src1A value of type half4.
Returns
The element-wise maximum of src0 and src1.

◆ maxc()

half2 ipu::maxc ( half4  src)
inline

Targets the f16v4maxc instruction.

Parameters
srcA value of type half4.
Returns
The 2x2 lateral maximum of src. The 0th element in the result vector is the maximum of src[0] and src[1], and the 1st element is the maximum of src[2] and src[3].

◆ min() [1/4]

float ipu::min ( float  src0,
float  src1 
)
inline

Targets the f32min instruction.

Parameters
src0A value of type float.
src1A value of type float.
Returns
The minimum of src0 and src1.

◆ min() [2/4]

float2 ipu::min ( float2  src0,
float2  src1 
)
inline

Targets the f32v2min instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
The element-wise minimum of src0 and src1.

◆ min() [3/4]

half2 ipu::min ( half2  src0,
half2  src1 
)
inline

Targets the f16v2min instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
The element-wise minimum of src0 and src1.

◆ min() [4/4]

half4 ipu::min ( half4  src0,
half4  src1 
)
inline

Targets the f16v4min instruction.

Parameters
src0A value of type half4.
src1A value of type half4.
Returns
The element-wise minimum of src0 and src1.

◆ nearbyint() [1/5]

float ipu::nearbyint ( float  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type float.
Returns
The value of x rounded to a nearby integral.

◆ nearbyint() [2/5]

float2 ipu::nearbyint ( float2  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type float2.
Returns
A vector where every ith element is x[i] rounded to a nearby integral.

◆ nearbyint() [3/5]

half ipu::nearbyint ( half  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half.
Returns
The value of x rounded to a nearby integral.

◆ nearbyint() [4/5]

half2 ipu::nearbyint ( half2  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half2.
Returns
A vector where every ith element is x[i] rounded to a nearby integral.

◆ nearbyint() [5/5]

half4 ipu::nearbyint ( half4  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half4.
Returns
A vector where every ith element is x[i] rounded to a nearby integral.

◆ popc()

unsigned ipu::popc ( int  src)
inline

Targets the popc instruction.

Parameters
srcA value of type int.
Returns
The number of set bits in src.

◆ pow() [1/5]

float ipu::pow ( float  x,
float  y 
)
inline

Calculates x to the power of y.

Parameters
xA value of type float.
yA value of type float.
Returns
The result of ( x )^( y )

◆ pow() [2/5]

float2 ipu::pow ( float2  x,
float2  y 
)
inline

Calculates x to the power of y.

Parameters
xA value of type float2.
yA value of type float2.
Returns
A vector where every ith element is x[i] raised to the power of y[i].

◆ pow() [3/5]

half ipu::pow ( half  x,
half  y 
)
inline

Calculates x to the power of y.

Parameters
xA value of type half.
yA value of type half.
Returns
The result of ( x )^( y )

◆ pow() [4/5]

half2 ipu::pow ( half2  x,
half2  y 
)
inline

Calculates x to the power of y.

Parameters
xA value of type half2.
yA value of type half2.
Returns
A vector where every ith element is x[i] raised to the power of y[i].

◆ pow() [5/5]

half4 ipu::pow ( half4  x,
half4  y 
)
inline

Calculates x to the power of y.

Parameters
xA value of type half4.
yA value of type half4.
Returns
A vector where every ith element is x[i] raised to the power of y[i].

◆ remainder() [1/5]

float ipu::remainder ( float  x,
float  y 
)
inline

Calculates the remainder of the division x / y, rounded to the nearest integral value, with halfway cases rounded to the even number.

Parameters
xA value of type float.
yA value of type float.
Returns
The result of remainder of x and y.

◆ remainder() [2/5]

float2 ipu::remainder ( float2  x,
float2  y 
)
inline

Calculates the remainder of the division x / y, rounded to the nearest integral value, with halfway cases rounded to the even number.

Parameters
xA value of type float2.
yA value of type float2.
Returns
The element-wise result of remainder of x and y.

◆ remainder() [3/5]

half ipu::remainder ( half  x,
half  y 
)
inline

Calculates the remainder of the division x / y, rounded to the nearest integral value, with halfway cases rounded to the even number.

Parameters
xA value of type half.
yA value of type half.
Returns
The result of remainder of x and y.

◆ remainder() [4/5]

half2 ipu::remainder ( half2  x,
half2  y 
)
inline

Calculates the remainder of the division x / y, rounded to the nearest integral value, with halfway cases rounded to the even number.

Parameters
xA value of type half2.
yA value of type half2.
Returns
The element-wise result of remainder of x and y.

◆ remainder() [5/5]

half4 ipu::remainder ( half4  x,
half4  y 
)
inline

Calculates the remainder of the division x / y, rounded to the nearest integral value, with halfway cases rounded to the even number.

Parameters
xA value of type half4.
yA value of type half4.
Returns
The element-wise result of remainder of x and y.

◆ rint() [1/5]

float ipu::rint ( float  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type float.
Returns
The value of x rounded to a nearby integral.

◆ rint() [2/5]

float2 ipu::rint ( float2  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type float2.
Returns
A vector where every ith element is x[i] rounded to a nearby integral.

◆ rint() [3/5]

half ipu::rint ( half  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half.
Returns
The value of x rounded to a nearby integral.

◆ rint() [4/5]

half2 ipu::rint ( half2  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half2.
Returns
A vector where every ith element is x[i] rounded to a nearby integral.

◆ rint() [5/5]

half4 ipu::rint ( half4  x)
inline

Rounds input to a nearby integral value, using the current rounding mode.

Parameters
xA value of type half4.
Returns
A vector where every ith element is x[i] rounded to a nearby integral.

◆ rmask() [1/2]

float2 ipu::rmask ( float2  src0,
float  src1 
)
inline

Targets the f32v2rmask instruction.

Parameters
src0A value of type float2.
src1A value of type float.
Returns
The result is a masked version of src0, with each element of the input being individually masked with the probability specified by the bottom 17-bits of src1:
  • if src1[16] == 1, no masking is applied;
  • if src1[16:0] == 0, the result is a zero vector;
  • otherwise each element is individually unmasked with probability src1[15:0] / 65536. PRNG is used by this instruction to generate 2 x 16-bit random values from the discrete uniform distribution.

◆ rmask() [2/2]

half4 ipu::rmask ( half4  src0,
float  src1 
)
inline

Targets the f16v4rmask instruction.

Parameters
src0A value of type half4.
src1A value of type float.
Returns
The result is a masked version of src0, with each element of the input being individually masked with the probability specified by the bottom 17-bits of src1:
  • if src1[16] == 1, no masking is applied;
  • if src1[16:0] == 0, the result is a zero vector;
  • otherwise each element is individually unmasked with probability src1[15:0] / 65536. PRNG is used by this instruction to generate 4 x 16-bit random values from the discrete uniform distribution.

◆ roll16() [1/3]

half2 ipu::roll16 ( half2  src0,
half2  src1 
)
inline

Targets the roll16 instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
The result of a SIMD roll permutation on the 4 16-bit values across src0 and src1, as a half2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 2 | 1 |

◆ roll16() [2/3]

short2 ipu::roll16 ( short2  src0,
short2  src1 
)
inline

Targets the roll16 instruction.

Parameters
src0A value of type short2.
src1A value of type short2.
Returns
The result of a SIMD roll permutation on the 4 16-bit values across src0 and src1, as a short2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 2 | 1 |

◆ roll16() [3/3]

ushort2 ipu::roll16 ( ushort2  src0,
ushort2  src1 
)
inline

Targets the roll16 instruction.

Parameters
src0A value of type ushort2.
src1A value of type ushort2.
Returns
The result of a SIMD roll permutation on the 4 16-bit values across src0 and src1, as a ushort2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 2 | 1 |

◆ roll32()

float2 ipu::roll32 ( float2  src0,
float2  src1 
)
inline

Targets the roll32 instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
The result of a SIMD roll permutation on the 4 32-bit float values across src0 and src1, as a float2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 2 | 1 |

◆ roll8l()

int ipu::roll8l ( int  src0,
int  src1 
)
inline

Targets the roll8l instruction.

Parameters
src0A value of type int.
src1A value of type int.
Returns
The result of a SIMD roll-left permutation on the 8 8-bit values across src0 and src1, as an int. src0 src1 -> Result | 7 | 6 | 5 | 4 | | 3 | 2 | 1 | 0 | | 6 | 5 | 4 | 3 |

◆ roll8r()

int ipu::roll8r ( int  src0,
int  src1 
)
inline

Targets the roll8r instruction.

Parameters
src0A value of type int.
src1A value of type int.
Returns
The result of a SIMD roll-right permutation on the 8 8-bit values across src0 and src1, as an int. src0 src1 -> Result | 7 | 6 | 5 | 4 | | 3 | 2 | 1 | 0 | | 4 | 3 | 2 | 1 |

◆ round() [1/5]

float ipu::round ( float  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type float.
Returns
The result of round of x.

◆ round() [2/5]

float2 ipu::round ( float2  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type float2.
Returns
The element-wise result of round of x.

◆ round() [3/5]

half ipu::round ( half  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type half.
Returns
The result of round of x.

◆ round() [4/5]

half2 ipu::round ( half2  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type half2.
Returns
The element-wise result of round of x.

◆ round() [5/5]

half4 ipu::round ( half4  x)
inline

Rounds input to nearest integral value, with halfway cases rounded away from zero.

Parameters
xA value of type half4.
Returns
The element-wise result of round of x.

◆ rsqrt() [1/5]

float ipu::rsqrt ( float  x)
inline

The reciprocal square root function.

Parameters
xA value of type float.
Returns
The reciprocal of the square root of x.

◆ rsqrt() [2/5]

float2 ipu::rsqrt ( float2  x)
inline

The reciprocal square root function.

Parameters
xA value of type float2.
Returns
A vector where every ith element is the reciprocal of the square root of x[i].

◆ rsqrt() [3/5]

half ipu::rsqrt ( half  x)
inline

The reciprocal square root function.

Parameters
xA value of type half.
Returns
The reciprocal of the square root of x.

◆ rsqrt() [4/5]

half2 ipu::rsqrt ( half2  x)
inline

The reciprocal square root function.

Parameters
xA value of type half2.
Returns
A vector where every ith element is the reciprocal of the square root of x[i].

◆ rsqrt() [5/5]

half4 ipu::rsqrt ( half4  x)
inline

The reciprocal square root function.

Parameters
xA value of type half4.
Returns
A vector where every ith element is the reciprocal of the square root of x[i].

◆ shuf8x8hi()

int ipu::shuf8x8hi ( int  src0,
int  src1 
)
inline

Targets the shuf8x8hi instruction.

Parameters
src0A value of type int.
src1A value of type int.
Returns
The upper word of a SIMD shuffle permutation on the 8 8-bit values across src0 and src1, as an int. src0 src1 -> Result | 7 | 6 | 5 | 4 | | 3 | 2 | 1 | 0 | | 7 | 3 | 6 | 2 |

◆ shuf8x8lo()

int ipu::shuf8x8lo ( int  src0,
int  src1 
)
inline

Targets the shuf8x8lo instruction.

Parameters
src0A value of type int.
src1A value of type int.
Returns
The lower word of a SIMD shuffle permutation on the 8 8-bit values across src0 and src1, as an int. src0 src1 -> Result | 7 | 6 | 5 | 4 | | 3 | 2 | 1 | 0 | | 5 | 1 | 4 | 0 |

◆ sigm() [1/2]

float ipu::sigm ( float  src)
inline

Targets the f32sigm instruction.

Parameters
srcA value of type float.
Returns
The result of an element-wise application of the sigmoid function on src.

◆ sigm() [2/2]

half2 ipu::sigm ( half2  src)
inline

Targets the f16v2sigm instruction.

Parameters
srcA value of type half2.
Returns
The result of an element-wise application of the sigmoid function on src.

◆ sigmoid() [1/5]

float ipu::sigmoid ( float  x)
inline

The sigmoid function, ie 1/(1 + exp(- x )).

Parameters
xA value of type float.
Returns
The result of sigmoid of x.

◆ sigmoid() [2/5]

float2 ipu::sigmoid ( float2  x)
inline

The sigmoid function, ie 1/(1 + exp(- x )).

Parameters
xA value of type float2.
Returns
The element-wise result of sigmoid of x.

◆ sigmoid() [3/5]

half ipu::sigmoid ( half  x)
inline

The sigmoid function, ie 1/(1 + exp(- x )).

Parameters
xA value of type half.
Returns
The result of sigmoid of x.

◆ sigmoid() [4/5]

half2 ipu::sigmoid ( half2  x)
inline

The sigmoid function, ie 1/(1 + exp(- x )).

Parameters
xA value of type half2.
Returns
The element-wise result of sigmoid of x.

◆ sigmoid() [5/5]

half4 ipu::sigmoid ( half4  x)
inline

The sigmoid function, ie 1/(1 + exp(- x )).

Parameters
xA value of type half4.
Returns
The element-wise result of sigmoid of x.

◆ sin() [1/5]

float ipu::sin ( float  x)
inline

The trigonometric sine function.

Parameters
xA value of type float.
Returns
The result of sin of x.

◆ sin() [2/5]

float2 ipu::sin ( float2  x)
inline

The trigonometric sine function.

Parameters
xA value of type float2.
Returns
The element-wise result of sin of x.

◆ sin() [3/5]

half ipu::sin ( half  x)
inline

The trigonometric sine function.

Parameters
xA value of type half.
Returns
The result of sin of x.

◆ sin() [4/5]

half2 ipu::sin ( half2  x)
inline

The trigonometric sine function.

Parameters
xA value of type half2.
Returns
The element-wise result of sin of x.

◆ sin() [5/5]

half4 ipu::sin ( half4  x)
inline

The trigonometric sine function.

Parameters
xA value of type half4.
Returns
The element-wise result of sin of x.

◆ sinh() [1/5]

float ipu::sinh ( float  x)
inline

The hyperbolic sine function.

Parameters
xA value of type float.
Returns
The result of sinh of x.

◆ sinh() [2/5]

float2 ipu::sinh ( float2  x)
inline

The hyperbolic sine function.

Parameters
xA value of type float2.
Returns
The element-wise result of sinh of x.

◆ sinh() [3/5]

half ipu::sinh ( half  x)
inline

The hyperbolic sine function.

Parameters
xA value of type half.
Returns
The result of sinh of x.

◆ sinh() [4/5]

half2 ipu::sinh ( half2  x)
inline

The hyperbolic sine function.

Parameters
xA value of type half2.
Returns
The element-wise result of sinh of x.

◆ sinh() [5/5]

half4 ipu::sinh ( half4  x)
inline

The hyperbolic sine function.

Parameters
xA value of type half4.
Returns
The element-wise result of sinh of x.

◆ sort4x16hi() [1/3]

half2 ipu::sort4x16hi ( half2  src0,
half2  src1 
)
inline

Targets the sort4x16hi instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
The result of a SIMD sort permutation on the 4 16-bit values across src0 and src1, as a half2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 3 | 1 |

◆ sort4x16hi() [2/3]

short2 ipu::sort4x16hi ( short2  src0,
short2  src1 
)
inline

Targets the sort4x16hi instruction.

Parameters
src0A value of type short2.
src1A value of type short2.
Returns
The result of a SIMD sort permutation on the 4 16-bit values across src0 and src1, as a short2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 3 | 1 |

◆ sort4x16hi() [3/3]

ushort2 ipu::sort4x16hi ( ushort2  src0,
ushort2  src1 
)
inline

Targets the sort4x16hi instruction.

Parameters
src0A value of type ushort2.
src1A value of type ushort2.
Returns
The result of a SIMD sort permutation on the 4 16-bit values across src0 and src1, as a ushort2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 3 | 1 |

◆ sort4x16lo() [1/3]

half2 ipu::sort4x16lo ( half2  src0,
half2  src1 
)
inline

Targets the sort4x16lo instruction.

Parameters
src0A value of type half2.
src1A value of type half2.
Returns
The result of a SIMD sort permutation on the 4 16-bit values across src0 and src1, as a half2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 2 | 0 |

◆ sort4x16lo() [2/3]

short2 ipu::sort4x16lo ( short2  src0,
short2  src1 
)
inline

Targets the sort4x16lo instruction.

Parameters
src0A value of type short2.
src1A value of type short2.
Returns
The result of a SIMD sort permutation on the 4 16-bit values across src0 and src1, as a short2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 2 | 0 |

◆ sort4x16lo() [3/3]

ushort2 ipu::sort4x16lo ( ushort2  src0,
ushort2  src1 
)
inline

Targets the sort4x16lo instruction.

Parameters
src0A value of type ushort2.
src1A value of type ushort2.
Returns
The result of a SIMD sort permutation on the 4 16-bit values across src0 and src1, as a ushort2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 2 | 0 |

◆ sort4x32hi()

float2 ipu::sort4x32hi ( float2  src0,
float2  src1 
)
inline

Targets the sort4x32hi instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
The result of a SIMD sort permutation on the 4 32-bit float values across src0 and src1, as a float2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 3 | 1 |

◆ sort4x32lo()

float2 ipu::sort4x32lo ( float2  src0,
float2  src1 
)
inline

Targets the sort4x32lo instruction.

Parameters
src0A value of type float2.
src1A value of type float2.
Returns
The result of a SIMD sort permutation on the 4 32-bit float values across src0 and src1, as a float2. src0 src1 -> Result | 3 | 2 | | 1 | 0 | | 2 | 0 |

◆ sort8()

int ipu::sort8 ( int  src)
inline

Targets the sort8 instruction.

Parameters
srcA value of type int.
Returns
The result of a SIMD sort8 permutation on the 4 8-bit values in src, as an int. src -> Result | 3 | 2 | 1 | 0 | | 3 | 1 | 2 | 0 |

◆ sort8x8hi()

int ipu::sort8x8hi ( int  src0,
int  src1 
)
inline

Targets the sort8x8hi instruction.

Parameters
src0A value of type int.
src1A value of type int.
Returns
The upper word of the result of a SIMD sort8 permutation on the 8 8-bit values across src0 and src1, as an int. src0 src1 -> Result | 7 | 6 | 5 | 4 | | 3 | 2 | 1 | 0 | | 7 | 5 | 3 | 1 |

◆ sort8x8lo()

int ipu::sort8x8lo ( int  src0,
int  src1 
)
inline

Targets the sort8x8lo instruction.

Parameters
src0A value of type int.
src1A value of type int.
Returns
The lower word of the result of a SIMD sort8 permutation on the 8 8-bit values across src0 and src1, as an int. src0 src1 -> Result | 7 | 6 | 5 | 4 | | 3 | 2 | 1 | 0 | | 6 | 4 | 2 | 0 |

◆ sqrt() [1/5]

float ipu::sqrt ( float  x)
inline

The square root function.

Parameters
xA value of type float.
Returns
The result of sqrt of x.

◆ sqrt() [2/5]

float2 ipu::sqrt ( float2  x)
inline

The square root function.

Parameters
xA value of type float2.
Returns
The element-wise result of sqrt of x.

◆ sqrt() [3/5]

half ipu::sqrt ( half  x)
inline

The square root function.

Parameters
xA value of type half.
Returns
The result of sqrt of x.

◆ sqrt() [4/5]

half2 ipu::sqrt ( half2  x)
inline

The square root function.

Parameters
xA value of type half2.
Returns
The element-wise result of sqrt of x.

◆ sqrt() [5/5]

half4 ipu::sqrt ( half4  x)
inline

The square root function.

Parameters
xA value of type half4.
Returns
The element-wise result of sqrt of x.

◆ store_postinc() [1/16]

void ipu::store_postinc ( char **  a,
char  v,
int  i 
)
inline

Post-incrementing store.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type char.
iValue by which to increment a after store.

◆ store_postinc() [2/16]

void ipu::store_postinc ( float **  a,
float  v,
int  i 
)
inline

Post-incrementing store, targeting the st32step instruction.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type float.
iValue by which to increment a after store.

◆ store_postinc() [3/16]

void ipu::store_postinc ( float2 **  a,
float2  v,
int  i 
)
inline

Post-incrementing store, targeting the st64step instruction.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type float2.
iValue by which to increment a after store.

◆ store_postinc() [4/16]

void ipu::store_postinc ( half2 **  a,
half2  v,
int  i 
)
inline

Post-incrementing store, targeting the st32step instruction.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type half2.
iValue by which to increment a after store.

◆ store_postinc() [5/16]

void ipu::store_postinc ( half4 **  a,
half4  v,
int  i 
)
inline

Post-incrementing store, targeting the st64step instruction.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type half4.
iValue by which to increment a after store.

◆ store_postinc() [6/16]

void ipu::store_postinc ( int **  a,
int  v,
int  i 
)
inline

Post-incrementing store, targeting the stm32step instruction if i is a variable stride, and st32step otherwise.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type int.
iValue by which to increment a after store.

◆ store_postinc() [7/16]

void ipu::store_postinc ( int2 **  a,
int2  v,
int  i 
)
inline

Post-incrementing store.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type int2.
iValue by which to increment a after store.

◆ store_postinc() [8/16]

void ipu::store_postinc ( short **  a,
short  v,
int  i 
)
inline

Post-incrementing store.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type short.
iValue by which to increment a after store.

◆ store_postinc() [9/16]

void ipu::store_postinc ( short2 **  a,
short2  v,
int  i 
)
inline

Post-incrementing store, targeting the stm32step instruction if i is a variable stride, and st32step otherwise.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type short2.
iValue by which to increment a after store.

◆ store_postinc() [10/16]

void ipu::store_postinc ( short4 **  a,
short4  v,
int  i 
)
inline

Post-incrementing store.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type short4.
iValue by which to increment a after store.

◆ store_postinc() [11/16]

void ipu::store_postinc ( uint2 **  a,
uint2  v,
int  i 
)
inline

Post-incrementing store.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type uint2.
iValue by which to increment a after store.

◆ store_postinc() [12/16]

void ipu::store_postinc ( unsigned **  a,
unsigned  v,
int  i 
)
inline

Post-incrementing store, targeting the stm32step instruction if i is a variable stride, and st32step otherwise.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type unsigned.
iValue by which to increment a after store.

◆ store_postinc() [13/16]

void ipu::store_postinc ( unsigned char **  a,
unsigned char  v,
int  i 
)
inline

Post-incrementing store.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type unsigned char.
iValue by which to increment a after store.

◆ store_postinc() [14/16]

void ipu::store_postinc ( unsigned short **  a,
unsigned short  v,
int  i 
)
inline

Post-incrementing store.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type unsigned short.
iValue by which to increment a after store.

◆ store_postinc() [15/16]

void ipu::store_postinc ( ushort2 **  a,
ushort2  v,
int  i 
)
inline

Post-incrementing store, targeting the stm32step instruction if i is a variable stride, and st32step otherwise.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type ushort2.
iValue by which to increment a after store.

◆ store_postinc() [16/16]

void ipu::store_postinc ( ushort4 **  a,
ushort4  v,
int  i 
)
inline

Post-incrementing store.

Parameters
aAddress of the variable holding the address to store to. Gets incremented by i after the store.
vValue to store, of type ushort4.
iValue by which to increment a after store.

◆ sum() [1/2]

float ipu::sum ( half2  src)
inline

Targets the f16v2sum instruction.

Parameters
srcA value of type half2.
Returns
The sum of the two elements in src as a float.

◆ sum() [2/2]

float2 ipu::sum ( half4  src)
inline

Targets the f16v4sum instruction.

Parameters
srcA value of type half2.
Returns
The 2x2 lateral summation of the elements in src as a float2. The first element is the sum of src[0] and src[1], the second element is the sum of src[2] and src[3].

◆ swap8()

int ipu::swap8 ( int  src)
inline

Targets the sort8 instruction.

Parameters
srcA value of type int.
Returns
The result of a SIMD swap permutation on the 4 8-bit values in src, as an int. src -> Result | 3 | 2 | 1 | 0 | | 2 | 3 | 0 | 1 |

◆ tan() [1/5]

float ipu::tan ( float  x)
inline

The trigonometric tangent function.

Parameters
xA value of type float.
Returns
The result of tan of x.

◆ tan() [2/5]

float2 ipu::tan ( float2  x)
inline

The trigonometric tangent function.

Parameters
xA value of type float2.
Returns
The element-wise result of tan of x.

◆ tan() [3/5]

half ipu::tan ( half  x)
inline

The trigonometric tangent function.

Parameters
xA value of type half.
Returns
The result of tan of x.

◆ tan() [4/5]

half2 ipu::tan ( half2  x)
inline

The trigonometric tangent function.

Parameters
xA value of type half2.
Returns
The element-wise result of tan of x.

◆ tan() [5/5]

half4 ipu::tan ( half4  x)
inline

The trigonometric tangent function.

Parameters
xA value of type half4.
Returns
The element-wise result of tan of x.

◆ tanh() [1/5]

float ipu::tanh ( float  x)
inline

Targets the f32tanh instruction.

The hyperbolic tangent function.

Parameters
srcA value of type float.
Returns
The result of tanh(src)`.
Parameters
xA value of type float.
Returns
The result of tanh of x.

◆ tanh() [2/5]

float2 ipu::tanh ( float2  x)
inline

The hyperbolic tangent function.

Parameters
xA value of type float2.
Returns
The element-wise result of tanh of x.

◆ tanh() [3/5]

half ipu::tanh ( half  x)
inline

The hyperbolic tangent function.

Parameters
xA value of type half.
Returns
The result of tanh of x.

◆ tanh() [4/5]

half2 ipu::tanh ( half2  x)
inline

Targets the f16v2tanh instruction.

The hyperbolic tangent function.

Parameters
srcA value of type half2.
Returns
The result of tanh(src)`.
Parameters
xA value of type half2.
Returns
The element-wise result of tanh of x.

◆ tanh() [5/5]

half4 ipu::tanh ( half4  x)
inline

The hyperbolic tangent function.

Parameters
xA value of type half4.
Returns
The element-wise result of tanh of x.

◆ tgamma() [1/5]

float ipu::tgamma ( float  x)
inline

The gamma function.

Parameters
xA value of type float.
Returns
The result of tgamma of x.

◆ tgamma() [2/5]

float2 ipu::tgamma ( float2  x)
inline

The gamma function.

Parameters
xA value of type float2.
Returns
The element-wise result of tgamma of x.

◆ tgamma() [3/5]

half ipu::tgamma ( half  x)
inline

The gamma function.

Parameters
xA value of type half.
Returns
The result of tgamma of x.

◆ tgamma() [4/5]

half2 ipu::tgamma ( half2  x)
inline

The gamma function.

Parameters
xA value of type half2.
Returns
The element-wise result of tgamma of x.

◆ tgamma() [5/5]

half4 ipu::tgamma ( half4  x)
inline

The gamma function.

Parameters
xA value of type half4.
Returns
The element-wise result of tgamma of x.

◆ trunc() [1/5]

float ipu::trunc ( float  x)
inline

Rounds input towards zero to the nearest integral value that is not larger in magnitude than x.

Parameters
xA value of type float.
Returns
The nearest integral value that is not larger in magnitude than x.

◆ trunc() [2/5]

float2 ipu::trunc ( float2  x)
inline

Rounds input towards zero to the nearest integral value that is not larger in magnitude than x.

Parameters
xA value of type float2.
Returns
A vector where every ith element is the nearest integral value to x[i] that is not larger in magnitude.

◆ trunc() [3/5]

half ipu::trunc ( half  x)
inline

Rounds input towards zero to the nearest integral value that is not larger in magnitude than x.

Parameters
xA value of type half.
Returns
The nearest integral value that is not larger in magnitude than x.

◆ trunc() [4/5]

half2 ipu::trunc ( half2  x)
inline

Rounds input towards zero to the nearest integral value that is not larger in magnitude than x.

Parameters
xA value of type half2.
Returns
A vector where every ith element is the nearest integral value to x[i] that is not larger in magnitude.

◆ trunc() [5/5]

half4 ipu::trunc ( half4  x)
inline

Rounds input towards zero to the nearest integral value that is not larger in magnitude than x.

Parameters
xA value of type half4.
Returns
A vector where every ith element is the nearest integral value to x[i] that is not larger in magnitude.