8. Release notes

8.1. Version 1.1.0

8.1.1. New features

Initial public release of the IPU Operator, consisting of the following containerised components:

  • controller: executes the main control loop of the operator;

  • vipu-proxy: performs IPU partitions operations (create, reset, delete). The proxy also tracks IPU usage and exposes endpoints to provide usage stats and check IPU availability;

  • launcher: sends IPU partition creation requests to vipu-proxy and waits for workers to be ready before allowing job execution.

8.1.2. Bug fixes


8.1.3. Other improvements


8.1.4. Known issues

  • K8s IPUJob Pods must be run on a Kubernetes node with IPU access, meaning that at least one worker node must be an IPU-POD head node;

  • In order to access the RDMA network interface on the head node, the IPUJob Pods must use host networking and run in privileged mode;

  • For parallel IPUJobs with more than one worker Pod, you must specify the network interface which will be used for MPI communication using the mpirun --mca btl_tcp_if_include option;

  • IPU partitions larger than 64 IPUs are currently not supported.

8.1.5. Compatibility changes