7. Known limitations

There currently exist the following limitations:

  1. IPUs can be only accessed from within the IPU-POD network by default.

    Therefore, IPUJob Pods must be run on a Kubernetes node that can access the IPUs, which means that at least one of the IPU-POD head nodes has to be a Kubernetes worker node.

  2. For parallel IPUJobs (jobs with more than one Worker Pod), you must specify the network interface which will be used for MPI communication using the mpirun --mca btl_tcp_if_include option.

  3. IPU partitions larger than 64 IPUs are currently not supported.