6. Integration with Kubernetes
Preview Release
This is an early release of Kubernetes support for the IPU. As such the software is subject to change without notice.
The Kubernetes PU Operator for V-IPU is available on request from Graphcore support.
Kubernetes (K8s) is an open-source container orchestration and management system. Kubernetes Operators allow the Kubernetes API to be extended with custom objects, and implement the control logic for such custom objects. You can read more about the Operator pattern in the Kubernetes documentation.
This provides a framework to extend the Kubernetes API and to manage IPUs via custom resource definitions (CRDs) and custom controllers. It allows you to specify the number of IPUs required for your Kubernetes workload using annotations.
This chapter outlines the Operator components, installation steps and usage.
Note
Kubernetes uses the word Pod to refer to the smallest deployable units of computing that you can create and manage in Kubernetes. This is not to be confused with the Graphcore IPU-POD, which is a rack-based system of IPUs.
6.1. Components and design
The Operator contains the following components:
The
gc-proxy
that communicates with the V-IPU controller (vipu-server
)The CRD and controller that let you allocate IPUs directly from the Kubernetes cluster.
The gc-proxy
is responsible for:
Managing the IPU resources by communicating with the V-IPU controller
Running the REST API server to serve requests from
init-containers
for partition creation
The CRD and custom controller extend the Kubernetes API and manage IPU resources on your behalf. They are responsible for:
Watching for CRD events
Creating worker and launcher Pods based on the CRD configuration, by using the following fields:
Adding a finaliser to the custom resource to release the IPUs on deletion
Setting the
hostNetwork
andsecurityContext/privileged
to trueSetting the Pod
dnsPolicy
toClusterFirstWithHostNet
Providing webhook REST endpoints to validate the input CRD specification
6.2. Package contents
The software is delivered as a single tarball containing the following files and directories:
- The CRD specification:
gc-ipu-operator-v1.0.0-alpha-5/CRDs/graphcore.ai_ipujobs.yaml
- Documentation for the Operator:
gc-ipu-operator-v1.0.0-alpha-5/docs/
- The Helm Chart:
gc-ipu-operator-v1.0.0-alpha-5/gc-ipu-operator-helm-chart-v1.0.0-alpha-5.tgz
- The Operator and
gc-proxy
images: gc-ipu-operator-v1.0.0-alpha-5/gc-operator-images.tar.gz
- Checksum for the Operator:
gc-ipu-operator-v1.0.0-alpha-5/gc-operator.cksm
- Checksum for the Helm Chart:
gc-ipu-operator-v1.0.0-alpha-5/gc-ipu-operator-helm-chart.cksm
6.3. Deploying the software
6.3.1. Prerequisites
Before you can use IPUs from your Kubernetes workloads, you need to meet the following conditions:
Have access to one or more Graphcore IPU-PODs
Have a compatible version of the V-IPU controller installed on your IPU-PODs
Create a Kubernetes cluster. At least one of the worker nodes in the cluster must be on the head node of the IPU-POD. See Section 6.9, Known limitations for more information.
Have the
kubectl
and Helm (v3.0.0 or later) command-line tools installed on your machine.
6.3.2. Installation
Installing the CRDs
To install the CRDs, run the following command:
$ kubectl apply -f <dir>/CRDs/graphcore.ai_ipujobs.yaml
Installing the Operator
Unzip the Helm package and run the following command:
$ helm install <release-name> <path-to-chart-tar> <custom-parameters>
Where:
<release-name>
is the name you choose for this Helm installation<path-to-chart-tar>
is the path to the downloaded Helm Chart tar file<custom-parameters>
is where you customize the installation.You can either use multiple
--set key=value
arguments, or put your customization in a YAML file and use the--values your-values.yaml
argument.
See Section 6.4, Configurations for more information.
For example, the following command deploys the software to the Kubernetes cluster in the default configuration.
$ cd ipu-proxy
$ helm install [RELEASE_NAME] . --set vipuServerAddr=[host] --set vipuServerPort=[port] --set vipuCLusterName=[cluster] \
--set controller.image.repository=[controller-image] --set vipuProxy.image.repository=[proxy-image]
This command installs the following in the same namespace where the Helm release installed:
gc-proxy
andgc-controller
as deploymentsgc-proxy
service of type ClusterIPRBAC: ServiceAccount, ClusterRole to manage Pods and ConfigMaps
A Partitions tracker ConfigMap
Configuration objects for the mutation and validation webhooks
You can read more about installing Helm in the Kubernetes documentation.
You can see all the customization options in the README.md
for the Helm Charts.
Multiple V-IPU controller support
The IPU Operator can communicate with multiple V-IPU controllers.
You can specify multiple V-IPU controllers during installation by setting the
vipuControllers
option on the helm install
command line. For example:
--set vipuControllers="pod001:8090:ipunode=node1,pod002:8091:ipunode=node2"
Alternatively, after installation you can edit the ConfigMap, as shown below, and update the value.
$ kubectl edit configmap gc-ipu-operator-vipu-controllers
Each V-IPU controller is specified with a colon-separated list of three values:
V-IPU controller host address
V-IPU controller port
A label defined by
key=value
.
The same label must be added to the node where the containers corresponding to that V-IPU controller will run. Labeling the node is done with the following command:
$ kubectl label nodes <someworkernode> <key>=<value>
The ConfigMap can be modified at any time and the IPU Operator automatically adds the new V-IPU controller to its internal list. It can take up to 60 seconds for the new V-IPU controller to be added. When a partition is created, the IPU Operator goes through the list serially until it finds space for the requested number of IPUs.
Verify the installation is successful
When the installation is complete, you can verify that it worked correctly by running the following commands and seeing similar output:
$ kubectl get crd samia@samia-malt0
NAME CREATED AT
ipujobs.graphcore.ai 2021-03-02T12:20:04Z
...
$ helm ls -n <the-namespace-where-you-deployed-the-operator>
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
gc default 1 2021-03-18 11:50:31.35861 +0100 CET deployed gc-ipu-operator-helm-chart-v1.0.0 v1.0.0
$ kubectl get pods -n <the-namespace-where-you-deployed-the-operator>
NAME READY STATUS RESTARTS AGE
gc-ipu-operator-controller-manager-54766f7f7b-x5wtr 2/2 Running 0 5d23h
gc-ipu-operator-vipu-proxy-844c7d6b7f-88bqr 1/1 Running 1 5d23h
6.3.3. Uninstall
$ helm uninstall [RELEASE_NAME]
This removes all the Kubernetes components associated with the chart and deletes the release.
See helm uninstall for command documentation.
Note
The partition tracker ConfigMap ipu-partitions-tracker
does not get deleted
when you uninstall the Helm release. This is so that when the ipu-proxy
is deployed
again, it can pick up from where it was uninstalled before (in terms of managing the
created partitions). If you wish to remove that ConfigMap, you can run:
kubectl delete configmap ipu-partitions-tracker -n <namespace>
6.3.4. Upgrading the Helm Chart
$ helm upgrade [RELEASE_NAME] [CHART]
See helm upgrade for command documentation.
6.4. Configurations
The following table lists the configurable parameters of the Helm Chart and their default values.
Parameter |
Description |
Default |
---|---|---|
global.launcherImage |
The container image used for each IPUJob launcher init container |
launcher:latest |
global.imagePullSecrets |
A map of image pull secrets names (for example, |
[] |
nameOverride |
Override the name of the chart in the generated Chart resource names |
“” |
fullNameOverride |
Override the fully qualified app name which is used in naming the generated chart resources.
If this is not set, the fully qualified app name is defaulted to: |
“” |
controller.hostNetwork |
Set the hostNetwork flag for the controller Pod |
true |
controller.dnsPolicy |
Set the dnsPolicy to ClusterFirstWithHostNet if hostNetwork is true otherwise set the dnsPolicy to ClusterFirst |
ClusterFirstWithHostNet |
controller.image.repository |
Controller image repository |
“” |
controller.image.pullPolicy |
Controller image pull policy |
Always |
controller.image.tag |
Overrides the Controller image tag whose default is the chart appVersion. |
“” |
controller.serviceAccount.create |
Set to true to create a service account for the |
true |
controller.serviceAccount.annotations |
Annotations to add to the service account. |
{} |
controllers.serviceAccount.name |
The name of the service account to use. If not set and create is true, a name is generated using the “fullname” template |
“” |
controller.rbac.create |
Set to true to create RBAC ClusterRole and ClusterRoleBinding and attach them to the service account |
true |
vipuServerAddr |
|
example.com |
vipuServerPort |
|
8191 |
vipuClusterName |
|
test |
proxyPort |
|
8080 |
ipuVisibility.advertiseOnMaster |
If true, advertise IPU availability on master node(s) or not |
false |
podAnnotations |
|
{} |
podSecurityPolicyContext |
|
{} |
securityContext |
Security context |
{} |
service.type |
|
clusterIP |
service.port |
|
80 |
resources |
|
{} |
nodeSelector |
|
{} |
tolerations |
|
{} |
affinity |
|
{} |
extendedScheduler.enabled |
If true, Kubernetes default scheduler extension is setup. You must manually restart the Kubernetes default scheduler after the Helm release installation. The Helm release installation prints out the instructions to do so. |
false |
admissionWebhooks.scope |
Comma-separated list of namespaces where the webhook will perform mutations/validations. Leaving this empty/unset means mutation is performed on all namespaces. |
“” |
admissionWebhooks.timeoutSeconds |
Admission webhook timeout |
30 |
admissionWebhooks.image.repository |
Admission webhook image repository |
|
admissionWebhooks.image.tag |
Admission webhook image |
v0.2.0 |
admissionWebhooks.image.pullPolicy |
Admission webhooks image pullPolicy |
IfNotPresent |
admissionWebhooks.failurePolicy |
Admission webhook failure policy |
Fail |
admissionWebhooks.port |
Admission webhook Pod port |
8443 |
admissionWebhooks.service.annotations |
Admission webhook service annotations |
{} |
admissionWebhooks.service.servicePort |
Admission webhook service port |
443 |
admissionWebhooks.service.type |
admission webhook service type |
clusterIP |
admissionWebhooks.patch.enabled |
Create and configure admission webhook TLS certificate |
true |
admissionWebhooks.patch.image.repository |
Admission webhook TLS patch image repository |
|
admissionWebhooks.patch.image.tag |
Admission webhook TLS patch image tag |
v1.3.0 |
admissionWebhooks.patch.image.pullPolicy |
Admission webhook TLS patch image pull policy |
|
admissionWebhooks.patch.priorityClassName |
Admission webhook TLS patch jobs priority class |
“” |
admissionWebhooks.patch.podAnnotations |
Admission webhook TLS patch jobs Pod annotations |
{} |
admissionWebhooks.patch.nodeSelector |
Admission webhook TLS patch jobs nodeSelector |
{} |
admissionWebhooks.patch.tolerations |
Admission webhook TLS patch jobs tolerations |
[] |
admissionWebhooks.patch.runAsUser |
Admission webhook TLS patch jobs run as user |
2000 |
6.5. Creating an IPUJob
Once the CRDs and the IPU Operator are installed, you can start submitting IPUJobs (MPI-based AI/ML jobs that use IPUs). The following YAML file is an example of a declarative definition of an IPUJob for the ResNet-8 TensorFlow application:
apiVersion: graphcore.ai/v1alpha1 # the API that defined this API object type
kind: IPUJob # the kind of this Kubernetes object
metadata:
name: ipujob-sample # the name of the job
spec:
modelReplicas: "4" # how many replicas should the graph model be split into when being processed
ipusPerModelReplica: "1" # how many IPUs should be assigned to each model replica
launcher:
command: # the command to trigger the job execution
- mpirun
- --allow-run-as-root
- --bind-to
- none
- -np
- "1"
- python3
- /public_examples/applications/tensorflow/cnns/training/train.py
- --dataset=cifar-10
- --synthetic-data
- --model-size=8
- --batch-size=1
- --batches-per-step=10
- --gradient-accumulation-count=10
- --no-validation
- --no-stochastic-rounding
- --iterations=20
workers:
replicas: 1 # how many workers (poplar instances) should participate in this execution
template: # native Kubernetes Pod template. https://kubernetes.io/docs/concepts/workloads/pods/#pod-templates
metadata:
labels:
app: resnet-launcher
spec:
containers: # the containers running inside each worker
- name: resnet
image: resnet:latest
env: # environment variables set on each worker
- name: "IPUOF_LOG_LEVEL"
value: "INFO"
- name: "POPLAR_LOG_LEVEL"
value: "INFO"
Download single-gcd-sample.yaml
Save the above specification file as single-gcd-sample.yaml
then run:
$ kubectl apply -f single-gcd-sample.yaml
ipujob.graphcore.ai/ipujob-sample created
Now you can inspect what happens in the cluster and you should see something similar to:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
gc-ipu-operator-controller-manager-6ff6b6875d-ncjgp 2/2 Running 0 3d22h
gc-ipu-operator-vipu-proxy-849dbf98df-rg8gh 1/1 Running 0 3d22h
ipujob-sample-launcher 1/1 Running 0 10s
ipujob-sample-worker-0 1/1 Running 0 25s
You can also list the IPUJobs in the cluster and see their status:
$ kubectl get ipujobs.graphcore.ai
NAME STATUS AGE
ipujob-sample Running 40s
And you can inspect more details about a specific IPUJob as follows:
$ kubectl describe ipujobs.graphcore.ai ipujob-sample
Name: ipujob-sample
Namespace: default
Labels: <none>
Annotations: <none>
API Version: graphcore.ai/v1alpha1
Kind: IPUJob
Metadata:
Creation Timestamp: 2021-03-22T10:10:31Z
Finalizers:
ipu.finalizers.graphcore.ai
Generation: 2
Manager: manager
Operation: Update
Time: 2021-03-22T10:10:45Z
Resource Version: 29226482
Self Link: /apis/graphcore.ai/v1alpha1/namespaces/default/ipujobs/ipujob-sample
UID: beb81bbe-2309-494a-9e28-2a75a704be15
Spec:
Clean Pod Policy: None
Ipus Per Model Replica: 1
Launcher:
Command:
mpirun
--allow-run-as-root
--bind-to
none
-np
1
python3
/public_examples/applications/tensorflow/cnns/training/train.py
--dataset=cifar-10
--synthetic-data
--model-size=8
--batch-size=1
--batches-per-step=10
--gradient-accumulation-count=10
--no-validation
--no-stochastic-rounding
--iterations=20
Model Replicas: 4
Restart Policy:
Back Off Limit: 3
Type: Never
Workers:
Replicas: 1
Template:
Metadata:
Spec:
Containers:
Env:
Name: IPUOF_LOG_LEVEL
Value: INFO
Name: POPLAR_LOG_LEVEL
Value: INFO
Image: artifactory-systems.eng.graphcore.ai/vipu-k8s-docker-dev-local/resnet-poplar-2.0:operator
Name: resnet
Resources:
Status:
Conditions:
Last Transition Time: 2021-03-22T10:10:31Z
Last Update Time: 2021-03-22T10:10:31Z
Message: IPUJob default/ipujob-sample is waiting for resources to be ready.
Reason: IPUJobPending
Status: False
Type: Pending
Last Transition Time: 2021-03-22T10:10:45Z
Last Update Time: 2021-03-22T10:10:45Z
Message: IPUJob default/ipujob-sample is running.
Reason: IPUJobRunning
Status: True
Type: Running
I PU Partition Created: true
Launcher Status: Running
Restart Count: 0
Start Time: 2021-03-22T10:10:31Z
Workers Status:
Active: 1
6.5.1. Interactive mode
You can also run the IPUJob in interactive mode, where it does not execute anything by default:
apiVersion: graphcore.ai/v1alpha1
kind: IPUJob
metadata:
name: interactive-sample-job
spec:
modelReplicas: "4"
ipusPerModelReplica: "1"
interactive:
ttl: 3600 # how long should the interactive session last
workers:
replicas: 1
template:
metadata:
labels:
app: resnet-launcher
spec:
containers:
- name: resnet
image: resnet:latest
imagePullPolicy: Always
env:
- name: "IPUOF_LOG_LEVEL"
value: "INFO"
- name: "POPLAR_LOG_LEVEL"
value: "INFO"
Download interactive-job.yaml
Save the above specification as interactive-job.yaml
then run:
$ kubectl apply -f interactive-job.yaml
ipujob.graphcore.ai/interactive-sample-job created
Then you can have a terminal access to the job’s launcher Kubernetes Pod:
$ kubectl exec -it interactive-sample-job-launcher -- bash
root@interactive-sample-job-launcher:/public_examples/applications/tensorflow/cnns/training# <run your mpi programs here>
6.5.2. Mounting data volumes for an IPUJob
Every IPUJob will require an input dataset and will possibly produce output files (for example, checkpoints and trained models). The Kubernetes Pods are ephemeral by nature. This means that all files inside the containers are lost when the containers are removed. For data persistence, the IPU Operator relies on using the native Kubernetes volumes.
By specifying volumes
and volumeMounts
under the IPUJob workers’ Pod
template, all the IPUJob workers and their launcher will have the same volume(s)
mounted at the same path. This means that the workers and the launcher all see
the same file system at certain path(s). One thing to keep in mind, however, is
that you need to use a Persistent Volume type that supports multiple Read/Write
mounts. See the Kubernetes
documentation
for a list of volume types you can use.
Here is an example of the single-gcd-sample.yaml
we used above with volumes added to it:
# native Kubernetes volumes which uses NFS
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-shared-storage
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteMany
persistentVolumeReclaimPolicy: Recycle
storageClassName: slow
mountOptions:
- hard
- nfsvers=4.1
nfs:
server: nfs-server.default.svc.cluster.local # this should be your NFS server endpoint
path: "/"
---
# Persistent Volume Claim
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-pvc
spec:
accessModes:
- ReadWriteMany
storageClassName: ""
resources:
requests:
storage: 5Gi
---
apiVersion: graphcore.ai/v1alpha1 # the API that defined this API object type
kind: IPUJob # the kind of this Kubernetes object
metadata:
name: ipujob-sample # the name of the job
spec:
modelReplicas: "4" # how many replicas should the graph model be split into when being processed
ipusPerModelReplica: "1" # how many IPUs should be assigned to each model replica
launcher:
command: # the command to trigger the job execution
- mpirun
- --allow-run-as-root
- --bind-to
- none
- -np
- "1"
- python3
- /public_examples/applications/tensorflow/cnns/training/train.py
- --dataset=cifar-10
- --synthetic-data
- --model-size=8
- --batch-size=1
- --batches-per-step=10
- --gradient-accumulation-count=10
- --no-validation
- --no-stochastic-rounding
- --iterations=20
workers:
replicas: 1 # how many workers (poplar instances) should participate in this execution
template: # native Kubernetes Pod template. https://kubernetes.io/docs/concepts/workloads/pods/#pod-templates
metadata:
labels:
app: resnet-launcher
spec:
volumes: # we define here which volumes we want to use with the workers (the same is applied to the launcher too)
- name: mypvc
persistentVolumeClaim:
claimName: nfs-pvc # that is the persistent volume claim we created in the above object
containers: # the containers running inside each worker
- name: resnet
image: resnet:latest
env: # environment variables set on each worker
- name: "IPUOF_LOG_LEVEL"
value: "INFO"
- name: "POPLAR_LOG_LEVEL"
value: "INFO"
volumeMounts:
- name: mypvc # the name of the volume defined in the volumes section
mountPath: /mnt/sample # this is where we mount the volume into both workers and the launcher
---
Download single-gcd-sample-nfs.yaml
The above specification will create an NFS persistent volume (assuming you have an NFS server available), and a persistent volume claim requesting the same amount of storage as the persistent volume.
The IPUJob then mounts that NFS volume at /mnt/sample
in the job’s workers and launchers.
6.5.3. Automatic restarts
You may want your IPUJob to automatically restart in certain cases. Currently,
we support four restart policies which can be defined under the IPUJob
specification: “Never”, “Always”, “OnFailure” and “ExitCode”. You can find more details about
these in the CRD reference documentation in the docs
directory of the release package.
6.5.4. Clean up Kubernetes resources and IPU partitions
Once the job is finished and is no longer going to be restarted, automatic cleanup
can be performed to free the Kubernetes resources that are no longer needed . This can
be defined in the cleanPodPolicy
under the IPUJob specification. You can explore the
options in the CRD reference documentation.
The IPU partitions are currently only cleaned up when the IPUJob is deleted.
6.6. Debugging problems
When something does not work as expected, you need to debug the problem to understand how to fix it.
Before we talk about how to debug a failed IPUJob, it is probably good to understand how the IPU Operator executes an IPUJob.
6.6.1. How does the IPU Operator work?
The IPU Operator consists of a few components:
Controller: this is the reconcile loop that makes sure the desired state (defined in the IPUJob specifications) matches the state of the world.
Admission webhooks: we have two webhooks:
Defaulting (Mutation) webhook: adds some default values to the submitted IPUJob specifications.
Validating webhook: validates the IPUJob specification.
V-IPU proxy: which proxies IPU partition operations to the V-IPU controller and keeps track of the partitions created for jobs running inside the cluster.
When an IPUJob is created in the cluster, the IPU Operator gets notified and creates the following Kubernetes resources:
A ConfigMap to hold a couple of things:
A
kubeexec
script which is used by MPI to trigger remote execution with the worker Pods from the launcher Pod.A
hostfile
which lists the worker Pods that MPI will use for remote execution.
A Kubernetes RBAC role, role-binding and service account for the launcher Pod which allows the launcher to list and watch worker Pods and exec into them.
A set of worker Pods which participate in the job execution. These Pods are placed into a 365 days sleep as the main background process until the launcher triggers the job processes on them.
A launcher Pod which contains two components:
An init-container which runs a small application we provide with the IPU Operator. This program watches the worker Pods until they are all available and in the Ready state.
The program also created the required IPU partition by interacting with the V-IPU proxy. If the V-IPU proxy sees that this partition already exists and is already owned by this IPUJob, it will reset the partition. This makes it possible to restart the job with a clean IPU partition.
The main container which uses the image provided by the user and runs the user-defined command to trigger the job execution.
The IPU Operator also sets environment variables on the worker Pods that allow Poplar to see and use the IPUs when running the AI/ML program.
6.6.2. Debugging
There are a few places to look for debug info:
The status updates for the IPUJob, which you can find by using
kubectl
to describe the jobThe launcher Pod logs which can be found by running:
$ kubectl logs <ipujob-name>-launcher -n <the-namespace-where-the-job-was-deployed>
The controller logs which can be found by running:
$ kubectl logs <controller-manager-pod-name> -n <the-namespace-where-the-operator-was-deployed>
The V-IPU proxy logs which can be found by running:
$ kubectl logs <vipu-proxy-Pod-name> -n <the-namespace-where-the-operator-was-deployed>
6.7. IPU usage statistics
The V-IPU proxy in the IPU Operator keeps track of the used IPU partitions by IPUJobs running inside the cluster. The data is stored in a ConfigMap and it links the IPUJob with the partition it is using. On top of that tracker ConfigMap, the V-IPU proxy exposes a couple of read-only REST endpoints that you can utilize.
By default, these endpoints are only exposed within the cluster, so we can use any container with curl to query them, using the following commands:
/stats
# from inside a Kubernetes Pod that has curl # gc-ipu-operator-vipu-proxy is the Kubernetes service name for the V-IPU proxy. This name can be different in your installation $ curl gc-ipu-operator-vipu-proxy/stats | jq . { "default": { # the default namespace "used": 4, "available": 28 }, "total": { "used": 4, "available": 28 } }
/query
# from inside a Kubernetes Pod that has curl # gc-ipu-operator-vipu-proxy is the Kubernetes service name for the V-IPU proxy. This name can be different in your installation $ curl --request POST -H "Content-Type: application/json" --data '{"size":2}' gc-ipu-operator-vipu-proxy/query | jq . { "available": true, "numOfPartitions": 14, # 14 possible partitions are available "message": "" } $ curl --request POST -H "Content-Type: application/json" --data '{"size":64}' gc-ipu-operator-vipu-proxy/query | jq . { "available": true, "numOfPartitions": 0, # no partition of the requested size are available "message": "" }
6.8. Operator Metrics
The IPU Operator exposes a set of Prometheus metrics that you can use. However, these metrics are exposed behind a protected endpoint. The IPU Operator creates a ClusterRole that grants the permissions to scrape the metrics. To allow your Prometheus server to scrape those metrics, you need to bind that ClusterRole to the service account that the Prometheus server uses.
# find the ClusterRole you must use. Note that the name can be different in your installation.
$ kubectl get clusterrole -l component=metrics-reader
NAME CREATED AT
gc-metrics-reader 2021-03-24T11:13:07Z
# create a ClusterRoleBinding. The service account must be the one that the Prometheus Server uses.
kubectl create clusterrolebinding metrics --clusterrole=gc-metrics-reader --serviceaccount=<namespace>:<service-account-name>
6.9. Known limitations
There are currently a few limitations:
IPUs can be only be accessed from within the IPU-POD network by default.
Therefore, IPUJob Pods must be run on a Kubernetes node that can access the IPUs, which means that at least one of the IPU-POD head nodes has to be a Kubernetes worker node.
In order to access the RDMA network interface on the head node, the IPUJob Pods run on host network and in privileged mode.
For parallel IPUJobs (jobs with more than one worker Pod), you must specify the network interface which will be used for MPI communication using the mpirun
--mca btl_tcp_if_include
option.IPU partitions larger than 64 IPUs are currently not supported.