3. Configurations

There are many configuration parameters in the IPU Operator. They can be set during installation or upgrade, or later via the specific ConfigMap. The table below lists the configurable parameters of the Helm Chart and their default values.

Table 3.1 Helm Chart parameters

Parameter

Description

Default

nameOverride

Override the name of the chart in the generated Chart resource names

“”

fullNameOverride

Override the fully qualified app name which is used in naming the generated chart resources. If this is not set, the fully qualified app name is defaulted to: <helm-release-name>-<either Chart name or nameOverride if it is set>

“ipu-operator”

global.imagePullSecrets

A map of image pull secrets names (for example, name: "test-secret")

[]

global.kubectlVersion

Kubectl server version

“”

global.launcherImage

The container image used for each IPUJob launcher container

graphcore/ipu-operator-launcher

global.launcherImageTag

The version tag of the IPUJob launcher container container image

“”

global.launcherImagePullPolicy

Launcher image pull policy

“IfNotPresent”

global.vipuControllers

A string with a comma-separated list of V-IPU Controllers’ definitions. These definitions are build from 3 elements separated a colon (:): 1. IP address or DNS name of the V-IPU Controller; 2. V-IPU Controller listening port (most often 8090); 3. node selector in the form label=value that selects nodes that are Poplar servers associated with this V-IPU Controller. In the example “pod005:8090:vipu-ctrl=pod005”, pod005 is a DNS name of V-IPU Controller, 8090 is a listening port, vipu-ctrl=pod005 is a label set on nodes running on Poplar servers associated with this V-IPU Controller.

“”

admissionWebhooks.failurePolicy

The admission webhooks failure policy.

“Fail”

admissionWebhooks.patch.image.pullPolicy

The admission webhooks patch image pull policy.

“IfNotPresent”

admissionWebhooks.patch.image.repository

The admission webhooks patch image repository.

“k8s.gcr.io/ingress-nginx/kube-webhook-certgen”

admissionWebhooks.patch.image.tag

The admission webhooks patch image tag.

“v1.1.1”

admissionWebhooks.patch.nodeSelector

The Kubernetes node selector for the admission webhooks patch jobs.

{}

admissionWebhooks.patch.podAnnotations

The Pod annotations for the admission webhooks patch jobs.

{}

admissionWebhooks.patch.priorityClassName

The name of a priority class to use with the admission webhook patching job.

“”

admissionWebhooks.patch.runAsUser

The User to use for the admission webhooks patch jobs.

2000

admissionWebhooks.patch.tolerations

The Kubernetes tolerations for the admission webhooks patch jobs. See Taints and Tolerations on the Kubernetes website.

[]

admissionWebhooks.port

The port at which the admission webhook server is exposed in the Controller container.

9443

admissionWebhooks.service.annotations

The admission webhooks service annotations.

{}

admissionWebhooks.service.servicePort

The admission webhooks service port.

443

admissionWebhooks.service.type

The admission webhooks service type.

“ClusterIP”

admissionWebhooks.timeoutSeconds

The admission webhooks timeout in seconds.

30

controller.affinity

Controller Kubernetes affinity. See Pod Affinity on the Kubernetes website.

{}

controller.develLogs

Specifies whether to enable (true) or disable (false) development logging mode.

true

controller.image.pullPolicy

The Controller image pull policy

“Always”

controller.image.repository

The Controller image repository

“graphcore/ipu-operator-controller”

controller.image.tag

Overrides the Controller image tag whose default is the chart appVersion.

“”

controller.nodeSelector

Controller Kubernetes node selector.

{}

controller.podAnnotations

Controller Pod annotations.

{}

controller.podSecurityContext

Controller Pod security policy.

{“runAsUser”:65532}`

controller.rbac.create

Specifies whether to create rbac clusterrole and clusterrolebinding and attach them to the service account.

true

controller.resources.limits.cpu

The max limit for CPU time for the Controller, in Kubernetes CPU units.

“500m”

controller.resources.limits.memory

The max limit for memory for the Controller.

“512Mi”

controller.resources.requests.cpu

The requested CPU for the Controller, in Kubernetes CPU units.

“100m”

controller.resources.requests.memory

The requested memory for the Controller.

“200Mi”

controller.securityContext

Controller security context.

{}

controller.service.port

The port for the Controller service, used to setup kube-rbac-proxy for protecting metrics endpoint

8443

controller.service.type

The Kubernetes service type for the Controller.

“ClusterIP”

controller.serviceAccount.annotations

Annotations to add to the service account.

{}

controller.serviceAccount.create

Specifies whether a service account should be created.

true

controller.serviceAccount.name

The name of the service account to use. If not set and create is true, a name is generated using the fullname template.

“”

controller.tolerations

Controller Kubernetes tolerations. See Taints and Tolerations on the Kubernetes website.

[]

vipuProxy.affinity

The V-IPU proxy Kubernetes affinity. See Pod Affinity

{}

vipuProxy.image.pullPolicy

The V-IPU proxy image pull policy.

“Always”

vipuProxy.image.repository

The V-IPU proxy image repository.

“graphcore/ipu-operator-vipu-proxy”

vipuProxy.image.tag

Overrides V-IPU proxy the image tag whose default is the chart appVersion.

“”

vipuProxy.logLevel

V-IPU proxy log level (min 1 -max 6).

2

vipuProxy.nodeSelector

The V-IPU proxy Kubernetes node selector.

{}

vipuProxy.podAnnotations

The V-IPU proxy Pod annotations.

{}

vipuProxy.podSecurityContext

The V-IPU proxy Pod security policy.

{}

vipuProxy.proxyIdleTimeoutSeconds

V-IPU proxy idle timeout seconds.

60

vipuProxy.proxyPartitionTrackerConfigMap

V-IPU proxy partition tracking configmap name.

“ipu-partitions-tracker”

vipuProxy.proxyPort

V-IPU proxy port.

8080

vipuProxy.proxyReadTimeoutSeconds

V-IPU proxy read timeout in seconds.

30

vipuProxy.proxyWriteTimeoutSeconds

V-IPU proxy write timeout in seconds.

300

vipuProxy.rbac.create

Specifies whether to create rbac clusterrole and clusterrolebinding and attach them to the service account for V-IPU proxy.

true

vipuProxy.resources

The Kubernetes resources limits and requirements for the V-IPU proxy.

{}

vipuProxy.securityContext

The V-IPU proxy security context.

{}

vipuProxy.service.port

The Kubernetes service port for V-IPU proxy.

80

vipuProxy.service.type

The Kubernetes service type for V-IPU proxy.

“ClusterIP”

vipuProxy.serviceAccount.annotations

Annotations to add to the service account for V-IPU proxy.

{}

vipuProxy.serviceAccount.create

Specifies whether a service account should be created for V-IPU proxy.

true

vipuProxy.serviceAccount.name

The name of the service account to use for V-IPU proxy. If not set and create is true, a name is generated using the fullname template.

“”

vipuProxy.tolerations

The V-IPU proxy Kubernetes tolerations. See Taints and Tolerations on the Kubernetes website.

[]

worker.hostNetwork

If true then host network is enabled in a worker Pod where workload is being run, otherwise host network is disabled

true