3. Container installation

3.1. Install the Docker image

The method for installing the Docker image depends on whether you have the Docker image saved locally (offline installation) or you will be pulling it from Docker Hub (online installation):

  • Offline installation:

    $ docker load -i <docker_image_save_path>
    
  • Online installation:

    $ docker pull <docker image link>
    

3.2. Create a DaemonSet

You can specify the configuration for a DaemonSet with a YAML file.

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: ipu-device-plugin-daemonset
  namespace: kube-system
spec:
  selector:
    matchLabels:
      name: ipu-dp-ds
  template:
    metadata:
      labels:
        name: ipu-dp-ds
    spec:
      hostNetwork: true
      containers:
      - image: graphcorecn/ipu-k8s-device-plugin:latest
        name: ipu-k8s-device-plugin
        securityContext:
          privileged: true
        volumeMounts:
        - name: dp
          mountPath: /var/lib/kubelet/device-plugins
        - name: sys
          mountPath: /sys
        - name: hostvolume
          mountPath: /etc/ipuof.conf.d
      volumes:
      - name: dp
        hostPath:
          path: /var/lib/kubelet/device-plugins
      - name: sys
        hostPath:
          path: /sys
      - name: hostvolume
        hostPath:
          path: /etc/ipuof.conf.d

ds.yaml

3.3. Deploy the IPU device plugin

Deploy the IPU device plugin with:

$ kubectl apply -f ds.yaml

Use the following command to check whether the deployment is successful. The output shows the new DaemonSet.

$ kubectl get ds -n kube-system

NAMESPACE     NAME                          DESIRED  CURRENT  READY  UP-TO-DATE  AVAILABLE  NODE SELECTOR  AGE

kube-system   ipu-device-plugin-daemonset   1        1        1      1           1          <none>

The following command displays the log and the expected output is shown:

$ kubectl logs -f <ipu-device-plugin-daemonset-pod> -n kube-system

1I1013 05:02:08.925503       1 main.go:25] Plugin version: dev
2I1013 05:02:08.925598       1 main.go:30] Starting FS watcher.
3I1013 05:02:08.925685       1 main.go:37] Starting OS watcher.
4E1013 05:02:08.926158       1 utils.go:64] stat /etc/vipu/vipu-cli.hcl: no such file or        directory
5E1013 05:02:08.926237       1 vipuclient.go:33] error creating VIPU client: V-IPU      configuration file not found: /etc/vipu/vipu-cli.hcl
6W1013 05:02:08.926262       1 devicemanager.go:94] vipu client cannot be created: V-IPU        configuration file not found: /etc/vipu/vipu-cli.hcl
7W1013 05:02:08.956855       1 storage.go:45] unable to read existing storage file, a new       empty one will be created /etc/ipuof.conf.d/storage/storage.json: open /etc/ipuof.conf.d/       storage/storage.json: no such file or directory
8I1013 05:02:08.957688       1 server.go:94] Starting GRPC server for 'c600.graphcore.ai/       ipu'
9I1013 05:02:08.958722       1 server.go:68] Started to serve 'c600.graphcore.ai/ipu' on /      var/lib/kubelet/device-plugins/ipu.sock
10I1013 05:02:08.965747      1 server.go:75] Registered device plugin for       'c600.graphcore.ai/ipu' with Kubelet
11I1013 05:02:08.966352      1 server.go:171] Inside list and watch

Run the following command:

$ kubectl describe nodes

You will see a new available device type c600.graphcore.ai/ipu:

Capacity:
   cpu: 48
   ephemeral-storage: 28703652Ki
   c600.graphcore.ai/ipu: 8
   hugepages-1Gi: 0
   hugepages-2Mi: 0
   memory: 758240352Ki
   pods: 110
Allocatable:
   cpu: 48
   ephemeral-storage: 26453285640
   c600.graphcore.ai/ipu: 8
   hugepages-1Gi: 0
   hugepages-2Mi: 0
   memory: 758137952Ki
   pods: 110

...

Allocated resources:
   (Total limits may be over 100 percent, i.e., overcommitted.)
   Resource Requests Limits
   -------- -------- ------
   cpu 1 (2%) 0 (0%)
   memory 140Mi (0%) 340Mi (0%)
   ephemeral-storage 0 (0%) 0 (0%)
   hugepages-1Gi 0 (0%) 0 (0%)
   hugepages-2Mi 0 (0%) 0 (0%)
   c600.graphcore.ai/ipu 0 0
   Events: <none>