4. 创建Pod及使用IPU
用户可以通过需要的资源类型使用IPU,如 Deployment 或 Pod 。
准备 test.yaml
:
Kubernetes Pod示例
apiVersion: v1 kind: Pod metadata: name: ipu-test-1 spec: containers: - name: demo-ipu-test image: graphcore/pytorch:latest command: ["/bin/bash", "-c", "--"] args: ["sleep infinity & wait"] resources: limits: c600.graphcore.ai/ipu: "1" # Number of IPUs allocated to the Pod
pod-test.yaml
(rename totest.yaml
)Deployment示例
apiVersion: apps/v1 kind: Deployment metadata: name: ipu-test namespace: default labels: app: app-test spec: replicas: 1 selector: matchLabels: app: app-test template: metadata: labels: app: app-test spec: containers: - name: demo-ipu-test image: graphcore/pytorch:latest command: ["/bin/bash", "-c", "--"] args: ["sleep infinity & wait"] resources: limits: c600.graphcore.ai/ipu: "1"
deployment-test.yaml
(rename totest.yaml
)
Deployment的副本缩放,回滚等功能依旧支持。 运行以下命令以创建Pod/Deployment:
$ kubectl apply -f test.yaml
以下指令用于查看Pod是否运行成功,期望得到的输出为:
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
ipu-test-1 1/1 Running 0 4d
再次运行以下指令
$ kubectl describe nodes
将会发现allocated IPU 已经从0变成了1。
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 1 (2%) 0 (0%)
memory 140Mi (0%) 340Mi (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
c600.graphcore.ai/ipu 1 1
...
此时说明K8s集群已经可以通过Kubernetes IPU device plugin 调度使用IPU设备。