Note: Searching from the top-level index page will search all documents. Searching from a specific document will search only that document.
Find an exact phrase: Wrap your search phrase in "" (double quotes) to only get results where the phrase is exactly matched. For example "PyTorch for the IPU" or "replicated tensor sharding"
Prefix query: Add an * (asterisk) at the end of any word to indicate a prefix query. This will return results containing all words with the specific prefix. For example tensor*
Fuzzy search: Use ~N (tilde followed by a number) at the end of any word for a fuzzy search. This will return results that are similar to the search word. N specifies the “edit distance” (fuzziness) of the match. For example Polibs~1
Words close to each other:~N (tilde followed by a number) after a phrase (in quotes) returns results where the words are close to each other. N is the maximum number of positions allowed between matching words. For example "ipu version"~2
Logical operators. You can use the following logical operators in a search:
+ signifies AND operation
| signifies OR operation
- negates a single word or phrase (returns results without that word or phrase)
These instructions describe how to replace a PSU in the IPU-Machine.
Note
We use the term power supply unit (PSU) to refer to the power supply in the IPU-Machine and power distribution unit (PDU) to refer to the power supply in the Pod.
There are two PSUs in the IPU-Machine as shown in Fig. 4.1. On each PSU, from the left, there is a power socket, a handle and a release tab.
Note
Depending on vendor, the release tab can be green or orange. Please ensure that the replacement PSU is from the same supplier and has the same part number.
Use the BMC command ipum-utilsinventory_list to view the status for the PSUs. When the output for a PSU shows true in the Present column and false in the Functional column, this can be due to the following:
Start by checking that the AC cable between the PSU and PDU is plugged securely into the PSU. Run the BMC command ipum-utilsinventory_list and check the output.
If the output is as described in Section 4.2, PSU maintenance options, then, check that the AC cable between the PSU and PDU is plugged securely into the PDU. Run the BMC command ipum-utilsinventory_list and check the output.
If the output is as described in Section 4.2, PSU maintenance options, then, replace the AC cable and ensure that it is plugged securely into the PSU and the PDU. Run the BMC command ipum-utilsinventory_list and check the output.
If you have confirmed that there are no issues with the AC cable or the PDU, then there is most likely a fault with the PSU and it would need to be replaced.
Ensure when replacing a power supply that you use the same model. If you do not, the gateway device will not power up (this is a means of performance protection).
Identify the faulty PSU at the rear of the IPU-Machine based on the output of the ipum-utilsinventory_list command.
Disconnect the power cable from the faulty power supply.
Note that the IPU-Machine itself and the Pod it is in can be left powered on, but should not be running any ML jobs.
Remove the faulty power supply.
Hold the handle on the power supply and at the same time press the release tab towards the handle. Keep the release tab pressed and pull the power supply out of the chassis.
Install the replacement power supply.
Place the replacement power supply into the empty power supply slot and slide in until you hear a click.
Connect the power cable.
Check that the replacement power supply is functional.
Running the ipum-utilsinventory_list command should show true in the Functional column for that power supply.