Pre-Training and Fine-Tuning BERT for the IPU
This technical note provides an insight into BERT-Large implementation on Graphcore IPU-POD systems, using both TensorFlow and PyTorch. This should help you better understand some of the key optimization techniques for model development on the IPU.
- 1. Introduction
- 2. Pre-training BERT on the IPU-POD
- 3. Scaling BERT to the IPU-POD
- 4. Training results
- 5. Trademarks & copyright