Logo
Pre-Training and Fine-Tuning BERT for the IPU
Version: latest
  • 1. Introduction
  • 2. Pre-training BERT on the IPU-POD
  • 3. Scaling BERT to the IPU-POD
    • 3.1. Collective gradient reduction
    • 3.2. Training BERT with large batches
      • 3.2.1. Linear scaling rule
      • 3.2.2. Gradual warmup strategy
      • 3.2.3. AdamW optimizer
      • 3.2.4. LAMB optimizer
      • 3.2.5. Low-precision training
  • 4. Training results
    • 4.1. Pre-training accuracy
    • 4.2. Fine-tuning accuracy
      • 4.2.1. SQuAD v1.1
      • 4.2.2. CLUE
  • 5. Trademarks & copyright
Pre-Training and Fine-Tuning BERT for the IPU

Search help

Note: Searching from the top-level index page will search all documents. Searching from a specific document will search only that document.

  • Find an exact phrase: Wrap your search phrase in "" (double quotes) to only get results where the phrase is exactly matched. For example "PyTorch for the IPU" or "replicated tensor sharding"
  • Prefix query: Add an * (asterisk) at the end of any word to indicate a prefix query. This will return results containing all words with the specific prefix. For example tensor*
  • Fuzzy search: Use ~N (tilde followed by a number) at the end of any word for a fuzzy search. This will return results that are similar to the search word. N specifies the “edit distance” (fuzziness) of the match. For example Polibs~1
  • Words close to each other: ~N (tilde followed by a number) after a phrase (in quotes) returns results where the words are close to each other. N is the maximum number of positions allowed between matching words. For example "ipu version"~2
  • Logical operators. You can use the following logical operators in a search:
    • + signifies AND operation
    • | signifies OR operation
    • - negates a single word or phrase (returns results without that word or phrase)
    • () controls operator precedence

5. Trademarks & copyright

Graphcloud®, Graphcore®, Poplar® and PopVision® are registered trademarks of Graphcore Ltd.

Bow™, Bow-2000™, Bow Pod™, Colossus™, In-Processor-Memory™, IPU-Core™, IPU-Exchange™, IPU-Fabric™, IPU-Link™, IPU-M2000™, IPU-Machine™, IPU-POD™, IPU-Tile™, PopART™, PopDist™, PopLibs™, PopRun™, PopTorch™, Streaming Memory™ and Virtual-IPU™ are trademarks of Graphcore Ltd.

All other trademarks are the property of their respective owners.

Copyright © 2016-2020 Graphcore Ltd. All rights reserved.

Previous

Revision e22161b1.