PyTorch internals
Buffers are scratch storage space within a module that are not considered parameters, but are part of the module’s persistent state. They are typically used to store values that should not be updated during training, such as running statistics in batch normalization layers.
Hooks are functions that can be registered to be called during the forward or backward pass of a module. They allow users to modify the inputs or outputs of a module, or to perform custom operations during the training process like logging.
Best practices
Section titled “Best practices”- Wrap a network’s learnable parameters in
torch.nn.Parameterinstead of manually tracking them with.requires_grad_. This registers them as parameters of the module and ensures they are included in calls to.parameters()and.state_dict() - PyTorch offers
torch.nnOOP for building neural networks andtorch.nn.functionalfor stateless functions. Blend the two paradigms: leverage the modularity and state management of OOP for your model architecture, while using functional programming for the stateless operations within your model’s forward pass - Use the autograd profiler to profile the performance of the operators used in your model.