A. Memory Usage

While we’re normally concerned with model accuracy, the amount of memory a model uses is important as well. After training a model, we store its computation graph and parameters (weights + biases) for future use. Though a model’s computation graph is relatively small (since even large models won’t have more than a couple hundred layers), the number of parameters a model has can be in the millions.

Most high performance models require hundreds of MB of space to store their parameters. The aforementioned AlexNet uses over 200MB for storage of 60 million parameters. On the other hand, the SqueezeNet model architecture uses less than 1MB. That’s even less memory than the simple digit recognition model we built in the previous chapter (13MB). However, SqueezeNet has significantly better performance than our previous model and actually matches AlexNet in accuracy.

When accuracy between two models is relatively equal, we prefer the model that uses up less memory.

What you'll learn from this course

Image Processing

CNN

SqueezeNet

ResNet

Introduction

A. Memory Usage

B. Calculating parameters