Overview of the models used for CV in fastai

Computer Vision models zoo

The fastai library includes several pretrained models from torchvision, namely:

  • resnet18, resnet34, resnet50, resnet101, resnet152
  • squeezenet1_0, squeezenet1_1
  • densenet121, densenet169, densenet201, densenet161
  • vgg16_bn, vgg19_bn
  • alexnet

On top of the models offered by torchvision, fastai has implementations for the following models:

  • Darknet architecture, which is the base of Yolo v3
  • Unet architecture based on a pretrained model. The original unet is described here, the model implementation is detailed in models.unet
  • Wide resnets architectures, as introduced in this article

class Darknet[source][test]

Darknet(num_blocks:Collection[int], num_classes:int, nf=32) :: PrePostInitMeta :: Module

No tests found for Darknet. To contribute a test please refer to this guide and this discussion.

https://github.com/pjreddie/darknet

Create a Darknet with blocks of sizes given in num_blocks, ending with num_classes and using nf initial features. Darknet53 uses num_blocks = [1,2,8,8,4].

class WideResNet[source][test]

WideResNet(num_groups:int, N:int, num_classes:int, k:int=1, drop_p:float=0.0, start_nf:int=16, n_in_channels:int=3) :: PrePostInitMeta :: Module

No tests found for WideResNet. To contribute a test please refer to this guide and this discussion.

Wide ResNet with num_groups and a width of k.

Each group contains N blocks. start_nf the initial number of features. Dropout of drop_p is applied in between the two convolutions in each block. The expected input channel size is fixed at 3.

Structure: initial convolution -> num_groups x N blocks -> final layers of regularization and pooling

The first block of each group joins a path containing 2 convolutions with filter size 3x3 (and various regularizations) with another path containing a single convolution with a filter size of 1x1. All other blocks in each group follow the more traditional res_block style, i.e., the input of the path with two convs is added to the output of that path.

In the first group the stride is 1 for all convolutions. In all subsequent groups the stride in the first convolution of the first block is 2 and then all following convolutions have a stride of 1. Padding is always 1.

wrn_22[source][test]

wrn_22()

No tests found for wrn_22. To contribute a test please refer to this guide and this discussion.

Wide ResNet with 22 layers.

This is a WideResNet with num_groups=3, N=3, k=6 and drop_p=0..