ResNet architecture, like VGG architecture, was able to show that sheer adding more
and more layers sequentially helps in capturing more complex features and improve the
model performance.
Residual units add an identity function as skip connection to the outputs, which enables
it to learn only the residual part and because of which, it is easier to optimize and create
very deep models with low complexity and high accuracy.
Unlike the ResNet variants with lower number of layers such as 18- and 34- layers,
variants with higher layers such as 50 layers and above, used bottleneck design, which
allowed it to go deeper by using 1 * 1 convolution layer to reduce and increase the
dimensions.
All of the above
Fig: 1