Xception architecture is called Extreme Inception because it uses lots of Inception
modules.
In regular convolutions, we capture the spatial and cross-channel together.
Inception module completely decouples the operations of mapping the spatial and cross-
channel correlations using 1* 1 convolution layers for reducing and increasing the
channels.
Xception block completely decouples the two operations of mapping spatial and cross-
channel correlations by applying depth wise convolution layer followed by 1 * 1
convolution, which captures spatial feature and cross-channel features respectively.
Fig: 1