< Back to Logs

DilatedConvBlock: When Convolutions Learn to Breathe

// Reading time: 5 min read

Convolutional Neural Networks (CNNs) have been the backbone of Computer Vision for over a decade. However, standard convolutions struggle with one major limitation: Receptive Field.


The Receptive Field Problem

To increase the receptive field in a standard CNN, you typically need to:

  • Increase the kernel size (Computationally expensive)
  • Add pooling layers (Loss of spatial resolution)
  • Stack more layers (Vanishing gradient issues)

Enter Dilated Convolutions

Dilated convolutions (or atrous convolutions) introduce a "dilation rate" parameter to the kernel. This effectively expands the kernel by inserting holes (zeros) between the weights.

Key Benefit: Exponential expansion of the receptive field without loss of resolution or coverage.

This technique is pivotal in semantic segmentation tasks (like DeepLab) and audio generation (WaveNet), where context is king.

EOF