About this chapter

At this point in the course, we are comfortable with the structure of a neural network. All the networks that we’ve seen so far share the same architecture: a sequence of dense layers, whereby “dense” means that each node in a layer is connected to all the nodes in the neighboring layers. That blueprint is also called a fully connected neural network, and it comes with a drastic limitation. Whether it’s classifying images, parsing a product review, or predicting traffic congestion, a fully connected network treats all data the same, as an indistinct sequence of bytes. That generalistic approach can only go so far and fails when dealing with complex datasets.

Deep learning is not just about making neural networks deeper; it’s also about making them smarter, and adopting different architectures to deal with different kinds of data. Deep learning researchers came up with quite a few variations to the basic fully connected network, and more are probably getting invented as we read this ...