Deep Learning¶

Machine learning with many-layered neural networks that learn representations directly from raw data.

Think about how you read a story. Your eyes first catch letters, then group them into words, then words into sentences, and finally the whole plot. Deep learning builds understanding the same way: data flows through many layers, and each layer turns the simple patterns below it into something richer above. Nobody hand-writes the rules; the network discovers useful features by itself from huge piles of examples. The newest networks add a trick called attention — instead of treating every word equally, the model learns which words to focus on, lingering on the ones that carry the meaning and skimming the rest. That single idea is what powers today's chatbots and image generators, letting a machine weigh what matters and combine it into answers that can feel surprisingly thoughtful.

The main ideas¶

Neural networks — Layers of weighted connections and non-linear activations, trained by backpropagation.
CNNs — Convolutional networks exploit spatial structure — the workhorse of classic computer vision.
RNNs & LSTMs — Recurrent networks for sequences; LSTMs/GRUs address long-range memory (largely superseded by transformers).
Transformers & attention — Self-attention lets every token attend to every other; the architecture behind modern LLMs and much of vision.
Diffusion models & GANs — Two families of generative models powering image, audio, and video synthesis.
Graph neural networks — Networks that operate on graph-structured data — molecules, social networks, knowledge graphs.
Training at scale — GPUs/TPUs, mixed precision, distributed and parallel training, and the scaling laws that drive frontier models.

Machine Learning · NLP & Large Language Models · Generative AI · Computer Vision

Want to make things?

Head to AI School — AI camps where kids build their own games.

Deep Learning¶

The main ideas¶

Related areas¶