ニューラルネットワークと深層学習

Nielsen, Michael A.

ニューラルネットワークと深層学習

What this book is about

On the exercises and problems

ニューラルネットワークを用いた手書き文字認識

逆伝播の仕組み

ニューラルネットワークの学習の改善

ニューラルネットワークが任意の関数を表現できることの視覚的証明

ニューラルネットワークを訓練するのはなぜ難しいのか

深層学習

Appendix: 知性のある シンプルな アルゴリズムはあるか?

Acknowledgements

Frequently Asked Questions

Sponsors

Resources

ニューラルネットワークはこれまで発明されたなかでも最も美しいプログラミングパラダイムの一つです。プログラミングする従来の方法では、私達は、解きたい大きな問題を、コンピュータにも実行できるよう明確に定義された無数の小さなタスクに分割し、コンピュータが何をすべきかを逐一指示します。これに対し、ニューラルネットワークを使う場合、私達はコンピュータに直接問題の解き方を指示しません。そのかわり、コンピュータは観測データから学習し、問題の解き方を自ら編み出すのです。データから自動的に学習するということは、見込みがあるように思えます。しかし、ニューラルネットワークを訓練して伝統的な手法より良い結果を出させる方法は、2006年までは分かっていませんでした。例外はいくつかの特殊な問題だけでした。 2006年に起きた変化というのは、いわゆるディープニューラルネットワークの学習のための手法が新たに発見されたことでした。これらの手法は現在では深層学習(ディープラーニング)として知られています。そして手法がさらに進歩した結果、コンピュータビジョン、音声認識、自然言語処理における多くの重要な問題で、ディープニューラルネットワークと深層学習は優れた実績を達成しています Google、Microsoft、そして中国の検索の巨人であるBaiduといった企業では、深層学習が大規模に活用されています。

この本の目的は、深層学習の新しい手法を含むニューラルネットワークの中心的な概念をあなたが習得するのを助けることです。この本に取り組んだ後には、あなたはコードを書き終わっているでしょう。そのコードは、複雑なパターン認識の問題を解くために、ニューラルネットワークと深層学習とを用いるものです。そして、ニューラルネットワークと深層学習を用いて自分自身が考え出した問題に取り組むための、基礎的な知識が身についているでしょう。

A principle-oriented approach

One conviction underlying the book is that it's better to obtain a solid understanding of the core principles of neural networks and deep learning, rather than a hazy understanding of a long laundry list of ideas. If you've understood the core ideas well, you can rapidly understand other new material. In programming language terms, think of it as mastering the core syntax, libraries and data structures of a new language. You may still only "know" a tiny fraction of the total language - many languages have enormous standard libraries - but new libraries and data structures can be understood quickly and easily.

This means the book is emphatically not a tutorial in how to use some particular neural network library. If you mostly want to learn your way around a library, don't read this book! Find the library you wish to learn, and work through the tutorials and documentation. But be warned. While this has an immediate problem-solving payoff, if you want to understand what's really going on in neural networks, if you want insights that will still be relevant years from now, then it's not enough just to learn some hot library. You need to understand the durable, lasting insights underlying how neural networks work. Technologies come and technologies go, but insight is forever.

A hands-on approach

We'll learn the core principles behind neural networks and deep learning by attacking a concrete problem: the problem of teaching a computer to recognize handwritten digits. This problem is extremely difficult to solve using the conventional approach to programming. And yet, as we'll see, it can be solved pretty well using a simple neural network, with just a few tens of lines of code, and no special libraries. What's more, we'll improve the program through many iterations, gradually incorporating more and more of the core ideas about ニューラルネットワークと深層学習.

This hands-on approach means that you'll need some programming experience to read the book. But you don't need to be a professional programmer. I've written the code in Python (version 2.7), which, even if you don't program in Python, should be easy to understand with just a little effort. Through the course of the book we will develop a little neural network library, which you can use to experiment and to build understanding. All the code is available for download here. Once you've finished the book, or as you read it, you can easily pick up one of the more feature-complete neural network libraries intended for use in production.

On a related note, the mathematical requirements to read the book are modest. There is some mathematics in most chapters, but it's usually just elementary algebra and plots of functions, which I expect most readers will be okay with. I occasionally use more advanced mathematics, but have structured the material so you can follow even if some mathematical details elude you. The one chapter which uses heavier mathematics extensively is Chapter 2, which requires a little multivariable calculus and linear algebra. If those aren't familiar, I begin Chapter 2 with a discussion of how to navigate the mathematics. If you're finding it really heavy going, you can simply skip to the summary of the chapter's main results. In any case, there's no need to worry about this at the outset.

It's rare for a book to aim to be both principle-oriented and hands-on. But I believe you'll learn best if we build out the fundamental ideas of neural networks. We'll develop living code, not just abstract theory, code which you can explore and extend. This way you'll understand the fundamentals, both in theory and practice, and be well set to add further to your knowledge.