On writing and chapter organization
A note on the writing in this book. As you may already have guessed, I have decided to settle upon a more conversational tone. I find this tone to be friendlier to the reader who may be intimidated by machine algorithms. I am also, if you have not yet noticed, quite opinionated in my writing. I strive to make clear, through my writing, what is and what ought to be. Application of Hume's fork is at the discretion at the reader. But as a quick guide, when talking about algorithms and how they work, they are is statements. When talking about what should be done, and organization of code, they are ought statements.
There are two general patterns in the design of the chapters of this book. First, the problems get harder as the chapters go on. Second, one may optionally want to mentally divide the chapters into three different parts. Part 1—Chapters 2, Linear Regression - House Price Prediction, Chapter 3, Classification - Spam Email Detection, Chapter 4, Decomposing CO2 Trends Using Time Series Analysis, Chapter 7, Convolutional Neural Networks - MNIST Handwriting Recognition, Chapter 8, Basic Facial Detection) correspond to readers who have an urgent ML problem. Part 2—Chapters 5, Clean Up Your Personal Twitter Timeline by Clustering Tweets, Chapter 9, Hot Dog or Not Hot Dog - Using External Services, Chapter 6, Neural Networks; MNIST Handwriting Recognition, Chapter 7, Convolutional Neural Networks - MNIST Handwriting Recognition, Chapter 8, Basic Facial Detection) are for those who have ML problems akin to the second example. Part 3, the last two chapters, are for people whose machine learning problems are not as urgent, but still require a solution.
Up to Chapter 8, Basic Facial Detection, for each chapter there will typically be one or two sections dedicated to the explanation of the algorithm itself. I strongly believe that one cannot write any meaningful program without at least a basic understanding of the algorithms they are using. Sure, you may use an algorithm that has been provided by someone else, but, without a proper understanding, it's meaningless. Sometimes meaningless programs may produce results, just as sometimes an elephant may appear to know how to do arithmetic, or a pet dog may appear to do a sophisticated feat, like speak.
This also means that I may sound rather dismissive of some ideas. For example, I am dismissive of using ML algorithms to predict stock prices. I do not believe that doing so will be a productive endeavor, because of an understanding of the basic processes that generate prices in a stock market, and the confounding effects of time.
Time, however, will tell if I am right or wrong. It may well be that one day someone will invent a ML algorithm that will work perfectly well on dynamical systems, but not right now, right here. We are at the dawn of a new era of computation, and we must strive to understand things as best as we can. Often you can learn quite a bit from history. Therefore, I also strive to insert important historical anecdotes on how certain things came to be. These are by no means a comprehensive survey. In fact, it is done without much regularity. However, I do rather hope that it adds to the flavor of the book.