Hands-On Mathematics for Deep Learning
上QQ阅读APP看书,第一时间看更新

Derivatives

To start with, let's imagine a straight line with the following equation:

In the equation, the following aspects apply:

  • y is a function of x, often written simply as f(x) (which is the notation we will be predominantly using in the remainder of the book). In the preceding equation, the output value y is dependent on the input value x.
  • The m value is the gradient, which tells us how steep the straight line is, or what its rate of change is (that is, how much does a change in the x value affect the y value).
  • The  value tells us whether the line is moving upward or downward.
  • The  value tells us by how much the line is above or below the origin. 
  • The m and b values in a straight line are constant throughout. 

Now that you know what the equation of a straight line looks like, you're probably wondering how to find it for an arbitrary straight line.

We start by first picking two points, (x1, y1) and (x2, y2), that lay on the line, and plug their values into the formula . After having found the value for m, we find the value of b by using the line equation and plugging into it the value for m and one (x, y) point on the line, and solve for b.

Well, that was very simple and straightforward. However, there are far more complex equations out there that aren't as straightforward—those that relate to curves (nonlinear functions), as illustrated in the following image:

Imagine a picture of a couple of hills or camel humps. If you trace the surface of them, you will have a curve, and as you may have no doubt noticed, they go up and then down and then back up, and the process repeats itself. 

From the preceding image of the curve, you can easily tell that the gradient is not constant, as it was in the previous example with the straight line. We could sketch straight lines along the curve and calculate their slopes to understand how the curve moves. However, there is a simpler method than this tedious one.

At the very core of calculus are two concepts, as follows:

  • Differentiation helps us understand how much a function output changes with respect to changing input. 
  • Integration helps us understand the impact of this change in inputs between certain points. 

We will begin initially by taking an in-depth look at differentiation. The primary equation for finding the derivative of a function is shown here:

I know there are a few new symbols here and it looks complicated, but it's really very simple. What this equation is doing is finding the derivative of the function f with respect to the variable in the denominator x. This isn't too different from the earlier equation we saw (which we used to calculate the gradient of a straight line). We subtract two values, f(x+h) and f(x), and divide it by its difference, h. But what does  have to do with this? This tells us that we want the two points on the curve to be as close to each other as possible so that when we are sketching the gradient on the curve, it looks like a straight line at one point on the curve. This allows us to better visualize and understand the effect of the change, as can be seen in the following screenshot:

 

See the following example:

Now that we understand what a derivative is and how to find it for any function, let's move on to some important rules of differentiation.