
Chain rule
One very important result of the symbolic determination of a function's derivative is the chain rule. This formula, first mentioned in a paper by Leibniz in 1676, made it possible to solve the derivatives of composite functions in a very simple and elegant manner, simplifying the solution for very complex functions.
In order to define the chain rule, if we suppose a function f, which is defined as a function of another function g, f(g(x)) of F, the derivative can be defined as follows:

The formula of the chain rule allows us to differentiate formulas whose input values depend on another function. This is the same as searching the rate of change of a function that is linked to a previous one. The chain rule is one of the main theoretical concepts employed in the training phase of neural networks, because in those layered structures, the output of the first neuron layers will be the inputs of the following, giving, as a result, a composite function that, most of the time, is of more than one nesting level.