Learn Unity ML-Agents:Fundamentals of Unity Machine Learning
上QQ阅读APP看书,第一时间看更新

A Machine Learning example

In order to demonstrate some of these concepts in a practical manner, let's look at an example scenario where we use ML to solve a game problem. In our game, we have a cannon that shoots a projectile at a specific velocity in a physics-based world. The object of the game is to choose the velocity to hit the target at a specific distance. We have already fired the cannon ten times and recorded the results in a table and chart, as shown in the following screenshot:



Record and chart of cannon shots

Since the data is labelled already, this problem is well-suited for Supervised Training. We will use a very simple method called linear regression in order to give us a model that can predict a velocity in order to hit a target at a certain distance. Microsoft Excel provides a quick way for us to model linear regression on the chart by adding a trendline, as follows:



Linear Regression applied with a trendline

By using this simple feature in Excel, you can quickly analyze your data and see an equation that best fits that data. Now, this is a rudimentary example of data science, but hopefully you can appreciate how this can easily be used to predict complex environments just based on the data. While the linear regression model can provide us with an answer, it obviously is not very good and the Rreflects that. The problem we have with our model is that we are using a linear model to try and solve a nonlinear problem. This is reflected with the arrows to the points, where the distance shows the amount of errors from the trendline. Our goal with any ML method will be to minimize the errors in order to find the solution of best fit. In most cases, that is all ML is, finding an equation that best predicts/classifies a value or action.

Getting back to our earlier question, we can now solve the velocity using some simple algebraic substitution, as shown in the following equation:

Where d = distance and v = velocity:

Our final answer would be an answer of 56.05, but as we already mentioned, we may still miss, because our model is not entirely accurate. However, if you look at the graph, our errors appear to minimize around the distance of 300. So, in our specific example, our model fits well. Looking closer at the graph, though, you can see that at a distance of around 100, our error gets quite large and it is unlikely that we will hit our target. 

R2 or R squared is an error value between 0 and 1, with 1 being the highest or best fit. R2 attempts to summarize the quality of fit. In some cases, it works well and in others there are other measures of fit that work better. We will use different measures of quality of fitness, but the concepts are similar.

The example we just looked at is quite simple and doesn't take into account many other factors, such as elevation differences or movement speed, and so on. If we wanted to add those inputs, we would just add more columns to our table. Each new column would expand our data space and consequently increase the complexity of the model. As you can quickly see, our model could quickly expand and become impractical. This is essentially the shortcomings the gaming industry already experienced using ML techniques at the turn of the century when implementing game AI. It is also a shortcoming that any other industry faces when implementing supervision-based models. That is the need to constantly re-sample and relabel data and consequently retrain models, which is why Reinforcement Learning and other methods of learning have become so significant. They provide a method of learning whereby autonomous agents or ML with no previous knowledge of an environment can successfully explore.