Assessment
When a data scientist evaluates a model or data science process for performance, this is referred to as assessment. Performance can be defined in several ways, including the model's growth of learning or the model's ability to improve (with) learning (to obtain a better score) with additional experience (for example, more rounds of training with additional samples of data) or accuracy of its results.
One popular method of assessing a model or processes performance is called bootstrap sampling. This method examines performance on certain subsets of data, repeatedly generating results that can be used to calculate an estimate of accuracy (performance).
The bootstrap sampling method takes a random sample of data, splits it into three files--a training file, a testing file, and a validation file. The model or process logic is developed based on the data in the training file and then evaluated (or tested) using the testing file. This tune and then test process is repeated until the data scientist is comfortable with the results of the tests. At that point, the model or process is again tested, this time using the validation file, and the results should provide a true indication of how it will perform.