Cross-validation is a technique used to assess the accuracy of a predictive model, based on training set data. It splits the training sets into test and control sets. The test sets are used to fine-tune the model to increase performance (better classification rate or reduced errors in prediction) and the control sets are used to simulate how the model would perform outside the training set. The control and test sets must be carefully chosen for this method to make sense.
This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, Hadoop, decision trees, ensembles, correlation, outliers, regression, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, time series, cross-validation, model fitting, dataviz, AI and many more. To keep receiving these articles, sign up on DSC.
Source for picture: article flagged with a +