A study of overfitting in optimization of a manufacturing quality control procedure

Tea Tušar, Klemen Gantar, Valentin Koblar, Bernard Ženko, Bogdan Filipič

2017

PDF ScienceDirect

Abstract

Quality control of the commutator manufacturing process can be automated by means of a machine learning model that can predict the quality of commutators as they are being manufactured. Such a model can be constructed by combining machine vision, machine learning and evolutionary optimization techniques. In this procedure, optimization is used to minimize the model error, which is estimated using single cross-validation. This work exposes the overfitting that emerges in such optimization. Overfitting is shown for three machine learning methods with different sensitivity to it (trees, additionally pruned trees and random forests) and assessed in two ways (repeated cross-validation and validation on a set of unseen instances). Results on two distinct quality control problems show that optimization amplifies overfitting, i.e., the single cross-validation error estimate for the optimized models is overly optimistic. Nevertheless, minimization of the error estimate by single cross-validation in general results in minimization of the other error estimates as well, showing that optimization is indeed beneficial in this context.

Type

Journal article

Publication

Applied Soft Computing