We studied the statistical learning methods with imbalanced training data sets. Imbalanced training sets are very common in industrial machine vision applications. The minority class contains the defects or anomalies we try to catch. The majority class contains the "regular" objects. We need a method that performs well at both false positive and false negative error rates. Traditional methods such as classification tree yield unsatisfactory results. We propose a two-stage classification scheme. We first use a subset selection method to remove redundant examples from the majority class. As a result, the training sample becomes more balanced without losing critical boundary information. The computation-intensive methods such as boosted classification trees are then applied to further improve both error rates.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.