Dimensionality Reduction Through Classifier Ensembles (Paperback)


In data mining, one often needs to analyze datasets with a very large number of attributes. Performing machine learning directly on such data sets is often impractical because of extensive run times, excessive complexity of the fitted model (often leading to overfitting), and the well-known "curse of dimensionality." In practice, to avoid such problems, feature selection and/or extraction are often used to reduce data dimensionality prior to the learning step. However, existing feature selection/extraction algorithms either evaluate features by their effectiveness across the entire data set or simply disregard class information altogether (e.g., principal component analysis). Furthermore, feature extraction algorithms such as principal components analysis create new features that are often meaningless to human users. In this article, we present input decimation, a method that provides "feature subsets" that are selected for their ability to discriminate among the classes. These features are subsequently used in ensembles of classifiers, yielding results superior to single classifiers, ensembles that use the full set of features, and ensembles based on principal component analysis on both real and synthetic datasets.

R408

Or split into 4x interest-free payments of 25% on orders over R50
Learn more

Discovery Miles4080
Delivery AdviceShips in 10 - 15 working days


Toggle WishListAdd to wish list
Review this Item

Product Description

In data mining, one often needs to analyze datasets with a very large number of attributes. Performing machine learning directly on such data sets is often impractical because of extensive run times, excessive complexity of the fitted model (often leading to overfitting), and the well-known "curse of dimensionality." In practice, to avoid such problems, feature selection and/or extraction are often used to reduce data dimensionality prior to the learning step. However, existing feature selection/extraction algorithms either evaluate features by their effectiveness across the entire data set or simply disregard class information altogether (e.g., principal component analysis). Furthermore, feature extraction algorithms such as principal components analysis create new features that are often meaningless to human users. In this article, we present input decimation, a method that provides "feature subsets" that are selected for their ability to discriminate among the classes. These features are subsequently used in ensembles of classifiers, yielding results superior to single classifiers, ensembles that use the full set of features, and ensembles based on principal component analysis on both real and synthetic datasets.

Customer Reviews

No reviews or ratings yet - be the first to create one!

Product Details

General

Imprint

Bibliogov

Country of origin

United States

Release date

August 2013

Availability

Expected to ship within 10 - 15 working days

First published

August 2013

Creators

Dimensions

246 x 189 x 2mm (L x W x T)

Format

Paperback - Trade

Pages

28

ISBN-13

978-1-287-28110-8

Barcode

9781287281108

Categories

LSN

1-287-28110-9



Trending On Loot