Non-linear Dimensionality Reduction Using Principal Curves and Surfaces

Mr. Kui-yu Chang
Dept. of ECE
UT Austin

Wednesday, April 19th, 2:00 PM, ENS 537

kuiyu@lans.ece.utexas.edu


Abstract

With the advent of next generation sensors, large-scale data acquisition and storage, huge amounts of higher definition data are now available for analysis. Ironically, this makes dimensionality reduction even more important. This dissertation takes the feature extraction approach to dimensionality reduction by studying nonlinear manifolds for characterizing high-D data. Existing nonlinear feature extraction methods are evaluated and found to be deficient, with the exception of the principal curve, which is a nonlinear generalization of principal components. Unfortunately, current principal curve formulations suffer from a number of problems. I first develop a new parametric formulation called probabilistic principal curve. This improved principal curve is then applied to feature extraction and classification of moderate-dimensional data with superior results over other classifiers based on the k-nearest neighbor, and multilayer perceptron neural network.

I then develop the probabilistic principal surface (PPS), a general parametric model for computing principal surfaces of arbitrary dimensionality and topology. The PPS is free from most of the problems associated with existing principal surface formulations. A generalized expectation maximization algorithm with guaranteed convergence is derived for the PPS. Empirical properties of the PPS are also studied. Next, a spherical PPS is proposed for emulating the sparsity and periphery properties of high-D data. Consequently, the spherical PPS is very useful as a powerful visualization tool, offering several advantages over the popular principal component analysis based visualization methods. A template-based classifier using spherical PPSs as reference manifolds is also proposed. The spherical PPS classifier is shown to perform better than traditional classifiers based on the k-nearest neighbor and Gaussian mixture models for several problems including the classification and pose estimation of 3-D objects from 2-D images. Results on simulated aircraft and vehicle data prove the proposed approach to be highly accurate and effective.


A list of Telecommunications and Signal Processing Seminars is available at from the ECE department Web pages under "Seminars". The Web address for the Telecommunications and Signal Processing Seminars is http://anchovy.ece.utexas.edu/seminars