A Robust Classification of Galaxy Spectra: Dealing with Noisy and Incomplete Data

Abstract
Over the next few years new spectroscopic surveys (from the optical surveys of the Sloan Digital Sky Survey and the 2 degree Field survey through to space-based ultraviolet satellites such as GALEX) will provide the opportunity and challenge of understanding how galaxies of different spectral type evolve with redshift. Techniques have been developed to classify galaxies based on their continuum and line spectra. Some of the most promising of these have used the Karhunen and Loeve transform (or Principal Component Analysis) to separate galaxies into distinct classes. Their limitation has been that they assume that the spectral coverage and quality of the spectra are constant for all galaxies within a given sample. In this paper we develop a general formalism that accounts for the missing data within the observed spectra (such as the removal of sky lines or the effect of sampling different intrinsic rest wavelength ranges due to the redshift of a galaxy). We demonstrate that by correcting for these gaps we can recover an almost redshift independent classification scheme. From this classification we can derive an optimal interpolation that reconstructs the underlying galaxy spectral energy distributions in the regions of missing data. This provides a simple and effective mechanism for building galaxy spectral energy distributions directly from data that may be noisy, incomplete or drawn from a number of different sources.

This publication has 0 references indexed in Scilit: