Kernel‐based PLS regression; Cross‐validation and applications to spectral data

Abstract
Multivariate images are very large data structures and any type of regression for their analysis is very computer‐intensive. Kernel‐based partial least squares (PLS) regression, presented in an earlier paper, makes the calculation phase more rapid and less demanding in computer memory. The present paper is a direct continuation of the first paper. In this study the kernel PLS algorithm is extended to include cross‐validation for determination of the optimal model dimensionality. To show the applicability of the kernel algorithm, two examples from multivariate image analysis are used. The first example is an image from an airborne scanner of size 9 × 512 × 512. It consists of nine images which are regressed against a constructed dependent image to test the accuracy of the kernel algorithm when used on large data structures. The second example is a satellite image of size 7 × 512 × 512. Several different regression models are presented together with a comparison of their predictive capabilities. The regression models are also used as examples for showing the use of cross‐validation.