Page updated: August 4, 2020
Author: Curtis Mobley
View PDF

# Thematic Mapping

Thematic mapping refers to the determination and display of a particular type of information (the theme). In terrestrial and oceanic remote sensing, a common theme is the type of surface material. On land, a thematic map might display the land areas covered by forest, grassland, water, crops, bare soil, pavement, etc. In shallow waters, the thematic map might distinguish bottom areas covered by mud, sand, rock, sea grass, coral, etc. Much work has been done recently on mapping bathymetry, bottom type, and water IOPs as extracted from hyperspectral imagery. This page compares the supervised classiﬁcation technique used for terrestrial thematic mapping with spectrum matching techniques (e.g., Mobley et al. (2005); Dekker et al. (2011)) for shallow-water mapping of bottom type.

The simultaneous retrieval of bathymetry, bottom classiﬁcation, and water IOPs is a much more diﬃcult task than traditional thematic mapping to determine land surface type, as used in terrestrial remote sensing. In terrestrial thematic mapping, only the type of land surface must be deduced from an atmosphericly corrected image spectrum; there are no confounding inﬂuences by water IOPs and depth. We will see that terrestrial techniques for supervised classiﬁcation are not well suited to the oceanic problem because of the additional complications of bottom depth and water optical properties, neither of which are present in terrestrial remote sensing.

#### Supervised classiﬁcation

In supervised classiﬁcation the object is to associate a given image spectrum with one of several pre-determined classes of spectra. In terrestrial remote sensing these classes are typically deﬁned as soil, grass, trees, water, pavement, etc. A thematic map of earth surface features is then generated by classifying the spectrum from each image pixel into one of the pre-determined classes.

One approach to supervised classiﬁcation is to compute the mean spectrum for each class and a corresponding covariance matrix that deﬁnes the ”size” of each class of spectra about its mean. The image spectrum is then compared only with the mean spectrum and size for each class, and the image spectrum is statistically associated with the class it is most likely to belong to according to some metric for distance between the image and mean spectra and user-speciﬁed assumptions about the statistical properties of the class members.

This page considers the terrestrial and oceanic problems in more detail and shows that the standard terrestrial thematic mapping methodology based on supervised classiﬁcation is not easily applied to the ocean remote sensing problem.

#### Covariance and correlation matrices

Consider a collection of $N$ remote sensing reﬂectance spectra ${R}_{rs}$, each with $K$ wavelengths, which we denote by ${R}_{n}\left({\lambda }_{k}\right),n=1,...,N$ and $k=1,...,K$ (dropping the rs subscript on ${R}_{rs}$ for convenience). The spectra can be regarded as column vectors:

 ${\mathbf{\text{R}}}_{n}=\left(\begin{array}{c}\hfill {R}_{n}\left({\lambda }_{1}\right)\hfill \\ \hfill {R}_{n}\left({\lambda }_{2}\right)\hfill \\ \hfill ⋮\hfill \\ \hfill {R}_{n}\left({\lambda }_{K}\right)\hfill \end{array}\right)={\left[{R}_{n}\left({\lambda }_{1}\right),{R}_{n}\left({\lambda }_{2}\right),\dots ,{R}_{n}\left({\lambda }_{K}\right)\right]}^{T}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}},$ (1)

where bold type indicates a vector or matrix, and superscript T indicates transpose. In the spectrum matching technique described previously, these spectra are the database spectra, $N$ is usually $1{0}^{5}$ or more, and $K$ would be 75 for spectra from 380 to 750 nm with 5 nm resolution. Let

 $\mathbf{\text{I}}={\left[R\left({\lambda }_{1}\right),R\left({\lambda }_{2}\right),\dots ,R\left({\lambda }_{K}\right)\right]}^{T}$

be the image spectrum that is to be classiﬁed.

Now consider subsets of the entire database that deﬁne various classes of spectra. To be speciﬁc in the illustrative computations below, we chose four classes of spectra: ${R}_{rs}$ for 10 sand and sediment spectra seen through 0.01 m of water, 10 coral spectra seen through 0.01 m of water, and the same sand and coral spectra seen through 10 m of the same water. The water IOPs were based on measurements of the very clear water in the Bahamas. The sand and sediment spectra range from clean ooid sand to heavily bioﬁlmed, darker sand. The coral spectra are diﬀerent species of corals. Figure 1 shows the individual spectra in these four classes. To minimize the array sizes for the printout of Table 1 below, we subsampled the spectra to wavelengths of 400, 450, ..., 650, 700 nm, so that $K=7$. The subsampled spectra are shown in Fig. 2.

These spectra are obviously correlated in wavelength. The amount of correlation between one wavelength and another is quantiﬁed by the covariance and correlation matrices, which are computed as follows. Let $m=1,...,M$ label the class, with $M$ being the total number of classes (here 4). Class $m$ contains ${N}_{m}$ spectra (here, ${N}_{m}=10$ for each class). Then the mean or average spectrum for each class is deﬁned by

 ${\overline{R}}_{m}\left({\lambda }_{i}\right)=\frac{1}{{N}_{m}}\sum _{n=1}^{{N}_{m}}{R}_{n}\left({\lambda }_{i}\right)\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}},$ (2)

where the sum is over the spectra belonging to class $m$. In vector notation this is

 ${\overline{\mathbf{\text{R}}}}_{m}=\frac{1}{{N}_{m}}\sum _{n=1}^{{N}_{m}}{\mathbf{\text{R}}}_{n}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}.$ (3)

The mean spectra for the example four classes are shown by the heavy lines in Fig. 2.

The elements of the $K×K$ class covariance matrices ${\Sigma }_{m}$ are deﬁned by

 ${\Sigma }_{m}\left(i,j\right)=\frac{1}{{N}_{m}-1}\sum _{n=1}^{{N}_{m}}\left[{R}_{n}\left({\lambda }_{i}\right)-{\overline{R}}_{m}\left({\lambda }_{i}\right)\right]\left[{R}_{n}\left({\lambda }_{j}\right)-{\overline{R}}_{m}\left({\lambda }_{j}\right)\right]\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}.$ (4)

${\Sigma }_{m}\left(i,j\right)$ expresses the covariance of the class spectra at wavelength ${\lambda }_{i}$ with ${\lambda }_{j}$; ${\Sigma }_{m}\left(i,i\right)$ is the variance of the class spectra at ${\lambda }_{i}$. For remote-sensing reﬂectance spectra ${R}_{rs}$ with units of ${\text{sr}}^{-1}$, the units of ${\Sigma }_{m}\left(i,j\right)$ are ${\text{sr}}^{-2}$. If we arrange the spectrum column vectors for class $m$ in a $K×{N}_{m}$ matrix with the class mean removed,

 ${\mathbf{\text{R}}}_{\left(m\right)}=\left(\begin{array}{ccc}\hfill {R}_{1}\left({\lambda }_{1}\right)-{\overline{R}}_{m}\left({\lambda }_{1}\right)\hfill & \hfill \cdots \phantom{\rule{0.3em}{0ex}}\hfill & \hfill {R}_{{N}_{m}}\left({\lambda }_{1}\right)-{\overline{R}}_{m}\left({\lambda }_{1}\right)\hfill \\ \hfill ⋮\hfill & \hfill \cdots \phantom{\rule{0.3em}{0ex}}\hfill & \hfill ⋮\hfill \\ \hfill {R}_{1}\left({\lambda }_{K}\right)-{\overline{R}}_{m}\left({\lambda }_{K}\right)\hfill & \hfill \cdots \phantom{\rule{0.3em}{0ex}}\hfill & \hfill {R}_{{N}_{m}}\left({\lambda }_{K}\right)-{\overline{R}}_{m}\left({\lambda }_{K}\right)\hfill \end{array}\right)\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}},$ (5)

then the covariance matrix for class $m$ can be compactly written as

 ${\Sigma }_{m}=\frac{1}{{N}_{m}-1}{R}_{\left(m\right)}{R}_{\left(m\right)}^{T}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}.$ (6)

The elements of the $K×K$ correlation matrix ${\rho }_{m}$ for class $m$ are deﬁned from the class covariance matrix ${\Sigma }_{m}$ by

 ${\rho }_{m}\left(i,j\right)=\frac{{\Sigma }_{m}\left(i,j\right)}{\sqrt{{\Sigma }_{m}\left(i,i\right){\Sigma }_{m}\left(j,j\right)}}\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}}.$ (7)

Table 1 shows the class covariance and correlation matrices computed by these equations for the four classes of spectra shown in Fig. 2.

Table 1. Covariance and correlation matrices for the four classes of spectra seen in Fig. 2. Wavelength 1 (400 nm) is at the upper left and wavelength 7 (700 nm) is at the lower right of each array. Units for $\Sigma$ are ${\text{sr}}^{-2}$; $\rho$ is non-dimensional.

These speciﬁc examples make it clear that

• For a given class, ${R}_{rs}$ at one wavelength is highly correlated with ${R}_{rs}$ at another wavelength, as expected.
• The covariance and correlation matrices are diﬀerent for each class. These matrices depend not only on bottom type (sand vs coral) but also on bottom depth (and water IOPs, not explicitly shown here). In other words, the wavelength covariances carry information about both bottom type and water depth and IOPs.

#### Spectrum matching vs. statistical classiﬁcation

One metric for comparing two spectra is the simple Euclidean metric, which measures the squared distance (in units of ${\text{sr}}^{-2}$) between an image spectrum I and each ${\mathbf{\text{R}}}_{n}$ in the database:

 ${D}_{E}^{2}\left(n\right)=\sum _{i=1}^{K}{\left[I\left({\lambda }_{i}\right)-{R}_{n}\left({\lambda }_{i}\right)\right]}^{2}={\left[I-{R}_{n}\right]}^{T}\left[I-{R}_{n}\right]$ (8)

The spectrum ${\mathbf{\text{R}}}_{n}$ giving the minimum distance ${D}_{E}^{2}\left(n\right)$ of all $N$ database spectra determines the closest match to the image spectrum I. Note that this is not a statistical estimate in the sense that no probability model is involved. Note also that the image spectrum is being compared with every spectrum in the database, not just with pre-deﬁned class mean spectra.

In traditional thematic classiﬁcation, an image spectrum I is compared only with the mean spectrum and ”size” for each class, as expressed by the class mean ${\overline{\mathbf{\text{R}}}}_{m}$ and covariance ${\Sigma }_{m}$. Here ”size” is used in the sense that the variances and covariances in ${\Sigma }_{m}$ are larger when the spread of ${R}_{rs}$ spectra is greater. Inspect, for example, the elements of ${\Sigma }_{m}$ for the class of sand at 0.01 m compared to sand at 10 m, for which the spectra are all much closer together (especially at blue and red wavelengths) and thus have smaller covariances. The class covariance matrix deﬁnes the size of the ”swarm of points” surrounding the centroid (mean class spectrum) representing the class in K-dimensional ${R}_{rs}$ space. The image spectrum is assigned to a particular class according to a statistical model (often based on the assumption of a multivariate normal distribution of the swarm of points) that determines the probability that the image spectrum belongs to a particular the swarm of points deﬁning a given class. The class spectra (K-dimensional swarms of points) generally overlap, so that an unambiguous, non-probabilistic association of I with a given class is not possible.

In maximum likelihood estimation (MLE; see Richards and Jia (1996) for an excellent discussion of this whole business), the distance metric is

 ${D}_{MLE}^{2}\left(m\right)=ln|{\Sigma }_{m}|+{\left[I-{\overline{R}}_{m}\right]}^{T}\phantom{\rule{0.3em}{0ex}}{\Sigma }_{m}^{-1}\phantom{\rule{0.3em}{0ex}}\left[I-{\overline{R}}_{m}\right]\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}},$ (9)

where $|{\Sigma }_{m}|$ denotes the determinant of ${\Sigma }_{m}$ and ${\Sigma }_{m}^{-1}$ denotes the inverse. $|{\Sigma }_{m}|$ and ${\Sigma }_{m}^{-1}$ are of course pre-computed for each class before doing the spectrum matching. The image spectrum I is assigned to the class m having the smallest value of ${D}_{MLE}^{2}\left(m\right)$. Note that now the image spectrum is compared only with the class mean spectra ${\overline{R}}_{m}$. The assignment of the image spectrum to a particular class is based on its closeness to the class mean and the spread of the ”swarm of points” surrounding the mean, as described by the covariance matrix. This metric involves matrix multiplication, which is computationally expensive, but the number of classes is generally small, so in practice this may not be a problem.

It is often said that the incorporation of ${\Sigma }_{m}$ into the distance metric ”removes the eﬀect of correlations between wavelengths.” This interpretation of the eﬀect of ${\Sigma }_{m}$ relates to the fact that covariance matrices are the foundation of principle component analysis (PCA; see Preisendorfer (1988)). In PCA the original independent, physical variables (here, the wavelengths) are transformed to obtain a new set of (generally unphysical) independent variables for which the data are uncorrelated. This transformation can be viewed as a rotation of the axes of the original, physical data space (here the wavelength axes used for plots in K-dimensional space) to generate new (generally unphysical) axes for which the data are uncorrelated.

If the class covariances are equal (or assumed to be equal), then $ln|{\Sigma }_{m}|$ is the same for each class and can be ignored. The MLE metric then reduces to the Mahalanobis distance metric,

 ${D}_{M}^{2}\left(m\right)={\left[I-{\overline{R}}_{m}\right]}^{T}\phantom{\rule{0.3em}{0ex}}{\Sigma }^{-1}\phantom{\rule{0.3em}{0ex}}\left[I-{\overline{R}}_{m}\right]\phantom{\rule{0.3em}{0ex}}\phantom{\rule{0.3em}{0ex}},$ (10)

where $\Sigma$ is the common value of ${\Sigma }_{m}$. The image spectrum I is then assigned to the class m having the smallest value of ${D}_{M}^{2}\left(m\right)$.

We have seen by the speciﬁc examples of Fig. 2 and Table 1 that the covariance matrices are diﬀerent for diﬀerent classes of the sort that are relevant for ocean-bottom remote sensing. Indeed, Table 1 shows that the elements of the ${\Sigma }_{m}$ can change by orders of magnitude as a function of water depth. This inequality of the ${\Sigma }_{m}$ for diﬀerent classes precludes use of the Mahalanobis metric for classes as deﬁned here. For the retrievals needed for shallow-water mapping of bottom type, MLE (or something else) would have to be used with a diﬀerent covariance matrix for each class.

However, it is not at all clear how meaningful classes should be deﬁned for simultaneous retrievals of bottom type, water column IOPs, and bottom depth. Should one class be ”sand spectra at 5.25 m depth with a particular set of water absorption, scattering, and backscatter spectra,” and another class be ”sand spectra at 5.25 m depth with the same absorption and scattering spectra but a diﬀerent backscatter fraction,” and another class be ”sand spectra at 5.50 m with the ﬁrst set of IOPs,” and then another class be ”sea grass spectra at 7.50 m with yet another set of IOPs,” and so on? If so, then the number of classes quickly becomes as large as the number of depths, IOP sets, and pre-chosen classes of bottom type (sand, coral, sea grass, etc.). A database generated as previously described easily could have hundreds or thousands of classes (a database often has 50-100 bottom depths, several dozen to several hundred sets of IOPs, and more than 100 bottom reﬂectance spectra). With such a large number of classes, the validity of doing traditional thematic mapping becomes uncertain, not to mention the additional computational costs involved with the matrix multiplications.

Clearly spectrum matching for shallow-water applications addresses a much more complicated problem than classic terrestrial thematic mapping, which corresponds to retrieval of bottom type if there were no water present, i.e. no simultaneous retrieval of depth and IOPs. Because of the greater complexity of the oceanographic retrieval problem, and because of the diﬃculty in deﬁning meaningful classes, shallow-water spectrum matching does not use statistical classiﬁcation techniques such as MLE. The spectrum matching approach of Mobley et al. (2005) does not compare an image spectrum to a class mean spectrum. In that technique, an image spectrum is compared to every spectrum in a database to ﬁnd the closest match by the chosen (Euclidean or some other) metric, which is appropriate in this case. In a manner of speaking, each database ${R}_{rs}$ spectrum is a separate class corresponding to a particular depth, bottom reﬂectance spectrum, and set of IOPs. In such a situation (only one member in each class) the covariance matrix is undeﬁned.

Moreover, for the present problem it is not even desirable to remove the eﬀects of wavelength correlations, as can be done with the MLE or Mahalanobis metrics, because the wavelength correlations carry information that is critical to separating depth and IOPs eﬀects from bottom type eﬀects.

The spectrum-matching approach of Mobley et al. (2005) for shallow-water benthic mapping therefore avoids deﬁning predetermined classes and ﬁnds the closest match from the entire database. This gives the highest possible resolution (in depth, bottom type, and water IOPs) of retrievals. This approach retrieves a particular bottom reﬂectance spectrum (which represents a particular bottom type), not just a generic bottom type such as sand or coral. If the user later wishes to group the particular spectra for the retrieved bottom types into broader classes such as corals vs. sediments, or to group the retrieved IOPs into low, medium, and high absorption bins, for example, then that is easily done from the full-resolution retrieval.