Supervised Classification

Supervised classification is a technique for extracting information from image data. The goal is to classify pixels in an image into different classes based on features of the pixels. There are two stages: training stage and classification stage. During the training stage, a set of vectors (each vector is associated with a pixel) called training samples are used to train a classifier. Each training sample vector is made up of the class the pixel belongs to and feature values of the pixel. In the classification stage, the trained classifier is used to classify pixels with known feature values but unknown class.

User has a choice of
to perform classification.

To do the former, type in a new filename.

To do the latter, select a filename from the list.


Train and save a classifier

User is required to provide the number of training samples to use.

Note that in addition to training, evaluation of the classifier is also performed. E.g., if the number of training samples to use is 5000, then 5000 x 2 = 10,000 samples will be extracted. The first 5000 will be used as training samples and the remaining 5000 will be used as test samples for evaluation.

All bands in the "feature" products are used.

User has a choice of
  1. using the data in the "class" band as classes (the first band in the "class" product without "lat_band" or "lon_band" in its name is used as the "class" band); or
  2. pick the classes from the list of "regions" in the "class" product.

To choose (1), deselect all in the "Classes" list.

In this case, if the data in the training samples only assume the values 0 (representing, say, water) or 1 (representing, say, land), then there will only be two classes.

If the data is biomass, there will be as many classes as there are biomass values in the training set. For data values that are continuous like biomass, it is recommended to quantize the values.

Remember to uncheck the "Quantize class value" box if the data is discrete labels such as landcover classes.

It is up to the user to ensure that the training or test samples set contains roughly equal number of samples for each class.

To choose (2), select the "regions" in the "Classes" list that correspond to classes (e.g., a "region" called "water" will become the class with label "water").

"Regions" are represented as "vectors" and can be created using the "New Vector Data Container" tool and other drawing tools such as "Rectangle drawing tool".

A pixel inside that "region" will have the name of the region as its class instead of its data value.

The operator will endeavour to extract the same number samples for each class when constructing the training or test samples set.

Two separate files are created: one with no extension and one with extension .xml.


Load a previously saved classifier

The minimum information the user needs to know to use a saved classifier is the list of features which is contained in the XML file among other useful information.

At  least one "feature" product is required.  However, user can specify more than one.

For each name in "featureNames" in the XML file, the operator will search for a band in the "feature" products whose name contains it.

If evaluation of the classifier is desired, then the user must provide
If there is more than one product in the list, the operator will determine if the first one is a "class" product by looking for a "feature" band in the product. If it does not contain any "feature" band, then it is taken as the "class" product.  

If the saved classifier was constructed using "regions" as classes,  the "class" product must contain "vectors" with the same names. The names can be found in the XML file under "regionVectorNames".

If "regions" were not used, then the operator will use the first band in the "class" product as the "class" band.