October 16, 2020
Do custom created classifications outperform global tree maps like Hansen’s?
In this problem, we explore some of the challenges of classifying forest pixels based on tree cover thresholds, and try to compare custom-created supervised classifications with tree cover threshold-based classification systems. We began by creating a set of training points of different landcover groups to rigorously pick out tree cover in our study area (the Rumphi district of Malawi) and creating decision trees based on those training points to assign landcover values to pixels.
Evaluation of Custom Classification
Based on visual inspection, this custom classification map seems to capture densely forested areas fairly well. Below are two images showing the same regions before and after our custom classification had been run. There seems to be some confusion regarding new tree growth in deforested areas and grassland (Fig. 1a), but overall, intact forest patches are well-identified, as can be seen in Fig. 1b.
Comparing Custom Classification to Hansen’s Classification
Next, we compared our custom tree cover classification to a threshold-based tree cover map made from Hansen’s Global Forest Change at a threshold of 30%. When comparing to Fig. 1a., it looks like Hansen’s map is more likely than our custom classification to classify areas that have been deforested and are beginning to grow again as “tree cover” (Fig. 2a). It also seems that Hansen’s dataset at the 30% threshold is more likely than our dataset to classify non-treed areas as forested, as can be seen in Fig 2b and Fig. 3. However, Fig. 3. shows that although Hansen is more likely to produce false positives, the overall majority of the pixels seem to be in agreement about tree cover conditions across the two datasets, although there is more agreement regarding true negatives than true positives.
“Truthing” the Hansen Dataset
If we consider our custom classification to be “truth,” we can compare our results to Hansen’s to see at what threshold the Hansen dataset performs the most correct classification. When performing supervised classification, Adjognon et al. (2019) state that when probability of classification as tree cover is low, the rates of true positives (correctly classified trees) is highest. However, as thresholds increase, true positive rates decreased but true negative rates (correctly classified as not trees) increased, resulting in a higher balanced accuracy rate between true positive rate and true negative rate. The lower the threshold, the higher the rate of true positives, meaning that more areas of tree cover will be properly identified. The reverse applies for higher thresholds and true negatives (Adjognon et al., 2019).
To determine this threshold of achieving the highest balanced accuracy rate, we ran our comparison of the custom classification against different thresholds of the Hansen dataset.
Code For Multithreshold Analysis
It seems that a Hansen threshold of 21% creates the most accurate classification of tree cover pixels. At this low threshold, Hansen is more likely than our dataset to classify a pixel as “forested,” but the difference between Agree (Tree) and Agree (No Tree) pixels is minimized and the majority of classified pixels are in agreement (Fig. 4.).
References
Adjognon, G.S., Rivera-Ballesteros, A., and van Soest, D. (2019). Satellite-based tree cover mapping for forest conservation in the drylands of Sub Saharan Africa (SSA): Application to Burkina Faso gazetted forests. Development Engineering, 4, 100039 (2019). https://doi.org/10.1016/j.deveng.2018.100039