Comparing my classification with the satellite imagery provided by Google shows that in general, I was generally able to identify obvious, unclouded forest cover and categorize it as such. However, this is about the only thing that my classification was able to do properly. When compared to the built in satellite visualization, it seems I drastically over-counted forest cover in the study region. Much of this is the result of the relative cloudiness of the Landsat image used. Even though it was the least cloudy image available, the presence of clouds in the study area affected my results greatly. In general, clouded areas and many cloud shadows were categorized as forested, regardless of the ground truth in the region. In retrospect, I should have made a separate classification for cloud cover to avoid this problem.
If we assume my classification of tree cover is an accurate depiction of the region, we can then compare the Hansen dataset to see at which canopy threshold it best resembles my classification. This is done by comparing my and Hansen’s classifications and finding the true positive and true negative rates and averaging them into a balanced rate. I conducted this analysis using Hansen thresholds of 15%, 30%, 50%, and 75% and summarized the results in Table 1.
15% | 30% | 50% | 75% | |
TPR | 0.765 | 0.466 | 0.263 | 0.141 |
TNR | 0.437 | 0.801 | 0.940 | 0.983 |
Balanced | 0.601 | 0.634 | 0.602 | 0.562 |
Source code: https://code.earthengine.google.com/5c0f691990987a5b900cbaacdfed777f
Of the thresholds, 30% had the best accuracy in assessing tree cover. However, the differences of balanced accuracy were slim between all four of the thresholds used, implying that while the true negative rates increased for higher thresholds, the true positive rates decreased about as quickly. In addition, none of the thresholds had a balanced accuracy that was as high as even the lowest in the paper by Adjognon, et al.
Looking closer at my forest classification when compared to Hansen’s (using the 30% canopy threshold), there are some similarities. Both classifications, for example, found the least forest in the northeastern part of the study area, and both identified similar patches of forest in the south. A very striking difference, however, are the forests in the northwestern part of the study area. Hansen and I both identified forests in this region, but these classifications are wildly different shapes. Visual inspection of the region shows only scattered forests, interspersed between some type of agriculture. The Hansen classification includes both the forest and the agriculture, leading to Type I errors. I also had Type I errors, but this stems from my classification falsely identifying cloud cover as forest. The image series below demonstrates this with three example points.
Comparing the example points in the images above demonstrates the shortcomings of both methods of classification. The blue point is not forest, but one of many fields listed as forest by Hansen. The red dot, on the other hand, is also not forest, but my classification identified it as such because of a cloud in the Landsat image I used. The analysis in Table 2 summarizes these differences.
Blue Point | Green Point | Red Point | |
Ground | Agriculture, near forest | Forest | Scrubland |
Landsat | Agriculture, near forest | Forest | Cloud cover |
Hansen | Forest | Forest | Not Forest |
Mumper | Not Forest | Forest | Forest |
Source code: https://code.earthengine.google.com/a5d21d2a3626ba14f465ee5d3a9e7cd8