Optical Characterization of Atmospheric Aerosols via Airborne Spectral Imaging and Self-Organizing Map for Climate Change Diagnostics ()
1. Introduction
Utilization of univariate techniques in optical characterization of atmospheric aerosols increases the likelihood of an observation occurring by chance (false-positive results) [1] . With this in mind, a number of studies have illustrated the possibility of utilizing the multivariate techniques among them Principle Component Analysis (PCA) in aerosol characterization [2] . In this study, PCA investigations quantified the influence of rainy and dry spells on aerosol characteristics over the region. In the quest to further understand factors that significantly influence aerosol variability and further project their future characteristics, it is necessary to use techniques with capabilities of extracting aerosol properties and characteristic patterns of variability within a large spectral data that is currently available.
Monitoring of atmospheric processes in relation to climate change require pattern detection techniques such as Self Organizing Map (SOM) that effectively cluster, classify and perform future extraction in multispectral data sets [3] [4] . SOM has been utilized in Aerosol Optical Depth (AOD) and Precipitation Rate (PR) projections over east Africa as detailed in [5] . As a tool for pattern recognition and classification, the SOM analysis is in widespread use across a number of disciplines among the climate research [6] [7] [8] [9] [10] . The current study explores the utilization of SOM as a novel technique in the optical characterization of atmospheric aerosols for climate change diagnostics over the region.
Airborne spectroscopic measurements from MODIS that necessitates the retrieval of both AOD and ÅE are on board the Earth Observing System (EOS) namely, Terra satellite operating at an altitude of 705 km [11] [12] was utilized. The MODIS TERRA (EOS AM-1) was launched on December 18, 1999 and passes the Equator at 10:30am daily (descending mode). MODIS images simultaneously both reflected solar radiance and terrestrial emission in 36 channels (0.41 - 14.4 μm wavelength range) with resolutions varying between 0.25 - 1 km [13] . On the other hand, PR retrieved from TRMM that is jointly supported by NASA and Japan’s National Space Development Agency space missions to basically monitor and study tropical and subtropical precipitation. The mission started in November 1997 at an altitude of about 402.5 km with its orbit ranging between 35˚ north and 35˚ south of the equator which allows it to cover the entire Earth daily [14] .
It is important to note that SOM technique can be utilized through its respective MATLAB toolbox that is available free online. The SOM MATLAB Toolbox (version 2.0) uses MATLAB structures, making it convenient to tailor the code for specific user needs and can be downloaded from a Website of the Helsinki University of Technology, Finland: http://www.cis.hut.fi/projects/somtoolbox/.
2. Methodology
2.1. Description of Study Area and Spectral Data Manipulation
The East Africa region covers diverse land forms comprising of glaciated mountains, Semi-Arid, Plateau and Coastal regions. Details and the map illustrating the study region and specifics over each site of study are as shown in [15] .
Level-3 MODIS gridded atmosphere monthly global product “MOD08_M3” at spatial resolutionof 1˚ × 1˚, was used in the current study for optical characterization of AOD (at 550 nm) and Ångström Exponent (ÅE) (at 470 - 660 nm) in relation to Precipitation Rate (PR) over selected sites of East Africa from 2000 to 2014. MODIS level 3 monthly data was rearranged in a 2D array with the rows and columns representing the temporal and spatial dimensions respectively. The row vector at each time step was used to update the weight of the SOM via an unsupervised learning algorithm. The outcome weight vectors of the SOM nodes are reshaped back into characteristic data patterns [4] [6] .
2.2. Self-Organizing Map and Its Training Rules
SOM has the capability of simultaneously affording, clustering, classifying and feature extraction of a multidimensional nonlinear spectroscopic data [5] [16] [17] . The SOM technique comprise of a two layered network that organizes the input patterns to a topological structure represented by its neurons while preserving the relations between different patterns. To achieve this, the following topology and training rules of Kohonen mapping are applied in the present study for clustering, classification, and feature extraction in MODIS spectral images from 2000 to 2014.
1) The training of the network is implemented by presenting data vectors
to the input layer of the network whose connection weight vectors
of all competitive neurons i are chosen random values. If N is the dimension of the satellite spectral data, we chose N input neurons and define the Euclidean distance (
) between
and
as:
(1)
and determine the activated neuron c with
.
2) Updating of the weights
that are associated to the neurons is only performed within the proximity (
) of
that reduces with the training time
and
is the winning neuron. The process of updating is implemented via the Equation (2b) where
represents a time dependent learning rate:
(2a)
(2b)
(2c)
The time dependent neighborhood is updated according to:
(2d)
It is therefore important to note that the network performs two features during the training, which are strongly related to both clustering and classification of the MODIS spectral data over the region. These are:
1) A separation, i.e. cluster analysis of the presented data by mean vectors
that are associated as weights to the neurons.
2) A topological ordering of the competitive neurons in such a way that neighboring neurons in the layer represent similar clusters in multidimensional space and thus dimensionality reduction.
Based on [9] , we have formulated a method that separates factors contributing to temporal change in both AOD and PR into:
a) Portion caused by a change in the frequency of occurrence (FO) of monthly AOD and PR maps in a node.
b) Portion due to a change in the node mean value in both AOD and PR.
c) Portion as a result of the combination of the two effects.
The stated factors that contribute to the temporal changes in both AOD and PR, we can define the following sets of equations:
(3a)
(3b)
where
and
are the total change in AOD and PR between two different time periods,
and
are the node average variables in both AOD and PR respectively in the initial period.
is the FO of monthly maps in node i during the initial period while
is the change in FO for node i between the two periods of interest. Additionally,
and
are the changes in both AOD and PR node average variables between the two periods while N is the total number of nodes in each SOM map over each site of study. Expanding Equation (3a) and Equation (3b), we have:
(4a)
(4b)
The first terms in Equation (4a) and Equation (4b) i.e.
and
relate changes in monthly AOD and PR fields respectively to changes in the FO of aerosol optical patterns over each study site. These patterns show a portion of the total change owing to the shifts in the frequencies with which monthly AOD and PR fields reside in the patterns depicted in the SOM. A change in AOD and PR distribution represents a change in aerosol characteristics which directly and indirectly alter regional climate [18] [19] and further affect the air quality, hence, referred to as a dynamic factor. The second term in Equation (4a) and Equation (4b) i.e.
and
relates the temporal evolution in AOD and PR fields respectively averaged over all months belonging to a give node. In the case of aerosol optical properties, such changes are caused by thermodynamic effects such as local air circulation, urban heat islands effects among others. The third term in Equation (1) represents the contribution from the interaction of both changing pattern frequency and the node averaged variable for both AOD and PR. This term tends to be small as compared to the other two. For any input data matrix of n-variables (spatial variability) and
-observations (samples) (temporal variability) the iterative SOM training procedure is as detailed elsewhere [16] .
3. Results and Discussions
SOM Analysis
1) Aerosol Optical Depth
SOM is normally colored by the values of the unified distance matrix (U-matrix) elements. The U-matrix is obtained according to the features of the input data and then illustrated as a component plane as displayed in Figures 1(a)-(c) for AOD, ÅE and PR clustering. SOM performs classification of the three properties and reveals that each of the three properties essentially has at least two clusters at each site of study. From Figure 1(a), we note that each site experiences unique AOD depiction as there is no single site whose AOD variability correlates with the others in the study period. This implies the fact that AOD characteristics over the study site are highly variable. The two clusters in the AOD over each study site are attributed the two dry and wet seasons experienced over the region [21] . Additionally, the East Africa region’s aerosol characteristics are modulated by the Monsoon precipitation [22] . On the contrary, Kampala is dominated by a single cluster in AOD during the study period; this may be attributed to the significant vehicular emissions [23] . Likewise, Mbita is dominated by a single AOD cluster over the study period that may be as a result of biomass burning and land preparation for agricultural use [21] . SOM AOD map for each study site is illustrated in Figure 1.
Figure 1. SOM classification of AOD over the six variables.
2) Ångström Exponent
It has been demonstrated that the ÅE is a good indicator of fraction of small aerosol particles with radii r = 0.057 - 0.21 μm relative to large particles with radii = 1.8 - 4 μm for atmospheric aerosols [24] . Additionally, the ÅE is often used as a qualitative indicator of aerosol particle size, with values greater than 2 indicating small particles associated with combustion byproducts, and values less than 1 indicating large particles like sea salt and dust [25] . Figure 2 shows the application of SOM on ÅE over each study site, specifically, Mbita and Mount Kilimanjaro are highly correlated and dominated by a single cluster. The single cluster can be attributed to the anthropogenic influence over the study sites i.e. land clearance, deforestation activities and biomass burning that dominate the study sites [21] [26] . This conclusion is arrived at since from Figure 3 the two sites experience distinct precipitation rates during the study period. On the contrary, Nairobi is characterized by two clear clusters attributed to dry and wet seasons experienced over the site. The dominant aerosol particles over the site constitute vehicular and industrial emissions, biomass and refuse burning [27] as well as their long distance transport from the surrounding regions. These aerosol particles are highly hygroscopic in nature [28] , hence, the two clear clusters that are modulated by the two seasons experienced over Nairobi. On the other hand, Kampala experienced two clusters but with relatively higher values in ÅE as compared to the rest of the study sites in the region. These values suggest the dominance of aerosol particles in the λ = 670 nm wavelength, these aerosol particles originate from vehicular emissions [23] . Malindi displays the least values in ÅE suggesting the existence of sea spray, sea salt and long distance transport of aerosol particles from the Arabian Peninsula desert via Monsoon winds which indeed are seasonal [20] . Likewise, SOM displays two clusters over Mau Forest complex which are associated to the two prevailing seasons experienced over the site. Additionally, continual biomass burning and forest
Figure 2. SOM classification of ÅE over the six variables.
Figure 3. SOM classification of PR over the six variables.
clearance for agricultural use over Mau Forest Complex [29] [30] , enhance the region’s aerosol particles whose ÅE values span in the range (0.46 - 0.89 ± 0.07).
3) Precipitation Rate
From Figure 3, Nairobi, Mbita, Mau Forest Complex, Mount Kilimanjaro all significantly experience both dry and wet seasons during the study period. On the contrary, Malindi experiences the lowest PR as compared to the rest of the region. SOM displays two clusters over each study site that is attributed prevailing seasonal variation over the region.
4. Conclusion
MODIS Terra monthly AOD and ÅE level 3 data from 2000 to 2014 are used to optical characterization of atmospheric aerosols using the SOM algorithm. The SOM algorithm classification of both AOD and ÅE is attributed to anthropogenic influences among them land clearance, deforestation activities and biomass burning that dominate the study sites. SOM displays two clusters over each study site except Mbita that is attributed prevailing seasonal variability in the precipitation rates over the region therefore emphasizing the role of precipitation on evaluating aerosol effects. On the contrary, a single cluster in AOD and ÅE over Mbita points towards the fact that aerosol characteristics over the site are mainly depended on both biomass burning and local air circulation rather than the monsoon precipitation throughout the study period.
Acknowledgements
This work was supported by the National Council for Science and Technology Grant funded by the Government of Kenya (NCST/ST&I/RCD/4TH call PhD/201). The authors wish to thank the NASA Goddard Earth Science Distributed Active Archive for MODIS Level 3, TRMM rainfall data which served as a complement to the meteorological data from the Kenya Meteorological Department.
Conflicts of Interest
The authors declare no conflicts of interest regarding the publication of this paper.