Principal Component Analysis: A Gap-Sharing Method

1112 Words 5 Pages
A gap-filling method was developed based on the Principal Component Analysis (PCA) method. PCA is a widely used dimensionality reduction method to find a new set of variables as linear combinations of original variables, capturing most of the observed variance in the original data (Storch and Zwiers, 2002). It has been used in various climate studies for decomposing dominant mode (Storch and Zwiers, 2002) and time-series gap-filling (Beckers and Rixen, 2003; Kondrashov et al., 2014).
The algorithm developed for gap-filling in this study is described below: Step 1, initial guess For each station, its 50 nearest neighbors were selected. For each record with NoData, its neighbors having positive correlation with the station under examination
…show more content…
At each time point, a T-test statistic is calculated by comparing the mean values before and after this time point. The maximum T-test statistic value and the time is picked up. Then the probability of observing a maximum T-test statistic less than the observed maximum T-test value in a randomly drawn time series is calculated using an approximation function generated from Monte-Carlo experimentation. If the probability is greater than a given criteria (PROB), the time series are split at this time point. The BGHS method will further process each sub time series until the length of the sub time series is less than a given length (MIN_LEN). When working on a sub time series, additional T-tests (ALPHA) would be conducted at a proposed splitting time point, comparing the separated sub time series with its former and latter neighboring sub time series separated in previous recursion. If any of the T-tests were not significant, the proposed split would be rejected. The BGHS method can be used to split non-stationary time series, and the formulation of its statistical test assures it is stricter than other T-test based change detection method, for example, moving window T-test. It has been widely used to analyze biophysical and physiological data (Fukuda et al., 2004), financial time series (Tóth et al., 2010), and geophysical time series (Feng et al., …show more content…
The results suggest 1976 / 1977 as the break year for air temperature in CA, with 113 out of 369 stations having statistically significant shift of mean annual temperature in the 1970s, and 69 stations witnessed such a change during 1976 / 1977 (Figure 6-1). Distribution of stations with significant break of mean air temperature do not have specific spatial patterns. At the same time, there does not exist a general break year for precipitation (Figure 6-2). As a result, the research period was cut into two sub periods in the year 1976 /

Related Documents