1 Introduction

Temperature analysis is crucial in understanding climate dynamics and the impacts of global warming. This project focuses on temperature changes and the significance of using time series analysis. The Earth’s climate system is complex, and rising temperatures have far-reaching consequences for ecosystems, agriculture, and human health.

Time series analysis allows us to uncover patterns, trends, and cycles in temperature data over time. It captures the temporal dimension, enabling the identification of long-term trends and seasonal variations. Time series models also facilitate predictions and forecasting of future temperature trends, aiding in climate planning and adaptation.

2 Data Characteristics

The data that we use are a time series of temperature anomalies for every month since 1880. The temperature anomalies are global average for the whole planet, they are monthly deviations from a global average. The time series contains 1716 values which are presented without unit. Figure 1 shows the time series that we will study.

Figure 1: Plot of the monthly global average anomalies from 1880 to 2022.

Figure 1: Plot of the monthly global average anomalies from 1880 to 2022.

3 Exploratory Data Analysis

We saw on the previous Figure that the shape of the time series change after 1960. We can see that more clearly on Figure 2 which shows two linear fit to the time series: one using all values and one using only those from after 1960. We observe that the values do not come from the same process.

Figure 2: Two different linear fit of the time series.

Figure 2: Two different linear fit of the time series.

4 Model Selection

Since the values of the time series do not come from the same process, we should not create a model using the whole time series that we have, but only the values after 1960.

First we should determine the shape of the trend. To do this we fit the trend using generalized least squares. Fitting different models for and using anova, we obtain that the best trend is \[\text{value}\sim 1+\text{month}+{t}+{t}^2\] Figure 3 shows the trend and the time series after 1960.

So we try to fit a SARIMA model for the period 1960-2022, knowing that the trend is quadratic on this part.

We will fit two models: one that assumes that the seasonality is useful and one that does not use seasonality.

To find the parameters of our model assuming seasonality we look at the ACF and PACF plots on Figure 4. From the ACF plot (left) we can determine that the value of q should be equal to 2 and Q equal to 1. We see on the PACF plot (right) that p should be equal to 2 and P equal to 2. d should be equal to 2 because we assume a quadratic trend. Finally, we choose to use D=1. Of course the seasonality parameter is s=12. So we obtain the model with seasonality : \[SARIMA(2,2,2)\times(2,1,1)_{12}\]

To determine the parameters of our model that assumes no seasonality we also look at the ACF and PACF plots on Figure 5. We can read that q should be equal to 2 and p equal to 2. d should still be equal to 2 because we assume a quadratic trend. So the model that we obtain without seasonality is \[ARIMA(2,2,2)\]

We have that the AIC of the SARIMA model is -1141 while the AIC of the ARIMA model is -1202. So the best model among those two is the ARIMA which does not assume seasonality, because it has smaller AIC and is simpler.

Figure 3: Trend of the time series after 1960.

Figure 3: Trend of the time series after 1960.

Figure 4: ACF and PACF plots to determine the model with seasonality.

Figure 4: ACF and PACF plots to determine the model with seasonality.

Figure 5: ACF and PACF plots to determine the model without seasonality.

Figure 5: ACF and PACF plots to determine the model without seasonality.

5 Residuals and Diagnostics

We now study the residuals of our two model. Figure 6 shows four residuals diagnostic plots of the SARIMA (2,1,4)\(\times\)(2,0,1)\(_{12}\) model (i.e. the one that assumes seasonality useful). The top left plot shows the residuals which look like white noise. The top right and the bottom left plots show respectively the ACF and PACF of the residuals, we observe that there is no significant correlation or autocorrelation. Finally, the bottom right plot is a Q-Q plot of the residuals that gives an additional indication that the residuals are white noise.

Figure 7 shows four residuals diagnostic plots of the ARIMA (2,1,4) model (i.e. the one that assumes not seasonality). The top left plot shows the residuals which look like white noise. The top right and the bottom left plots show respectively the ACF and PACF of the residuals, we observe that there is a significant correlation and autocorrelation for lag = 24, but it is not too large. Finally, the bottom right plot is a Q-Q plot of the residuals that gives an additional indication that the residuals are white noise.

Figure 6: Four residuals diagnostic plots for the model with seasonality.

Figure 6: Four residuals diagnostic plots for the model with seasonality.

Figure 7: Four residuals diagnostic plots for the model without seasonality.

Figure 7: Four residuals diagnostic plots for the model without seasonality.

6 Prediction

Now we are interesting in predicting the temperature changes until 2050. We do that using both our models. Figure 8 and Figure 9 show the predicted anomalies in the temperature until 2050 using respectively the SARIMA and ARIMA models. Around the main prediction we plotted confidence at intervals: at level 50% in dark blue and at level 95% in light blue. We observe that the prediction of the two models is almost the same: 1.5, but that the model which uses seasonality gives more variable predictions.

Figure 8: Prediction of the temperature until 2050 using the SARIMA model.

Figure 8: Prediction of the temperature until 2050 using the SARIMA model.

Figure 9: Prediction of the temperature until 2050 using the ARIMA model.

Figure 9: Prediction of the temperature until 2050 using the ARIMA model.

7 Conclusion

In conclusion we can say that the global warming as a quadratic form. Using seasonality in a model to predict average temperature over the whole planet is not necessary.

We observed that the shape changed after 1960 and increases more over time. Possible explanation of this phenomenon are the mid-20th century baby boom and the developpement of automobile industry.