Technical Articles

Predicting the Long-Term Behavior of Chaotic Systems

By Dr. Pilar Gómez-Gil and Dr. Rigoberto Fonseca-Delgado, National Institute of Astrophysics, Optics, and Electronics, Mexico


Chaotic systems are highly nonlinear and extremely sensitive to initial conditions, making them notoriously unpredictable: Despite intense interest in the future behavior of financial markets, weather patterns, seismic movements, and similarly chaotic phenomena, researchers have found it difficult to generate accurate long-range predictions from measured time series data.

We and our colleagues have developed a system for improving the accuracy of long-term forecasts of chaotic time series. Our system uses a self-organizing map (SOM) neural network to select and combine predictive models. Designed, tuned, and validated with MATLAB® and Neural Network Toolbox™, this system analyzes the time series data to identify the best predictive models to use for various portions of the data and then uses the SOM to create an ensemble solution that outperforms any individual model.

Why MATLAB?

Dr. Gómez-Gil had been using MATLAB for several years before she began working on chaotic time series prediction, and was well aware of its versatility. This versatility proved to be important for her current work, which involves not only neural networks but also statistical analysis and signal processing. Several other factors contributed to our decision to use MATLAB. We’ve found that students learn MATLAB quickly, which means that even complete beginners can rapidly come up to speed on our research projects. Most importantly, MATLAB makes it easy to experiment with and evaluate new ideas, algorithms, and models.

Preprocessing Time Series Data and Generating Basic Predictive Models

Our primary data preprocessing tasks were noise reduction and data reduction. To filter and reduce the data, we applied a number of signal processing techniques, including fast Fourier transforms, signal smoothing, moving averages, and Gaussian noise filters.

We incorporated two basic types of predictive models into our system:

  • Autoregressive integrated moving average (ARIMA) models, which use a mean of past observations and errors for forecasting
  • Nonlinear autoregressive exogenous (NARX) models, which use a feed-forward neural network to find approximate future time series values based on previous values

We varied the parameters of these models to create a more diverse pool of models for our self-organizing map. For the ARIMA models, we varied the number of autoregressive terms and lagged forecast errors as well as nonseasonal difference and seasonality parameters to generate 54 variants. We generated 27 different NARX models by varying the number of delay neurons, the number of hidden layer neurons, and the training algorithms. We used three training algorithms from Neural Network Toolbox: Bayesian regularization backpropagation, conjugate gradient backpropagation with Fletcher-Reeves updates, and Levenberg-Marquardt backpropagation.

Case Study: Predicting ATM Withdrawals

We validated our system against several chaotic time series data sets. These included measured values, such as airline passengers in transit, births and accidental deaths in the U.S., sunspots, and automatic teller machine (ATM) withdrawals, as well as values generated using well-known chaotic systems, equations, and solutions, such as Mackey-Glass, the Lorenz attractor, and the Hénon map.

The ATM data set provides a useful illustration of our approach. The values in this time series represent the total amount withdrawn from a set of individual ATM machines. The data includes no information on factors that might influence withdrawal patterns, such as the day of the week, proximate holidays, or the weather at the time of the withdrawal. Our goal was to predict the total daily withdrawals for 56 consecutive days based solely on withdrawal totals from the preceding 700+ days (Figure 1).

Figure 1. Time series for daily ATM withdrawals, with training values and test values.

Figure 1. Time series for daily ATM withdrawals, with training values shown in blue and test values in red.

First, our system employs a strategy called temporal validated combination, which splits the prediction horizon into short-term, medium-term, and long-term windows to account for the dynamic behavior changes in chaotic time series that occur over varying time scales. For each of these windows, the system uses a Monte Carlo cross-validation process to evaluate each NARX and ARIMA predictive model. It computes two metrics for each model: performance and representative error. The SOM neural network then automatically organizes clusters of models, grouped by their prediction skills. Finally, the system selects high-performing models from different groups to create a diverse ensemble. The results of the ensemble are shown in Figure 2.

Figure 2.  Predicted ATM withdrawals and actual withdrawals for a 56-day prediction horizon.

Figure 2. Predicted ATM withdrawals (blue) and actual withdrawals (red) for a 56-day prediction horizon.

In addition to predicting ATM withdrawals, the system can be used with a completely different data set from another domain. It will automatically identify and combine the best underlying predictive models for that time series to maximize prediction accuracy.

Current Projects and Future Plans

As we refine and enhance the chaotic time series prediction system, we continue to apply it in new domains. Dr. Gomez-Gil team is about to publish the results of a study in which we used the system to forecast exchange rates for the U.S. dollar and Mexican peso. We plan to turn the MATLAB system we developed into an application that provides an interface to the system’s core capabilities, to make it easier for other researchers to use.

National Institute of Astrophysics, Optics, and Electronics, Mexico is among the nearly 1000 universities worldwide that provide campus-wide access to MATLAB and Simulink. With the Total Academic Headcount (TAH) License, researchers, faculty, and students have access to a common configuration of products, at the latest release level, for use anywhere—in the classroom, at home, in the lab, or in the field.

About the Author

Dr. Pilar Gómez-Gil is a researcher at the National Institute of Astrophysics, Optics, and Electronics in Tonantzintla, Puebla, Mexico. Her research interests include artificial neural networks and other machine learning models used for temporal classification, prediction, and signal processing.

Dr. Rigoberto Fonseca-Delgado is a lecturer at Yachay Tech University, Ecuador. His research interests include selection and combination of models, self-organization, time series forecasting, and graph-based data mining.

Published 2018

View Articles for Related Capabilities

View Articles for Related Industries