TY - JOUR

T1 - Machine Learning-based Prediction of Sunspots using Fourier Transform Analysis of the Time Series

AU - Rodríguez, José Víctor

AU - Rodríguez-Rodríguez, Ignacio

AU - Lok Woo, Wai

PY - 2022/12/19

Y1 - 2022/12/19

N2 - The study of solar activity holds special importance since the changes in our star’s behavior affect both the Earth’s atmosphere and the conditions of the interplanetary environment. They can interfere with air navigation, space flight, satellites, radar, high-frequency communications, and overhead power lines, and can even negatively influence human health. We present here a machine learning-based prediction of the evolution of the current sunspot cycle (solar cycle 25). First, we analyze the Fourier Transform of the total time series (from 1749 to 2022) to find periodicities with which to lag this series and then add attributes (predictors) to the forecasting models to obtain the most accurate result possible. Consequently, we build a trained model of the series considering different starting points (from 1749 to 1940, with 1 yr steps), applying Random Forests, Support Vector Machines, Gaussian Processes, and Linear Regression. We find that the model with the lowest error in the test phase (cycle 24) arises with Random Forest and with 1915 as the start year of the time series (yielding a Root Mean Squared Error of 9.59 sunspots). Finally, for cycle 25 this model predicts that the maximum number of sunspots (90) will occur in 2025 March.

AB - The study of solar activity holds special importance since the changes in our star’s behavior affect both the Earth’s atmosphere and the conditions of the interplanetary environment. They can interfere with air navigation, space flight, satellites, radar, high-frequency communications, and overhead power lines, and can even negatively influence human health. We present here a machine learning-based prediction of the evolution of the current sunspot cycle (solar cycle 25). First, we analyze the Fourier Transform of the total time series (from 1749 to 2022) to find periodicities with which to lag this series and then add attributes (predictors) to the forecasting models to obtain the most accurate result possible. Consequently, we build a trained model of the series considering different starting points (from 1749 to 1940, with 1 yr steps), applying Random Forests, Support Vector Machines, Gaussian Processes, and Linear Regression. We find that the model with the lowest error in the test phase (cycle 24) arises with Random Forest and with 1915 as the start year of the time series (yielding a Root Mean Squared Error of 9.59 sunspots). Finally, for cycle 25 this model predicts that the maximum number of sunspots (90) will occur in 2025 March.

KW - Support vector machine

KW - Processes regression

KW - Sunspots

KW - Solar activity

KW - Time series analysis

KW - Gaussian

KW - Linear regression

KW - Random Forests

UR - http://www.scopus.com/inward/record.url?scp=85145288216&partnerID=8YFLogxK

U2 - 10.1088/1538-3873/aca4a3

DO - 10.1088/1538-3873/aca4a3

M3 - Article

AN - SCOPUS:85145288216

SN - 0004-6280

VL - 134

JO - Publications of the Astronomical Society of the Pacific

JF - Publications of the Astronomical Society of the Pacific

IS - 1042

M1 - 124201

ER -