Forecasting Probability Distributions of Financial Returns with Deep Neural Networks

This study evaluates deep neural networks for forecasting probability distributions of financial returns. 1D convolutional neural networks (CNN) and Long Short-Term Memory (LSTM) architectures are used to forecast parameters of three probability distributions: Normal, Student’s t, and skewed Student’s t. Using custom negative log-likelihood loss functions, distribution parameters are optimized directly. The models are tested on six major equity indices (S&P 500, BOVESPA, DAX, WIG, Nikkei 225, and KOSPI) using probabilistic evaluation metrics including Log Predictive Score (LPS), Continuous Ranked Probability Score (CRPS), and Probability Integral Transform (PIT). Results show that deep learning models provide accurate distributional forecasts and perform competitively with classical GARCH models for Value-at-Risk estimation. The LSTM with skewed Student’s t distribution performs best across multiple evaluation […]

Alternative Loss Function in Evaluation of Transformer Models

The proper design and architecture of testing machine learning models, especially in their application to quantitative finance problems, is crucial. The most important aspect of this process is selecting an adequate loss function for training, validation, estimation purposes, and hyperparameter tuning. Therefore, in this research, through empirical experiments on equity and cryptocurrency assets, we apply the Mean Absolute Directional Loss (MADL) function, which is more adequate for optimizing forecast-generating models used in algorithmic investment strategies. The MADL function results are compared between Transformer and LSTM models, and we show that in almost every case, Transformer results are significantly better than those obtained with LSTM.

Combining Deep Learning and GARCH Models for Financial Volatility and Risk Forecasting

In this paper, we develop a hybrid approach to forecasting the volatility and risk of financial instruments by combining econometric GARCH models with deep learning networks. For the latter, we employ Gated Recurrent Unit (GRU) networks, whereas four different specifications are used for GARCH: standard GARCH, EGARCH, GJR-GARCH and APARCH. Models are tested using daily returns on the S&P 500 index and Bitcoin prices. As the main volatility estimator, also the target function of our hybrid models, we use the modified Garman-Klass estimator. Volatility forecasts resulting from the hybrid models are employed to evaluate the assets’ risk using the Value-at-Risk (VaR) and Expected Shortfall (ES). Gains from combining the GARCH and GRU approaches are discussed in the contexts of both […]

Hedging Properties of Algorithmic Investment Strategies Using Long Short-Term Memory and Time Series Models for Equity Indices

This paper proposes a novel approach to hedging portfolios of risky assets when financial markets are affected by financial turmoils. We introduce a novel approach to diversification on the level of ensemble algorithmic investment strategies (AIS) built on the prices of these assets. We employ four types of diverse models (LSTM, ARIMA-GARCH, momentum, contrarian) to generate price forecasts, which are used to produce investment signals in single and complex AIS. We verify the diversification potential of different types of investment strategies consisting of various assets classes in hedging ensemble AIS built for equity indices (S&P 500). Our conclusion is that LSTM-based strategies outperform the other models and that the best diversifier for the AIS built for the S&P 500 index […]

Mean absolute directional loss as a new loss function for machine learning problems in algorithmic investment strategies

This paper investigates the issue of an adequate loss function in the optimization of machine learning models used in the forecasting of financial time series for the purpose of algorithmic investment strategies (AIS) construction. We propose the Mean Absolute Directional Loss (MADL) function, solving important problems of classical forecast error functions in extracting information from forecasts to create efficient buy/sell signals in algorithmic investment strategies. MADL places appropriate emphasis not only on the quality of the point forecast but also on its impact on the rate of achievement by the investment system based on it. The introduction and detailed description of the theoretical properties of this new MADL loss function are our main contributions to the literature. In the empirical […]

LSTM in algorithmic investment strategies on BTC and S&P500 index

We use LSTM networks to forecast the value of the BTC and S&P500 index, using data from 2013 to the end of 2020, with the following frequencies: daily, 1 h, and 15 min data. We introduce our innovative loss function, which improves the usefulness of the forecasting ability of the LSTM model in algorithmic investment strategies. Based on the forecasts from the LSTM model we generate buy and sell investment signals, employ them in algorithmic investment strategies and create equity lines for our investment. For this purpose we use various combinations of LSTM models, optimized on in-sample period and tested on out-of-sample period, using rolling window approach. We pay special attention to data preprocessing in the input layer, to avoid […]

Deep Learning for Financial Time Series Forecasting

The main goal of this thesis was to investigate how deep neural network models perform in financial time series forecasting. In particular, it focuses on combining deep learning and econometric methodologies. In this regard, the theoretical foundations of ARMA-GARCH econometric models, deep recurrent LSTM networks and convolutional networks were discussed in detail. The thesis proposes innovative solutions in the form of ARMA-GARCH-LSTM hybrid volatility point forecast models, as well as models that enable forecasting the parameters of the probability distribution of future asset returns, using deep learning networks. The empirical research presented in the dissertation was carried out on three levels. First, neural network models using MLP, LSTM and CNN network architectures were examined in the context of point forecasting […]