Papers Under Review / Working Papers

  • Hierarchical Modelling and Forecasting System for Inflation Rate and Volatility
  • With Paul Kattuman
    (Currently under review for International Journal of Forecasting)
    More Information...

    Abstract:  Using monthly data that underlies the Retail Prices Index for the UK, we analyse the dynamics of inflation rate and its volatility. We examine patterns in the time-varying covariation among product-level inflation rates that aggregate up to industry-level inflation rates that in turn aggregate up to the overall inflation rate. The aggregate inflation volatility closely tracks the time path of this covariation, which is seen to be driven primarily by the variances of common shocks shared by all products, and by the covariances between idiosyncratic product-level shocks. We formulate a forecasting system that comprises of models for mean inflation rate and its variance, and exploit the index structure of the aggregate inflation rate using the hierarchical time series framework. Using a dynamic model selection approach to forecasting we obtain forecasts that are between 9 and 155 % more accurate than a SARIMA-GARCH(1,1) for the aggregate inflation volatility.

    Author Manuscript:  Link
    Presented at:  36th International Symposium on Forecasting, RSS International Conference 2016, Vienna Congress on Mathematical Finance

  • GeomComb: (Geometric) Forecast Combination Methods in R
  • With Gernot R. Roetzer
    More Information...

    Summary:  There are circumstances when combination of forecasts from different models can lead to accuracy gains (e.g., misspecification of individual models, unstable forecast environment, etc.). Recent research suggests that regression-based combination methods tend to have a relative advantage when one or more models perform significantly better than the others, while eigenvector-based (geometric) combination methods tend to be superior when the individual forecasts are in the same ball park. Our R package provides code for several geometric combination methods: the eigenvector approach, the bias-corrected eigenvector approach, the trimmed eigenvector approach, and the trimmed bias-corrected eigenvector approach. We also include other common methods of forecast combination that can be classified as simple methods (simple average, median, trimmed mean, winsorized mean) and regression–based methods (OLS, CLS, LAD). In addition to the combination methods, we provide data manipulation tools that handle missing values and collinearity issues prior to combination, an automated model selection algorithm based on training set fit according to different loss criteria, as well as tools for forecast accuracy evaluation and combination result plots.

    Installable development version (Github):  Link
    Reference manual:  Link

Work In Progress

  • Hierarchical Healthcare Demand Forecasting & Forecast Combination
  • With Paul Kattuman & Stefan Scholtes
    More Information...

    Outline:  Accurate forecasting of healthcare demand is essential for efficient staffing of temporary workers in hospitals. Extant healthcare forecasting relies heavily on aggregate forecasting. Using a large disaggregated dataset supplied by a hospital, we show that exploiting the patterns (trend, seasonality, comovement with explanatory variables) at disaggregated levels, e.g., divisions or primary specialties, can significantly improve forecasting accuracy, if aggregated optimally through hierarchical time series (HTS) forecasting. Building on the forecasts obtained from the HTS modelling, we explore the value of simple, geometric, as well as regression-based forecast combination techniques.

    Methods:  HTS, Forecast Combination, Distributed Lag Models
    (Preliminary results available on request)

  • The Effect of Open-Access Publishing on Research Impact: Time Series Feature Extraction
  • With Rupert Gatti, Paul Kattuman & Cameron Neylon
    More Information...

    Outline:  Innovation diffusion trajectories are traditionally modelled using a differential equations-based approach that fits the long-term behavior of a diffusion curve reasonably well, but is not suitable for short-term forecasting or for modelling diffusion features of interest (saddles, takeoff). While the popular Bass model is well suited for describing the long-term trajectory, it is not capable of extracting interesting short-term features. Using a sample of 722 time series that describe the daily evolution of article views for all articles that were published in ‘Nature Communications’ in the first half of 2013, we analyze the effect of open-access publishing on research impact. Using state-space modelling and Dynamic Time Warping – a time series clustering technique – we identify distinct views trajectories. In separate analyses of the ‘Open Access’ the ‘Subscription’ subsamples, we find supportive evidence that open-access articles receive a larger number of total views by far, and even a change to open access years after publication has a significant permanent positive effect on article views. We further explore the directional causality between article views and related social media activity, controlling for a wide range of author- and paper-specific explanatory variables.

    Methods:  Unobserved Components Models, Time Series Clustering, Bayesian Poisson VAR, Multinomial Logit
    (Preliminary results available on request)

  • Time-Varying Correlation in Product-Level Inflation: A Network Analysis
  • With Paul Kattuman
    More Information...

    Outline:  Motivated by literature on divergent trends in micro and macro volatility (e.g., Comin and Mulani, 2006 – “Divergent Trends in Aggregate and Firm Volatility”) that documents covariances as main drivers of aggregated volatility, we aim to explore the main linkages across the 85 products that form the UK Retail Prices Index. Using adaptive elastic nets (a sparse estimation method that mitigates the documented weaknesses of LASSO estimation for correlation networks), we compute the contemporaneous partial correlation network, as well as the Granger correlation network (that can show lead/lag relationships), and combine the two into a long-run partial correlation measure (concept based on Barigozzi and Brownlees, 2014 – “Network Estimation for Time Series”). This information allows us to recombine the bottom-level components of the RPI into groups that reflect the input-output structure of the economy and to evaluate its usefulness for hierarchical inflation forecasting.

    Methods:  Long-Run Partial Correlation, Granger Causality, (Adaptive) Elastic Net, Eigenvector Centrality
    (Preliminary results available on request)

Graded Degree Papers (Theses)

  • The Use of Time-Series Methods for Diffusion Modelling: An Evaluation
  • More Information...

    (submitted as First-Year PhD Progress Report, graded on a pass/fail basis)

    Grade:  Pass (without required corrections) — graded by Prof. Andrew Harvey and Dr. Vincent Mak

    Abstract:  The ability to describe, explain, and predict the diffusion of innovations in a social system is crucial – understanding the dynamic drivers of the diffusion process is a necessity for successful innovation management. This paper sets out to evaluate the extant modelling techniques in the field and introduces state-space modelling as a powerful holistic approach to diffusion modelling. A formal theoretical framework for state-space modelling in a diffusion context is provided. The empirical part of the study suggests superiority of a state-space approach as regards description and forecasting of diffusion processes (when compared to the popular Bass and Logistic growth models, as well as ARIMA models) and can also be used to explain such processes well by accommodating regressors and intervention variables in the model framework. Furthermore, we introduce a formal systematic test (within the state-space framework) for the saddle effect that is a feature of many diffusion processes.

    Methods:  Unobserved Component Models, Bass Model, Logistic Growth, Gompertz Growth, ARIMA
    Author Manuscript:  Link

  • Disaggregating Stock Index Return Volatility: A Variance Decomposition Study
  • More Information...

    (submitted as MSc Dissertation to Department of Statistics, University of Oxford)

    Grade:  Distinction

    Abstract:  Reflecting the growing importance of volatility in economics and finance, there is a large empirical literature in the field that is devoted to estimating and forecasting conditional volatility. The dominant empirical approach hinges on the ARCH/GARCH family of volatility models, numerous extensions of which have been applied to time series data arising in a wide variety of contexts. The success of these models has led to their being applied without discrimination to series such as returns to individual stocks, as well as to returns from stock indices. More generally, the current practice in volatility modelling does not differentiate between models that apply to contemporaneous aggregates of sets of disaggregate variables (such as stock indices, inflation rates, national growth rates) and models that apply to their components (returns to individual stocks, prices of individual goods, growth rates of individual firms).

    There is obvious scope for improvement in models for aggregate variables, that take note of component volatilities, component weights in the aggregate, and the covariation of the components,
    all of which are observed. In this dissertation we take note of the generating process for volatility behavior of aggregates in the hope that it will lead to better explanations of stylized facts about volatility and more accurate forecasts. To illustrate the value of this approach we study the volatility behavior of the major German stock index, DAX, for the period between September 2012 and July 2014. We find that covariance between the returns of its constituent stocks is the dominant driver of index volatility and specify a causal model. In addition, taking note of the fact that the covariation between stock returns is often driven by unobserved bubbles, we estimate unobserved component models and illustrate the value of this approach.

    Methods:  Variance Decomposition, Unobserved Component Models, GARCH Models, Distributed Lag Models
    Author Manuscript:  Link