What is accidental forest regression

Random forest regression and trend time series


I am comparing a random forest model with a GLS model using a univariate time series that has a deterministic linear trend. I will add a linear time trend covariate (among other predictors) to the GLS model to account for the changing trend. To be consistent in my comparison, I was hoping to add this predictor to the random forest regression model as well. I've been looking for literature on the subject and can't find much.

Does anyone know if adding this type of predictor in a random forest regression is inappropriate for some reason? The random forest regression already contains time-delayed variables in order to take the autocorrelation into account.

Reply:


RFs can of course identify and model a long-term trend in the data. However, the problem becomes more complicated when you try to predict values ‚Äč‚Äčnever seen before, as you often do with time series data. For example, if you find that activity increases linearly between 1915 and 2015, you can assume that it will continue to do so in the future. However, RF would not make this forecast. It would predict that all future variables will have the same activity as 2015.

The above script prints 2013, 2014, 2015, 2015, 2015, 2015. Adding delay variables to the RF doesn't help in this regard. So careful. I'm not sure if adding trending data to your RF is going to do what you think it is.







Just change the variable you want to predict to the difference in the dependent variable.

As the other posts point out, the random forest does not know how to treat time variables that appear after the training set. For example, suppose your training set contains data from minute 1 to minute 60. The random forest can rule out that the dependent variable is 100 after forty minutes. Even if there is a trend, when you reach 10000 min in the test data, the same rule will apply. However, predicting the difference can have the same effect if you include a trend.

In terms of whether RFs are decent forecasters, I've had a lot more luck with RFs than other econometric models like VAR, VECM, etc., but especially for short-term forecasting. However, some other models seem to work better on most data, such as: B. Well-coordinated GBM models.

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from.

By continuing, you consent to our use of cookies and other tracking technologies and affirm you're at least 16 years old or have consent from a parent or guardian.

You can read details in our Cookie policy and Privacy policy.