Forecasting is predicting the future with the help of present and past data. It uses the concept of Exponential Smoothing to predict the future. The Power BI desktop has a very nifty feature of forecasting. This article will describe the process with practical data.
The data has been collected from Wikipedia using the data scraping feature provided in Power BI. I have described here how you can load the data from the web with this feature.
The data I have collected has several years of information on the monthly and annual rainfall of different regions of India. This data can be used to predict the rainfall of those particular regions for the coming years.
The data
Page contents
I have selected only a single region including West Bengal and Sikkim with rainfall data from 1901 to 2016. The rainfall has been recorded in mm. On the basis of these many years data, lets try to predict what will be the rainfall of next 5 years.
Here is a glimpse of the data.
Creating the line chart
In order to apply the Forecasting feature in the Power BI desktop, we need to create a line chart first. The line chart option is available in the “Visualizations
” pane of the application.
Select the “Line chart
” option from visualizations and then select appropriate variables from “Fields
“.
The next step is selecting the “Year” as the Axis variable and “Annual” rainfall in the Values. Consequently, the line chart will be created.
Forecasting in the Power BI desktop
Now as the line chart is ready we need to create a Forecast for future time points. In “Visualizations
” pane, under “Analytics
” you can get the option for “Forecast
“. But unless your data has at least 60-time points the option will not be available.
Go with the default values of Forecast and click apply. Now forecast for 10 future time points is produced. As in my case each year is individual time point, so forecast for next 10 years will be produced.
The confidence interval is 95% by default. In layman language, out of 100 times the experiment conducted, 95 times the forecast will lie within the interval shown around the forecast values.
But you can see the forecast produced, does not appear to be very realistic. It has no similarity with the historical trend. So, something is wrong here. The seasonality is left to be selected automatically which is not working in the present case.
Seasonality in Forecasting
We need to provide appropriate value to the “Seasonality
“. This parameter is the most important in the case of forecasting. So let’s try to adjust these value to get the most accurate result.
Seasonality in time series forecasting refers to a time period during which the data shows some regular and predictable changes. This period may be weeks, months or years with a cyclic pattern.
This cycle we can identify from the line chart we created. If we closely analyze the line chart and zoom it a little, we can notice the line repeats a pattern every 5-6 years period.
So, I will try to create the forecast with seasonality values close to 5 time points.
Checking accuracy of the forecast
To check the performance of the forecast, the forecast tool of the Power BI desktop has one feature “Ignore last”. It simply help us to produce the forecast leaving the last few points as mentioned in this field.
Which means, for this many time period we have both the observed as well as forecast values. Thus we can compare how precise the forecast is.
If we take seasonality of 4 and 6 time points, the forecast has big differences with the observed ones. See the below images. For example, for the year 2011, the actual rainfall is 2418.70mm and the forecast is 2733.56mm.
Again if we set seasonality as 6, again the forecast is very different from the original value.
But if we provide seasonality as 5 we achieve the best forecast with the closest values to the original rainfall. If we again take the example of the year 2011, with seasonality as 6 time points, the rain forecast is 2337.89mm.
The “Format” option allows us to change the style of the forecast report generated. We can change the confidence interval pattern, line pattern and colour etc.