How to Properly Forecast Your Data

As an owner or manager, you can feel the trends. You can tell that business is growing. Every year, you get a little more guests and/or you are able to raise your price a tad more. When you think into the future, you have an idea of where your numbers will be, but how are others able to forecast so precisely? Where do these numbers come from? The answer is they are either guesstimating—pulling these numbers out of thin air—or they are derived from good old-fashioned statistics. Here, we will show you a solid way to do the math.
First, you need some materials and some tools. Material-wise, you will need at least three years of your occupancy data. You can modify it as you see fit, but there are some requirements to get the model working right. Tool-wise, all you need is a good old-fashioned Microsoft Excel with the Data Analysis Toolpak installed. If you are trying to use Google Docs, you might need some add-ons to get the regression tools. We’ll show you the Excel way and leave a couple of links on how to get regression working in Google Docs.
Now, back to the data. Have your PMS system create a CSV or Excel file of this data going as far back as you can—again, at least three years. You might have to export the data and manipulate it to get it into a format to start, and with all the systems out there, you're on your own here to figure this part out. If you can do that, here is the basic information you need per day: These mandatory data fields are:
Date
Dorm ABR
Private Room ADR
Occupied Dorm Beds
Occupied Private Rooms
Now you can make it more complex by breaking down the prices per channel. Please note abbreviations for titles to make them recognizable only. They have no bearing on the statistics involved:
Average Hostelworld Dorm Bed Rate (AHWbR)
Average Hostelworld Private Room Rate (AHPR)
Average Booking.com Dorm Bed Rate (ABbR)
Average Booking.com Private Room Rate (ABPR)
Other channels, including direct and ‘Other’
You can also do the same with your occupancy numbers. You can break them down to:
Occupied Dorms by Hostelworld (ODHW)
Occupied Private Rooms by Hostelworld (OPHW)
Occupied Dorms by Booking (ODB)
Occupied Private Rooms by Booking (ODP)
Other channels, including direct and ‘Other’
Please note, if you break down by channel, you must include all the channels, or at least lump together the remaining channels into an ‘Other’ field so it reflects the true total of rates and occupancy. If you are breaking it down, you do not need the total occupied numbers because that can be extrapolated later. To keep things simple, we’ll just stick with the most basic of data using those fields first mentioned. Being able to forecast down to the day is overkill and causes too much complexity for little value. Some holidays and events aren’t on the same day each year but happen around the same time, so combining them into weekly stats is the way to go. This will help you forecast by the week. You can then get a cleaner sheet with more actionable data.
How do you condense your data down to the week (or month, quarter)? Pivot Tables, that’s how. We could write an article about how to group data by some kind of time interval, such as a week, month, or quarter, but people already have. Just make sure you are taking an average of your bed rates and occupancy rates, not a sum. So here are some links:
Now we can write a step-by-step guide (and provide links to videos that better explain them below):
Create a graph. It must be a blank Line Plot graph. Label it, and break the year down by quarter (or month if you have little data). You will have to format the Time Axis accordingly. Remember, one cycle is one year. This original column data is also known as Yt, if you really wanted to know.
Create a column and calculate the moving average (title it MA(52)) by writing an average function starting halfway through the cycle (so it starts after the first six months of data, taking that data into account). You also do not get a moving average of the last six months either because there is no data.
Center the moving average in another column (CMA(52)); since 52 weeks is an even number, you create a new column that is an average of the two uncentered moving average values adjacent to it. This way, the average of the two uncentered values would then be centered, and this data can also be referred to as the baseline.
Now select the CMA and add it to the chart on top of the Time Series Data. This gets rid of the seasonality and irregularity components.
Now you extract the Seasonality and Irregularity Components by comparing the initial data with the averaged data. This is the Classical Multiplicative Model for time series. Create a column with St, It. Here, you just divide the original data by the centered average.
Now you get rid of the irregularity and only calculate the seasonal component. You first create an AVERAGEIF formula somewhere else on the sheet to get an average of the season (week number). Then use a VLOOKUP formula to place the data into a column titled St.
Now that you know the seasonality, you deseasonalize the original data by dividing the original data by St. Now you’ve eliminated both the irregularity and seasonality.
Now you calculate the trend component using linear regression, using the deseasonalized data as the Y variable and the T variable as the X. Here you use the Data Analysis Toolpak and input that data somewhere out of the way on the worksheet.
Now you add together the coefficients for the slope and multiply that by the time code. This data will give you the trend over time to help make predictions.
Now use all these values to make predictions. Create a forecast in a new column by multiplying the seasonality trend with the trend.
Now add another year’s worth of time intervals (week, month, quarter) to the time. Extend the seasonal component, the trend component, and the forecast to the end of the forecasted year. The forecast into the future is what you want. Now you can plot the forecast of the future right on top of the original data as a new series in a different color.
Too much? Do you need pictures? How about some quality videos? Here are some videos in a row:
Or view some slides: https://www.slideshare.net/UniSrikandi/chap15-time-series-forecasting-index-number. This data helps you with predictions and can assist you in planning for the future. This is especially useful if you are considering taking on some investment or debt. It is not perfect, and who knows what other factors can influence your business over time?
Now, you can get more complex here, like calculating the average difference between weekend rates and weekday rates. However, please consider that the more complexity you wish for, the better off you might be using a revenue management tool. What revenue management companies do is this, and more. Instead of doing it with spreadsheets, they utilize more complex data science models with a bunch of code. They also employ relevant benchmarking and other publicly accessible hostel data to improve their forecasts, and finally, they add in proxy data to make their data models even more robust. Think of what data outside your business can help your business: local airport statistics, weather predictions, major event schedules? All these data sources already give a good manager a hunch to change prices. This kind of software quantifies the hunch and offers suggestions on how to best act on it.
Pivot Tables. Time Series Forecasting. Wow, you have some business intelligence now. Add that to your resume and start forecasting. If you have any other tips (especially on what systems make it easy to export data like this), feel free to leave them in the comments.
