Sherlock Homes: Investigating Property Growth in Australia
What factors affect property growth? Distance to school? Find out with Holmes - a unique tool investigating property growth in Australia.
Problem Statement: How can we make home buying in Australia (in the states of NSW, VIC and QLD) less emotional and more profitable?
Proposed Solution: Create a tool that 1) predicts 1-year annualized growth of a property in Australia (in NSW, VIC and QLD), 2) displays confidence interval for the prediction and 3) displays factors that are most important to property growth.
Impact: Home buyers will be more confident in their home purchase decision which will reduce stress levels and improve quality of life. Home buyers might also experience increased profits because they know some of the factors that go into growth.
Customer Segment: Home buyers in the NSW, VIC and QLD states of Australia
Australian Property Market: 70% of homes in Australia are owner-occupied, one of the largest proportions of any country. Increases in owner-occupied property values are not subject to capital gains tax. This helps to make home ownership a smart investment. A third of Australian homes are owned by their owners outright, about another third are still paying mortgages. (The last 30% are renting.) Home loan interest rates are at all time lows of ~4% encouraging further investment into the property market.
Challenges: In order to gather enough features to make our assertions, we looked to many different data sets. The data we obtained was of varying quality, coverage, and accuracy. All of the data required pre-processing to be made into a coherent form useful for analysis. The data quality issues easily presented the most difficult engineering challenge as well as the biggest risk to the validity of our results.
Example Impact Calculation: 10% growth found by the tool for postcode 3129. That is 4% more than the national average of 6%. Median home value in this postcode is $1.0 million. An extra 4% growth could mean an extra $40K a year. (As a comparison, term deposits with local banks are currently only returning about 3%.)
Modeling Summary: We used Gradient Boosted Regression Trees (GBRT) from sklearn to predict growth at an individual property level initially using all the features we gathered. This feature set was then pruned down to improve the generalization of the model using importance scores from GBRT. The hyperparameters for GBRT were tuned in order to maximize R^2 which was done to ensure the model was explaining as much of the growth as possible.
Performance & Evaluation: Trained model on growth from 2009 to 2015 and generated predictions of growth for 2016 (Prediction Window up to 6 months). The following table shows the probability of selecting a property with growth above 6% using the model vs random selection.
Property Growth | Random Probability of Selection | Model Probability of Selection |
---|---|---|
> 6% | 54% | 70% |
> 7% | 44% | 66% |
> 8% | 35% | 57% |
Insights: Supply of properties for sale is an important feature. Less availability = More growth. Proximity to transport is highly important.
Explore Holmes to find out more.