This is the repository for the allhomes
R package. The main function that the package provides is get_past_sales_data()
which extracts past sales data from allhomes.com.au for a (or multiple) suburb(s) and year(s).
Install the package from CRAN
install.packages("allhomes")
Or directly from GitHub
remotes::install_github("mevers/allhomes")
The function get_past_sales_data()
takes the following two arguments:
suburb
: This is a character
vector denoting a (or multiple) suburbs. Every entry must be of the form “year
: This is an numeric
or integer
vector of the the year(s) of the sales history.Example:
get_past_sales_data("Balmain, NSW", 2019) %>% print(width = 100)
#[2022-07-27 14:52:47] Looking up division ID for suburb='Balmain, NSW'...
#[2022-07-27 14:52:47] URL: https://www.allhomes.com.au/svc/locality/searchallbyname?st=NSW&n=balmain
#[2022-07-27 14:52:47] Finding data for ID=7857, year=2019...
#[2022-07-27 14:52:47] URL: https://www.allhomes.com.au/ah/research/_/120785712/sale-history?year=2019
#[2022-07-27 14:52:48] Found 229 entries.
## A tibble: 229 × 27
# divis…¹ state postc…² value year address bedro…³ bathr…⁴ ensui…⁵ garages carpo…⁶ contr…⁷ trans…⁸
# <chr> <chr> <chr> <int> <dbl> <chr> <dbl> <dbl> <lgl> <dbl> <lgl> <chr> <chr>
# 1 Balmain NSW 2041 7857 2019 1 Long… NA NA NA NA NA 06/12/… 02/04/…
# 2 Balmain NSW 2041 7857 2019 7 Alex… NA NA NA NA NA 30/08/… 16/10/…
# 3 Balmain NSW 2041 7857 2019 29 Bir… NA NA NA NA NA 25/10/… 06/12/…
# 4 Balmain NSW 2041 7857 2019 2 Well… 6 3 NA 4 NA 25/05/… 26/08/…
# 5 Balmain NSW 2041 7857 2019 109 Mo… 4 2 NA 2 NA 25/02/… 08/04/…
# 6 Balmain NSW 2041 7857 2019 10 Tha… 4 2 NA 4 NA 05/10/… 16/12/…
# 7 Balmain NSW 2041 7857 2019 3/100 … NA NA NA NA NA 18/07/… 06/09/…
# 8 Balmain NSW 2041 7857 2019 160 Be… 5 4 NA 1 NA 18/10/… 13/12/…
# 9 Balmain NSW 2041 7857 2019 25 Isa… NA NA NA NA NA 01/05/… 02/09/…
#10 Balmain NSW 2041 7857 2019 71 Mor… 4 2 NA 2 NA 24/05/… 05/07/…
## … with 219 more rows, 14 more variables: list_date <chr>, price <dbl>, block_size <dbl>,
## transfer_type <chr>, full_sale_price <dbl>, days_on_market <dbl>, sale_type <lgl>,
## sale_record_source <chr>, building_size <lgl>, land_type <lgl>, property_type <lgl>,
## purpose <chr>, unimproved_value <lgl>, unimproved_value_ratio <lgl>, and abbreviated variable
## names ¹division, ²postcode, ³bedrooms, ⁴bathrooms, ⁵ensuites, ⁶carports, ⁷contract_date,
## ⁸transfer_date
## ℹ Use `print(n = ...)` to see more rows, and `colnames()` to see all variable names
Under the hood, the function get_past_sales_data()
first calls a helper function get_ah_division_ids()
that determines for every suburb
entry the Allhomes “division” name and ID. The division ID is then used to extract past sales data from the Allhomes website using the low-level function extract_past_sales_data()
.
Currently, there are limited sanity checks in place to verify if past sales data are available for a particular suburb and year. Allhomes does not have data for all suburbs and years (for example, Allhomes past sales data for Victoria is pretty much absent).
allhomes
also provides two datasets divisions_ACT
and divisions_NSW
that list division names and IDs for all Allhomes divisions (suburbs) in the ACT and NSW, respectively.
Please report any bugs as GitHub issues. If you like to get involved, please get in touch and/or submit a PR.
The (unofficial) Allhomes API distinguishes between different types of “localities”; in increasing level of granularity these are: state > region > district > division > street > address. Divisions (roughly) correspond to suburbs. The allhomes
package pulls in past sales data at the division (i.e. suburb) level.
Allhomes (which is part of Domain Group) receives historical past sales data from relevant state departments. Some details on Allhomes’ data retention are given here.
While there seems to exist an (unofficial) Allhomes API to query IDs (which are necessary for looking up past sales data), past sales data themselves need to be scraped from somewhat awkwardly-formatted static HTML tables. Data for every sale is stored within a <tbody>
element; within every <tbody>
element, individual values (address, price, dates, block size, etc.) are spread across 3 lines, each contained within a <td>
element; unfortunately, the format of every line is not consistent.
This project is neither related to nor endorsed by allhomes.com.au. With changes to how Allhomes (and Domain group) manages and formats data, some or all of the functions might break at any time. There is also no guarantee that historical past sales data won’t change.
All data provided are subject to the allhomes “Advertising Sales Agreement terms and conditions - All Homes Pty Ltd”.