Goal

The goal of this case study is to show how to combine functionalities from a range of external packages to gain insight into the variability of the unimproved value (UV) based on Allhomes past sales in the ACT. Key packages are

  • allhomes for extracting Allhomes past sales data,
  • strayr for getting ABS geometries of SA2 regions (which – within cities – usually represent suburbs), and
  • leaflet for drawing interactive Leaflet maps in R.

Prerequisities

We load necessary non-base R libraries.

Raw data

Geospatial data

We obtain 2021 statistical area 2 (SA2) geospatial data provided by the ABS through strayr, and only keep records for areas in the ACT. SA2 names sometimes include the territory name, so we clean names by removing the territory name if present. We then store all cleaned SA2 names in preparation for using these names in the past sales allhomes search.

data_spatial <- read_absmap("sa22021") %>%
    filter(state_name_2021 == "Australian Capital Territory") %>%
    mutate(sa2_name_2021 = str_remove_all(sa2_name_2021, "\\s\\(ACT\\).*$"))
sa2_names <- data_spatial %>% pull(sa2_name_2021) %>% unique() %>% sort()

Allhomes past sales data

We now get past sales data for all suburbs as given in sa2_names for the years 2021 and 2022. Since get_past_sale_data() requires suburbs to be specified in format “suburb_name, state/territory_abbreviation”, we append ", ACT" to entries in sa2_names. This process may take a few minutes.

data_allhomes <- get_past_sales_data(sa2_names %>% paste0(", ACT"), 2021L:2022L)
#> Warning: [2022-09-13 06:32:25] Could not find ID for 'ACT - South West, ACT'.
#> Skipping.
#> Warning: [2022-09-13 06:32:31] Could not find ID for 'Arboretum, ACT'. Skipping.
#> Warning: [2022-09-13 06:33:01] Could not find ID for 'Canberra Airport, ACT'.
#> Skipping.
#> Warning: [2022-09-13 06:33:02] Could not find ID for 'Canberra East, ACT'.
#> Skipping.
#> Warning: [2022-09-13 06:33:14] Could not find ID for 'Civic, ACT'. Skipping.
#> Warning: [2022-09-13 06:33:40] Could not find ID for 'Duntroon, ACT'. Skipping.
#> Warning: [2022-09-13 06:34:05] Could not find ID for 'Gooromon, ACT'. Skipping.
#> Warning: [2022-09-13 06:34:22] Could not find ID for 'Gungahlin - East, ACT'.
#> Skipping.
#> Warning: [2022-09-13 06:34:23] Could not find ID for 'Gungahlin - West, ACT'.
#> Skipping.
#> Warning: [2022-09-13 06:34:49] Could not find ID for 'Kenny, ACT'. Skipping.
#> Warning: [2022-09-13 06:34:55] Could not find ID for 'Kowen, ACT'. Skipping.
#> Warning: [2022-09-13 06:34:55] Could not find ID for 'Lake Burley Griffin, ACT'.
#> Skipping.
#> Warning: [2022-09-13 06:35:12] Could not find ID for 'Majura, ACT'. Skipping.
#> Warning: [2022-09-13 06:35:18] Could not find ID for 'Migratory - Offshore -
#> Shipping, ACT'. Skipping.
#> Warning: [2022-09-13 06:35:20] Could not find ID for 'Molonglo, ACT'. Skipping.
#> Warning: [2022-09-13 06:35:20] Could not find ID for 'Molonglo - East, ACT'.
#> Skipping.
#> Warning: [2022-09-13 06:35:21] Could not find ID for 'Molonglo Corridor, ACT'.
#> Skipping.
#> Warning: [2022-09-13 06:35:26] Could not find ID for 'Mount Taylor, ACT'.
#> Skipping.
#> Warning: [2022-09-13 06:35:26] Could not find ID for 'Namadgi, ACT'. Skipping.
#> Warning: [2022-09-13 06:35:34] Could not find ID for 'No usual address, ACT'.
#> Skipping.
#> Warning: [2022-09-13 06:35:59] Could not find ID for 'Scrivener, ACT'. Skipping.
#> Warning: [2022-09-13 06:36:18] Could not find ID for 'Tuggeranong - West, ACT'.
#> Skipping.
#> Warning: [2022-09-13 06:36:30] Could not find ID for 'West Belconnen, ACT'.
#> Skipping.

Plotting

We now combine geospatial and Allhomes past sales data; we use an inner-join to filter out suburbs without any UV data, and summarise UV data per property sale to give a median as well as upper and lower 95% quantile band values for the UV per square metre for every suburb. Prior to aggregating UV data we keep only those entries where the block size is between 100 and 2000 sqm and the unimproved value exceeds $100; this is to filter out large scale commercial sales and zero-UV-value outliers.

data <- data_spatial %>%
    inner_join(
        data_allhomes %>%
            filter(between(block_size, 100, 2000), unimproved_value > 100) %>%
            mutate(UV_per_sqm = unimproved_value / block_size) %>%
            group_by(division) %>%
            summarise(
                UV_per_sqm = quantile(
                    UV_per_sqm, probs = c(0.025, 0.5, 0.975), na.rm = TRUE),
                quant = c("l", "m", "h"),
                .groups = "drop") %>%
            pivot_wider(names_from = "quant", values_from = "UV_per_sqm"),
        by = c("sa2_name_2021" = "division"))

We can now visualise median UV values per sqm for every suburb in a Leaflet map. The 95% quantile band and median values for every suburb are detailed on mouse hover.

pal <- colorNumeric("YlOrRd", domain = data$m)
leaflet(data = data, height = 1000) %>%
    addTiles() %>%
    addPolygons(
        fillColor = ~pal(m),
        fillOpacity = 0.7,
        color = "white",
        weight = 1,
        smoothFactor = 0.2,
        highlightOptions = highlightOptions(
            weight = 5,
            color = "#666",
            fillOpacity = 0.7,
            bringToFront = TRUE),
        label = sprintf(
            "<strong>%s</strong><br/>UV per m²: %s<br/>95%% CI: [%s, %s]",
            data$sa2_name_2021, 
            sprintf("$%.0f", data$m),
            sprintf("$%.0f", data$l), sprintf("$%.0f", data$h)) %>% 
            map(HTML),
        labelOptions = labelOptions(
            style = list("font-weight" = "normal", padding = "3px 8px"),
            textsize = "15px",
            direction = "auto")) %>%
    addLegend(
        pal = pal, 
        values = ~m, 
        opacity = 0.7, 
        title = "Unimproved Value (UV) per m²",
        position = "bottomright")