How researchers are keeping your health data safe

Digital map of Australia
Picture: Getty Images

Researchers developing an Australian heatwave index are using innovative data collection that improves the quality and resolution of personal data while protecting people’s privacy

By Dr Hao Chen and Flavia Barar, University of Melbourne

Dr Hao ChenFlavia Barar

Published 28 November 2024

Heatwaves are Australia’s deadliest natural disaster, causing more deaths than any other natural hazard. The intensity and frequency of heatwaves are predicted to increase due to climate change, posing great risks to human health – especially in cities where the urban heat island effect can exacerbate these impacts.

For this reason, many cities including the City of Melbourne, have appointed Chief Heat Officers to assess and mitigate such risks.

Man drinking from a water bottle
Heatwaves cause more deaths than any other natural hazard. Picture: Getty Images

A critical element in understanding these effects – and how to mitigate them – is access to relevant data.

However, this can often be a difficult process due to the disparate sources of such data, the variety of scales at which this information is collected, and the need for privacy when dealing with health data.

Most people would agree that understanding heatwaves at a local scale is important and worthy of research, but most Australians unfortunately have personal experience with data breaches and many would be reticent if they thought their personal information, including health information was in any way at risk.

A problem of scale 

Typically, geospatial data is available at different scales based on Statistical Areas, as defined by the Australian Bureau of Statistics. Statistical Area Level 1 (SA1) is one of the smallest possible levels.

Australia is covered by 61,845 SA1 regions with an average population of 400 people.

The same heatwave can affect populations disproportionately depending on their health, demographic and socioeconomic status, as well as the built and natural environment of their local area.

Having access to SA1 level data would allow for detailed, high-resolution analysis which is essential for our research. However, existing indicators primarily focus on heat vulnerability at the broader SA2 level (average population 10,000).

At this scale, population and urban characteristics are averaged out over large geographic areas and so we are potentially missing out on important details, like the specific drivers contributing to vulnerability profiles.

But there’s a flipside.

These larger areas are more often used because of the risks related to accessing and analysing health data. All of the health data used in studies like ours is de-identified, but at the smaller SA1 scale, there is a real possibility that individuals could be re-identified and their personal information compromised.

Map of Sydney divided by many small regions
Australia has 61,845 SA1 regions, each with about 400 residents. Picture: Australian Bureau of Statistics

For example, in an SA1 area there might be only one or two people with a particular health condition, making it possible, when linked to other data, to identify these people.

We are working with government to develop and share ways to use high-resolution health data in a completely safe way.

Heat Health Vulnerability

Currently, we're developing a Heat Health Vulnerability Indicator (HHVI) as part of the Australian Urban Health Indicators (AusUrb-HI) pilot project, a collaboration across National Collaborative Research Infrastructure FacilitiesAURIN, PHRN and ARDC – as well as Queensland University of Technology, RMIT University, the University of Western Australia and Cancer Council Queensland.

The HHVI aims to support government agencies in obtaining a comprehensive understanding of heat vulnerability at a local scale, allowing them to direct resources where they are needed (for example, to cope with a spike in hospital presentations during a heatwave), allocate resources for vulnerable populations and take targeted planning measures towards urban heat resilience.

The HHVI identifies spatial patterns of vulnerability, as well as underlying health, socioeconomic and urban environment characteristics with the potential to increase both vulnerability and exposure to heat.

To uncover these factors, we use health data in conjunction with population data (eg age, socioeconomic disadvantage, household composition, education, income), urban data (eg building height and density, public transport, water and green space) and environmental data (eg temperature).

We use this data to create three sub-indicators which together provide an indication of heat health vulnerability: heat exposure, sensitivity and adaptive capacity.

Using New South Wales (NSW) as a case study, we have identified and compared these sub-indicators over space and time and assessed surges in demand for health and emergency services related to heatwave intensity.

One notable preliminary finding was that inner regional areas experienced lower vulnerability, on average, than major city areas and this difference appears to be caused almost entirely by a consistent increase in heat exposure in these major city areas.

Cityscape of Sydney on a hot day
Increases in heat exposure make people in major Australian cities more vulnerable to heatwaves. Picture: Getty Images

Risk and reward

An essential part of this project involved finding ways to unlock access to data which can often be difficult to obtain because of privacy concerns.

The steps involved in mitigating privacy risks depends on the type of data and is different for each Australian state.

For example, in NSW, government health data is held by data custodians. Once our research application was approved, data was linked by the Centre for Health Record Linkage, then transferred to a Secure Unified Research Environment (SURE).

We are never handed data directly to store and analyse on our own computers; instead, we access and process data inside the secured environment provided for the project. Data custodians have access to the project space and review and approve any results before they can be extracted from SURE.

Researchers undergo a lengthy and detailed application process that requires specific training and approval from several data custodians, as well as an independent ethics committee.

We undertake a risk assessment to test for potential weak points and determine appropriate analysis methods to prevent individuals from being re-identified.

As you can see, the process of accessing health data is highly complex and requires ongoing communication with all parties involved.

We’ve been documenting this process to create a clear pathway for future researchers to successfully access linked health data, without putting patient data at risk of re-identification.

We hope sharing our process will help the wider research community understand the necessary steps to safely access and analyse highly sensitive data.

We also hope the HHVI itself will allow researchers to refine and adapt the indicator for further environmental health research to support community adaption and resilience in the face of growing health and climate challenges.

Find out more about research in this faculty

Architecture, Building and Planning