Disclosure Risk
- Risk that confidential or sensitive information is revealed inadvertently when collecting, processing, or releasing data.
- Can occur through direct identifiers (e.g., names and addresses) or by linking unique combinations of variables.
- Mitigation includes anonymizing data, using statistical masking techniques, and restricting data access.
Definition
Section titled “Definition”Disclosure risk refers to the risk that confidential or sensitive information may be unintentionally revealed during the process of data collection and analysis.
Explanation
Section titled “Explanation”Disclosure risk arises when data about individuals can be exposed either directly (through identifiable fields) or indirectly (through combinations of non-identifying variables that together identify someone). Consequences of disclosure may include embarrassment or discrimination for the affected individuals. Reducing disclosure risk requires deliberate actions by data collectors and researchers to protect confidentiality and privacy throughout data handling and release.
Examples
Section titled “Examples”Release of personal information
Section titled “Release of personal information”If a dataset includes personal information such as names and addresses and is released without proper protections, individuals can be identified. For example, a study that collects income levels of individuals in a particular city could reveal who earns specific incomes if the data are not properly anonymized.
Identification via a combination of variables
Section titled “Identification via a combination of variables”Even without direct identifiers, a unique combination of variables can identify an individual. For example, a study that collects height and weight of individuals in a particular city might allow someone to identify an individual if their height and weight combination is unique, potentially leading to embarrassment or discrimination.
Notes or pitfalls
Section titled “Notes or pitfalls”- Common mitigation measures described in the source include anonymizing data before release, applying statistical techniques to mask individual identities, and limiting access to data to those with a legitimate need.
- Researchers should assess potential implications and risks to individuals before collecting and analyzing data to help reduce disclosure risk.
Related terms
Section titled “Related terms”- Anonymizing data
- Statistical techniques to mask the identity of individuals
- Limiting access to data (data access controls)