Data Segmentation and Consent: Thoughts on the ONC Proposed Rule
- 9 minsAs we reach the deadline for comments on the recent notice of proposed rulemaking (NPRM) by the ONC (see a summary or the full-text), I am sharing some thoughts on patient-requested restrictions and data segmentation for privacy (DS4P) that are among the proposed updates in this NPRM. These are general comments on the subject matter covered by these proposed rules hoping to explain and clarify some of the implications for vendors and implementers.
Patient-Requested Restrictions should be modeled using FHIR Consent
The FHIR Consent
resource provides a powerful model for articulating patient-requested restrictions and privacy preferences. Profiles of the consent resource have been proposed (including the one in the emerging IHE Privacy Consent on FHIR Implementation Guide) and there has been multiple reference implementations (including the ONC LEAP Consent Project). Adding a FHIR Consent profile to the US Core Data for Interoperability (USCDI) seems like a natural next step to support an interoperable mechanism for expressing granular patient-requested restrictions.
Granular Consent and Data Segmentation for Privacy (DS4P) are closely linked
Through the use of a Security Labeling Service (SLS), data segmentation identifies the granular segments of data that are noteworthy from the perspective of policies and regulations, or common-sense expectations of privacy. The SLS tags the data to identify and mark those segments that have significance in enforcing privacy rules and preferences; for example, FHIR resources or sections of a CDA document that contain information about reproductive health.
This service is integral in the capability to enforce granular rules and restrictions, including those requested by patients. For example, if a patient consent requires that a provider not share reproductive health data, the provider must be able to identify any data item that falls under this category. Without DS4P, granular rules and restrictions about the data would not be enforceable because the access control system would not be able to distinguish the data items that are subject to these rules.
Minimally, it is possible to enforce patient-requested restrictions while keeping data segmentation and labeling completely internal to the organization. The organization can provide a proprietary interface for patients to request restrictions and express their preferences, and these restrictions can then be enforced through proprietary mechanisms. But this approach has significant limitations because a common understanding of patient-requested rules across different organizations, requires a common understanding of security labels used in expressing these rules. Moreover, exchanging labeled data between organizations, hinges on having an interoperable framework for recording security labels as well as a common vocabulary and an agreement on the meaning of the labels.
Security Labels are not an alternative to consent
It is important to note that security labels are not an alternative to consent in capturing patient-requested restrictions. While some security labels communicate basic handling instructions, such as “do not re-disclose” or “delete after use”, security labels generally cannot encode complex rules that a consent can model.
In theory, it may be possible (and might be tempting) as an implementation strategy, to capture patient-requested restrictions and try to turn them into security labels persisted on data items, but this is often a very poor engineering decision because patients can change their minds about the requested restrictions, and this change may be frequent. Such a change would require re-labeling potentially a large volume of data belonging to that patient according to the new rules and this will have to happen again every time the patient’s preferences and restrictions change.
In most cases, requested restrictions (that are in nature a type of ‘policy’) should be persisted and maintained separately from the data, for example, in the form of FHIR Consent
resources. This policy should be consulted at the time of sharing or using the data to ensure that latest version of the policy is enforced.
Interoperable Security Labels are the crux of interoperable DS4P and consent enforcement
Standard labels provide a vocabulary for expressing granular patient-requested restrictions and privacy preferences (and also other types of policies). While consent enforcement and segmentation methods can be proprietary, these labels need to be standard and interoperable in order to ensure a consistent interpretation and enforcement of consent rules in different organizations. For example, when a patient expresses a restriction about sharing of information related to substance use treatment, the code for referring to this type of data (which will be used in articulating this rule) must be consistently recognized and understood (both syntactically and semantically) by all the providers who are processing this patient’s data.
Agreement on the meaning of security labels is crucial in establishing an interoperable consent management ecosystem in which patients can author their preferences with a consent management service of their choice and have these preferences applied across different providers, rather than having to record, and perhaps repeat, their requested restrictions with every provider.
We need to start with a small well-defined subset of Security Labels
The FHIR DS4P IG provides the latest value sets for different types of security labels that are now incorporated into the HL7 terminology. This should supersede other, older value sets defined in previous standards (such as the vocabulary defined in the supplement of HL7 HCS).
However, there is a rather large number of codes in these value sets; some of these codes may not have clear-enough definitions for consistent implementation; some may not be a priority in common use cases, and some may even be outdated or replaced by newer codes. Large-scale implementation of DS4P requires identifying and adopting a well-defined subset of these codes which is most pertinent to common use cases.
The basic confidentiality labels:
- unclassified (
U
), - restricted (
R
), and - normal (
N
),
as well as the sensitivity codes for common types of sensitive data recognized in US jurisdictions:
- mental health (
MH
), - substance use disorder(
SUD
), - sexual and reproductive health (
SEX
), and - sexually transmitted disease (
STD
),
seem to be well-defined and stable and backed by well-understood regulations. I think the value set for general purpose of use also seems to be well-defined and stable.
Ultimately, this subset should be determined based on feedback from vendors and health IT policy experts and come with more extensive guidance on the precise meaning and use cases for each label to ensure a common and consistent understanding across different implementations.
Some existing specifications (such as the emerging IHE PCF IG) have proposed a subset of most-commonly-used security labels. I think, ideally, this subset should be be part of a future USCDI profile of the FHIR Consent
resource.
Assigning Sensitivity Labels is complex
Sensitivity labels provide an abstract language for the patient to request restrictions about their sensitive data. Naming individual data items in restriction requests can be infeasibly onerous for the average patient who may not be fully knowledgeable about the information contained in each data item and its implications, or have the time and resources to comb through large volumes of data to request restrictions on sensitive information. So, rather than articulating restrictions about individual data items (e.g., “this diagnosis, that observation, and those medications”) the patient can request a broad restriction about an entire category of sensitive information (e.g., anything related to substance use history).
Unlike confidentiality labels that are a function of policies and rules applicable to the data, sensitivity labels are inherent to the content of the data. Therefore, it is often reasonable to assume they can be assigned and persisted without the need to update them as a result of a policy change –they only need to be updated if the content of the data item changes, for example, if a new line item is added to a prescription.
Since sensitivity labels are semantically connected to the clinical content of the data, determining whether a data item belongs to sensitive category may be somewhat subjective especially in edge cases. For example, whether a particular medication is indicative of substance use treatment could be a subject of disagreement among experts which can lead to different systems labeling that data differently.
To ensure a sufficient level of consistency, there should be at least some high-level and broad guidance on the underlying clinical concepts for each sensitivity class. This is necessary to ensure consistent enforcement of restrictions and rules that are based on sensitivity labels. Without such guidance, there is a risk of inconsistent application of sensitivity labels by different implementations and thereby inconsistent enforcement of policies. This could lead to circumstances where one provider chooses to release a data item as not sensitive, while other providers restrict access to the same data item. Such lack of consistency could undermine the trust of patients in the enforcement of their preferences, and in extreme cases, could create loopholes for leaking sensitive data and inference attacks.
Enforcement should be cross-paradigm
Health data, even residing with the same provider, may flow in different formats and via different protocols. So, it is important to ensure that patient-requested restrictions are enforced consistently at all of the gateways where data may flow. Currently, there is guidance for data segmentation for FHIR, for CDA, as well as for HL7 v2.9.
Note that patient-requested restrictions do not need to be captured in different formats for different environment; they can be captured as a FHIR Consent
resource once, and then be enforced in different environments through an appropriate consent enforcement service, as demonstrated in the ONC LEAP Consent Project.
A maturity model can pave the way for incremental implementation
Achieving advanced data segmentation and consent enforcement requires implementing and orchestrating a complex set of services and it may not be feasible to do this in one phase. A maturity model can help vendors establish a roadmap for planning an incremental implementation. It also enables different implementers to consistently and accurately communicate the features and the maturity of their implementation.
A maturity model is a roadmap item for future versions of the FHIR DS4P Implementation Guide. The emerging IHE PCF Implementation Guide have proposed a model of maturity (basic, intermediate, and advanced) that is specific to the recording and enforcement of consents. There is also an unpublished draft document within the HL7 Security Workgroup archives that was a first attempt at defining a maturity model for both Security Labeling Services (SLS) and consumers of security labels. I will publish a summary of this maturity model in a future post.