Managing the Evolving Data Landscape in Cancer Care and Research
By Theodora Bakker, Director, Data Stewardship and Integration and Atti Riazi, SVP & CIO, Memorial Sloan Kettering Cancer Center
With a heavy concentration on translational and clinical research at Memorial Sloan Kettering Cancer Center (MSK), there is an ever-growing need to leverage data across the clinical, research and education missions. While as a cancer center, our organization has a singular focus of disease, our data and technology needs are consistent with the needs of the larger healthcare industry. And our overall outlook reflects the demands of larger society. Atti Riazi says, “We must go through a radical revolution in terms of how we view IT— no longer seeing it as things and products, but instead focusing on the experience, intelligence, and insights from all the technology we deploy.”
The advances in differentiating types of data storage and federation, as well as the ability to create an access and delivery layer across disparate data sources, has fostered the emergence of a different way to think about data—the data fabric. There are a few transformative core components of our data fabric, all housed in a strong metadata layer including a concept-driven catalog, data lineage, master and reference data management, and data de-identification. These contribute to the advancement of healthcare by providing clarity and transparency while also protecting sensitive data classes like PHI (Protected Health Information) and PII (Personal Identifiable Information).
Our catalog and use of standard ontologies in biomedical research and patient care allow our fabric to provide transparency to the meaning of our data—clarity previously obscured by the myriad of independent transactional systems used in healthcare. Since clinical and administrative data is often spread across multiple systems, it is challenging for users to understand what data means and how it connects to each other across systems. A billing system might provide data about a patient’s diagnosis and comorbidities using standard billing codes, while the impact of drug interactions is housed in the EHR, and outcomes are buried in provider notes. The context of this integrated information is critical in both the clinical and research realms. By extracting the meaning of each of these domains of data and representing them in an integrated catalog, users can find new pathways of care and create new insights for research.
While the focus of healthcare must always remain in the provider-patient relationship, the administrative functions of healthcare enable better care. Operations must look at data in the aggregate, which lays bare the inconsistencies and quality issues across a medical center. A data fabric allows for data to be selected and managed through the metadata, providing the ability to track data through its lifecycle and pinpoint the opportunities for its quality improvements. With a robust data stewardship program, an organization can use master and reference data management to create a unified picture of data, allowing operations to manage interactions organization-wide. The regulatory and ethical considerations around the privacy of an individual’s data are continuously advancing, and technology is emerging to automatically de-identify data as it moves through systems. Our data fabric transforms our ability to use near real-time data while protecting data privacy. There is no longer the requirement to send unique datasets through manual de-identification code, delaying the use of data at the moment.
Adoption across the organization is varied, with some areas showing reluctance to adopt the change, while others are racing to embrace the new technology.
Present day, the value of data in understanding and controlling infectious disease is on the forefront of many people’s minds. Atti Riazi says, “What is the benefit of knowing about COVID-19 or Ebola a month earlier, or understanding that a few less inches of rain will create drought, food shortages, unrest, and instability in a region the following year? Data has great value in providing insight into so many social, health, and environmental issues; by sharing information freely, we can better predict such disasters and take much more effective action. Tech companies can help governments, NGOs, and civil society with big data projects through funding and providing expertise, tools, and data itself.”
The technology behind our integrated data fabric layer contributes to the transformation of our industry by enhancing the meaning of data and enabling more flexible use, with the data constantly in motion through its lifecycle—although, as with any transformative program, it is not without its challenges. The technology is still being conceived, and a stable, integrated technology has yet to emerge in the industry. Today, organizations that have a fully realized fabric have invested millions of dollars and years to achieve those ends, and a fabric platform approach to data management is outside the reach of many medical centers.
With any new technology, the early adopters will suffer the wounds of the ‘bleeding-edge’ to enable true transformation of the industry; it requires the foresight and will of those institutions to lead healthcare, and clinical and translational research, into the next era. In addition to the burden of commitment, any transformation bears the delicate challenges of change management. Our approach uses two main components—education through data literacy initiatives, and tool training as users bring specific use cases—to transform how we think about an organizational data platform.
Adoption across the organization is varied, with some areas showing reluctance to adopt the change, while others are racing to embrace the new technology. The former group poses the challenge of requiring tactical and significant resource commitment to help our users adopt the new approach to thinking about data. The latter group poses the challenge of demanding changes faster than the technology can be built. To remain on a successful trajectory, our program has adopted a concentrated approach to change management, a change we feel is a departure from a traditional ‘build it and they will come’ mentality. Through these efforts, and mindful of the overall benefit of data globally, we are trying a wide variety of outreach, training, and communication strategies, and measuring the success of each so we can continuously optimize not only the technology we are building, but also the enterprise-wide adoption.