What You Don’t Know About Your Data Can Cost You

Do not give your data to a healthcare IT vendor just to be handed back your information with a beautiful wrapper.

The idea of semantic integration sits at the intersection of traditional interface technologies and the confluence of some technological trends now shaking the healthcare IT the world, including the cloud, the Internet of Things, big data, data lakes, analytics and mobile interaction. The key strategies for healthcare improvement and value-based contracting all rely heavily on the competent and accurate high-speed integration of disparate information from all types of sources. However, simple integration and patient information matching is not the entire story when you consider all of the requirements to deliver quality and reduce costs moving to an at-risk environment. It is more about creating a dynamic structural and semantic homogeneity among the myriad of sources of patient information and delivering to the point of care and or any interaction with a patient clear, concise picture of that individual’s status. To achieve this panacea, it requires a very simple event model (SEM) approach integrated to a clinical event model (CEM) supported by ontologies that can mitigate the challenges of semantic and syntactic incongruities healthcare information technology professionals, and clinicians face every day.

In recent posts I detailed how using machine learning to build semantic models for matching patient data from disparate systems is far and away superior to traditional methods that rely on deterministic and probabilistic algorithms in database systems. Once that data is matched and the "DNA" of that match is stored for future reference if the target organization desires to store the raw data for analysis and population health exploits we take a step back by approaching that persistence using traditional hierarchical data models. Instead of data models, the use of formal descriptions of terms within the data and the relationships those have within provided a more ontological approach to overcoming the challenges of storing semantically correct information about a person, patient, physician, etc. This idea sounds like a foreign concept for healthcare professionals, but it is more prevalent than you may recognize. Let’s simplify the concept a bit as we explore this promising approach to data persistence and use.

Consider the manner in which we classify items in our everyday lives according to their use or domain. A card to a network engineer means something totally different than a technology sales person. The network engineer considers the term “card” and associates that as a network interface card and the technology salesperson thinks of a card regarding a business card for a prospect. However, when we apply a “Domain Ontology” to the word “card” the definition of the concept becomes associated with a particular "world." Granular meaning is then applied from the domain to create an associative model that allows for additional concepts like network cable, router, and server to be freely associated with the concept of a card in the IT domain. In healthcare, we have many domains, especially from the clinical perspective. Consider for a moment the concept of “Shunt” and what that concept means to a general surgeon versus a neurosurgeon. While both may operate on a child to place a VP Shunt the anatomy and definition of the successful outcome will be same and different at the same time. The neurosurgeons will operate on the patient's cranium to place the shunt and the general surgeon will operate on the patient's abdomen to place the distal end of the shunt. Both will share the "concept" of success as draining of the CSF fluid from the brain to the abdomen. The neurosurgeon, however, has different concepts of the success of the procedure being the relief of fluid pressure on the brain and the general surgeon may have concepts on a successful placement of the distal end of the shunt in the patient abdomen. This is perhaps an over-simplification, but this example represents well thecomplexities of merged domains which have created many of the silos and information rationalization challenges on aggregating information sources to support population health and big data analytics in healthcare.

The persistence of information from many sources has to be a learning process that is close to 100% automated in the creation of ontologies, including the extraction of a domain's terms and logic from the information source. Today much of the data warehouse and big data efforts in healthcare are bound by manually, extremely labor-intensive and, time-consuming efforts to build data models with hierarchical structures all of which are quite possibly unnecessary. Information extraction and mining methods should automatically link ontologies to documents, data, sources e.g. in a context using an extendable conceptual data modeling language. Today this is frequently done in the field of engineering where concepts, domains, and context are equally if not more complex than healthcare. By creating the domain, specific knowledge extracted from the data sources this automatically provides the semantic modeling of the different data as a byproduct of the methodologies in this area of data persistence disciplines. Using this type of discipline is intended to express "facts," "answers" and or "statements" about the domains and the data consumed in an easy, natural language and understandable format. For example, in the preceding healthcare example, the complete, unambiguous consumption of clinical processes data and information, business processes data and information, resource consumption, to build a performance view of quality metrics, financial performance, and contractual viability in an extremely IT system-diverse environment is possible in this discipline. Once in the learning data model, the learned concepts can be easily presented in an interpretable and language independent fashion such as SQL or OWL (The Web Ontology Language). In less than half the time it would take to use traditional techniques of data modeling and hierarchical structure creation.

Semantic integration of heterogeneous information is a time-consuming effort that is overcomplicated by the massive application portfolios of healthcare delivery systems and is proving to be the rate limiting factor in advancing the volume to value shift in the market. By leveraging the way that humans associate information coupled with the event models and the automated ontological data persistence approach the complexity and cost of the big data challenge become manageable. Reducing the technical expertise to query the information stored in this model increases the access to more information, closer to real-time with an infinite number of correlations unobtainable in the older traditional approaches. There is a large body of research and successful projects in healthcare to support the movement of technology away from traditional models to a high velocity, semantic integration strategy with an associative based data persistence layer as outlined here in this blog post. The ability to rapidly explore patterns in both streaming and persisted data using the same methodology for both allows for the expedited discovery of patterns within the data that provide for proactive, personalized delivery of care at a significantly lower cost.