Word 1: This publish is an element 2 of a three-part collection on healthcare, information graphs, and classes for different industries. Half 1, “What Is a Information Graph — and Why It Issues” is offered here.
Word 2: All photographs by creator
In Half 1, we described how structured information enabled healthcare’s progress. This text examines why healthcare, greater than another business, was in a position to construct that construction at scale.
Healthcare is probably the most mature business in the usage of information graphs for a number of elementary causes. At its core, medication is grounded in empirical science (biology, chemistry, pharmacology) which makes it doable to determine a shared understanding of the varieties of issues that exist, how they work together, and causality. In different phrases, healthcare lends itself naturally to ontology.
The business additionally advantages from a deep tradition of shared managed vocabularies. Scientists and clinicians are pure librarians. By necessity, they meticulously listing and categorize all the things they’ll discover, from genes to ailments. This emphasis on classification is bolstered by a dedication to empirical, reproducible remark, the place information have to be comparable throughout establishments, research, and time.
Lastly, there are structural forces which have accelerated maturity: strict regulation; robust pre-competitive collaboration; sustained public funding; and open information requirements. All of those components incentivize shared requirements and reusable information reasonably than remoted, proprietary fashions.
Collectively, these components created the situations for healthcare to construct sturdy, shared semantic infrastructure—permitting information to build up throughout establishments, generations, and applied sciences.
Ontologies
People have at all times tried to grasp how the world works. After we observe and report the identical factor repeatedly, and agree that it’s true, we develop a shared understanding of actuality. This course of is formalized in science utilizing the scientific methodology. Scientists develop a speculation, conduct an experiment, and consider the outcomes empirically. On this manner, people have been creating an implicit medical ontology for hundreds of years.
Otzi, the caveman found in 1991, who lived 5,300 years in the past, was found with an antibacterial fungus in his leggings, prone to deal with his whipworm an infection (Kirsch and Ogas 4). Even cavemen had some understanding that crops might be used to deal with illnesses.
Finally, scientists realized that it wasn’t the plant itself that was treating the ailment, however compounds contained in the plant, and that they might mess with the molecular construction of those compounds within the lab and make them stronger or more practical. This was the start of natural chemistry and the way Bayer invented Aspirin (by tweaking Willow bark) and Heroin (by tweaking opium from poppies) (Hager 75; Kirsch and Ogas 69). This added a brand new class to the ontology: compounds. With every new scientific breakthrough, our understanding of the pure world developed, and we up to date our ontology accordingly.

Over time, medication developed a layered ontology, the place every new class didn’t change the earlier one however prolonged it. The ontology grew to incorporate pathogens after scientists Fritz Schaudinn and Erich Hoffmann found the underlying reason behind syphilis was a bacterium referred to as Treponema pallidum. We realized microbes might be discovered nearly in all places and a few of them may kill micro organism, like penicillin, so microbes have been added to our idea.

We realized that DNA accommodates genes, which encode proteins, which work together with organic processes and threat components. Each main advance in medication added new lessons of issues to our shared understanding of actuality and compelled us to purpose about how these lessons work together. Lengthy earlier than computer systems, healthcare had already constructed a layered ontology. Information graphs didn’t introduce this mind-set; they merely gave it a proper, computational substrate.
At present, we’ve got ontologies for anatomy (Uberon), genes (Gene Ontology), chemical compounds (ChEBI) and a whole bunch of different domains. Repositories corresponding to BioPortal and the OBO Foundry present entry to properly over a thousand biomedical ontologies.
Managed vocabularies
As soon as a category of issues was outlined, medication instantly started naming and cataloging each occasion it may discover. Scientists are nice at cataloging and defining situations of lessons. De materia medica, the primary pharmacopoeia, was accomplished in 70 CE. It was a e-book of about 600 plants and about 1000 medicines. When chemists started working with natural compounds within the lab, they created hundreds of latest molecules that wanted to be cataloged. In response, the primary quantity of the Beilstein Handbook of Natural Chemistry was launched in 1881. This handbook catalogued all recognized natural compounds, their reactions and properties, and grew to comprise thousands and thousands of entries.

This sample repeats all through the historical past of medication. Each time our understanding of the pure world improved, and a brand new class was added to the ontology, scientists started cataloging the entire situations of that class. Following Louis Pasteur’s discovering in 1861 that germs trigger illness, individuals started cataloging all of the pathogens they might discover. In 1923, the primary model of Bergey’s Handbook of Determinative Bacteriology was printed, which contained a couple of thousand distinctive micro organism species.

The identical sample repeated with the invention of genes, proteins, threat components, and opposed results. At present, we’ve got wealthy managed vocabularies for situations and procedures (SNOMED CT), ailments (ICD 11), opposed results (MedDRA), medication (RxNorm), compounds (CheBI and PubChem), proteins (UniProt), and genes (NCBI Gene). Most massive pharma corporations work with dozens of those third-party managed vocabularies.
Considerably confusingly, ontologies and managed vocabularies are sometimes blended in follow. Massive managed vocabularies often comprise situations from a number of lessons together with a light-weight semantic mannequin (ontology) that relates them. SNOMED CT, for instance, contains situations of ailments, signs, procedures, and medical findings, in addition to formally outlined relationships corresponding to has intent and attributable to. In doing so, it combines a managed vocabulary with ontological construction, successfully functioning as a information graph in its personal proper.
Rules
Following a mass poisoning that killed 107 people attributable to an improperly ready “elixir” in 1937, the US authorities gave the Meals and Drug Administration (FDA) elevated regulatory powers (Kirsch 97). The Federal Food, Drug, and Cosmetic Act of 1938 had necessities on how medication must be labeled and required that drug producers submit security information and an announcement of “supposed use” to the FDA. This helped the US largely keep away from the thalidomide tragedy within the late Nineteen Fifties in Europe, the place a tranquilizer was prescribed to pregnant girls to deal with nervousness, bother sleeping, and morning illness—regardless of not ever being examined on pregnant girls. This brought on the “largest anthropogenic medical catastrophe ever”, throughout which hundreds of girls suffered miscarriages and greater than 10,000 infants have been born with extreme deformities.
Whereas the US largely averted this due to FDA reviewer warning, it additionally uncovered gaps within the system. The Kekauver-Harris Amendments to the Federal Meals, Drug, and Beauty Act in 1962 now required proof that medication have been each protected and efficient. The elevated energy of the FDA in 1938, and once more in 1962, compelled healthcare to standardize on the which means of phrases. Drug corporations have been compelled to agree upon indications (what’s the drug meant for), situations (what does the drug deal with), opposed results (what different situations have been related to this drug) and medical outcomes. Elevated regulatory stress additionally required replicable, well-controlled research for all claims made a couple of drug. Regulation didn’t simply demand safer medication; it demanded shared which means.
Observational information
These regulatory modifications didn’t simply have an effect on approval processes; they basically reshaped how medical observations have been generated, structured, and in contrast. To make medical proof comparable, reviewable, and replicable, information requirements for medical trials turned codified via organizations just like the Clinical Data Interchange Standards Consortium (CDISC). CDISC defines how medical observations, endpoints, and populations have to be represented for regulatory assessment. Likewise, the FDA turned the shared terminologies cataloged in managed vocabularies from finest follow to necessary.
Pre-competitive collaboration
One of many enabling components that has led healthcare to dominate in information graphs is pre-competitive collaboration. Numerous the work of healthcare is grounded in pure sciences like biology and chemistry which can be handled as a public good. Corporations nonetheless compete on merchandise, however most take into account a big portion of their analysis “pre-competitive.” Organizations just like the Pistoia Alliance facilitate this collaboration by offering impartial boards to align on shared semantics and infrastructure (see information requirements part beneath).
Public funding
Public funding has been important to constructing healthcare’s information infrastructure. Governments and public analysis establishments have invested closely within the creation and upkeep of ontologies, managed vocabularies, and large-scale observational information that no single firm may afford constructing alone. Businesses such because the National Institutes of Health (NIH) fund many of those belongings as public items, leaving healthcare with a wealthy, open information base able to be linked and reasoned over utilizing information graphs.
Knowledge requirements
Healthcare additionally embraced open information requirements early, making certain shared information might be represented and reused throughout methods and distributors. Requirements from the World Huge Internet Consortium (W3C) made medical information machine-readable and interoperable, permitting semantic fashions to be shared independently of any single system or vendor. By anchoring which means in open requirements reasonably than proprietary schemas, healthcare enabled information graphs to perform as shared, long-lived infrastructure reasonably than remoted implementations. Requirements ensured that which means may survive system upgrades, vendor modifications, and many years of technological churn.
Conclusion
None of those components alone explains healthcare’s maturity; it’s their interplay over many years—ontology shaping vocabularies, regulation implementing proof, funding sustaining shared infrastructure, and requirements enabling reuse—that made information graphs inevitable reasonably than optionally available. Lengthy earlier than trendy AI, healthcare invested in agreeing on what issues imply and the way observations must be interpreted. Within the ultimate a part of this collection, we’ll discover why most different industries lack these situations—and what they’ll realistically borrow from healthcare’s path.
In regards to the creator: Steve Hedden is the Head of Product Administration at TopQuadrant, the place he leads the technique for EDG, a platform for information graph and metadata administration. His work focuses on bridging enterprise information governance and AI via ontologies, taxonomies, and semantic applied sciences. Steve writes and speaks commonly about information graphs, and the evolving position of semantics in AI methods.
Bibliography
Hager, Thomas. Ten Medicine: How Vegetation, Powders, and Tablets Have Formed the Historical past of Medication. Harry N. Abrams, 2019.
Isaacson, Walter. The Code Breaker: Jennifer Doudna, Gene Enhancing, and the Way forward for the Human Race. Simon & Schuster, 2021.
Kirsch, Donald R., and Ogi Ogas. The Drug Hunters: The Inconceivable Quest to Uncover New Medicines. Arcade, 2017.

