MEDCIN Graph Database Technology

The MEDCIN clinical relevancy engine is at the core of all Medicomp products. The two major components of MEDCIN are the clinical content and the database technology that runs the engine.

For over 40 years, Medicomp has worked with physicians from leading medical centers and institutions to develop MEDCIN’s clinical content. Today, MEDCIN includes more than 400,000 clinical concepts with hundreds of millions of clinical relevancy links. To process such a tremendous volume of information in real-time, a massively scalable and extremely efficient database engine is required to optimize run-time and support artificial intelligence and machine learning technologies.

medcin graph database MEDCIN node

Because of the massive number of clinical relevancy links and the hierarchical relationships between concepts, MEDCIN uses a purpose-built graph database to support the engine’s processing speed requirements and the scalability of the knowledge base. To understand the complexities of powering the MEDCIN engine, consider the concept of “diagnosis of left-sided congestive heart failure.”

The MEDCIN concept for left-sided congestive heart failure includes relevancy links to more than 380 MEDCIN concepts, including the symptom “difficulty breathing” and the diagnosis “coronary artery disease.” Difficulty breathing, in turn, has more than 1,200 links to other MEDCIN diagnosis concepts, while coronary artery disease has more than 200 links to MEDCIN concepts for symptoms, histories, physical exams, tests and therapies. Because of the vast number of linkages between data points (i.e., MEDCIN concepts), the MEDCIN engine needs extremely efficient methods to quickly traverse all clinically related “nodes.”

Chest and chest pain

Traditional relational database technologies use tables, and “joins” or indices are used to navigate from one table to another. Relational database technologies were not designed to support the type of complex, hierarchical and connected relationships that a clinical data engine requires. In a scenario where there are millions of links between data points and a need to access multiple concepts for any single data point, relational databases are not an efficient alternative.

Unlike a relational database that uses fixed tables, a graph database contains a collection of “nodes” and “edges.” Each node represents an object, such as a single MEDCIN concept in the MEDCIN engine. An “edge” is the connection or relationship between two objects, such as two MEDCIN concepts. Each edge is defined by a unique identifier that details a starting or ending node, along with a set of properties, such as a clinical relevancy “score.”

Every MEDCIN concept or, in graph database parlance, “node,” is identified by a unique, permanent, non-contextual integer. MEDCIN’s graph database is a set of optimized, binary, run-time files which, for any given MEDCIN concept, uses “chunks of bytes” with pointers to all other nodes, enabling it to jump from memory location to memory location to access all clinically-related concepts.

The underlying code for MEDCIN’s graph database and engine is written in ANSI Standard C, a very low-level implementation, which runs everywhere, including on devices, in the cloud, in a container, and does not require a dedicated machine. The MEDCIN engine is scalable for server farms with load-balancing capabilities and has been commercially implemented on Amazon Web Services and Microsoft’s Azure cloud computing services.

MEDCIN Concepts as Clinical Data

Each clinical concept in MEDCIN is a pre-coordinated clinical proposition that has meaning to a clinician at the point of care.

Every implementation of MEDCIN in an EHR must support the way clinicians think and work; it also must adhere to sound vocabulary and data governance processes. Critical aspects of the MEDCIN concept structures include:

Permanent, Unique, Non-semantic concept identifiers
Each MEDCIN concept is assigned a permanent, unique six-digit numeric identifier that is never eliminated or duplicated, and has no semantic significance.

Hierarchy and Poly-hierarchy
MEDCIN clinical concepts and the MEDCIN engine are designed to support two different types of hierarchy: one for terminological navigation to move from general to more-specific concepts withina “tree” structure; and, one to support multiple diagnostic links between MEDCIN concepts andspecific diagnoses.

The terminological navigation hierarchy enables a clinical user to quickly move from a general concept such as “sputum,” to more specific sputum variants such as “purulent,” “blood-streaked”, etc. The navigational hierarchy includes 10 levels to support evolving concept granularity.

More importantly, the linkage hierarchy in MEDCIN supports the engine’s ability to link a single MEDCIN concept to thousands of clinical diagnoses, independent of the concept’s position in the navigational hierarchy and independent of its links to other diagnoses. Because the presence - or the absence - of a specific concept can be clinically relevant, MEDCIN’s linkage hierarchy is the key to its usability at the point of care.

In addition to the MEDCIN concept itself, there are several parameters which are used to modify the clinical meaning of a MEDCIN concept when it is presented to a clinician at the point of care.

The most critical of these parameters are the prefix, modifier, and status:

Breast Cancer: Maternal History
Breast Cancer: Personal History
Breast Cancer: Maternal History • Stage 2
Breast Cancer: Personal History • Stage 4
Breast Cancer: Maternal History • Stage 2 • Remission
Breast Cancer: Personal History • Stage 4 • Year 2 Chemotherapy

These parameters provide further specificity to a clinical concept. With a well-designed clinical user interface, these parameters are readily available in the appropriate context at the time of need. One of the keys to MEDCIN’s usability is this powerful combination of a MEDCIN clinical concept with appropriate parameters

MEDCIN Concept Properties

Each MEDCIN concept is pre-coordinated to work the way that clinicians process information, and includes properties that support computability, documentation, and presentation. Some of these properties include:

MEDCIN concepts have been included in the UMLS Metathesaurus since 2008, are updated semi-annually, and are designated as a category 3 license.

More than an Engine

In addition to the clinical relevancy scores and code mappings, MEDCIN concepts support documentation requirements for evaluation and management (E&M) coding, can track compliance with clinical quality measures, and enable risk adjustment calculations for Medicare Advantage hierarchical condition codes.


Billing & Reimbursement


Hierarchical category condition (HCC)


Clinical quality measuries (CQM)


Care coordination


Interoperability (HL7, FHIR v4)

The MEDCIN Engine is at the core of all Medicomp products, including its suite of Quippe solutions:

As evident through the building of the MEDCIN engine, how data is integrated and how a data engine is conceived is key to its usefulness. Developing a data engine is not a challenge but creating one with a purpose is a challenge. Knowing the potential of current and future data and building an engine that solves the issues of today and can pivot to address the problems of tomorrow is why the underlying foundation is so important. In the healthcare industry, data engines do not have the luxury of time – physicians and patients need clinical data available at the point of care in order to make decisions that improve clinical outcomes.

Request a Demo of Our Quippe Product Suite Today!