September 8, 2015
Healthcare providers need data model flexibility now.
The US healthcare industry’s ongoing business model shift is having a considerable impact on how providers collect, analyze, and store data. Although the fee-for-service model still dominates, payers are refining pay-for-performance programs, which measure and reward the value, not the volume, of healthcare delivered. To get paid for performance, providers are finding they need data collection and analysis capabilities that are different from those that relational databases deliver.
More than 14 percent of Medicare beneficiaries—nearly 8 million people—already receive care through pay-for-performance plans.
The pay-for-performance model encourages providers to focus less on the most aggressive treatment of each distinct ailment and more on a balanced approach to the patient’s overall health. For example, a payer might seek evidence that a provider has successfully brought blood pressure, blood glucose and lipid levels under control for a given patient population with both high blood pressure and diabetes.
By building in incentives, program advocates hope to reduce the overall cost and to improve the quality of care. More than 14 percent of Medicare beneficiaries—nearly 8 million people—already receive care through a pay-for-performance program, and the Obama administration aims to reach 30 percent by the end of 2016.
Pay for performance triggers changes in the data collection and analysis that surrounds patient care. Providers need to document, aggregate, and share detailed analyses of the impact of their care on overall patient health. Reliable and persuasive health outcome reporting is essential to the billing process and the success of pay for performance.
To achieve successful outcome reporting, providers such as Avera (a chain of nonprofit hospitals, clinics, and nursing homes in the American Midwest) know they need to run longitudinal queries on the same patient populations over time across a continuum of care. Unfortunately, electronic medical record (EMR) databases, the primary data source, weren’t designed for longitudinal analysis. Most EMR data models are visit oriented to facilitate the old fee-for-service billing.
To address this challenge, Avera uses Apache CouchDB, a NoSQL document store, as a bridge between the EMR and the healthcare provider’s analytics systems.1 NoSQL document stores such as CouchDB, MongoDB, or Microsoft DocumentDB can offer an expedient bridge between a rigid EMR database and a more permissive, versatile, general-purpose data repository.
Healthcare is just one domain where a sea change is under way in database technology. This article provides a deeper look at the NoSQL document store and how it differs from the relational database.
How document stores and relational databases compare
Unlike a typical EMR relational database, a document store accommodates data in many structures and formats and allows the longitudinal queries necessary for patient outcome reporting purposes. This kind of flexibility pays off when working with all the different data formats providers collect.
Take the case of diabetics who’ve also had high cholesterol levels during three visits over 12 months. With a document store, providers can preserve the full fidelity of the data, regardless of the structure or quality of the data. The CouchDB database that Avera uses identifies a test result as an element connected to its parent, the low-density lipid cholesterol test. The tags describing the relationship connecting them indicate the data type, such as a fax image, a PDF, or other file type. Any query in CouchDB would indicate the presence of a lab result and its data type, even if the user can’t read the result value itself.
Contemporary relational databases can store data in various forms, but they weren’t originally designed for that purpose, which limits their utility for querying. To function properly, queries to relational databases generally require data to be in a specifically keyed, tabular form with predetermined, explicit schema (that is, predefined tables, fields, views, indices, and other elements) and all the table cells filled in.
Document objects, by comparison, may conform to a schema (a data description list matched to the tabular data), but the schema is rarely explicit and can be unique to each document in the store. Document stores typically support familiar schemas but won’t strictly enforce them.
Document stores treat data and metadata equally within documents, and they aren’t fussy about whether each document has a particular tag or not. In CouchDB, for example, running a map function on lastName creates a view in documents that have the lastName attribute (or tag), which a user can then query. With the help of multiple queries, users can either query multiple collections that have this tag or embed all documents within a single document or collection. All the data elements can describe a constantly changing patient, rather than a single transactional visit.
By contrast, adding new tables to relational databases can complicate the data model substantially, resulting in foreign key (that is, table linking) metadata that demands consistency across tables. This metadata in essence occurs between the tables and can include patient name metadata that needs to be queried. Treating the metadata the same as the data, which is what happens in document stores, improves ease of use.
Finally, in document stores, the documents themselves can contain hierarchical collections of subdocuments. This capability suits continuous, small, volatile read and write operations, such as telemetry from a bedside monitor.
When to use NoSQL document stores
For most enterprises in most industries, document stores will augment, rather than replace, relational databases. For lots of writes, relational technology continues to be the norm. Relational databases are write intensive, and immediate consistency is mandatory—known as ACID (atomicity, consistency, isolation, and durability) compliance. By requiring immediate consistency, they allow thousands of users to concurrently trigger changes in the data. Mission-critical transactional systems will continue to be relational.
Many of the other distinctions between relational databases and document stores involve the heritage and maturity of these different types. Fundamentally, the distinctions boil down to a tradeoff. Relational databases have maturity on their side; a broad range of management and visualization tools are available, as well as comprehensive security and transactional integrity features. Document stores, by contrast, are horizontally scalable; they are generally designed to scale out across clusters and between data centers, and they have developer ease of use and flexibility on their side.
Pros and cons of document stores and relational databases
Document store workloads are read intensive and generally designed to be highly available. NoSQL document stores relax ACID requirements somewhat, and many—such as MongoDB, CouchDB, and DocumentDB—claim tunable consistency. But consistency expectations coming from the relational data world can’t always be met in the document world, nor should they be. The focus is on the richness of the data for analytics purposes and thus read capabilities are more important than write capabilities. For analytics and auditability, the principle of immutability—preserving versions intact for data mining, audit trail, and historical consistency purposes—becomes a higher priority than immediate writability. As storage costs continue to decline, immutability becomes more feasible. In healthcare as in other industries, the absence of a transaction is as important as its presence, and inconsistencies or gaps can be crucial clinical indicators, rather than something that needs normalizing, smoothing, or interpolating.
Document stores also allow developers to be much more efficient. The favored way to access document stores are RESTful application programming interfaces (APIs), which is one reason for their efficiency. Another element of efficiency is ease of use. Click-to-deploy capability is emerging for Amazon DynamoDB, MongoDB, and DocumentDB, for example.
Healthcare organizations will increasingly explore NoSQL document stores as a means of gaining insight into their growing data lakes. Providers are largely technical, science-driven professionals and will acquire the skills they need to work with NoSQL document stores as long as NoSQL offers competitive advantage and freedom from relational constraints.
However, relational databases for transaction processing will continue to precede document stores in the data pipeline. For healthcare and many other industries, weak ACID compliance is a showstopper for any critical transaction processing purpose. Some document stores claim ACID compliance, but the approach to consistency differs from that of relational databases. Document stores have gained in popularity, and enterprises will need to become more familiar with their capabilities and limitations.
Over the next several years, NoSQL document stores are expected to expand their role of ingesting relational data to feed auxiliary systems such as clinical reporting and advanced analytics, and enterprises will gain familiarity with their advantages and limitations for various purposes.
- Structured query language, or SQL, is the dominant query language associated with relational databases. NoSQL stands for not only structured query language. In practice, the term NoSQL is used loosely to refer to non-relational databases designed for distributed environments, rather than the associated query languages. PwC uses the term NoSQL, despite its inadequacies, to refer to non-relational distributed databases because it has become the default term of art. See the section “Database evolution becomes a revolution” in “Enterprises hedge their bets with NoSQL databases” at Enterprises hedge their bets with NoSQL databases for more information on relational versus non-relational database technology.