What is a linter?
In software development, linters are essential tools that automatically flag programming errors, bugs, and stylistic issues. They help maintain code quality.
What if we applied this same concept to FHIR?
At Flexpa, we've been building what we call "FHIR Linting" - a system of automated fixes for common FHIR validation errors. This approach helps us maintain high-quality data while processing millions of healthcare records from the largest network of payer FHIR endpoints.
FHIR Linting builds on our transform pipeline that standardizes healthcare data, and complements our work on scaling FHIR data processing.
Why linting matters
Healthcare data is messy.
At Flexpa, we experience this problem uniquely. We source data from hundreds of different FHIR implementations across health plans, each with their own interpretation of the FHIR specification. There are so many small differences in implementation - including many, many, many variants of syntax and semantic errors.
When retrieving records from this diverse ecosystem, you'll encounter data that doesn't perfectly conform to the FHIR specification. There are a few approaches to handling this:
- Reject the data - This leads to gaps in records and frustrates users
- Ignore validation - This produces technically invalid data that causes downstream problems
- Manual fixes - This doesn't scale with millions of records
- Automated linting - This programmatically fixes common issues without changing the meaning
We've embraced the last option, creating a validation layer that detects and fixes common FHIR structural issues without altering the clinical meaning of the data.
What is "FHIR Linting"?
Our validation process works like this:
- First, we construct a FHIR Bundle that contains all the resources we want to validate
- We submit this Bundle to our internal $validate operation endpoint (built on top of Medplum)
- Medplum returns an OperationOutcome with detailed validation issues
- For each error in the OperationOutcome, we:
- Extract the exact path to the problematic data element
- Identify the error type from the error message
- Apply an appropriate automated fix if possible
- Skip errors that cannot be automatically fixed
What's critical about this process is that fixes must not change the meaning of the data - in the same way that a software linter shouldn't change how the code runs.
This systematic approach allows us to target specific FHIR validation issues with precise fixes. Let's look at some of the common issues we address:
Missing Required Properties
FHIR requires certain fields to be present in resources. This is also known as having a minimum cardinality of 1.
When a field is missing, we append it with an extension (data-absent-reason) to explicitly indicate the data is unknown. We have more details about this particular lint in our Validation documentation released last week.
Before:
{
"resourceType": "ExplanationOfBenefit",
"id": "EOB123",
"status": "active",
"outcome": "complete"
// Missing required "insurer" property
}
After:
{
"resourceType": "ExplanationOfBenefit",
"id": "EOB123",
"status": "active",
"outcome": "complete",
"insurer": {
"extension": [
{
"url": "http://hl7.org/fhir/StructureDefinition/data-absent-reason",
"valueCode": "unknown"
}
]
}
}
This approach satisfies validation requirements while explicitly communicating that the data was not provided in the original source.
Array Type Corrections
FHIR expects certain fields to always be arrays, even when there's only one value. Some systems incorrectly represent these as singular values.
This is more common than you might expect. Our lint automatically "array-ifies" the value.
Before:
{
"resourceType": "DocumentReference",
"id": "doc123",
"author": {
"reference": "Practitioner/123",
"display": "Dr. Jane Smith"
} // Should be an array of references
}
After:
{
"resourceType": "DocumentReference",
"id": "doc123",
"author": [
{
"reference": "Practitioner/123",
"display": "Dr. Jane Smith"
}
] // Fixed as array of references
}
Cleaning Up Nullish Values
Empty strings, null values, and whitespace-only values often create validation errors.
Here, our lint drops the field.
Before:
{
"resourceType": "Condition",
"id": "cond123",
"subject": { "reference": "Patient/123" },
"code": {
"coding": [
{
"system": "http://snomed.info/sct",
"code": "123456",
"display": "" // Empty string is invalid
}
]
}
}
After:
{
"resourceType": "Condition",
"id": "cond123",
"subject": { "reference": "Patient/123" },
"code": {
"coding": [
{
"system": "http://snomed.info/sct",
"code": "123456"
// Empty display property removed
}
]
}
}
Base64Binary Format Fixing
FHIR requires binary data to be properly base64 encoded, but sometimes systems provide raw text.
When this happens, our lint fixes the encoding automatically.
Before:
{
"resourceType": "DiagnosticReport",
"id": "report123",
"presentedForm": [
{
"contentType": "text/html",
"data": "<div>This is a clinical note</div>" // Not base64 encoded
}
]
}
After:
{
"resourceType": "DiagnosticReport",
"id": "report123",
"presentedForm": [
{
"contentType": "text/html",
"data": "PGRpdj5UaGlzIGlzIGEgY2xpbmljYWwgbm90ZTwvZGl2Pg==" // Properly encoded
}
]
}
Code Value Whitespace Trimming
Our newest lint addresses invalid code values.
According to the FHIR code datatype specification, "a code is restricted to a string which has at least one character and no leading or trailing whitespace, and where there is no whitespace other than single spaces in the contents".
Typically a validation fails here due to extra whitespace at the end of the value.
Before:
{
"resourceType": "Condition",
"code": {
"coding": [
{
"system": "http://snomed.info/sct",
"code": " 123456 " // Extra whitespace
}
]
}
}
After:
{
"resourceType": "Condition",
"code": {
"coding": [
{
"system": "http://snomed.info/sct",
"code": "123456" // Whitespace trimmed
}
]
}
}
Benefits
This automated approach to data cleaning provides several advantages:
- Improved data completeness - Customers get more complete records rather than missing data that fails validation
- Consistent data structure - Applications can rely on data conforming to FHIR standards
- Scalable processing - Fixes are applied automatically across millions of records
- Better developer experience - Less time spent handling edge cases means more time building great healthcare applications
FHIR Linting bridges the gap between real-world healthcare data and the ideal FHIR specification. By applying software development best practices to healthcare data processing, we're able to deliver higher quality data to developers while maintaining the clinical integrity of the original information.
This approach is just one of the ways Flexpa is working behind the scenes to make healthcare data more accessible and usable. By handling these technical details automatically, we allow developers to focus on building innovative applications with claims data.
Interested in learning more about how we handle FHIR records? Let us know about your specific use case!