Blog/Platform Updates

FHIR Linting: Automating Validation Fixes

Applying software development best practices to healthcare data cleanup - how Flexpa automates data quality at scale.

May 5, 2025•Joshua Kelly

FHIR Linting: Automating Validation Fixes

What is a linter?

In software development, linters are essential tools that automatically flag programming errors, bugs, and stylistic issues. They help maintain code quality.

What if we applied this same concept to FHIR?

At Flexpa, we've been building what we call "FHIR Linting" - a system of automated fixes for common FHIR validation errors. This approach helps us maintain high-quality data while processing millions of healthcare records from the largest network of payer FHIR endpoints.

FHIR Linting builds on our transform pipeline that standardizes healthcare data, and complements our work on scaling FHIR data processing.

Why linting matters

Healthcare data is messy.

At Flexpa, we experience this problem uniquely. We source data from hundreds of different FHIR implementations across health plans, each with their own interpretation of the FHIR specification. There are so many small differences in implementation - including many, many, many variants of syntax and semantic errors.

When retrieving records from this diverse ecosystem, you'll encounter data that doesn't perfectly conform to the FHIR specification. There are a few approaches to handling this:

Reject the data - This leads to gaps in records and frustrates users
Ignore validation - This produces technically invalid data that causes downstream problems
Manual fixes - This doesn't scale with millions of records
Automated linting - This programmatically fixes common issues without changing the meaning

We've embraced the last option, creating a validation layer that detects and fixes common FHIR structural issues without altering the clinical meaning of the data.

What is "FHIR Linting"?

Our validation process works like this:

First, we construct a FHIR Bundle that contains all the resources we want to validate
We submit this Bundle to our internal $validate operation endpoint (built on top of Medplum)
Medplum returns an OperationOutcome with detailed validation issues
For each error in the OperationOutcome, we:
- Extract the exact path to the problematic data element
- Identify the error type from the error message
- Apply an appropriate automated fix if possible
- Skip errors that cannot be automatically fixed

What's critical about this process is that fixes must not change the meaning of the data - in the same way that a software linter shouldn't change how the code runs.

This systematic approach allows us to target specific FHIR validation issues with precise fixes. Let's look at some of the common issues we address:

Missing Required Properties

FHIR requires certain fields to be present in resources. This is also known as having a minimum cardinality of 1.

When a field is missing, we append it with an extension (data-absent-reason) to explicitly indicate the data is unknown. We have more details about this particular lint in our Validation documentation released last week.

Before:

{
  "resourceType": "ExplanationOfBenefit",
  "id": "EOB123",
  "status": "active",
  "outcome": "complete"
  // Missing required "insurer" property
}

After:

{
  "resourceType": "ExplanationOfBenefit",
  "id": "EOB123",
  "status": "active",
  "outcome": "complete",
  "insurer": {
    "extension": [
      {
        "url": "http://hl7.org/fhir/StructureDefinition/data-absent-reason",
        "valueCode": "unknown"
      }
    ]
  }
}

This approach satisfies validation requirements while explicitly communicating that the data was not provided in the original source.

Array Type Corrections

FHIR expects certain fields to always be arrays, even when there's only one value. Some systems incorrectly represent these as singular values.

This is more common than you might expect. Our lint automatically "array-ifies" the value.

Before:

{
  "resourceType": "DocumentReference",
  "id": "doc123",
  "author": {
    "reference": "Practitioner/123",
    "display": "Dr. Jane Smith"
  } // Should be an array of references
}

After:

{
  "resourceType": "DocumentReference",
  "id": "doc123",
  "author": [
    {
      "reference": "Practitioner/123",
      "display": "Dr. Jane Smith"
    }
  ] // Fixed as array of references
}

Cleaning Up Nullish Values

Empty strings, null values, and whitespace-only values often create validation errors.

Here, our lint drops the field.

Before:

{
  "resourceType": "Condition",
  "id": "cond123",
  "subject": { "reference": "Patient/123" },
  "code": {
    "coding": [
      {
        "system": "http://snomed.info/sct",
        "code": "123456",
        "display": "" // Empty string is invalid
      }
    ]
  }
}

After:

{
  "resourceType": "Condition",
  "id": "cond123",
  "subject": { "reference": "Patient/123" },
  "code": {
    "coding": [
      {
        "system": "http://snomed.info/sct",
        "code": "123456"
        // Empty display property removed
      }
    ]
  }
}

Base64Binary Format Fixing

FHIR requires binary data to be properly base64 encoded, but sometimes systems provide raw text.

When this happens, our lint fixes the encoding automatically.

Before:

{
  "resourceType": "DiagnosticReport",
  "id": "report123",
  "presentedForm": [
    {
      "contentType": "text/html",
      "data": "<div>This is a clinical note</div>" // Not base64 encoded
    }
  ]
}

After:

{
  "resourceType": "DiagnosticReport",
  "id": "report123",
  "presentedForm": [
    {
      "contentType": "text/html",
      "data": "PGRpdj5UaGlzIGlzIGEgY2xpbmljYWwgbm90ZTwvZGl2Pg==" // Properly encoded
    }
  ]
}

Code Value Whitespace Trimming

Our newest lint addresses invalid code values.

According to the FHIR code datatype specification, "a code is restricted to a string which has at least one character and no leading or trailing whitespace, and where there is no whitespace other than single spaces in the contents".

Typically a validation fails here due to extra whitespace at the end of the value.

Before:

{
  "resourceType": "Condition",
  "code": {
    "coding": [
      {
        "system": "http://snomed.info/sct",
        "code": "  123456 " // Extra whitespace
      }
    ]
  }
}

After:

{
  "resourceType": "Condition",
  "code": {
    "coding": [
      {
        "system": "http://snomed.info/sct",
        "code": "123456" // Whitespace trimmed
      }
    ]
  }
}

Benefits

This automated approach to data cleaning provides several advantages:

Improved data completeness - Customers get more complete records rather than missing data that fails validation
Consistent data structure - Applications can rely on data conforming to FHIR standards
Scalable processing - Fixes are applied automatically across millions of records
Better developer experience - Less time spent handling edge cases means more time building great healthcare applications

FHIR Linting bridges the gap between real-world healthcare data and the ideal FHIR specification. By applying software development best practices to healthcare data processing, we're able to deliver higher quality data to developers while maintaining the clinical integrity of the original information.

This approach is just one of the ways Flexpa is working behind the scenes to make healthcare data more accessible and usable. By handling these technical details automatically, we allow developers to focus on building innovative applications with claims data.

Interested in learning more about how we handle FHIR records? Let us know about your specific use case!

In this blog

What is a linter?

Why linting matters

What is "FHIR Linting"?

Missing Required Properties Array Type Corrections Cleaning Up Nullish Values Base64Binary Format Fixing Code Value Whitespace Trimming

Benefits

More platform updates

View All

Flexpa's response to CMS-0042-NC

Flexpa's response to CMS-0042-NC on healthcare interoperability, covering Blue Button 2.0 enhancements, USCDI gaps, digital identity, and network infrastructure.

June 14, 2025•Andrew Arruda and Angela Liu

Digital by Default - 85% of Americans Have Online Health Insurance Accounts

What 5,602 Americans told Flexpa—and why it challenges the myth of online logins

July 1, 2025•Angela Liu

15 Questions Claims Data Answers for the Future Patient Experience

From HSA reimbursement to care coordination, discover the specific questions that claims data answers, and doesn't answer, for a data-driven patient experience.

July 7, 2025•Angela Liu

Get fresh insights on patient access

Unsubscribe anytime

Get fresh insights on patient access

Unsubscribe anytime