Last Updated: 2019-04-22

This codelab demonstrates a data ingestion pattern to ingest FHIR STU3 formatted healthcare data into BigQuery using Cloud Healthcare FHIR APIs. Test but realistic healthcare data has been generated and made available in Google Cloud Storage bucket for you.

In this code lab you will learn:

What do you need to run this demo?

  1. You need access to a GCP Project.
  2. You need Owner role to the GCP Project.
  3. FHIR STU3 resources in NDJSON format

If you don't have a GCP Project, follow these steps to create a new GCP Project.

If you have a GCP Project make sure you have Owner role to the GCP Project.

FHIR STU3 resources in NDJSON format has been pre-loaded into GCS bucket at gs://hc-ds/ndjson/. If you need new dataset, you can always generate it using SyntheaTM. Then upload it to GCS: gs://hc-ds/ndjson/

Follow these steps to ingest data from NDJSON files to healthcare dataset in BigQuery using Cloud Healthcare FHIR APIs:

Initialize shell variables for your environment

<!-- CODELAB: Initialize shell variables -->
GCP_PROJECT_ID=<PROJECT_ID>
REGION=<REGION>
HC_DATASET=<DATASET_ID>
FHIR_STORE=<FHIR_STORE_ID>
PUB_SUB_TOPIC=<PUBSUB_TOPIC>
BQ_DATASET_ID=<BQ_DATASET>

Create Healthcare Dataset and FHIR Store

Create Healthcare dataset using Cloud Healthcare APIs

gcloud alpha healthcare datasets create $HC_DATASET

Create FHIR Store in dataset using Cloud Healthcare APIs

gcloud alpha healthcare fhir-stores create $FHIR_STORE \
  --dataset=$HC_DATASET --pubsub-topic $PUB_SUB_TOPIC

Import test data from Google Cloud Storage to FHIR Store.

We will use preloaded files from GCS Bucket. These files contains FHIR STU3 resources in ndjson format. If import is not successful the errors are also written back into GCS Bucket

gcloud alpha healthcare fhir-stores import $FHIR_STORE \
  --dataset=$HC_DATASET --async \
  --error-gcs-uri=gs://hc-ds/ndjson/error \
  --gcs-uri=gs://hc-ds/ndjson/**.ndjson \
  --content-structure=RESOURCE

Validate

Try to fetch patient records from FHIR Store to check if the data has been imported correctly or not.

curl -X GET \
  -H "Authorization: Bearer "$(gcloud auth print-access-token) \
  -H "Content-Type: application/fhir+json; charset=utf-8" \   "https://healthcare.googleapis.com/v1alpha2/projects/$GCP_PROJECT_ID/locations/$REGION/datasets/$HC_DATASET/fhirStores/$FHIR_STORE/fhir/Patient"

Export to BigQuery

Export healthcare data from FHIR Store to BigQuery Dataset

Create a BigQuery Dataset and replace the GCP_PROJECT_ID and BQ_DATASET_ID in the following command.

gcloud alpha healthcare fhir-stores export $FHIR_STORE \
  --dataset=$HC_DATASET --location=$REGION --async \
  --bq-dataset=bq://$GCP_PROJECT_ID.$BQ_DATASET_ID \
  --schema-type=analytics

Validate if BigQuery Dataset has all 16 tables or not

Open BigQuery UI and validate that the Dataset contains tables.

bq ls $GCP_PROJECT_ID:$BQ_DATASET_ID

Congratulations, you've successfully completed the code lab to ingest healthcare data in BigQuery using Cloud Healthcare APIs

You imported FHIR STU3 data from Google Cloud Storage into Cloud Healthcare FHIR APIs.

Your exported data from Cloud Healthcare FHIR APIs to BigQuery.

You now know the key steps required to start your Healthcare Data Analytics journey with BigQuery on Google Cloud Platform.