Last Updated: 2019-04-22
This codelab demonstrates a data ingestion pattern to ingest FHIR STU3 formatted healthcare data into BigQuery using Cloud Healthcare FHIR APIs. Test but realistic healthcare data has been generated and made available in Google Cloud Storage bucket for you.
In this code lab you will learn:
What do you need to run this demo?
If you don't have a GCP Project, follow these steps to create a new GCP Project.
If you have a GCP Project make sure you have Owner role to the GCP Project.
FHIR STU3 resources in NDJSON format has been pre-loaded into GCS bucket at gs://hc-ds/ndjson/. If you need new dataset, you can always generate it using SyntheaTM. Then upload it to GCS: gs://hc-ds/ndjson/
Follow these steps to ingest data from NDJSON files to healthcare dataset in BigQuery using Cloud Healthcare FHIR APIs:
<!-- CODELAB: Initialize shell variables --> GCP_PROJECT_ID=<PROJECT_ID> REGION=<REGION> HC_DATASET=<DATASET_ID> FHIR_STORE=<FHIR_STORE_ID> PUB_SUB_TOPIC=<PUBSUB_TOPIC> BQ_DATASET_ID=<BQ_DATASET>
gcloud alpha healthcare datasets create $HC_DATASET
gcloud alpha healthcare fhir-stores create $FHIR_STORE \ --dataset=$HC_DATASET --pubsub-topic $PUB_SUB_TOPIC
We will use preloaded files from GCS Bucket. These files contains FHIR STU3 resources in ndjson format. If import is not successful the errors are also written back into GCS Bucket
gcloud alpha healthcare fhir-stores import $FHIR_STORE \ --dataset=$HC_DATASET --async \ --error-gcs-uri=gs://hc-ds/ndjson/error \ --gcs-uri=gs://hc-ds/ndjson/**.ndjson \ --content-structure=RESOURCE
Try to fetch patient records from FHIR Store to check if the data has been imported correctly or not.
curl -X GET \ -H "Authorization: Bearer "$(gcloud auth print-access-token) \ -H "Content-Type: application/fhir+json; charset=utf-8" \ "https://healthcare.googleapis.com/v1alpha2/projects/$GCP_PROJECT_ID/locations/$REGION/datasets/$HC_DATASET/fhirStores/$FHIR_STORE/fhir/Patient"
Export healthcare data from FHIR Store to BigQuery Dataset
Create a BigQuery Dataset and replace the GCP_PROJECT_ID and BQ_DATASET_ID in the following command.
gcloud alpha healthcare fhir-stores export $FHIR_STORE \ --dataset=$HC_DATASET --location=$REGION --async \ --bq-dataset=bq://$GCP_PROJECT_ID.$BQ_DATASET_ID \ --schema-type=analytics
Open BigQuery UI and validate that the Dataset contains tables.
bq ls $GCP_PROJECT_ID:$BQ_DATASET_ID
Congratulations, you've successfully completed the code lab to ingest healthcare data in BigQuery using Cloud Healthcare APIs
You imported FHIR STU3 data from Google Cloud Storage into Cloud Healthcare FHIR APIs.
Your exported data from Cloud Healthcare FHIR APIs to BigQuery.
You now know the key steps required to start your Healthcare Data Analytics journey with BigQuery on Google Cloud Platform.