About this codelab
1. Introduction
In this codelab, you'll create a machine learning (ML) model in BigQuery and get predictions from this model using ABAP SDK for Google Cloud.
You'll leverage the following Google Cloud services:
- BigQuery
- Cloud Shell
What you'll build
You'll create the following:
- A BigQuery Machine Learning (ML) model.
- A service account with BigQuery Job User role to call BigQuery API.
- An ABAP program to call BigQuery API and get predictions from the ML model.
2. Requirements
- A browser, such as Chrome or Firefox.
- A Google Cloud project with billing enabled or Create a 90-Day Free Trial account for Google Cloud Platform.
- SAP GUI (Windows or Java) installed in your system. If SAP GUI is already installed on your laptop, connect to SAP using the VM external IP address as the Application Server IP. If you are on Mac, then you can also install the SAP GUI for Java available in this link.
3. Before you begin
- In the Google Cloud Console, on the project selector page, select or create a Google Cloud project (For example:
abap-sdk-poc
). - Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project. Skip this step if you are using the 90-Day Free Trial Account.
- You will use Cloud Shell, a command-line environment running in Google Cloud.
- From the Cloud Console, click Activate Cloud Shell on the top right corner:
- Run the following commands in Cloud Shell to authenticate for your account and set the default project to
abap-sdk-poc
. Zoneus-west4-b
is used as an example. If needed, please change the project and zone in the following commands based on your preference.
gcloud auth login
gcloud config set project abap-sdk-poc
gcloud config set compute/zone us-west4-b
- You must have access to an SAP system with the ABAP SDK for Google Cloud installed.
- You must complete codelab 1 (Install ABAP Platform Trial 1909 on Google Cloud Platform and Install ABAP SDK for Google Cloud) and codelab 2 (Configure ABAP SDK Authentication using tokens for SAP Hosted on Compute Engine VM) before proceeding with this codelab.
- If you have completed codelab 1 and codelab 2, this would have provisioned you with an ABAP Platform Trial 1909 System on Google Cloud, along with the required setup for authentication and connectivity.
- If you have not completed codelab 1 and codelab 2, you will not have all the required infrastructure and connectivity to perform the steps provided in this codelab. Therefore, you must complete codelab 1 and codelab 2 before proceeding with this codelab.
4. Enable BigQuery API V2 in your Google Cloud Project
- From the Cloud Console, click Activate Cloud Shell on the top right corner:
- Run the following commands in Cloud Shell to enable BigQuery API:
gcloud services enable bigquery.googleapis.com
On successful execution you should see a message displayed as shown below
You should now have the BigQuery API enabled in your Google Cloud Project.
5. Create a Service Account for Secure BigQuery Access
To securely get ML predictions from a BigQuery ML model, you need to create a service account with the BigQuery Job User and BigQuery Data Viewer roles, which will allow your program to run queries (as jobs) within the project and read data from tables. This role grants only the necessary permission to create jobs and read data, minimizing security risks.
Create a service account
To create a service account with required role, perform the following steps:
- Run the following command in the Cloud Shell terminal:
gcloud iam service-accounts create abap-sdk-bigquery-jobuser --display-name="Service Account for BigQuery Job user"
- Now add the required roles to the service account created in the previous step:
gcloud projects add-iam-policy-binding abap-sdk-poc --member='serviceAccount:abap-sdk-bigquery-jobuser@abap-sdk-poc.iam.gserviceaccount.com' --role='roles/bigquery.jobUser'
gcloud projects add-iam-policy-binding abap-sdk-poc --member='serviceAccount:abap-sdk-bigquery-jobuser@abap-sdk-poc.iam.gserviceaccount.com' --role='roles/bigquery.dataViewer'
The above command uses abap-sdk-poc
as a placeholder for the Google Cloud Project. Replace abap-sdk-poc
with your project id.
- To verify, the role has been added, go to IAM page. The service account you created should be listed along with the role that has been assigned to it.
6. Creating a BigQuery Machine Learning model
In this codelab, we will create a k-means model to cluster London bicycle hires dataset. You can apply the k-means algorithm to group your data into clusters. Unlike supervised machine learning, which is about predictive analytics, unsupervised learning is about descriptive analytics. It's about understanding your data so that you can make data-driven decisions.
Create your dataset
To create a BigQuery dataset to store your ML model, perform the following steps:
- In the Google Cloud console, go to the BigQuery page. Go to the BigQuery page
- In the Explorer pane, click your project name.
- Click
View actions > Create dataset.
- On the Create dataset page, do the following:
- For Dataset ID, enter
bqml_tutorial
. - For Location type, select Multi-region, and then select EU (multiple regions in European Union). The London Bicycle Hires public dataset is stored in the EU multi-region. Your dataset must be in the same location.
- Leave the remaining default settings as they are, and click Create dataset.
Create a k-means model
Now that your dataset is set up, the next step is to create a k-means model using the data. You can create and train a k-means model using the CREATE MODEL statement with the option model_type=kmeans
.
To run the query and create a k-means model, perform the following steps:
- Go to the BigQuery page. Go to BigQuery
- In the editor pane, run the following SQL statement:
CREATE OR REPLACE MODEL `bqml_tutorial.london_station_clusters` OPTIONS(model_type='kmeans', num_clusters=4) AS WITH hs AS ( SELECT h.start_station_name AS station_name, IF (EXTRACT(DAYOFWEEK FROM h.start_date) = 1 OR EXTRACT(DAYOFWEEK FROM h.start_date) = 7, "weekend", "weekday") AS isweekday, h.duration, ST_DISTANCE(ST_GEOGPOINT(s.longitude, s.latitude), ST_GEOGPOINT(-0.1, 51.5))/1000 AS distance_from_city_center FROM `bigquery-public-data.london_bicycles.cycle_hire` AS h JOIN `bigquery-public-data.london_bicycles.cycle_stations` AS s ON h.start_station_id = s.id WHERE h.start_date BETWEEN CAST('2015-01-01 00:00:00' AS TIMESTAMP) AND CAST('2016-01-01 00:00:00' AS TIMESTAMP) ), stationstats AS ( SELECT station_name, isweekday, AVG(duration) AS duration, COUNT(duration) AS num_trips, MAX(distance_from_city_center) AS distance_from_city_center FROM hs GROUP BY station_name, isweekday) SELECT * EXCEPT(station_name, isweekday) FROM stationstats
- In the navigation panel, in the Resources section, expand your project name, click bqml_tutorial, and then click london_station_clusters.
- Click the Schema tab. The model schema lists the three station attributes that BigQuery ML used to perform clustering. The schema should look like the following:
- Click the Evaluation tab. This tab displays visualizations of the clusters identified by the k-means model. Under Numeric features, bar graphs display up to 10 of the most important numeric feature values for each centroid. You can select which features to visualize from the drop-down menu.
7. Get BigQuery ML predictions using ABAP SDK for Google Cloud
Now that you have set up the pre-requisites on the Google Cloud side, you are ready to complete steps in your SAP system to get predictions from the ML model using ABAP SDK for Google Cloud.
Create Client Key Configuration
For authentication and connectivity related configuration, the ABAP SDK for Google Cloud uses the tables /GOOG/CLIENT_KEY
and /GOOG/SERVIC_MAP.
To maintain the configuration in the /GOOG/CLIENT_KEY
table, perform the following steps:
- In the SAP GUI, enter transaction code SPRO.
- Click SAP Reference IMG.
- Click ABAP SDK for Google Cloud > Basic Settings > Configure Client Key
- Maintain the following values against the listed fields, leave all other fields blank:
Field | Value |
Google Cloud Key Name | BIGQUERY_ML |
Google Cloud Service Account Name | abap-sdk-bigquery-jobuser@abap-sdk-poc.iam.gserviceaccount.com |
Google Cloud Scope | https://www.googleapis.com/auth/cloud-platform |
Project ID | abap-sdk-poc |
Authorization Class | /GOOG/CL_AUTH_GOOGLE |
Build an ABAP Report to get predictions from the BigQuery ML model
To build an ABAP Report, perform the following steps:
- In your SAP GUI, go to transaction code
SE38
and create a Report Program with the nameZDEMO_BIGQUERY_ML_PREDICT.
- In the pop-up that opens up, provide details as shown in the following image:
- In the next pop-up, either select Local Object or provide a package name as appropriate.
- In the ABAP Editor, add the following code:
REPORT zdemo_bigquery_ml_predict. types: begin of lty_query_result, centroid_id type i, station_name type string, isweekday type string, num_trips type i, distance_from_city type string, end of lty_query_result, ltt_query_result type standard table of lty_query_result. DATA: lv_project_id TYPE string, ls_input TYPE /goog/cl_bigquery_v2=>ty_103, ls_output TYPE lty_query_result, lt_output TYPE ltt_query_result. CONSTANTS: lc_newline TYPE c VALUE cl_abap_char_utilities=>newline. TRY. "Initialize Bigquery object, pass the client key name that you have configured in /GOOG/CLIENT_KEY table DATA(lo_bq) = NEW /goog/cl_bigquery_v2( iv_key_name = 'BIGQUERY_ML' ). "Populate relevant parameters lv_project_id = lo_bq->gv_project_id. ls_input-default_dataset-project_id = 'abap-sdk-poc'. ls_input-default_dataset-dataset_id = 'bqml_tutorial'. "This query gets predictions from ls_input-query = | WITH | && lc_newline && | hs AS ( | && lc_newline && | SELECT | && lc_newline && | h.start_station_name AS station_name, | && lc_newline && | IF | && lc_newline && | (EXTRACT(DAYOFWEEK | && lc_newline && | FROM | && lc_newline && | h.start_date) = 1 | && lc_newline && | OR EXTRACT(DAYOFWEEK | && lc_newline && | FROM | && lc_newline && | h.start_date) = 7, | && lc_newline && | "weekend", | && lc_newline && | "weekday") AS isweekday, | && lc_newline && | h.duration, | && lc_newline && | ST_DISTANCE(ST_GEOGPOINT(s.longitude, | && lc_newline && | s.latitude), | && lc_newline && | ST_GEOGPOINT(-0.1, | && lc_newline && | 51.5))/1000 AS distance_from_city_center | && lc_newline && | FROM | && lc_newline && | `bigquery-public-data.london_bicycles.cycle_hire` AS h | && lc_newline && | JOIN | && lc_newline && | `bigquery-public-data.london_bicycles.cycle_stations` AS s | && lc_newline && | ON | && lc_newline && | h.start_station_id = s.id | && lc_newline && | WHERE | && lc_newline && | h.start_date BETWEEN CAST('2015-01-01 00:00:00' AS TIMESTAMP) | && lc_newline && | AND CAST('2016-01-01 00:00:00' AS TIMESTAMP) ), | && lc_newline && | stationstats AS ( | && lc_newline && | SELECT | && lc_newline && | station_name, | && lc_newline && | isweekday, | && lc_newline && | AVG(duration) AS duration, | && lc_newline && | COUNT(duration) AS num_trips, | && lc_newline && | MAX(distance_from_city_center) AS distance_from_city_center | && lc_newline && | FROM | && lc_newline && | hs | && lc_newline && | GROUP BY | && lc_newline && | station_name, isweekday ) | && lc_newline && | SELECT | && lc_newline && | * EXCEPT(nearest_centroids_distance) | && lc_newline && | FROM | && lc_newline && | ML.PREDICT( MODEL `bqml_tutorial.london_station_clusters`, | && lc_newline && | ( | && lc_newline && | SELECT | && lc_newline && | * | && lc_newline && | FROM | && lc_newline && | stationstats | && lc_newline && | WHERE | && lc_newline && | REGEXP_CONTAINS(station_name, 'Kennington'))) |. "Call API method: bigquery.jobs.query CALL METHOD lo_bq->query_jobs EXPORTING iv_p_project_id = lv_project_id is_input = ls_input IMPORTING es_output = DATA(ls_response) ev_ret_code = DATA(lv_ret_code) ev_err_text = DATA(lv_err_text) es_err_resp = DATA(ls_err_resp). IF lo_bq->is_success( lv_ret_code ). "API Call successful, loop through the data & display the result IF ls_response-job_complete = abap_true. LOOP AT ls_response-rows ASSIGNING FIELD-SYMBOL(<ls_row>). LOOP AT <ls_row>-f ASSIGNING FIELD-SYMBOL(<ls_value>). ASSIGN <ls_value>-v->* TO FIELD-SYMBOL(<ls_field_value>). CASE sy-tabix. WHEN 1. ls_output-centroid_id = <ls_field_value>. WHEN 2. ls_output-station_name = <ls_field_value>. WHEN 3. ls_output-isweekday = <ls_field_value>. WHEN 4. ls_output-num_trips = <ls_field_value>. WHEN 5. ls_output-distance_from_city = <ls_field_value>. ENDCASE. ENDLOOP. APPEND ls_output TO lt_output. CLEAR ls_output. ENDLOOP. IF lt_output IS NOT INITIAL. cl_demo_output=>new( )->begin_section( 'ML.Predict Query Details' )->write_text( ls_input-query )->write_text( 'Dataset: bigquery-public-data.london_bicycles' )->end_section( )->begin_section( 'ML.Predict Query Results' )->write_data( lt_output )->end_section( )->display( ). ENDIF. ENDIF. ELSE. "Display error message in case the API call fails MESSAGE lv_err_text TYPE 'E'. ENDIF. "Close HTTP Connection lo_bq->close( ). CATCH /goog/cx_sdk INTO DATA(lo_exception). MESSAGE lo_exception->get_text( ) TYPE 'E'. ENDTRY.
- Save and activate the report.
- Execute the report (F8).
On successful execution, you should see a report output as shown below:
8. Congratulations
Excellent work completing the "Get predictions from a BigQuery Machine Learning (ML) model using ABAP SDK for Google Cloud" codelab!
You've successfully retrieved predictions of a BigQuery Machine Learning model, right from within your SAP system! You've unlocked a new level of integration between ABAP and Google Cloud Services. Expand your horizons with other exciting ABAP SDK for Google Cloud codelabs:
- Using the Translation API with ABAP SDK for Google Cloud
- Upload a large object to a Cloud Storage bucket using chunking
- Retrieving Credentials/Secrets from Secret Manager with ABAP SDK for Google Cloud
- Call Vertex AI test-bison from ABAP
9. Clean up
If you do not wish to continue with the additional codelabs related to ABAP SDK for Google Cloud, please proceed with the cleanup.
Delete the project
- Delete the Google Cloud project:
gcloud projects delete abap-sdk-poc
Delete individual resources
- Delete the compute instance:
gcloud compute instances delete abap-trial-docker
- Delete the firewall-rules:
gcloud compute firewall-rules delete sapmachine
- Delete the service account:
gcloud iam service-accounts delete \
abap-sdk-bigquery-jobuser@abap-sdk-poc.iam.gserviceaccount.com