Practical observability techniques for Generative AI application in Java

About this codelab

Last updated Feb 10, 2025

Written by Leonid Yankulin

1. Overview

Gen AI applications require observability like any other. Are there special observability techniques require for Generative AI?

In this lab, you will create a simple Gen AI application. Deploy it to Cloud Run. And instrument it with essential monitoring and logging capabilities using Google Cloud observability services and products.

What you will learn

Write an application that uses Vertex AI with Cloud Shell Editor
Store your application code in GitHub
Use gcloud CLI to deploy your application's source code to Cloud Run

Add monitoring and logging capabilities to your Gen AI application
Using log-based metrics
Implementing logging and monitoring with Open Telemetry SDK
Gain insights into responsible AI data handling

2. Prerequisites

If you do not already have a Google account, you have to create a new account.

3. Project setup

Sign-in to the Google Cloud Console with your Google account.
Create a new project or choose to reuse an existing project. Write down the project ID of the project that you just created or selected.
Enable billing for the project.
- Completing this lab should cost less than $5 in billing costs.
- You can follow the steps at the end of this lab to delete resources to avoid further charges.
- New users are eligible for the $300 USD Free Trial.
Confirm billing is enabled in My projects in Cloud Billing
- If your new project says Billing is disabled in the Billing account column:
  1. Click the three dots in the Actions column
  2. Click Change billing
  3. Select the billing account you would like to use
- If you are attending a live event, the account will likely be named Google Cloud Platform Trial Billing Account

4. Prepare Cloud Shell Editor

Navigate to Cloud Shell Editor. If you are prompted with the following message, requesting to authorize cloud shell to call gcloud with your credentials, click Authorize to continue.
Open terminal window
1. Click the hamburger menu
2. Click Terminal
3. Click New Terminal
In the terminal, configure your project ID:
```
gcloud config set project [PROJECT_ID]
```
Replace [PROJECT_ID] with ID of your project. For example, if your project ID is lab-example-project, the command will be:
```
gcloud config set project lab-project-id-example
```
If you are prompted with the following message, saying that gcloud requesting your credentials to GCPI API, click Authorize to continue.

On successful execution you should see the following message:
```
Updated property [core/project].
```
If you see a WARNING and are asked Do you want to continue (Y/N)?, then you have likely entered the project ID incorrectly. Press N, press Enter, and try to run the gcloud config set project command again after you found the correct project ID.
(Optional) If you are having a problem to find the project ID, run the following command to see project ID of all your projects sorted by creation time in descending order:
```
gcloud projects list \
     --format='value(projectId,createTime)' \
     --sort-by=~createTime
```

5. Enable Google APIs

In the terminal, enable Google APIs that are required for this lab:

gcloud services enable \
     run.googleapis.com \
     cloudbuild.googleapis.com \
     aiplatform.googleapis.com \
     logging.googleapis.com \
     monitoring.googleapis.com \
     cloudtrace.googleapis.com

This command will take some time to complete. Eventually, it produce a successful message similar to this one:

Operation "operations/acf.p2-73d90d00-47ee-447a-b600" finished successfully.

If you receive an error message starting with ERROR: (gcloud.services.enable) HttpError accessing and containing error details like below, retry the command after a 1-2 minute delay.

"error": {
  "code": 429,
  "message": "Quota exceeded for quota metric 'Mutate requests' and limit 'Mutate requests per minute' of service 'serviceusage.googleapis.com' ...",
  "status": "RESOURCE_EXHAUSTED",
  ...
}

6. Create a Gen AI application

In this step you will write a code of the simple request-based application that uses Gemini model to show 10 fun facts about an animal of your choice. Do the follow to create the application code.

In the terminal, create the codelab-o11y directory:
```
mkdir "${HOME}/codelab-o11y"
```
Change the current directory to codelab-o11y:
```
cd "${HOME}/codelab-o11y"
```

Download the bootstrap code of the Java application using Spring framework starter:

curl https://start.spring.io/starter.zip \
    -d dependencies=web \
    -d javaVersion=17 \
    -d type=maven-project \
    -d bootVersion=3.4.1 -o java-starter.zip

Unarchive the bootstrap code into the current folder:
```
unzip java-starter.zip
```
And remove the archive file from the folder:
```
rm java-starter.zip
```

Create project.toml file to define Java Runtime version to be used when deploying the code to Cloud Run:

cat > "${HOME}/codelab-o11y/project.toml" << EOF
[[build.env]]
    name = "GOOGLE_RUNTIME_VERSION"
    value = "17"
EOF

Add Google Cloud SDK dependencies to the pom.xml file:

Add Google Cloud Core package:

sed -i 's/<dependencies>/<dependencies>\
\
        <dependency>\
            <groupId>com.google.cloud<\/groupId>\
            <artifactId>google-cloud-core<\/artifactId>\
            <version>2.49.1<\/version>\
        <\/dependency>\
        /g' "${HOME}/codelab-o11y/pom.xml"

Add Google Cloud Vertex AI package:

sed -i 's/<dependencies>/<dependencies>\
\
        <dependency>\
            <groupId>com.google.cloud<\/groupId>\
            <artifactId>google-cloud-vertexai<\/artifactId>\
            <version>1.16.0<\/version>\
        <\/dependency>\
        /g' "${HOME}/codelab-o11y/pom.xml"

Open the DemoApplication.java file in Cloud Shell Editor:

cloudshell edit "${HOME}/codelab-o11y/src/main/java/com/example/demo/DemoApplication.java"

A scaffolded source code of the file DemoApplication.java should now appear in the editor window above the terminal. The source code of the file will be similar to the following:

package com.example.demo;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication
public class DemoApplication {

    public static void main(String[] args) {
        SpringApplication.run(DemoApplication.class, args);
    }
}

Replace the code in the editor with the version shown below. To replace the code, delete the content of the file and then copy the code below into the editor:

package com.example.demo;

import java.io.IOException;
import java.util.Collections;

import javax.annotation.PostConstruct;
import javax.annotation.PreDestroy;

import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import com.google.cloud.ServiceOptions;
import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.GenerateContentResponse;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import com.google.cloud.vertexai.generativeai.ResponseHandler;

@SpringBootApplication
public class DemoApplication {

    public static void main(String[] args) {
        String port = System.getenv().getOrDefault("PORT", "8080");
        SpringApplication app = new SpringApplication(DemoApplication.class);
        app.setDefaultProperties(Collections.singletonMap("server.port", port));
        app.run(args);
    }
}

@RestController
class HelloController {
    private final String projectId = ServiceOptions.getDefaultProjectId();
    private VertexAI vertexAI;
    private GenerativeModel model;

    @PostConstruct
    public void init() {
        vertexAI = new VertexAI(projectId, "us-central1");
        model = new GenerativeModel("gemini-1.5-flash", vertexAI);
    }

    @PreDestroy
    public void destroy() {
        vertexAI.close();
    }

    @GetMapping("/")
    public String getFacts(@RequestParam(defaultValue = "dog") String animal) throws IOException {
        String prompt = "Give me 10 fun facts about " + animal + ". Return this as html without backticks.";
        GenerateContentResponse response = model.generateContent(prompt);
        return ResponseHandler.getText(response);
    }
}

After a few seconds, Cloud Shell Editor will save your code automatically.

Deploy the code of the Gen AI application to Cloud Run

In the terminal window run the command to deploy the source code of the application to Cloud Run.

gcloud run deploy codelab-o11y-service \
     --source="${HOME}/codelab-o11y/" \
     --region=us-central1 \
     --allow-unauthenticated

If you see the prompt like below, informing you that the command will create a new repository. Click Enter.

Deploying from source requires an Artifact Registry Docker repository to store built containers.
A repository named [cloud-run-source-deploy] in region [us-central1] will be created.

Do you want to continue (Y/n)?

The deployment process may take up to a few minutes. After the deployment process completes you will see output like:

Service [codelab-o11y-service] revision [codelab-o11y-service-00001-t2q] has been deployed and is serving 100 percent of traffic.
Service URL: https://codelab-o11y-service-12345678901.us-central1.run.app

Copy the displayed Cloud Run service URL to a separate tab or window in your browser. Alternatively, run the following command in the terminal to print the service URL and click on the shown URL while holding Ctrl key to open the URL:
```
gcloud run services list \
     --format='value(URL)' \
     --filter='SERVICE:"codelab-o11y-service"'
```
When the URL is opened, you may get 500 error or see the message:
```
Sorry, this is just a placeholder...
```
It means that the services did not finish its deployment. Wait a few moments and refresh the page. At the end you will see a text starting with Fun Dog Facts and containing 10 fun facts about dogs.

Try to interact with the application to get fun facts about different animals. To do it append the animal parameter to the URL, like ?animal=[ANIMAL] where [ANIMAL] is an animal name. For example, append ?animal=cat to get 10 fun facts about cats or ?animal=sea turtle to get 10 fun facts about sea turtles.

7. Audit your Vertex API calls

Auditing Google API calls provides answers to the questions like "Who call a particular API, where, and when?". Auditing is important when you troubleshoot your application, investigate resource consumption or perform software forensic analysis.

Audit logs allow you to track administrative and system activities as well as to log calls to "data read" and "data write" API operations. To audit Vertex AI requests to generate content you have to enable "Data Read" audit logs in the Cloud console.

Click on the button below to open the Audit Logs page in the Cloud console
Ensure that the page has the project that you created for this lab selected. The selected project is shown at the top left corner of the page right from the hamburger menu:

If necessary, select the correct project from the combobox.
In the Data Access audit logs configuration table, in the Service column find the Vertex AI API service and select the service by selecting the checkbox located to the left from the service name.
In the info panel on the right, select the "Data Read" audit type.
Click Save.

To generate audit logs open the service URL. Refresh the page while changing the value of the ?animal= parameter to get different results.

Explore audit logs

Click on the button below to open the Logs Explorer page in the Cloud console:
Paste the following filter into the Query pane.
```
LOG_ID("cloudaudit.googleapis.com%2Fdata_access") AND
protoPayload.serviceName="aiplatform.googleapis.com"
```
The Query pane is an editor located near the top of the Logs Explorer page:
Click Run query.
Select one of the audit log entries and expands the fields to inspect information captured in the log.
You can see details about Vertex API call including the method and the model that was used. You also can see the identity of the invoker and what permissions authorized the call.

8. Log interactions with Gen AI

You do not find API request parameters or response data in audit logs. However, this information can be important for troubleshooting application and workflow analysis. In this step we fulfill this gap by adding application logging.

Implementation uses Logback with Spring Boot to print application logs to standard output. This method features the Cloud Run ability to capture information printed to standard output and ingesting it to Cloud Logging automatically. In order to capture information as structured data, the printed logs should be formatted accordingly. Follow the instructions below to add structured logging capabilities to the application.

Return to the ‘Cloud Shell' window (or tab) in your browser.

Create and open a new file LoggingEventGoogleCloudEncoder.java in Clooud Shell Editor:

cloudshell edit "${HOME}/codelab-o11y/src/main/java/com/example/demo/LoggingEventGoogleCloudEncoder.java"

Copy and paste the following code to implement Logback encoder that encodes the log as a stringified JSON following the Google Cloud structured log format:

package com.example.demo;

import static ch.qos.logback.core.CoreConstants.UTF_8_CHARSET;

import java.time.Instant;
import ch.qos.logback.core.encoder.EncoderBase;
import ch.qos.logback.classic.Level;
import ch.qos.logback.classic.spi.ILoggingEvent;
import java.util.HashMap;

import com.google.gson.Gson;

public class LoggingEventGoogleCloudEncoder extends EncoderBase<ILoggingEvent>  {
    private static final byte[] EMPTY_BYTES = new byte[0];
    private final Gson gson = new Gson();

    @Override
    public byte[] headerBytes() {
        return EMPTY_BYTES;
    }

    @Override
    public byte[] encode(ILoggingEvent e) {
        var timestamp = Instant.ofEpochMilli(e.getTimeStamp());
        var fields = new HashMap<String, Object>() {
            {
                put("timestamp", timestamp.toString());
                put("severity", severityFor(e.getLevel()));
                put("message", e.getMessage());
            }
        };
        var params = e.getKeyValuePairs();
        if (params != null && params.size() > 0) {
            params.forEach(kv -> fields.putIfAbsent(kv.key, kv.value));
        }
        var data = gson.toJson(fields) + "\n";
        return data.getBytes(UTF_8_CHARSET);
    }

    @Override
    public byte[] footerBytes() {
        return EMPTY_BYTES;
    }

    private static String severityFor(Level level) {
        switch (level.toInt()) {
            case Level.TRACE_INT:
            return "DEBUG";
            case Level.DEBUG_INT:
            return "DEBUG";
            case Level.INFO_INT:
            return "INFO";
            case Level.WARN_INT:
            return "WARNING";
            case Level.ERROR_INT:
            return "ERROR";
            default:
            return "DEFAULT";
        }
    }
}

Create and open a new file logback.xml in Clooud Shell Editor:

cloudshell edit "${HOME}/codelab-o11y/src/main/resources/logback.xml"

Copy and paste the following XML to configure Logback to use the encoder with the Logback appender that prints logs to standard output:

<?xml version="1.0" encoding="UTF-8"?>
<configuration debug="true">
    <appender name="Console" class="ch.qos.logback.core.ConsoleAppender">
        <encoder class="com.example.demo.LoggingEventGoogleCloudEncoder"/>
    </appender>

    <root level="info">
        <appender-ref ref="Console" />
    </root>
</configuration>

Re-open the DemoApplication.java file in Cloud Shell Editor:

cloudshell edit "${HOME}/codelab-o11y/src/main/java/com/example/demo/DemoApplication.java"

Replace the code in the editor with the version shown below to log Gen AI request and response. To replace the code, delete the content of the file and then copy the code below into the editor:

package com.example.demo;

import java.io.IOException;
import java.util.Collections;

import javax.annotation.PostConstruct;
import javax.annotation.PreDestroy;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import com.google.cloud.ServiceOptions;
import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.GenerateContentResponse;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import com.google.cloud.vertexai.generativeai.ResponseHandler;

@SpringBootApplication
public class DemoApplication {

    public static void main(String[] args) {
        String port = System.getenv().getOrDefault("PORT", "8080");
        SpringApplication app = new SpringApplication(DemoApplication.class);
        app.setDefaultProperties(Collections.singletonMap("server.port", port));
        app.run(args);
    }
}

@RestController
class HelloController {
    private final String projectId = ServiceOptions.getDefaultProjectId();
    private VertexAI vertexAI;
    private GenerativeModel model;
    private final Logger LOGGER = LoggerFactory.getLogger(HelloController.class);

    @PostConstruct
    public void init() {
        vertexAI = new VertexAI(projectId, "us-central1");
        model = new GenerativeModel("gemini-1.5-flash", vertexAI);
    }

    @PreDestroy
    public void destroy() {
        vertexAI.close();
    }

    @GetMapping("/")
    public String getFacts(@RequestParam(defaultValue = "dog") String animal) throws IOException {
        String prompt = "Give me 10 fun facts about " + animal + ". Return this as html without backticks.";
        GenerateContentResponse response = model.generateContent(prompt);
        LOGGER.atInfo()
                .addKeyValue("animal", animal)
                .addKeyValue("prompt", prompt)
                .addKeyValue("response", response)
                .log("Content is generated");
        return ResponseHandler.getText(response);
    }
}

After a few seconds, Cloud Shell Editor saves your changes automatically.

Deploy the code of the Gen AI application to Cloud Run

In the terminal window run the command to deploy the source code of the application to Cloud Run.

gcloud run deploy codelab-o11y-service \
     --source="${HOME}/codelab-o11y/" \
     --region=us-central1 \
     --allow-unauthenticated

If you see the prompt like below, informing you that the command will create a new repository. Click Enter.

Deploying from source requires an Artifact Registry Docker repository to store built containers.
A repository named [cloud-run-source-deploy] in region [us-central1] will be created.

Do you want to continue (Y/n)?

The deployment process may take up to a few minutes. After the deployment process completes you will see output like:

Service [codelab-o11y-service] revision [codelab-o11y-service-00001-t2q] has been deployed and is serving 100 percent of traffic.
Service URL: https://codelab-o11y-service-12345678901.us-central1.run.app

Copy the displayed Cloud Run service URL to a separate tab or window in your browser. Alternatively, run the following command in the terminal to print the service URL and click on the shown URL while holding Ctrl key to open the URL:
```
gcloud run services list \
     --format='value(URL)' \
     --filter='SERVICE:"codelab-o11y-service"'
```
When the URL is opened, you may get 500 error or see the message:
```
Sorry, this is just a placeholder...
```
It means that the services did not finish its deployment. Wait a few moments and refresh the page. At the end you will see a text starting with Fun Dog Facts and containing 10 fun facts about dogs.

To generate application logs open the service URL. Refresh the page while changing the value of the ?animal= parameter to get different results.
To see the application logs do the following:

Click on the button below to open the Logs explorer page in the Cloud console:
Paste the following filter into the Query pane (#2 in the Log explorer interface):
```
LOG_ID("run.googleapis.com%2Fstdout") AND
severity=DEBUG
```
Click Run query.

The result of the query shows logs with prompt and Vertex AI response including safety ratings.

9. Count interactions with Gen AI

Cloud Run writes managed metrics that can be used to monitor deployed services. User-managed monitoring metrics provide more control over data and frequency of the metric update. To implement such metric requires writing a code that collects data and writes it to Cloud Monitoring. See the next (optional) step for the way to implement it using OpenTelemetry SDK.

This step shows alternative to implementing user metric in the code - log-based metrics. Log-based metrics let you generate monitoring metrics from the log entries that your application write to Cloud Logging. We will use the application logs that we implemented in the previous step to define a log-based metric of the type counter. The metric will count the number of successful calls to Vertex API.

Look at the window of the Logs explorer that we used in the previous step. Under the Query pane locate the Actions drop-down menu and click it to open. See the screenshot below to find the menu:
In the opened menu select the Create metric to open the Create log-based metric panel.
Follow these steps to configure a new counter metric in the Create log-based metric panel:
1. Set the Metric type: Select Counter.
2. Set the following fields in the Details section:
  - Log metric name: Set the name to model_interaction_count. Some naming restrictions apply; See naming restrictions Troubleshooting for details.
  - Description: Enter a description for the metric. For example, Number of log entries capturing successful call to model inference.
  - Units: Leave this blank or insert the digit 1.
3. Leave the values in the Filter selection section. Note that the Build filter field has the same filter we used to see application logs.
4. (Optional) Add a label that helps to count a number of calls for each animal. NOTE: this label has potential to greatly increase metric's cardinality and is not recommended for use in production:
  1. Click Add label.
  2. Set the following fields in the Labels section:
    - Label name: Set the name to animal.
    - Description: Enter the description of the label. For example, Animal parameter.
    - Label type: Select STRING.
    - Field name: Type jsonPayload.animal.
    - Regular expression: Leave it empty.
  3. Click Done
5. Click Create metric to create the metric.

You can also create a log-based metric from the Log-based metrics page, using gcloud logging metrics create CLI command or with google_logging_metric Terraform resource.

To generate metric data open the service URL. Refresh the opened page several times to make multiple calls to the model. Like before, try to use different animals in the parameter.

Enter the PromQL query to search for the log-based metric data. To enter a PromQL query, do the following:

Click on the button below to open the Metrics explorer page in the Cloud console:
In the toolbar of the query-builder pane, select the button whose name is either < > MQL or < > PromQL. See the picture below for the button's location.
Verify that PromQL is selected in the Language toggle. The language toggle is in the same toolbar that lets you format your query.

Enter your query into the Queries editor:

sum(rate(logging_googleapis_com:user_model_interaction_count{monitored_resource="cloud_run_revision"}[${__interval}]))

For more information about using PromQL, see PromQL in Cloud Monitoring.

Click Run Query. You will see a line chart similar to this screenshot:

Note that when the Auto-run toggle is enabled, the Run Query button isn't shown.

10. (Optional) Use Open Telemetry for monitoring and tracing

As mentioned in the previous step it is possible to implement metrics using OpenTelemetry (Otel) SDK. Using OTel on multi-service architectures is a recommended practice. This step demonstrates adding OTel instrumentation to a Spring Boot application. In this step you will do the following:

Instrumenting Spring Boot application with automatic tracing capabilities
Implementing a counter metric to monitor a number of successful model calls
Correlate tracing with application logs

The recommended architecture for product-level services is to use OTel collector to collect and ingest all observability data from multiple services. The code in this step does not use the collector for simplicity sake. Instead it uses OTel exports that write data directly to Google Cloud.

Set up Spring Boot application with OTel components and automatic tracing

Return to the ‘Cloud Shell' window (or tab) in your browser.

In the terminal, update the application.permissions file with additional configuration parameters:

cat >> "${HOME}/codelab-o11y/src/main/resources/application.properties" << EOF
otel.logs.exporter=none
otel.traces.exporter=google_cloud_trace
otel.metrics.exporter=google_cloud_monitoring
otel.resource.attributes.service.name=codelab-o11y-service
otel.traces.sampler=always_on
EOF

These parameters define exporting observability data to Cloud Trace and Cloud Monitoring and enforce sampling all traces.

Add required OpenTelemetry dependencies to the pom.xml file:

sed -i 's/<dependencies>/<dependencies>\
\
        <dependency>\
            <groupId>io.opentelemetry.instrumentation<\/groupId>\
            <artifactId>opentelemetry-spring-boot-starter<\/artifactId>\
        <\/dependency>\
        <dependency>\
            <groupId>com.google.cloud.opentelemetry<\/groupId>\
            <artifactId>exporter-auto<\/artifactId>\
            <version>0.33.0-alpha<\/version>\
        <\/dependency>\
        <dependency>\
            <groupId>com.google.cloud.opentelemetry<\/groupId>\
            <artifactId>exporter-trace<\/artifactId>\
            <version>0.33.0<\/version>\
        <\/dependency>\
        <dependency>\
            <groupId>com.google.cloud.opentelemetry<\/groupId>\
            <artifactId>exporter-metrics<\/artifactId>\
            <version>0.33.0<\/version>\
        <\/dependency>\
/g' "${HOME}/codelab-o11y/pom.xml"

Add OpenTelemetry BOM to the pom.xml file:

sed -i 's/<\/properties>/<\/properties>\
    <dependencyManagement>\
        <dependencies>\
            <dependency>\
                <groupId>io.opentelemetry.instrumentation<\/groupId>\
                <artifactId>opentelemetry-instrumentation-bom<\/artifactId>\
                <version>2.12.0<\/version>\
                <type>pom<\/type>\
                <scope>import<\/scope>\
            <\/dependency>\
        <\/dependencies>\
    <\/dependencyManagement>\
/g' "${HOME}/codelab-o11y/pom.xml"

Re-open the DemoApplication.java file in Cloud Shell Editor:

cloudshell edit "${HOME}/codelab-o11y/src/main/java/com/example/demo/DemoApplication.java"

Replace the current code with the version that increments a performance metric. To replace the code, delete the content of the file and then copy the code below into the editor:

package com.example.demo;

import io.opentelemetry.api.common.AttributeKey;
import io.opentelemetry.api.common.Attributes;
import io.opentelemetry.api.OpenTelemetry;
import io.opentelemetry.api.metrics.LongCounter;

import java.io.IOException;
import java.util.Collections;

import javax.annotation.PostConstruct;
import javax.annotation.PreDestroy;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;

import com.google.cloud.ServiceOptions;
import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.GenerateContentResponse;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import com.google.cloud.vertexai.generativeai.ResponseHandler;


@SpringBootApplication
public class DemoApplication {

    public static void main(String[] args) {
        String port = System.getenv().getOrDefault("PORT", "8080");
        SpringApplication app = new SpringApplication(DemoApplication.class);
        app.setDefaultProperties(Collections.singletonMap("server.port", port));
        app.run(args);
    }
}

@RestController
class HelloController {
    private final String projectId = ServiceOptions.getDefaultProjectId();
    private VertexAI vertexAI;
    private GenerativeModel model;
    private final Logger LOGGER = LoggerFactory.getLogger(HelloController.class);
    private static final String INSTRUMENTATION_NAME = "genai-o11y/java/workshop/example";
    private static final AttributeKey<String> ANIMAL = AttributeKey.stringKey("animal");
    private final LongCounter counter;

    public HelloController(OpenTelemetry openTelemetry) {
        this.counter = openTelemetry.getMeter(INSTRUMENTATION_NAME)
                .counterBuilder("model_call_counter")
                .setDescription("Number of successful model calls")
                .build();
    }

    @PostConstruct
    public void init() {
        vertexAI = new VertexAI(projectId, "us-central1");
        model = new GenerativeModel("gemini-1.5-flash", vertexAI);
    }

    @PreDestroy
    public void destroy() {
        vertexAI.close();
    }

    @GetMapping("/")
    public String getFacts(@RequestParam(defaultValue = "dog") String animal) throws IOException {
        String prompt = "Give me 10 fun facts about " + animal + ". Return this as html without backticks.";
        GenerateContentResponse response = model.generateContent(prompt);
        LOGGER.atInfo()
                .addKeyValue("animal", animal)
                .addKeyValue("prompt", prompt)
                .addKeyValue("response", response)
                .log("Content is generated");
        counter.add(1, Attributes.of(ANIMAL, animal));
        return ResponseHandler.getText(response);
    }
}

Re-open the LoggingEventGoogleCloudEncoder.java file in Cloud Shell Editor:

cloudshell edit "${HOME}/codelab-o11y/src/main/java/com/example/demo/LoggingEventGoogleCloudEncoder.java"

Replace the current code with the version that adds tracing attributes to the written logs. Adding the attributes enable logs to be correlated with correct trace spans. To replace the code, delete the content of the file and then copy the code below into the editor:

package com.example.demo;

import static ch.qos.logback.core.CoreConstants.UTF_8_CHARSET;

import java.time.Instant;
import java.util.HashMap;

import ch.qos.logback.core.encoder.EncoderBase;
import ch.qos.logback.classic.Level;
import ch.qos.logback.classic.spi.ILoggingEvent;
import com.google.cloud.ServiceOptions;
import io.opentelemetry.api.trace.Span;
import io.opentelemetry.api.trace.SpanContext;
import io.opentelemetry.context.Context;

import com.google.gson.Gson;


public class LoggingEventGoogleCloudEncoder extends EncoderBase<ILoggingEvent>  {
    private static final byte[] EMPTY_BYTES = new byte[0];
    private final Gson gson;
    private final String projectId;
    private final String tracePrefix;


    public LoggingEventGoogleCloudEncoder() {
        this.gson = new Gson();
        this.projectId = lookUpProjectId();
        this.tracePrefix = "projects/" + (projectId == null ? "" : projectId) + "/traces/";
    }

    private static String lookUpProjectId() {
        return ServiceOptions.getDefaultProjectId();
    }

    @Override
    public byte[] headerBytes() {
        return EMPTY_BYTES;
    }

    @Override
    public byte[] encode(ILoggingEvent e) {
        var timestamp = Instant.ofEpochMilli(e.getTimeStamp());
        var fields = new HashMap<String, Object>() {
            {
                put("timestamp", timestamp.toString());
                put("severity", severityFor(e.getLevel()));
                put("message", e.getMessage());
                SpanContext context = Span.fromContext(Context.current()).getSpanContext();
                if (context.isValid()) {
                    put("logging.googleapis.com/trace", tracePrefix + context.getTraceId());
                    put("logging.googleapis.com/spanId", context.getSpanId());
                    put("logging.googleapis.com/trace_sampled", Boolean.toString(context.isSampled()));
                }
            }
        };
        var params = e.getKeyValuePairs();
        if (params != null && params.size() > 0) {
            params.forEach(kv -> fields.putIfAbsent(kv.key, kv.value));
        }
        var data = gson.toJson(fields) + "\n";
        return data.getBytes(UTF_8_CHARSET);
    }

    @Override
    public byte[] footerBytes() {
        return EMPTY_BYTES;
    }

    private static String severityFor(Level level) {
        switch (level.toInt()) {
            case Level.TRACE_INT:
            return "DEBUG";
            case Level.DEBUG_INT:
            return "DEBUG";
            case Level.INFO_INT:
            return "INFO";
            case Level.WARN_INT:
            return "WARNING";
            case Level.ERROR_INT:
            return "ERROR";
            default:
            return "DEFAULT";
        }
    }
}

After a few seconds, Cloud Shell Editor saves your changes automatically.

Deploy the code of the Gen AI application to Cloud Run

In the terminal window run the command to deploy the source code of the application to Cloud Run.

gcloud run deploy codelab-o11y-service \
     --source="${HOME}/codelab-o11y/" \
     --region=us-central1 \
     --allow-unauthenticated

If you see the prompt like below, informing you that the command will create a new repository. Click Enter.

Deploying from source requires an Artifact Registry Docker repository to store built containers.
A repository named [cloud-run-source-deploy] in region [us-central1] will be created.

Do you want to continue (Y/n)?

The deployment process may take up to a few minutes. After the deployment process completes you will see output like:

Service [codelab-o11y-service] revision [codelab-o11y-service-00001-t2q] has been deployed and is serving 100 percent of traffic.
Service URL: https://codelab-o11y-service-12345678901.us-central1.run.app

Copy the displayed Cloud Run service URL to a separate tab or window in your browser. Alternatively, run the following command in the terminal to print the service URL and click on the shown URL while holding Ctrl key to open the URL:
```
gcloud run services list \
     --format='value(URL)' \
     --filter='SERVICE:"codelab-o11y-service"'
```
When the URL is opened, you may get 500 error or see the message:
```
Sorry, this is just a placeholder...
```
It means that the services did not finish its deployment. Wait a few moments and refresh the page. At the end you will see a text starting with Fun Dog Facts and containing 10 fun facts about dogs.

To generate telemetry data open the service URL. Refresh the page while changing the value of the ?animal= parameter to get different results.

Explore application traces

Click on the button below to open the Trace explorer page in the Cloud console:
Select one of the most recent traces. You are supposed to see 5 or 6 spans that look like in the screenshot below.
Find the span that traces the call to the event handler (the fun_facts method). It will be the last span with the name /.
In the Trace details pane select Logs & events. You will see application logs that correlate to this particular span. The correlation is detected using the trace and span IDs in the trace and in the log. You are supposed to see the application log that wrote the prompt and the response of the Vertex API.

Explore the counter metric

Click on the button below to open the Metrics explorer page in the Cloud console:
In the toolbar of the query-builder pane, select the button whose name is either < > MQL or < > PromQL. See the picture below for the button's location.
Verify that PromQL is selected in the Language toggle. The language toggle is in the same toolbar that lets you format your query.

Enter your query into the Queries editor:

sum(rate(workload_googleapis_com:model_call_counter{monitored_resource="generic_task"}[${__interval}]))

Click Run Query.When the Auto-run toggle is enabled, the Run Query button isn't shown.

11. (Optional) Obfuscated sensitive information from logs

In Step 10 we logged information about the application's interaction with the Gemini model. This information included the name of the animal, the actual prompt and the model's response. While storing this information in the log should be safe, it is not necessary true for many other scenarios. The prompt may include some personal or otherwise sensitive information that a user does not want to be stored. To address this you can obfuscated the sensitive data that is written to Cloud Logging. To minimize code modifications the following solution is recommended.

Create a PubSub topic to store incoming log entries
Create a log sink that redirects ingested logs to PubSub topic.
Create a Dataflow pipeline that modifies logs redirected to PubSub topic following these steps:
1. Read a log entry from the PubSub topic
2. Inspect the entry's payload for sensitive information using DLP inspection API
3. Redact the sensitive information in the payload using one of the DLP redaction methods
4. Write the obfuscated log entry to Cloud Logging
Deploy the pipeline

12. (Optional) Clean up

To avoid a risk of incurring charges for resources and APIs used in the codelab it is recommended to clean up after you finished the lab. The easiest way to eliminate billing is to delete the project that you created for the codelab.

To delete the project run the delete project command in the terminal:

PROJECT_ID=$(gcloud config get-value project)
gcloud projects delete ${PROJECT_ID} --quiet

Deleting your Cloud project stops billing for all the resources and APIs used within that project. You should see this message where PROJECT_ID will be your project ID:

Deleted [https://cloudresourcemanager.googleapis.com/v1/projects/PROJECT_ID].

You can undo this operation for a limited period by running the command below.
    $ gcloud projects undelete PROJECT_ID

See https://cloud.google.com/resource-manager/docs/creating-managing-projects for information on shutting down projects.

(Optional) If you receive an error, consult with Step 5 to find the project ID you used during the lab. Substitute it to the command in the first instruction. For example, if your project ID is lab-example-project, the command will be:
```
gcloud projects delete lab-project-id-example --quiet
```

13. Congratulations

In this lab, you created a Gen AI application that uses Gemini model to make predictions. And instrumented the application with essential monitoring and logging capabilities. You deployed the application and changes from the source code to Cloud Run. Then you Google Cloud Observability products to track the application's performance, so you can be ensure in the application's reliability.

If you are interested in being included in a user experience (UX) research study to improve the products you worked with today, register here.

Here are some options for continuing your learning:

Codelab How to deploy gemini powered chat app on Cloud Run
Codelab How to use Gemini function calling with Cloud Run
How to use Cloud Run Jobs Video Intelligence API to process a Video scene-by-scene
On-demand workshop Google Kubernetes Engine Onboard
Learn more about configuring counter and distribution metrics using application logs
Write OTLP metrics by using an OpenTelemetry sidecar
Reference to use of Open Telemetry in Google Cloud

Generate Open Telemetry traces and metrics with Go

Report a mistake