Use Vertex AI Search on PDFs (unstructured data) in Cloud Storage from a Cloud Run service

1. Introduction

Overview

Vertex AI Search and Conversation (formerly known as Generative AI App Builder) lets developers tap into the power of Google's foundation models, search expertise, and conversational AI technologies to create enterprise-grade generative AI applications. This codelab focuses on using Vertex AI Search, where you can build a Google-quality search app on your own data and embed a search bar in your web pages or app.

Cloud Run is a managed compute platform that lets you run containers directly on top of Google's scalable infrastructure. You can deploy code written in any programming language on Cloud Run (that is capable of being put inside a container) by using the source-based deployment option.

In this codelab, you'll create a Cloud Run service using the source-based deployment to retrieve search results for unstructured content in PDF files in a Cloud Storage bucket. You can learn more about ingesting unstructured content here.

What you'll learn

  • How to create a Vertex AI Search app for unstructured data as PDFs ingested from a Cloud Storage bucket
  • How to create an HTTP endpoint using source-based deployment in Cloud Run
  • How to create a service account following the principle of least privilege for the Cloud Run service to use to query the Vertex AI Search app
  • How to invoke the Cloud Run service to query the Vertex AI Search app

2. Setup and Requirements

Prerequisites

Activate Cloud Shell

  1. From the Cloud Console, click Activate Cloud Shell d1264ca30785e435.png.

cb81e7c8e34bc8d.png

If this is your first time starting Cloud Shell, you're presented with an intermediate screen describing what it is. If you were presented with an intermediate screen, click Continue.

d95252b003979716.png

It should only take a few moments to provision and connect to Cloud Shell.

7833d5e1c5d18f54.png

This virtual machine is loaded with all the development tools needed. It offers a persistent 5 GB home directory and runs in Google Cloud, greatly enhancing network performance and authentication. Much, if not all, of your work in this codelab can be done with a browser.

Once connected to Cloud Shell, you should see that you are authenticated and that the project is set to your project ID.

  1. Run the following command in Cloud Shell to confirm that you are authenticated:
gcloud auth list

Command output

 Credentialed Accounts
ACTIVE  ACCOUNT
*       <my_account>@<my_domain.com>

To set the active account, run:
    $ gcloud config set account `ACCOUNT`
  1. Run the following command in Cloud Shell to confirm that the gcloud command knows about your project:
gcloud config list project

Command output

[core]
project = <PROJECT_ID>

If it is not, you can set it with this command:

gcloud config set project <PROJECT_ID>

Command output

Updated property [core/project].

3. Enable APIs

Before you can start using Vertex AI Search, there are several APIs you will need to enable.

First, this codelab requires using the Vertex AI Search and Conversation, BigQuery, Cloud Storage APIs. You can enable those APIs here.

Second, follow these steps to enable the Vertex AI Search and Conversation API:

  1. In the Google Cloud console, navigate to the Vertex AI Search and Conversation console.
  2. Read and agree to the Terms of Service, then click Continue and activate the API.

4. Create a search app for unstructured data from Cloud Storage

  1. In the Google Cloud console, go to the Search & Conversation page. Click New app.
  2. In the Select app type pane, select Search.
  3. Make sure Enterprise features is enabled to receive answers that are extracted verbatim from your documents.
  4. Make sure the Advanced LLM features option is enabled to receive search summarization.
  5. In the App name field, enter a name for your app. Your app ID appears under the app name.
  6. Select global (Global) as the location for your app, and then click Continue.
  7. In the Data stores pane, click Create new data store.
  8. In the Select a data source pane, select Cloud Storage.
  9. In the Import data from GCS pane, make sure Folder is selected.
  10. In the gs:// field, enter the following value: cloud-samples-data/gen-app-builder/search/stanford-cs-224 This Cloud Storage bucket contains PDF files from a publicly available Cloud Storage folder for testing purposes.
  11. Select Unstructured documents, and then click Continue.
  12. In the Configure your data store pane, select global (Global) as the location for your data store.
  13. Enter a name for your data store. You will use this name later in this codelab when deploying your Cloud Run service. Click Create.
  14. In the Data stores pane, select your new data store and click Create.
  15. On your data store's Data page, click the Activity tab to see the status of your data ingestion. Import completed displays in the Status column when the import process is complete.
  16. Click the Documents tab to see the number of documents imported.
  17. In the navigation menu, click Preview to test the search app.
  18. In the search bar, enter final lab due date, and then press Enter to view your results.

5. Create the Cloud Run service

In this section, you will create a Cloud Run service that accepts a query string for your search terms. This service will use the python client libraries for the Discovery Engine API. For other supported runtimes, you can view the list here.

Create the source code for the function

First, create a directory and cd into that directory.

mkdir docs-search-service-python && cd $_

Then, create a requirements.txt file with the following content:

blinker==1.6.3
cachetools==5.3.1
certifi==2023.7.22
charset-normalizer==3.3.0
click==8.1.7
Flask==3.0.0
google-api-core==2.12.0
google-auth==2.23.3
google-cloud-discoveryengine==0.11.2
googleapis-common-protos==1.61.0
grpcio==1.59.0
grpcio-status==1.59.0
idna==3.4
importlib-metadata==6.8.0
itsdangerous==2.1.2
Jinja2==3.1.2
MarkupSafe==2.1.3
numpy==1.26.1
proto-plus==1.22.3
protobuf==4.24.4
pyasn1==0.5.0
pyasn1-modules==0.3.0
requests==2.31.0
rsa==4.9
urllib3==2.0.7
Werkzeug==3.0.1
zipp==3.17.0

Next, create a main.py source file with the following content:

from typing import List
import json
import os
from flask import Flask
from flask import request

app = Flask(__name__)

from google.api_core.client_options import ClientOptions
from google.cloud import discoveryengine_v1 as discoveryengine

project_id = os.environ.get('PROJECT_ID')
location = "global"  # Values: "global", "us", "eu"
data_store_id = os.environ.get('SEARCH_ENGINE_ID')

print(project_id)
print(data_store_id)

@app.route("/")
def search_storage():

    search_query = request.args.get("searchQuery")

    result = search_sample(project_id, location, data_store_id, search_query)
    return result

def search_sample(
    project_id: str,
    location: str,
    data_store_id: str,
    search_query: str,
) -> str:
    #  For more information, refer to:
    # https://cloud.google.com/generative-ai-app-builder/docs/locations#specify_a_multi-region_for_your_data_store
    client_options = (
        ClientOptions(api_endpoint=f"{location}-discoveryengine.googleapis.com")
        if location != "global"
        else None
    )

    # Create a client
    client = discoveryengine.SearchServiceClient(client_options=client_options)

    # The full resource name of the search engine serving config
    # e.g. projects/{project_id}/locations/{location}/dataStores/{data_store_id}/servingConfigs/{serving_config_id}
    serving_config = client.serving_config_path(
        project=project_id,
        location=location,
        data_store=data_store_id,
        serving_config="default_config",
    )

    # Optional: Configuration options for search
    # Refer to the `ContentSearchSpec` reference for all supported fields:
    # https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.SearchRequest.ContentSearchSpec
    content_search_spec = discoveryengine.SearchRequest.ContentSearchSpec(
        # For information about snippets, refer to:
        # https://cloud.google.com/generative-ai-app-builder/docs/snippets
        snippet_spec=discoveryengine.SearchRequest.ContentSearchSpec.SnippetSpec(
            return_snippet=True
        ),
        # For information about search summaries, refer to:
        # https://cloud.google.com/generative-ai-app-builder/docs/get-search-summaries
        summary_spec=discoveryengine.SearchRequest.ContentSearchSpec.SummarySpec(
            summary_result_count=5,
            include_citations=True,
            ignore_adversarial_query=True,
            ignore_non_summary_seeking_query=True,
        ),
    )


    # Refer to the `SearchRequest` reference for all supported fields:
    # https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.SearchRequest
    request = discoveryengine.SearchRequest(
        serving_config=serving_config,
        query=search_query,
        page_size=10,
        content_search_spec=content_search_spec,
        query_expansion_spec=discoveryengine.SearchRequest.QueryExpansionSpec(
            condition=discoveryengine.SearchRequest.QueryExpansionSpec.Condition.AUTO,
        ),
        spell_correction_spec=discoveryengine.SearchRequest.SpellCorrectionSpec(
            mode=discoveryengine.SearchRequest.SpellCorrectionSpec.Mode.AUTO
        ),
    )

    response = client.search(request)

    return response.summary.summary_text

if __name__ == "__main__":
    app.run(debug=True, host="0.0.0.0", port=int(os.environ.get("PORT", 8080)))

Setup environment variables

In this code, you will create a few environment variables to improve the readability of the gcloud commands used in this codelab.

PROJECT_ID=$(gcloud config get-value project)

SERVICE_NAME="search-storage-pdfs-python"
SERVICE_REGION="us-central1"

# update with your data store name
SEARCH_ENGINE_ID=<your-data-store-name>

Create a Service Account

This codelab show you how to create a service account for the Cloud Run service to use to access the Vertex AI Search API.

SERVICE_ACCOUNT="cloud-run-vertex-ai-search"
SERVICE_ACCOUNT_ADDRESS=$SERVICE_ACCOUNT@$PROJECT_ID.iam.gserviceaccount.com

gcloud iam service-accounts create $SERVICE_ACCOUNT \
  --display-name="Cloud Run Vertex AI Search service account"

gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member serviceAccount:$SERVICE_ACCOUNT_ADDRESS \
  --role='roles/discoveryengine.editor'

Deploy the Cloud Run service

Now you can use a source-based deployment to automatically containerize your Cloud Run service.

gcloud run deploy $SERVICE_NAME \
--region=$SERVICE_REGION \
--source=. \
--service-account $SERVICE_ACCOUNT \
--update-env-vars SEARCH_ENGINE_ID=$SEARCH_ENGINE_ID,PROJECT_ID=$PROJECT_ID \
--no-allow-unauthenticated

and then you can save the Cloud Run URL as an environment variable to use later.

ENDPOINT_URL="$(gcloud run services describe $SERVICE_NAME --region=$SERVICE_REGION --format='value(status.url)')"

6. Call the Cloud Run service

You can now call your Cloud Run service with a query string to ask What is the final lab due date?.

curl -H "Authorization: bearer $(gcloud auth print-identity-token)" "$ENDPOINT_URL?searchQuery=what+is+the+final+lab+due+date"

Your results should look similar to the example output below:

The final lab is due on Tuesday, March 21 at 4:30 PM [1].

7. Congratulations!

Congratulations for completing the codelab!

We recommend reviewing the documentation on Vertex AI Search and Cloud Run.

What we've covered

  • How to create a Vertex AI Search app for unstructured data as PDFs ingested from a Cloud Storage bucket
  • How to create an HTTP endpoint using source-based deployment in Cloud Run
  • How to create a service account following the principle of least privilege for the Cloud Run service to use to query the Vertex AI Search app.
  • How to invoke the Cloud Run service to query the Vertex AI Search app

8. Clean up

To avoid inadvertent charges, (for example, if this Cloud Function is inadvertently invoked more times than your monthly Cloud Function invokement allocation in the free tier), you can either delete the Cloud Function or delete the project you created in Step 2.

To delete the Cloud Function, go to the Cloud Function Cloud Console at https://console.cloud.google.com/functions/ and delete the imagen_vqa function (or the $FUNCTION_NAME in case you used a different name).

If you choose to delete the entire project, you can go to https://console.cloud.google.com/cloud-resource-manager, select the project you created in Step 2, and choose Delete. If you delete the project, you'll need to change projects in your Cloud SDK. You can view the list of all available projects by running gcloud projects list.