Database as a Tool: Agentic RAG with ADK, MCP Toolbox, and Cloud SQL

1. Introduction

AI agents are only as useful as the data they can access. Most real-world data lives in databases — and connecting agents to databases typically means writing connection management, query logic, and embedding pipelines inside your agent code. Every agent that needs database access repeats this work, and every query change requires redeploying the agent.

This codelab shows a different approach. You declare your database tools in a YAML file — standard SQL queries, vector similarity search, even automatic embedding generation — and MCP Toolbox for Databases handles all database operations as an MCP server. Your agent code stays minimal: load the tools, let Gemini decide which one to call.

What you'll build

A Smart Job Board Assistant for "TechJobs" — an ADK agent powered by Gemini that helps developers browse tech job listings using standard filters (role, tech stack) and discover jobs through natural language descriptions like "I want a remote job working on AI chatbots." The agent reads from and writes to a Cloud SQL PostgreSQL database entirely through MCP Toolbox for Databases, which handles all database access — including automatic embedding generation for vector search. By the end, both the Toolbox and the agent run on Cloud Run.

eb6de681c40990c1.jpeg

What you'll learn

  • How MCP (Model Context Protocol) standardizes tool access for AI agents, and how MCP Toolbox for Databases applies this to database operations
  • Set up MCP Toolbox for Databases as middleware between an ADK agent and Cloud SQL PostgreSQL
  • Define database tools declaratively in tools.yaml — no database code in your agent
  • Build an ADK agent that loads tools from a running Toolbox server using ToolboxToolset
  • Generate vector embeddings using Cloud SQL's built-in embedding() function and enable semantic search with pgvector
  • Use the valueFromParam feature for automatic vector ingestion on write operations
  • Deploy both the Toolbox server and the ADK agent to Cloud Run

Prerequisites

  • A Google Cloud account with a trial billing account
  • Basic familiarity with Python and SQL
  • No prior experience with ADK, MCP Toolbox, or pgvector required

2. Set Up Your Environment

This step prepares your Cloud Shell environment, configures your Google Cloud project, and clones the reference repository.

Open Cloud Shell

Open Cloud Shell in your browser. Cloud Shell provides a pre-configured environment with all the tools you need for this codelab. Click Authorize when prompted to

Then click "View" -> "Terminal" to open the terminal.Your interface should look similar to this

86307fac5da2f077.png

This will be our main interface, IDE on top, terminal on the bottom

Set up your working directory

Create your working directory. All code you write in this codelab lives here:

mkdir -p ~/build-agent-adk-toolbox-cloudsql
cloudshell workspace ~/build-agent-adk-toolbox-cloudsql && cd ~/build-agent-adk-toolbox-cloudsql

Set up your Google Cloud project

Create the .env file with the location variables:

# For Vertex AI / Gemini API calls
echo "GOOGLE_CLOUD_LOCATION=global" > .env
# For Cloud SQL, Cloud Run, Artifact Registry
echo "REGION=us-central1" >> .env

Download the project setup script into your working directory:

curl -sL https://raw.githubusercontent.com/alphinside/cloud-trial-project-setup/main/setup_verify_trial_project.sh -o setup_verify_trial_project.sh

Run the script. It verifies your trial billing account, creates a new project (or validates an existing one), saves your project ID to a .env file in the current directory, and sets the active project in gcloud.

bash setup_verify_trial_project.sh && source .env

The script will:

  1. Verify you have an active trial billing account
  2. Check for an existing project in .env (if any)
  3. Create a new project or reuse the existing one
  4. Link the trial billing account to your project
  5. Save the project ID to .env
  6. Set the project as the active gcloud project

Verify the project is set correctly by checking the yellow text next to your working directory in the Cloud Shell terminal prompt. It should display your project ID.

dcba35ce1389f313.png

If your Cloud Shell session resets at any point during this codelab, navigate back to your working directory and re-run bash setup_verify_trial_project.sh && source .env to restore your project configuration. Confirm the yellow project ID text reappears in the terminal prompt.

gcloud services enable \
  aiplatform.googleapis.com \
  sqladmin.googleapis.com \
  compute.googleapis.com \
  run.googleapis.com \
  cloudbuild.googleapis.com \
  artifactregistry.googleapis.com
  • Vertex AI API (aiplatform.googleapis.com) — your agent uses Gemini models, and Toolbox uses the embedding API for vector search.
  • Cloud SQL Admin API (sqladmin.googleapis.com) — you provision and manage a PostgreSQL instance.
  • Compute Engine API (compute.googleapis.com) — required for creating Cloud SQL instances.
  • Cloud Run, Cloud Build, Artifact Registry — used in the deployment step later in this codelab

3. Create the Database Instance

This step sets up Cloud SQL instance creation in the background — it provisions while you continue the tutorial.

Start the instance creation

Add the database password to your .env file and reload it:

echo "DB_PASSWORD=techjobs-pwd-2025" >> .env
source .env

Start the Cloud SQL instance creation. This runs in the background so you can keep working:

gcloud sql instances create jobs-instance \
  --database-version=POSTGRES_17 \
  --tier=db-custom-1-3840 \
  --edition=ENTERPRISE \
  --region=$REGION \
  --root-password=$DB_PASSWORD \
  --enable-google-ml-integration \
  --database-flags cloudsql.enable_google_ml_integration=on \
  --quiet &
  • db-custom-1-3840 is the smallest dedicated-core Cloud SQL tier (1 vCPU, 3.75 GB RAM) in ENTERPRISE edition. You can read more details in here. A dedicated core is required for the Vertex AI ML integration — shared-core tiers (db-f1-micro, db-g1-small) do not support it.
  • --root-password sets the password for the default postgres user.
  • --enable-google-ml-integration enables Cloud SQL's built-in integration with Vertex AI, which lets you call embedding models directly from SQL using the embedding() function.
  • The & runs the command in the background.

This will run in the background, next let's download the MCP Toolbox binary. You can do this in the same terminal

Download the Toolbox binary

We will utilize MCP Toolbox in this tutorial, fortunately it comes with a pre-built binary that is ready to be used in the Linux environment. Let's download it in the background as it takes quite a while

cd ~/build-agent-adk-toolbox-cloudsql
curl -O https://storage.googleapis.com/genai-toolbox/v0.27.0/linux/amd64/toolbox &

Let this process run in the current tab (we already run it in background, however output will still be displayed). Let's open a new terminal tab in Cloud Shell (click the + icon) so we can be more focused.

b01e3fbd89f17332.png

Navigate to your working directory again and activate the project using the previous setup script.

cd ~/build-agent-adk-toolbox-cloudsql
bash setup_verify_trial_project.sh && source .env

This step sets up the Python project, installs dependencies, and scaffolds the ADK agent directory

4. Initialize the Agent Project

Set up the Python project

uv is a fast Python package and project manager written in Rust ( uv documentations ). This codelab uses it for speed and simplicity.

Initialize a Python project and add the required dependencies:

uv init
uv add google-adk==1.25.0 toolbox-adk==0.6.0
  • google-adk — Google's Agent Development Kit, including the Gemini SDK
  • toolbox-adk — ADK integration for MCP Toolbox for Databases.

Create the agent directory structure

ADK expects a specific folder layout: a directory named after your agent containing __init__.py, agent.py, and .env. To help with this, it has built in command to quickly establish the structure:

uv run adk create jobs_agent \
    --model gemini-2.5-flash \
    --project ${GOOGLE_CLOUD_PROJECT} \
    --region ${GOOGLE_CLOUD_LOCATION}

Your directory should now look like this:

build-agent-adk-toolbox-cloudsql/
├── jobs_agent/
│   ├── __init__.py
│   ├── agent.py
│   └── .env
├── pyproject.toml
├── .env              (project setup — already exists)
└── .venv/

5. Seed the Jobs Listing Database

This step writes the seed data, waits for the Cloud SQL instance to finish provisioning, and loads the jobs table with 15 job listings and their description embedding

Write the seed SQL

We will create a file called seed.sql in the Cloud Shell Editor with the jobs listing content. This creates the jobs table with pgvector support and inserts 15 job listings at tech companies.

First, create the seed.sql file using the following command:

cloudshell edit seed.sql

Then, copy these scripts into the file

-- seed.sql
-- DISCLAIMER: These job listings are entirely fictional and created for tutorial
-- purposes only. Company names are used for illustrative context — the positions,
-- salaries, and descriptions do not reflect real openings.

CREATE EXTENSION IF NOT EXISTS google_ml_integration;
CREATE EXTENSION IF NOT EXISTS vector;

CREATE TABLE IF NOT EXISTS jobs (
    id SERIAL PRIMARY KEY,
    title VARCHAR NOT NULL,
    company VARCHAR NOT NULL,
    role VARCHAR NOT NULL,
    tech_stack VARCHAR NOT NULL,
    salary_range VARCHAR NOT NULL,
    location VARCHAR NOT NULL,
    openings INTEGER NOT NULL,
    description TEXT NOT NULL,
    description_embedding vector(3072)
);

INSERT INTO jobs (title, company, role, tech_stack, salary_range, location, openings, description) VALUES
('Senior Backend Engineer', 'Stripe', 'Backend', 'Go, PostgreSQL, gRPC, Kubernetes', '$180-250K/year', 'San Francisco, Hybrid', 3,
 'Design and build high-throughput microservices powering payment infrastructure for millions of businesses. Optimize Go services for sub-100ms latency at scale, work with PostgreSQL and Redis for data persistence, and deploy on Kubernetes clusters handling billions of API calls.'),

('Machine Learning Engineer', 'Spotify', 'Data/AI', 'Python, TensorFlow, BigQuery, Vertex AI', '$170-230K/year', 'Stockholm, Remote', 2,
 'Build and deploy ML models for music recommendation and personalization systems serving hundreds of millions of listeners. Design feature pipelines in BigQuery, train models using distributed computing, and serve predictions through real-time APIs processing thousands of requests per second.'),

('Frontend Engineer', 'Vercel', 'Frontend', 'React, TypeScript, Next.js', '$140-190K/year', 'Remote', 4,
 'Build developer-facing dashboard interfaces and deployment tools used by millions of developers worldwide. Create responsive, accessible React components for project management, analytics, and real-time deployment monitoring with a focus on developer experience.'),

('DevOps Engineer', 'Datadog', 'DevOps', 'Terraform, GCP, Docker, Kubernetes, ArgoCD', '$160-220K/year', 'New York, Hybrid', 2,
 'Manage cloud infrastructure powering an observability platform used by thousands of engineering teams. Automate deployment pipelines with ArgoCD, manage multi-cloud Kubernetes clusters, and implement infrastructure-as-code with Terraform across production environments.'),

('Mobile Engineer (Android)', 'Grab', 'Mobile', 'Kotlin, Jetpack Compose, GraphQL', '$120-170K/year', 'Singapore, Hybrid', 3,
 'Develop features for a super-app serving millions of users across Southeast Asia. Build modern Android UIs with Jetpack Compose, integrate GraphQL APIs, and optimize app performance for diverse device capabilities and network conditions.'),

('Data Engineer', 'Airbnb', 'Data', 'Python, Apache Spark, Airflow, BigQuery', '$160-210K/year', 'San Francisco, Hybrid', 2,
 'Build data pipelines that process booking, search, and pricing data for a global travel marketplace. Design ETL workflows with Apache Spark and Airflow, maintain data warehouses in BigQuery, and ensure data quality for analytics and machine learning teams.'),

('Full Stack Engineer', 'Revolut', 'Full Stack', 'TypeScript, Node.js, React, PostgreSQL', '$130-180K/year', 'London, Remote', 5,
 'Build the next generation of financial products making banking accessible to millions of users across 35 countries. Develop real-time trading interfaces with React and WebSockets, build Node.js APIs handling market data streams, and design PostgreSQL schemas for financial transactions.'),

('Site Reliability Engineer', 'Cloudflare', 'SRE', 'Go, Prometheus, Grafana, GCP, Terraform', '$170-230K/year', 'Austin, Hybrid', 2,
 'Ensure 99.99% uptime for a global network handling millions of requests per second. Define SLOs, build monitoring dashboards with Prometheus and Grafana, manage incident response, and automate infrastructure scaling across 300+ data centers worldwide.'),

('Cloud Architect', 'Google Cloud', 'Cloud', 'GCP, Terraform, Kubernetes, Python', '$200-280K/year', 'Seattle, Hybrid', 1,
 'Help enterprises modernize their infrastructure on Google Cloud. Design multi-region architectures, lead migration projects from on-premises to GKE, and build reference implementations using Terraform and Cloud Foundation Toolkit.'),

('Backend Engineer (Payments)', 'Square', 'Backend', 'Java, Spring Boot, PostgreSQL, Kafka', '$160-220K/year', 'San Francisco, Hybrid', 3,
 'Build payment processing systems handling millions of transactions for businesses of all sizes. Design event-driven architectures using Kafka, implement idempotent payment flows with Spring Boot, and ensure PCI-DSS compliance across all services.'),

('AI Engineer', 'Hugging Face', 'Data/AI', 'Python, LangChain, Vertex AI, FastAPI, PostgreSQL', '$150-210K/year', 'Paris, Remote', 2,
 'Build AI-powered tools for the largest open-source ML community. Develop RAG pipelines that index and search model documentation, create conversational agents using LangChain, and deploy AI services with FastAPI on cloud infrastructure.'),

('Platform Engineer', 'Coinbase', 'Platform', 'Rust, Kubernetes, AWS, Terraform', '$180-250K/year', 'Remote', 0,
 'Build the infrastructure platform for a leading cryptocurrency exchange. Develop high-performance matching engines in Rust, manage Kubernetes clusters for microservices, and design CI/CD pipelines that enable rapid feature deployment with zero downtime.'),

('QA Automation Engineer', 'Shopify', 'QA', 'Python, Selenium, Cypress, Jenkins', '$110-160K/year', 'Toronto, Hybrid', 3,
 'Design and maintain automated test suites for a commerce platform powering millions of merchants. Build end-to-end test frameworks with Cypress and Selenium, integrate tests into Jenkins CI pipelines, and establish quality gates that prevent regressions in checkout and payment flows.'),

('Security Engineer', 'CrowdStrike', 'Security', 'Python, SIEM, Kubernetes, Penetration Testing', '$170-240K/year', 'Austin, On-site', 1,
 'Protect enterprise customers from cyber threats on a leading endpoint security platform. Conduct penetration testing, design security monitoring with SIEM tools, implement zero-trust networking in Kubernetes environments, and lead incident response for security events.'),

('Product Engineer', 'GitLab', 'Full Stack', 'Go, React, PostgreSQL, Redis, GCP', '$140-200K/year', 'Remote', 4,
 'Own features end-to-end for an all-in-one DevSecOps platform used by millions of developers. Build Go microservices for CI/CD pipelines, create React frontends for code review and project management, and collaborate with product managers to iterate on user-facing features using data-driven development.');

The seed script installs two PostgreSQL extensions:

  • google_ml_integration — provides the embedding() SQL function, which calls Vertex AI embedding models directly from SQL. This is a database-level extension that makes ML functions available inside jobs_db. The instance-level flag (--enable-google-ml-integration) you set during instance creation allows the Cloud SQL VM to reach Vertex AI — the extension makes the SQL functions available within this specific database.
  • vector (pgvector) — adds the vector data type and distance operators for storing and querying embeddings.

The description_embedding column is vector(3072) — a pgvector column that stores 3072-dimensional vectors. It's NULL for now; you generate and populate embeddings in next step using the embedding() function.

Finish the database setup

The Cloud SQL instance creation you started in the previous step may still be running and not finished yet. Verify the instance is ready:

gcloud sql instances describe jobs-instance --format="value(state)"

You should see the following output

RUNNABLE

34f5b48006b4cb3a.png

Next, grant the Cloud SQL instance's service account permission to call Vertex AI. This is required for the built-in embedding() function you'll use in next step:

SERVICE_ACCOUNT=$(gcloud sql instances describe jobs-instance --format="value(serviceAccountEmailAddress)")

gcloud projects add-iam-policy-binding $GOOGLE_CLOUD_PROJECT \
  --member="serviceAccount:$SERVICE_ACCOUNT" \
  --role="roles/aiplatform.user" \
  --quiet

After that, create a dedicated database for the job listings:

gcloud sql databases create jobs_db --instance=jobs-instance

You should see output confirming the database was created:

Creating Cloud SQL database...done.                                                                         
Created database [jobs_db].
instance: jobs-instance
name: jobs_db
project: workshop-xxxxxxx

Connect and seed the database

Start the Cloud SQL Auth Proxy (cloud-sql-proxy is pre-installed in Cloud Shell). This provides a secure, authenticated connection from Cloud Shell to your Cloud SQL instance:

d72e56478b517b5c.jpeg

cloud-sql-proxy ${GOOGLE_CLOUD_PROJECT}:${REGION}:jobs-instance --port 5432 &

If the proxy starts, you should see these output in terminal:

... Authorizing with Application Default Credentials
... [workshop-xxxxxx:your-location:jobs-instance] Listening on 127.0.0.1:5432
... The proxy has started successfully and is ready for new connections!

Now the current terminal outputs the log of the cloud sql proxy continuously. Let's open a new terminal tab in Cloud Shell (click the + icon) so we can be more focused.

b01e3fbd89f17332.png

Navigate to your working directory again and activate the project using the previous setup script.

cd ~/build-agent-adk-toolbox-cloudsql
bash setup_verify_trial_project.sh && source .env

Then, run the seed script

psql "host=127.0.0.1 port=5432 dbname=jobs_db user=postgres password=$DB_PASSWORD" -f seed.sql

You will see terminal output like this

CREATE EXTENSION
CREATE EXTENSION
CREATE TABLE
INSERT 0 15

Let's verify the data

psql "host=127.0.0.1 port=5432 dbname=jobs_db user=postgres password=$DB_PASSWORD" \
  -c "SELECT title, company, role, openings FROM jobs ORDER BY role, title;"

You should see 15 job listings across multiple roles:

             title              |    company     |   role    | openings
---------------------------------+----------------+-----------+----------
 Senior Backend Engineer         | Stripe         | Backend   |        3
 Backend Engineer (Payments)     | Square         | Backend   |        3
 Cloud Architect                 | Google Cloud   | Cloud     |        1
 ...
(15 rows)

Generate embeddings for job descriptions

The description_embedding column in the jobs table is currently NULL. Cloud SQL's built-in google_ml_integration extension provides an embedding() function that calls Vertex AI directly from SQL — no Python script or external SDK needed.

Start the embedding generation in the background. This calls Vertex AI to generate a 3072-dimensional vector using gemini-embedding-001 model for each of the 15 job descriptions:

psql "host=127.0.0.1 port=5432 dbname=jobs_db user=postgres password=$DB_PASSWORD" \
  -c "UPDATE jobs SET description_embedding = embedding('gemini-embedding-001', description)::vector;" &

This is what the script do:

  • embedding('gemini-embedding-001', description) — calls Vertex AI's Gemini embedding model directly from SQL, passing each job's description text. This is the google_ml_integration extension you installed in the seed script.
  • ::vector — casts the returned float array to pgvector's vector type so it can be stored and queried with distance operators.
  • The UPDATE runs across all 15 rows, generating one 3072-dimensional embedding per job description.
  • The & runs the command in the background so you can continue working while Vertex AI processes the embeddings.

Like previous background process execution, the current terminal will output the log of the process. Let's open a new terminal tab in Cloud Shell (click the + icon) so we can be more focused.

b01e3fbd89f17332.png

Navigate to your working directory again and activate the project using the previous setup script.

cd ~/build-agent-adk-toolbox-cloudsql
bash setup_verify_trial_project.sh && source .env

Then, we can continue to the next process

6. Configure MCP Toolbox for Databases

This step introduces MCP Toolbox for Databases, configures it to connect to your Cloud SQL instance, and defines two standard SQL query tools.

What is MCP and why use Toolbox?

e7b9be2e1c98b4db.png

MCP (Model Context Protocol) is an open protocol that standardizes how AI agents discover and interact with external tools. It defines a client-server model: the agent hosts an MCP client, and tools are exposed by MCP servers. Any MCP-compatible client can use any MCP-compatible server — the agent doesn't need custom integration code for each tool.

d5baa77423f0f465.png

MCP Toolbox for Databases is an open-source MCP server built specifically for database access. Without it, you would write Python functions that open database connections, manage connection pools, construct parameterized queries to prevent SQL injection, handle errors, and embed all of that code inside your agent. Every agent that needs database access repeats this work. Changing a query means redeploying the agent.

With Toolbox, you write a YAML file. Each tool maps to a parameterized SQL statement. Toolbox handles connection pooling, parameterized queries, authentication, and observability. Tools are decoupled from the agent — update a query by editing tools.yaml and restarting Toolbox, without touching agent code. The same tools work across ADK, LangGraph, LlamaIndex, or any MCP-compatible framework.

Write the tools configuration

Now, we need to create a file called tools.yaml in the Cloud Shell Editor to set up our tools configuration

cloudshell edit tools.yaml

The file uses multi-document YAML — each block separated by --- is a standalone resource. Every resource has a kind that declares what it is (sources for database connections, tools for agent-callable actions) and a type that specifies the backend (cloud-sql-postgres for the source, postgres-sql for SQL-based tools). A tool references its source by name, which is how Toolbox knows which connection pool to execute against. Environment variables use ${VAR_NAME} syntax and are resolved at startup.

Now, let's copy the following scripts first into tools.yaml file

# tools.yaml

# --- Data Source ---
kind: sources
name: jobs-db
type: cloud-sql-postgres
project: ${GOOGLE_CLOUD_PROJECT}
region: ${REGION}
instance: jobs-instance
database: jobs_db
user: postgres
password: ${DB_PASSWORD}

---

This script here define the following resource:

  • Source (jobs-db) — tells Toolbox how to connect to your Cloud SQL PostgreSQL instance. The cloud-sql-postgres type uses the Cloud SQL connector internally, handling authentication and secure connections automatically. The ${GOOGLE_CLOUD_PROJECT} , ${REGION} and ${DB_PASSWORD} placeholders are resolved from environment variables at startup.

Next, append the following script under the --- symbol in the tools.yaml

# --- Tool 1: Search jobs by role and/or tech stack ---
kind: tools
name: search-jobs
type: postgres-sql
source: jobs-db
description: >-
  Search for job listings by role category and/or tech stack.
  Use this tool when the developer wants to browse listings
  by role (e.g., Backend, Frontend, Data) or find jobs
  using a specific technology. Both parameters accept an
  empty string to match all values.
statement: |
  SELECT title, company, role, tech_stack, salary_range, location, openings
  FROM jobs
  WHERE ($1 = '' OR LOWER(role) = LOWER($1))
  AND ($2 = '' OR LOWER(tech_stack) LIKE '%' || LOWER($2) || '%')
  ORDER BY title
  LIMIT 10
parameters:
  - name: role
    type: string
    description: "The role category to filter by (e.g., 'Backend', 'Frontend', 'Data/AI', 'DevOps'). Use empty string for all roles."
  - name: tech_stack
    type: string
    description: "A technology to search for in the tech stack (partial match, e.g., 'Python', 'Kubernetes'). Use empty string for all tech stacks."

---

# --- Tool 2: Get full details for a specific job ---
kind: tools
name: get-job-details
type: postgres-sql
source: jobs-db
description: >-
  Get full details for a specific job listing including its description,
  salary range, location, and number of openings. Use this tool when the
  developer asks about a particular job by title or company.
statement: |
  SELECT title, company, role, tech_stack, salary_range, location, openings, description
  FROM jobs
  WHERE LOWER(title) LIKE '%' || LOWER($1) || '%'
  OR LOWER(company) LIKE '%' || LOWER($1) || '%'
parameters:
  - name: search_term
    type: string
    description: "The job title or company name to look up (partial match supported)."

---

This script here define the following resource:

  • Tools 1 and 2 (search-jobs, get-job-details) — standard SQL query tools. Each maps a tool name (what the agent sees) to a parameterized SQL statement (what the database executes). Parameters use $1, $2 positional placeholders. Toolbox executes these as prepared statements, which prevents SQL injection.

Let's continue, append the following script under the --- symbol in the tools.yaml

# --- Embedding Model ---
kind: embeddingModels
name: gemini-embedding
type: gemini
model: gemini-embedding-001
dimension: 3072

---

This script here define the following resource:

  • Embedding model (gemini-embedding) — configures Toolbox to call Gemini's gemini-embedding-001 model for generating 3072-dimensional text embeddings. Toolbox uses Application Default Credentials (ADC) to authenticate — no API key needed in Cloud Shell or Cloud Run. Notes that this dimension configured here must be the same with previously we config to seed the database

Let's continue, append the following script under the --- symbol in the tools.yaml

# --- Tool 3: Semantic search by description ---
kind: tools
name: search-jobs-by-description
type: postgres-sql
source: jobs-db
description: >-
  Find jobs that match a natural language description of what the developer
  is looking for. Use this tool when the developer describes their ideal job
  using interests, work style, career goals, or project type rather than a
  specific role or tech stack. Examples: "I want to work on AI chatbots,"
  "a remote job at a fintech startup," "something involving infrastructure
  and reliability."
statement: |
  SELECT title, company, role, tech_stack, salary_range, location, description
  FROM jobs
  WHERE description_embedding IS NOT NULL
  ORDER BY description_embedding <=> $1
  LIMIT 5
parameters:
  - name: search_query
    type: string
    description: "A natural language description of the kind of job the developer is looking for."
    embeddedBy: gemini-embedding

---

This script here define the following resource:

  • Tool 3 (search-jobs-by-description) — a vector search tool. The search_query parameter has embeddedBy: gemini-embedding, which tells Toolbox to intercept the raw text, send it to the embedding model, and use the resulting vector in the SQL statement. The <=> operator is pgvector's cosine distance — smaller values mean more similar descriptions.

Finally, append the last tool under the --- symbol in the tools.yaml

# --- Tool 4: Add a new job listing with automatic embedding ---
kind: tools
name: add-job
type: postgres-sql
source: jobs-db
description: >-
  Add a new job listing to the platform. Use this tool when a user asks
  to post a job that is not currently listed.
statement: |
  INSERT INTO jobs (title, company, role, tech_stack, salary_range, location, openings, description, description_embedding)
  VALUES ($1, $2, $3, $4, $5, $6, CAST($7 AS INTEGER), $8, $9)
  RETURNING title, company
parameters:
  - name: title
    type: string
    description: "The job title (e.g., 'Senior Backend Engineer')."
  - name: company
    type: string
    description: "The company name (e.g., 'Stripe', 'Spotify')."
  - name: role
    type: string
    description: "The role category (e.g., 'Backend', 'Frontend', 'Data/AI', 'DevOps')."
  - name: tech_stack
    type: string
    description: "Comma-separated list of technologies (e.g., 'Python, FastAPI, GCP')."
  - name: salary_range
    type: string
    description: "The salary range (e.g., '$150-200K/year')."
  - name: location
    type: string
    description: "Work location and arrangement (e.g., 'Remote')."
  - name: openings
    type: string
    description: "The number of open positions."
  - name: description
    type: string
    description: "A short description of the job (2-3 sentences)."
  - name: description_vector
    type: string
    description: "Auto-generated embedding vector for the job description."
    valueFromParam: description
    embeddedBy: gemini-embedding

This script here define the following resource:

  • Tool 4 (add-job) — demonstrates vector ingestion. The description_vector parameter has two special fields:
  • valueFromParam: description — Toolbox copies the value from the description parameter into this one. The LLM never sees this parameter.
  • embeddedBy: gemini-embedding — Toolbox embeds the copied text into a vector before passing it to the SQL.

The result: one tool call stores both the raw description text and its vector embedding, without the agent knowing anything about embeddings.

The multi-document YAML format separates each resource with ---. Each document has kind, name, and type fields that define what it is. In summary we already configured all of the following things:

  • Define the source database
  • Define tools ( tool 1 and 2 ) to query database with standard filter
  • Define embedding model
  • Define tool to do vector search ( tool 3 ) to database
  • Define tool to do vector data ingestion ( tool 4 ) to database

Verify the embeddings

Before starting Toolbox, confirm the background embedding generation has completed. Check that all jobs now have embeddings:

psql "host=127.0.0.1 port=5432 dbname=jobs_db user=postgres password=$DB_PASSWORD" \
  -c "SELECT title, (description_embedding IS NOT NULL) AS has_embedding FROM jobs ORDER BY title;"

Every row should show t (true) in the has_embedding column. If not, you can choose to wait until all row embedding creation process finished

           title            | has_embedding 
-----------------------------+---------------
 AI Engineer                 | t
 Backend Engineer (Payments) | t
 Cloud Architect             | t
 Data Engineer               | t
 DevOps Engineer             | t
 Frontend Engineer           | t
 Full Stack Engineer         | t

Start the Toolbox server

In the setup step, we already downloaded the toolbox executable. Ensure that this binary file exist and successfully downloaded, if not, download it and wait till finished

cd ~/build-agent-adk-toolbox-cloudsql
if [ ! -f toolbox ]; then
  curl -O https://storage.googleapis.com/genai-toolbox/v0.27.0/linux/amd64/toolbox
fi
chmod +x toolbox

Export the required environment variables and start Toolbox. The GOOGLE_CLOUD_LOCATION and GOOGLE_GENAI_USE_VERTEXAI variables are required because the configuration includes an embedding model — GOOGLE_GENAI_USE_VERTEXAI tells the Gemini SDK to route through Vertex AI (instead of the consumer Gemini API), and GOOGLE_CLOUD_LOCATION tells it which regional endpoint to use.

export GOOGLE_CLOUD_PROJECT=$GOOGLE_CLOUD_PROJECT
export GOOGLE_CLOUD_LOCATION=$GOOGLE_CLOUD_LOCATION
export GOOGLE_GENAI_USE_VERTEXAI=true
export DB_PASSWORD=$DB_PASSWORD
export REGION=$REGION
./toolbox --tools-file tools.yaml &

You should see output confirming the server is ready like shown below:

... INFO "Initialized 0 authServices: " 
... INFO "Initialized 1 embeddingModels: gemini-embedding" 
... INFO "Initialized 4 tools: add-job, search-jobs, get-job-details, search-jobs-by-description" 
...
... INFO "Server ready to serve!"

Like the previous step, this will spawn another process and spew outputs. Let's open a new terminal tab in Cloud Shell (click the + icon) so we can be more focused.

b01e3fbd89f17332.png

Navigate to your working directory again and activate the project using the previous setup script.

cd ~/build-agent-adk-toolbox-cloudsql
bash setup_verify_trial_project.sh && source .env

Verify the tools

Query the Toolbox API to list all registered tools:

curl -s http://localhost:5000/api/toolset | python3 -m json.tool

You should see tools with their descriptions and parameters. Like shown below

...
       
"search-jobs-by-description": {
            "description": "Find jobs that match a natural language description of what the developer is looking for. Use this tool when the developer describes their ideal job using interests, work style, career goals, or project type rather than a specific role or tech stack. Examples: \"I want to work on AI chatbots,\" \"a remote job at a fintech startup,\" \"something involving infrastructure and reliability.\"",
            "parameters": [
                {
                    "name": "search_query",
                    "type": "string",
                    "required": true,
                    "description": "A natural language description of the kind of job the developer is looking for.",
                    "authSources": []
                }
            ],
            "authRequired": []
        }
...

Test the search-jobs tool directly:

curl -s -X POST http://localhost:5000/api/tool/search-jobs/invoke \
  -H "Content-Type: application/json" \
  -d '{"role": "Backend", "tech_stack": ""}' | jq '.result | fromjson'

The response should contain the two backend engineering jobs from your seed data.

[
  {
    "title": "Backend Engineer (Payments)",
    "company": "Square",
    "role": "Backend",
    "tech_stack": "Java, Spring Boot, PostgreSQL, Kafka",
    "salary_range": "$160-220K/year",
    "location": "San Francisco, Hybrid",
    "openings": 3
  },
  {
    "title": "Senior Backend Engineer",
    "company": "Stripe",
    "role": "Backend",
    "tech_stack": "Go, PostgreSQL, gRPC, Kubernetes",
    "salary_range": "$180-250K/year",
    "location": "San Francisco, Hybrid",
    "openings": 3
  }
]

7. Build the ADK Agent

This step wires the ADK agent to the running Toolbox server and tests all four tools — standard queries, semantic search, and vector ingestion. The agent code is minimal: all database logic lives in tools.yaml.

Configure the agent's environment

ADK reads GOOGLE_GENAI_USE_VERTEXAI, GOOGLE_CLOUD_PROJECT, and GOOGLE_CLOUD_LOCATION from the shell environment, which you already set in earlier step. The only agent-specific variable is TOOLBOX_URL — append it to the agent's .env file:

echo -e "\nTOOLBOX_URL=http://127.0.0.1:5000" >> jobs_agent/.env

Update the agent module

Open jobs_agent/agent.py in the Cloud Shell Editor

cloudshell edit jobs_agent/agent.py

and overwrite the content with the following code:

# jobs_agent/agent.py
import os

from google.adk.agents import LlmAgent
from toolbox_adk import ToolboxToolset

TOOLBOX_URL = os.environ.get("TOOLBOX_URL", "http://127.0.0.1:5000")

toolbox = ToolboxToolset(TOOLBOX_URL)

root_agent = LlmAgent(
    name="jobs_agent",
    model="gemini-2.5-flash",
    instruction="""You are a helpful assistant at "TechJobs," a tech job listing platform.

Your job:
- Help developers browse job listings by role or tech stack.
- Provide full details about specific positions, including salary range and number of openings.
- Recommend jobs based on natural language descriptions of what the developer is looking for.
- Add new job listings to the platform when asked.

When a developer asks about a specific job by title or company, use the get-job-details tool.
When a developer asks for a specific role category or tech stack, use the search-jobs tool.
When a developer describes what kind of job they want — by interest area, work style,
career goals, or project type — use the search-jobs-by-description tool for semantic search.
When in doubt between search-jobs and search-jobs-by-description, prefer
search-jobs-by-description — it searches job descriptions and finds more relevant matches.

If a position has no openings (openings is 0), let the developer know
and suggest similar alternatives from the search results.

Be conversational, knowledgeable, and concise.""",
    tools=[toolbox],
)

Notice that there is no database code in here — ToolboxToolset connects to the Toolbox server at startup and loads all available tools. The agent calls tools by name; Toolbox translates those calls into SQL queries against Cloud SQL.

The TOOLBOX_URL environment variable defaults to http://127.0.0.1:5000 for local development. When you deploy to Cloud Run later, you override this with the Toolbox service's Cloud Run URL — no code changes needed.

The instruction currently references only the two standard tools (search-jobs and get-job-details). You will expand it in the next step when you add semantic search and ingestion tools.

Test the agent

Start the ADK dev UI:

cd ~/build-agent-adk-toolbox-cloudsql
uv run adk web

Open the URL shown in the terminal (typically http://localhost:8000) using Cloud Shell's Web Preview feature or ctrl + click the URL shown in terminal. Select jobs_agent from the agent dropdown in the top-left corner.

Test standard queries

Try these prompts to verify the standard SQL tools:

What backend engineering jobs do you have?
Any jobs using Kubernetes?
Tell me about the Cloud Architect position

93ac33e7f73aa0b9.png 240c53376042a916.png

Try natural language descriptions that don't map to a specific role or tech stack:

I want a remote job where I can work on AI and machine learning
Find me something in fintech with good work-life balance
I'm interested in infrastructure and reliability engineering

The agent will try to pick the right tool based on the query type: structured filters go through search-jobs, natural language descriptions go through search-jobs-by-description.

b0ea629f5c9b4c26.png

Test vector ingestion

Ask the agent to add a new job:

Add a new job: 'Robotics Software Engineer' at Boston Dynamics, role Robotics, tech stack: Python, C++, ROS, Computer Vision, salary $160-230K/year, location Waltham MA, Hybrid, 2 openings. Description: Design and implement autonomous navigation and manipulation algorithms for next-generation robots. Work on perception pipelines using computer vision and lidar, develop motion planning software in C++ and Python, and test systems on real hardware in warehouse and logistics environments.

c601a7a9bc0a705b.png

Now try to search for it:

Find me jobs involving autonomous systems and working with physical hardware

The embedding was generated automatically during the INSERT — no separate step needed.

5a3d8e6f523dc18b.png

Now, you already have a full working Agentic RAG application utilizing ADK, MCP Toolbox, and CloudSQL. Congratulations! Let's take a step further to deploy these apps to Cloud Run!

Now, let's stop the dev UI by killing the process by pressing Ctrl+C twice before proceeding.

8. Deploy to Cloud Run

The agent and Toolbox work locally. This step deploys both as Cloud Run services so they're accessible over the internet. The Toolbox service runs as an MCP server on Cloud Run, and the agent service connects to it.

Prepare the Toolbox for deployment

Create a deployment directory for the Toolbox service:

cd ~/build-agent-adk-toolbox-cloudsql
mkdir -p deploy-toolbox
cp toolbox tools.yaml deploy-toolbox/

Create the Dockerfile for the Toolbox. Open deploy-toolbox/Dockerfile in the Cloud Shell Editor:

cloudshell edit deploy-toolbox/Dockerfile

And copy the following script to it

# deploy-toolbox/Dockerfile
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y ca-certificates && rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY toolbox tools.yaml ./
RUN chmod +x toolbox
EXPOSE 8080
CMD ["./toolbox", "--tools-file", "tools.yaml", "--address", "0.0.0.0", "--port", "8080"]

The Toolbox binary and tools.yaml are packaged into a minimal Debian image. Cloud Run routes traffic to port 8080.

Deploy the Toolbox service

cd ~/build-agent-adk-toolbox-cloudsql
gcloud run deploy toolbox-service \
  --source deploy-toolbox/ \
  --region $REGION \
  --set-env-vars "DB_PASSWORD=$DB_PASSWORD,GOOGLE_CLOUD_PROJECT=$GOOGLE_CLOUD_PROJECT,REGION=$REGION,GOOGLE_CLOUD_LOCATION=$GOOGLE_CLOUD_LOCATION,GOOGLE_GENAI_USE_VERTEXAI=true" \
  --allow-unauthenticated \
  --quiet

This command submits the source to Cloud Build, builds a container image, pushes it to Artifact Registry, and deploys it to Cloud Run. It takes a few minutes — let's open a new terminal tab in Cloud Shell (click the + icon) so we can be more focused.

b01e3fbd89f17332.png

Navigate to your working directory again and activate the project using the previous setup script.

cd ~/build-agent-adk-toolbox-cloudsql
bash setup_verify_trial_project.sh && source .env

Prepare the agent for deployment

While the Toolbox builds, set up the agent's deployment files.

Create a Dockerfile in the project root. Open Dockerfile in the Cloud Shell Editor:

cloudshell edit Dockerfile

Then, copy the following content

# Dockerfile
FROM ghcr.io/astral-sh/uv:python3.12-trixie-slim
WORKDIR /app
COPY pyproject.toml ./
COPY uv.lock ./
RUN uv sync --no-dev
COPY jobs_agent/ jobs_agent/
EXPOSE 8080
CMD ["uv", "run", "adk", "web", "--host", "0.0.0.0", "--port", "8080"]

This Dockerfile uses ghcr.io/astral-sh/uv as the base image, which includes both Python and uv pre-installed — no need to install uv separately via pip.

Create a .dockerignore file to exclude unnecessary files from the container image:

cloudshell edit .dockerignore

Then copy the following script into it

# .dockerignore
.venv/
__pycache__/
*.pyc
.env
jobs_agent/.env
toolbox
tools.yaml
seed.sql
deploy-toolbox/

Deploy the agent service

Wait for the Toolbox deployment to complete. Retrieve its Cloud Run URL using the following command

TOOLBOX_URL=$(gcloud run services describe toolbox-service \
  --region=$REGION \
  --format='value(status.url)')
echo "Toolbox URL: $TOOLBOX_URL"

You will see the similar output like this

Toolbox URL: https://toolbox-service-xxxxxx-xx.a.run.app

Then, Let's verify the deployed Toolbox is working:

curl -s "$TOOLBOX_URL/api/toolset" | python3 -m json.tool | head -5

If the output shown like this example, the deployment is already succeed

{
    "serverVersion": "0.27.0+binary.linux.amd64.c5524d3",
    "tools": {
        "add-job": {
            "description": "Add a new job listing to the platform. Use this tool when a user asks to post a job that is not currently listed.",

Next, let's deploy the agent, passing the Toolbox URL as an environment variable:

cd ~/build-agent-adk-toolbox-cloudsql
gcloud run deploy jobs-agent \
  --source . \
  --region $REGION \
  --set-env-vars "TOOLBOX_URL=$TOOLBOX_URL,GOOGLE_CLOUD_PROJECT=$GOOGLE_CLOUD_PROJECT,GOOGLE_CLOUD_LOCATION=$GOOGLE_CLOUD_LOCATION,GOOGLE_GENAI_USE_VERTEXAI=TRUE" \
  --allow-unauthenticated \
  --quiet

The agent code reads TOOLBOX_URL from the environment (you set this up previously). Locally it points to http://127.0.0.1:5000; on Cloud Run it points to the Toolbox service URL. No code changes needed.

Test the deployed agent

Retrieve the agent's Cloud Run URL:

AGENT_URL=$(gcloud run services describe jobs-agent \
  --region=$REGION \
  --format='value(status.url)')
echo "Agent URL: $AGENT_URL"

Open the URL in your browser. The ADK dev UI loads — the same interface you've been using locally, now running on Cloud Run.

Select jobs_agent from the dropdown and test:

What backend engineering jobs do you have?
I want a remote job working on AI and machine learning

Both queries work through the deployed services: the agent on Cloud Run calls the Toolbox on Cloud Run, which queries Cloud SQL.

9. Congratulations / Clean Up

You've built and deployed a smart job board assistant that uses MCP Toolbox for Databases to bridge an ADK agent and Cloud SQL PostgreSQL — with both standard SQL queries and semantic vector search.

What you've learned

  • How MCP standardizes tool access for AI agents, and how MCP Toolbox for Databases applies this specifically to database operations — replacing custom database code with declarative YAML configuration
  • How to configure Cloud SQL PostgreSQL as a Toolbox data source using the cloud-sql-postgres source type
  • How to define standard SQL query tools with parameterized statements that prevent SQL injection
  • How to enable vector search using pgvector and gemini-embedding-001, with the embeddedBy parameter for automatic query embedding
  • How valueFromParam enables automatic vector ingestion — the LLM provides a text description, and Toolbox silently copies, embeds, and stores the vector alongside the text
  • How ADK's ToolboxToolset loads tools from a running Toolbox server, keeping agent code minimal and database logic fully decoupled
  • How to deploy both the Toolbox MCP server and the ADK agent to Cloud Run as separate services

Clean up

To avoid incurring charges to your Google Cloud account for the resources created in this codelab, you can either delete the individual resources or delete the entire project.

The easiest way to clean up is to delete the project. This removes all resources associated with the project.

gcloud projects delete $GOOGLE_CLOUD_PROJECT

Option 2: Delete individual resources

If you want to keep the project but remove only the resources created in this codelab:

gcloud run services delete jobs-agent --region=$REGION --quiet
gcloud run services delete toolbox-service --region=$REGION --quiet
gcloud sql instances delete jobs-instance --quiet
gcloud artifacts repositories delete cloud-run-source-deploy --location=$REGION --quiet 2>/dev/null