1. Introduction
AI agents are only as useful as the data they can access. Most real-world data lives in databases — and connecting agents to databases typically means writing connection management, query logic, and embedding pipelines inside your agent code. Every agent that needs database access repeats this work, and every query change requires redeploying the agent.
This codelab shows a different approach. You declare your database tools in a YAML file — standard SQL queries, vector similarity search, even automatic embedding generation — and MCP Toolbox for Databases handles all database operations as an MCP server. Your agent code stays minimal: load the tools, let Gemini decide which one to call.
What you'll build
A Smart Job Board Assistant for "TechJobs" — an ADK agent powered by Gemini that helps developers browse tech job listings using standard filters (role, tech stack) and discover jobs through natural language descriptions like "I want a remote job working on AI chatbots." The agent reads from and writes to a Cloud SQL PostgreSQL database entirely through MCP Toolbox for Databases, which handles all database access — including automatic embedding generation for vector search. By the end, both the Toolbox and the agent run on Cloud Run.
What you'll learn
- How MCP (Model Context Protocol) standardizes tool access for AI agents, and how MCP Toolbox for Databases applies this to database operations
- Set up MCP Toolbox for Databases as middleware between an ADK agent and Cloud SQL PostgreSQL
- Define database tools declaratively in
tools.yaml— no database code in your agent - Build an ADK agent that loads tools from a running Toolbox server using
ToolboxToolset - Generate vector embeddings using Cloud SQL's built-in
embedding()function and enable semantic search withpgvector - Use the
valueFromParamfeature for automatic vector ingestion on write operations - Deploy both the Toolbox server and the ADK agent to Cloud Run
Prerequisites
- A Google Cloud account with a trial billing account
- Basic familiarity with Python and SQL
- Prior experience with Cloud Database and ADK will be helpful
2. Set Up Your Environment
This step prepares your Cloud Shell environment, configures your Google Cloud project, and clones the reference repository.
Open Cloud Shell
Open Cloud Shell in your browser. Cloud Shell provides a pre-configured environment with all the tools you need for this codelab. Click Authorize when prompted to
Then click "View" -> "Terminal" to open the terminal.Your interface should look similar to this

This will be our main interface, IDE on top, terminal on the bottom
Set up your working directory
Create your working directory. All code you write in this codelab lives here:
mkdir -p ~/build-agent-adk-toolbox-cloudsql
cloudshell workspace ~/build-agent-adk-toolbox-cloudsql && cd ~/build-agent-adk-toolbox-cloudsql
After that, let's prepare several directories to manage things like seeding scripts and logs
mkdir -p ~/build-agent-adk-toolbox-cloudsql/scripts
mkdir -p ~/build-agent-adk-toolbox-cloudsql/logs
Set up your Google Cloud project
Create the .env file with the location variables:
# For Vertex AI / Gemini API calls
echo "GOOGLE_CLOUD_LOCATION=global" > .env
# For Cloud SQL, Cloud Run, Artifact Registry
echo "REGION=us-central1" >> .env
To simplify project setup in your terminal, download this project setup script into your working directory:
curl -sL https://raw.githubusercontent.com/alphinside/cloud-trial-project-setup/main/setup_verify_trial_project.sh -o setup_verify_trial_project.sh
Run the script. It verifies your trial billing account, creates a new project (or validates an existing one), saves your project ID to a .env file in the current directory, and sets the active project in gcloud.
bash setup_verify_trial_project.sh && source .env
The script will:
- Verify you have an active trial billing account
- Check for an existing project in
.env(if any) - Create a new project or reuse the existing one
- Link the trial billing account to your project
- Save the project ID to
.env - Set the project as the active
gcloudproject
Verify the project is set correctly by checking the yellow text next to your working directory in the Cloud Shell terminal prompt. It should display your project ID.

Activate Required API
Next, we need to enable several API for the product that we will be interacting with:
gcloud services enable \
aiplatform.googleapis.com \
sqladmin.googleapis.com \
compute.googleapis.com \
run.googleapis.com \
cloudbuild.googleapis.com \
artifactregistry.googleapis.com
- Vertex AI API (
aiplatform.googleapis.com) — your agent uses Gemini models, and Toolbox uses the embedding API for vector search. - Cloud SQL Admin API (
sqladmin.googleapis.com) — you provision and manage a PostgreSQL instance. - Compute Engine API (
compute.googleapis.com) — required for creating Cloud SQL instances. - Cloud Run, Cloud Build, Artifact Registry — used in the deployment step later in this codelab
3. Preparing Scripts for Database Initialization
This step starts Cloud SQL instance creation and runs an automated setup script that waits for the instance to be ready, then creates the database, seeds it with job listings, and generates embeddings — all in one operation.
First, let's add the database password to your .env file and reload it:
echo "DB_PASSWORD=techjobs-pwd" >> .env
echo "DB_INSTANCE=jobs-instance" >> .env
echo "DB_NAME=jobs_db" >> .env
source .env
Creating Bash script for instance and database creation
Then, create the scripts/setup_database.sh script with the following command
mkdir -p ~/build-agent-adk-toolbox-cloudsql/scripts
cloudshell edit scripts/setup_database.sh
Then, copy the following code into the scripts/setup_database.sh file
#!/bin/bash
set -e
source .env
echo "================================================"
echo "Database Setup"
echo "================================================"
echo ""
# Step 1: Create Cloud SQL instance
echo "[1/5] Creating Cloud SQL instance..."
# Check if instance already exists
if gcloud sql instances describe "$DB_INSTANCE" --quiet >/dev/null 2>&1; then
echo " Instance already exists"
else
echo " Creating instance (takes 5-10 minutes)..."
gcloud sql instances create "$DB_INSTANCE" \
--database-version=POSTGRES_17 \
--tier=db-custom-1-3840 \
--edition=ENTERPRISE \
--region="$REGION" \
--root-password="$DB_PASSWORD" \
--enable-google-ml-integration \
--database-flags cloudsql.enable_google_ml_integration=on \
--quiet
fi
echo " ✓ Instance ready"
echo ""
# Step 2: Verify instance is ready
echo "[2/5] Verifying instance state..."
STATE=$(gcloud sql instances describe "$DB_INSTANCE" --format='value(state)')
if [ "$STATE" != "RUNNABLE" ]; then
echo "ERROR: Instance not ready (state: $STATE)"
exit 1
fi
echo " ✓ Instance is RUNNABLE"
echo ""
# Step 3: Grant IAM permissions
echo "[3/5] Granting Vertex AI permissions..."
SERVICE_ACCOUNT=$(gcloud sql instances describe "$DB_INSTANCE" \
--format='value(serviceAccountEmailAddress)')
if [ -z "$SERVICE_ACCOUNT" ]; then
echo "ERROR: Could not retrieve service account"
exit 1
fi
gcloud projects add-iam-policy-binding "$GOOGLE_CLOUD_PROJECT" \
--member="serviceAccount:$SERVICE_ACCOUNT" \
--role="roles/aiplatform.user" \
--quiet
echo " ✓ Permissions granted"
echo ""
# Step 4: Create database
echo "[4/5] Creating database..."
# Check if database already exists
if gcloud sql databases describe "$DB_NAME" \
--instance="$DB_INSTANCE" --quiet >/dev/null 2>&1; then
echo " Database already exists"
else
gcloud sql databases create "$DB_NAME" \
--instance="$DB_INSTANCE" \
--quiet
fi
echo " ✓ Database '$DB_NAME' ready"
echo ""
# Step 5: Seed database and generate embeddings
echo "[5/5] Seeding database and generating embeddings..."
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SETUP_SCRIPT="${SCRIPT_DIR}/setup_jobs_db.py"
if [ ! -f "$SETUP_SCRIPT" ]; then
echo "ERROR: Setup script not found: $SETUP_SCRIPT"
exit 1
fi
uv run "$SETUP_SCRIPT"
echo ""
echo "================================================"
echo "Setup complete!"
echo "================================================"
echo ""
Creating Python script for data seed
After that, create the seeding script python file scripts/setup_jobs_db.py using the command below
cloudshell edit scripts/setup_jobs_db.py
Then, copy the following code into scripts/setup_jobs_db.py file
import os
import sys
from pathlib import Path
from dotenv import load_dotenv
from google.cloud.sql.connector import Connector
import pg8000
import time
# Load environment variables from .env file
env_path = Path(__file__).parent.parent / '.env'
load_dotenv(env_path)
EMBEDDING_MODEL='gemini-embedding-001'
# Verify required environment variables
required_vars = ['GOOGLE_CLOUD_PROJECT', 'REGION', 'DB_PASSWORD']
missing_vars = [var for var in required_vars if not os.environ.get(var)]
if missing_vars:
print(f"ERROR: Missing required environment variables: {', '.join(missing_vars)}", file=sys.stderr)
print(f"", file=sys.stderr)
print(f"Expected .env file location: {env_path}", file=sys.stderr)
if not env_path.exists():
print(f"✗ File not found at that location", file=sys.stderr)
else:
print(f"✓ File exists but is missing the variables above", file=sys.stderr)
print(f"", file=sys.stderr)
print(f"Make sure your .env file contains:", file=sys.stderr)
for var in missing_vars:
print(f" {var}=<value>", file=sys.stderr)
sys.exit(1)
# Job listings data (fictional, for tutorial purposes only)
JOBS = [
("Senior Backend Engineer", "Stripe", "Backend", "Go, PostgreSQL, gRPC, Kubernetes", "$180-250K/year", "San Francisco, Hybrid", 3,
"Design and build high-throughput microservices powering payment infrastructure for millions of businesses. Optimize Go services for sub-100ms latency at scale, work with PostgreSQL and Redis for data persistence, and deploy on Kubernetes clusters handling billions of API calls."),
("Machine Learning Engineer", "Spotify", "Data/AI", "Python, TensorFlow, BigQuery, Vertex AI", "$170-230K/year", "Stockholm, Remote", 2,
"Build and deploy ML models for music recommendation and personalization systems serving hundreds of millions of listeners. Design feature pipelines in BigQuery, train models using distributed computing, and serve predictions through real-time APIs processing thousands of requests per second."),
("Frontend Engineer", "Vercel", "Frontend", "React, TypeScript, Next.js", "$140-190K/year", "Remote", 4,
"Build developer-facing dashboard interfaces and deployment tools used by millions of developers worldwide. Create responsive, accessible React components for project management, analytics, and real-time deployment monitoring with a focus on developer experience."),
("DevOps Engineer", "Datadog", "DevOps", "Terraform, GCP, Docker, Kubernetes, ArgoCD", "$160-220K/year", "New York, Hybrid", 2,
"Manage cloud infrastructure powering an observability platform used by thousands of engineering teams. Automate deployment pipelines with ArgoCD, manage multi-cloud Kubernetes clusters, and implement infrastructure-as-code with Terraform across production environments."),
("Mobile Engineer (Android)", "Grab", "Mobile", "Kotlin, Jetpack Compose, GraphQL", "$120-170K/year", "Singapore, Hybrid", 3,
"Develop features for a super-app serving millions of users across Southeast Asia. Build modern Android UIs with Jetpack Compose, integrate GraphQL APIs, and optimize app performance for diverse device capabilities and network conditions."),
("Data Engineer", "Airbnb", "Data", "Python, Apache Spark, Airflow, BigQuery", "$160-210K/year", "San Francisco, Hybrid", 2,
"Build data pipelines that process booking, search, and pricing data for a global travel marketplace. Design ETL workflows with Apache Spark and Airflow, maintain data warehouses in BigQuery, and ensure data quality for analytics and machine learning teams."),
("Full Stack Engineer", "Revolut", "Full Stack", "TypeScript, Node.js, React, PostgreSQL", "$130-180K/year", "London, Remote", 5,
"Build the next generation of financial products making banking accessible to millions of users across 35 countries. Develop real-time trading interfaces with React and WebSockets, build Node.js APIs handling market data streams, and design PostgreSQL schemas for financial transactions."),
("Site Reliability Engineer", "Cloudflare", "SRE", "Go, Prometheus, Grafana, GCP, Terraform", "$170-230K/year", "Austin, Hybrid", 2,
"Ensure 99.99% uptime for a global network handling millions of requests per second. Define SLOs, build monitoring dashboards with Prometheus and Grafana, manage incident response, and automate infrastructure scaling across 300+ data centers worldwide."),
("Cloud Architect", "Google Cloud", "Cloud", "GCP, Terraform, Kubernetes, Python", "$200-280K/year", "Seattle, Hybrid", 1,
"Help enterprises modernize their infrastructure on Google Cloud. Design multi-region architectures, lead migration projects from on-premises to GKE, and build reference implementations using Terraform and Cloud Foundation Toolkit."),
("Backend Engineer (Payments)", "Square", "Backend", "Java, Spring Boot, PostgreSQL, Kafka", "$160-220K/year", "San Francisco, Hybrid", 3,
"Build payment processing systems handling millions of transactions for businesses of all sizes. Design event-driven architectures using Kafka, implement idempotent payment flows with Spring Boot, and ensure PCI-DSS compliance across all services."),
("AI Engineer", "Hugging Face", "Data/AI", "Python, LangChain, Vertex AI, FastAPI, PostgreSQL", "$150-210K/year", "Paris, Remote", 2,
"Build AI-powered tools for the largest open-source ML community. Develop RAG pipelines that index and search model documentation, create conversational agents using LangChain, and deploy AI services with FastAPI on cloud infrastructure."),
("Platform Engineer", "Coinbase", "Platform", "Rust, Kubernetes, AWS, Terraform", "$180-250K/year", "Remote", 0,
"Build the infrastructure platform for a leading cryptocurrency exchange. Develop high-performance matching engines in Rust, manage Kubernetes clusters for microservices, and design CI/CD pipelines that enable rapid feature deployment with zero downtime."),
("QA Automation Engineer", "Shopify", "QA", "Python, Selenium, Cypress, Jenkins", "$110-160K/year", "Toronto, Hybrid", 3,
"Design and maintain automated test suites for a commerce platform powering millions of merchants. Build end-to-end test frameworks with Cypress and Selenium, integrate tests into Jenkins CI pipelines, and establish quality gates that prevent regressions in checkout and payment flows."),
("Security Engineer", "CrowdStrike", "Security", "Python, SIEM, Kubernetes, Penetration Testing", "$170-240K/year", "Austin, On-site", 1,
"Protect enterprise customers from cyber threats on a leading endpoint security platform. Conduct penetration testing, design security monitoring with SIEM tools, implement zero-trust networking in Kubernetes environments, and lead incident response for security events."),
("Product Engineer", "GitLab", "Full Stack", "Go, React, PostgreSQL, Redis, GCP", "$140-200K/year", "Remote", 4,
"Own features end-to-end for an all-in-one DevSecOps platform used by millions of developers. Build Go microservices for CI/CD pipelines, create React frontends for code review and project management, and collaborate with product managers to iterate on user-facing features using data-driven development."),
]
def get_connection():
"""Create a connection to Cloud SQL using the connector."""
project = os.environ['GOOGLE_CLOUD_PROJECT']
region = os.environ['REGION']
password = os.environ['DB_PASSWORD']
instance = os.environ['DB_INSTANCE']
database = os.environ['DB_NAME']
connector = Connector()
conn = connector.connect(
f"{project}:{region}:{instance}",
"pg8000",
user="postgres",
password=password,
db=database
)
return conn, connector
def create_schema(cursor):
"""Create extensions and jobs table."""
cursor.execute("CREATE EXTENSION IF NOT EXISTS google_ml_integration")
cursor.execute("CREATE EXTENSION IF NOT EXISTS vector")
cursor.execute("""
CREATE TABLE IF NOT EXISTS jobs (
id SERIAL PRIMARY KEY,
title VARCHAR NOT NULL,
company VARCHAR NOT NULL,
role VARCHAR NOT NULL,
tech_stack VARCHAR NOT NULL,
salary_range VARCHAR NOT NULL,
location VARCHAR NOT NULL,
openings INTEGER NOT NULL,
description TEXT NOT NULL,
description_embedding vector(3072)
)
""")
def seed_jobs(cursor, conn):
"""Insert job listings."""
cursor.execute("SELECT COUNT(*) FROM jobs")
existing_count = cursor.fetchone()[0]
if existing_count > 0:
print(f" {existing_count} jobs already exist, skipping seed")
return 0
cursor.executemany("""
INSERT INTO jobs (title, company, role, tech_stack, salary_range, location, openings, description)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s)
""", JOBS)
conn.commit()
return len(JOBS)
def generate_embeddings(cursor, conn):
"""Generate embeddings using Cloud SQL's embedding() function."""
cursor.execute("SELECT COUNT(*) FROM jobs WHERE description_embedding IS NULL")
null_count = cursor.fetchone()[0]
if null_count == 0:
print(" All jobs already have embeddings")
return 0
cursor.execute(f"""
UPDATE jobs
SET description_embedding = embedding('{EMBEDDING_MODEL}', description)::vector
WHERE description_embedding IS NULL
""")
rows_updated = cursor.rowcount
conn.commit()
return rows_updated
def main():
conn, connector = get_connection()
cursor = conn.cursor()
try:
create_schema(cursor)
conn.commit()
seeded = seed_jobs(cursor, conn)
if seeded > 0:
print(f" ✓ Inserted {seeded} jobs")
# Waiting for vertex role propagation
time.sleep(60)
embedded = generate_embeddings(cursor, conn)
if embedded > 0:
print(f" ✓ Generated {embedded} embeddings")
except Exception as e:
print(f"ERROR: {e}", file=sys.stderr)
sys.exit(1)
finally:
cursor.close()
conn.close()
connector.close()
if __name__ == "__main__":
main()
Now, let's go to the next step
4. Create and Initialize the Database
Now our scripts are ready to be executed. We will need Python to execute our prepared script, so let's prepare that one first
Set up the Python project
uv is a fast Python package and project manager written in Rust ( uv documentations ). This codelab uses it for speed and simplicity in maintaining the Python project
Initialize a Python project and add the required dependencies:
uv init
uv add cloud-sql-python-connector --extra pg8000
uv add python-dotenv
Note that we are utilizing cloud-sql-python-connector Python SDK here to initialize a secure connection with our database instance which is authenticated using Application Default Credentials.
Execute the setup script
Now, we can run the setup script in the background and inspect the console output which will be written to logs/atabase_setup.log file using the following command. You can continue to the next section while waiting this to be finished
mkdir -p ~/build-agent-adk-toolbox-cloudsql/logs
bash scripts/setup_database.sh > logs/database_setup.log 2>&1 &
Download the Toolbox binary
We will utilize MCP Toolbox in this tutorial, fortunately it comes with a pre-built binary that is ready to be used in the Linux environment. Now, let's download it in the background as well as it takes quite a while. Run the following command to download the binary and inspect the output log on the logs/toolbox_dl.log . You can continue to the next section while waiting this to be finished
cd ~/build-agent-adk-toolbox-cloudsql
curl -O https://storage.googleapis.com/mcp-toolbox-for-databases/v1.0.0/linux/amd64/toolbox > logs/toolbox_dl.log 2>&1 &
Understanding the setup script scripts/setup_database.sh
Now let's try to understand the setup script we previously configured. It does the following process
- The very first command we execute there is the
gcloud sql instances createcommand with the following flag
db-custom-1-3840is the smallest dedicated-core Cloud SQL tier (1 vCPU, 3.75 GB RAM) inENTERPRISEedition. You can read more details in here. A dedicated core is required for the Vertex AI ML integration — shared-core tiers (db-f1-micro,db-g1-small) do not support it.--root-passwordsets the password for the defaultpostgresuser.--enable-google-ml-integrationenables Cloud SQL's built-in integration with Vertex AI, which lets you call embedding models directly from SQL using theembedding()function.
- Verify whether the instance already in
RUNNABLEstatus - Grant the Cloud SQL instance's service account permission to call Vertex AI using the
gcloud projects add-iam-policy-bindingcommand. This is required for the built-inembedding()function that we will use when seeding the database - Creating the database
- Executing the seeding script
setup_jobs_db.pyscript
Understanding the seed script scripts/setup_jobs_db.py
Now, moving to the seeding script, this script do the following things:
- Initialize connection to the database instance
- Installs two PostgreSQL extensions:
google_ml_integration— provides theembedding()SQL function, which calls Vertex AI embedding models directly from SQL. This is a database-level extension that makes ML functions available insidejobs_db. The instance-level flag (--enable-google-ml-integration) you set during instance creation allows the Cloud SQL VM to reach Vertex AI — the extension makes the SQL functions available within this specific database.vector(pgvector) — adds thevectordata type and distance operators for storing and querying embeddings.
- Create the table, notes that the
description_embeddingcolumn isvector(3072)— apgvectorcolumn that stores 3072-dimensional vectors. - Seed the initial jobs data
- Generate the embedding data from
descriptionfield and fill thedescription_embeddingusing the built in vertex integration via theembedding()function
embedding('gemini-embedding-001', description)— calls Vertex AI's Gemini embedding model directly from SQL, passing each job'sdescriptiontext. This is thegoogle_ml_integrationextension you installed in the seed script.::vector— casts the returned float array to pgvector'svectortype so it can be stored and queried with distance operators.- The
UPDATEruns across all 15 rows, generating one 3072-dimensional embedding per job description.
This will prepare initial data which will be accessed by our agent
5. Configure MCP Toolbox for Databases
This step introduces MCP Toolbox for Databases, configures it to connect to your Cloud SQL instance, and defines two standard SQL query tools.
What is MCP and why use Toolbox?

MCP (Model Context Protocol) is an open protocol that standardizes how AI agents discover and interact with external tools. It defines a client-server model: the agent hosts an MCP client, and tools are exposed by MCP servers. Any MCP-compatible client can use any MCP-compatible server — the agent doesn't need custom integration code for each tool.

MCP Toolbox for Databases is an open-source MCP server built specifically for database access. Without it, you would write Python functions that open database connections, manage connection pools, construct parameterized queries to prevent SQL injection, handle errors, and embed all of that code inside your agent. Every agent that needs database access repeats this work. Changing a query means redeploying the agent.
With Toolbox, you write a YAML file. Each tool maps to a parameterized SQL statement. Toolbox handles connection pooling, parameterized queries, authentication, and observability. Tools are decoupled from the agent — update a query by editing tools.yaml and restarting Toolbox, without touching agent code. The same tools work across ADK, LangGraph, LlamaIndex, or any MCP-compatible framework.
Write the tools configuration
Now, we need to create a file called tools.yaml in the Cloud Shell Editor to set up our tools configuration
cloudshell edit tools.yaml
The file uses multi-document YAML — each block separated by --- is a standalone resource. Every resource has a kind that declares what it is (sources for database connections, tools for agent-callable actions) and a type that specifies the backend (cloud-sql-postgres for the source, postgres-sql for SQL-based tools). A tool references its source by name, which is how Toolbox knows which connection pool to execute against. Environment variables use ${VAR_NAME} syntax and are resolved at startup.
Now, let's copy the following scripts first into tools.yaml file
# tools.yaml
# --- Data Source ---
kind: source
name: jobs-db
type: cloud-sql-postgres
project: ${GOOGLE_CLOUD_PROJECT}
region: ${REGION}
instance: ${DB_INSTANCE}
database: ${DB_NAME}
user: postgres
password: ${DB_PASSWORD}
---
This script here define the following resource:
- Source (
jobs-db) — tells Toolbox how to connect to your Cloud SQL PostgreSQL instance. Thecloud-sql-postgrestype uses the Cloud SQL connector internally, handling authentication and secure connections automatically. The${GOOGLE_CLOUD_PROJECT},${REGION}and${DB_PASSWORD}placeholders are resolved from environment variables at startup.
Next, append the following script under the --- symbol in the tools.yaml
# --- Tool 1: Search jobs by role and/or tech stack ---
kind: tool
name: search-jobs
type: postgres-sql
source: jobs-db
description: >-
Search for job listings by role category and/or tech stack.
Use this tool when the developer wants to browse listings
by role (e.g., Backend, Frontend, Data) or find jobs
using a specific technology. Both parameters accept an
empty string to match all values.
statement: |
SELECT title, company, role, tech_stack, salary_range, location, openings
FROM jobs
WHERE ($1 = '' OR LOWER(role) = LOWER($1))
AND ($2 = '' OR LOWER(tech_stack) LIKE '%' || LOWER($2) || '%')
ORDER BY title
LIMIT 10
parameters:
- name: role
type: string
description: "The role category to filter by (e.g., 'Backend', 'Frontend', 'Data/AI', 'DevOps'). Use empty string for all roles."
- name: tech_stack
type: string
description: "A technology to search for in the tech stack (partial match, e.g., 'Python', 'Kubernetes'). Use empty string for all tech stacks."
---
# --- Tool 2: Get full details for a specific job ---
kind: tool
name: get-job-details
type: postgres-sql
source: jobs-db
description: >-
Get full details for a specific job listing including its description,
salary range, location, and number of openings. Use this tool when the
developer asks about a particular job by title or company.
statement: |
SELECT title, company, role, tech_stack, salary_range, location, openings, description
FROM jobs
WHERE LOWER(title) LIKE '%' || LOWER($1) || '%'
OR LOWER(company) LIKE '%' || LOWER($1) || '%'
parameters:
- name: search_term
type: string
description: "The job title or company name to look up (partial match supported)."
---
This script here define the following resource:
- Tools 1 and 2 (
search-jobs,get-job-details) — standard SQL query tools. Each maps a tool name (what the agent sees) to a parameterized SQL statement (what the database executes). Parameters use$1,$2positional placeholders. Toolbox executes these as prepared statements, which prevents SQL injection.
Let's continue, append the following script under the --- symbol in the tools.yaml
# --- Embedding Model ---
kind: embeddingModel
name: gemini-embedding
type: gemini
model: gemini-embedding-001
project: ${GOOGLE_CLOUD_PROJECT}
location: ${GOOGLE_CLOUD_LOCATION}
dimension: 3072
---
This script here define the following resource:
- Embedding model (
gemini-embedding) — configures Toolbox to call Gemini'sgemini-embedding-001model for generating 3072-dimensional text embeddings. Toolbox uses Application Default Credentials (ADC) to authenticate — no API key needed in Cloud Shell or Cloud Run. Notes that thisdimensionconfigured here must be the same with previously we config to seed the database
Let's continue, append the following script under the --- symbol in the tools.yaml
# --- Tool 3: Semantic search by description ---
kind: tool
name: search-jobs-by-description
type: postgres-sql
source: jobs-db
description: >-
Find jobs that match a natural language description of what the developer
is looking for. Use this tool when the developer describes their ideal job
using interests, work style, career goals, or project type rather than a
specific role or tech stack. Examples: "I want to work on AI chatbots,"
"a remote job at a fintech startup," "something involving infrastructure
and reliability."
statement: |
SELECT title, company, role, tech_stack, salary_range, location, description
FROM jobs
WHERE description_embedding IS NOT NULL
ORDER BY description_embedding <=> $1
LIMIT 5
parameters:
- name: search_query
type: string
description: "A natural language description of the kind of job the developer is looking for."
embeddedBy: gemini-embedding
---
This script here define the following resource:
- Tool 3 (
search-jobs-by-description) — a vector search tool. Thesearch_queryparameter hasembeddedBy: gemini-embedding, which tells Toolbox to intercept the raw text, send it to the embedding model, and use the resulting vector in the SQL statement. The<=>operator is pgvector's cosine distance — smaller values mean more similar descriptions.
Finally, append the last tool under the --- symbol in the tools.yaml
# --- Tool 4: Add a new job listing with automatic embedding ---
kind: tool
name: add-job
type: postgres-sql
source: jobs-db
description: >-
Add a new job listing to the platform. Use this tool when a user asks
to post a job that is not currently listed.
statement: |
INSERT INTO jobs (title, company, role, tech_stack, salary_range, location, openings, description, description_embedding)
VALUES ($1, $2, $3, $4, $5, $6, CAST($7 AS INTEGER), $8, $9)
RETURNING title, company
parameters:
- name: title
type: string
description: "The job title (e.g., 'Senior Backend Engineer')."
- name: company
type: string
description: "The company name (e.g., 'Stripe', 'Spotify')."
- name: role
type: string
description: "The role category (e.g., 'Backend', 'Frontend', 'Data/AI', 'DevOps')."
- name: tech_stack
type: string
description: "Comma-separated list of technologies (e.g., 'Python, FastAPI, GCP')."
- name: salary_range
type: string
description: "The salary range (e.g., '$150-200K/year')."
- name: location
type: string
description: "Work location and arrangement (e.g., 'Remote')."
- name: openings
type: string
description: "The number of open positions."
- name: description
type: string
description: "A short description of the job (2-3 sentences)."
- name: description_vector
type: string
description: "Auto-generated embedding vector for the job description."
valueFromParam: description
embeddedBy: gemini-embedding
This script here define the following resource:
- Tool 4 (
add-job) — demonstrates vector ingestion. Thedescription_vectorparameter has two special fields: valueFromParam: description— Toolbox copies the value from thedescriptionparameter into this one. The LLM never sees this parameter.embeddedBy: gemini-embedding— Toolbox embeds the copied text into a vector before passing it to the SQL.
The result: one tool call stores both the raw description text and its vector embedding, without the agent knowing anything about embeddings.
The multi-document YAML format separates each resource with ---. Each document has kind, name, and type fields that define what it is. In summary we already configured all of the following things:
- Define the source database
- Define tools ( tool 1 and 2 ) to query database with standard filter
- Define embedding model
- Define tool to do vector search ( tool 3 ) to database
- Define tool to do vector data ingestion ( tool 4 ) to database
6. Running the MCP Toolbox Server
In the previous step, we already set the necessary configuration for our MCP Toolbox. Now we are ready to run the server
Verify the seeded data
Before starting Toolbox, let's confirm the database setup has completed. Create a python script scripts/verify_database.py using the following command
cloudshell edit scripts/verify_seed.py
Then, copy the following code into scripts/verify_seed.py file
#!/usr/bin/env python3
"""Verify the database has 15 jobs with embeddings."""
import os
import sys
from pathlib import Path
from dotenv import load_dotenv
from google.cloud.sql.connector import Connector
import pg8000
# Load environment variables
env_path = Path(__file__).parent.parent / '.env'
load_dotenv(env_path)
# Verify required environment variables
required_vars = ['GOOGLE_CLOUD_PROJECT', 'REGION', 'DB_PASSWORD', 'DB_INSTANCE', 'DB_NAME']
missing_vars = [var for var in required_vars if not os.environ.get(var)]
if missing_vars:
print(f"ERROR: Missing environment variables: {', '.join(missing_vars)}", file=sys.stderr)
sys.exit(1)
def verify_database():
"""Check that 15 jobs exist with embeddings."""
connector = Connector()
try:
project = os.environ['GOOGLE_CLOUD_PROJECT']
region = os.environ['REGION']
password = os.environ['DB_PASSWORD']
instance = os.environ['DB_INSTANCE']
database = os.environ['DB_NAME']
conn = connector.connect(
f"{project}:{region}:{instance}",
"pg8000",
user="postgres",
password=password,
db=database
)
cursor = conn.cursor()
# Count jobs and embeddings
cursor.execute("SELECT COUNT(*) FROM jobs")
job_count = cursor.fetchone()[0]
cursor.execute("SELECT COUNT(*) FROM jobs WHERE description_embedding IS NOT NULL")
embedding_count = cursor.fetchone()[0]
print(f"Jobs: {job_count}/15")
print(f"Embeddings: {embedding_count}/15")
cursor.close()
conn.close()
if job_count == 15 and embedding_count == 15:
print("\n✓ Database ready!")
return True
else:
print("\n✗ Database not ready")
return False
except Exception as e:
print(f"\nERROR: {e}", file=sys.stderr)
return False
finally:
connector.close()
if __name__ == "__main__":
success = verify_database()
sys.exit(0 if success else 1)
This script will check the number of job post data and their embedding. Run the script using the following command
uv run scripts/verify_seed.py
If you see the following terminal output, it means the data is ready
Jobs: 15/15 Embeddings: 15/15 ✓ Database ready!
Start the Toolbox server
In the setup step earlier, we already downloaded the toolbox executable. Ensure that this binary file exist and successfully downloaded, if not, download it and wait till finished
cd ~/build-agent-adk-toolbox-cloudsql
if [ ! -f toolbox ]; then
curl -O https://storage.googleapis.com/mcp-toolbox-for-databases/v1.0.0/linux/amd64/toolbox
fi
chmod +x toolbox
We will need to expose our .env variables to the child process which is run by the MCP toolbox. Run the following command to start the toolbox server and log its console output to logs/mcp_toolbox.log file
set -a; source .env; set +a
./toolbox --config tools.yaml --enable-api > logs/mcp_toolbox.log 2>&1 &
You should see output in the logs/mcp_toolbox.log file confirming the server is ready like shown below:
... INFO "Initialized 1 sources: jobs-db" ... INFO "Initialized 0 authServices: " ... INFO "Using Vertex AI backend for Gemini embedding" ... INFO "Initialized 1 embeddingModels: gemini-embedding" ... INFO "Initialized 4 tools: add-job, search-jobs, get-job-details, search-jobs-by-description" ... ... INFO "Server ready to serve!"
Verify the tools
Query the Toolbox API to list all registered tools:
curl -s http://localhost:5000/api/toolset | uv run -m json.tool
You should see tools with their descriptions and parameters. Like shown below
...
"search-jobs-by-description": {
"description": "Find jobs that match a natural language description of what the developer is looking for. Use this tool when the developer describes their ideal job using interests, work style, career goals, or project type rather than a specific role or tech stack. Examples: \"I want to work on AI chatbots,\" \"a remote job at a fintech startup,\" \"something involving infrastructure and reliability.\"",
"parameters": [
{
"name": "search_query",
"type": "string",
"required": true,
"description": "A natural language description of the kind of job the developer is looking for.",
"authSources": []
}
],
"authRequired": []
}
...
Test the search-jobs tool directly:
curl -s -X POST http://localhost:5000/api/tool/search-jobs/invoke \
-H "Content-Type: application/json" \
-d '{"role": "Backend", "tech_stack": ""}' | jq '.result | fromjson'
The response should contain the two backend engineering jobs from your seed data.
[
{
"title": "Backend Engineer (Payments)",
"company": "Square",
"role": "Backend",
"tech_stack": "Java, Spring Boot, PostgreSQL, Kafka",
"salary_range": "$160-220K/year",
"location": "San Francisco, Hybrid",
"openings": 3
},
{
"title": "Senior Backend Engineer",
"company": "Stripe",
"role": "Backend",
"tech_stack": "Go, PostgreSQL, gRPC, Kubernetes",
"salary_range": "$180-250K/year",
"location": "San Francisco, Hybrid",
"openings": 3
}
]
7. Build the ADK Agent
Now, we will utilize ADK in Python for this project, let's add the required dependencies:
uv add google-adk==1.29.0 toolbox-adk==1.0.0
google-adk— Google's Agent Development Kit, including the Gemini SDKtoolbox-adk— ADK integration for MCP Toolbox for Databases.
Create the agent directory structure
ADK expects a specific folder layout: a directory named after your agent containing __init__.py, agent.py, and .env. To help with this, it has built in command to quickly establish the structure:
uv run adk create jobs_agent \
--model gemini-2.5-flash \
--project ${GOOGLE_CLOUD_PROJECT} \
--region ${GOOGLE_CLOUD_LOCATION}
Your directory should now look like this:
build-agent-adk-toolbox-cloudsql/ ├── jobs_agent/ │ ├── __init__.py │ ├── agent.py │ └── .env ├── logs ├── scripts └── ...
Next, we will need to integrate the ADK agent to the running Toolbox server and test all four tools — standard queries, semantic search, and vector ingestion. The agent code is minimal: all database logic lives in tools.yaml.
Configure the agent's environment
ADK reads GOOGLE_GENAI_USE_VERTEXAI, GOOGLE_CLOUD_PROJECT, and GOOGLE_CLOUD_LOCATION from the shell environment, which you already set in the earlier step. The only agent-specific variable is TOOLBOX_URL — append it to the agent's .env file:
echo -e "\nTOOLBOX_URL=http://127.0.0.1:5000" >> jobs_agent/.env
Update the agent module
Open jobs_agent/agent.py in the Cloud Shell Editor
cloudshell edit jobs_agent/agent.py
and overwrite the content with the following code:
# jobs_agent/agent.py
import os
from google.adk.agents import LlmAgent
from toolbox_adk import ToolboxToolset
TOOLBOX_URL = os.environ.get("TOOLBOX_URL", "http://127.0.0.1:5000")
toolbox = ToolboxToolset(TOOLBOX_URL)
root_agent = LlmAgent(
name="jobs_agent",
model="gemini-2.5-flash",
instruction="""You are a helpful assistant at "TechJobs," a tech job listing platform.
Your job:
- Help developers browse job listings by role or tech stack.
- Provide full details about specific positions, including salary range and number of openings.
- Recommend jobs based on natural language descriptions of what the developer is looking for.
- Add new job listings to the platform when asked.
When a developer asks about a specific job by title or company, use the get-job-details tool.
When a developer asks for a specific role category or tech stack, use the search-jobs tool.
When a developer describes what kind of job they want — by interest area, work style,
career goals, or project type — use the search-jobs-by-description tool for semantic search.
When in doubt between search-jobs and search-jobs-by-description, prefer
search-jobs-by-description — it searches job descriptions and finds more relevant matches.
If a position has no openings (openings is 0), let the developer know
and suggest similar alternatives from the search results.
Be conversational, knowledgeable, and concise.""",
tools=[toolbox],
)
Notice that there is no database code in here — ToolboxToolset connects to the Toolbox server at startup and loads all available tools. The agent calls tools by name; Toolbox translates those calls into SQL queries against Cloud SQL.
The TOOLBOX_URL environment variable defaults to http://127.0.0.1:5000 for local development. When you deploy to Cloud Run later, you override this with the Toolbox service's Cloud Run URL — no code changes needed.
The instruction currently references only the two standard tools (search-jobs and get-job-details). You will expand it in the next step when you add semantic search and ingestion tools.
Test the agent
Start the ADK dev UI:
cd ~/build-agent-adk-toolbox-cloudsql
uv run adk web --allow_origins "regex:https://.*\.cloudshell\.dev"
Open the URL shown in the terminal (typically http://localhost:8000) using Cloud Shell's Web Preview feature or ctrl + click the URL shown in terminal. Select jobs_agent from the agent dropdown in the top-left corner.
Test standard queries
Try these prompts to verify the standard SQL tools:
What backend engineering jobs do you have?
Any jobs using Kubernetes?
Tell me about the Cloud Architect position

Test semantic search
Try natural language descriptions that don't map to a specific role or tech stack:
I want a remote job where I can work on AI and machine learning
Find me something in fintech with good work-life balance
I'm interested in infrastructure and reliability engineering
The agent will try to pick the right tool based on the query type: structured filters go through search-jobs, natural language descriptions go through search-jobs-by-description.

Test vector ingestion
Ask the agent to add a new job:
Add a new job: 'Robotics Software Engineer' at Boston Dynamics, role Robotics, tech stack: Python, C++, ROS, Computer Vision, salary $160-230K/year, location Waltham MA, Hybrid, 2 openings. Description: Design and implement autonomous navigation and manipulation algorithms for next-generation robots. Work on perception pipelines using computer vision and lidar, develop motion planning software in C++ and Python, and test systems on real hardware in warehouse and logistics environments.

Now try to search for it:
Find me jobs involving autonomous systems and working with physical hardware
The embedding was generated automatically during the INSERT — no separate step needed.

Now, you already have a full working Agentic RAG application utilizing ADK, MCP Toolbox, and CloudSQL. Congratulations! Let's take a step further to deploy these apps to Cloud Run!
Now, let's stop the dev UI by killing the process by pressing Ctrl+C twice before proceeding.
8. Deploy to Cloud Run
The agent and Toolbox work locally. This step deploys both as Cloud Run services so they're accessible over the internet. The Toolbox service runs as an MCP server on Cloud Run, and the agent service connects to it.
Prepare the Toolbox for deployment
Create a deployment directory for the Toolbox service:
cd ~/build-agent-adk-toolbox-cloudsql
mkdir -p deploy-toolbox
cp toolbox tools.yaml deploy-toolbox/
Create the Dockerfile for the Toolbox. Open deploy-toolbox/Dockerfile in the Cloud Shell Editor:
cloudshell edit deploy-toolbox/Dockerfile
And copy the following script to it
# deploy-toolbox/Dockerfile
FROM debian:bookworm-slim
RUN apt-get update && apt-get install -y ca-certificates && rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY toolbox tools.yaml ./
RUN chmod +x toolbox
EXPOSE 8080
CMD ["./toolbox", "--config", "tools.yaml", "--enable-api", "--address", "0.0.0.0", "--port", "8080"]
The Toolbox binary and tools.yaml are packaged into a minimal Debian image. Cloud Run routes traffic to port 8080.
Deploy the Toolbox service
cd ~/build-agent-adk-toolbox-cloudsql
gcloud run deploy toolbox-service \
--source deploy-toolbox/ \
--region $REGION \
--set-env-vars "DB_PASSWORD=$DB_PASSWORD,DB_INSTANCE=$DB_INSTANCE,DB_NAME=$DB_NAME,GOOGLE_CLOUD_PROJECT=$GOOGLE_CLOUD_PROJECT,REGION=$REGION,GOOGLE_CLOUD_LOCATION=$GOOGLE_CLOUD_LOCATION" \
--allow-unauthenticated \
--quiet > logs/deploy_toolbox.log 2>&1 &
This command submits the source to Cloud Build, builds a container image, pushes it to Artifact Registry, and deploys it to Cloud Run. It will take a few minutes — we can inspect the deployment process log on the logs/deploy_toolbox.log file
Prepare the agent for deployment
While the Toolbox builds, set up the agent's deployment files.
Create a Dockerfile in the project root. Open Dockerfile in the Cloud Shell Editor:
cloudshell edit Dockerfile
Then, copy the following content
# Dockerfile
FROM ghcr.io/astral-sh/uv:python3.12-trixie-slim
WORKDIR /app
COPY pyproject.toml ./
COPY uv.lock ./
RUN uv sync --no-dev
COPY jobs_agent/ jobs_agent/
EXPOSE 8080
CMD ["uv", "run", "adk", "web", "--host", "0.0.0.0", "--port", "8080"]
This Dockerfile uses ghcr.io/astral-sh/uv as the base image, which includes both Python and uv pre-installed — no need to install uv separately via pip.
Create a .dockerignore file to exclude unnecessary files from the container image:
cloudshell edit .dockerignore
Then copy the following script into it
# .dockerignore
.venv/
__pycache__/
*.pyc
.env
jobs_agent/.env
toolbox
tools.yaml
seed.sql
deploy-toolbox/
Deploy the agent service
Wait for the Toolbox deployment to complete. Check the deployment process again on logs/deploy_toolbox.log to verify the process. Then, r etrieve its Cloud Run URL using the following command
TOOLBOX_URL=$(gcloud run services describe toolbox-service \
--region=$REGION \
--format='value(status.url)')
echo "Toolbox URL: $TOOLBOX_URL"
You will see the similar output like this
Toolbox URL: https://toolbox-service-xxxxxx-xx.a.run.app
Then, Let's verify the deployed Toolbox is working:
curl -s "$TOOLBOX_URL/api/toolset" | python3 -m json.tool | head -5
If the output shown like this example, the deployment is already succeed
{
"serverVersion": "1.0.0+binary.linux.amd64.c5524d3",
"tools": {
"add-job": {
"description": "Add a new job listing to the platform. Use this tool when a user asks to post a job that is not currently listed.",
Next, let's deploy the agent, passing the Toolbox URL as an environment variable:
cd ~/build-agent-adk-toolbox-cloudsql
gcloud run deploy jobs-agent \
--source . \
--region $REGION \
--set-env-vars "TOOLBOX_URL=$TOOLBOX_URL,GOOGLE_CLOUD_PROJECT=$GOOGLE_CLOUD_PROJECT,GOOGLE_CLOUD_LOCATION=$GOOGLE_CLOUD_LOCATION,GOOGLE_GENAI_USE_VERTEXAI=TRUE" \
--allow-unauthenticated \
--quiet
The agent code reads TOOLBOX_URL from the environment (you set this up previously). Locally it points to http://127.0.0.1:5000; on Cloud Run it points to the Toolbox service URL. No code changes needed.
Test the deployed agent
Retrieve the agent's Cloud Run URL:
AGENT_URL=$(gcloud run services describe jobs-agent \
--region=$REGION \
--format='value(status.url)')
echo "Agent URL: $AGENT_URL"
Open the URL in your browser. The ADK dev UI loads — the same interface you've been using locally, now running on Cloud Run.
Select jobs_agent from the dropdown and test:
What backend engineering jobs do you have?
I want a remote job working on AI and machine learning
Both queries work through the deployed services: the agent on Cloud Run calls the Toolbox on Cloud Run, which queries Cloud SQL.
9. Congratulations / Clean Up
You've built and deployed a smart job board assistant that uses MCP Toolbox for Databases to bridge an ADK agent and Cloud SQL PostgreSQL — with both standard SQL queries and semantic vector search.
What you've learned
- How MCP standardizes tool access for AI agents, and how MCP Toolbox for Databases applies this specifically to database operations — replacing custom database code with declarative YAML configuration
- How to configure Cloud SQL PostgreSQL as a Toolbox data source using the
cloud-sql-postgressource type - How to define standard SQL query tools with parameterized statements that prevent SQL injection
- How to enable vector search using pgvector and
gemini-embedding-001, with theembeddedByparameter for automatic query embedding - How
valueFromParamenables automatic vector ingestion — the LLM provides a text description, and Toolbox silently copies, embeds, and stores the vector alongside the text - How ADK's
ToolboxToolsetloads tools from a running Toolbox server, keeping agent code minimal and database logic fully decoupled - How to deploy both the Toolbox MCP server and the ADK agent to Cloud Run as separate services
Clean up
To avoid incurring charges to your Google Cloud account for the resources created in this codelab, you can either delete the individual resources or delete the entire project.
Option 1: Delete the project (recommended)
The easiest way to clean up is to delete the project. This removes all resources associated with the project.
gcloud projects delete $GOOGLE_CLOUD_PROJECT
Option 2: Delete individual resources
If you want to keep the project but remove only the resources created in this codelab:
gcloud run services delete jobs-agent --region=$REGION --quiet
gcloud run services delete toolbox-service --region=$REGION --quiet
gcloud sql instances delete jobs-instance --quiet
gcloud artifacts repositories delete cloud-run-source-deploy --location=$REGION --quiet 2>/dev/null
