Agents CLI in Agent Platform: From Development to Production

1. Overview

What you'll learn

In this Codelab, you'll build, test, and deploy a production-ready AI agent using Agents CLI and the Agent Development Kit (ADK). You'll go from installing the tools to having a live agent running on Google Cloud's Agent Runtime in under an hour.

What you'll build

A customer support AI agent that can:

  • Answer questions using natural language
  • Call custom tools (weather and time lookups)
  • Be tested locally with instant feedback
  • Be evaluated automatically for quality
  • Run in production on Google Cloud

Two ways to use Agents CLI

Agents CLI supports two workflows:

🤖 With a Coding Agent (Recommended for beginners)

Let AI guide you! Install skills into Gemini CLI, Antigravity, Claude Code, Cursor, or any other supported coding agent — they'll help you build the agent step-by-step.

👤 Manual Mode (For developers who prefer direct control)

Run commands yourself from your terminal. You'll type each command and see exactly what happens.

Throughout this Codelab, we'll show both approaches. Choose whichever fits your style!

What you need

Required:

  • Python 3.11 or higher
  • uv package manager
  • Node.js 18+ (for coding agent skills)
  • Google Cloud project with billing enabled
  • Google Cloud SDK installed

Optional for local-only development:

Prerequisites

This Codelab assumes you're comfortable with:

  • Using a terminal/command line
  • Basic Python concepts
  • Google Cloud Console basics

No prior experience with AI agents or ADK required!

2. Before you begin

Set up Google Cloud

Create or select a project

  1. Go to the Google Cloud Console
  2. Create a new project or select an existing one
  3. Note your Project ID - you'll need it later

Enable required APIs

Run these commands in your terminal (or use Cloud Console):

gcloud services enable aiplatform.googleapis.com \
  run.googleapis.com \
  cloudtrace.googleapis.com \
  cloudbuild.googleapis.com

This enables:

  • Agent Platform - For the Gemini model and Agent Runtime
  • Cloud Run - Alternative deployment option
  • Cloud Trace - Observability and monitoring
  • Cloud Build - Build automation

Authenticate

gcloud auth login
gcloud auth application-default login

Set environment variables

export GOOGLE_CLOUD_PROJECT=YOUR_PROJECT_ID
export GOOGLE_CLOUD_LOCATION=us-central1

Replace YOUR_PROJECT_ID with your actual project ID.

3. Install Agents CLI

🤖 With a Coding Agent

If you're using Gemini CLI, Antigravity, Claude Code, Cursor, or any other supported coding agent:

uvx google-agents-cli setup

This installs:

  • The Agents CLI tool globally
  • 7 skills that any supported coding agent on your machine can use to help you build agents (skills are installed once and discovered by every agent that supports them)

Expected output (trimmed):

 █▀█ █▀▀ █▀▀ █▄ █ ▀█▀ █▀   █▀▀ █  █
 █▀█ █▄█ ██▄ █ ▀█  █  ▄█   █▄▄ █▄ █

 Your coding agent just got an upgrade.

 1. Authentication
 ─────────────────
   ✓ Authenticated with Google Cloud

 2. CLI Installation
 ───────────────────
   ▸ uv tool install google-agents-cli
   ✓ Installed google-agents-cli

 3. Skills Installation
 ──────────────────────
   ▸ npx -y skills add https://github.com/google/agents-cli -y --all -g

   ◇  Found 7 skills
   ~/.agents/skills/google-agents-cli-adk-code
   ~/.agents/skills/google-agents-cli-deploy
   ~/.agents/skills/google-agents-cli-eval
   ~/.agents/skills/google-agents-cli-observability
   ~/.agents/skills/google-agents-cli-publish
   ~/.agents/skills/google-agents-cli-scaffold
   ~/.agents/skills/google-agents-cli-workflow

Skills are installed via skills (npx -y skills@latest) into ~/.agents/skills/ and picked up automatically by every supported coding agent on your machine.

👤 Manual Mode

If you prefer to run commands directly:

uv tool install google-agents-cli

Verify installation:

agents-cli --version

Expected output:

agents-cli, version 0.1.2

4. Create your agent project

🤖 With a Coding Agent

Ask your coding agent:

"Create a new ADK agent project called customer-support-agent using the prototype template"

Your agent will run the scaffold command and create the project for you.

👤 Manual Mode

Use quick mode to create a consistent project that matches this Codelab:

agents-cli scaffold create customer-support-agent --prototype --yes

This creates a basic ADK agent project instantly with all the code needed for this Codelab.

Expected output:

Agents CLI v0.1.2
> Verifying GCP credentials...
> ✓ Connected to project: YOUR_PROJECT_ID

✅ Success! Your agent project is ready.

📖 Documentation
   README:    cat customer-support-agent/README.md

💡 Tip
   Add a deployment target later with: agents-cli scaffold enhance

🚀 Get Started
   cd customer-support-agent && agents-cli install && agents-cli playground

Alternative: Interactive mode

If you want to explore other agent types, run without flags:

agents-cli scaffold create customer-support-agent

You'll see options for:

  • adk - Simple ReAct agent (choose this for the Codelab)
  • adk_a2a - Agent-to-agent communication
  • agentic_rag - RAG-based document Q&A

What was created?

customer-support-agent/
├── app/
│   ├── agent.py              # Your agent code (main file)
│   ├── fast_api_app.py       # Development server (replaced when you add a deployment target)
│   └── app_utils/            # Utilities (telemetry, etc.)
├── tests/
│   ├── unit/                 # Unit tests
│   ├── integration/          # Integration tests
│   └── eval/                 # Evalsets and rubric config for `adk eval`
├── Dockerfile                # Container image (removed when you switch to Agent Runtime)
├── GEMINI.md                 # Coding-agent context for this project
├── pyproject.toml            # Project config & dependencies
├── README.md                 # Project documentation
└── .gitignore

5. Explore the agent code

cd customer-support-agent

Examine the agent

Open app/agent.py - this is where your agent is defined:

import datetime
from zoneinfo import ZoneInfo

from google.adk.agents import Agent
from google.adk.apps import App
from google.adk.models import Gemini
from google.genai import types

import os
import google.auth

_, project_id = google.auth.default()
os.environ["GOOGLE_CLOUD_PROJECT"] = project_id
os.environ["GOOGLE_CLOUD_LOCATION"] = "global"
os.environ["GOOGLE_GENAI_USE_VERTEXAI"] = "True"


def get_weather(query: str) -> str:
    """Simulates a web search. Use it get information on weather.

    Args:
        query: A string containing the location to get weather information for.

    Returns:
        A string with the simulated weather information for the queried location.
    """
    if "sf" in query.lower() or "san francisco" in query.lower():
        return "It's 60 degrees and foggy."
    return "It's 90 degrees and sunny."


def get_current_time(query: str) -> str:
    """Simulates getting the current time for a city.

    Args:
        city: The name of the city to get the current time for.

    Returns:
        A string with the current time information.
    """
    if "sf" in query.lower() or "san francisco" in query.lower():
        tz_identifier = "America/Los_Angeles"
    else:
        return f"Sorry, I don't have timezone information for query: {query}."

    tz = ZoneInfo(tz_identifier)
    now = datetime.datetime.now(tz)
    return f"The current time for query {query} is {now.strftime('%Y-%m-%d %H:%M:%S %Z%z')}"


root_agent = Agent(
    name="root_agent",
    model=Gemini(
        model="gemini-flash-latest",
        retry_options=types.HttpRetryOptions(attempts=3),
    ),
    instruction="You are a helpful AI assistant designed to provide accurate and useful information.",
    tools=[get_weather, get_current_time],
)

app = App(
    root_agent=root_agent,
    name="app",
)

Key concepts

Tools: Python functions your agent can call

  • get_weather(query) - Returns simulated weather (foggy 60°F for SF, sunny 90°F otherwise)
  • get_current_time(query) - Returns current time (San Francisco only in this stub)

Model: gemini-flash-latest is an alias that auto-tracks the latest stable Gemini Flash release, so this code keeps working as Google ships new versions. To pin a specific model, swap in something like gemini-2.5-flash (stable) or gemini-3-flash-preview (preview).

App / root_agent: ADK projects expose a top-level App wrapping a root_agent. The playground, adk eval, and Agent Runtime all discover the agent through this app object.

Location =

"global": Some Gemini preview models are only served from the global endpoint — that's why the scaffold sets it explicitly.

Instruction: System prompt that shapes the agent's behavior.

6. Test locally with the playground

The playground provides an interactive chat interface for testing.

🤖 With a Coding Agent

Ask your agent:

"Start the playground for my agent"

👤 Manual Mode

Install dependencies first

agents-cli install

This runs uv sync under the hood, resolving and installing the agent's dependencies into a local .venv.

Start the playground

agents-cli playground

Expected output:

╭──────────────────────────────────────────────────────────────────────────────╮
│ Starting your agent playground...                                            │
│                                                                              │
│ Will be available at:  http://127.0.0.1:8080/dev-ui/?app=app                 │
╰──────────────────────────────────────────────────────────────────────────────╯
  ▸ uv run adk web . --host 127.0.0.1 --port 8080 --reload_agents
INFO:     Started server process
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8080

Try it out

  1. Open http://127.0.0.1:8080/dev-ui/?app=app in your browser
  2. Try these prompts:
    • "What's the weather in San Francisco?"
    • "What's the weather in Tokyo?"
    • "What time is it in San Francisco?"

Watch how the agent:

  • Calls the get_weather tool
  • Calls the get_current_time tool
  • Combines results into natural responses

7. Run from the command line

Test your agent without opening a browser.

🤖 With a Coding Agent

Ask your agent:

"Run the agent with the query ‘What's the weather in Paris?'"

👤 Manual Mode

agents-cli run "What's the weather in San Francisco?"

Expected output:

Using project root directory: /path/to/customer-support-agent
Local server started on port 18080 (PID 30008)
  Stop with: agents-cli run --stop-server
[user]: What's the weather in San Francisco?
[root_agent]:
[tool_call: get_weather({"query": "San Francisco"})]
[tool_response: get_weather -> {"result": "It's 60 degrees and foggy."}]The weather in San Francisco is 60 degrees and foggy.

Session: fb30f7f7-147e-4697-8aaa-706d604589fa (resume with --session-id)

Behind the scenes, run boots a background adk api_server (kept warm for ~30 minutes) so subsequent calls are fast. Stop it explicitly with agents-cli run --stop-server. Resume a multi-turn conversation with --session-id .

The run command is perfect for:

  • Quick testing during development
  • Scripting and automation
  • CI/CD pipelines

8. Evaluate your agent

ADK evaluations validate two independent things:

  1. Tool trajectory — did the agent call the right tools with the right arguments? Deterministic, exact-match.
  2. Response quality — is the final answer relevant, helpful, and grounded in the tool outputs? Scored by an LLM acting as a judge.

You need both. Rubric scoring alone could pass a hallucinated answer that happens to read well; trajectory alone can't tell whether the user got a useful reply. The codelab evalset exercises both.

These are driven by two files:

  • tests/eval/evalsets/basic.evalset.json — the conversations to replay, plus the expected tool calls
  • tests/eval/eval_config.json — which metrics to score, their thresholds, and the rubric for the LLM judge

Edit the evalset to include expected tool calls

Replace the contents of tests/eval/evalsets/basic.evalset.json with:

{
  "eval_set_id": "basic_eval",
  "name": "Basic Agent Evaluation",
  "description": "Validates that the agent calls the right tools AND produces a quality response.",
  "eval_cases": [
    {
      "eval_id": "weather_san_francisco",
      "conversation": [
        {
          "user_content": {"parts": [{"text": "What's the weather like in San Francisco?"}], "role": "user"},
          "final_response": {"parts": [{"text": "The weather in San Francisco is 60 degrees and foggy."}], "role": "model"},
          "intermediate_data": {
            "tool_uses": [{"name": "get_weather", "args": {"query": "San Francisco"}}],
            "tool_responses": [],
            "intermediate_responses": []
          }
        }
      ],
      "session_input": {"app_name": "app", "user_id": "eval_user", "state": {}}
    },
    {
      "eval_id": "weather_tokyo",
      "conversation": [
        {
          "user_content": {"parts": [{"text": "What's the weather in Tokyo?"}], "role": "user"},
          "final_response": {"parts": [{"text": "The weather in Tokyo is 90 degrees and sunny."}], "role": "model"},
          "intermediate_data": {
            "tool_uses": [{"name": "get_weather", "args": {"query": "Tokyo"}}],
            "tool_responses": [],
            "intermediate_responses": []
          }
        }
      ],
      "session_input": {"app_name": "app", "user_id": "eval_user", "state": {}}
    },
    {
      "eval_id": "time_san_francisco",
      "conversation": [
        {
          "user_content": {"parts": [{"text": "What time is it in San Francisco?"}], "role": "user"},
          "intermediate_data": {
            "tool_uses": [{"name": "get_current_time", "args": {"query": "San Francisco"}}],
            "tool_responses": [],
            "intermediate_responses": []
          }
        }
      ],
      "session_input": {"app_name": "app", "user_id": "eval_user", "state": {}}
    }
  ]
}

The intermediate_data.tool_uses block is the expected trajectory — what tools the agent should call, with what arguments. The trajectory metric compares this against what actually happened at runtime.

Edit the rubric to score both metrics

Replace tests/eval/eval_config.json with:

{
  "criteria": {
    "tool_trajectory_avg_score": 1.0,
    "rubric_based_final_response_quality_v1": {
      "threshold": 0.8,
      "judgeModelOptions": {"judgeModel": "gemini-flash-latest", "numSamples": 1},
      "rubrics": [
        {"rubricId": "relevance",     "rubricContent": {"textProperty": "The response directly addresses the user's query."}},
        {"rubricId": "helpfulness",   "rubricContent": {"textProperty": "The response is helpful and provides useful information."}},
        {"rubricId": "tool_grounded", "rubricContent": {"textProperty": "The response is grounded in the values returned by the tools (e.g. the exact temperature and weather condition) and does not invent details."}}
      ]
    }
  }
}

What each line does:

  • tool_trajectory_avg_score: 1.0 — adds the trajectory metric with a strict threshold of 1.0 (every expected tool call must match exactly). Score is computed deterministically by comparing the agent's actual function_call events against intermediate_data.tool_uses.
  • rubric_based_final_response_quality_v1 — runs an LLM-as-judge (gemini-flash-latest here) that scores the final response against each rubric on a 0–1 scale, then averages. The case passes when the average meets threshold (0.8). The tool_grounded rubric explicitly asks the judge to penalize answers that contradict or invent over the tool output — a defense against hallucination layered on top of the trajectory check.

🤖 With a Coding Agent

Ask your agent:

"Run the evaluations for my agent"

👤 Manual Mode

agents-cli eval run --all

--all runs every *.evalset.json under tests/eval/evalsets/. To target one, use --evalset tests/eval/evalsets/basic.evalset.json.

Expected output (trimmed):

  ▸ uv run adk eval ./app tests/eval/evalsets/basic.evalset.json --config_file_path tests/eval/eval_config.json
INFO - google_llm.py - Sending out request, model: gemini-flash-latest, backend: GoogleLLMVariant.VERTEX_AI
INFO - local_eval_set_results_manager.py - Writing eval result to file: app/.adk/eval_history/app_basic_eval_<ts>.evalset_result.json
Using evaluation criteria: criteria={'tool_trajectory_avg_score': 1.0, 'rubric_based_final_response_quality_v1': BaseCriterion(threshold=0.8, ...)}
*********************************************************************
Eval Run Summary
basic_eval:
  Tests passed: 3
  Tests failed: 0

Per-case results are written under app/.adk/eval_history/ — each result file lists per-metric scores so you can see exactly which check passed or failed.

How to read a result file

A passing case in app/.adk/eval_history/app_basic_eval_.evalset_result.json looks like:

{
  "eval_id": "weather_san_francisco",
  "final_eval_status": 1,
  "overall_eval_metric_results": [
    {"metric_name": "tool_trajectory_avg_score",            "score": 1.0, "threshold": 1.0, "eval_status": 1},
    {"metric_name": "rubric_based_final_response_quality_v1","score": 1.0, "threshold": 0.8, "eval_status": 1}
  ]
}

If the agent had hallucinated and called the wrong tool (or no tool), tool_trajectory_avg_score would drop to 0.0 and the case would fail — even if the final text happened to read plausibly. That's the property the rubric alone can't give you.

Other available metrics

ADK ships several built-in evaluators you can add to criteria:

Metric

What it checks

tool_trajectory_avg_score

Exact-match tool calls against expected trajectory

rubric_based_final_response_quality_v1

LLM judge against your rubric (final response)

rubric_based_tool_use_quality_v1

LLM judge against rubric (tool usage)

final_response_match_v2

LLM judge: is final response semantically equivalent to expected?

response_match_score

ROUGE-1 between actual and expected final response

safety_v1

LLM judge: safety of the response

hallucinations_v1

LLM judge: response is grounded in provided context

9. Add Agent Runtime deployment

Agent Runtime is Google Cloud's managed, serverless runtime for ADK agents. It handles scaling, infrastructure, and observability automatically.

🤖 With a Coding Agent

Ask your agent:

"Add Agent Runtime deployment to my project"

👤 Manual Mode

agents-cli scaffold enhance --deployment-target agent_runtime --yes

Expected output:

Agents CLI v0.1.2
Resolved project root to: /home/user/customer-support-agent

Generating templates for comparison...
  - Original template...
  - Enhanced template...

Comparing files...

Will auto-update (unchanged by you):
  ✓ README.md
  ✓ app/app_utils/telemetry.py

Skipping (your code):
  - app/agent.py

Files to add:
  + app/agent_runtime_app.py
  + deployment_metadata.json
  + tests/integration/test_agent_runtime_app.py

Files to remove:
  - Dockerfile
  - app/fast_api_app.py
  - tests/integration/test_server_e2e.py

Dependency changes:
  + Add: google-cloud-aiplatform[evaluation,agent-engines]>=1.130.0
  + Add: protobuf>=6.31.1,<7.0.0

📦 Creating backup before modification...
Backup created: /home/user/.agents-cli/backups/customer-support-agent_20260430_001940

  Updated: 2 files
  Added: 3 files
  Removed: 3 files

✅ Enhance complete!

What changed?

Added:

  • app/agent_runtime_app.py - Agent Runtime wrapper
  • deployment_metadata.json - Deployment tracking
  • Agent Runtime-specific tests

Removed:

  • Dockerfile - Not needed (Agent Runtime is managed)
  • fast_api_app.py - Replaced with Agent Runtime app

Preserved:

  • app/agent.py - Your agent code (untouched!)

10. Deploy to Agent Runtime

Deploy your agent to Google Cloud's managed infrastructure.

Update dependencies

uv lock

🤖 With a Coding Agent

Ask your agent:

"Deploy my agent to Agent Runtime in project YOUR_PROJECT_ID, region us-central1"

👤 Manual Mode

agents-cli deploy --project YOUR_PROJECT_ID --region us-central1

Expected output:

Using project root directory: /home/user/customer-support-agent
  📦 Auto-generated requirements: app/app_utils/.requirements.txt

    ╔═══════════════════════════════════════════════════════════╗
    ║                                                           ║
    ║   🤖 DEPLOYING AGENT TO VERTEX AI AGENT ENGINE 🤖         ║
    ║                                                           ║
    ╚═══════════════════════════════════════════════════════════╝

📋 Deployment Parameters:
  Project: YOUR_PROJECT_ID
  Location: us-central1
  Display Name: customer-support-agent
  Min Instances: 1
  Max Instances: 10
  CPU: 4
  Memory: 8Gi
  Container Concurrency: 9

🌍 Environment Variables:
  AGENT_VERSION: 0.1.0
  GOOGLE_CLOUD_AGENT_ENGINE_ENABLE_TELEMETRY: true
  GOOGLE_CLOUD_REGION: us-central1
  NUM_WORKERS: 1
  OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT: true

INFO:root:Introspecting app.agent_runtime_app.agent_runtime via subprocess

🚀 Creating agent: customer-support-agent (this can take 5-10 minutes)...
INFO:vertexai_genai.agentengines:Using agent framework: google-adk
   Operation: projects/.../locations/us-central1/reasoningEngines/.../operations/...

[... deployment in progress for ~5–10 minutes ...]

INFO:root:Agent Runtime ID written to deployment_metadata.json

✅ Deployment successful!
Agent Runtime ID: projects/.../locations/us-central1/reasoningEngines/XXXXXXXXXXXXXXXXXX
Service Account: service-XXXXXXXXX@gcp-sa-aiplatform-re.iam.gserviceaccount.com

📊 Open Console Playground: https://console.cloud.google.com/vertex-ai/agents/agent-engines/locations/us-central1/agent-engines/XXXXXXXXXXXXXXXXXX/playground?project=YOUR_PROJECT_ID

What just happened?

Agent Runtime:

  • Packaged your code automatically
  • Uploaded to Google Cloud
  • Provisioned managed infrastructure (4 CPU, 8Gi memory)
  • Configured auto-scaling (1-10 instances)
  • Enabled telemetry and observability

Deployment time: ~5–10 minutes (one-time setup; subsequent re-deploys are faster)

11. Test and monitor your deployed agent

Your agent is now running on Agent Runtime! Let's test it and explore the built-in monitoring.

Test your agent

Option 1: Console Playground (Easiest)

The agents-cli deploy output prints a Console Playground link of the form:

https://console.cloud.google.com/vertex-ai/agents/agent-engines/locations/<REGION>/agent-engines/<RUNTIME_ID>/playground?project=<PROJECT_ID>

(Agent Runtime is the productized name; the underlying GCP resource is a Vertex AI Agent Engine, which is why the URL path includes vertex-ai/agents/agent-engines.) Click the link to:

  1. Sign in to the Google Cloud Console
  2. See your deployed agent with an interactive chat UI
  3. Test queries like "What's the weather in San Francisco?"
  4. View tool calls and responses in real-time

The Console Playground provides the easiest way to verify your deployment.

Option 2: agents-cli run –url

The same agents-cli run command you used locally also queries deployed agents:

RUNTIME_ID=$(jq -r .remote_agent_runtime_id deployment_metadata.json)
agents-cli run \
  --url "https://us-central1-aiplatform.googleapis.com/v1/${RUNTIME_ID}" \
  --mode adk \
  "What's the weather in San Francisco?"

--mode adk speaks the ADK SSE protocol (the deployed agent exposes :streamQuery). agents-cli auto-attaches a Google access token from your active gcloud credentials.

Option 3: Agent Engine SDK (Python)

For programmatic access from Python, use the Agent Engine module of the Vertex AI SDK (vertexai.agent_engines):

import vertexai
from vertexai import agent_engines

vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")

# remote_agent_runtime_id from deployment_metadata.json (full resource name)
remote_agent = agent_engines.get(
    "projects/.../locations/us-central1/reasoningEngines/..."
)

session = remote_agent.create_session(user_id="user-1")
for event in remote_agent.stream_query(
    user_id="user-1",
    session_id=session["id"],
    message="What's the weather in San Francisco?",
):
    print(event)

Monitor your agent

Agent Runtime includes built-in observability through Google Cloud's monitoring tools:

Cloud Trace - Request tracing

  1. Navigate to: Cloud Console > Trace Explorer
  2. Select your project
  3. Filter spans by attribute service.name = customer-support-agent
  4. Drill into a trace to see model inference, tool calls, and end-to-end latency

Cloud Logging - Application logs

  1. Navigate to: Cloud Console > Logs Explorer
  2. Use query: resource.type="aiplatform.googleapis.com/ReasoningEngine"
  3. Add resource.labels.reasoning_engine_id="" to scope to this agent
  4. View agent requests, responses, tool execution, and errors

Cloud Monitoring - Metrics and dashboards

  1. Navigate to: Cloud Console > Metrics Explorer
  2. Filter by resource type aiplatform.googleapis.com/ReasoningEngine
  3. Useful metrics: request_count, request_latencies, instance_count

12. Optional: Publish to Gemini Enterprise

Make your agent available in Gemini Enterprise for your organization.

🤖 With a Coding Agent

Ask your agent:

"Publish my agent to Gemini Enterprise"

👤 Manual Mode

List the Gemini Enterprise apps in your project so you have an app ID to target:

agents-cli publish gemini-enterprise --list

Then register the deployed agent. The command auto-reads the Agent Runtime ID from deployment_metadata.json:

agents-cli publish gemini-enterprise \
  --gemini-enterprise-app-id "projects/PROJECT_NUMBER/locations/global/collections/default_collection/engines/YOUR_APP_ID" \
  --display-name "Customer Support Agent" \
  --description "Answers weather and time questions" \
  --tool-description "Use this tool to ask the customer support agent."

This makes your agent available:

  • In the Gemini Enterprise agent marketplace
  • To users in your organization
  • With centralized management and governance

For details, see Publishing documentation.

13. Clean up

To avoid charges, clean up resources you created.

Delete the agent

Get your Agent Runtime resource name from deployment_metadata.json:

jq -r .remote_agent_runtime_id deployment_metadata.json

Delete via Agent Engine SDK (Python)

import vertexai
from vertexai import agent_engines

vertexai.init(project="YOUR_PROJECT_ID", location="us-central1")

# Full resource name from deployment_metadata.json
remote_agent = agent_engines.get(
    "projects/.../locations/us-central1/reasoningEngines/..."
)

# force=True also deletes any child sessions
remote_agent.delete(force=True)
print("✅ Deleted successfully")

Or via the REST API (curl):

RUNTIME_ID=$(jq -r .remote_agent_runtime_id deployment_metadata.json)
curl -X DELETE \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  "https://us-central1-aiplatform.googleapis.com/v1beta1/${RUNTIME_ID}?force=true"

14. Congratulations!

🎉 You did it! You've successfully built, tested, and deployed an AI agent from scratch.

What you learned

  • Two ways to use Agents CLI (with coding agent or manually)
  • Creating agent projects with agents-cli scaffold
  • Building agents with custom tools
  • Testing locally with the playground
  • Running automated evaluations
  • Deploying to Agent Runtime (managed infrastructure)
  • Monitoring with Cloud Trace and Logging
  • Production best practices

What you built

A production-ready AI agent with:

  • Natural language understanding (Gemini)
  • Custom tool integration
  • Quality evaluations
  • Auto-scaling deployment
  • Built-in observability

Next steps

Extend your agent:

  • Add more tools (database queries, API calls)
  • Implement RAG with vector search
  • Add multi-turn conversations
  • Enable A2A (agent-to-agent) communication

Learn more:

Join the community: