Build an Autonomous Supply Chain with Gemini 3 Flash & AlloyDB AI

1. Overview

The era of "Chatbots that Read" is ending. We are entering the era of Agentic Vision.

In this Codelab, we will implement Deterministic AI Engineering—a practice of building AI systems that don't guess. Standard AI models often "hallucinate" (guess) when asked to count items in a complex image. In a supply chain, a guess is dangerous. If an AI guesses you have 12 items when you actually have 15, it triggers costly errors.

We will build an Autonomous Supply Chain Agent utilizing the new Think, Act, Observe loop in Gemini 3 Flash. It doesn't just look; it investigates.

The Deterministic Architecture

We will start with a "blind" and "amnesiac" system. You will manually "awaken" its senses one by one:

17191a91a9c54146.png

  1. The Eyes (Vision Agent): We enable Gemini 3 Flash with Code Execution. Instead of predicting tokens to guess a number, the model writes Python code (OpenCV) to count pixels deterministically.
  2. The Memory (Supplier Agent): We enable AlloyDB AI with ScaNN (Scalable Nearest Neighbors). This allows the agent to recall the exact supplier for a part from millions of options in milliseconds.
  3. The Handshake (A2A Protocol): We enable Agent-to-Agent communication using a standardized agent_card.json , allowing the Vision Agent to autonomously order stock from the Supplier Agent.

What you'll build

  • A Vision Agent that performs "visual math" on camera feeds.
  • A Supplier Agent backed by AlloyDB ScaNN for high-speed vector search.
  • A Control Tower frontend with real-time WebSocket updates to visualize the autonomous loop.

What you'll learn

  • How to set up AlloyDB with vector embeddings and ScaNN indexes.
  • How to enable Agentic Vision with gemini-3-flash-preview using the Gemini API.
  • How to implement vector search using the <=> (cosine distance) operator in AlloyDB.
  • How to connect agents to AlloyDB using the AlloyDB Python Connector.
  • How to use the A2A Protocol for dynamic agent discovery.

Requirements

  • A browser, such as Chrome or Firefox
  • A Google Cloud project with billing enabled.
  • A Gemini API key (free tier available at Google AI Studio) for the Vision Agent.

2. Before you begin

Create a project

  1. In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
  2. Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
  1. You'll use Cloud Shell, a command-line environment running in Google Cloud. Click Activate Cloud Shell at the top of the Google Cloud console.

Activate Cloud Shell button image

  1. Once connected to Cloud Shell, you check that you're already authenticated and that the project is set to your project ID using the following command:
gcloud auth list

Setup Database [AlloyDB]

Before anything else, let's provision the database — this takes ~15 minutes, so we'll kick it off first.

  1. Click the button below to open the AlloyDB setup tool in Cloud Shell:

  1. Run the setup:
Sh run.sh
  1. Use the Web Preview (eye icon 👁️ → Preview on port 8080) to open the setup UI.
  2. Enter your Project ID, select a Region (e.g., us-central1), and create a Database Password.

⚠️ SAVE THIS PASSWORD — you'll need it when the setup script asks for it.

  1. Click Start Deployment and wait ~15 minutes for the cluster to provision.

Get the Code

While AlloyDB provisions (or once it's done), open the codelab repo in Cloud Shell:

⚠️ IMPORTANT: When you click the button, you'll see a security dialog. Check the box "Trust repo" and click "Confirm".

Alternatively, clone manually:

git clone https://github.com/MohitBhimrajka/visual-commerce-gemini-3-alloydb.git

cd visual-commerce-gemini-3-alloydb

Set Your Project

In this Cloud Shell terminal, confirm your project is set:

gcloud config set project <YOUR_PROJECT_ID>

Enable Public IP on AlloyDB

Once AlloyDB provisioning is complete, enable Public IP so the Python Connector can connect from Cloud Shell:

  1. Go to the AlloyDB Console
  2. Click your cluster → click your primary instance
  3. Click Edit
  4. Scroll to Public IP Connectivity and check Enable Public IP
  5. Click Update Instance

💡 Note: The AlloyDB Python Connector handles authentication and encryption — you don't need to add any authorized external networks.

Grant Vertex AI Permissions

The AlloyDB service account needs Vertex AI access for generating embeddings. Run this in the same Cloud Shell window:

PROJECT_ID=$(gcloud config get-value project)


gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member="serviceAccount:service-$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")@gcp-sa-alloydb.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

Run the Setup Script

Now run the setup script — it will auto-detect your AlloyDB instance:

sh setup.sh

What the script does:

  • Validates gcloud CLI, authentication, project, and Python 3
  • Checks and enables required APIs (AlloyDB, Vertex AI, Compute, Service Networking)
  • Prompts for your Gemini API key
  • Auto-detects your AlloyDB instance and extracts region, cluster, and instance name
  • Asks for your database password
  • Generates the .env configuration file
  • Installs Python dependencies

3. Database Setup

At the heart of our application lies AlloyDB for PostgreSQL. We'll use its powerful vector capabilities and ScaNN index to enable near real-time semantic search, allowing our agents to find inventory matches across thousands of records in milliseconds.

In this section, you'll provision the schema, seed data, and generate embeddings — all from AlloyDB Studio.

Connect to AlloyDB Studio

  1. Go to your AlloyDB instance in the AlloyDB Console
  2. Click AlloyDB Studio in the left navigation
  3. Authenticate with:
  4. Username: postgres
  5. Database: postgres
  6. Password: The password you set during cluster creation

Enable Extensions

AlloyDB provides built-in extensions for vectors and AI. Run the following SQL in AlloyDB Studio:

CREATE EXTENSION IF NOT EXISTS google_ml_integration CASCADE;
CREATE EXTENSION IF NOT EXISTS vector;
CREATE EXTENSION IF NOT EXISTS alloydb_scann CASCADE;
  • google_ml_integration: Enables the ai.embedding() function for calling Vertex AI directly from SQL.
  • vector: Stores and queries 768-dimensional vector embeddings.
  • alloydb_scann: Enables Google's ScaNN index for ultra-fast vector search.

Create the Inventory Table

DROP TABLE IF EXISTS inventory;

CREATE TABLE inventory (
    id SERIAL PRIMARY KEY,
    part_name TEXT NOT NULL,
    supplier_name TEXT NOT NULL,
    description TEXT,
    stock_level INT DEFAULT 0,
    part_embedding vector(768)
);

The part_embedding column stores 768-dimensional vectors from text-embedding-005. This is what powers the semantic search.

Insert Sample Data

Insert 20 warehouse inventory items:

INSERT INTO inventory (part_name, supplier_name, description, stock_level) VALUES
('Cardboard Shipping Box Large', 'Packaging Solutions Inc', 'Heavy-duty corrugated cardboard shipping container, 24x18x12 inches', 250),
('Warehouse Storage Container', 'Industrial Supply Co', 'Stackable plastic storage bin with snap-lock lid, blue', 180),
('Product Shipping Boxes', 'Acme Packaging', 'Medium corrugated boxes for warehouse storage, 18x14x10 inches', 320),
('Industrial Widget X-9', 'Acme Corp', 'Heavy-duty industrial coupling for pneumatic systems', 50),
('Precision Bolt M4', 'Global Fasteners Inc', 'Stainless steel M4 allen bolt, 20mm length, grade A2-70', 200),
('Hexagonal Nut M6', 'Metro Supply Co', 'Galvanized steel hex nut M6, DIN 934 standard', 150),
('Phillips Head Screw 3x20', 'Acme Corp', 'Zinc-plated Phillips head wood screw, 3mm x 20mm', 500),
('Wooden Dowel 10mm', 'Craft Materials Ltd', 'Hardwood birch dowel rod, 10mm diameter x 300mm length', 80),
('Rubber Gasket Small', 'SealTech Industries', 'Buna-N rubber gasket, 25mm OD x 15mm ID, oil resistant', 120),
('Spring Tension 5kg', 'Mechanical Parts Co', 'Stainless steel compression spring, 5kg load capacity', 60),
('Bearing 6204', 'Bearings Direct', 'Deep groove ball bearing 6204-2RS, 20x47x14mm sealed', 45),
('Warehouse Shelf Boxes', 'Storage Systems Ltd', 'Standardized warehouse inventory boxes, corrugated, bulk pack', 400),
('Inventory Container Units', 'Supply Chain Pros', 'Modular stackable storage units for warehouse racking', 95),
('Aluminum Extrusion Bar', 'MetalWorks International', 'T-slot aluminum extrusion 20x20mm profile, 1 meter length', 110),
('Cable Tie Pack 200mm', 'ElectroParts Depot', 'Nylon cable ties, 200mm x 4.8mm, UV resistant black, pack of 100', 600),
('Hydraulic Hose 1/2 inch', 'FluidPower Systems', 'High-pressure hydraulic hose, 1/2 inch ID, 3000 PSI rated', 35),
('Safety Goggles Clear', 'WorkSafe Equipment Co', 'ANSI Z87.1 rated clear safety goggles, anti-fog coating', 275),
('Packing Tape Industrial', 'Packaging Solutions Inc', 'Heavy-duty polypropylene packing tape, 48mm x 100m, clear', 450),
('Stainless Steel Sheet 1mm', 'MetalWorks International', '304 stainless steel sheet, 1mm thickness, 300x300mm', 70),
('Silicone Sealant Tube', 'SealTech Industries', 'Industrial-grade RTV silicone sealant, 300ml cartridge, grey', 190);

Grant Embedding Permissions

GRANT EXECUTE ON FUNCTION embedding TO postgres;

Generate Vector Embeddings

This uses AlloyDB's built-in ai.embedding() function to call Vertex AI's text-embedding-005 model directly from SQL — no Python code needed:

UPDATE inventory
SET part_embedding = ai.embedding(
    'text-embedding-005',
    part_name || '. ' || description
)::vector
WHERE part_embedding IS NULL;

This generates 768-dimensional vectors that capture the semantic meaning of each part's name and description. The ScaNN index will use these for lightning-fast similarity search. [This will take approximately 3-5 minutes to complete]

Create the ScaNN Index

SET scann.allow_blocked_operations = true;

CREATE INDEX IF NOT EXISTS idx_inventory_scann
ON inventory USING scann (part_embedding cosine)
WITH (num_leaves=5, quantizer='sq8');

Verify Everything Worked

SELECT part_name, supplier_name, stock_level,
       (part_embedding IS NOT NULL) as has_embedding
FROM inventory
ORDER BY id;

You should see 20 rows, all with has_embedding = true.

4. Understanding the Architecture

Before making code changes, let's understand how the system is built. The architecture follows a progressive "awakening" pattern:

The Agent Stack

Vision Agent (agents/vision-agent/)

  • agent.py — Core Gemini 3 Flash logic. Sends images to the model with Code Execution enabled so it writes Python (OpenCV) to count items deterministically.
  • agent_executor.py — Bridges A2A protocol requests to the agent logic.
  • main.py — Uvicorn A2A server that serves /.well-known/agent-card.json and handles requests.

Supplier Agent (agents/supplier-agent/)

  • inventory.py — Connects to AlloyDB via the AlloyDB Python Connector (no Auth Proxy needed). Contains the find_supplier() function that performs ScaNN vector search.
  • agent_executor.py — Bridges A2A protocol to inventory search logic.
  • main.py — Uvicorn A2A server with agent card and health endpoint.

Control Tower (frontend/)

  • app.py — FastAPI + WebSocket server that discovers agents via A2A, orchestrates the vision → search → order pipeline, and streams real-time updates to the browser.

The A2A Flow

  • Control Tower reads /.well-known/agent-card.json from each agent
  • Discovers capabilities (skills, endpoints) — no hardcoded URLs
  • Sends image to Vision Agent → gets item count + description
  • Sends description as embedding query to Supplier Agent → gets part match
  • Places order autonomously

The AlloyDB Connection

The Supplier Agent uses the AlloyDB Python Connector instead of the traditional Auth Proxy:

from google.cloud.alloydbconnector import Connector

connector = Connector()
conn = connector.connect(
    inst_uri,       # Full instance URI
    "pg8000",       # Driver
    user="postgres",
    password=DB_PASS,
    ip_type="PUBLIC",  # Cloud Shell uses Public IP
)

This handles IAM authentication, SSL/TLS, and connection routing automatically. When deploying to Cloud Run later, simply change ip_type to "PRIVATE" for VPC access.

5. Step 1: The Memory (Supplier Agent)

The Supplier Agent remembers millions of parts using AlloyDB ScaNN. Right now, it ships with a placeholder query — an amnesiac that returns the first row it finds, regardless of what you search for.

The Audit: The Amnesiac

If you query the Supplier Agent now, it returns a random result. It has no concept of similarity. Let's fix that.

Start the Supplier Agent

The A2A server (main.py) delegates to agent_executor.py, which bridges the protocol to the business logic in inventory.py.

pkill -f uvicorn #Kill all uvicorn processes

Step 1: Navigate to the agent directory

cd agents/supplier-agent

Step 2: Install dependencies

pip install -r requirements.txt

Step 3: Start the agent server

uvicorn main:app --host 0.0.0.0 --port 8082 > /dev/null 2>&1 &

The > /dev/null 2>&1 & runs the server in the background and suppresses output so it doesn't interrupt your terminal.

Step 4: Verify the agent is running (wait 2-3 seconds after starting)

curl http://localhost:8082/.well-known/agent-card.json

Expected Output: JSON with agent configuration (should return without errors)

The Fix: Implementing the <=> Operator

Open agents/supplier-agent/inventory.py and find the find_supplier() function around line 60-70. You'll see the placeholder:

# TODO: Replace this placeholder query with ScaNN vector search

sql = "SELECT part_name, supplier_name FROM inventory LIMIT 1;"
cursor.execute(sql)

Replace those two lines with:

sql = """
SELECT part_name, supplier_name,
       part_embedding <=> %s::vector as distance
FROM inventory
ORDER BY part_embedding <=> %s::vector
LIMIT 1;
"""
cursor.execute(sql, (embedding_str, embedding_str))

What this does:

  • <=> is the cosine distance operator in PostgreSQL
  • ORDER BY part_embedding <=> %s::vector finds the nearest match (lowest distance = closest semantic meaning)
  • %s::vector casts your embedding array to PostgreSQL's vector type
  • The ScaNN index automatically accelerates this query!

Step 4: Save the file (Ctrl+S or Cmd+S)

The agent will now use semantic search instead of returning random results!

Verification

Test A2A discovery and the inventory:

curl http://localhost:8082/.well-known/agent-card.json

44500cdeeca3cee8.png

python3 -c "
from inventory import find_supplier
import json
vec = [0.1]*768
r = find_supplier(vec)
if r:
    result = {'part': r[0], 'supplier': r[1]}
    if len(r) > 2:
        result['distance'] = float(r[2]) if r[2] else None
    print(json.dumps(result))
else:
    print('No result found')
"

Expected: agent-card.json returns the agent card. The Python snippet returns a part and supplier from the seeded data.

6. Step 2: The Eyes (Vision Agent)

While the database is accessible, let's awaken the eyes using Gemini 3 Flash. The Vision Agent performs "visual math" via Code Execution.

The Audit: The Hallucination

If you ask a standard multimodal model "How many boxes are in this messy image?", it processes the image as a static snapshot and guesses.

  • Model says: "I see about 12 boxes."
  • Reality: There are 15 boxes.
  • Result: Supply chain failure.

The Fix: Awakening the Think-Act-Observe Loop

We enable Code Execution and ThinkingConfig so the model writes Python (OpenCV) to count deterministically.

  1. Open agents/vision-agent/agent.py .
  2. Find the GenerateContentConfig section (around line 68-78).
  3. Uncomment both the thinking_config=types.ThinkingConfig(...) block and tools=[types.Tool(code_execution=...)].
  4. The client is already configured to use your GEMINI_API_KEY from the environment.

File: agents/vision-agent/agent.py

config = types.GenerateContentConfig(
    temperature=0,
    # CODELAB STEP 1: Uncomment to enable reasoning
    thinking_config=types.ThinkingConfig(
        thinking_level="MINIMAL",  # Valid: "MINIMAL", "LOW", "MEDIUM", "HIGH"
        include_thoughts=False    # Set to True for debugging
    ),
    # CODELAB STEP 2: Uncomment to enable code execution
    tools=[types.Tool(code_execution=types.ToolCodeExecution)]
)

Why thinking_level="MINIMAL"?

For this task (counting items via code execution), "MINIMAL" provides sufficient reasoning to plan the script and verify the count. Using "HIGH" would add 2-3x latency without improving accuracy for deterministic tasks. Cost-Performance Optimization — match reasoning depth to task complexity.

Cost-Performance Optimization is a key skill for production AI engineering: match reasoning depth to task complexity.

Start the Vision Agent

🔄 Path Check: If you're still in agents/supplier-agent/, first go back to repo root with cd ../..

Step 1: Navigate to the vision agent directory

cd agents/vision-agent

Step 2: Install dependencies

pip install -r requirements.txt

Step 3: Start the vision agent server

uvicorn main:app --host 0.0.0.0 --port 8081 > /dev/null 2>&1 &

The > /dev/null 2>&1 & runs the server in the background and suppresses output so it doesn't interrupt your terminal.

Verification

Test A2A discovery:

curl http://localhost:8081/.well-known/agent-card.json

Expected: JSON with agent name and skills. You'll test the actual vision counting with the Control Tower UI in Step 8.

dc9bc53007336472.png

7. Step 3: The Handshake (A2A Agent Card)

Our agent sees the problem (Vision) and knows the supplier (Memory). The A2A protocol enables dynamic discovery—the frontend learns how to talk to each agent by reading its card.

A2A vs Traditional REST APIs

Aspect

Traditional REST

A2A Protocol

Endpoint Discovery

Hardcoded URLs in config

Dynamic via /.well-known/agent-card.json

Capability Description

API docs (for humans)

Skills (machine-readable)

Integration

Manual code per service

Semantic matching: "I need inventory search" → discovers skill

New Agent Added

Update all clients' configs

Zero config—auto-discovered

Real-World Benefit: In a traditional microservice, if you add a third "Logistics Agent," you'd need to update the Control Tower's code with its URL and API contract. With A2A, the Control Tower discovers it automatically and understands its capabilities through natural language skill descriptions.

This is why A2A enables Plug-and-Play Agent Composition—the architectural pattern for autonomous systems.

Create the Agent Card

🔄 Path Check: If you're still in agents/vision-agent/, first go back to repo root with cd ../..

The agent card is already included at agents/supplier-agent/agent_card.json. Open it and review:

{
  "name": "Acme Supplier Agent",
  "description": "Autonomous fulfillment for industrial parts via AlloyDB ScaNN.",
  "version": "1.0.0",
  "skills": [{
    "id": "search_inventory",
    "name": "Search Inventory",
    "description": "Searches the warehouse database for semantic matches using AlloyDB ScaNN vector search.",
    "tags": ["inventory", "search", "alloydb"],
    "examples": ["Find stock for Industrial Widget X-9", "Who supplies ball bearings?"]
  }]
}

Feel free to customize the name, description, or examples to match your use case.

Restart the Supplier Agent to load the card:

Step 1: Stop the running agent

pkill -f "uvicorn main:app.*8082"

Step 2: Navigate to the agent directory

cd agents/supplier-agent

Step 3: Start the agent again

uvicorn main:app --host 0.0.0.0 --port 8082 > /dev/null 2>&1 &

The > /dev/null 2>&1 & runs the server in the background and suppresses output so it doesn't interrupt your terminal.

Step 4: Verify the new agent card (wait 2-3 seconds after starting)

curl http://localhost:8082/.well-known/agent-card.json

Expected Output: JSON with your filled-in name, description, and skills.

dd352ca2e7e6109a.png

8. Step 4: The Control Tower

Run the Control Tower frontend with FastAPI + WebSockets. It discovers agents via A2A and orchestrates the full loop with real-time updates.

Start All Services

The easiest way to start all services:

Verify you're in repo root

pwd  # Should end with: visual-commerce-gemini-3-alloydb

Then,

sh run.sh

This single command starts:

  • Vision Agent on port 8081
  • Supplier Agent on port 8082
  • Control Tower on port 8080

Wait ~10 seconds for all services to initialize.

Test the System

Access the Control Tower:

  1. Click Web Preview button (eye icon 👁️) in the Cloud Shell toolbar
  2. Select "Preview on port 8080"
  3. The Control Tower dashboard will open in a new tab

Run the Demo:

  1. Top right: Connection status (green "Live" dot), DEMO/AUTO mode toggle, and audio controls
  2. Center: Main workflow canvas with image upload and analysis visualization
  3. Side panels (appear during analysis): Workflow timeline (left), progress tracking and code viewer (right)

Option 1: Quick Start (Recommended)

  1. On the homepage, you'll see a "Quick start" section with sample images
  2. Click any sample image to auto-start analysis
  3. Watch the autonomous workflow (~30-45 seconds)

Option 2: Upload Your Own

  1. Drag and drop a warehouse/shelf image (PNG, JPG, up to 10MB) or click to browse
  2. Click "Initiate Autonomous Workflow"
  3. Observe the 4-stage pipeline

What Happens:

  1. Agent Discovery: A2A protocol modals show Vision Agent and Supplier Agent cards with their skills and endpoints
  2. Vision Analysis: Gemini 3 Flash generates and executes Python code (OpenCV) to count items. Progress bar shows substeps. Bounding boxes overlay on detected items. Result badge shows "✓ Code-Verified" or "~ Estimated"
  3. Supplier Match: AlloyDB ScaNN vector search animation. Search query displays (e.g., "industrial metal boxes"). Result card shows matched part, supplier, and confidence score
  4. Order Placed: Receipt card with order ID, quantity, and details

Tip: Keep DEMO mode on (top right) to pause at each stage for presentations. In AUTO mode, the workflow runs continuously.

1a031c4fd407a183.png

What Just Happened

The Control Tower used A2A Protocol to discover both agents via /.well-known/agent-card.json, orchestrated the vision analysis (Gemini 3 Flash with code execution), performed vector search (AlloyDB ScaNN), and placed an autonomous order—all with real-time WebSocket updates. Each agent exposes its capabilities via the A2A standard, enabling plug-and-play composition without custom SDKs. Learn more: A2A Protocol

Troubleshooting

Path-Related Errors:

  • "No such file or directory" when running commands: You're not in the repo root.
# Check where you are
pwd

# If you're lost, navigate to home and back to repo
cd
cd visual-commerce-gemini-3-alloydb

Service Errors:

  • "Address already in use": Processes from previous runs are still active.
# Kill all services and restart
pkill -f uvicorn
sh run.sh  # Or manually restart individual agents
  • Services not starting: Check if ports are occupied:
# Check which processes are using the ports
lsof -i :8080  # Control Tower
lsof -i :8081  # Vision Agent
lsof -i :8082  # Supplier Agent
  • "Connection refused" to AlloyDB: Verify Public IP is enabled on your AlloyDB instance

9. 🎁 Bonus: Deploy to Cloud Run

Optional — Everything works locally! But if you'd like to share your creation with a public URL:

# From repo root
sh deploy/deploy.sh

What happens:

  1. Reads your .env configuration
  2. Asks for your name (shown in the deployed app)
  3. Deploys all 3 services as a single Cloud Run container
  4. Grants IAM roles for AlloyDB access
  5. Outputs a shareable URL

Visitors who open your URL will see a popup:

10. Cleanup

To avoid incurring charges, destroy all resources with the automated cleanup script:

# From repo root
sh deploy/cleanup.sh

This safely removes:

  • AlloyDB cluster (the primary cost driver)
  • Cloud Run services (if deployed)
  • Associated service accounts

The script will prompt for confirmation before deleting anything.

11. References & Further Reading

All technical claims in this codelab are verified from official Google Cloud and Google AI documentation.

Official Documentation

Gemini 3 Flash:

AlloyDB AI & ScaNN:

MCP Toolbox for Databases (alternative approach):

Pricing Information:

Verified Performance Claims

Feature

Claim

Source

ScaNN vs HNSW (filtered)

10x faster

Google Cloud Blog (verified)

ScaNN vs HNSW (standard)

4x faster

Google Cloud Blog (verified)

ScaNN memory footprint

3-4x smaller

Google Cloud Blog (verified)

ScaNN index build time

8x faster

Google Cloud Blog (verified)

Code execution timeout

30 seconds max

Google Cloud Docs (verified)

Code execution file I/O

Not supported

Google Cloud Docs (verified)

Temperature=0 behavior

Deterministic output

Community verified

Additional Resources

Agent-to-Agent (A2A) Protocol:

  • A2A standardizes agent discovery and communication
  • Agent cards served at /.well-known/agent-card.json
  • Emerging standard for autonomous agent collaboration

ScaNN Research:

  • Based on 12 years of Google Research
  • Powers Google Search, YouTube at billion-scale
  • Released for general availability: October 2024
  • First PostgreSQL vector index suitable for million-to-billion vectors

12. Challenge Mode: Level Up Your Agentic Skills

You've built a working autonomous supply chain. Ready to push further? These challenges apply the patterns you've learned to new problems.

Challenge 1: Image-Based Search (Multimodal Embeddings)

Current Flow: Vision Agent counts items → generates text query → Supplier Agent embeds text → searches AlloyDB

Challenge: Bypass text entirely—send the cropped image directly to the Supplier Agent.

Hints:

  1. The Vision Agent's code execution can crop individual items from the shelf image
  2. Vertex AI's multimodalembedding@001 model can embed images directly
  3. Modify inventory.py to accept image bytes instead of text
  4. Update the A2A skill description to indicate "Accepts: image/jpeg or text"

Why This Matters: Visual search is more accurate for parts with complex appearances (color variations, damage, packaging differences).

Challenge 2: Observability—Trust Through Transparency

Current State: The system works, but you can't see "under the hood"

Challenge: Inspect AlloyDB's query logs to prove the vector search is executing.

Steps:

  • Query insights are enabled by default on AlloyDB. To verify, run:
gcloud alloydb instances describe INSTANCE_NAME \
  --cluster=CLUSTER_NAME \
  --region=us-central1 \
  --format="value(queryInsightsConfig.queryPlansPerMinute)"
  • Run a supplier search through the UI
  • View the actual SQL executed:
gcloud logging read \
  'resource.type="alloydb.googleapis.com/Instance" AND textPayload:"ORDER BY part_embedding"' \
  --limit 5 \
  --format=json

Expected Output: You'll see the exact ORDER BY part_embedding <=> $1::vector LIMIT 1 query with execution time.

Why This Matters: Observability builds trust. When stakeholders ask "How does this agent make decisions?", you can show them the query plan, not just the output.

Challenge 3: Multi-Agent Composition

Challenge: Add a third agent (Logistics Agent) that calculates shipping costs based on warehouse location and item weight.

Architecture:

  • Vision Agent outputs: item count
  • Supplier Agent outputs: supplier location
  • Logistics Agent (NEW) inputs: destination, weight → outputs: shipping cost + ETA

Hint: The A2A protocol makes this trivial—create a new agent card with a calculate_shipping skill. The Control Tower will discover it automatically.

Pattern You're Learning: This is the core of Agent-Oriented Architecture—complex systems built from small, composable specialists.

13. Conclusion

You have successfully moved from Generative AI to Agentic AI.

What we built:

  • Vision: We replaced "guessing" with Code Execution (Gemini 3 Flash via API key).
  • Memory: We replaced "slow search" with AlloyDB ScaNN (via GCP).
  • Action: We replaced "API integration" with the A2A Protocol.

Hybrid Architecture Benefits:

This codelab demonstrated a hybrid approach:

  • Vision Agent: Uses Gemini API (API key) - simple, free tier available, no GCP billing required
  • Supplier Agent: Uses GCP (Vertex AI + AlloyDB) - enterprise-grade, compliance-ready

This is the architecture of the autonomous economy. The code is yours to keep.

Next Steps