Build a Supply Chain Orchestrator with ADK, AlloyDB, and Vertex AI Memory Bank

1. Overview

In this codelab, you will build a Supply Chain Orchestrator agent. This application allows users to analyze inventory, track logistics, and manage supply chain risks using natural language.

We will leverage Google's Agent Development Kit (ADK) to build a multi-agent architecture that maintains context, remembers user preferences via Vertex AI Memory Bank, and interacts with a massive dataset stored in AlloyDB via the MCP Toolbox.

What you'll build

A Python Flask application consisting of:

Global Orchestrator Agent: The root agent that manages conversation flow and delegation.

Specialist Agents: An "InventorySpecialist" and "LogisticsManager" for domain-specific tasks.

Memory Integration: Short-term session memory and long-term memory using Vertex AI Memory Bank.

Narrative UI: A web interface that visualizes the agent's reasoning process (Trace Context).

What you'll learn

How to use the Google ADK to create specialized agents and sub-agents.
How to integrate Vertex AI Memory Bank for long-term agent memory.
How to use MCP Toolbox to connect agents to AlloyDB data tools.
How to implement ADK Callbacks to trace and visualize agent reasoning.
How to deploy the solution using Cloud Run or run it locally.

The Architecture

The Tech Stack

AlloyDB for PostgreSQL: Serves as the high-performance operational database holding 50,000+ supply chain records. It powers the vector search and retrieval.
MCP Toolbox for Databases: Acts as the "Orchestration Maestro," exposing AlloyDB data as executable tools that the agents can call.
Agent Development Kit (ADK): The framework used to define the agents, instructions, and tools.
Vertex AI Memory Bank: Provides long-term memory, allowing the agent to recall user preferences and past interactions across sessions.
Vertex AI Session Service: Manages short-term conversation context.

The Flow

User Query: The user asks a question (e.g., "Check stock for Premium Ice Cream").
Memory Check: The Orchestrator checks the Memory Bank for relevant past information (e.g., "User is a regional manager for EMEA").
Delegation: The Orchestrator delegates the task to the InventorySpecialist.
Tool Execution: The Specialist uses tools provided by the MCP Toolbox to query AlloyDB.
Response: The agent processes the data and returns a Markdown-formatted table.
Memory Storage: Significant interactions are saved back to the Memory Bank.

Requirements

A browser, such as Chrome or Firefox.
A Google Cloud project with billing enabled.
Basic familiarity with SQL and Python.

2. Before you begin

Create a project

In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.

You'll use Cloud Shell, a command-line environment running in Google Cloud. Click Activate Cloud Shell at the top of the Google Cloud console.

Activate Cloud Shell button image

Once connected to Cloud Shell, you check that you're already authenticated and that the project is set to your project ID using the following command:

gcloud auth list

Run the following command in Cloud Shell to confirm that the gcloud command knows about your project.

gcloud config list project

If your project is not set, use the following command to set it:

gcloud config set project <YOUR_PROJECT_ID>

Enable the required APIs: Follow the link and enable the APIs.

Alternatively you can use the gcloud command for this. Refer documentation for gcloud commands and usage.

Gotchas & Troubleshooting

The "Ghost Project" Syndrome	You ran `gcloud config set project`, but you're actually looking at a different project in the Console UI. Check the project ID in the top-left dropdown!
The Billing Barricade	You enabled the project, but forgot the billing account. AlloyDB is a high-performance engine; it won't start if the "gas tank" (billing) is empty.
API Propagation Lag	You clicked "Enable APIs," but the command line still says `Service Not Enabled`. Give it 60 seconds. The cloud needs a moment to wake up its neurons.
Quota Quags	If you're using a brand-new trial account, you might hit a regional quota for AlloyDB instances. If `us-central1` fails, try `us-east1`.
"Hidden" Service Agent	Sometimes the AlloyDB Service Agent isn't automatically granted the `aiplatform.user` role. If your SQL queries can't talk to Gemini later, this is usually the culprit.

3. Database setup

At the heart of our application lies AlloyDB for PostgreSQL. We leveraged its powerful vector capabilities and integrated columnar engine to generate embeddings for 50,000+ SCM records. This enables near real-time vector analysis, allowing our agents to identify inventory anomalies or logistics risks across massive datasets in milliseconds.

In this lab we'll use AlloyDB as the database for the test data. It uses clusters to hold all of the resources, such as databases and logs. Each cluster has a primary instance that provides an access point to the data. Tables will hold the actual data.

Let's create an AlloyDB cluster, instance and table where the test dataset will be loaded.

Click the button or Copy the link below to your browser where you have the Google Cloud Console user logged in.

Alternatively, you can go to Cloud Shell Terminal from your project where you have redeemed the billing account, and clone the github repo and navigate to the project using the commands below:

git clone https://github.com/AbiramiSukumaran/easy-alloydb-setup

cd easy-alloydb-setup

Once this step is complete the repo will be cloned to your local cloud shell editor and you will be able to run the command below from with the project folder (important to make sure you are in the project directory):

sh run.sh

Now use the UI (clicking the link in the terminal or clicking the "preview on web" link in the terminal.
Enter your details for project id, cluster and instance names to get started.
Go grab a coffee while the logs scroll & you can read about how it's doing this behind the scenes here.

Gotchas & Troubleshooting

The "Patience" Problem	Database clusters are heavy infrastructure. If you refresh the page or kill the Cloud Shell session because it "looks stuck," you might end up with a "ghost" instance that is partially provisioned and impossible to delete without manual intervention.
Region Mismatch	If you enabled your APIs in `us-central1` but try to provision the cluster in `asia-south1`, you might run into quota issues or Service Account permission delays. Stick to one region for the whole lab!
Zombie Clusters	If you previously used the same name for a cluster and didn't delete it, the script might say the cluster name already exists. Cluster names must be unique within a project.
Cloud Shell Timeout	If your coffee break takes 30 minutes, Cloud Shell might go to sleep and disconnect the `sh run.sh` process. Keep the tab active!

4. Schema Provisioning

Once you have your AlloyDB cluster and instance running, head over to the AlloyDB Studio SQL editor to enable the AI extensions and provision the schema.

You may need to wait for your instance to finish being created. Once it is, sign into AlloyDB using the credentials you created when you created the cluster. Use the following data for authenticating to PostgreSQL:

Username : "postgres"
Database : "postgres"
Password : "alloydb" (or whatever you set at the time of creation)

Once you have authenticated successfully into AlloyDB Studio, SQL commands are entered in the Editor. You can add multiple Editor windows using the plus to the right of the last window.

You'll enter commands for AlloyDB in editor windows, using the Run, Format, and Clear options as necessary.

Enable Extensions

For building this app, we will use the extensions pgvector and google_ml_integration. The pgvector extension allows you to store and search vector embeddings. The google_ml_integration extension provides functions you use to access Vertex AI prediction endpoints to get predictions in SQL. Enable these extensions by running the following DDLs:

CREATE EXTENSION IF NOT EXISTS google_ml_integration CASCADE;
CREATE EXTENSION IF NOT EXISTS vector;

Create a table

You can create a table using the DDL statement below in the AlloyDB Studio:

DROP TABLE IF EXISTS shipments;
DROP TABLE IF EXISTS products;

-- 1. Product Inventory Table

CREATE TABLE products (
id SERIAL PRIMARY KEY,
name VARCHAR(255) NOT NULL,
category VARCHAR(100),
stock_level INTEGER,
distribution_center VARCHAR(100),
region VARCHAR(50),
embedding vector(768),
last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- 2. Logistics & Shipments
CREATE TABLE shipments (
shipment_id SERIAL PRIMARY KEY,
product_id INTEGER REFERENCES products(id),
status VARCHAR(50), -- 'In Transit', 'Delayed', 'Delivered', 'Pending'
estimated_arrival TIMESTAMP,
route_efficiency_score DECIMAL(3, 2)
);

The embedding column will allow storage for the vector values of some of the text fields.

Data Ingestion

Run the below set of SQL statements to bulk insert 50000 records in products table:

-- We use a CROSS JOIN pattern with realistic naming segments to create meaningful variety
DO $$
DECLARE
brand_names TEXT[] := ARRAY['Artisan', 'Nature', 'Elite', 'Pure', 'Global', 'Eco', 'Velocity', 'Heritage', 'Aura', 'Summit'];
product_types TEXT[] := ARRAY['Ice Cream', 'Body Wash', 'Laundry Detergent', 'Shampoo', 'Mayonnaise', 'Deodorant', 'Tea', 'Soup', 'Face Cream', 'Soap'];
variants TEXT[] := ARRAY['Classic', 'Gold', 'Premium', 'Eco-Friendly', 'Organic', 'Night-Repair', 'Extra-Fresh', 'Zero-Sugar', 'Sensitive', 'Maximum-Strength'];
regions TEXT[] := ARRAY['EMEA', 'APAC', 'LATAM', 'NAMER'];
dcs TEXT[] := ARRAY['London-Hub', 'Mumbai-Central', 'Sao-Paulo-Logistics', 'Singapore-Port', 'Rotterdam-Gate', 'New-York-DC'];
BEGIN
INSERT INTO products (name, category, stock_level, distribution_center, region)
SELECT
b || ' ' || v || ' ' || t as name,
CASE
WHEN t IN ('Ice Cream', 'Mayonnaise', 'Tea', 'Soup') THEN 'Food & Refreshment'
WHEN t IN ('Body Wash', 'Shampoo', 'Deodorant', 'Face Cream', 'Soap') THEN 'Personal Care'
ELSE 'Home Care'
END as category,
floor(random() * 20000 + 100)::int as stock_level,
dcs[floor(random() * 6 + 1)] as distribution_center,
regions[floor(random() * 4 + 1)] as region
FROM
unnest(brand_names) b,
unnest(variants) v,
unnest(product_types) t,
generate_series(1, 50); -- 10 * 10 * 10 * 50 = 50,000 records
END $$;

Lets insert demo specific records to ensure predictable answers for executive style questions

-- These ensure you have predictable answers for specific "Executive" questions
INSERT INTO products (name, category, stock_level, distribution_center, region) VALUES
('Magnum Ultra Gold Limited Edition', 'Food & Refreshment', 45, 'Rotterdam-Gate', 'EMEA'),
('Dove Pro-Health Deep Moisture', 'Personal Care', 12000, 'Mumbai-Central', 'APAC'),
('Hellmanns Real Organic Mayonnaise', 'Food & Refreshment', 8000, 'London-Hub', 'EMEA');

Inserting shipments data

-- Shipments Generation (More shipments than products)
INSERT INTO shipments (product_id, status, estimated_arrival, route_efficiency_score)
SELECT
id,
CASE
WHEN random() > 0.8 THEN 'Delayed'
WHEN random() > 0.4 THEN 'In Transit'
ELSE 'Delivered'
END,
NOW() + (random() * 10 || ' days')::interval,
(random() * 0.5 + 0.5)::decimal(3,2)
FROM products
WHERE random() > 0.3; -- Create shipments for ~70% of products


-- Add duplicate shipments for some products to show complex logistics
INSERT INTO shipments (product_id, status, estimated_arrival, route_efficiency_score)
SELECT id, 'In Transit', NOW() + INTERVAL '12 days', 0.88
FROM products
LIMIT 5000;

Grant Permission

Run the below statement to grant execute on the "embedding" function:

GRANT EXECUTE ON FUNCTION embedding TO postgres;

Grant Vertex AI User ROLE to the AlloyDB service account

From Google Cloud IAM console, grant the AlloyDB service account (that looks like this: service-<<PROJECT_NUMBER>>@gcp-sa-alloydb.iam.gserviceaccount.com) access to the role "Vertex AI User". PROJECT_NUMBER will have your project number.

Alternatively you can run the below command from the Cloud Shell Terminal:

PROJECT_ID=$(gcloud config get-value project)


gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member="serviceAccount:service-$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")@gcp-sa-alloydb.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"

Generate Embeddings

Next, let's generate vector embeddings for specific meaningful text fields:

WITH
 rows_to_update AS (
 SELECT
   id
 FROM
   products
 WHERE
   embedding IS NULL
 LIMIT
   5000 )
UPDATE
 products
SET
 embedding = ai.embedding('text-embedding-005', name || ' ' || category || ' ' || distribution_center || ' ' || region)::vector
FROM
 rows_to_update
WHERE
 products.id = rows_to_update.id
 AND embedding IS null;

In this statement above we have set the limit as 5000 so make sure to run it repeatedly until there is no row in the table with the column embedding as NULL.

Gotchas & Troubleshooting

The "Password Amnesia" Loop	If you used the "One Click" setup and can't remember your password, go to the Instance basic information page in the console and click "Edit" to reset the `postgres` password.
The "Extension Not Found" Error	If `CREATE EXTENSION` fails, it's often because the instance is still in a "Maintenance" or "Updating" state from the initial provisioning. Go check if instance creation step is complete and wait a few seconds if needed.
The IAM Propagation Gap	You ran the `gcloud` IAM command, but the SQL `CALL` still fails with a permission error. IAM changes can take a little time to propagate through the Google backbone. Take a breath.
Vector Dimension Mismatch	The `items` table is set to `VECTOR(768)`. If you try to use a different model (like a 1536-dim model) later, your inserts will explode. Stick to `text-embedding-005`.
Project ID Typo	In the `create_model` call, if you leave the brackets `« »` or mistype your project ID, the model registration will look successful but fail during the first actual query. Double-check your string!

5. Tools & Toolbox Setup

MCP Toolbox for Databases is an open source MCP server for databases. It enables you to develop tools easier, faster, and more securely by handling the complexities such as connection pooling, authentication, and more. Toolbox helps you build Gen AI tools that let your agents access data in your database.

We use the Model Context Protocol (MCP) Toolbox for Databases as the "conductor." It acts as a standardized middleware between our agents and AlloyDB. By defining a tools.yaml configuration, the toolbox automatically exposes complex database operations as clean, executable tools like search_products_by_context or check_inventory_levels. This eliminates the need for manual connection pooling or boilerplate SQL within the agent logic.

Installing the Toolbox server

From your Cloud Shell Terminal, create a folder for saving your new tools yaml file and the toolbox binary:

mkdir scm-agent-toolbox

cd scm-agent-toolbox

From within that new folder, run the following set of commands:

# see releases page for other versions
export VERSION=0.27.0
curl -L -o toolbox https://storage.googleapis.com/genai-toolbox/v$VERSION/linux/amd64/toolbox
chmod +x toolbox

Next create the tools.yaml file inside that new folder by navigating into the Cloud Shell Editor and copy the contents of this repo file into the tools.yaml file.

sources:
    supply_chain_db:
        kind: "alloydb-postgres"
        project: "YOUR_PROJECT_ID"
        region: "us-central1"
        cluster: "YOUR_CLUSTER"
        instance: "YOUR_INSTANCE"
        database: "postgres"
        user: "postgres"
        password: "YOUR_PASSWORD"

tools:
  search_products_by_context:
    kind: postgres-sql
    source: supply_chain_db
    description: Find products in the inventory using natural language search and vector embeddings.
    parameters:
      - name: search_text
        type: string
        description: Description of the product or category the user is looking for.
    statement: |
     SELECT name, category, stock_level, distribution_center, region
      FROM products
      ORDER BY embedding <=> ai.embedding('text-embedding-005', $1)::vector
      LIMIT 5;

  check_inventory_levels:
    kind: postgres-sql
    source: supply_chain_db
    description: Get precise stock levels for a specific product name.
    parameters:
      - name: product_name
        type: string
        description: The exact or partial name of the product.
    statement: |
     SELECT name, stock_level, distribution_center, last_updated
      FROM products
      WHERE name ILIKE '%' || $1 || '%'
      ORDER BY stock_level DESC;

  track_shipment_status:
    kind: postgres-sql
    source: supply_chain_db
    description: Retrieve real-time logistics and shipping status for a specific region or product.
    parameters:
      - name: region
        type: string
        description: The geographical region to filter shipments (e.g., EMEA, APAC).
    statement: |
     SELECT p.name, s.status, s.estimated_arrival, s.route_efficiency_score
      FROM shipments s
      JOIN products p ON s.product_id = p.id
      WHERE p.region = $1
      ORDER BY s.estimated_arrival ASC;

  analyze_supply_chain_risk:
    kind: postgres-sql
    source: supply_chain_db
    description: Rerank and filter shipments based on risk profiles and efficiency scores using Google ML reranker.
    parameters:
      - name: risk_context
        type: string
        description: The business context for risk analysis (e.g., 'heatwave impact' or 'port strike').
    statement: |
     WITH initial_ranking AS (
      SELECT s.shipment_id, p.name, s.status, p.distribution_center,
      ROW_NUMBER() OVER () AS ref_number
      FROM shipments s
      JOIN products p ON s.product_id = p.id
      WHERE s.status != 'Delivered'
      LIMIT 10
      ),
      reranked_results AS (
      SELECT index, score FROM
      ai.rank(
      model_id => 'semantic-ranker-default-003',
      search_string => $1,
      documents => (SELECT ARRAY_AGG(name || ' at ' || distribution_center ORDER BY ref_number) FROM initial_ranking)
      )
      )
      SELECT i.name, i.status, i.distribution_center, r.score
      FROM initial_ranking i, reranked_results r
      WHERE i.ref_number = r.index
      ORDER BY r.score DESC;

toolsets:
   supply_chain_toolset:
     - search_products_by_context
     - check_inventory_levels
     - track_shipment_status
     - analyze_supply_chain_risk

Now test the tools.yaml file in the local server:

./toolbox --tools-file "tools.yaml"

You can alternatively test it in the UI

./toolbox --ui

Perfect!! Once you're sure this all works, go ahead and deploy it in Cloud Run as follows.

Cloud Run Deployment

Set the PROJECT_ID environment variable:

export PROJECT_ID="my-project-id"

Initialize gcloud CLI:

gcloud init
gcloud config set project $PROJECT_ID

You must have the following APIs enabled:

gcloud services enable run.googleapis.com \
                       cloudbuild.googleapis.com \
                       artifactregistry.googleapis.com \
                       iam.googleapis.com \
                       secretmanager.googleapis.com

Create a backend service account if you don't already have one:

gcloud iam service-accounts create toolbox-identity

Grant permissions to use secret manager:

gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member serviceAccount:toolbox-identity@$PROJECT_ID.iam.gserviceaccount.com \
    --role roles/secretmanager.secretAccessor

Grant additional permissions to the service account that are specific to our AlloyDB source (roles/alloydb.client and roles/serviceusage.serviceUsageConsumer)

gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member serviceAccount:toolbox-identity@$PROJECT_ID.iam.gserviceaccount.com \
    --role roles/alloydb.client


gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member serviceAccount:toolbox-identity@$PROJECT_ID.iam.gserviceaccount.com \
    --role serviceusage.serviceUsageConsumer

Upload tools.yaml as a secret:

gcloud secrets create tools-scm-agent --data-file=tools.yaml

If you already have a secret and want to update the secret version, execute the following:

gcloud secrets versions add tools-scm-agent --data-file=tools.yaml

Set an environment variable to the container image that you want to use for cloud run:

export IMAGE=us-central1-docker.pkg.dev/database-toolbox/toolbox/toolbox:latest

Deploy Toolbox to Cloud Run using the following command:

If you have enabled public access in your AlloyDB instance (not recommended), follow the command below for deployment to Cloud Run:

gcloud run deploy toolbox-scm-agent \
    --image $IMAGE \
    --service-account toolbox-identity \
    --region us-central1 \
    --set-secrets "/app/tools.yaml=tools-scm-agent:latest" \
    --args="--tools-file=/app/tools.yaml","--address=0.0.0.0","--port=8080" \
    --allow-unauthenticated

If you are using a VPC network, use the command below:

gcloud run deploy toolbox-scm-agent \
    --image $IMAGE \
    --service-account toolbox-identity \
    --region us-central1 \
    --set-secrets "/app/tools.yaml=tools-scm-agent:latest" \
    --args="--tools-file=/app/tools.yaml","--address=0.0.0.0","--port=8080" \
    # TODO(dev): update the following to match your VPC details
    --network <<YOUR_NETWORK_NAME>> \
    --subnet <<YOUR_SUBNET_NAME>> \
    --allow-unauthenticated

6. Agent Setup

Using the Agent Development Kit (ADK), we've moved away from monolithic prompts toward a specialized, multi-agent architecture:

InventorySpecialist: Focused on product stock and warehouse metrics.
LogisticsManager: Expert in global shipping routes and risk analysis.
GlobalOrchestrator: The "brain" that uses reasoning to delegate tasks and synthesize findings.

Clone this repo into your project and let's walk through it.

To clone this, from your Cloud Shell Terminal (in the root directory or from wherever you want to create this project), run the following command:

git clone https://github.com/AbiramiSukumaran/scm-memory-agent

This should create the project and you can verify that in the Cloud Shell Editor.

Make sure to update the .env file with the values for your project and instance.

Code Walkthrough

A quick look at the Orchestrator Agent

    Go to app.py and you should be able to see the following snippet:

orchestrator = adk.Agent(
    name="GlobalOrchestrator",
    model="gemini-2.5-flash",
    description="Global Supply Chain Orchestrator root agent.",
    instruction="""
    You are the Global Supply Chain Brain. You are responsible for products, inventory and logistics.
    You also have access to the memory tool, remember to include all the information that the tool can provide you with about the user before you respond.
    1. Understand intent and delegate to specialists. As the Global Orchestrator, you have access to the full conversation history with the user.
    When you transfer a query to a specialist agent, sub agent or tool, share the important facts and information from your memory to them so they can operate with the full context. 
    2. Ensure the final response is professional and uses Markdown tables for data.
    3. If a specialist provides a long list, ensure only the top 10 items are shown initially.
    4. Conclude with a brief, high-level executive summary of what the data implies.
    """,
    tools=[adk.tools.preload_memory_tool.PreloadMemoryTool()],
    sub_agents=[inventory_agent, logistics_agent],
    
    #after_agent_callback=auto_save_session_to_memory_callback,
)

This snippet is the definition for the root that is the orchestrator agent that receives the conversation or the request from the user and routes to the corresponding sub agent or user the corresponding tools based on the task.

Let's look at the inventory agent

inventory_agent = adk.Agent(
    name="InventorySpecialist",
    model="gemini-2.5-flash",
    description="Specialist in product stock and warehouse data.",
    instruction="""
    Analyze inventory levels.
    1. Use 'search_products_by_context' or 'check_inventory_levels'.
    2. ALWAYS format results as a clean Markdown table.
    3. If there are many results, display only the TOP 10 most relevant ones.
    4. At the end, state: 'There are additional records available. Would you like to see more?'
    """,
    tools=tools
)

This particular sub agent is specialized in inventory activities like searching products contextually and also checking inventory levels.

Then there is the logistics sub agent:

logistics_agent = adk.Agent(
    name="LogisticsManager",
    model="gemini-2.5-flash",
    description="Expert in global shipping routes and logistics tracking.",
    instruction="""
    Check shipment statuses.
    1. Use 'track_shipment_status' or 'analyze_supply_chain_risk'.
    2. ALWAYS format results as a clean Markdown table.
    3. Limit initial output to the top 10 shipments.
    4. Ask if the user needs the full manifest if more results exist.
    """,
    tools=tools
)

This particular sub-agent is specialized in logistics activities like tracking shipments and analysing risks in the supply chain.

All the 3 agents we discussed so far use tools and tools are referenced through our Toolbox server that we have already deployed in the previous section. Refer to the snippet below:

from toolbox_core import ToolboxSyncClient

TOOLBOX_SERVER = os.environ["TOOLBOX_SERVER"]
TOOLBOX_TOOLSET = os.environ["TOOLBOX_TOOLSET"]

# --- ADK TOOLBOX CONFIGURATION ---
toolbox = ToolboxSyncClient(TOOLBOX_SERVER)
tools = toolbox.load_toolset(TOOLBOX_TOOLSET)

This particular sub-agent is specialized in logistics activities like tracking shipments and analysing risks in the supply chain.

7. Agent Engine

In the initial run, create the Agent Engine

import vertexai

GOOGLE_CLOUD_PROJECT = os.environ["GOOGLE_CLOUD_PROJECT"]
GOOGLE_CLOUD_LOCATION = os.environ["GOOGLE_CLOUD_LOCATION"]

client = vertexai.Client(
  project=GOOGLE_CLOUD_PROJECT,
  location=GOOGLE_CLOUD_LOCATION
)

agent_engine = client.agent_engines.create()

For the next run, update the Agent Engine with Memory Bank configuration:

agent_engine = client.agent_engines.update(
    name=APP_NAME,
    config={
        "context_spec": {
            "memory_bank_config": {
                "generation_config": {
                    "model": f"projects/{PROJECT_ID}/locations/{GOOGLE_CLOUD_LOCATION}/publishers/google/models/gemini-2.5-flash"
                }
            }
        }
    })

8. Context, Run & Memory

Context management is split into two distinct layers to ensure the agent feels like a continuous partner rather than a stateless bot:

Short-Term Memory (Sessions): Managed via VertexAiSessionService, this tracks the immediate event history (user messages, tool responses) within a single interaction.

Long-Term Memory (Memory Bank): Powered by the Vertex AI Memory Bank via adk.memorybankservice. This layer extracts "meaningful" information—like a user's preference for specific shipping carriers or recurring warehouse delays—and persists them across sessions.

Initialize session for session memory within the scope of the conversation

This is the part of the snippet that creates the session for the current app for the current user.

from google.adk.sessions import VertexAiSessionService

...

session_service = VertexAiSessionService(
    project=PROJECT_ID,
    location=GOOGLE_CLOUD_LOCATION,
)

...

# Initialize the session *outside* of the route handler to avoid repeated creation
session = None
session_lock = threading.Lock()

async def initialize_session():
    global session
    try:
        session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID)
        print(f"Session {session.id} created successfully.")  # Add a log
    except Exception as e:
        print(f"Error creating session: {e}")
        session = None  # Ensure session is None in case of error

# Create the session on app startup
asyncio.run(initialize_session())

Initialize Vertex AI Memory Bank for long term memory

This is the part of the snippet that instantiates the Vertex AI Memory Bank Service object for the agent engine.

from google.adk.memory import InMemoryMemoryService
from google.adk.memory import VertexAiMemoryBankService

...

try:
    memory_bank_service = adk.memory.VertexAiMemoryBankService(
        agent_engine_id=AGENT_ENGINE_ID,
        project=PROJECT_ID,
        location=GOOGLE_CLOUD_LOCATION,
    )
    #in_memory_service = InMemoryMemoryService()
    print("Memory Bank Service initialized successfully.")
except Exception as e:
    print(f"Error initializing Memory Bank Service: {e}")
    memory_bank_service = None

runner = adk.Runner(
    agent=orchestrator,
    app_name=APP_NAME,
    session_service=session_service,
    memory_service=memory_bank_service,
)

...

What is configured?

In this part of the snippet we are configuring the Vertex AI Memory Bank Service for long term memory, it contextually stores the session for the specific app for the specific user as a memory within the Vertex AI memory bank.

What is run as part of the agent execution?

   async def run_and_collect():
        final_text = ""
        try:
            async for event in runner.run_async(
                new_message=content,
                user_id=user_id,
                session_id=session_id
            ):
                if hasattr(event, 'author') and event.author:
                    if not any(log['agent'] == event.author for log in execution_logs):
                        execution_logs.append({
                            "agent": event.author,
                            "action": "Analyzing data requirements...",
                            "type": "orchestration_event"
                        })
                if hasattr(event, 'text') and event.text:
                    final_text = event.text
                elif hasattr(event, 'content') and hasattr(event.content, 'parts'):
                    for part in event.content.parts:
                        if hasattr(part, 'text') and part.text:
                            final_text = part.text
        except Exception as e:
            print(f"Error during runner.run_async: {e}")
            raise  # Re-raise the exception to signal failure
        finally:
            gc.collect()
            return final_text

It processes the user's input content into the new_message object with the user id and the session id in scope. Then the agent takes over and the agent response is processed and returned.

What is stored in the long term memory?

The session detail in the scope of the app and the user is extracted in the session variable.

This session is then added as the memory for the current user for the current app of the Vertex AI Memory Bank object using the "add_session_to_memory" method.

session = asyncio.run(session_service.get_session(app_name=APP_NAME, user_id=USER_ID, session_id=session.id))

if memory_bank_service and session:  # Check memory service AND session
                try:
                    #asyncio.run(in_memory_service.add_session_to_memory(session))
                    asyncio.run(memory_bank_service.add_session_to_memory(session))
                    '''
                    client.agent_engines.memories.generate(
                        scope={"app_name": APP_NAME, "user_id": USER_ID},
                        name=APP_NAME,
                        direct_contents_source={
                            "events": [
                                {"content": content}
                            ]
                        },
                        config={"wait_for_completion": True},
                    )   
                    '''

                    print("Successfully added session to memory.******")
                    print(session.id)

                except Exception as e:
                    print(f"Error adding session to memory: {e}")

Memory Retrieval

We need to retrieve the stored long term memory using the app name and the user name as the scope (since that is the scope we stored the memories for) in order to be able to pass it as part of the context to the orchestrator and other agents as applicable.

    results = client.agent_engines.memories.retrieve(
    name=APP_NAME,
    scope={"app_name": APP_NAME, "user_id": USER_ID}
    )
    # RetrieveMemories returns a pager. You can use `list` to retrieve all pages' memories.
    list(results)
    print(list(results))

How is the retrieved memory loaded as part of the context?

We use the following attribute in the definition of the Orchestrator agent that allows the root agent to preload the context from the memory bank. This is in addition to the tools we access from the toolbox server for the sub agents.

tools=[adk.tools.preload_memory_tool.PreloadMemoryTool()],

Callback Context

In an enterprise supply chain, you cannot have a "black box." We use ADK's CallbackContext to create a Narrative Engine. By hooking into the agent's execution, we capture every thought process and tool call, streaming them to a UI sidebar.

Trace Event: "GlobalOrchestrator is analyzing data requirements..."
Trace Event: "Delegating to InventorySpecialist for stock levels..."
Trace Event: "Retrieving historical supplier delay patterns from Memory Bank..."

This audit trail is invaluable for debugging and ensures that human operators can trust the agent's autonomous decisions.

from google.adk.agents.callback_context import CallbackContext

...

# --- ADK CALLBACKS (Narrative Engine) ---
execution_logs = []

async def trace_callback(context: CallbackContext):
    """
    Captures agent and tool invocation flow for the UI narrative.
    """
    agent_name = context.agent.name
    event = {
        "agent": agent_name,
        "action": "Processing request steps...",
        "type": "orchestration_event"
    }
    execution_logs.append(event)
    return None

...

That is it!!! We have successfully cloned the project and walked through the details of the agent, memory and context.

You can test it by navigating to the project folder of the cloned repo and executing the following commands:

>> pip install -r requirements.txt

>> python app.py

This should start your agent locally and you should be able to test it.

9. Let's deploy it to Cloud Run

Deploy it on Cloud Run by running the following command from the Cloud Shell Terminal where the project is cloned and make sure you are inside the project's root folder.

Run this in your Cloud Shell terminal:

gcloud run deploy supply-chain-agent --source . --platform managed   --region us-central1 --allow-unauthenticated --set-env-vars GOOGLE_CLOUD_PROJECT=<<YOUR_PROJECT>>,GOOGLE_CLOUD_LOCATION=us-central1,GOOGLE_GENAI_USE_VERTEXAI=TRUE,REASONING_ENGINE_APP_NAME=<<YOUR_APP_ENGINE_URL>>,TOOLBOX_SERVER=<<YOUR_TOOLBOX_SERVER>>,TOOLBOX_TOOLSET=supply_chain_toolset,AGENT_ENGINE_ID=<<YOUR_AGENT_ENGINE_ID>>

Replace the values for placeholders <<YOUR_PROJECT>>, <<YOUR_APP_ENGINE_URL>>, <<YOUR_TOOLBOX_SERVER>> and <<YOUR_AGENT_ENGINE_ID>>

Once the command finishes, it will spit out a Service URL. Copy it.

Grant the AlloyDB Client role to the Cloud Run service account.This allows your serverless application to securely tunnel into the database.

Run this in your Cloud Shell terminal:

# 1. Get your Project ID and Project Number
PROJECT_ID=$(gcloud config get-value project)
PROJECT_NUMBER=$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")

# 2. Grant the AlloyDB Client role
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$PROJECT_NUMBER-compute@developer.gserviceaccount.com" \
--role="roles/alloydb.client"

Now use the service URL (Cloud Run endpoint you copied earlier) and test the app.

Note: If you encounter a service issue, and it cites memory as the reason, try increasing the allocated memory limit to 1 GiB to test it.

10. Clean up

Once this lab is done, do not forget to delete alloyDB cluster and instance.

It should clean up the cluster along with its instance(s).

11. Congratulations

By combining the speed of AlloyDB, the orchestration efficiency of MCP Toolbox, and the "institutional memory" of Vertex AI Memory Bank, we've built a supply chain system that evolves. It doesn't just answer questions; it remembers that your warehouse in Singapore always struggles with monsoon-related delays and proactively suggests rerouting shipments before you even ask.