Deploy LangChain Agent on Cloud Run

1. Overview

An agent is an autonomous program that talks to an AI model to perform a goal-based operation using the tools and context it has and is capable of autonomous decision making grounded in truth!

Agent frameworks like Agent Development Kit(ADK), LangChain, smolagents are used to create agents. The agent applications created through such frameworks can be deployed onto Cloud Run, and can be made available to users as serverless applications.

In this codelab, we will be building an agent using LangChain and deploying it to Cloud Run.

What you'll build

Ready to move from prototype PROMPT to Building an agent??? We'll create an agent using LangChain to get information regarding historical figure in a structured format. As part of this lab, you will:

Build a simple agent to generate information about the historical figure in a structured format using LangChain
Run the agent locally, and ensure it is working as expected
Deploy the agent to Cloud Run, and invoke it using Cloud Run URL

Requirements

A browser, such as Chrome or Firefox
A Google Cloud project with billing enabled.

2. Before you begin

Create a project

In the Google Cloud Console, on the project selector page, select or create a Google Cloud project.
Make sure that billing is enabled for your Cloud project. Learn how to check if billing is enabled on a project.
Activate Cloud Shell by clicking this link. You can toggle between Cloud Shell Terminal (for running cloud commands) and Editor (for building projects) by clicking on the corresponding button from Cloud Shell.
Once connected to Cloud Shell, you check that you're already authenticated and that the project is set to your project ID using the following command:

gcloud auth list

Run the following command in Cloud Shell to confirm that the gcloud command knows about your project.

gcloud config list project

If your project is not set, use the following command to set it:

gcloud config set project <YOUR_PROJECT_ID>

Make sure to have Python 3.13+

Refer documentation for other gcloud commands and usage.

3. Creating LangChain Agent

Project Structure

In your Cloud Shell, create a folder named langchain-app, and add the following files within it:

langchain-gemini-fastapi-app/
├── main.py
├── requirements.txt

Application Code

Here are the following dependencies which you should add in the requirements.txt:

fastapi
uvicorn
langchain
langchain-google-genai
python-dotenv

In main.py, we will be writing the agent code which uses Gemini model, and retrieves the information about the historical figure. Once retrieved, formats it into a structured format as directed.

This implementation uses the modern LCEL (LangChain Expression Language) syntax.

import os
import uvicorn
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import JsonOutputParser

# Initialize FastAPI
app = FastAPI(title="LangChain App for Historical Figures")

# 1. Setup Gemini Model
# We expect GOOGLE_API_KEY to be set in the environment variables
llm = ChatGoogleGenerativeAI(
    model="gemini-2.5-flash",
    temperature=0.7
)

# 2. Define the Prompt
prompt = ChatPromptTemplate.from_template("You are an expert Historian. For the historical personality {name}, you are able to accurately tell their birth date and birth country. Return the output in the JSON format containing name, birthDate, birthCountry. In case you are unable to retrieve birthDate or birthCountry, just have the unknown values as null. ensure the response is a valid json object only.")
output_parser = JsonOutputParser()

# Chain: Prompt -> Model -> Json Output Parser
chain = prompt | llm | output_parser

# 3. Define Request Model
class QueryRequest(BaseModel):
    name: str

# 4. Define Endpoint
@app.post("/chat")
async def chat(request: QueryRequest):
    try:
        response = await chain.ainvoke({"name": request.name})
        return response
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

@app.get("/")
def health_check():
    return {"status": "ok", "service": "LangChain-Gemini-FastAPI"}

4. Test the Agent

Before deploying, you can test it locally.

Create a Gemini API key in the AI Studio.
Export the Gemini API key as:

export GOOGLE_API_KEY="AIzaSy..."

Run the server.

uvicorn main:app --port 8080 --host 0.0.0.0

Test the agent using the following curl command:

curl -X POST http://localhost:8080/chat \
     -H "Content-Type: application/json" \
     -d '{"name": "Abraham Lincoln"}'

Check you are getting a JSON structured output containing the name, birthDate and birthCountry.

5. Deploy to Cloud Run

We will use gcloud run deploy command to deploy the agent application to Cloud Run. The following command builds the container using Cloud Build and deploys it to Cloud Run in one step.

Replace YOUR_API_KEY with your actual Gemini API key.

gcloud run deploy gemini-fastapi-service \
  --source . \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --set-env-vars GOOGLE_API_KEY=<YOUR_API_KEY>

Once it's deployed you should see your endpoint in the terminal which would be ready for your use.

6. Test the Deployment

Use the Cloud Run endpoint, and perform the curl to ensure you get the expected results.

curl -X POST <CLOUD_RUN_ENDPOINT> \
     -H "Content-Type: application/json" \
     -d '{"name": "Abraham Lincoln"}'

You should get the same result as you had got on the locally running application.

7. Clean up

To avoid incurring charges to your Google Cloud account for the resources used in this codelab, follow these steps:

In the Google Cloud console, go to the Manage resources page.
In the project list, select the project that you want to delete, and then click Delete.
In the dialog, type the project ID, and then click Shut down to delete the project.

8. Congratulations

Congratulations! You have successfully created and interacted with your LangChain agent deployed on Cloud Run!