Prompt Encryption SDK Codelab

1. Overview

This codelab guides you through using the Prompt Encryption SDK to securely communicate with a model served in a Trusted Execution Environment (TEE) on Google Cloud.

What you'll learn

Establishing a cryptographically verified and encrypted channel between a client and a remote inference server.
Verifying the server's identity (software hash, hardware model, launch configuration) using Attested TLS.
Ensuring data sovereignty by keeping prompts encrypted until they reach the verified enclave.
Using the Prompt Encryption SDK to interact with vLLM running on Confidential Space.

What you'll need

A Google Cloud Project with billing enabled.
Google Cloud SDK (gcloud) installed and authenticated.
Python 3.10+ environment.
A Hugging Face Token for downloading Gemma models.
Familiarity with VPC firewalls and External IP address quota.
Building the SDK locally requires compiling the _ekm.c C extension. This step fails if Python C headers are not installed. Install python3-dev to resolve this (e.g., sudo apt-get install python3-dev for Debian/Ubuntu).

2. Setting Up Cloud Resources

Before starting, ensure you have enabled the required APIs and configured your environment.

1. Enable Required APIs:

gcloud services enable compute.googleapis.com \
    confidentialcomputing.googleapis.com \
    logging.googleapis.com \
    artifactregistry.googleapis.com \
    cloudbuild.googleapis.com

2. Configure Docker:

gcloud auth configure-docker gcr.io

3. Set Hugging Face Token:

export HF_TOKEN="your_token"

4. Clone the Repository:

git clone https://github.com/google/prompt-encryption-sdk && cd prompt-encryption-sdk

3. Scenario

We will use:

Client: Your local Python environment or a standard VM.
Server: A vLLM instance serving an open-source model (e.g., Gemma) inside a Confidential Space (TDX/SEV-SNP).
SDK: The prompt_encryption_sdk Python library.

4. Step 0: Server Setup

Before the client can verify anything, we need a server running in Confidential Space. A provided bash script handles provisioning.

./codelabs/setup.sh --project-id <PROJECT_ID>

The setup.sh script performs the following:

Enables required APIs (Compute, Confidential Computing, Logging, Artifact Registry, Cloud Build).
Builds and pushes the Docker image (wrapping vLLM with Attested TLS middleware).
Provisions a Service Account with necessary permissions.
Creates the Confidential VM (A3 instance with H100 GPU and TDX enabled).
Configures Networking and Load Balancing (Passthrough Network Load Balancer).
Saves outputs (image hash and load balancer IP) to local files.

5. Step 1: Run the Attested Client

Now that the server is running securely, establish an attested connection.

python3 -m venv venv
source venv/bin/activate
pip install -r examples/requirements.txt
pip install -e .
./codelabs/run_client.sh <PROJECT_ID>

The run_client.sh script reads deployment details and executes a Python request using the ConfidentialSDKClient. If attestation fails, an AttestationError is raised and the prompt is never sent.

6. Step 2: Cleanup

To avoid charges, clean up resources once finished.

./codelabs/cleanup.sh --project-id <PROJECT_ID>

7. Under the Hood

What happens during http.post?

TCP/TLS: Standard connection established.
Handshake Interception: SDK pauses before sending the body.
AttestConnection RPC: SDK sends a nonce to the server.
Quote Generation: Server requests a TEE hardware quote.
Validation: SDK verifies the quote signature and policy.
Bind: The SDK verifies that the channel's "Exported Keying Material" matches the session bound in the quote.
Data Transmission: Body is sent only if all checks pass.

8. Troubleshooting

Attestation Failed: Verify the image_hash in the policy matches the container.
Connection Refused: Ensure the server is reachable and port 8000 is open.
Timeout: TEE quote generation can take time; ensure timeouts are sufficient.

9. Congratulations

You've successfully completed the Prompt Encryption SDK codelab! You've learned how to establish a cryptographically verified and encrypted channel between your client and a TEE-based inference server.

What's next?

Explore advanced AttestationPolicy configurations.
Integrate the SDK with your existing production applications.
Learn more about Confidential Space and TEE hardware models.