1. Overview
This codelab guides you through using the Prompt Encryption SDK to securely communicate with a model served in a Trusted Execution Environment (TEE) on Google Cloud.
What you'll learn
- Establishing a cryptographically verified and encrypted channel between a client and a remote inference server.
- Verifying the server's identity (software hash, hardware model, launch configuration) using Attested TLS.
- Ensuring data sovereignty by keeping prompts encrypted until they reach the verified enclave.
- Using the Prompt Encryption SDK to interact with vLLM running on Confidential Space.
What you'll need
- A Google Cloud Project with billing enabled.
- Google Cloud SDK (gcloud) installed and authenticated.
- Python 3.10+ environment.
- A Hugging Face Token for downloading Gemma models.
- Familiarity with VPC firewalls and External IP address quota.
- Building the SDK locally requires compiling the _ekm.c C extension. This step fails if Python C headers are not installed. Install python3-dev to resolve this (e.g., sudo apt-get install python3-dev for Debian/Ubuntu).
2. Setting Up Cloud Resources
Before starting, ensure you have enabled the required APIs and configured your environment.
1. Enable Required APIs:
gcloud services enable compute.googleapis.com \
confidentialcomputing.googleapis.com \
logging.googleapis.com \
artifactregistry.googleapis.com \
cloudbuild.googleapis.com
2. Configure Docker:
gcloud auth configure-docker gcr.io
3. Set Hugging Face Token:
export HF_TOKEN="your_token"
4. Clone the Repository:
git clone https://github.com/google/prompt-encryption-sdk && cd prompt-encryption-sdk
3. Scenario
We will use:
- Client: Your local Python environment or a standard VM.
- Server: A vLLM instance serving an open-source model (e.g., Gemma) inside a Confidential Space (TDX/SEV-SNP).
- SDK: The prompt_encryption_sdk Python library.
4. Step 0: Server Setup
Before the client can verify anything, we need a server running in Confidential Space. A provided bash script handles provisioning.
./codelabs/setup.sh --project-id <PROJECT_ID>
The setup.sh script performs the following:
- Enables required APIs (Compute, Confidential Computing, Logging, Artifact Registry, Cloud Build).
- Builds and pushes the Docker image (wrapping vLLM with Attested TLS middleware).
- Provisions a Service Account with necessary permissions.
- Creates the Confidential VM (A3 instance with H100 GPU and TDX enabled).
- Configures Networking and Load Balancing (Passthrough Network Load Balancer).
- Saves outputs (image hash and load balancer IP) to local files.
5. Step 1: Run the Attested Client
Now that the server is running securely, establish an attested connection.
python3 -m venv venv
source venv/bin/activate
pip install -r examples/requirements.txt
pip install -e .
./codelabs/run_client.sh <PROJECT_ID>
The run_client.sh script reads deployment details and executes a Python request using the ConfidentialSDKClient. If attestation fails, an AttestationError is raised and the prompt is never sent.
6. Step 2: Cleanup
To avoid charges, clean up resources once finished.
./codelabs/cleanup.sh --project-id <PROJECT_ID>
7. Under the Hood
What happens during http.post?
- TCP/TLS: Standard connection established.
- Handshake Interception: SDK pauses before sending the body.
- AttestConnection RPC: SDK sends a nonce to the server.
- Quote Generation: Server requests a TEE hardware quote.
- Validation: SDK verifies the quote signature and policy.
- Bind: The SDK verifies that the channel's "Exported Keying Material" matches the session bound in the quote.
- Data Transmission: Body is sent only if all checks pass.
8. Troubleshooting
- Attestation Failed: Verify the image_hash in the policy matches the container.
- Connection Refused: Ensure the server is reachable and port 8000 is open.
- Timeout: TEE quote generation can take time; ensure timeouts are sufficient.
9. Congratulations
You've successfully completed the Prompt Encryption SDK codelab! You've learned how to establish a cryptographically verified and encrypted channel between your client and a TEE-based inference server.
What's next?
- Explore advanced AttestationPolicy configurations.
- Integrate the SDK with your existing production applications.
- Learn more about Confidential Space and TEE hardware models.