1. Introduction
In this codelab, you'll use gRPC to create a client and server that form the foundation of a route-mapping application written in python.
By the end of the tutorial, you will have a simple gRPC HelloWorld application instrumented with the gRPC OpenTelemetry plugin and be able to see the exported observability metrics in Prometheus.
What you'll learn
- How to setup OpenTelemetry Plugin for existing gRPC python application
- Running a local Prometheus instance
- Exporting metrics to Prometheus
- View metrics from Prometheus dashboard
2. Before you begin
What you'll need
- git
- curl
- build-essential
- Python 3.9 or higher. For platform-specific Python installation instructions, see Python Setup and Usage. Alternatively, install a non-system Python using tools like uv or pyenv.
- pip version 9.0.1 or higher to install Python packages.
- venv to create Python virtual environments.
Install the prerequisites:
sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install -y git curl build-essential clang
sudo apt install python3
sudo apt install python3-pip python3-venv
Get the code
To streamline your learning, this codelab offers a pre-built source code scaffold to help you get started. The following steps will guide you through instrumenting the gRPC OpenTelemetry Plugin in an application.
grpc-codelabs
The scaffold source code for this codelab is available in this github directory. If you prefer not to implement the code yourself, the completed source code is available in the completed directory.
First, clone the grpc codelab repo and cd into grpc-python-opentelemetry folder:
git clone https://github.com/grpc-ecosystem/grpc-codelabs.git
cd grpc-codelabs/codelabs/grpc-python-opentelemetry/
Alternatively, you can download the .zip file containing only the codelab directory and manually unzip it.
Let's first create a new python virtual environment (venv) to isolate your project's dependencies from the system packages:
python3 -m venv --upgrade-deps .venv
To activate the virtual environment in bash/zsh shell:
source .venv/bin/activate
For Windows and non-standard shells, see the table at https://docs.python.org/3/library/venv.html#how-venvs-work.
Next, install the dependencies in the environment using:
python -m pip install -r requirements.txt
3. Register the OpenTelemetry Plugin
We need a gRPC application to add the gRPC OpenTelemetry plugin. In this codelab, we will use a simple gRPC HelloWorld client and server that we will instrument with the gRPC OpenTelemetry plugin.
Your first step is to register the OpenTelemetry Plugin configured with a Prometheus exporter in the client. Open start_here/observability_greeter_client.py with your favorite editor. First add related dependencies and macros to look like this -
import logging
import time
import grpc
import grpc_observability
import helloworld_pb2
import helloworld_pb2_grpc
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from opentelemetry.sdk.metrics import MeterProvider
from prometheus_client import start_http_server
_SERVER_PORT = "50051"
_PROMETHEUS_PORT = 9465
Then transform run() to look like -
def run():
# Start Prometheus client
start_http_server(port=_PROMETHEUS_PORT, addr="0.0.0.0")
meter_provider = MeterProvider(metric_readers=[PrometheusMetricReader()])
otel_plugin = grpc_observability.OpenTelemetryPlugin(
meter_provider=meter_provider
)
otel_plugin.register_global()
with grpc.insecure_channel(target=f"localhost:{_SERVER_PORT}") as channel:
stub = helloworld_pb2_grpc.GreeterStub(channel)
# Continuously send RPCs every second.
while True:
try:
response = stub.SayHello(helloworld_pb2.HelloRequest(name="You"))
print(f"Greeter client received: {response.message}")
time.sleep(1)
except grpc.RpcError as rpc_error:
print("Call failed with code: ", rpc_error.code())
# Deregister is not called in this example, but this is required to clean up.
otel_plugin.deregister_global()
Next step is to add the OpenTelemetry plugin to the server. Open start_here/observability_greeter_server.py and add related dependencies and macros to look like this -
from concurrent import futures
import logging
import time
import grpc
import grpc_observability
import helloworld_pb2
import helloworld_pb2_grpc
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from prometheus_client import start_http_server
_SERVER_PORT = "50051"
_PROMETHEUS_PORT = 9464
Then transform run() to look like -
def serve():
# Start Prometheus client
start_http_server(port=_PROMETHEUS_PORT, addr="0.0.0.0")
meter_provider = MeterProvider(metric_readers=[PrometheusMetricReader()])
otel_plugin = grpc_observability.OpenTelemetryPlugin(
meter_provider=meter_provider
)
otel_plugin.register_global()
server = grpc.server(
thread_pool=futures.ThreadPoolExecutor(max_workers=10),
)
helloworld_pb2_grpc.add_GreeterServicer_to_server(Greeter(), server)
server.add_insecure_port("[::]:" + _SERVER_PORT)
server.start()
print("Server started, listening on " + _SERVER_PORT)
server.wait_for_termination()
# Deregister is not called in this example, but this is required to clean up.
otel_plugin.deregister_global()
4. Running the example and viewing metrics
To run the server, run -
cd start_here
python -m observability_greeter_server
With a successful setup, you will see the following output for the server -
Server started, listening on 50051
While, the server is running, on another terminal, run the client -
# Run the below commands to cd to the working directory and activate virtual environment in the new terminal
cd grpc-codelabs/codelabs/grpc-python-opentelemetry/
source .venv/bin/activate
cd start_here
python -m observability_greeter_client
A successful run will look like -
Greeter client received: Hello You
Greeter client received: Hello You
Greeter client received: Hello You
Since we have set-up the gRPC OpenTelemetry plugin to export metrics using Prometheus. Those metrics will be available on localhost:9464 for server and localhost:9465 for client.
To see client metrics -
curl localhost:9465/metrics
The result would be of the form -
# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 241.0
python_gc_objects_collected_total{generation="1"} 163.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable objects found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 78.0
python_gc_collections_total{generation="1"} 7.0
python_gc_collections_total{generation="2"} 0.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="9",version="3.10.9"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.868988416e+09
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 4.1680896e+07
# TYPE process_resident_memory_bytes gauge 21:20:16 [154/966]
process_resident_memory_bytes 4.1680896e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.72375679833e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.38
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 9.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 4096.0
# HELP target_info Target metadata
# TYPE target_info gauge
target_info{service_name="unknown_service",telemetry_sdk_language="python",telemetry_sdk_name="opentelemetry",telemetry_sdk_version="1.26.0"} 1.0
# HELP grpc_client_attempt_started_total Number of client call attempts started
# TYPE grpc_client_attempt_started_total counter
grpc_client_attempt_started_total{grpc_method="other",grpc_target="localhost:50051"} 18.0
# HELP grpc_client_attempt_sent_total_compressed_message_size_bytes Compressed message bytes sent per client call attempt
# TYPE grpc_client_attempt_sent_total_compressed_message_size_bytes histogram
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="0.0"} 0.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="5.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="10.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="25.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="50.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="75.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="100.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="250.0"} 18.0
Similarly, for the server side metrics -
curl localhost:9464/metrics
5. Viewing metrics on Prometheus
Here, we will setup a prometheus instance that will scrape our gRPC example client and server that are exporting metrics using prometheus.
Download the latest release of Prometheus for your platform using the given link, or use the following command:
curl -sLO https://github.com/prometheus/prometheus/releases/download/v3.7.3/prometheus-3.7.3.linux-amd64.tar.gz
Then extract and run it using the following command:
tar xvfz prometheus-*.tar.gz
cd prometheus-*
Create a prometheus configuration file with the following -
cat > grpc_otel_python_prometheus.yml <<EOF
scrape_configs:
- job_name: "prometheus"
scrape_interval: 5s
static_configs:
- targets: ["localhost:9090"]
- job_name: "grpc-otel-python"
scrape_interval: 5s
static_configs:
- targets: ["localhost:9464", "localhost:9465"]
EOF
Start prometheus with the new configuration -
./prometheus --config.file=grpc_otel_python_prometheus.yml
This will configure the metrics from the client and server codelab processes to be scraped every 5 seconds.
Go to http://localhost:9090/graph to view the metrics. For example, the query -
histogram_quantile(0.5, rate(grpc_client_attempt_duration_seconds_bucket[1m]))
will show a graph with the median attempt latency using 1minute as the time window for the quantile calculation.
Rate of queries -
increase(grpc_client_attempt_duration_seconds_bucket[1m])
6. (Optional) Exercise for User
In the prometheus dashboards, you'll notice that the QPS is low. See if you can identify some suspicious code in the example that is limiting the QPS.
For the enthusiastic, the client code limits itself to only have a single pending RPC at a given moment. This can be modified so that the client sends more RPCs without waiting for the previous ones to complete. (The solution for this has not been provided.)