Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

在 gRPC Python 中設定基本 OpenTelemetry 外掛程式

1. 簡介

在本程式碼研究室中，您將使用 gRPC 建立用戶端和伺服器，做為以 Python 編寫的路線對應應用程式基礎。

在本教學課程結束時，您將擁有一個簡單的 gRPC HelloWorld 應用程式，並使用 gRPC OpenTelemetry 外掛程式進行檢測，且能夠在 Prometheus 中查看匯出的可觀測性指標。

課程內容

如何為現有的 gRPC Python 應用程式設定 OpenTelemetry 外掛程式
執行本機 Prometheus 執行個體
將指標匯出至 Prometheus
從 Prometheus 資訊主頁查看指標

2. 事前準備

軟硬體需求

Git
curl
build-essential
Python 3.9 以上版本。如需特定平台的 Python 安裝說明，請參閱「Python 設定和使用方式」。或者，您也可以使用 uv 或 pyenv 等工具，安裝非系統 Python。
pip 9.0.1 以上版本，用於安裝 Python 套件。
venv 建立 Python 虛擬環境。

設定前置條件：

sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install -y git curl build-essential clang
sudo apt install python3
sudo apt install python3-pip python3-venv

取得程式碼

為簡化學習過程，本程式碼研究室提供預先建構的原始碼架構，協助您踏出第一步。下列步驟會引導您在應用程式中檢測 gRPC OpenTelemetry 外掛程式。

grpc-codelabs

本程式碼研究室的架構原始碼位於這個 GitHub 目錄。如果您不想自行導入程式碼，可以前往「completed」目錄查看已完成的原始碼。

首先，請複製 grpc Codelab 存放區，然後 cd 到 grpc-python-opentelemetry 資料夾：

git clone https://github.com/grpc-ecosystem/grpc-codelabs.git
cd grpc-codelabs/codelabs/grpc-python-opentelemetry/

或者，您也可以下載只包含 Codelab 目錄的 .zip 檔案，然後手動解壓縮。

首先，請建立新的 Python 虛擬環境 (venv)，將專案的依附元件與系統套件隔開：

python3 -m venv --upgrade-deps .venv

如要在 bash/zsh 殼層中啟用虛擬環境，請執行下列指令：

source .venv/bin/activate

如為 Windows 和非標準殼層，請參閱 https://docs.python.org/3/library/venv.html#how-venvs-work 中的表格。

接著，請使用下列指令在環境中安裝依附元件：

python -m pip install -r requirements.txt

3. 註冊 OpenTelemetry 外掛程式

我們需要 gRPC 應用程式來新增 gRPC OpenTelemetry 外掛程式。在本程式碼研究室中，我們將使用簡單的 gRPC HelloWorld 用戶端和伺服器，並透過 gRPC OpenTelemetry 外掛程式進行檢測。

首先，請在用戶端中註冊已設定 Prometheus 匯出器的 OpenTelemetry 外掛程式。使用偏好的編輯器開啟 start_here/observability_greeter_client.py。首先，請新增相關依附元件和巨集，如下所示：

import logging
import time

import grpc
import grpc_observability
import helloworld_pb2
import helloworld_pb2_grpc
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from opentelemetry.sdk.metrics import MeterProvider
from prometheus_client import start_http_server

_SERVER_PORT = "50051"
_PROMETHEUS_PORT = 9465

然後轉換 run()，使其看起來像 -

def run():
    # Start Prometheus client
    start_http_server(port=_PROMETHEUS_PORT, addr="0.0.0.0")
    meter_provider = MeterProvider(metric_readers=[PrometheusMetricReader()])

    otel_plugin = grpc_observability.OpenTelemetryPlugin(
        meter_provider=meter_provider
    )
    otel_plugin.register_global()

    with grpc.insecure_channel(target=f"localhost:{_SERVER_PORT}") as channel:
        stub = helloworld_pb2_grpc.GreeterStub(channel)
        # Continuously send RPCs every second.
        while True:
            try:
                response = stub.SayHello(helloworld_pb2.HelloRequest(name="You"))
                print(f"Greeter client received: {response.message}")
                time.sleep(1)
            except grpc.RpcError as rpc_error:
                print("Call failed with code: ", rpc_error.code())

    # Deregister is not called in this example, but this is required to clean up.
    otel_plugin.deregister_global()

下一步是在伺服器中新增 OpenTelemetry 外掛程式。開啟 start_here/observability_greeter_server.py，並新增相關依附元件和巨集，如下所示：

from concurrent import futures
import logging
import time

import grpc
import grpc_observability
import helloworld_pb2
import helloworld_pb2_grpc
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from prometheus_client import start_http_server

_SERVER_PORT = "50051"
_PROMETHEUS_PORT = 9464

然後轉換 run()，使其看起來像 -

def serve():
    # Start Prometheus client
    start_http_server(port=_PROMETHEUS_PORT, addr="0.0.0.0")

    meter_provider = MeterProvider(metric_readers=[PrometheusMetricReader()])

    otel_plugin = grpc_observability.OpenTelemetryPlugin(
        meter_provider=meter_provider
    )
    otel_plugin.register_global()

    server = grpc.server(
        thread_pool=futures.ThreadPoolExecutor(max_workers=10),
    )
    helloworld_pb2_grpc.add_GreeterServicer_to_server(Greeter(), server)
    server.add_insecure_port("[::]:" + _SERVER_PORT)
    server.start()
    print("Server started, listening on " + _SERVER_PORT)

    server.wait_for_termination()

    # Deregister is not called in this example, but this is required to clean up.
    otel_plugin.deregister_global()

4. 執行範例並查看指標

如要執行伺服器，請執行下列指令：

cd start_here
python -m observability_greeter_server

設定成功後，伺服器會顯示下列輸出內容：

Server started, listening on 50051

伺服器執行期間，在另一個終端機中執行用戶端：

# Run the below commands to cd to the working directory and activate virtual environment in the new terminal
cd grpc-codelabs/codelabs/grpc-python-opentelemetry/
source .venv/bin/activate

cd start_here
python -m observability_greeter_client

成功執行的結果如下所示：

Greeter client received: Hello You
Greeter client received: Hello You
Greeter client received: Hello You

我們已設定 gRPC OpenTelemetry 外掛程式，使用 Prometheus 匯出指標。這些指標會顯示在伺服器的 localhost:9464 和用戶端的 localhost:9465。

如要查看用戶端指標，請按照下列步驟操作：

curl localhost:9465/metrics

結果會採用以下形式：

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 241.0
python_gc_objects_collected_total{generation="1"} 163.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable objects found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 78.0
python_gc_collections_total{generation="1"} 7.0
python_gc_collections_total{generation="2"} 0.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="9",version="3.10.9"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.868988416e+09
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 4.1680896e+07
# TYPE process_resident_memory_bytes gauge                                                                                                                                                                                                                                                                21:20:16 [154/966]
process_resident_memory_bytes 4.1680896e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.72375679833e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.38
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 9.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 4096.0
# HELP target_info Target metadata
# TYPE target_info gauge
target_info{service_name="unknown_service",telemetry_sdk_language="python",telemetry_sdk_name="opentelemetry",telemetry_sdk_version="1.26.0"} 1.0
# HELP grpc_client_attempt_started_total Number of client call attempts started
# TYPE grpc_client_attempt_started_total counter
grpc_client_attempt_started_total{grpc_method="other",grpc_target="localhost:50051"} 18.0
# HELP grpc_client_attempt_sent_total_compressed_message_size_bytes Compressed message bytes sent per client call attempt
# TYPE grpc_client_attempt_sent_total_compressed_message_size_bytes histogram
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="0.0"} 0.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="5.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="10.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="25.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="50.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="75.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="100.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="250.0"} 18.0

同樣地，伺服器端指標也是如此：

curl localhost:9464/metrics

5. 在 Prometheus 上查看指標

在這裡，我們將設定 Prometheus 執行個體，用於擷取 gRPC 範例用戶端和伺服器，這些用戶端和伺服器會使用 Prometheus 匯出指標。

使用指定連結下載適用於您平台的最新版 Prometheus，或使用下列指令：

curl -sLO https://github.com/prometheus/prometheus/releases/download/v3.7.3/prometheus-3.7.3.linux-amd64.tar.gz

然後使用下列指令解壓縮並執行：

tar xvfz prometheus-*.tar.gz
cd prometheus-*

建立 Prometheus 設定檔，並加入下列內容：

cat > grpc_otel_python_prometheus.yml <<EOF
scrape_configs:
  - job_name: "prometheus"
    scrape_interval: 5s
    static_configs:
      - targets: ["localhost:9090"]
  - job_name: "grpc-otel-python"
    scrape_interval: 5s
    static_configs:
      - targets: ["localhost:9464", "localhost:9465"]
EOF

使用新設定啟動 Prometheus：

./prometheus --config.file=grpc_otel_python_prometheus.yml

這會設定每 5 秒擷取一次用戶端和伺服器程式碼實驗室程序的指標。

前往 http://localhost:9090/graph 查看指標。例如，查詢 -

histogram_quantile(0.5, rate(grpc_client_attempt_duration_seconds_bucket[1m]))

會顯示圖表，其中以 1 分鐘做為分位數計算的時間視窗，顯示嘗試延遲時間中位數。

查詢率 -

increase(grpc_client_attempt_duration_seconds_bucket[1m])

6. (選用) 使用者練習

在 Prometheus 資訊主頁中，您會發現 QPS 偏低。看看您是否能在範例中找出限制 QPS 的可疑程式碼。

對於熱衷於此的開發人員，用戶端程式碼會限制在特定時間點只能有一個待處理的 RPC。您可以修改這項設定，讓用戶端傳送更多 RPC，不必等待先前的 RPC 完成。(這項問題的解決方案尚未提供)。