1. 簡介
在本程式碼研究室中,您將使用 gRPC 建立用戶端和伺服器,做為以 Python 編寫的路線對應應用程式基礎。
完成本教學課程後,您將擁有一個簡單的 gRPC HelloWorld 應用程式,並透過 gRPC OpenTelemetry 外掛程式進行檢測,還能在 Prometheus 中查看匯出的可觀測性指標。
課程內容
- 如何為現有的 gRPC Python 應用程式設定 OpenTelemetry 外掛程式
- 執行本機 Prometheus 執行個體
- 將指標匯出至 Prometheus
- 從 Prometheus 資訊主頁查看指標
2. 事前準備
軟硬體需求
- Git
- curl
- build-essential
- Python 3.9 以上版本。如需特定平台的 Python 安裝說明,請參閱「Python 設定和使用方式」。或者,您也可以使用 uv 或 pyenv 等工具,安裝非系統 Python。
- pip 9.0.1 以上版本,用於安裝 Python 套件。
- venv 建立 Python 虛擬環境。
設定前置條件:
sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install -y git curl build-essential clang
sudo apt install python3
sudo apt install python3-pip python3-venv
取得程式碼
為簡化學習過程,本程式碼研究室提供預先建構的原始碼架構,協助您踏出第一步。下列步驟將引導您在應用程式中檢測 gRPC OpenTelemetry 外掛程式。
grpc-codelabs
本程式碼研究室的架構原始碼位於這個 GitHub 目錄。如果您不想自行導入程式碼,可以前往「completed」目錄查看已完成的原始碼。
首先,請複製 grpc Codelab 存放區,然後 cd 到 grpc-python-opentelemetry 資料夾:
git clone https://github.com/grpc-ecosystem/grpc-codelabs.git
cd grpc-codelabs/codelabs/grpc-python-opentelemetry/
您也可以下載只包含 Codelab 目錄的 .zip 檔案,然後手動解壓縮。
首先,請建立新的 Python 虛擬環境 (venv),將專案的依附元件與系統套件區隔開來:
python3 -m venv --upgrade-deps .venv
如要在 bash/zsh 殼層中啟動虛擬環境,請執行下列指令:
source .venv/bin/activate
如為 Windows 和非標準殼層,請參閱 https://docs.python.org/3/library/venv.html#how-venvs-work 中的表格。
接著,請使用下列指令在環境中安裝依附元件:
python -m pip install -r requirements.txt
3. 註冊 OpenTelemetry 外掛程式
我們需要 gRPC 應用程式來新增 gRPC OpenTelemetry 外掛程式。在本程式碼研究室中,我們將使用簡單的 gRPC HelloWorld 用戶端和伺服器,並透過 gRPC OpenTelemetry 外掛程式進行檢測。
首先,請在用戶端中註冊已設定 Prometheus 匯出器的 OpenTelemetry 外掛程式。使用偏好的編輯器開啟 start_here/observability_greeter_client.py。首先,請新增相關依附元件和巨集,如下所示:
import logging
import time
import grpc
import grpc_observability
import helloworld_pb2
import helloworld_pb2_grpc
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from opentelemetry.sdk.metrics import MeterProvider
from prometheus_client import start_http_server
_SERVER_PORT = "50051"
_PROMETHEUS_PORT = 9465
然後將 run() 轉換成 -
def run():
# Start Prometheus client
start_http_server(port=_PROMETHEUS_PORT, addr="0.0.0.0")
meter_provider = MeterProvider(metric_readers=[PrometheusMetricReader()])
otel_plugin = grpc_observability.OpenTelemetryPlugin(
meter_provider=meter_provider
)
otel_plugin.register_global()
with grpc.insecure_channel(target=f"localhost:{_SERVER_PORT}") as channel:
stub = helloworld_pb2_grpc.GreeterStub(channel)
# Continuously send RPCs every second.
while True:
try:
response = stub.SayHello(helloworld_pb2.HelloRequest(name="You"))
print(f"Greeter client received: {response.message}")
time.sleep(1)
except grpc.RpcError as rpc_error:
print("Call failed with code: ", rpc_error.code())
# Deregister is not called in this example, but this is required to clean up.
otel_plugin.deregister_global()
下一步是在伺服器中新增 OpenTelemetry 外掛程式。開啟 start_here/observability_greeter_server.py,並新增相關依附元件和巨集,如下所示:
from concurrent import futures
import logging
import time
import grpc
import grpc_observability
import helloworld_pb2
import helloworld_pb2_grpc
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from prometheus_client import start_http_server
_SERVER_PORT = "50051"
_PROMETHEUS_PORT = 9464
然後將 run() 轉換成 -
def serve():
# Start Prometheus client
start_http_server(port=_PROMETHEUS_PORT, addr="0.0.0.0")
meter_provider = MeterProvider(metric_readers=[PrometheusMetricReader()])
otel_plugin = grpc_observability.OpenTelemetryPlugin(
meter_provider=meter_provider
)
otel_plugin.register_global()
server = grpc.server(
thread_pool=futures.ThreadPoolExecutor(max_workers=10),
)
helloworld_pb2_grpc.add_GreeterServicer_to_server(Greeter(), server)
server.add_insecure_port("[::]:" + _SERVER_PORT)
server.start()
print("Server started, listening on " + _SERVER_PORT)
server.wait_for_termination()
# Deregister is not called in this example, but this is required to clean up.
otel_plugin.deregister_global()
4. 執行範例並查看指標
如要執行伺服器,請執行下列指令:
cd start_here
python -m observability_greeter_server
設定成功後,伺服器會顯示下列輸出內容:
Server started, listening on 50051
伺服器執行期間,在另一個終端機中執行用戶端:
# Run the below commands to cd to the working directory and activate virtual environment in the new terminal
cd grpc-codelabs/codelabs/grpc-python-opentelemetry/
source .venv/bin/activate
cd start_here
python -m observability_greeter_client
成功執行的結果如下所示:
Greeter client received: Hello You
Greeter client received: Hello You
Greeter client received: Hello You
我們已設定 gRPC OpenTelemetry 外掛程式,使用 Prometheus 匯出指標。這些指標會顯示在伺服器的 localhost:9464 和用戶端的 localhost:9465。
如要查看用戶端指標,請按照下列步驟操作:
curl localhost:9465/metrics
結果會採用以下形式:
# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 241.0
python_gc_objects_collected_total{generation="1"} 163.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable objects found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 78.0
python_gc_collections_total{generation="1"} 7.0
python_gc_collections_total{generation="2"} 0.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="9",version="3.10.9"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.868988416e+09
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 4.1680896e+07
# TYPE process_resident_memory_bytes gauge 21:20:16 [154/966]
process_resident_memory_bytes 4.1680896e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.72375679833e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.38
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 9.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 4096.0
# HELP target_info Target metadata
# TYPE target_info gauge
target_info{service_name="unknown_service",telemetry_sdk_language="python",telemetry_sdk_name="opentelemetry",telemetry_sdk_version="1.26.0"} 1.0
# HELP grpc_client_attempt_started_total Number of client call attempts started
# TYPE grpc_client_attempt_started_total counter
grpc_client_attempt_started_total{grpc_method="other",grpc_target="localhost:50051"} 18.0
# HELP grpc_client_attempt_sent_total_compressed_message_size_bytes Compressed message bytes sent per client call attempt
# TYPE grpc_client_attempt_sent_total_compressed_message_size_bytes histogram
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="0.0"} 0.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="5.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="10.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="25.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="50.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="75.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="100.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="250.0"} 18.0
同樣地,伺服器端指標的計算方式如下:
curl localhost:9464/metrics
5. 在 Prometheus 上查看指標
在這裡,我們將設定 Prometheus 執行個體,用於擷取匯出 Prometheus 指標的 gRPC 範例用戶端和伺服器。
使用指定連結下載適用於您平台的最新版 Prometheus,或使用下列指令:
curl -sLO https://github.com/prometheus/prometheus/releases/download/v3.7.3/prometheus-3.7.3.linux-amd64.tar.gz
然後使用下列指令解壓縮並執行:
tar xvfz prometheus-*.tar.gz
cd prometheus-*
建立 Prometheus 設定檔,並加入下列內容:
cat > grpc_otel_python_prometheus.yml <<EOF
scrape_configs:
- job_name: "prometheus"
scrape_interval: 5s
static_configs:
- targets: ["localhost:9090"]
- job_name: "grpc-otel-python"
scrape_interval: 5s
static_configs:
- targets: ["localhost:9464", "localhost:9465"]
EOF
使用新設定啟動 Prometheus:
./prometheus --config.file=grpc_otel_python_prometheus.yml
這會設定每 5 秒擷取一次用戶端和伺服器 Codelab 程序中的指標。
前往 http://localhost:9090/graph 查看指標。例如,查詢 -
histogram_quantile(0.5, rate(grpc_client_attempt_duration_seconds_bucket[1m]))
會顯示圖表,其中以 1 分鐘做為分位數計算的時間視窗,顯示嘗試延遲時間中位數。
查詢率 -
increase(grpc_client_attempt_duration_seconds_bucket[1m])
6. (選用) 使用者練習
在 Prometheus 資訊主頁中,您會發現 QPS 偏低。看看您是否能在範例中找出限制 QPS 的可疑程式碼。
對於熱衷於此的開發人員,用戶端程式碼會限制在特定時間點只能有一個待處理的 RPC。您可以修改這項設定,讓用戶端傳送更多 RPC,而不必等待先前的 RPC 完成。(這項問題的解決方案尚未提供)。