ตั้งค่าปลั๊กอิน OpenTelemetry พื้นฐานใน gRPC Python

1. บทนำ

ในโค้ดแล็บนี้ คุณจะได้ใช้ gRPC เพื่อสร้างไคลเอ็นต์และเซิร์ฟเวอร์ซึ่งเป็นรากฐานของแอปพลิเคชันการแมปเส้นทางที่เขียนด้วย Python

เมื่อสิ้นสุดบทแนะนำนี้ คุณจะมีแอปพลิเคชัน gRPC HelloWorld แบบง่ายๆ ที่ติดตั้งปลั๊กอิน gRPC OpenTelemetry และจะดูเมตริกการสังเกตการณ์ที่ส่งออกใน Prometheus ได้

สิ่งที่คุณจะได้เรียนรู้

วิธีตั้งค่าปลั๊กอิน OpenTelemetry สำหรับแอปพลิเคชัน gRPC Python ที่มีอยู่
การเรียกใช้อินสแตนซ์ Prometheus ในเครื่อง
ส่งออกเมตริกไปยัง Prometheus
ดูเมตริกจากแดชบอร์ด Prometheus

2. ก่อนเริ่มต้น

สิ่งที่คุณต้องมี

git
curl
build-essential
Python 3.9 ขึ้นไป ดูวิธีการติดตั้ง Python สำหรับแพลตฟอร์มที่เฉพาะเจาะจงได้ที่การตั้งค่าและการใช้งาน Python หรือจะติดตั้ง Python ที่ไม่ใช่ระบบโดยใช้เครื่องมืออย่าง uv หรือ pyenv ก็ได้
pip เวอร์ชัน 9.0.1 ขึ้นไปเพื่อติดตั้งแพ็กเกจ Python
venv เพื่อสร้างสภาพแวดล้อมเสมือนของ Python

ติดตั้งข้อกำหนดเบื้องต้น

sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install -y git curl build-essential clang
sudo apt install python3
sudo apt install python3-pip python3-venv

รับโค้ด

Codelab นี้มีโครงร่างซอร์สโค้ดที่สร้างไว้ล่วงหน้าเพื่อช่วยให้คุณเริ่มต้นใช้งานได้ง่ายขึ้น ขั้นตอนต่อไปนี้จะแนะนําคุณตลอดการติดตั้งใช้งานปลั๊กอิน gRPC OpenTelemetry ในแอปพลิเคชัน

grpc-codelabs

ซอร์สโค้ดโครงร่างสำหรับ Codelab นี้อยู่ในไดเรกทอรีนี้ใน GitHub หากไม่ต้องการติดตั้งใช้งานโค้ดด้วยตนเอง คุณจะดูซอร์สโค้ดที่เสร็จสมบูรณ์ได้ในไดเรกทอรี completed

ก่อนอื่น ให้โคลนที่เก็บโค้ดแล็บ grpc แล้ว cd ไปยังโฟลเดอร์ grpc-python-opentelemetry

git clone https://github.com/grpc-ecosystem/grpc-codelabs.git
cd grpc-codelabs/codelabs/grpc-python-opentelemetry/

หรือคุณจะดาวน์โหลดไฟล์ .zip ที่มีเฉพาะไดเรกทอรี Codelab แล้วแตกไฟล์ด้วยตนเองก็ได้

ก่อนอื่นมาสร้างสภาพแวดล้อมเสมือนของ Python (venv) ใหม่เพื่อแยกการอ้างอิงของโปรเจ็กต์ออกจากแพ็กเกจของระบบกัน

python3 -m venv --upgrade-deps .venv

วิธีเปิดใช้งานสภาพแวดล้อมเสมือนใน Bash/Zsh Shell

source .venv/bin/activate

สำหรับ Windows และเชลล์ที่ไม่ใช่แบบมาตรฐาน โปรดดูตารางที่ https://docs.python.org/3/library/venv.html#how-venvs-work

จากนั้นติดตั้งการอ้างอิงในสภาพแวดล้อมโดยใช้คำสั่งต่อไปนี้

python -m pip install -r requirements.txt

3. ลงทะเบียนปลั๊กอิน OpenTelemetry

เราต้องมีแอปพลิเคชัน gRPC เพื่อเพิ่มปลั๊กอิน gRPC OpenTelemetry ใน Codelab นี้ เราจะใช้ไคลเอ็นต์และเซิร์ฟเวอร์ gRPC HelloWorld แบบง่ายๆ ซึ่งเราจะติดตั้งเครื่องมือด้วยปลั๊กอิน OpenTelemetry ของ gRPC

ขั้นตอนแรกคือการลงทะเบียนปลั๊กอิน OpenTelemetry ที่กำหนดค่าด้วยเครื่องมือส่งออก Prometheus ในไคลเอ็นต์ เปิด start_here/observability_greeter_client.py ด้วยโปรแกรมแก้ไขที่ต้องการ ก่อนอื่น ให้เพิ่มการอ้างอิงและมาโครที่เกี่ยวข้องให้มีลักษณะดังนี้

import logging
import time

import grpc
import grpc_observability
import helloworld_pb2
import helloworld_pb2_grpc
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from opentelemetry.sdk.metrics import MeterProvider
from prometheus_client import start_http_server

_SERVER_PORT = "50051"
_PROMETHEUS_PORT = 9465

จากนั้นเปลี่ยน run() ให้มีลักษณะดังนี้

def run():
    # Start Prometheus client
    start_http_server(port=_PROMETHEUS_PORT, addr="0.0.0.0")
    meter_provider = MeterProvider(metric_readers=[PrometheusMetricReader()])

    otel_plugin = grpc_observability.OpenTelemetryPlugin(
        meter_provider=meter_provider
    )
    otel_plugin.register_global()

    with grpc.insecure_channel(target=f"localhost:{_SERVER_PORT}") as channel:
        stub = helloworld_pb2_grpc.GreeterStub(channel)
        # Continuously send RPCs every second.
        while True:
            try:
                response = stub.SayHello(helloworld_pb2.HelloRequest(name="You"))
                print(f"Greeter client received: {response.message}")
                time.sleep(1)
            except grpc.RpcError as rpc_error:
                print("Call failed with code: ", rpc_error.code())

    # Deregister is not called in this example, but this is required to clean up.
    otel_plugin.deregister_global()

หมายเหตุ: ระบบกำลังตั้งค่า Prometheus Exporter ใน OpenTelemetry Meter Provider (นอกจากนี้ยังมีวิธีอื่นๆ ในการส่งออกเมตริกด้วย Codelab นี้เลือกใช้ตัวส่งออก Prometheus) ระบบจะระบุ MeterProvider นี้ให้กับปลั๊กอิน OpenTelemetry ของ gRPC เป็นการกำหนดค่า เมื่อลงทะเบียนปลั๊กอิน OpenTelemetry ทั่วโลกแล้ว ไคลเอ็นต์และเซิร์ฟเวอร์ gRPC ทั้งหมดจะได้รับการวัดผลด้วย OpenTelemetry

ขั้นตอนถัดไปคือการเพิ่มปลั๊กอิน OpenTelemetry ลงในเซิร์ฟเวอร์ เปิด start_here/observability_greeter_server.py แล้วเพิ่มการอ้างอิงและมาโครที่เกี่ยวข้องให้มีลักษณะดังนี้

from concurrent import futures
import logging
import time

import grpc
import grpc_observability
import helloworld_pb2
import helloworld_pb2_grpc
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from prometheus_client import start_http_server

_SERVER_PORT = "50051"
_PROMETHEUS_PORT = 9464

จากนั้นเปลี่ยน run() ให้มีลักษณะดังนี้

def serve():
    # Start Prometheus client
    start_http_server(port=_PROMETHEUS_PORT, addr="0.0.0.0")

    meter_provider = MeterProvider(metric_readers=[PrometheusMetricReader()])

    otel_plugin = grpc_observability.OpenTelemetryPlugin(
        meter_provider=meter_provider
    )
    otel_plugin.register_global()

    server = grpc.server(
        thread_pool=futures.ThreadPoolExecutor(max_workers=10),
    )
    helloworld_pb2_grpc.add_GreeterServicer_to_server(Greeter(), server)
    server.add_insecure_port("[::]:" + _SERVER_PORT)
    server.start()
    print("Server started, listening on " + _SERVER_PORT)

    server.wait_for_termination()

    # Deregister is not called in this example, but this is required to clean up.
    otel_plugin.deregister_global()

4. การเรียกใช้ตัวอย่างและการดูเมตริก

หากต้องการเรียกใช้เซิร์ฟเวอร์ ให้เรียกใช้ -

cd start_here
python -m observability_greeter_server

หากตั้งค่าสำเร็จ คุณจะเห็นเอาต์พุตต่อไปนี้สำหรับเซิร์ฟเวอร์

Server started, listening on 50051

ขณะที่เซิร์ฟเวอร์ทำงาน ให้เรียกใช้ไคลเอ็นต์ในเทอร์มินัลอื่นโดยใช้คำสั่งต่อไปนี้

# Run the below commands to cd to the working directory and activate virtual environment in the new terminal
cd grpc-codelabs/codelabs/grpc-python-opentelemetry/
source .venv/bin/activate

cd start_here
python -m observability_greeter_client

การเรียกใช้ที่สำเร็จจะมีลักษณะดังนี้

Greeter client received: Hello You
Greeter client received: Hello You
Greeter client received: Hello You

เนื่องจากเราได้ตั้งค่าปลั๊กอิน gRPC OpenTelemetry เพื่อส่งออกเมตริกโดยใช้ Prometheus เมตริกเหล่านั้นจะพร้อมใช้งานใน localhost:9464 สำหรับเซิร์ฟเวอร์และ localhost:9465 สำหรับไคลเอ็นต์

วิธีดูเมตริกไคลเอ็นต์

curl localhost:9465/metrics

ผลลัพธ์จะมีรูปแบบดังนี้

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 241.0
python_gc_objects_collected_total{generation="1"} 163.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable objects found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 78.0
python_gc_collections_total{generation="1"} 7.0
python_gc_collections_total{generation="2"} 0.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="9",version="3.10.9"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.868988416e+09
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 4.1680896e+07
# TYPE process_resident_memory_bytes gauge                                                                                                                                                                                                                                                                21:20:16 [154/966]
process_resident_memory_bytes 4.1680896e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.72375679833e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.38
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 9.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 4096.0
# HELP target_info Target metadata
# TYPE target_info gauge
target_info{service_name="unknown_service",telemetry_sdk_language="python",telemetry_sdk_name="opentelemetry",telemetry_sdk_version="1.26.0"} 1.0
# HELP grpc_client_attempt_started_total Number of client call attempts started
# TYPE grpc_client_attempt_started_total counter
grpc_client_attempt_started_total{grpc_method="other",grpc_target="localhost:50051"} 18.0
# HELP grpc_client_attempt_sent_total_compressed_message_size_bytes Compressed message bytes sent per client call attempt
# TYPE grpc_client_attempt_sent_total_compressed_message_size_bytes histogram
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="0.0"} 0.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="5.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="10.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="25.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="50.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="75.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="100.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="250.0"} 18.0

ในทำนองเดียวกัน สำหรับเมตริกฝั่งเซิร์ฟเวอร์ ให้ทำดังนี้

curl localhost:9464/metrics

5. การดูเมตริกใน Prometheus

ในที่นี้ เราจะตั้งค่าอินสแตนซ์ Prometheus ที่จะคัดลอกไคลเอ็นต์และเซิร์ฟเวอร์ตัวอย่าง gRPC ที่ส่งออกเมตริกโดยใช้ Prometheus

ดาวน์โหลดรุ่นล่าสุดของ Prometheus สำหรับแพลตฟอร์มของคุณโดยใช้ลิงก์ที่ให้ไว้ หรือใช้คำสั่งต่อไปนี้

curl -sLO https://github.com/prometheus/prometheus/releases/download/v3.7.3/prometheus-3.7.3.linux-amd64.tar.gz

จากนั้นแตกไฟล์และเรียกใช้โดยใช้คำสั่งต่อไปนี้

tar xvfz prometheus-*.tar.gz
cd prometheus-*

สร้างไฟล์การกำหนดค่า Prometheus ด้วยข้อมูลต่อไปนี้

cat > grpc_otel_python_prometheus.yml <<EOF
scrape_configs:
  - job_name: "prometheus"
    scrape_interval: 5s
    static_configs:
      - targets: ["localhost:9090"]
  - job_name: "grpc-otel-python"
    scrape_interval: 5s
    static_configs:
      - targets: ["localhost:9464", "localhost:9465"]
EOF

เริ่ม Prometheus ด้วยการกำหนดค่าใหม่ -

./prometheus --config.file=grpc_otel_python_prometheus.yml

ซึ่งจะกำหนดค่าเมตริกจากกระบวนการ Codelab ของไคลเอ็นต์และเซิร์ฟเวอร์ให้มีการขูดทุกๆ 5 วินาที

ไปที่ http://localhost:9090/graph เพื่อดูเมตริก ตัวอย่างเช่น การค้นหา -

histogram_quantile(0.5, rate(grpc_client_attempt_duration_seconds_bucket[1m]))

จะแสดงกราฟที่มีเวลาในการตอบสนองของความพยายามที่ค่ามัธยฐานโดยใช้ 1 นาทีเป็นกรอบเวลาสำหรับการคำนวณควอนไทล์

อัตราการค้นหา -

increase(grpc_client_attempt_duration_seconds_bucket[1m])

6. (ไม่บังคับ) แบบฝึกหัดสำหรับผู้ใช้

คุณจะเห็นว่า QPS ต่ำในแดชบอร์ด Prometheus ดูว่าคุณระบุโค้ดที่น่าสงสัยในตัวอย่างที่จำกัด QPS ได้หรือไม่

สำหรับผู้ที่กระตือรือร้น รหัสไคลเอ็นต์จะจำกัดตัวเองให้มี RPC ที่รอดำเนินการเพียงรายการเดียวในขณะใดก็ตาม ซึ่งสามารถแก้ไขเพื่อให้ไคลเอ็นต์ส่ง RPC เพิ่มเติมได้โดยไม่ต้องรอให้ RPC ก่อนหน้าเสร็จสมบูรณ์ (ยังไม่มีวิธีแก้ปัญหานี้)