gRPC Python에서 기본 OpenTelemetry 플러그인 설정

1. 소개

이 Codelab에서는 gRPC를 사용하여 Python으로 작성된 경로 매핑 애플리케이션의 기반을 형성하는 클라이언트와 서버를 만듭니다.

튜토리얼을 마치면 gRPC OpenTelemetry 플러그인으로 계측된 간단한 gRPC HelloWorld 애플리케이션이 있으며 내보낸 관측 가능성 측정항목을 Prometheus에서 볼 수 있습니다.

학습할 내용

기존 gRPC Python 애플리케이션용 OpenTelemetry 플러그인을 설정하는 방법
로컬 Prometheus 인스턴스 실행
Prometheus로 측정항목 내보내기
Prometheus 대시보드에서 측정항목 보기

2. 시작하기 전에

필요한 항목

git
curl
build-essential
Python 3.9 이상 플랫폼별 Python 설치 안내는 Python 설정 및 사용을 참고하세요. 또는 uv 또는 pyenv와 같은 도구를 사용하여 시스템이 아닌 Python을 설치합니다.
Python 패키지를 설치하려면 pip 버전 9.0.1 이상
venv를 사용하여 Python 가상 환경을 만듭니다.

다음 기본 요소를 설치합니다.

sudo apt-get update -y
sudo apt-get upgrade -y
sudo apt-get install -y git curl build-essential clang
sudo apt install python3
sudo apt install python3-pip python3-venv

코드 가져오기

학습을 간소화하기 위해 이 Codelab에서는 시작하는 데 도움이 되는 사전 빌드된 소스 코드 스캐폴드를 제공합니다. 다음 단계에서는 애플리케이션에서 gRPC OpenTelemetry 플러그인을 계측하는 방법을 설명합니다.

grpc-codelabs

이 Codelab의 스캐폴드 소스 코드는 이 GitHub 디렉터리에서 확인할 수 있습니다. 코드를 직접 구현하지 않으려면 completed 디렉터리에서 완료된 소스 코드를 확인하세요.

먼저 grpc codelab 저장소를 클론하고 grpc-python-opentelemetry 폴더로 cd합니다.

git clone https://github.com/grpc-ecosystem/grpc-codelabs.git
cd grpc-codelabs/codelabs/grpc-python-opentelemetry/

또는 Codelab 디렉터리만 포함된 .zip 파일을 다운로드하고 직접 압축을 해제할 수 있습니다.

먼저 프로젝트의 종속 항목을 시스템 패키지에서 격리하기 위해 새 Python 가상 환경 (venv)을 만듭니다.

python3 -m venv --upgrade-deps .venv

bash/zsh 셸에서 가상 환경을 활성화하려면 다음을 실행합니다.

source .venv/bin/activate

Windows 및 비표준 셸의 경우 https://docs.python.org/3/library/venv.html#how-venvs-work의 표를 참고하세요.

그런 다음 다음을 사용하여 환경에 종속 항목을 설치합니다.

python -m pip install -r requirements.txt

3. OpenTelemetry 플러그인 등록

gRPC OpenTelemetry 플러그인을 추가하려면 gRPC 애플리케이션이 필요합니다. 이 Codelab에서는 gRPC OpenTelemetry 플러그인으로 계측할 간단한 gRPC HelloWorld 클라이언트와 서버를 사용합니다.

첫 번째 단계는 클라이언트에 Prometheus 내보내기 도구로 구성된 OpenTelemetry 플러그인을 등록하는 것입니다. 원하는 편집기로 start_here/observability_greeter_client.py를 엽니다. 먼저 다음과 같이 관련 종속 항목과 매크로를 추가합니다.

import logging
import time

import grpc
import grpc_observability
import helloworld_pb2
import helloworld_pb2_grpc
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from opentelemetry.sdk.metrics import MeterProvider
from prometheus_client import start_http_server

_SERVER_PORT = "50051"
_PROMETHEUS_PORT = 9465

그런 다음 run()를 다음과 같이 변환합니다.

def run():
    # Start Prometheus client
    start_http_server(port=_PROMETHEUS_PORT, addr="0.0.0.0")
    meter_provider = MeterProvider(metric_readers=[PrometheusMetricReader()])

    otel_plugin = grpc_observability.OpenTelemetryPlugin(
        meter_provider=meter_provider
    )
    otel_plugin.register_global()

    with grpc.insecure_channel(target=f"localhost:{_SERVER_PORT}") as channel:
        stub = helloworld_pb2_grpc.GreeterStub(channel)
        # Continuously send RPCs every second.
        while True:
            try:
                response = stub.SayHello(helloworld_pb2.HelloRequest(name="You"))
                print(f"Greeter client received: {response.message}")
                time.sleep(1)
            except grpc.RpcError as rpc_error:
                print("Call failed with code: ", rpc_error.code())

    # Deregister is not called in this example, but this is required to clean up.
    otel_plugin.deregister_global()

다음 단계는 서버에 OpenTelemetry 플러그인을 추가하는 것입니다. start_here/observability_greeter_server.py을 열고 다음과 같이 관련 종속 항목과 매크로를 추가합니다.

from concurrent import futures
import logging
import time

import grpc
import grpc_observability
import helloworld_pb2
import helloworld_pb2_grpc
from opentelemetry.sdk.metrics import MeterProvider
from opentelemetry.exporter.prometheus import PrometheusMetricReader
from prometheus_client import start_http_server

_SERVER_PORT = "50051"
_PROMETHEUS_PORT = 9464

그런 다음 run()를 다음과 같이 변환합니다.

def serve():
    # Start Prometheus client
    start_http_server(port=_PROMETHEUS_PORT, addr="0.0.0.0")

    meter_provider = MeterProvider(metric_readers=[PrometheusMetricReader()])

    otel_plugin = grpc_observability.OpenTelemetryPlugin(
        meter_provider=meter_provider
    )
    otel_plugin.register_global()

    server = grpc.server(
        thread_pool=futures.ThreadPoolExecutor(max_workers=10),
    )
    helloworld_pb2_grpc.add_GreeterServicer_to_server(Greeter(), server)
    server.add_insecure_port("[::]:" + _SERVER_PORT)
    server.start()
    print("Server started, listening on " + _SERVER_PORT)

    server.wait_for_termination()

    # Deregister is not called in this example, but this is required to clean up.
    otel_plugin.deregister_global()

4. 예 실행 및 측정항목 보기

서버를 실행하려면 다음을 실행하세요.

cd start_here
python -m observability_greeter_server

설정이 완료되면 서버에 다음 출력이 표시됩니다.

Server started, listening on 50051

서버가 실행되는 동안 다른 터미널에서 클라이언트를 실행합니다.

# Run the below commands to cd to the working directory and activate virtual environment in the new terminal
cd grpc-codelabs/codelabs/grpc-python-opentelemetry/
source .venv/bin/activate

cd start_here
python -m observability_greeter_client

성공적인 실행은 다음과 같습니다.

Greeter client received: Hello You
Greeter client received: Hello You
Greeter client received: Hello You

Prometheus를 사용하여 측정항목을 내보내도록 gRPC OpenTelemetry 플러그인을 설정했기 때문입니다. 이러한 측정항목은 서버의 경우 localhost:9464, 클라이언트의 경우 localhost:9465에서 확인할 수 있습니다.

클라이언트 측정항목을 보려면 다음 단계를 따르세요.

curl localhost:9465/metrics

결과는 다음과 같은 형식입니다.

# HELP python_gc_objects_collected_total Objects collected during gc
# TYPE python_gc_objects_collected_total counter
python_gc_objects_collected_total{generation="0"} 241.0
python_gc_objects_collected_total{generation="1"} 163.0
python_gc_objects_collected_total{generation="2"} 0.0
# HELP python_gc_objects_uncollectable_total Uncollectable objects found during GC
# TYPE python_gc_objects_uncollectable_total counter
python_gc_objects_uncollectable_total{generation="0"} 0.0
python_gc_objects_uncollectable_total{generation="1"} 0.0
python_gc_objects_uncollectable_total{generation="2"} 0.0
# HELP python_gc_collections_total Number of times this generation was collected
# TYPE python_gc_collections_total counter
python_gc_collections_total{generation="0"} 78.0
python_gc_collections_total{generation="1"} 7.0
python_gc_collections_total{generation="2"} 0.0
# HELP python_info Python platform information
# TYPE python_info gauge
python_info{implementation="CPython",major="3",minor="10",patchlevel="9",version="3.10.9"} 1.0
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 1.868988416e+09
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 4.1680896e+07
# TYPE process_resident_memory_bytes gauge                                                                                                                                                                                                                                                                21:20:16 [154/966]
process_resident_memory_bytes 4.1680896e+07
# HELP process_start_time_seconds Start time of the process since unix epoch in seconds.
# TYPE process_start_time_seconds gauge
process_start_time_seconds 1.72375679833e+09
# HELP process_cpu_seconds_total Total user and system CPU time spent in seconds.
# TYPE process_cpu_seconds_total counter
process_cpu_seconds_total 0.38
# HELP process_open_fds Number of open file descriptors.
# TYPE process_open_fds gauge
process_open_fds 9.0
# HELP process_max_fds Maximum number of open file descriptors.
# TYPE process_max_fds gauge
process_max_fds 4096.0
# HELP target_info Target metadata
# TYPE target_info gauge
target_info{service_name="unknown_service",telemetry_sdk_language="python",telemetry_sdk_name="opentelemetry",telemetry_sdk_version="1.26.0"} 1.0
# HELP grpc_client_attempt_started_total Number of client call attempts started
# TYPE grpc_client_attempt_started_total counter
grpc_client_attempt_started_total{grpc_method="other",grpc_target="localhost:50051"} 18.0
# HELP grpc_client_attempt_sent_total_compressed_message_size_bytes Compressed message bytes sent per client call attempt
# TYPE grpc_client_attempt_sent_total_compressed_message_size_bytes histogram
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="0.0"} 0.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="5.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="10.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="25.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="50.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="75.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="100.0"} 18.0
grpc_client_attempt_sent_total_compressed_message_size_bytes_bucket{grpc_method="other",grpc_status="OK",grpc_target="localhost:50051",le="250.0"} 18.0

마찬가지로 서버 측 측정항목의 경우 다음을 확인하세요.

curl localhost:9464/metrics

5. Prometheus에서 측정항목 보기

여기서는 Prometheus를 사용하여 측정항목을 내보내는 gRPC 예시 클라이언트와 서버를 스크랩하는 Prometheus 인스턴스를 설정합니다.

제공된 링크를 사용하여 플랫폼에 맞는 Prometheus의 최신 출시 버전을 다운로드하거나 다음 명령어를 사용합니다.

curl -sLO https://github.com/prometheus/prometheus/releases/download/v3.7.3/prometheus-3.7.3.linux-amd64.tar.gz

그런 다음 다음 명령어를 사용하여 추출하고 실행합니다.

tar xvfz prometheus-*.tar.gz
cd prometheus-*

다음과 같이 Prometheus 구성 파일을 만듭니다.

cat > grpc_otel_python_prometheus.yml <<EOF
scrape_configs:
  - job_name: "prometheus"
    scrape_interval: 5s
    static_configs:
      - targets: ["localhost:9090"]
  - job_name: "grpc-otel-python"
    scrape_interval: 5s
    static_configs:
      - targets: ["localhost:9464", "localhost:9465"]
EOF

새 구성으로 prometheus를 시작합니다.

./prometheus --config.file=grpc_otel_python_prometheus.yml

이렇게 하면 클라이언트 및 서버 Codelab 프로세스의 측정항목이 5초마다 스크랩되도록 구성됩니다.

http://localhost:9090/graph로 이동하여 측정항목을 확인합니다. 예를 들어 다음 쿼리는

histogram_quantile(0.5, rate(grpc_client_attempt_duration_seconds_bucket[1m]))

분위수 계산의 시간 창으로 1분을 사용하여 시도 대기 시간의 중앙값을 보여주는 그래프가 표시됩니다.

질문 비율 -

increase(grpc_client_attempt_duration_seconds_bucket[1m])

6. (선택사항) 사용자 운동

Prometheus 대시보드에서 QPS가 낮다는 것을 확인할 수 있습니다. QPS를 제한하는 의심스러운 코드를 예시에서 식별할 수 있는지 확인합니다.

열성적인 사용자를 위해 클라이언트 코드는 특정 시점에 하나의 대기 중인 RPC만 갖도록 제한됩니다. 이 값을 수정하여 클라이언트가 이전 RPC가 완료될 때까지 기다리지 않고 더 많은 RPC를 전송할 수 있습니다. (이 문제의 해결 방법은 제공되지 않았습니다.)