Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

以 HEY 進行 Vertex AI 線上預測基準測試

1. 簡介

本教學課程說明如何使用 HEY 網頁效能工具，在 us-central1 和 us-west1 執行基準測試時，建立及評估 Cloud Monitoring 線上預測指標，並將測試結果傳送至部署在 us-central1 的預測端點。

建構項目

您將設定名為 aiml-vpc 的虛擬私有雲網路，其中包含 us-west1 和 us-central1 中的子網路和執行個體，這些子網路和執行個體會使用 HEY 產生流量，並以部署在 us-central1 中的線上預測和模型為目標。

本教學課程也納入 Private Service Connect 和私人 DNS，示範地端部署和多雲端環境如何利用 PSC 存取 googleapis。

本教學課程會使用 Cloud Monitoring 和 Network Intelligence，驗證從 HEY 產生至線上預測的流量。雖然本教學課程中說明的步驟是在虛擬私有雲中部署，但您仍可運用這些步驟，從地端部署或多雲端環境部署及取得 Vertex API 的基準。網路架構包含下列元件：

以下是使用案例的詳細資料：

使用 HEY 從 us-west1 的 GCE 執行個體存取 us-central1 的線上預測
確認是否使用 PSC 存取 Vertex API
使用 HEY 執行 curl 5 分鐘
使用 Cloud Monitoring 驗證延遲時間
使用 Network Intelligence 驗證跨區域延遲
使用 HEY 從 us-central1 的 GCE 執行個體存取 us-central1 的線上預測
確認是否使用 PSC 存取 Vertex API
使用 HEY 執行 curl 5 分鐘
使用 Cloud Monitoring 驗證延遲時間
使用 Network Intelligence 驗證區域內延遲時間

課程內容

如何建立 Private Service Connect 端點
如何使用 HEY 對線上預測產生負載
如何使用 Cloud Monitoring 建立 Vertex AI 指標
如何使用網路智慧功能驗證區域內和區域間的延遲

軟硬體需求

Google Cloud 專案

IAM 權限

Compute 網路管理員

Service Directory 編輯器

DNS 管理員

Network Management 檢視者

2. 事前準備

更新專案以支援教學課程

本教學課程會使用 $variables，協助您在 Cloud Shell 中實作 gcloud 設定。

在 Cloud Shell 中執行下列操作：

gcloud config list project
gcloud config set project [YOUR-PROJECT-NAME]
projectid=YOUR-PROJECT-NAME
echo $projectid

3. aiml-vpc 設定

建立 aiml-vpc

gcloud services enable networkmanagement.googleapis.com

在 Cloud Shell 中執行下列操作：

gcloud compute networks create aiml-vpc --project=$projectid --subnet-mode=custom

在 Cloud Shell 中，啟用 Network Intelligence 的 Network Management API

gcloud services enable networkmanagement.googleapis.com

建立由使用者管理的筆記本子網路

在 Cloud Shell 中建立 workbench-subnet。

gcloud compute networks subnets create workbench-subnet --project=$projectid --range=172.16.10.0/28 --network=aiml-vpc --region=us-central1 --enable-private-ip-google-access

在 Cloud Shell 中建立 us-west1-subnet。

gcloud compute networks subnets create us-west1-subnet --project=$projectid --range=192.168.10.0/28 --network=aiml-vpc --region=us-west1

在 Cloud Shell 中，建立 us-central1-subnet。

gcloud compute networks subnets create us-central1-subnet --project=$projectid --range=192.168.20.0/28 --network=aiml-vpc --region=us-central1

Cloud Router 和 NAT 設定

本教學課程使用 Cloud NAT 下載軟體套件，因為 GCE 執行個體沒有外部 IP 位址。Cloud NAT 提供連出 NAT 功能，也就是說，網際網路主機無法與使用者管理的 Notebook 啟動通訊，因此更加安全。

在 Cloud Shell 中，建立 us-west1 地區的 Cloud Router。

gcloud compute routers create cloud-router-us-west1-aiml-nat --network aiml-vpc --region us-west1

在 Cloud Shell 中，建立區域性 Cloud NAT 閘道 us-west1。

gcloud compute routers nats create cloud-nat-us-west1 --router=cloud-router-us-west1-aiml-nat --auto-allocate-nat-external-ips --nat-all-subnet-ip-ranges --region us-west1

在 Cloud Shell 中，建立 us-central1 地區的 Cloud Router。

gcloud compute routers create cloud-router-us-central1-aiml-nat --network aiml-vpc --region us-central1

在 Cloud Shell 中，建立區域性 Cloud NAT 閘道 (us-central1)。

gcloud compute routers nats create cloud-nat-us-central1 --router=cloud-router-us-central1-aiml-nat --auto-allocate-nat-external-ips --nat-all-subnet-ip-ranges --region us-central1

4. 建立 Private Service Connect 端點

在下一節中，您將建立 Private Service Connect (PSC) 端點，用於從 aiml-vpc 存取 Vertex API。

透過 Cloud Shell

gcloud compute addresses create psc-ip \
    --global \
    --purpose=PRIVATE_SERVICE_CONNECT \
    --addresses=100.100.10.10 \
    --network=aiml-vpc

在實驗室期間儲存「pscendpointip」

pscendpointip=$(gcloud compute addresses list --filter=name:psc-ip --format="value(address)")

echo $pscendpointip

建立 PSC 端點

透過 Cloud Shell

gcloud compute forwarding-rules create pscvertex \
    --global \
    --network=aiml-vpc \
    --address=psc-ip \
    --target-google-apis-bundle=all-apis

列出已設定的 Private Service Connect 端點

透過 Cloud Shell

gcloud compute forwarding-rules list  \
--filter target="(all-apis OR vpc-sc)" --global

說明已設定的 Private Service Connect 端點

透過 Cloud Shell

gcloud compute forwarding-rules describe \
    pscvertex --global

5. 為 GCE 執行個體建立服務帳戶

如要精細控管 Vertex API，您必須使用使用者管理的服務帳戶，並套用至美國西部和美國中部執行個體。產生服務帳戶後，您可以根據業務需求修改權限。在本教學課程中，使用者管理的服務帳戶 vertex-sa 會套用下列角色：

您必須先 Service Account API，才能繼續操作。

在 Cloud Shell 中建立服務帳戶。

gcloud iam service-accounts create vertex-gce-sa \
    --description="service account for vertex" \
    --display-name="vertex-sa"

在 Cloud Shell 中，使用 Compute 執行個體管理員角色更新服務帳戶

gcloud projects add-iam-policy-binding $projectid --member="serviceAccount:vertex-gce-sa@$projectid.iam.gserviceaccount.com" --role="roles/compute.instanceAdmin.v1"

在 Cloud Shell 中，使用 Vertex AI 使用者角色更新服務帳戶

gcloud projects add-iam-policy-binding $projectid --member="serviceAccount:vertex-gce-sa@$projectid.iam.gserviceaccount.com" --role="roles/aiplatform.user"

6. 建立使用者管理的服務帳戶 (Notebooks)

在下一節中，您將建立與本教學課程所用 Vertex Workbench (Notebook) 相關聯的使用者代管服務帳戶。

在本教學課程中，服務帳戶會套用下列規則：

在 Cloud Shell 中建立服務帳戶。

gcloud iam service-accounts create user-managed-notebook-sa \
    --display-name="user-managed-notebook-sa"

在 Cloud Shell 中，將服務帳戶更新為 Storage 管理員角色。

gcloud projects add-iam-policy-binding $projectid --member="serviceAccount:user-managed-notebook-sa@$projectid.iam.gserviceaccount.com" --role="roles/storage.admin"

在 Cloud Shell 中，使用 Vertex AI 使用者角色更新服務帳戶。

gcloud projects add-iam-policy-binding $projectid --member="serviceAccount:user-managed-notebook-sa@$projectid.iam.gserviceaccount.com" --role="roles/aiplatform.user"

在 Cloud Shell 中，更新服務帳戶，並指派 Artifact Registry 管理員角色。

gcloud projects add-iam-policy-binding $projectid --member="serviceAccount:user-managed-notebook-sa@$projectid.iam.gserviceaccount.com" --role="roles/artifactregistry.admin"

在 Cloud Shell 中列出服務帳戶，並記下建立使用者管理的筆記本時要使用的電子郵件地址。

gcloud iam service-accounts list

7. 建立測試執行個體

在下一節中，您將建立測試執行個體，從 us-west1 和 us-central1 執行基準測試。

在 Cloud Shell 中建立 west-client。

gcloud compute instances create west-client \
    --zone=us-west1-a \
    --image-family=debian-11 \
    --image-project=debian-cloud \
    --subnet=us-west1-subnet \
    --scopes=https://www.googleapis.com/auth/cloud-platform \
    --no-address \
    --shielded-secure-boot --service-account=vertex-gce-sa@$projectid.iam.gserviceaccount.com \
    --metadata startup-script="#! /bin/bash
      sudo apt-get update
      sudo apt-get install tcpdump dnsutils -y"

在 Cloud Shell 中建立 central-client。

gcloud compute instances create central-client \
    --zone=us-central1-a \
    --image-family=debian-11 \
    --image-project=debian-cloud \
    --subnet=us-central1-subnet \
    --scopes=https://www.googleapis.com/auth/cloud-platform \
    --no-address \
    --shielded-secure-boot --service-account=vertex-gce-sa@$projectid.iam.gserviceaccount.com \
    --metadata startup-script="#! /bin/bash
      sudo apt-get update
      sudo apt-get install tcpdump dnsutils -y"

如要允許 IAP 連線至您的 VM 執行個體，請根據以下條件建立防火牆規則：

套用至所有您希望能透過 IAP 存取的 VM 執行個體。
允許來自 IP 範圍 35.235.240.0/20 的輸入流量。這個範圍包含 IAP 用於 TCP 轉送的所有 IP 位址。

在 Cloud Shell 中，建立 IAP 防火牆規則。

gcloud compute firewall-rules create ssh-iap-vpc \
    --network aiml-vpc \
    --allow tcp:22 \
    --source-ranges=35.235.240.0/20

8. 建立由使用者管理的筆記本

Notebooks API

在下一節中，建立納入先前建立的服務帳戶 (user-managed-notebook-sa) 的使用者管理筆記本。

在 Cloud Shell 中建立 private-client 執行個體。

gcloud notebooks instances create workbench-tutorial \
      --vm-image-project=deeplearning-platform-release \
      --vm-image-family=common-cpu-notebooks \
      --machine-type=n1-standard-4 \
      --location=us-central1-a \
      --subnet-region=us-central1 \
      --shielded-secure-boot \
      --subnet=workbench-subnet \
      --no-public-ip    --service-account=user-managed-notebook-sa@$projectid.iam.gserviceaccount.com

前往「Vertex AI」→「Workbench」，即可查看已部署的筆記本。

9. 部署模型及線上預測

在下一節中，請使用提供的程式碼研究室「Vertex AI：使用 Sklearn 的自訂預測處理常式，預先處理及後續處理預測資料」，從第 7 節開始，因為您已在上一個步驟中建立筆記本。模型部署完成後，請返回教學課程，開始下一個部分。

10. 為線上預測建立自訂監控資訊主頁

線上預測會在「VERTEX AI」→「線上預測」→「端點名稱」(diamonds-cpr_endpoint) 下方建立預設的監控資訊主頁。不過，為了進行測試，我們需要定義開始和停止時間，因此必須使用自訂資訊主頁。

在下一節中，您將建立 Cloud Monitoring 指標，根據區域存取線上預測端點的延遲時間，驗證從 us-west1 和 us-central 部署的 GCE 執行個體存取 us-central1 端點時，延遲時間有所不同。

在本教學課程中，我們會使用 prediction_latencies 指標，其他指標則可在 aiplatform 中取得

指標	說明
prediction/online/prediction_latencies	已部署模型的線上預測延遲時間。

建立 prediction_latencies 指標的圖表

在 Cloud 控制台中，前往「監控」→「Metrics Explorer」

插入「Metric」(指標) prediction/online/prediction_latencies，然後選取下列選項，並選取「Apply」(套用)。

根據下列選項更新「分組依據」，然後選取「儲存圖表」。

選取「儲存」後，系統會提示您選取資訊主頁。選取「新資訊主頁」，然後提供名稱。

Vertex 自訂資訊主頁

在下一節中，驗證 Vertex 自訂資訊主頁是否顯示正確時間。

依序前往「監控」→「資訊主頁」，然後選取「Vertex 自訂資訊主頁」，並選取時間。確認時區正確無誤。

請務必展開圖例，取得表格檢視畫面。

已展開檢視畫面的範例：

11. 為 PSC 端點建立私人 DNS

在 aiml-vpc 中建立私人 DNS 區域，將所有 googleapis 解析為 PSC 端點 IP 位址 100.100.10.10。

在 Cloud Shell 中建立私人 DNS 區域。

gcloud dns --project=$projectid managed-zones create psc-googleapis --description="Private Zone to resolve googleapis to a PSC endpoint" --dns-name="googleapis.com." --visibility="private" --networks="https://www.googleapis.com/compute/v1/projects/$projectid/global/networks/aiml-vpc"

在 Cloud Shell 中，建立將 *. googleapis.com 與 PSC IP 建立關聯的 A 記錄。

gcloud dns --project=$projectid record-sets create *.googleapis.com. --zone="psc-googleapis" --type="A" --ttl="300" --rrdatas="100.100.10.10"

12. Hey testing variables

Hey 可讓使用者根據網路和應用程式需求自訂測試。在本教學課程中，我們會使用下列選項和範例執行字串：

c == 1 個工作站

z == Duration

m == HTTP method POST

D == 檔案中的 HTTP 要求主體，即 instances.json

n == 要執行的要求數。預設值為 200。

含有 HEY 的 curl 字串範例 (不需執行)

user@us-central$ ./hey_linux_amd64 -c 1 -z 1m -m POST -D instances.json  -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid$}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict

13. 取得預測 ID

從 Cloud 控制台取得線上預測端點 ID，以供後續步驟使用。

依序前往「VERTEX AI」→「ONLINE PREDICTION」

14. 下載並執行 HEY (us-west1)

在下一個部分中，您將登入 west-client，下載並執行 HEY，以對位於 us-central1 的線上預測服務進行測試。

從 Cloud Shell 登入 west-client，然後下載 HEY

gcloud compute ssh west-client --project=$projectid --zone=us-west1-a --tunnel-through-iap

從作業系統下載 HEY，並更新權限。

wget https://hey-release.s3.us-east-2.amazonaws.com/hey_linux_amd64
chmod +x hey_linux_amd64

在作業系統中建立下列變數：

gcloud config list project
gcloud config set project [YOUR-PROJECT-NAME]
projectid=YOUR-PROJECT-NAME
echo $projectid
ENDPOINT_ID="insert-your-endpoint-id-here"

範例：

ENDPOINT_ID="2706243362607857664"

在下一節中，您將使用 vi 編輯器或 nano 建立 instances.json 檔案，並插入用於從已部署模型取得預測結果的資料字串。

在 west-client OS 中，使用下列資料字串建立 instances.json 檔案：

{"instances": [
  [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
  [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}

範例：

user@west-client:$ more instances.json 
{"instances": [
  [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
  [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}

user@west-client:$

前測

從 OS 執行 curl，驗證模型和預測端點是否正常運作。請注意詳細記錄中的 PSC 端點 IP，以及表示成功的 HTTP/2 200。

curl -v -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json

舉例來說，請記下用來存取預測和成功結果的 PSC IP 位址。

user@west-client:$ curl -v -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 100.100.10.10:443...
* Connected to us-central1-aiplatform.googleapis.com (100.100.10.10) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=upload.video.google.com
*  start date: Jul 31 08:22:19 2023 GMT
*  expire date: Oct 23 08:22:18 2023 GMT
*  subjectAltName: host "us-central1-aiplatform.googleapis.com" matched cert's "*.googleapis.com"
*  issuer: C=US; O=Google Trust Services LLC; CN=GTS CA 1C3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55a9f38b42c0)
> POST /v1/projects/new-test-project-396322/locations/us-central1/endpoints/2706243362607857664:predict HTTP/2
> Host: us-central1-aiplatform.googleapis.com
> user-agent: curl/7.74.0
> accept: */*
> authorization: Bearer ya29.c.b0Aaekm1LqrcaOlWFFwuEOWX_tZVXXvJgN_K-u5_hFyEAYXAi3AnBEBwwtHS8dweW_P2QGfdyFfa31nMT_6BaKBI0mC9IsfzfIiUwXc8u2yJt01gTUSJpCmGAFKZKidRMgkPYivVYCnuymzdYbRAWacIe__StkRzI9UeQOGN3jNIeESr80AdH12goaxCFXWaNWxoYRfGVhekEgUcsKs7t1OhOM-937gy4YGkXcXa8sGuHWRqF5bnulYlTqlxqQ2aAxMTrQg2lwUWRGCmGhPrym7rXJq7oim0DkAJSbAarl1qFuz0PPfNXeHGbs13zY2r1giV7u8_w4Umj_Q5M7H9fTkq7EiqnLzqRkOHXismYL368P1jOUBYM__krFQt4M3X9RJa0g01tOw3FnOh27BmUqlFQ1J2h14JZpx215Q3xzRvgfJ5iW5YYSkv67uZRQk4V04naOUXyc0plzWuVOjj4nor3fYvkS_oW0IyxJoBjeXR16Vnvln8c04svWX9dt7eobczFvBOm9nVdh4lVp8qxbp__2WtMvc1QVg6y-2i6lRpbvmyp1oadxVRjxV1e0wiQFSe-qqsinJu3bnnaMbxdU2cu5j26o8o8Xpgo0SF1UM0b1WX84iatbWpdFSphZm1llwmRagMzcFBW0aBk-i35_bXSbzwURgMfY6Qbyb9Rv9y0F-Maf34I0WxiMldv2uc57nej7dVl9OSm_Ohnro-i9zcpq9fxo9soYVB8WjaZOUjauk4znstc2_6y4atcVVsQBkeU674biR567Ri3M74Jfv4MrrF02ObfrJRdB7UJ4MU_9kWW-kYeeJzoci15UqYV0f_yJgReBwQa66Supmebee2Sn2nku6xZkRMu5Mz55mXuva0XWrpIbor7WckSsXwUFbf7rj5ipa4mOOyf2hJe1Rq0x6yeBaariRzXrhfm5bBpFBU73-zd-IekvOji0ZJQSkk0o6gpX_794Jny7j14aQJ8VxezcFpZUztimYhMnRhlO2lqms1h0h48
> content-type: application/json
> content-length: 158
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
* We are completely uploaded and fine
< HTTP/2 200 
< x-vertex-ai-internal-prediction-backend: harpoon
< content-type: application/json; charset=UTF-8
< date: Sun, 20 Aug 2023 03:51:54 GMT
< vary: X-Origin
< vary: Referer
< vary: Origin,Accept-Encoding
< server: scaffolding on HTTPServer2
< cache-control: private
< x-xss-protection: 0
< x-frame-options: SAMEORIGIN
< x-content-type-options: nosniff
< accept-ranges: none
< 
{
  "predictions": [
    "$479.0",
    "$586.0"
  ],
  "deployedModelId": "3587550310781943808",
  "model": "projects/884291964428/locations/us-central1/models/6829574694488768512",
  "modelDisplayName": "diamonds-cpr",
  "modelVersionId": "1"
}
* Connection #0 to host us-central1-aiplatform.googleapis.com left intact

執行 HEY

從 OS 執行 HEY，啟用 10 分鐘的基準測試。

./hey_linux_amd64 -c 1 -z 10m -m POST -D instances.json  -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/$projectid/locations/us-central1/endpoints/${ENDPOINT_ID}:predict

15. Hey Validation (us-west1)

您已從 us-west1 的運算執行個體執行 Hey，現在請評估下列結果：

HEY 結果
Vertex 自訂資訊主頁
網路智慧

HEY 結果

從作業系統中，根據 10 分鐘的執行時間驗證 HEY 結果，

每秒 17.5826 個要求

99% in 0.0686 secs | 68 ms

10,550 個狀態碼為 200 的回應

user@west-client:$ ./hey_linux_amd64 -c 1 -z 10m -m POST -D instances.json  -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/$projectid/locations/us-central1/endpoints/${ENDPOINT_ID}:predict

Summary:
  Total:        600.0243 secs
  Slowest:      0.3039 secs
  Fastest:      0.0527 secs
  Average:      0.0569 secs
  Requests/sec: 17.5826
  

Response time histogram:
  0.053 [1]     |
  0.078 [10514] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.103 [16]    |
  0.128 [4]     |
  0.153 [3]     |
  0.178 [1]     |
  0.203 [0]     |
  0.229 [2]     |
  0.254 [1]     |
  0.279 [5]     |
  0.304 [3]     |


Latency distribution:
  10% in 0.0546 secs
  25% in 0.0551 secs
  50% in 0.0559 secs
  75% in 0.0571 secs
  90% in 0.0596 secs
  95% in 0.0613 secs
  99% in 0.0686 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0000 secs, 0.0527 secs, 0.3039 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0116 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0002 secs
  resp wait:    0.0567 secs, 0.0526 secs, 0.3038 secs
  resp read:    0.0001 secs, 0.0001 secs, 0.0696 secs

Status code distribution:
  [200] 10550 responses

Vertex 自訂資訊主頁

依序前往「MONITORING」→「Dashboard」，然後選取「Vertex Custom Dashboard」。輸入 10 分鐘，或指定開始和停止時間。確認時區正確無誤。

查看「預測延遲時間」定義，瞭解伺服器端指標，該指標會測量從模型取得回應後，回應用戶端要求的總時間。

總延遲時間：要求在服務中花費的總時間，也就是模型延遲時間加上額外延遲時間。

相較之下，HEY 是用戶端指標，會考量下列參數：

用戶端要求 + 總延遲時間 (包括模型延遲時間) + 用戶端回應

網路智慧

現在來看看 Network Intelligence 報告的區域間網路延遲，瞭解 Google Cloud Platform 報告的 us-west1 到 us-central1 延遲。

前往 Cloud 控制台的「Network Intelligence」→「效能資訊主頁」，然後選取以下選項 (如螢幕截圖所示)，指出延遲時間為 32 到 39 毫秒。

HEY us-west1 基準摘要

比較測試工具的「總延遲時間」報表，會發現與 HEY 回報的延遲時間大致相同。跨區域延遲是造成延遲的主要因素。接下來的一系列測試將說明中央用戶端的效能。

延遲工具	時間長度
網路智慧：從 us-west1 到 us-central1 的延遲時間	約 32 至 39 毫秒
Cloud Monitoring：預測總延遲時間 [第 99 個百分位數]	34.58 毫秒 (99p)
Google 回報的總延遲時間	約 66.58 至 73.58 毫秒
HEY 用戶端延遲分布	68 毫秒 (99p)

16. 下載並執行 HEY (us-central1)

在下一節中，您將登入 central-client，下載並執行 HEY，以對位於 us-central1 的線上預測服務進行測試。

從 Cloud Shell 登入 central-client，然後下載 HEY

gcloud compute ssh central-client --project=$projectid --zone=us-central1-a --tunnel-through-iap

從作業系統下載 HEY，並更新權限。

wget https://hey-release.s3.us-east-2.amazonaws.com/hey_linux_amd64
chmod +x hey_linux_amd64

在作業系統中建立下列變數：

gcloud config list project
gcloud config set project [YOUR-PROJECT-NAME]
projectid=YOUR-PROJECT-NAME
echo $projectid
ENDPOINT_ID="insert-your-endpoint-id-here"

範例：

ENDPOINT_ID="2706243362607857664"

在下一節中，您將使用 vi 編輯器或 nano 建立 instances.json 檔案，並插入用於從已部署模型取得預測結果的資料字串。

在 west-client OS 中，使用下列資料字串建立 instances.json 檔案：

{"instances": [
  [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
  [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}

範例：

user@west-client:$ more instances.json 
{"instances": [
  [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
  [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}

user@west-client:$

前測

從 OS 執行 curl，驗證模型和預測端點是否正常運作。請注意詳細記錄中的 PSC 端點 IP，以及表示成功的 HTTP/2 200。

curl -v -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json

舉例來說，請記下用來存取預測和成功結果的 PSC IP 位址。

user@central-client:~$ curl -v -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 100.100.10.10:443...
* Connected to us-central1-aiplatform.googleapis.com (100.100.10.10) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=upload.video.google.com
*  start date: Jul 31 08:22:19 2023 GMT
*  expire date: Oct 23 08:22:18 2023 GMT
*  subjectAltName: host "us-central1-aiplatform.googleapis.com" matched cert's "*.googleapis.com"
*  issuer: C=US; O=Google Trust Services LLC; CN=GTS CA 1C3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x559b57adc2c0)
> POST /v1/projects/new-test-project-396322/locations/us-central1/endpoints/2706243362607857664:predict HTTP/2
> Host: us-central1-aiplatform.googleapis.com
> user-agent: curl/7.74.0
> accept: */*
> authorization: Bearer ya29.c.b0Aaekm1KWqq-CIXuL6f1cx9d9jHHquQq9tlSV1oVZ1y3TACi82JFFZRwsagVY7MMovycsU4PLkt9MDMkNngxZE5RzXcS-AoaUaQf1tPT9-_JMTlFI6wCcR7Yr9MeRF5AZblr_k52ZZgEZKeYGcrXoGiqGQcAAwFtHiEVAkUhLuyukteXbMoep1JM9E0zFblJj7Z0yOCMJYBH-6XHcIDYnOKpStMVBR2wcTDbnFrCE08HXbvRnQVcENatTBoI9FzSVL1ORwqUiCcdfnTSjpIXcyD-W82d6ZHjGX_RUhfnH7RPfOJqkuU8pOovwoCjq_jvM_wJUfPuQnBKHp5rxbYxPE349DMBql62po2SWFguuFo-a2eoUnb8-FQeBZqan65zgV0lexR73gZlm071y9grlXv3fmJUo7vlj5W-7_-FJXaWWg8iWc6rmjYeO1Wz2h_8qnmojkX9xSUciI6JfmwdgMWwtvwJb63ppSmdwf8oagrYiQlpMzgRI6rekbRzg-1WOBeOf5nRg5vtxUMSc9iRaoarO5XwFX8vt7rxOUBvbXYVWmo3bsdhzsS9VopMwgMlxgcIJg7bq7_F3iapB-nRjfjfhZWpR83cWIkI2Wb9f89inpsxtYjZbbzdWkZvRB8FYSsY8F8tcpiVoWWyQWZiph9z7O59fF9irWY2gtUnbFcJJ_ZcYztjlMQaR45y42ZflkM3Qn668bzge3Y3hmVI1s6ZSmxxq6m27hoMwVn21R07Y613jwljmaFJ5V8MwkR6yvFhYngrh_JrhRUQtSSMh02Rz25wMfv7g8Fiqymr-12viM4btIFjXZBM3XFqzvso_rw1omI1yYWofmbaBYggpegpJBzSeqVUZe791agjVtiMUkyjXFy__9gI0Qk9ZUarI4p25SvS4I1hX4YyBk6ol32Z5zIsVr1Seff__aklm6M2Mlkumd7nurm46hjOIoOhFpfFxrQ6yivnhYapBOJMYirgbZvigvI3dom1fnmt0-ktmRxp69w7Uzzy
> content-type: application/json
> content-length: 158
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
* We are completely uploaded and fine
< HTTP/2 200 
< x-vertex-ai-internal-prediction-backend: harpoon
< date: Sun, 20 Aug 2023 22:25:31 GMT
< content-type: application/json; charset=UTF-8
< vary: X-Origin
< vary: Referer
< vary: Origin,Accept-Encoding
< server: scaffolding on HTTPServer2
< cache-control: private
< x-xss-protection: 0
< x-frame-options: SAMEORIGIN
< x-content-type-options: nosniff
< accept-ranges: none
< 
{
  "predictions": [
    "$479.0",
    "$586.0"
  ],
  "deployedModelId": "3587550310781943808",
  "model": "projects/884291964428/locations/us-central1/models/6829574694488768512",
  "modelDisplayName": "diamonds-cpr",
  "modelVersionId": "1"
}
* Connection #0 to host us-central1-aiplatform.googleapis.com left intact

執行 HEY

從 OS 執行 HEY，啟用 10 分鐘的基準測試。

./hey_linux_amd64 -c 1 -z 10m -m POST -D instances.json  -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/$projectid/locations/us-central1/endpoints/${ENDPOINT_ID}:predict

17. Hey Validation (us-central1)

您已從 us-central1 的運算執行個體執行 Hey，現在請評估下列結果：

HEY 結果
Vertex 自訂資訊主頁
網路智慧

HEY 結果

從作業系統中，根據 10 分鐘的執行時間驗證 HEY 結果，

每秒要求數：44.9408

99% in 0.0353 secs | 35 ms

26965 個狀態碼為 200 的回應

devops_user_1_deepakmichael_alto@central-client:~$ ./hey_linux_amd64 -c 1 -z 10m -m POST -D instances.json  -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/$projectid/locations/us-central1/endpoints/${ENDPOINT_ID}:predict

Summary:
  Total:        600.0113 secs
  Slowest:      0.3673 secs
  Fastest:      0.0184 secs
  Average:      0.0222 secs
  Requests/sec: 44.9408
  

Response time histogram:
  0.018 [1]     |
  0.053 [26923] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.088 [25]    |
  0.123 [4]     |
  0.158 [0]     |
  0.193 [1]     |
  0.228 [9]     |
  0.263 [1]     |
  0.298 [0]     |
  0.332 [0]     |
  0.367 [1]     |


Latency distribution:
  10% in 0.0199 secs
  25% in 0.0205 secs
  50% in 0.0213 secs
  75% in 0.0226 secs
  90% in 0.0253 secs
  95% in 0.0273 secs
  99% in 0.0353 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0000 secs, 0.0184 secs, 0.3673 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0079 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0007 secs
  resp wait:    0.0220 secs, 0.0182 secs, 0.3672 secs
  resp read:    0.0002 secs, 0.0001 secs, 0.0046 secs

Status code distribution:
  [200] 26965 responses

Vertex 自訂資訊主頁

依序前往「MONITORING」→「Dashboard」，然後選取「Vertex Custom Dashboard」，輸入 10m。或開始和停止時間。確認時區正確無誤。

過去 10 分鐘的預測延遲時間為 30.533 毫秒。

查看「預測延遲時間」定義，瞭解伺服器端指標，該指標會測量從模型取得回應後，回應用戶端要求的總時間。

總延遲時間：要求在服務中花費的總時間，也就是模型延遲時間加上額外延遲時間。

相較之下，HEY 是用戶端指標，會考量下列參數：

用戶端要求 + 總延遲時間 (包括模型延遲時間) + 用戶端回應

網路智慧

現在來看看 Network Intelligence 報告的區域內網路延遲，瞭解 Google Cloud Platform 報告的 us-central1 延遲。

前往 Cloud 控制台的「Network Intelligence」→「效能資訊主頁」，然後選取以下選項 (如以下螢幕截圖所示)，指出延遲時間為 0.2 至 0.8 毫秒。

HEY us-central1 基準摘要

比較測試工具回報的總延遲時間，會發現延遲時間比 west-client 短，這是因為運算 (central-client) 和 Vertex 端點 (模型和線上預測) 位於同一區域。

延遲工具	時間長度
網路智慧：us-central1 區域內延遲	約 0.2 到 0 .8 毫秒
Cloud Monitoring：預測總延遲時間 [第 99 個百分位數]	30.533 毫秒 (第 99 個百分位數)
Google 回報的總延遲時間	約 30.733 至 31.333 毫秒
HEY 用戶端延遲時間	35 毫秒 (第 99 個百分位數)

18. 恭喜

恭喜！您已成功部署及驗證 HEY，並使用 Cloud Monitoring 和 Network Intelligence 取得用戶端預測基準延遲時間。根據測試結果，您發現 us-central 的預測端點可跨區域提供服務，但出現延遲現象。

Cosmopup 認為教學課程很棒！

19. 清理

在 Cloud Shell 中刪除教學課程元件。

gcloud compute instances delete central-client --zone=us-central1-a -q

gcloud compute instances delete west-client --zone=us-west1-a -q

gcloud compute instances delete workbench-tutorial --zone=us-central1-a -q

gcloud compute forwarding-rules delete pscvertex --global --quiet 

gcloud compute addresses delete psc-ip --global --quiet

gcloud compute networks subnets delete workbench-subnet --region=us-central1 --quiet 

gcloud compute networks subnets delete us-west1-subnet --region=us-west1 --quiet

gcloud compute networks subnets delete us-central1-subnet --region=us-central1 --quiet

gcloud compute routers delete cloud-router-us-west1-aiml-nat --region=us-west1 --quiet

gcloud compute routers delete cloud-router-us-central1-aiml-nat --region=us-central1 --quiet

gcloud compute firewall-rules delete  ssh-iap-vpc --quiet

gcloud dns record-sets delete *.googleapis.com. --zone=psc-googleapis --type=A --quiet

gcloud dns managed-zones delete psc-googleapis --quiet

gcloud compute networks delete aiml-vpc --quiet

gcloud storage rm -r gs://$projectid-cpr-bucket

從 Cloud 控制台刪除下列項目：

Artifact Registry 資料夾

從 Vertex AI Model Registry 取消部署模型：

從 Vertex AI Online Prediction 刪除端點

後續步驟

查看一些教學課程…

以 HEY 進行 Vertex AI 線上預測基準測試

1. 簡介

建構項目

課程內容

軟硬體需求

IAM 權限

2. 事前準備

更新專案以支援教學課程

3. aiml-vpc 設定

建立 aiml-vpc

建立由使用者管理的筆記本子網路

Cloud Router 和 NAT 設定

4. 建立 Private Service Connect 端點

5. 為 GCE 執行個體建立服務帳戶

6. 建立使用者管理的服務帳戶 (Notebooks)

7. 建立測試執行個體

8. 建立由使用者管理的筆記本

9. 部署模型及線上預測

10. 為線上預測建立自訂監控資訊主頁

建立 prediction_latencies 指標的圖表

Vertex 自訂資訊主頁

11. 為 PSC 端點建立私人 DNS

12. Hey testing variables

13. 取得預測 ID

14. 下載並執行 HEY (us-west1)

前測

執行 HEY

15. Hey Validation (us-west1)

HEY 結果

Vertex 自訂資訊主頁

網路智慧

HEY us-west1 基準摘要

16. 下載並執行 HEY (us-central1)

前測

執行 HEY

17. Hey Validation (us-central1)

HEY 結果

Vertex 自訂資訊主頁

網路智慧

HEY us-central1 基準摘要

18. 恭喜

19. 清理

後續步驟

延伸閱讀和影片

參考文件