以 HEY 進行 Vertex AI 線上預測基準測試

程式碼研究室簡介

125 分鐘

上次更新時間：2023年8月23日

作者：Deepak Michael

1. 簡介

本教學課程將介紹如何使用 HEY 網路效能工具，對 us-central1 和 us-west1 執行基準測試，對部署於 us-central1 的預測端點執行基準測試時，建立及評估 Cloud Monitoring 線上預測指標。

建構項目

您要設定一個名為 Targetl-vpc 的虛擬私有雲網路，該網路包含 us-west1 和 us-central1 中的子網路，以及使用 HEY 指定線上預測和 us-central1 內部署的模型來產生流量。

本教學課程也提供 Private Service Connect 和私人 DNS，說明如何利用 PSC 存取地端部署和多雲端環境。

在教學課程中，Cloud Monitoring 和 Network Intelligence 將用於驗證從 HEY 到線上預測產生的流量。雖然教學課程中列出的步驟均部署在虛擬私有雲中，您仍可按照相關步驟，透過地端部署或多雲端環境部署 Vertex APIS 並取得基準。網路架構包含下列元件：

以下是用途的詳細資料：

在 us-central1 中使用 HEY 透過 us-west1 中的 GCE 執行個體存取線上預測
確認 PSC 目前用於存取 Vertex API
使用 HEY 執行 curl 指令 5 分鐘
使用 Cloud Monitoring 驗證延遲時間
使用 Network Intelligence 驗證跨區域延遲時間
在 us-central1 中使用 HEY 透過 us-central1 中的 GCE 執行個體存取線上預測
確認 PSC 目前用於存取 Vertex API
使用 HEY 執行 curl 指令 5 分鐘
使用 Cloud Monitoring 驗證延遲時間
使用 Network Intelligence 驗證區域內延遲時間

課程內容

如何建立 Private Service Connect 端點
如何使用 HEY 產生線上預測的負載
如何使用 Cloud Monitoring 建立 Vertex AI 指標
如何使用 Network Intelligence 驗證內部區域間延遲時間

軟硬體需求

Google Cloud 專案

身分與存取權管理權限

Compute 網路管理員

Service Directory 編輯者

DNS 管理員

Network Management 檢視者

2. 事前準備

更新專案以支援教學課程

本教學課程將使用 $variables，協助在 Cloud Shell 中實作 gcloud 設定。

在 Cloud Shell 中執行以下操作：

gcloud config list project
gcloud config set project [YOUR-PROJECT-NAME]
projectid=YOUR-PROJECT-NAME
echo $projectid

3. Targetl-vpc 設定

建立 preparel-vpc

gcloud 服務可啟用 networkmanagement.googleapis.com 服務

在 Cloud Shell 中執行以下操作：

gcloud compute networks create aiml-vpc --project=$projectid --subnet-mode=custom

在 Cloud Shell 中，啟用 Network Intelligence 的網路管理 API

gcloud services enable networkmanagement.googleapis.com

建立使用者自行管理的筆記本子網路

在 Cloud Shell 中建立 Workbench-subnet。

gcloud compute networks subnets create workbench-subnet --project=$projectid --range=172.16.10.0/28 --network=aiml-vpc --region=us-central1 --enable-private-ip-google-access

在 Cloud Shell 中建立 us-west1-subnet。

gcloud compute networks subnets create us-west1-subnet --project=$projectid --range=192.168.10.0/28 --network=aiml-vpc --region=us-west1

在 Cloud Shell 中建立 us-central1-subnet。

gcloud compute networks subnets create us-central1-subnet --project=$projectid --range=192.168.20.0/28 --network=aiml-vpc --region=us-central1

Cloud Router 和 NAT 設定

由於 GCE 執行個體沒有外部 IP 位址，因此本教學課程會使用 Cloud NAT 下載軟體套件。Cloud NAT 提供輸出 NAT 功能，因此網際網路主機無法與使用者自行管理的筆記本展開通訊，因此更加安全。

在 Cloud Shell 中，建立區域性 Cloud Router us-west1，

gcloud compute routers create cloud-router-us-west1-aiml-nat --network aiml-vpc --region us-west1

在 Cloud Shell 中建立區域 Cloud nat 閘道 us-west1。

gcloud compute routers nats create cloud-nat-us-west1 --router=cloud-router-us-west1-aiml-nat --auto-allocate-nat-external-ips --nat-all-subnet-ip-ranges --region us-west1

在 Cloud Shell 中，建立區域性 Cloud Router us-central1，

gcloud compute routers create cloud-router-us-central1-aiml-nat --network aiml-vpc --region us-central1

在 Cloud Shell 中建立區域雲端 Nat 閘道 us-central1。

gcloud compute routers nats create cloud-nat-us-central1 --router=cloud-router-us-central1-aiml-nat --auto-allocate-nat-external-ips --nat-all-subnet-ip-ranges --region us-central1

4. 建立 Private Service Connect 端點

在下一節中，您將建立 Private Service Connect (PSC) 端點，以便透過 preparel-vpc 存取 Vertex API。

透過 Cloud Shell

gcloud compute addresses create psc-ip \
    --global \
    --purpose=PRIVATE_SERVICE_CONNECT \
    --addresses=100.100.10.10 \
    --network=aiml-vpc

儲存「pscendpointip」持續學習

pscendpointip=$(gcloud compute addresses list --filter=name:psc-ip --format="value(address)")

echo $pscendpointip

建立 PSC 端點

透過 Cloud Shell

gcloud compute forwarding-rules create pscvertex \
    --global \
    --network=aiml-vpc \
    --address=psc-ip \
    --target-google-apis-bundle=all-apis

列出已設定的 Private Service Connect 端點

透過 Cloud Shell

gcloud compute forwarding-rules list  \
--filter target="(all-apis OR vpc-sc)" --global

說明已設定的 Private Service Connect 端點

透過 Cloud Shell

gcloud compute forwarding-rules describe \
    pscvertex --global

5. 為 GCE 執行個體建立服務帳戶

如要進一步控管 Vertex API，您需要提供由使用者代管的服務帳戶，這個服務帳戶會套用至西部和中央執行個體。產生服務帳戶後，您就能根據業務需求修改服務帳戶權限。在本教學課程中，使用者自行管理的服務帳戶 vertex-sa 將套用下列角色：

必須 Service Account API 才能繼續操作。

在 Cloud Shell 中建立服務帳戶。

gcloud iam service-accounts create vertex-gce-sa \
    --description="service account for vertex" \
    --display-name="vertex-sa"

在 Cloud Shell 中，將服務帳戶更新為「運算執行個體管理員」角色

gcloud projects add-iam-policy-binding $projectid --member="serviceAccount:vertex-gce-sa@$projectid.iam.gserviceaccount.com" --role="roles/compute.instanceAdmin.v1"

在 Cloud Shell 中，將服務帳戶更新為「Vertex AI 使用者」角色

gcloud projects add-iam-policy-binding $projectid --member="serviceAccount:vertex-gce-sa@$projectid.iam.gserviceaccount.com" --role="roles/aiplatform.user"

6. 建立使用者自行管理的服務帳戶 (筆記本)

在下一節中，您將建立使用者代管服務帳戶，並與教學課程中使用的 Vertex Workbench (筆記本) 建立關聯。

在本教學課程中，服務帳戶會套用下列規則：

在 Cloud Shell 中建立服務帳戶。

gcloud iam service-accounts create user-managed-notebook-sa \
    --display-name="user-managed-notebook-sa"

在 Cloud Shell 中，將服務帳戶更新為「Storage 管理員」角色。

gcloud projects add-iam-policy-binding $projectid --member="serviceAccount:user-managed-notebook-sa@$projectid.iam.gserviceaccount.com" --role="roles/storage.admin"

在 Cloud Shell 中，將服務帳戶更新為「Vertex AI 使用者」角色。

gcloud projects add-iam-policy-binding $projectid --member="serviceAccount:user-managed-notebook-sa@$projectid.iam.gserviceaccount.com" --role="roles/aiplatform.user"

在 Cloud Shell 中，將服務帳戶更新為 Artifact Registry 管理員角色。

gcloud projects add-iam-policy-binding $projectid --member="serviceAccount:user-managed-notebook-sa@$projectid.iam.gserviceaccount.com" --role="roles/artifactregistry.admin"

在 Cloud Shell 中列出服務帳戶，並記下建立使用者自行管理筆記本時要使用的電子郵件地址。

gcloud iam service-accounts list

7. 建立測試執行個體

在下一節中，您將建立測試執行個體，從 us-west1 和 us-central1 執行基準測試。

在 Cloud Shell 中建立西部用戶端。

gcloud compute instances create west-client \
    --zone=us-west1-a \
    --image-family=debian-11 \
    --image-project=debian-cloud \
    --subnet=us-west1-subnet \
    --scopes=https://www.googleapis.com/auth/cloud-platform \
    --no-address \
    --shielded-secure-boot --service-account=vertex-gce-sa@$projectid.iam.gserviceaccount.com \
    --metadata startup-script="#! /bin/bash
      sudo apt-get update
      sudo apt-get install tcpdump dnsutils -y"

然後在 Cloud Shell 中建立中央用戶端。

gcloud compute instances create central-client \
    --zone=us-central1-a \
    --image-family=debian-11 \
    --image-project=debian-cloud \
    --subnet=us-central1-subnet \
    --scopes=https://www.googleapis.com/auth/cloud-platform \
    --no-address \
    --shielded-secure-boot --service-account=vertex-gce-sa@$projectid.iam.gserviceaccount.com \
    --metadata startup-script="#! /bin/bash
      sudo apt-get update
      sudo apt-get install tcpdump dnsutils -y"

如要允許 IAP 連線至您的 VM 執行個體，請建立下列防火牆規則：

適用於您要透過 IAP 存取的所有 VM 執行個體。
允許來自 IP 範圍 35.235.240.0/20 的輸入流量。這個範圍包含 IAP 用於 TCP 轉送的所有 IP 位址。

在 Cloud Shell 中建立 IAP 防火牆規則。

gcloud compute firewall-rules create ssh-iap-vpc \
    --network aiml-vpc \
    --allow tcp:22 \
    --source-ranges=35.235.240.0/20

8. 建立由使用者自行管理的筆記本

筆記本 API

在下一節中，建立由使用者自行管理的筆記本，並在其中加入先前建立的服務帳戶，也就是使用者自行管理的 notebook-sa。

在 Cloud Shell 中建立一個私人用戶端執行個體。

gcloud notebooks instances create workbench-tutorial \
      --vm-image-project=deeplearning-platform-release \
      --vm-image-family=common-cpu-notebooks \
      --machine-type=n1-standard-4 \
      --location=us-central1-a \
      --subnet-region=us-central1 \
      --shielded-secure-boot \
      --subnet=workbench-subnet \
      --no-public-ip    --service-account=user-managed-notebook-sa@$projectid.iam.gserviceaccount.com

依序前往「Vertex AI」→「Workbench」，即可查看已部署的筆記本。

9. 部署模型和線上預測

在下一節中，您將使用我們提供的程式碼研究室：Vertex AI：用 Sklearn 使用自訂預測處理常式，預先處理及後續處理資料，以便進行預測 (自上一個步驟中已建立筆記本以來)，第 7 節開始著手。模型部署完成後，請返回教學課程，開始下一節。

10. 為線上預測建立自訂監控資訊主頁

線上預測會在 VERTEX AI → 線上預測 → 端點名稱 (diamonds-cpr_endpoint) 下建立預設的 Monitoring 資訊主頁。不過，為了進行測試，我們需要定義開始和停止時間，因此必須使用自訂資訊主頁。

在下一節中，您將建立 Cloud Monitoring 指標，根據線上預測端點的區域存取權取得延遲時間測量結果，進而驗證透過在 us-west1 和 us-central 部署的 GCE 執行個體存取 us-central1 中的端點時不同的延遲時間。

教學課程將使用 Prediction_latencies 指標，aiplatform 也提供其他指標

指標	說明
prediction/online/prediction_latencies	已部署模型的線上預測延遲時間。

為 predict_latencies Metric 建立圖表

從 Cloud 控制台前往「監控」→「Metrics Explorer」

插入指標 prediction/online/prediction_latencies，並選取下列選項，然後選取「套用」。

根據下列選項更新分組依據，請選取「儲存圖表」。

選取「儲存」，系統會提示您選取資訊主頁。選取新資訊主頁和提供名稱

Vertex 自訂資訊主頁

在下一節中，確認 Vertex 自訂資訊主頁顯示的時間是否正確。

請前往「監控」→「資訊主頁」，依序選取「Vertex 自訂資訊主頁」和時間。請確認您的時區正確。

請務必展開圖例，取得表格檢視。

展開的檢視畫面範例：

11. 為 PSC 端點建立私人 DNS

在 preparel-vpc 中建立私人 DNS 區域，將所有 googleapis 解析為 PSC 端點 IP 位址 100.100.10.10。

透過 Cloud Shell 建立私人 DNS 區域。

gcloud dns --project=$projectid managed-zones create psc-googleapis --description="Private Zone to resolve googleapis to a PSC endpoint" --dns-name="googleapis.com." --visibility="private" --networks="https://www.googleapis.com/compute/v1/projects/$projectid/global/networks/aiml-vpc"

在 Cloud Shell 中，建立與 * 相關聯的 A 記錄。googleapis.com 變更為 PSC IP。

gcloud dns --project=$projectid record-sets create *.googleapis.com. --zone="psc-googleapis" --type="A" --ttl="300" --rrdatas="100.100.10.10"

12. Hey 測試變數

Hey 讓使用者能根據網路和應用程式需求自訂測試。為順利進行本教學課程，我們會使用以下詳細選項，以及執行字串範例：

c == 1 個工作站

z == 時間長度

m == HTTP 方法 POST

D == 來自 file 的 HTTP 要求主體，instances.json

n == 要執行的要求數。預設值為 200。

含有 HEY 的 curl 字串範例 (不需要執行)

user@us-central$ ./hey_linux_amd64 -c 1 -z 1m -m POST -D instances.json  -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid$}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict

13. 取得預測 ID

從 Cloud 控制台取得線上預測端點 ID，將在後續步驟中使用。

前往 VERTEX AI → 線上預測

14. 下載並執行 HEY (us-west1)

在下一節中，您將登入西部用戶端，下載並執行 HEY 以針對 us-central1 中的線上預測。

透過 Cloud Shell 登入西部用戶端並下載 HEY

gcloud compute ssh west-client --project=$projectid --zone=us-west1-a --tunnel-through-iap

從 OS 下載 HEY 並更新權限。

wget https://hey-release.s3.us-east-2.amazonaws.com/hey_linux_amd64
chmod +x hey_linux_amd64

在 OS 中建立下列變數：

gcloud config list project
gcloud config set project [YOUR-PROJECT-NAME]
projectid=YOUR-PROJECT-NAME
echo $projectid
ENDPOINT_ID="insert-your-endpoint-id-here"

範例：

ENDPOINT_ID="2706243362607857664"

在下一節中，您將使用 vi 編輯器或 nano 建立 instances.json 檔案，並插入用於從已部署模型取得預測結果的資料字串。

在 west-client OS 中，使用以下資料字串建立 instances.json f 檔案：

{"instances": [
  [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
  [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}

範例：

user@west-client:$ more instances.json 
{"instances": [
  [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
  [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}

user@west-client:$

預先測試

在 OS 中執行 curl，驗證模型和預測端點是否成功。請注意詳細資訊記錄中的 PSC 端點 IP，HTTP/2 200 則會指出成功。

curl -v -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json

範例：記下用於存取預測和分析的 PSC IP 位址「成功」結果

user@west-client:$ curl -v -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 100.100.10.10:443...
* Connected to us-central1-aiplatform.googleapis.com (100.100.10.10) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=upload.video.google.com
*  start date: Jul 31 08:22:19 2023 GMT
*  expire date: Oct 23 08:22:18 2023 GMT
*  subjectAltName: host "us-central1-aiplatform.googleapis.com" matched cert's "*.googleapis.com"
*  issuer: C=US; O=Google Trust Services LLC; CN=GTS CA 1C3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x55a9f38b42c0)
> POST /v1/projects/new-test-project-396322/locations/us-central1/endpoints/2706243362607857664:predict HTTP/2
> Host: us-central1-aiplatform.googleapis.com
> user-agent: curl/7.74.0
> accept: */*
> authorization: Bearer ya29.c.b0Aaekm1LqrcaOlWFFwuEOWX_tZVXXvJgN_K-u5_hFyEAYXAi3AnBEBwwtHS8dweW_P2QGfdyFfa31nMT_6BaKBI0mC9IsfzfIiUwXc8u2yJt01gTUSJpCmGAFKZKidRMgkPYivVYCnuymzdYbRAWacIe__StkRzI9UeQOGN3jNIeESr80AdH12goaxCFXWaNWxoYRfGVhekEgUcsKs7t1OhOM-937gy4YGkXcXa8sGuHWRqF5bnulYlTqlxqQ2aAxMTrQg2lwUWRGCmGhPrym7rXJq7oim0DkAJSbAarl1qFuz0PPfNXeHGbs13zY2r1giV7u8_w4Umj_Q5M7H9fTkq7EiqnLzqRkOHXismYL368P1jOUBYM__krFQt4M3X9RJa0g01tOw3FnOh27BmUqlFQ1J2h14JZpx215Q3xzRvgfJ5iW5YYSkv67uZRQk4V04naOUXyc0plzWuVOjj4nor3fYvkS_oW0IyxJoBjeXR16Vnvln8c04svWX9dt7eobczFvBOm9nVdh4lVp8qxbp__2WtMvc1QVg6y-2i6lRpbvmyp1oadxVRjxV1e0wiQFSe-qqsinJu3bnnaMbxdU2cu5j26o8o8Xpgo0SF1UM0b1WX84iatbWpdFSphZm1llwmRagMzcFBW0aBk-i35_bXSbzwURgMfY6Qbyb9Rv9y0F-Maf34I0WxiMldv2uc57nej7dVl9OSm_Ohnro-i9zcpq9fxo9soYVB8WjaZOUjauk4znstc2_6y4atcVVsQBkeU674biR567Ri3M74Jfv4MrrF02ObfrJRdB7UJ4MU_9kWW-kYeeJzoci15UqYV0f_yJgReBwQa66Supmebee2Sn2nku6xZkRMu5Mz55mXuva0XWrpIbor7WckSsXwUFbf7rj5ipa4mOOyf2hJe1Rq0x6yeBaariRzXrhfm5bBpFBU73-zd-IekvOji0ZJQSkk0o6gpX_794Jny7j14aQJ8VxezcFpZUztimYhMnRhlO2lqms1h0h48
> content-type: application/json
> content-length: 158
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
* We are completely uploaded and fine
< HTTP/2 200 
< x-vertex-ai-internal-prediction-backend: harpoon
< content-type: application/json; charset=UTF-8
< date: Sun, 20 Aug 2023 03:51:54 GMT
< vary: X-Origin
< vary: Referer
< vary: Origin,Accept-Encoding
< server: scaffolding on HTTPServer2
< cache-control: private
< x-xss-protection: 0
< x-frame-options: SAMEORIGIN
< x-content-type-options: nosniff
< accept-ranges: none
< 
{
  "predictions": [
    "$479.0",
    "$586.0"
  ],
  "deployedModelId": "3587550310781943808",
  "model": "projects/884291964428/locations/us-central1/models/6829574694488768512",
  "modelDisplayName": "diamonds-cpr",
  "modelVersionId": "1"
}
* Connection #0 to host us-central1-aiplatform.googleapis.com left intact

執行 HEY

在 OS 中執行 HEY，進行 10 分鐘的基準測試。

./hey_linux_amd64 -c 1 -z 10m -m POST -D instances.json  -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/$projectid/locations/us-central1/endpoints/${ENDPOINT_ID}:predict

15. Hey Validation (us-west1)

現在您在 us-west1 的運算執行個體中執行 Hey 時，系統會產生下列結果：

HEY 結果
Vertex 自訂資訊主頁
網路智慧

HEY 結果

在 OS 中，我們可根據 10 分鐘的執行作業驗證 HEY 結果。

每秒 17.5826 個要求

99% 在 0.0686 秒內 |68 毫秒

10,550 個回應 (包含 200 個狀態碼)

user@west-client:$ ./hey_linux_amd64 -c 1 -z 10m -m POST -D instances.json  -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/$projectid/locations/us-central1/endpoints/${ENDPOINT_ID}:predict

Summary:
  Total:        600.0243 secs
  Slowest:      0.3039 secs
  Fastest:      0.0527 secs
  Average:      0.0569 secs
  Requests/sec: 17.5826
  

Response time histogram:
  0.053 [1]     |
  0.078 [10514] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.103 [16]    |
  0.128 [4]     |
  0.153 [3]     |
  0.178 [1]     |
  0.203 [0]     |
  0.229 [2]     |
  0.254 [1]     |
  0.279 [5]     |
  0.304 [3]     |


Latency distribution:
  10% in 0.0546 secs
  25% in 0.0551 secs
  50% in 0.0559 secs
  75% in 0.0571 secs
  90% in 0.0596 secs
  95% in 0.0613 secs
  99% in 0.0686 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0000 secs, 0.0527 secs, 0.3039 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0116 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0002 secs
  resp wait:    0.0567 secs, 0.0526 secs, 0.3038 secs
  resp read:    0.0001 secs, 0.0001 secs, 0.0696 secs

Status code distribution:
  [200] 10550 responses

Vertex 自訂資訊主頁

依序前往「監控」→「資訊主頁」，然後選取「Vertex 自訂資訊主頁」。請輸入 10 公尺或指定開始和停止時間。請確認您的時區正確。

查看「預測延遲」定義代表的伺服器端指標，用於評估在模型收到回應後，回應用戶端要求的總時間。

總延遲時間：要求在服務內花費的總時間，也就是模型延遲時間再加上額外延遲時間。

相較之下，HEY 則是用戶端指標，會將下列參數納入考量：

用戶端要求 + 總延遲時間 (含模型延遲) + 用戶端回應

網路智慧

現在來看看 Network Intelligence 回報的跨區域網路延遲問題，藉此瞭解 Google Cloud Platform 回報的 us-west1 到 us-central1 延遲時間。

前往 Cloud 控制台「網路智慧」→「效能資訊主頁」，然後在下方螢幕截圖中選取下列選項 (顯示延遲時間為 32 至 39 毫秒)。

HEY us-west1 基準摘要

比較測試工具回報的總延遲情況後，產生的延遲時間大致上與 HEY 相同。跨區域延遲會造成大量延遲。我們來看看中央用戶端在下一系列測試中的成效。

延遲時間工具	時間長度
網路情報：us-west1 至 us-central1 延遲時間	約 32 至 39 毫秒
Cloud Monitoring：預測總延遲時間 [99%]	34.58 毫秒 (99p)
Google 回報的總延遲時間	約 66.58 至 73.58 毫秒
HEY 用戶端延遲分佈情形	68 毫秒 (99p)

16. 下載並執行 HEY (us-central1)

在下一節中，您將登入中央用戶端，針對 us-central1 中的線上預測，下載並執行 HEY。

從 Cloud Shell 登入中央用戶端並下載 HEY

gcloud compute ssh central-client --project=$projectid --zone=us-central1-a --tunnel-through-iap

從 OS 下載 HEY 並更新權限。

wget https://hey-release.s3.us-east-2.amazonaws.com/hey_linux_amd64
chmod +x hey_linux_amd64

在 OS 中建立下列變數：

gcloud config list project
gcloud config set project [YOUR-PROJECT-NAME]
projectid=YOUR-PROJECT-NAME
echo $projectid
ENDPOINT_ID="insert-your-endpoint-id-here"

範例：

ENDPOINT_ID="2706243362607857664"

在下一節中，您將使用 vi 編輯器或 nano 建立 instances.json 檔案，並插入用於從已部署模型取得預測結果的資料字串。

在 west-client OS 中，使用以下資料字串建立 instances.json f 檔案：

{"instances": [
  [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
  [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}

範例：

user@west-client:$ more instances.json 
{"instances": [
  [0.23, 'Ideal', 'E', 'VS2', 61.5, 55.0, 3.95, 3.98, 2.43],
  [0.29, 'Premium', 'J', 'Internally Flawless', 52.5, 49.0, 4.00, 2.13, 3.11]]}

user@west-client:$

事前測試

在 OS 中執行 curl，驗證模型和預測端點是否成功。請注意詳細資訊記錄中的 PSC 端點 IP，HTTP/2 200 則會指出成功。

curl -v -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json

範例：記下用於存取預測和分析的 PSC IP 位址「成功」結果

user@central-client:~$ curl -v -X POST -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/${projectid}/locations/us-central1/endpoints/${ENDPOINT_ID}:predict -d @instances.json
Note: Unnecessary use of -X or --request, POST is already inferred.
*   Trying 100.100.10.10:443...
* Connected to us-central1-aiplatform.googleapis.com (100.100.10.10) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*  CAfile: /etc/ssl/certs/ca-certificates.crt
*  CApath: /etc/ssl/certs
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.3 (IN), TLS handshake, Encrypted Extensions (8):
* TLSv1.3 (IN), TLS handshake, Certificate (11):
* TLSv1.3 (IN), TLS handshake, CERT verify (15):
* TLSv1.3 (IN), TLS handshake, Finished (20):
* TLSv1.3 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.3 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.3 / TLS_AES_256_GCM_SHA384
* ALPN, server accepted to use h2
* Server certificate:
*  subject: CN=upload.video.google.com
*  start date: Jul 31 08:22:19 2023 GMT
*  expire date: Oct 23 08:22:18 2023 GMT
*  subjectAltName: host "us-central1-aiplatform.googleapis.com" matched cert's "*.googleapis.com"
*  issuer: C=US; O=Google Trust Services LLC; CN=GTS CA 1C3
*  SSL certificate verify ok.
* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x559b57adc2c0)
> POST /v1/projects/new-test-project-396322/locations/us-central1/endpoints/2706243362607857664:predict HTTP/2
> Host: us-central1-aiplatform.googleapis.com
> user-agent: curl/7.74.0
> accept: */*
> authorization: Bearer ya29.c.b0Aaekm1KWqq-CIXuL6f1cx9d9jHHquQq9tlSV1oVZ1y3TACi82JFFZRwsagVY7MMovycsU4PLkt9MDMkNngxZE5RzXcS-AoaUaQf1tPT9-_JMTlFI6wCcR7Yr9MeRF5AZblr_k52ZZgEZKeYGcrXoGiqGQcAAwFtHiEVAkUhLuyukteXbMoep1JM9E0zFblJj7Z0yOCMJYBH-6XHcIDYnOKpStMVBR2wcTDbnFrCE08HXbvRnQVcENatTBoI9FzSVL1ORwqUiCcdfnTSjpIXcyD-W82d6ZHjGX_RUhfnH7RPfOJqkuU8pOovwoCjq_jvM_wJUfPuQnBKHp5rxbYxPE349DMBql62po2SWFguuFo-a2eoUnb8-FQeBZqan65zgV0lexR73gZlm071y9grlXv3fmJUo7vlj5W-7_-FJXaWWg8iWc6rmjYeO1Wz2h_8qnmojkX9xSUciI6JfmwdgMWwtvwJb63ppSmdwf8oagrYiQlpMzgRI6rekbRzg-1WOBeOf5nRg5vtxUMSc9iRaoarO5XwFX8vt7rxOUBvbXYVWmo3bsdhzsS9VopMwgMlxgcIJg7bq7_F3iapB-nRjfjfhZWpR83cWIkI2Wb9f89inpsxtYjZbbzdWkZvRB8FYSsY8F8tcpiVoWWyQWZiph9z7O59fF9irWY2gtUnbFcJJ_ZcYztjlMQaR45y42ZflkM3Qn668bzge3Y3hmVI1s6ZSmxxq6m27hoMwVn21R07Y613jwljmaFJ5V8MwkR6yvFhYngrh_JrhRUQtSSMh02Rz25wMfv7g8Fiqymr-12viM4btIFjXZBM3XFqzvso_rw1omI1yYWofmbaBYggpegpJBzSeqVUZe791agjVtiMUkyjXFy__9gI0Qk9ZUarI4p25SvS4I1hX4YyBk6ol32Z5zIsVr1Seff__aklm6M2Mlkumd7nurm46hjOIoOhFpfFxrQ6yivnhYapBOJMYirgbZvigvI3dom1fnmt0-ktmRxp69w7Uzzy
> content-type: application/json
> content-length: 158
> 
* Connection state changed (MAX_CONCURRENT_STREAMS == 100)!
* We are completely uploaded and fine
< HTTP/2 200 
< x-vertex-ai-internal-prediction-backend: harpoon
< date: Sun, 20 Aug 2023 22:25:31 GMT
< content-type: application/json; charset=UTF-8
< vary: X-Origin
< vary: Referer
< vary: Origin,Accept-Encoding
< server: scaffolding on HTTPServer2
< cache-control: private
< x-xss-protection: 0
< x-frame-options: SAMEORIGIN
< x-content-type-options: nosniff
< accept-ranges: none
< 
{
  "predictions": [
    "$479.0",
    "$586.0"
  ],
  "deployedModelId": "3587550310781943808",
  "model": "projects/884291964428/locations/us-central1/models/6829574694488768512",
  "modelDisplayName": "diamonds-cpr",
  "modelVersionId": "1"
}
* Connection #0 to host us-central1-aiplatform.googleapis.com left intact

執行 HEY

在 OS 中執行 HEY，進行 10 分鐘的基準測試。

./hey_linux_amd64 -c 1 -z 10m -m POST -D instances.json  -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/$projectid/locations/us-central1/endpoints/${ENDPOINT_ID}:predict

17. Hey Validation (us-central1)

現在您在 us-central1 的運算執行個體中執行 Hey 了，評估結果如下：

HEY 結果
Vertex 自訂資訊主頁
網路智慧

HEY 結果

在 OS 中，我們可根據 10 分鐘的執行作業驗證 HEY 結果。

每秒 44.9408 個要求

0.0353 秒內就佔 99% |35 毫秒

26965 年回應 (包含 200 狀態碼)

devops_user_1_deepakmichael_alto@central-client:~$ ./hey_linux_amd64 -c 1 -z 10m -m POST -D instances.json  -H "Authorization: Bearer $(gcloud auth print-access-token)" -H "Content-Type: application/json" https://us-central1-aiplatform.googleapis.com/v1/projects/$projectid/locations/us-central1/endpoints/${ENDPOINT_ID}:predict

Summary:
  Total:        600.0113 secs
  Slowest:      0.3673 secs
  Fastest:      0.0184 secs
  Average:      0.0222 secs
  Requests/sec: 44.9408
  

Response time histogram:
  0.018 [1]     |
  0.053 [26923] |■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  0.088 [25]    |
  0.123 [4]     |
  0.158 [0]     |
  0.193 [1]     |
  0.228 [9]     |
  0.263 [1]     |
  0.298 [0]     |
  0.332 [0]     |
  0.367 [1]     |


Latency distribution:
  10% in 0.0199 secs
  25% in 0.0205 secs
  50% in 0.0213 secs
  75% in 0.0226 secs
  90% in 0.0253 secs
  95% in 0.0273 secs
  99% in 0.0353 secs

Details (average, fastest, slowest):
  DNS+dialup:   0.0000 secs, 0.0184 secs, 0.3673 secs
  DNS-lookup:   0.0000 secs, 0.0000 secs, 0.0079 secs
  req write:    0.0000 secs, 0.0000 secs, 0.0007 secs
  resp wait:    0.0220 secs, 0.0182 secs, 0.3672 secs
  resp read:    0.0002 secs, 0.0001 secs, 0.0046 secs

Status code distribution:
  [200] 26965 responses

Vertex 自訂資訊主頁

前往「監控」→「資訊主頁」並選取「Vertex 自訂資訊主頁」，請輸入 10 公尺。或您的開始和停止時間。請確認您的時區正確。

過去 10 公尺的預測延遲時間產生 30.533 毫秒。

查看「預測延遲」定義代表的伺服器端指標，用於評估在模型收到回應後，回應用戶端要求的總時間。

總延遲時間：要求在服務內花費的總時間，也就是模型延遲時間再加上額外延遲時間。

相較之下，HEY 則是用戶端指標，會將下列參數納入考量：

用戶端要求 + 總延遲時間 (含模型延遲) + 用戶端回應

網路智慧

現在來看看 Network Intelligence 回報的區域內部網路延遲問題，瞭解 Google Cloud Platform 回報的 us-central1 延遲時間。

前往 Cloud 控制台網路智慧功能 → 效能資訊主頁，在下方螢幕截圖中詳細列出延遲時間為 0.2 到 .8 毫秒的選項。

HEY us-central1 基準摘要

由於相同區域的運算 (中央用戶端) 和 Vertex 端點 (模型和線上預測)，比較測試工具回報的總延遲時間會產生比西部用戶端更短的延遲時間。

延遲時間工具	時間長度
網路情報：us-central1 內部區域延遲時間	約 0.2 至 .8 毫秒
Cloud Monitoring：預測總延遲時間 [99%]	30.533 毫秒 (99p)
Google 回報的總延遲時間	約 30.733 至 31.333 毫秒
HEY 用戶端延遲	35 毫秒 (99p)

18. 恭喜

恭喜！您已成功部署並驗證 HEY，使用 Cloud Monitoring 和 Network Intelligence 的組合取得用戶端預測基準延遲時間。依據測試，找出 us-central 中的預測端點，可在跨區域提供，但觀察到延遲。

Cosmopup 認為教學課程非常精彩！

19. 清除所用資源

透過 Cloud Shell 刪除教學課程元件。

gcloud compute instances delete central-client --zone=us-central1-a -q

gcloud compute instances delete west-client --zone=us-west1-a -q

gcloud compute instances delete workbench-tutorial --zone=us-central1-a -q

gcloud compute forwarding-rules delete pscvertex --global --quiet 

gcloud compute addresses delete psc-ip --global --quiet

gcloud compute networks subnets delete workbench-subnet --region=us-central1 --quiet 

gcloud compute networks subnets delete us-west1-subnet --region=us-west1 --quiet

gcloud compute networks subnets delete us-central1-subnet --region=us-central1 --quiet

gcloud compute routers delete cloud-router-us-west1-aiml-nat --region=us-west1 --quiet

gcloud compute routers delete cloud-router-us-central1-aiml-nat --region=us-central1 --quiet

gcloud compute firewall-rules delete  ssh-iap-vpc --quiet

gcloud dns record-sets delete *.googleapis.com. --zone=psc-googleapis --type=A --quiet

gcloud dns managed-zones delete psc-googleapis --quiet

gcloud compute networks delete aiml-vpc --quiet

gcloud storage rm -r gs://$projectid-cpr-bucket

已從 Cloud 控制台刪除以下項目：

Artifact Registry 資料夾

從 Vertex AI Model Registry 取消部署模型：

在 Vertex AI Online Prediction 中刪除端點

後續步驟

快來看看一些教學課程...

以 HEY 進行 Vertex AI 線上預測基準測試

1. 簡介

建構項目

課程內容

軟硬體需求

身分與存取權管理權限

2. 事前準備

更新專案以支援教學課程

3. Targetl-vpc 設定

建立 preparel-vpc

建立使用者自行管理的筆記本子網路

Cloud Router 和 NAT 設定

4. 建立 Private Service Connect 端點

5. 為 GCE 執行個體建立服務帳戶

6. 建立使用者自行管理的服務帳戶 (筆記本)

7. 建立測試執行個體

8. 建立由使用者自行管理的筆記本

9. 部署模型和線上預測

10. 為線上預測建立自訂監控資訊主頁

為 predict_latencies Metric 建立圖表

Vertex 自訂資訊主頁

11. 為 PSC 端點建立私人 DNS

12. Hey 測試變數

13. 取得預測 ID

14. 下載並執行 HEY (us-west1)

預先測試

執行 HEY

15. Hey Validation (us-west1)

HEY 結果

Vertex 自訂資訊主頁

網路智慧

HEY us-west1 基準摘要

16. 下載並執行 HEY (us-central1)

事前測試

執行 HEY

17. Hey Validation (us-central1)

HEY 結果

Vertex 自訂資訊主頁

網路智慧

HEY us-central1 基準摘要

18. 恭喜

19. 清除所用資源

後續步驟

其他資訊與影片

參考文件