หน้านี้ได้รับการแปลโดย Cloud Translation API

เริ่มต้นใช้งานการฝังเวกเตอร์ด้วย AI ของ AlloyDB

1. บทนำ

ใน Codelab นี้ คุณจะได้เรียนรู้วิธีใช้ AI ของ AlloyDB ด้วยการรวมการค้นหาแบบเวกเตอร์เข้ากับการฝัง Vertex AI

ข้อกำหนดเบื้องต้น

ความเข้าใจพื้นฐานเกี่ยวกับ Google Cloud Console
ทักษะพื้นฐานในอินเทอร์เฟซบรรทัดคำสั่งและ Google Shell

สิ่งที่คุณจะได้เรียนรู้

วิธีทำให้คลัสเตอร์ AlloyDB และอินสแตนซ์หลักใช้งานได้
วิธีเชื่อมต่อกับ AlloyDB จาก VM ของ Google Compute Engine
วิธีสร้างฐานข้อมูลและเปิดใช้ AI ของ AlloyDB
วิธีโหลดข้อมูลไปยังฐานข้อมูล
วิธีใช้โมเดลการฝังของ Vertex AI ใน AlloyDB
วิธีเพิ่มประสิทธิภาพผลลัพธ์โดยใช้โมเดล Generative ของ Vertex AI
วิธีปรับปรุงประสิทธิภาพโดยใช้ดัชนีเวกเตอร์

สิ่งที่ต้องมี

บัญชี Google Cloud และโปรเจ็กต์ Google Cloud
เว็บเบราว์เซอร์ เช่น Chrome

2. การตั้งค่าและข้อกําหนด

การตั้งค่าสภาพแวดล้อมด้วยตนเอง

ลงชื่อเข้าใช้ Google Cloud Console และสร้างโปรเจ็กต์ใหม่หรือใช้โปรเจ็กต์ที่มีอยู่ซ้ำ หากยังไม่มีบัญชี Gmail หรือ Google Workspace คุณต้องสร้างบัญชี

ชื่อโปรเจ็กต์คือชื่อที่แสดงสำหรับผู้เข้าร่วมโปรเจ็กต์นี้ ซึ่งเป็นสตริงอักขระที่ Google APIs ไม่ได้ใช้ คุณจะอัปเดตได้ทุกเมื่อ
รหัสโปรเจ็กต์จะต้องไม่ซ้ำกันสำหรับโปรเจ็กต์ Google Cloud ทั้งหมดและจะเปลี่ยนแปลงไม่ได้ (เปลี่ยนแปลงไม่ได้หลังจากตั้งค่าแล้ว) คอนโซล Cloud จะสร้างสตริงที่ไม่ซ้ำกันโดยอัตโนมัติ ซึ่งปกติแล้วคุณไม่จำเป็นต้องสนใจว่าสตริงนั้นจะเป็นอะไร ในโค้ดแล็บส่วนใหญ่ คุณจะต้องอ้างอิงรหัสโปรเจ็กต์ (ปกติจะระบุเป็น PROJECT_ID) หากไม่ชอบรหัสที่สร้างขึ้น คุณอาจสร้างรหัสอื่นแบบสุ่มได้ หรือจะลองใช้อุปกรณ์ของคุณเองเพื่อดูว่าอุปกรณ์พร้อมใช้งานหรือไม่ก็ได้ คุณจะเปลี่ยนแปลงชื่อหลังจากขั้นตอนนี้ไม่ได้ และชื่อดังกล่าวจะคงอยู่ตลอดระยะเวลาของโปรเจ็กต์
โปรดทราบว่ามีค่าที่ 3 ซึ่งเป็นหมายเลขโปรเจ็กต์ที่ API บางรายการใช้ ดูข้อมูลเพิ่มเติมเกี่ยวกับค่าทั้ง 3 รายการนี้ได้ในเอกสารประกอบ

ถัดไป คุณจะต้องเปิดใช้การเรียกเก็บเงินใน Cloud Console เพื่อใช้ทรัพยากร/API ของ Cloud การทำตามโค้ดแล็บนี้จะไม่เสียค่าใช้จ่ายมากนัก หากต้องการปิดทรัพยากรเพื่อหลีกเลี่ยงการเรียกเก็บเงินหลังจากบทแนะนำนี้ คุณลบทรัพยากรที่สร้างไว้หรือลบโปรเจ็กต์ได้ ผู้ใช้ Google Cloud รายใหม่มีสิทธิ์เข้าร่วมโปรแกรมช่วงทดลองใช้ฟรีมูลค่า$300 USD

เริ่ม Cloud Shell

แม้ว่า Google Cloud จะทำงานจากระยะไกลจากแล็ปท็อปได้ แต่ในโค้ดแล็บนี้ คุณจะใช้ Google Cloud Shell ซึ่งเป็นสภาพแวดล้อมบรรทัดคำสั่งที่ทำงานในระบบคลาวด์

จากคอนโซล Google Cloud ให้คลิกไอคอน Cloud Shell ในแถบเครื่องมือด้านขวาบน

การจัดสรรและเชื่อมต่อกับสภาพแวดล้อมจะใช้เวลาเพียงไม่กี่นาที เมื่อดำเนินการเสร็จแล้ว คุณควรเห็นข้อมูลดังต่อไปนี้

เครื่องเสมือนนี้โหลดเครื่องมือการพัฒนาทั้งหมดที่คุณต้องการ ซึ่งจะมีไดเรกทอรีหลักขนาด 5 GB ถาวรและทำงานบน Google Cloud ซึ่งจะช่วยเพิ่มประสิทธิภาพเครือข่ายและการรับรองได้อย่างมีประสิทธิภาพ คุณทํางานทั้งหมดในโค้ดแล็บนี้ได้ภายในเบราว์เซอร์ คุณไม่จำเป็นต้องติดตั้งอะไรเลย

3. ก่อนเริ่มต้น

เปิดใช้ API

เอาต์พุต:

ใน Cloud Shell ให้ตรวจสอบว่าได้ตั้งค่ารหัสโปรเจ็กต์แล้ว โดยทำดังนี้

gcloud config set project [YOUR-PROJECT-ID]

ตั้งค่าตัวแปรสภาพแวดล้อม PROJECT_ID

PROJECT_ID=$(gcloud config get-value project)

เปิดใช้บริการที่จำเป็นทั้งหมด

gcloud services enable alloydb.googleapis.com \
                       compute.googleapis.com \
                       cloudresourcemanager.googleapis.com \
                       servicenetworking.googleapis.com \
                       aiplatform.googleapis.com

ผลลัพธ์ที่คาดหวัง

student@cloudshell:~ (test-project-001-402417)$ gcloud config set project test-project-001-402417
Updated property [core/project].
student@cloudshell:~ (test-project-001-402417)$ PROJECT_ID=$(gcloud config get-value project)
Your active configuration is: [cloudshell-14650]
student@cloudshell:~ (test-project-001-402417)$ 
student@cloudshell:~ (test-project-001-402417)$ gcloud services enable alloydb.googleapis.com \
                       compute.googleapis.com \
                       cloudresourcemanager.googleapis.com \
                       servicenetworking.googleapis.com \
                       aiplatform.googleapis.com
Operation "operations/acat.p2-4470404856-1f44ebd8-894e-4356-bea7-b84165a57442" finished successfully.

กำหนดค่าภูมิภาคเริ่มต้นเพื่อใช้โมเดลการฝังของ Vertex AI อ่านเพิ่มเติมเกี่ยวกับประเทศที่ Vertex AI พร้อมให้บริการ ในตัวอย่างนี้ เราใช้ภูมิภาค us-central1

gcloud config set compute/region us-central1

4. ติดตั้งใช้งาน AlloyDB

ก่อนสร้างคลัสเตอร์ AlloyDB เราต้องมีช่วง IP ส่วนตัวที่ใช้ได้ภายใน VPC เพื่อให้อินสแตนซ์ AlloyDB ในอนาคตใช้ หากไม่มี เราจะต้องสร้าง กำหนดให้บริการภายในของ Google นำไปใช้ แล้วหลังจากนั้นจึงจะสร้างคลัสเตอร์และอินสแตนซ์ได้

สร้างช่วง IP ส่วนตัว

เราต้องกำหนดค่าการเข้าถึงบริการส่วนตัวใน VPC สำหรับ AlloyDB สมมติฐานของเราคือเรามีเครือข่าย VPC "เริ่มต้น" ในโปรเจ็กต์และจะใช้เครือข่ายดังกล่าวสําหรับการดำเนินการทั้งหมด

สร้างช่วง IP ส่วนตัว โดยทำดังนี้

gcloud compute addresses create psa-range \
    --global \
    --purpose=VPC_PEERING \
    --prefix-length=24 \
    --description="VPC private service access" \
    --network=default

สร้างการเชื่อมต่อส่วนตัวโดยใช้ช่วง IP ที่จัดสรร โดยทำดังนี้

gcloud services vpc-peerings connect \
    --service=servicenetworking.googleapis.com \
    --ranges=psa-range \
    --network=default

ผลลัพธ์ที่คาดหวังในคอนโซล

student@cloudshell:~ (test-project-402417)$ gcloud compute addresses create psa-range \
    --global \
    --purpose=VPC_PEERING \
    --prefix-length=24 \
    --description="VPC private service access" \
    --network=default
Created [https://www.googleapis.com/compute/v1/projects/test-project-402417/global/addresses/psa-range].

student@cloudshell:~ (test-project-402417)$ gcloud services vpc-peerings connect \
    --service=servicenetworking.googleapis.com \
    --ranges=psa-range \
    --network=default
Operation "operations/pssn.p24-4470404856-595e209f-19b7-4669-8a71-cbd45de8ba66" finished successfully.

student@cloudshell:~ (test-project-402417)$

สร้างคลัสเตอร์ AlloyDB

ในส่วนนี้เราจะสร้างคลัสเตอร์ AlloyDB ในภูมิภาค us-central1

กำหนดรหัสผ่านสำหรับผู้ใช้ postgres คุณสามารถกําหนดรหัสผ่านของคุณเองหรือใช้ฟังก์ชันสุ่มเพื่อสร้างรหัสผ่าน

export PGPASSWORD=`openssl rand -hex 12`

ผลลัพธ์ที่คาดหวังในคอนโซล

student@cloudshell:~ (test-project-402417)$ export PGPASSWORD=`openssl rand -hex 12`

จดรหัสผ่าน PostgreSQL ไว้ใช้ในอนาคต

echo $PGPASSWORD

ผลลัพธ์ที่คาดหวังในคอนโซล

student@cloudshell:~ (test-project-402417)$ echo $PGPASSWORD
bbefbfde7601985b0dee5723

สร้างคลัสเตอร์ช่วงทดลองใช้ฟรี

หากไม่เคยใช้ AlloyDB มาก่อน คุณสามารถสร้างคลัสเตอร์ช่วงทดลองใช้ฟรีได้โดยทำดังนี้

กำหนดชื่อภูมิภาคและคลัสเตอร์ AlloyDB เราจะใช้ภูมิภาค us-central1 และ alloydb-aip-01 เป็นชื่อคลัสเตอร์

export REGION=us-central1
export ADBCLUSTER=alloydb-aip-01

เรียกใช้คําสั่งเพื่อสร้างคลัสเตอร์

gcloud alloydb clusters create $ADBCLUSTER \
    --password=$PGPASSWORD \
    --network=default \
    --region=$REGION \
    --subscription-type=TRIAL

ผลลัพธ์ที่คาดหวังในคอนโซล

export REGION=us-central1
export ADBCLUSTER=alloydb-aip-01
gcloud alloydb clusters create $ADBCLUSTER \
    --password=$PGPASSWORD \
    --network=default \
    --region=$REGION \
    --subscription-type=TRIAL
Operation ID: operation-1697655441138-6080235852277-9e7f04f5-2012fce4
Creating cluster...done.

สร้างอินสแตนซ์หลัก AlloyDB สําหรับคลัสเตอร์ของเราในเซสชัน Cloud Shell เดียวกัน หากถูกตัดการเชื่อมต่อ คุณจะต้องกําหนดตัวแปรสภาพแวดล้อมชื่อคลัสเตอร์และภูมิภาคอีกครั้ง

gcloud alloydb instances create $ADBCLUSTER-pr \
    --instance-type=PRIMARY \
    --cpu-count=8 \
    --region=$REGION \
    --cluster=$ADBCLUSTER

ผลลัพธ์ที่คาดหวังในคอนโซล

student@cloudshell:~ (test-project-402417)$ gcloud alloydb instances create $ADBCLUSTER-pr \
    --instance-type=PRIMARY \
    --cpu-count=8 \
    --region=$REGION \
    --availability-type ZONAL \
    --cluster=$ADBCLUSTER
Operation ID: operation-1697659203545-6080315c6e8ee-391805db-25852721
Creating instance...done.

สร้างคลัสเตอร์ AlloyDB Standard

หากไม่ใช่คลัสเตอร์ AlloyDB รายการแรกในโปรเจ็กต์ ให้สร้างคลัสเตอร์มาตรฐาน

export REGION=us-central1
export ADBCLUSTER=alloydb-aip-01

เรียกใช้คําสั่งเพื่อสร้างคลัสเตอร์

gcloud alloydb clusters create $ADBCLUSTER \
    --password=$PGPASSWORD \
    --network=default \
    --region=$REGION

ผลลัพธ์ที่คาดหวังในคอนโซล

export REGION=us-central1
export ADBCLUSTER=alloydb-aip-01
gcloud alloydb clusters create $ADBCLUSTER \
    --password=$PGPASSWORD \
    --network=default \
    --region=$REGION 
Operation ID: operation-1697655441138-6080235852277-9e7f04f5-2012fce4
Creating cluster...done.

gcloud alloydb instances create $ADBCLUSTER-pr \
    --instance-type=PRIMARY \
    --cpu-count=2 \
    --region=$REGION \
    --cluster=$ADBCLUSTER

ผลลัพธ์ที่คาดหวังในคอนโซล

student@cloudshell:~ (test-project-402417)$ gcloud alloydb instances create $ADBCLUSTER-pr \
    --instance-type=PRIMARY \
    --cpu-count=2 \
    --region=$REGION \
    --availability-type ZONAL \
    --cluster=$ADBCLUSTER
Operation ID: operation-1697659203545-6080315c6e8ee-391805db-25852721
Creating instance...done.

5. เชื่อมต่อกับ AlloyDB

AlloyDB ติดตั้งใช้งานโดยใช้การเชื่อมต่อแบบส่วนตัวเท่านั้น เราจึงต้องใช้ VM ที่ติดตั้งไคลเอ็นต์ PostgreSQL เพื่อทำงานกับฐานข้อมูล

ติดตั้งใช้งาน GCE VM

สร้าง GCE VM ในภูมิภาคและ VPC เดียวกับคลัสเตอร์ AlloyDB

ใน Cloud Shell ให้ดำเนินการต่อไปนี้

export ZONE=us-central1-a
gcloud compute instances create instance-1 \
    --zone=$ZONE \
    --create-disk=auto-delete=yes,boot=yes,image=projects/debian-cloud/global/images/$(gcloud compute images list --filter="family=debian-12 AND family!=debian-12-arm64" --format="value(name)") \
    --scopes=https://www.googleapis.com/auth/cloud-platform

ผลลัพธ์ที่คาดหวังในคอนโซล

student@cloudshell:~ (test-project-402417)$ export ZONE=us-central1-a
student@cloudshell:~ (test-project-402417)$ export ZONE=us-central1-a
gcloud compute instances create instance-1 \
    --zone=$ZONE \
    --create-disk=auto-delete=yes,boot=yes,image=projects/debian-cloud/global/images/$(gcloud compute images list --filter="family=debian-12 AND family!=debian-12-arm64" --format="value(name)") \
    --scopes=https://www.googleapis.com/auth/cloud-platform

Created [https://www.googleapis.com/compute/v1/projects/test-project-402417/zones/us-central1-a/instances/instance-1].
NAME: instance-1
ZONE: us-central1-a
MACHINE_TYPE: n1-standard-1
PREEMPTIBLE: 
INTERNAL_IP: 10.128.0.2
EXTERNAL_IP: 34.71.192.233
STATUS: RUNNING

ติดตั้ง ไคลเอ็นต์ Postgres

ติดตั้งซอฟต์แวร์ไคลเอ็นต์ PostgreSQL ใน VM ที่ติดตั้งใช้งาน

เชื่อมต่อกับ VM

gcloud compute ssh instance-1 --zone=us-central1-a

ผลลัพธ์ที่คาดหวังในคอนโซล

student@cloudshell:~ (test-project-402417)$ gcloud compute ssh instance-1 --zone=us-central1-a
Updating project ssh metadata...working..Updated [https://www.googleapis.com/compute/v1/projects/test-project-402417].                                                                                                                                                         
Updating project ssh metadata...done.                                                                                                                                                                                                                                              
Waiting for SSH key to propagate.
Warning: Permanently added 'compute.5110295539541121102' (ECDSA) to the list of known hosts.
Linux instance-1.us-central1-a.c.gleb-test-short-001-418811.internal 6.1.0-18-cloud-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64

The programs included with the Debian GNU/Linux system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Debian GNU/Linux comes with ABSOLUTELY NO WARRANTY, to the extent
permitted by applicable law.
student@instance-1:~$

ติดตั้งคำสั่งเรียกใช้ซอฟต์แวร์ภายใน VM โดยทำดังนี้

sudo apt-get update
sudo apt-get install --yes postgresql-client

ผลลัพธ์ที่คาดหวังในคอนโซล

student@instance-1:~$ sudo apt-get update
sudo apt-get install --yes postgresql-client
Get:1 https://packages.cloud.google.com/apt google-compute-engine-bullseye-stable InRelease [5146 B]
Get:2 https://packages.cloud.google.com/apt cloud-sdk-bullseye InRelease [6406 B]   
Hit:3 https://deb.debian.org/debian bullseye InRelease  
Get:4 https://deb.debian.org/debian-security bullseye-security InRelease [48.4 kB]
Get:5 https://packages.cloud.google.com/apt google-compute-engine-bullseye-stable/main amd64 Packages [1930 B]
Get:6 https://deb.debian.org/debian bullseye-updates InRelease [44.1 kB]
Get:7 https://deb.debian.org/debian bullseye-backports InRelease [49.0 kB]
...redacted...
update-alternatives: using /usr/share/postgresql/13/man/man1/psql.1.gz to provide /usr/share/man/man1/psql.1.gz (psql.1.gz) in auto mode
Setting up postgresql-client (13+225) ...
Processing triggers for man-db (2.9.4-2) ...
Processing triggers for libc-bin (2.31-13+deb11u7) ...

เชื่อมต่อกับอินสแตนซ์

เชื่อมต่อกับอินสแตนซ์หลักจาก VM โดยใช้ psql

ในแท็บ Cloud Shell เดียวกับเซสชัน SSH ที่เปิดอยู่ไปยัง VM อินสแตนซ์ 1

ใช้ค่ารหัสผ่าน AlloyDB (PGPASSWORD) ที่ระบุไว้และรหัสคลัสเตอร์ AlloyDB เพื่อเชื่อมต่อกับ AlloyDB จาก GCE VM

export PGPASSWORD=<Noted password>

export PROJECT_ID=$(gcloud config get-value project)
export REGION=us-central1
export ADBCLUSTER=alloydb-aip-01
export INSTANCE_IP=$(gcloud alloydb instances describe $ADBCLUSTER-pr --cluster=$ADBCLUSTER --region=$REGION --format="value(ipAddress)")
psql "host=$INSTANCE_IP user=postgres sslmode=require"

ผลลัพธ์ที่คาดหวังในคอนโซล

student@instance-1:~$ export PGPASSWORD=CQhOi5OygD4ps6ty
student@instance-1:~$ ADBCLUSTER=alloydb-aip-01
student@instance-1:~$ REGION=us-central1
student@instance-1:~$ INSTANCE_IP=$(gcloud alloydb instances describe $ADBCLUSTER-pr --cluster=$ADBCLUSTER --region=$REGION --format="value(ipAddress)")
gleb@instance-1:~$ psql "host=$INSTANCE_IP user=postgres sslmode=require"
psql (15.6 (Debian 15.6-0+deb12u1), server 15.5)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

postgres=>

ปิดเซสชัน psql

exit

6. เตรียมฐานข้อมูล

เราต้องสร้างฐานข้อมูล เปิดใช้การผสานรวม Vertex AI สร้างออบเจ็กต์ฐานข้อมูล และนําเข้าข้อมูล

ให้สิทธิ์ที่จําเป็นแก่ AlloyDB

เพิ่มสิทธิ์ Vertex AI ให้กับตัวแทนบริการ AlloyDB

เปิดแท็บ Cloud Shell อื่นโดยใช้เครื่องหมาย "+" ที่ด้านบน

ในแท็บ Cloud Shell ใหม่ ให้ดำเนินการต่อไปนี้

PROJECT_ID=$(gcloud config get-value project)
gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member="serviceAccount:service-$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")@gcp-sa-alloydb.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"

ผลลัพธ์ที่คาดหวังในคอนโซล

student@cloudshell:~ (test-project-001-402417)$ PROJECT_ID=$(gcloud config get-value project)
Your active configuration is: [cloudshell-11039]
student@cloudshell:~ (test-project-001-402417)$ gcloud projects add-iam-policy-binding $PROJECT_ID \
  --member="serviceAccount:service-$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")@gcp-sa-alloydb.iam.gserviceaccount.com" \
  --role="roles/aiplatform.user"
Updated IAM policy for project [test-project-001-402417].
bindings:
- members:
  - serviceAccount:service-4470404856@gcp-sa-alloydb.iam.gserviceaccount.com
  role: roles/aiplatform.user
- members:
...
etag: BwYIEbe_Z3U=
version: 1

ปิดแท็บด้วยคำสั่งการดำเนินการ "exit" ในแท็บ

exit

สร้างฐานข้อมูล

เริ่มต้นใช้งานฐานข้อมูลอย่างรวดเร็ว

ในเซสชัน VM ของ GCE ให้ทำดังนี้

สร้างฐานข้อมูล

psql "host=$INSTANCE_IP user=postgres" -c "CREATE DATABASE quickstart_db"

ผลลัพธ์ที่คาดหวังในคอนโซล

student@instance-1:~$ psql "host=$INSTANCE_IP user=postgres" -c "CREATE DATABASE quickstart_db"
CREATE DATABASE
student@instance-1:~$

เปิดใช้การผสานรวม Vertex AI

เปิดใช้การผสานรวม Vertex AI และส่วนขยาย pgvector ในฐานข้อมูล

ใน GCE VM ให้ทำดังนี้

psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db" -c "CREATE EXTENSION IF NOT EXISTS google_ml_integration CASCADE"
psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db" -c "CREATE EXTENSION IF NOT EXISTS vector"

ผลลัพธ์ที่คาดหวังในคอนโซล

student@instance-1:~$ psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db" -c "CREATE EXTENSION IF NOT EXISTS google_ml_integration CASCADE"
psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db" -c "CREATE EXTENSION IF NOT EXISTS vector"
CREATE EXTENSION
CREATE EXTENSION
student@instance-1:~$

นําเข้าข้อมูล

ดาวน์โหลดข้อมูลที่เตรียมไว้และนําเข้าลงในฐานข้อมูลใหม่

ใน GCE VM ให้ทำดังนี้

gsutil cat gs://cloud-training/gcc/gcc-tech-004/cymbal_demo_schema.sql |psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db"
gsutil cat gs://cloud-training/gcc/gcc-tech-004/cymbal_products.csv |psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db" -c "\copy cymbal_products from stdin csv header"
gsutil cat gs://cloud-training/gcc/gcc-tech-004/cymbal_inventory.csv |psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db" -c "\copy cymbal_inventory from stdin csv header"
gsutil cat gs://cloud-training/gcc/gcc-tech-004/cymbal_stores.csv |psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db" -c "\copy cymbal_stores from stdin csv header"

ผลลัพธ์ที่คาดหวังในคอนโซล

student@instance-1:~$ gsutil cat gs://cloud-training/gcc/gcc-tech-004/cymbal_demo_schema.sql |psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db"
SET
SET
SET
SET
SET
 set_config 
------------
 
(1 row)
SET
SET
SET
SET
SET
SET
CREATE TABLE
ALTER TABLE
CREATE TABLE
ALTER TABLE
CREATE TABLE
ALTER TABLE
CREATE TABLE
ALTER TABLE
CREATE SEQUENCE
ALTER TABLE
ALTER SEQUENCE
ALTER TABLE
ALTER TABLE
ALTER TABLE
student@instance-1:~$ gsutil cat gs://cloud-training/gcc/gcc-tech-004/cymbal_products.csv |psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db" -c "\copy cymbal_products from stdin csv header"
COPY 941
student@instance-1:~$ gsutil cat gs://cloud-training/gcc/gcc-tech-004/cymbal_inventory.csv |psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db" -c "\copy cymbal_inventory from stdin csv header"
COPY 263861
student@instance-1:~$ gsutil cat gs://cloud-training/gcc/gcc-tech-004/cymbal_stores.csv |psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db" -c "\copy cymbal_stores from stdin csv header"
COPY 4654
student@instance-1:~$

7. คำนวณการฝัง

หลังจากนำเข้าข้อมูลแล้ว เราได้รับข้อมูลผลิตภัณฑ์ในตาราง cymbal_products, สินค้าคงคลังแสดงจำนวนผลิตภัณฑ์ที่พร้อมจำหน่ายในร้านค้าแต่ละแห่งในตาราง cymbal_inventory และรายการร้านค้าในตาราง cymbal_stores เราต้องคํานวณข้อมูลเวกเตอร์ตามคําอธิบายของผลิตภัณฑ์ และเราจะใช้ฟังก์ชันการฝังสําหรับการดำเนินการนี้ เราจะใช้ฟังก์ชันนี้เพื่อผสานรวม Vertex AI เพื่อคํานวณข้อมูลเวกเตอร์ตามรายละเอียดผลิตภัณฑ์และเพิ่มลงในตาราง อ่านข้อมูลเพิ่มเติมเกี่ยวกับเทคโนโลยีที่ใช้ได้ในเอกสารประกอบ

สร้างคอลัมน์การฝัง

เชื่อมต่อกับฐานข้อมูลโดยใช้ psql และสร้างคอลัมน์เสมือนที่มีข้อมูลเวกเตอร์โดยใช้ฟังก์ชันการฝังในตาราง cymbal_products ฟังก์ชันการฝังจะแสดงผลข้อมูลเวกเตอร์จาก Vertex AI โดยอิงตามข้อมูลที่ได้จากคอลัมน์ product_description

psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db"

ในเซสชัน psql หลังจากเชื่อมต่อกับฐานข้อมูล ให้ดำเนินการต่อไปนี้

ALTER TABLE cymbal_products ADD COLUMN embedding vector(768) GENERATED ALWAYS AS (embedding('text-embedding-005',product_description)) STORED;

คำสั่งนี้จะสร้างคอลัมน์เสมือนและป้อนข้อมูลเวกเตอร์

ผลลัพธ์ที่คาดหวังในคอนโซล

student@instance-1:~$ psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db"
psql (13.11 (Debian 13.11-0+deb11u1), server 14.7)
WARNING: psql major version 13, server major version 14.
         Some psql features might not work.
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

quickstart_db=> ALTER TABLE cymbal_products ADD COLUMN embedding vector(768) GENERATED ALWAYS AS (embedding('text-embedding-004',product_description)) STORED;
ALTER TABLE
quickstart_db=>

8. เรียกใช้การค้นหาความคล้ายคลึง

ตอนนี้เราเรียกใช้การค้นหาโดยใช้การค้นหาแบบคล้ายคลึงตามค่าเวกเตอร์ที่คำนวณสำหรับคำอธิบายและค่าเวกเตอร์ที่เราได้รับสำหรับคำขอได้แล้ว

คุณสามารถเรียกใช้การค้นหา SQL จากอินเทอร์เฟซบรรทัดคำสั่ง psql เดียวกัน หรือจาก AlloyDB Studio ก็ได้ เอาต์พุตที่ซับซ้อนและมีหลายแถวอาจดูดีขึ้นใน AlloyDB Studio

เชื่อมต่อกับ AlloyDB Studio

ในบทต่อไปนี้ คุณสามารถเรียกใช้คําสั่ง SQL ทั้งหมดที่ต้องใช้การเชื่อมต่อกับฐานข้อมูลใน AlloyDB Studio แทนได้ หากต้องการเรียกใช้คําสั่ง คุณต้องเปิดอินเทอร์เฟซเว็บคอนโซลสําหรับคลัสเตอร์ AlloyDB โดยคลิกอินสแตนซ์หลัก

จากนั้นคลิก AlloyDB Studio ทางด้านซ้าย

เลือกฐานข้อมูล quickstart_db, ผู้ใช้ postgres และระบุรหัสผ่านที่บันทึกไว้เมื่อเราสร้างคลัสเตอร์ จากนั้นคลิกปุ่ม "ตรวจสอบสิทธิ์"

ซึ่งจะเปิดอินเทอร์เฟซ AlloyDB Studio หากต้องการเรียกใช้คําสั่งในฐานข้อมูล ให้คลิกแท็บ "Editor 1" ทางด้านขวา

ซึ่งจะเปิดอินเทอร์เฟซที่คุณสามารถเรียกใช้คําสั่ง SQL

หากต้องการใช้ psql ในบรรทัดคำสั่ง ให้ทำตามเส้นทางอื่นและเชื่อมต่อกับฐานข้อมูลจากเซสชัน SSH ของ VM ตามที่อธิบายไว้ในบทก่อนหน้า

เรียกใช้การค้นหาความคล้ายคลึงจาก psql

หากเซสชันฐานข้อมูลถูกตัดการเชื่อมต่อ ให้เชื่อมต่อกับฐานข้อมูลอีกครั้งโดยใช้ psql หรือ AlloyDB Studio

เชื่อมต่อกับฐานข้อมูล

psql "host=$INSTANCE_IP user=postgres dbname=quickstart_db"

เรียกใช้การค้นหาเพื่อดูรายการผลิตภัณฑ์ที่มีจำหน่ายซึ่งเกี่ยวข้องกับคำขอของลูกค้ามากที่สุด คําขอที่เราส่งไปยัง Vertex AI เพื่อรับค่าเวกเตอร์จะมีลักษณะดังนี้ "ต้นไม้ผลชนิดใดที่ปลูกได้ดีที่นี่"

ต่อไปนี้คือข้อความค้นหาที่คุณเรียกใช้ได้เพื่อเลือกรายการ 10 รายการแรกที่เหมาะสมกับคำขอของเรามากที่สุด

SELECT
        cp.product_name,
        left(cp.product_description,80) as description,
        cp.sale_price,
        cs.zip_code,
        (cp.embedding <=> embedding('text-embedding-005','What kind of fruit trees grow well here?')::vector) as distance
FROM
        cymbal_products cp
JOIN cymbal_inventory ci on
        ci.uniq_id=cp.uniq_id
JOIN cymbal_stores cs on
        cs.store_id=ci.store_id
        AND ci.inventory>0
        AND cs.store_id = 1583
ORDER BY
        distance ASC
LIMIT 10;

ผลลัพธ์ที่คาดหวังมีดังนี้

quickstart_db=> SELECT
        cp.product_name,
        left(cp.product_description,80) as description,
        cp.sale_price,
        cs.zip_code,
        (cp.embedding <=> embedding('text-embedding-004','What kind of fruit trees grow well here?')::vector) as distance
FROM
        cymbal_products cp
JOIN cymbal_inventory ci on
        ci.uniq_id=cp.uniq_id
JOIN cymbal_stores cs on
        cs.store_id=ci.store_id
        AND ci.inventory>0
        AND cs.store_id = 1583
ORDER BY
        distance ASC
LIMIT 10;
      product_name       |                                   description                                    | sale_price | zip_code |      distance       
-------------------------+----------------------------------------------------------------------------------+------------+----------+---------------------
 Cherry Tree             | This is a beautiful cherry tree that will produce delicious cherries. It is an d |      75.00 |    93230 | 0.43922018972266397
 Meyer Lemon Tree        | Meyer Lemon trees are California's favorite lemon tree! Grow your own lemons by  |         34 |    93230 |  0.4685112926118228
 Toyon                   | This is a beautiful toyon tree that can grow to be over 20 feet tall. It is an e |      10.00 |    93230 |  0.4835677149651668
 California Lilac        | This is a beautiful lilac tree that can grow to be over 10 feet tall. It is an d |       5.00 |    93230 |  0.4947204525907498
 California Peppertree   | This is a beautiful peppertree that can grow to be over 30 feet tall. It is an e |      25.00 |    93230 |  0.5054166905547247
 California Black Walnut | This is a beautiful walnut tree that can grow to be over 80 feet tall. It is a d |     100.00 |    93230 |  0.5084219510932597
 California Sycamore     | This is a beautiful sycamore tree that can grow to be over 100 feet tall. It is  |     300.00 |    93230 |  0.5140519790508755
 Coast Live Oak          | This is a beautiful oak tree that can grow to be over 100 feet tall. It is an ev |     500.00 |    93230 |  0.5143126438081371
 Fremont Cottonwood      | This is a beautiful cottonwood tree that can grow to be over 100 feet tall. It i |     200.00 |    93230 |  0.5174774727252058
 Madrone                 | This is a beautiful madrona tree that can grow to be over 80 feet tall. It is an |      50.00 |    93230 |  0.5227400803389093

9. ปรับปรุงคำตอบ

คุณสามารถปรับปรุงการตอบกลับแอปพลิเคชันไคลเอ็นต์โดยใช้ผลการค้นหาและเตรียมเอาต์พุตที่สื่อความหมายโดยใช้ผลการค้นหาที่ระบุเป็นส่วนหนึ่งของพรอมต์สำหรับโมเดลภาษาพื้นฐานแบบ Generative ของ Vertex AI

ด้วยเหตุนี้ เราจึงวางแผนที่จะสร้าง JSON ที่มีผลลัพธ์จากการค้นหาเวกเตอร์ จากนั้นใช้ JSON ที่สร้างขึ้นนั้นเพิ่มเติมจากพรอมต์สำหรับโมเดล LLM ที่เป็นข้อความใน Vertex AI เพื่อสร้างเอาต์พุตที่มีความหมาย ในขั้นตอนแรก เราจะสร้าง JSON จากนั้นทดสอบใน Vertex AI Studio และขั้นตอนสุดท้ายเราจะรวม JSON ไว้ในคำสั่ง SQL ซึ่งใช้ในแอปพลิเคชันได้

สร้างเอาต์พุตในรูปแบบ JSON

แก้ไขการค้นหาเพื่อสร้างเอาต์พุตในรูปแบบ JSON และแสดงผลเพียงแถวเดียวเพื่อส่งไปยัง Vertex AI

ตัวอย่างข้อความค้นหามีดังนี้

WITH trees as (
SELECT
        cp.product_name,
        left(cp.product_description,80) as description,
        cp.sale_price,
        cs.zip_code,
        cp.uniq_id as product_id
FROM
        cymbal_products cp
JOIN cymbal_inventory ci on
        ci.uniq_id=cp.uniq_id
JOIN cymbal_stores cs on
        cs.store_id=ci.store_id
        AND ci.inventory>0
        AND cs.store_id = 1583
ORDER BY
        (cp.embedding <=> embedding('text-embedding-005','What kind of fruit trees grow well here?')::vector) ASC
LIMIT 1)
SELECT json_agg(trees) FROM trees;

และนี่คือ JSON ที่คาดไว้ของเอาต์พุต

[{"product_name":"Cherry Tree","description":"This is a beautiful cherry tree that will produce delicious cherries. It is an d","sale_price":75.00,"zip_code":93230,"product_id":"d536e9e823296a2eba198e52dd23e712"}]

เรียกใช้พรอมต์ใน Vertex AI Studio

เราสามารถใช้ JSON ที่สร้างขึ้นเพื่อส่งเป็นพรอมต์ไปยังโมเดลข้อความ Generative AI ใน Vertex AI Studio

เปิด Vertex AI Studio ใน Cloud Console

กดปุ่ม "ยอมรับและดำเนินการต่อ"

เขียนพรอมต์ที่ด้านล่างของอินเทอร์เฟซ

หมายเหตุ: ในตัวอย่างนี้ เราใช้พารามิเตอร์เริ่มต้นทั้งหมดสําหรับพรอมต์ข้อความแบบอิสระ และโมเดล Gemini เวอร์ชันล่าสุด (ในขณะนั้น) ที่ระบบแนะนำโดยค่าเริ่มต้นและมีพารามิเตอร์เริ่มต้น เอาต์พุตอาจแตกต่างกันไปตามเวอร์ชันของรุ่นและพารามิเตอร์ อ่านข้อมูลเพิ่มเติมเกี่ยวกับ Vertex AI และโมเดลภาษาแบบ Generative ในเอกสารประกอบ

ระบบอาจขอให้คุณเปิดใช้ API เพิ่มเติม แต่คุณเพิกเฉยต่อคำขอนี้ได้ เราไม่จำเป็นต้องใช้ API เพิ่มเติมเพื่อทําให้ห้องทดลองเสร็จสมบูรณ์

พรอมต์ที่เราจะใช้กับเอาต์พุต JSON ของการค้นหาต้นไม่ในช่วงต้นมีดังนี้

คุณคือที่ปรึกษาที่คอยช่วยเหลือในการค้นหาผลิตภัณฑ์ตามความต้องการของลูกค้า

เราได้โหลดรายการผลิตภัณฑ์ที่เกี่ยวข้องกับการค้นหาอย่างใกล้ชิดตามคำขอของลูกค้า

รายการในรูปแบบ JSON ที่มีรายการค่า เช่น {"product_name":"name","description":"some description","sale_price":10,"zip_code": 10234, "produt_id": "02056727942aeb714dc9a2313654e1b0"}

นี่คือรายการผลิตภัณฑ์

{"product_name":"Cherry Tree","description":"This is a beautiful cherry tree that will produce delicious cherries. รูปแบบคือ d","sale_price":75.00,"zip_code":93230,"product_id":"d536e9e823296a2eba198e52dd23e712"}

ลูกค้าถามว่า "ต้นไม้อะไรเติบโตได้ดีที่สุดในบริเวณนี้"

คุณควรให้ข้อมูลเกี่ยวกับผลิตภัณฑ์ ราคา และข้อมูลเพิ่มเติมบางส่วนเป็นพรอมต์

ผลลัพธ์เมื่อเราเรียกใช้พรอมต์ด้วยค่า JSON และใช้รูปแบบ gemini-2.0-flash-001 มีดังนี้

คำตอบที่เราได้รับจากโมเดลในตัวอย่างนี้แสดงอยู่ด้านล่าง โปรดทราบว่าคําตอบอาจแตกต่างออกไปเนื่องจากรูปแบบและพารามิเตอร์มีการเปลี่ยนแปลงเมื่อเวลาผ่านไป

"โอเค เราช่วยคุณได้ จากรายการผลิตภัณฑ์แบบจำกัดที่เรามี ต้นเชอร์รี่อาจเป็นตัวเลือกที่ดี

ข้อมูลที่เราทราบมีดังนี้

ผลิตภัณฑ์: ต้นเชอร์รี่

คําอธิบาย: "นี่คือต้นเชอร์รี่ที่สวยงามซึ่งจะผลิตเชอร์รี่แสนอร่อย มันเป็น d" (ขออภัย คำอธิบายไม่สมบูรณ์)

ราคา: $75.00

รหัสไปรษณีย์: 93230 (ข้อมูลนี้สำคัญต่อการทำความเข้าใจว่าพืชชนิดนี้เติบโตได้ดีในพื้นที่ของคุณหรือไม่) "

เรียกใช้พรอมต์ใน PSQL

เราสามารถใช้การผสานรวม AI ของ AlloyDB กับ Vertex AI เพื่อรับคำตอบเดียวกันจากโมเดล Generative โดยใช้ SQL ในฐานข้อมูลโดยตรง แต่หากต้องการใช้โมเดล gemini-1.5-flash เราต้องลงทะเบียนโมเดลก่อน

ยืนยันส่วนขยาย google_ml_integration โดยควรเป็นเวอร์ชัน 1.4.2 ขึ้นไป

เชื่อมต่อกับฐานข้อมูล quickstart_db จาก psql ตามที่แสดงไว้ก่อนหน้านี้ (หรือใช้ AlloyDB Studio) แล้วดำเนินการต่อไปนี้

SELECT extversion from pg_extension where extname='google_ml_integration';

ตรวจสอบ Flag ฐานข้อมูล google_ml_integration.enable_model_support

show google_ml_integration.enable_model_support;

ผลลัพธ์ที่คาดหวังจากเซสชัน psql คือ "on"

postgres=> show google_ml_integration.enable_model_support;
 google_ml_integration.enable_model_support 
--------------------------------------------
 on
(1 row)

หากแสดงเป็น "ปิด" เราต้องตั้งค่า Flag ฐานข้อมูล google_ml_integration.enable_model_support เป็น "เปิด" ซึ่งคุณสามารถทำได้โดยใช้อินเทอร์เฟซเว็บคอนโซล AlloyDB หรือเรียกใช้คำสั่ง gcloud ต่อไปนี้

PROJECT_ID=$(gcloud config get-value project)
REGION=us-central1
ADBCLUSTER=alloydb-aip-01
gcloud beta alloydb instances update $ADBCLUSTER-pr \
  --database-flags google_ml_integration.enable_model_support=on \
  --region=$REGION \
  --cluster=$ADBCLUSTER \
  --project=$PROJECT_ID \
  --update-mode=FORCE_APPLY

คำสั่งจะใช้เวลาประมาณ 3-5 นาทีในการดำเนินการในเบื้องหลัง จากนั้นคุณจะยืนยันการแจ้งว่าไม่เหมาะสมอีกครั้งได้

ตอนนี้เราต้องลงทะเบียน 2 โมเดล รายการแรกคือรูปแบบ text-embedding-005 ที่ใช้อยู่แล้ว อุปกรณ์ต้องได้รับการลงทะเบียนเนื่องจากเราเปิดใช้ความสามารถในการลงทะเบียนรุ่น

หากต้องการลงทะเบียนโมเดลที่เรียกใช้ใน psql หรือ AlloyDB Studio ให้ใช้โค้ดต่อไปนี้

CALL
  google_ml.create_model(
    model_id => 'text-embedding-005',
    model_provider => 'google',
    model_qualified_name => 'text-embedding-005',
    model_type => 'text_embedding',
    model_auth_type => 'alloydb_service_agent_iam',
    model_in_transform_fn => 'google_ml.vertexai_text_embedding_input_transform',
    model_out_transform_fn => 'google_ml.vertexai_text_embedding_output_transform');

และโมเดลถัดไปที่เราต้องลงทะเบียนคือ gemini-2.0-flash-001 ซึ่งจะใช้สร้างเอาต์พุตที่ใช้งานง่าย

CALL
  google_ml.create_model(
    model_id => 'gemini-2.0-flash-001',
    model_request_url => 'publishers/google/models/gemini-2.0-flash-001:streamGenerateContent',
    model_provider => 'google',
    model_auth_type => 'alloydb_service_agent_iam');

คุณสามารถยืนยันรายการโมเดลที่ลงทะเบียนได้ทุกเมื่อโดยเลือกข้อมูลจาก google_ml.model_info_view

select model_id,model_type from google_ml.model_info_view;

ตัวอย่างเอาต์พุต

quickstart_db=> select model_id,model_type from google_ml.model_info_view;
        model_id         |   model_type   
-------------------------+----------------
 textembedding-gecko     | text_embedding
 textembedding-gecko@001 | text_embedding
 text-embedding-005      | text_embedding
 gemini-2.0-flash-001    | generic
(4 rows)

ตอนนี้เราสามารถใช้ JSON ของคำถามย่อยที่สร้างขึ้นเพื่อส่งเป็นพรอมต์ไปยังโมเดลข้อความ Generative AI โดยใช้ SQL

ในเซสชัน psql หรือ AlloyDB Studio กับฐานข้อมูล ให้เรียกใช้การค้นหา

WITH trees AS (
SELECT
        cp.product_name,
        cp.product_description AS description,
        cp.sale_price,
        cs.zip_code,
        cp.uniq_id AS product_id
FROM
        cymbal_products cp
JOIN cymbal_inventory ci ON
        ci.uniq_id = cp.uniq_id
JOIN cymbal_stores cs ON
        cs.store_id = ci.store_id
        AND ci.inventory>0
        AND cs.store_id = 1583
ORDER BY
        (cp.embedding <=> embedding('text-embedding-005',
        'What kind of fruit trees grow well here?')::vector) ASC
LIMIT 1),
prompt AS (
SELECT
        'You are a friendly advisor helping to find a product based on the customer''s needs.
Based on the client request we have loaded a list of products closely related to search.
The list in JSON format with list of values like {"product_name":"name","product_description":"some description","sale_price":10}
Here is the list of products:' || json_agg(trees) || 'The customer asked "What kind of fruit trees grow well here?"
You should give information about the product, price and some supplemental information' AS prompt_text
FROM
        trees),
response AS (
SELECT
        json_array_elements(google_ml.predict_row( model_id =>'gemini-2.0-flash-001',
        request_body => json_build_object('contents',
        json_build_object('role',
        'user',
        'parts',
        json_build_object('text',
        prompt_text)))))->'candidates'->0->'content'->'parts'->0->'text' AS resp
FROM
        prompt)
SELECT
        string_agg(resp::text,
        ' ')
FROM
        response;

และนี่คือผลลัพธ์ที่คาดไว้ เอาต์พุตอาจแตกต่างกันไปตามเวอร์ชันของโมเดลและพารามิเตอร์

"Okay" ", based on" " the product list, the \"Cherry Tree\" seems like a potential option for you.\n\n" "* **Product:** Cherry Tree\n* **Description:** It's a beautiful" " deciduous tree that grows to about 15 feet tall. You'll get dark green leaves in the summer that turn red in the fall. These trees are known for" " their beauty, shade, and privacy. Plus, you'll get delicious cherries!\n* **Growing Conditions:** Cherry trees prefer a cool, moist climate" " and sandy soil.\n* **USDA Zones:** They are best suited for USDA zones 4-9. (You may want to confirm that zone 4-9 is appropriate for your location.)\n* **Price:** \\$" "75.00\n\n**To make sure this is the *best* fit for you, could you tell me:**\n\n1. **Your Zip Code:** While the product lists zip code 93230, I" " would like to confirm where you are to verify that the USDA zone is a match for your area.\n2. **What kind of soil do you have?** The product description says that cherry trees prefer sandy soil.\n\nOnce I have this information, I can give you a more confident recommendation!\n"

10. สร้างดัชนีเวกเตอร์

ชุดข้อมูลของเราค่อนข้างเล็กและเวลาในการตอบสนองจะขึ้นอยู่กับการโต้ตอบกับโมเดล AI เป็นหลัก แต่หากคุณมีเวกเตอร์หลายล้านรายการ ส่วนการค้นหาเวกเตอร์อาจใช้เวลาในการตอบสนองส่วนใหญ่และทำให้ระบบมีภาระงานสูง เราสามารถปรับปรุงได้โดยสร้างดัชนีบนเวกเตอร์

สร้างดัชนี ScaNN

หากต้องการสร้างดัชนี SCANN เราจะต้องเปิดใช้ส่วนขยายอีกรายการ ส่วนขยาย alloydb_scann มีอินเทอร์เฟซสำหรับใช้กับดัชนีเวกเตอร์ประเภท ANN โดยใช้อัลกอริทึม Google ScaNN

CREATE EXTENSION IF NOT EXISTS alloydb_scann;

ผลลัพธ์ที่คาดหวัง

quickstart_db=> CREATE EXTENSION IF NOT EXISTS alloydb_scann;
CREATE EXTENSION
Time: 27.468 ms
quickstart_db=>

ตอนนี้เราสร้างดัชนีได้แล้ว ในตัวอย่างนี้ เราจะปล่อยให้พารามิเตอร์ส่วนใหญ่เป็นค่าเริ่มต้นและระบุเฉพาะจํานวนพาร์ติชัน (num_leaves) สําหรับดัชนี

CREATE INDEX cymbal_products_embeddings_scann ON cymbal_products
  USING scann (embedding cosine)
  WITH (num_leaves=31, max_num_levels = 2);

คุณสามารถอ่านเกี่ยวกับการปรับพารามิเตอร์ดัชนีได้ในเอกสารประกอบ

ผลลัพธ์ที่คาดหวัง

quickstart_db=> CREATE INDEX cymbal_products_embeddings_scann ON cymbal_products
  USING scann (embedding cosine)
  WITH (num_leaves=31, max_num_levels = 2);
CREATE INDEX
quickstart_db=>

เปรียบเทียบคำตอบ

ตอนนี้เราเรียกใช้การค้นหาเวกเตอร์ในโหมด EXPLAIN และตรวจสอบได้ว่ามีการใช้ดัชนีหรือไม่

EXPLAIN (analyze) 
WITH trees as (
SELECT
        cp.product_name,
        left(cp.product_description,80) as description,
        cp.sale_price,
        cs.zip_code,
        cp.uniq_id as product_id
FROM
        cymbal_products cp
JOIN cymbal_inventory ci on
        ci.uniq_id=cp.uniq_id
JOIN cymbal_stores cs on
        cs.store_id=ci.store_id
        AND ci.inventory>0
        AND cs.store_id = 1583
ORDER BY
        (cp.embedding <=> embedding('text-embedding-005','What kind of fruit trees grow well here?')::vector) ASC
LIMIT 1)
SELECT json_agg(trees) FROM trees;

ผลลัพธ์ที่คาดหวัง

Aggregate (cost=16.59..16.60 rows=1 width=32) (actual time=2.875..2.877 rows=1 loops=1)
-> Subquery Scan on trees (cost=8.42..16.59 rows=1 width=142) (actual time=2.860..2.862 rows=1 loops=1)
-> Limit (cost=8.42..16.58 rows=1 width=158) (actual time=2.855..2.856 rows=1 loops=1)
-> Nested Loop (cost=8.42..6489.19 rows=794 width=158) (actual time=2.854..2.855 rows=1 loops=1)
-> Nested Loop (cost=8.13..6466.99 rows=794 width=938) (actual time=2.742..2.743 rows=1 loops=1)
-> Index Scan using cymbal_products_embeddings_scann on cymbal_products cp (cost=7.71..111.99 rows=876 width=934) (actual time=2.724..2.724 rows=1 loops=1)
Order By: (embedding <=> '[0.008864171,0.03693164,-0.024245683,-0.00355923,0.0055611245,0.015985578,...<redacted>...5685,-0.03914233,-0.018452475,0.00826032,-0.07372604]'::vector)
-> Index Scan using walmart_inventory_pkey on cymbal_inventory ci (cost=0.42..7.26 rows=1 width=37) (actual time=0.015..0.015 rows=1 loops=1)
Index Cond: ((store_id = 1583) AND (uniq_id = (cp.uniq_id)::text))

จากเอาต์พุต เราเห็นได้อย่างชัดเจนว่าข้อความค้นหาใช้ "การสแกนดัชนีโดยใช้ cymbal_products_embeddings_scann ใน cymbal_products"

และหากเรียกใช้การค้นหาโดยไม่มีคำอธิบาย

WITH trees as (
SELECT
        cp.product_name,
        left(cp.product_description,80) as description,
        cp.sale_price,
        cs.zip_code,
        cp.uniq_id as product_id
FROM
        cymbal_products cp
JOIN cymbal_inventory ci on
        ci.uniq_id=cp.uniq_id
JOIN cymbal_stores cs on
        cs.store_id=ci.store_id
        AND ci.inventory>0
        AND cs.store_id = 1583
ORDER BY
        (cp.embedding <=> embedding('text-embedding-005','What kind of fruit trees grow well here?')::vector) ASC
LIMIT 1)
SELECT json_agg(trees) FROM trees;

ผลลัพธ์ที่คาดหวัง

[{"product_name":"Meyer Lemon Tree","description":"Meyer Lemon trees are California's favorite lemon tree! Grow your own lemons by ","sale_price":34,"zip_code":93230,"product_id":"02056727942aeb714dc9a2313654e1b0"}]

เราเห็นว่าผลลัพธ์แตกต่างกันเล็กน้อยและไม่ได้แสดงต้นเชอร์รี่ซึ่งอยู่ด้านบนในการค้นหาแบบไม่มีดัชนี แต่แสดงต้นเลมอนเมเยอร์ซึ่งเป็นตัวเลือกที่ 2 ดังนั้นดัชนีจึงแสดงประสิทธิภาพให้เราเห็นแต่ยังคงมีความแม่นยำเพียงพอที่จะให้ผลลัพธ์ที่ดี

คุณสามารถลองใช้ดัชนีอื่นๆ ที่มีให้สำหรับเวกเตอร์ รวมถึงดูห้องทดลองและตัวอย่างเพิ่มเติมที่มีการผสานรวม Langchain ได้ในหน้าเอกสารประกอบ

11. ล้างสภาพแวดล้อม

ทำลายอินสแตนซ์และคลัสเตอร์ AlloyDB เมื่อใช้ห้องทดลองเสร็จแล้ว

ลบคลัสเตอร์ AlloyDB และอินสแตนซ์ทั้งหมด

ระบบจะทำลายคลัสเตอร์ด้วยตัวเลือก "บังคับ" ซึ่งจะลบอินสแตนซ์ทั้งหมดของคลัสเตอร์ด้วย

ในเชลล์ระบบคลาวด์ ให้กําหนดตัวแปรโปรเจ็กต์และสภาพแวดล้อมหากคุณถูกตัดการเชื่อมต่อและการตั้งค่าก่อนหน้านี้ทั้งหมดหายไป

gcloud config set project <your project id>

export REGION=us-central1
export ADBCLUSTER=alloydb-aip-01
export PROJECT_ID=$(gcloud config get-value project)

ลบคลัสเตอร์

gcloud alloydb clusters delete $ADBCLUSTER --region=$REGION --force

ผลลัพธ์ที่คาดหวังในคอนโซล

student@cloudshell:~ (test-project-001-402417)$ gcloud alloydb clusters delete $ADBCLUSTER --region=$REGION --force
All of the cluster data will be lost when the cluster is deleted.

Do you want to continue (Y/n)?  Y

Operation ID: operation-1697820178429-6082890a0b570-4a72f7e4-4c5df36f
Deleting cluster...done.

ลบข้อมูลสํารอง AlloyDB

ลบข้อมูลสำรอง AlloyDB ทั้งหมดของคลัสเตอร์

for i in $(gcloud alloydb backups list --filter="CLUSTER_NAME: projects/$PROJECT_ID/locations/$REGION/clusters/$ADBCLUSTER" --format="value(name)" --sort-by=~createTime) ; do gcloud alloydb backups delete $(basename $i) --region $REGION --quiet; done

ผลลัพธ์ที่คาดหวังในคอนโซล

student@cloudshell:~ (test-project-001-402417)$ for i in $(gcloud alloydb backups list --filter="CLUSTER_NAME: projects/$PROJECT_ID/locations/$REGION/clusters/$ADBCLUSTER" --format="value(name)" --sort-by=~createTime) ; do gcloud alloydb backups delete $(basename $i) --region $REGION --quiet; done
Operation ID: operation-1697826266108-60829fb7b5258-7f99dc0b-99f3c35f
Deleting backup...done.

ตอนนี้เราทำลาย VM ได้

ลบ GCE VM

ใน Cloud Shell ให้ดำเนินการต่อไปนี้

export GCEVM=instance-1
export ZONE=us-central1-a
gcloud compute instances delete $GCEVM \
    --zone=$ZONE \
    --quiet

ผลลัพธ์ที่คาดหวังในคอนโซล

student@cloudshell:~ (test-project-001-402417)$ export GCEVM=instance-1
export ZONE=us-central1-a
gcloud compute instances delete $GCEVM \
    --zone=$ZONE \
    --quiet
Deleted

12. ขอแสดงความยินดี

ขอแสดงความยินดีที่ทํา Codelab จนเสร็จสมบูรณ์

สิ่งที่เราได้พูดถึง

วิธีทำให้คลัสเตอร์ AlloyDB และอินสแตนซ์หลักใช้งานได้
วิธีเชื่อมต่อกับ AlloyDB จาก VM ของ Google Compute Engine
วิธีสร้างฐานข้อมูลและเปิดใช้ AI ของ AlloyDB
วิธีโหลดข้อมูลไปยังฐานข้อมูล
วิธีใช้โมเดลการฝังของ Vertex AI ใน AlloyDB
วิธีเพิ่มประสิทธิภาพผลลัพธ์โดยใช้โมเดล Generative ของ Vertex AI
วิธีปรับปรุงประสิทธิภาพโดยใช้ดัชนีเวกเตอร์

13. แบบสำรวจ

เอาต์พุต:

คุณจะใช้บทแนะนำนี้อย่างไร

อ่านอย่างเดียว

อ่านและทำแบบฝึกหัดให้เสร็จ