程式碼研究室簡介
1. 簡介
本程式碼研究室提供指南,說明如何搭配 AI 運算子部署 AlloyDB,並將其用於語意搜尋、彙整和結果排名等工作。
必要條件
- 具備 Google Cloud 控制台的基本知識
- 指令列介面和 Cloud Shell 的基本技能
課程內容
- 如何部署 AlloyDB for Postgres
- 如何啟用 AlloyDB AI 運算子
- 如何使用不同的 AlloyDB AI 運算子
- 如何在 RAG 工作流程中使用 AlloyDB AI 運算子
軟硬體需求
- Google Cloud 帳戶和 Google Cloud 專案
- 支援 Google Cloud 控制台和 Cloud Shell 的網路瀏覽器,例如 Chrome
2. 設定和需求
自助式環境設定
- 登入 Google Cloud 控制台,然後建立新專案或重複使用現有專案。如果您還沒有 Gmail 或 Google Workspace 帳戶,請務必建立帳戶。
- 「Project name」是這個專案參與者的顯示名稱。這是 Google API 不會使用的字元字串。您隨時可以更新。
- 專案 ID 在所有 Google Cloud 專案中都是不重複的值,且無法變更 (設定後即無法變更)。Cloud 控制台會自動產生唯一字串,您通常不需要理會這個字串。在大多數程式碼研究室中,您需要參照專案 ID (通常會標示為
PROJECT_ID
)。如果您不喜歡產生的 ID,可以產生另一個隨機 ID。或者,您也可以自行嘗試,看看是否可用。在這個步驟後就無法變更,且會在專案期間維持不變。 - 提醒您,有些 API 會使用第三個值「專案編號」。如要進一步瞭解這三個值,請參閱說明文件。
- 接下來,您需要在 Cloud 控制台中啟用帳單功能,才能使用 Cloud 資源/API。執行這個程式碼研究室不會產生任何費用,如要關閉資源,避免在本教學課程結束後繼續產生費用,您可以刪除建立的資源或專案。Google Cloud 新使用者可享有 $300 美元的免費試用期。
啟動 Cloud Shell
雖然 Google Cloud 可透過筆記型電腦遠端操作,但在本程式碼研究室中,您將使用 Google Cloud Shell,這是在雲端運作的指令列環境。
在 Google Cloud 控制台中,按一下右上方工具列的 Cloud Shell 圖示:
佈建並連線至環境的作業需要一些時間才能完成。完成後,畫面應如下所示:
這個虛擬機器會載入您需要的所有開發工具。提供永久的 5 GB 主目錄,而且在 Google Cloud 中運作,大幅提升網路效能和驗證功能。您可以在瀏覽器中完成本程式碼研究室的所有工作。您不需要安裝任何東西。
3. 事前準備
啟用 API
在 Cloud Shell 中,確認專案 ID 已設定完畢:
gcloud config set project [YOUR-PROJECT-ID]
設定環境變數 PROJECT_ID:
PROJECT_ID=$(gcloud config get-value project)
啟用所有必要服務:
gcloud services enable alloydb.googleapis.com \
compute.googleapis.com \
cloudresourcemanager.googleapis.com \
servicenetworking.googleapis.com \
aiplatform.googleapis.com \
discoveryengine.googleapis.com \
secretmanager.googleapis.com
預期的輸出內容:
student@cloudshell:~ (test-project-001-402417)$ gcloud config set project test-project-001-402417 Updated property [core/project]. student@cloudshell:~ (test-project-001-402417)$ PROJECT_ID=$(gcloud config get-value project) Your active configuration is: [cloudshell-14650] student@cloudshell:~ (test-project-001-402417)$ student@cloudshell:~ (test-project-001-402417)$ gcloud services enable alloydb.googleapis.com \ compute.googleapis.com \ cloudresourcemanager.googleapis.com \ servicenetworking.googleapis.com \ aiplatform.googleapis.com Operation "operations/acat.p2-4470404856-1f44ebd8-894e-4356-bea7-b84165a57442" finished successfully.
4. 部署 AlloyDB
建立 AlloyDB 叢集和主要執行個體。下列程序說明如何使用 Google Cloud SDK 建立 AlloyDB 叢集和執行個體。如果您偏好使用主控台,請按照這裡的說明文件操作。
在建立 AlloyDB 叢集之前,我們需要在虛擬私人雲端中提供可供日後 AlloyDB 執行個體使用的私人 IP 範圍。如果沒有,我們需要建立,並指派給內部 Google 服務使用,之後才能建立叢集和執行個體。
建立私人 IP 範圍
我們需要在 AlloyDB 的 VPC 中設定私人服務存取權設定。這裡假設專案中有「預設」虛擬私有雲網路,且會用於所有動作。
建立私人 IP 範圍:
gcloud compute addresses create psa-range \
--global \
--purpose=VPC_PEERING \
--prefix-length=24 \
--description="VPC private service access" \
--network=default
使用分配的 IP 範圍建立私人連線:
gcloud services vpc-peerings connect \
--service=servicenetworking.googleapis.com \
--ranges=psa-range \
--network=default
預期的控制台輸出內容:
student@cloudshell:~ (test-project-402417)$ gcloud compute addresses create psa-range \ --global \ --purpose=VPC_PEERING \ --prefix-length=24 \ --description="VPC private service access" \ --network=default Created [https://www.googleapis.com/compute/v1/projects/test-project-402417/global/addresses/psa-range]. student@cloudshell:~ (test-project-402417)$ gcloud services vpc-peerings connect \ --service=servicenetworking.googleapis.com \ --ranges=psa-range \ --network=default Operation "operations/pssn.p24-4470404856-595e209f-19b7-4669-8a71-cbd45de8ba66" finished successfully. student@cloudshell:~ (test-project-402417)$
建立 AlloyDB 叢集
在本節中,我們會在 us-central1 區域中建立 AlloyDB 叢集。
定義 postgres 使用者的密碼。您可以自行定義密碼,也可以使用隨機函式產生密碼
export PGPASSWORD=`openssl rand -hex 12`
預期的控制台輸出內容:
student@cloudshell:~ (test-project-402417)$ export PGPASSWORD=`openssl rand -hex 12`
請記下 PostgreSQL 密碼,以供日後使用:
echo $PGPASSWORD
預期的控制台輸出內容:
student@cloudshell:~ (test-project-402417)$ echo $PGPASSWORD bbefbfde7601985b0dee5723
建立免付費試用叢集
如果您之前未曾使用 AlloyDB,可以建立免費試用叢集:
定義區域和 AlloyDB 叢集名稱。我們將使用 us-central1 區域,並將 alloydb-aip-01 做為叢集名稱:
export REGION=us-central1
export ADBCLUSTER=alloydb-aip-01
執行指令建立叢集:
gcloud alloydb clusters create $ADBCLUSTER \
--password=$PGPASSWORD \
--network=default \
--region=$REGION \
--subscription-type=TRIAL
預期的控制台輸出內容:
export REGION=us-central1 export ADBCLUSTER=alloydb-aip-01 gcloud alloydb clusters create $ADBCLUSTER \ --password=$PGPASSWORD \ --network=default \ --region=$REGION \ --subscription-type=TRIAL Operation ID: operation-1697655441138-6080235852277-9e7f04f5-2012fce4 Creating cluster...done.
在同一個 Cloud Shell 工作階段中,為叢集建立 AlloyDB 主要執行個體。如果連線中斷,您必須重新定義地區和叢集名稱環境變數。
gcloud alloydb instances create $ADBCLUSTER-pr \
--instance-type=PRIMARY \
--cpu-count=8 \
--region=$REGION \
--cluster=$ADBCLUSTER
預期的控制台輸出內容:
student@cloudshell:~ (test-project-402417)$ gcloud alloydb instances create $ADBCLUSTER-pr \ --instance-type=PRIMARY \ --cpu-count=8 \ --region=$REGION \ --availability-type ZONAL \ --cluster=$ADBCLUSTER Operation ID: operation-1697659203545-6080315c6e8ee-391805db-25852721 Creating instance...done.
建立 AlloyDB Standard 叢集
如果不是專案中的第一個 AlloyDB 叢集,請繼續建立標準叢集。
定義區域和 AlloyDB 叢集名稱。我們將使用 us-central1 區域,並將 alloydb-aip-01 做為叢集名稱:
export REGION=us-central1
export ADBCLUSTER=alloydb-aip-01
執行指令建立叢集:
gcloud alloydb clusters create $ADBCLUSTER \
--password=$PGPASSWORD \
--network=default \
--region=$REGION
預期的控制台輸出內容:
export REGION=us-central1 export ADBCLUSTER=alloydb-aip-01 gcloud alloydb clusters create $ADBCLUSTER \ --password=$PGPASSWORD \ --network=default \ --region=$REGION Operation ID: operation-1697655441138-6080235852277-9e7f04f5-2012fce4 Creating cluster...done.
在同一個 Cloud Shell 工作階段中,為叢集建立 AlloyDB 主要執行個體。如果連線中斷,您必須重新定義地區和叢集名稱環境變數。
gcloud alloydb instances create $ADBCLUSTER-pr \
--instance-type=PRIMARY \
--cpu-count=2 \
--region=$REGION \
--cluster=$ADBCLUSTER
預期的控制台輸出內容:
student@cloudshell:~ (test-project-402417)$ gcloud alloydb instances create $ADBCLUSTER-pr \ --instance-type=PRIMARY \ --cpu-count=2 \ --region=$REGION \ --availability-type ZONAL \ --cluster=$ADBCLUSTER Operation ID: operation-1697659203545-6080315c6e8ee-391805db-25852721 Creating instance...done.
5. 準備資料庫
我們需要建立資料庫、啟用 Vertex AI 整合功能、建立資料庫物件,然後匯入資料。
授予 AlloyDB 必要權限
將 Vertex AI 權限新增至 AlloyDB 服務代理。
使用頂端的「+」符號開啟另一個 Cloud Shell 分頁。
在新的 Cloud Shell 分頁中執行:
PROJECT_ID=$(gcloud config get-value project)
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:service-$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")@gcp-sa-alloydb.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"
預期的控制台輸出內容:
student@cloudshell:~ (test-project-001-402417)$ PROJECT_ID=$(gcloud config get-value project) Your active configuration is: [cloudshell-11039] student@cloudshell:~ (test-project-001-402417)$ gcloud projects add-iam-policy-binding $PROJECT_ID \ --member="serviceAccount:service-$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")@gcp-sa-alloydb.iam.gserviceaccount.com" \ --role="roles/aiplatform.user" Updated IAM policy for project [test-project-001-402417]. bindings: - members: - serviceAccount:service-4470404856@gcp-sa-alloydb.iam.gserviceaccount.com role: roles/aiplatform.user - members: ... etag: BwYIEbe_Z3U= version: 1
在分頁中執行「exit」指令,關閉分頁:
exit
連線至 AlloyDB Studio
在後續章節中,所有需要連線至資料庫的 SQL 指令都可以改為在 AlloyDB Studio 中執行。如要執行指令,您必須按一下主要執行個體,開啟 AlloyDB 叢集的網路主控台介面。
然後按一下左側的 AlloyDB Studio:
請選擇 postgres 資料庫、使用者 postgres,並提供我們在建立叢集時記下的密碼。然後按一下「驗證」按鈕。
系統會開啟 AlloyDB Studio 介面。如要在資料庫中執行指令,請按一下右側的「編輯器 1」分頁。
系統會開啟介面,讓您執行 SQL 指令
建立資料庫
建立資料庫快速入門。
在 AlloyDB Studio 編輯器中執行下列指令。
建立資料庫:
CREATE DATABASE quickstart_db
預期輸出內容:
Statement executed successfully
連線至 quickstart_db
使用按鈕切換使用者/資料庫,重新連線至工作室。
從下拉式清單中選取新的 quickstart_db 資料庫,並使用與先前相同的使用者和密碼。
系統會開啟新的連線,讓您可以使用 quickstart_db 資料庫中的物件。
驗證 google_ml 擴充功能
請檢查 google_ml 擴充功能版本,確認版本為 1.4.4 以上,才能使用 AI 查詢引擎。
在 AlloyDB Studio 中,連線至 quickstart_db 執行:
SELECT extversion FROM pg_extension WHERE extname = 'google_ml_integration';
預期輸出內容:
1.4.4
如果版本低於必要版本,請更新擴充功能。
在 AlloyDB Studio 中,連線至 quickstart_db 執行:
CALL google_ml.upgrade_to_preview_version();
預期輸出內容:
Statement executed successfully
執行成功後,請再次驗證版本。
在 AlloyDB Studio 中,連線至 quickstart_db 執行:
SELECT extversion FROM pg_extension WHERE extname = 'google_ml_integration';
預期輸出內容:
1.4.4
6. 範例資料
接下來,我們需要在資料庫中建立物件並載入資料。我們將使用幾列的虛構電影資料集。
將下列陳述式複製到 AlloyDB Studio 編輯器,然後按下「Run」(執行) 按鈕。
-- Drop tables if they exist to prevent errors on re-running the script
DROP TABLE IF EXISTS movie_reviews;
DROP TABLE IF EXISTS movies;
-- Create the 'movies' table
-- This table stores information about each movie.
CREATE TABLE movies (
id BIGINT PRIMARY KEY, -- Unique identifier for the movie
title TEXT NOT NULL, -- Title of the movie
description TEXT, -- A brief description or synopsis of the movie
genres TEXT, -- Comma-separated list of genres (e.g., "Action, Adventure, Sci-Fi")
actors TEXT -- Comma-separated list of main actors
);
-- Create the 'movie_reviews' table
-- This table stores reviews for the movies.
CREATE TABLE movie_reviews (
review_id BIGINT PRIMARY KEY, -- Unique identifier for the review
movie_id BIGINT NOT NULL, -- Foreign key referencing the movie being reviewed
reviewer_name TEXT, -- Name of the person who wrote the review
rating INT CHECK (rating >= 1 AND rating <= 5), -- Rating from 1 to 5 stars
review_text TEXT, -- The content of the review
review_date DATE DEFAULT CURRENT_DATE, -- Date when the review was submitted
FOREIGN KEY (movie_id) REFERENCES movies(id) ON DELETE CASCADE -- Ensures referential integrity; if a movie is deleted, its reviews are also deleted.
);
-- Insert sample data into the 'movies' table (20 rows)
INSERT INTO movies (id, title, description, genres, actors) VALUES
(1, 'Inception', 'A thief who steals information by entering people''s dreams.', 'Sci-Fi, Thriller, Action', 'Leonardo DiCaprio, Joseph Gordon-Levitt, Elliot Page'),
(2, 'The Matrix', 'A computer hacker learns about the true nature of his reality.', 'Sci-Fi, Action', 'Keanu Reeves, Laurence Fishburne, Carrie-Anne Moss'),
(3, 'Interstellar', 'A team of explorers journey through a cosmic passage beyond our world in an attempt to ensure humanity''s survival.', 'Sci-Fi, Drama, Adventure', 'Matthew McConaughey, Anne Hathaway, Jessica Chastain'), -- Updated description
(4, 'The Dark Knight', 'When the menace known as the Joker wreaks havoc and chaos on the people of Gotham, Batman must accept one of the greatest psychological and physical tests of his ability to fight injustice.', 'Action, Crime, Drama', 'Christian Bale, Heath Ledger, Aaron Eckhart'),
(5, 'Pulp Fiction', 'The lives of two mob hitmen, a boxer, a gangster and his wife, and a pair of diner bandits intertwine in four tales of violence and redemption.', 'Crime, Drama', 'John Travolta, Uma Thurman, Samuel L. Jackson'),
(6, 'Forrest Gump', 'The presidencies of Kennedy and Johnson, the Vietnam War, the Watergate scandal and other historical events unfold from the perspective of an Alabama man with an IQ of 75.', 'Drama, Romance', 'Tom Hanks, Robin Wright, Gary Sinise'),
(7, 'The Shawshank Redemption', 'Two imprisoned men bond over a number of years, finding solace and eventual redemption through acts of common decency.', 'Drama', 'Tim Robbins, Morgan Freeman, Bob Gunton'),
(8, 'Gladiator', 'A former Roman General sets out to exact vengeance against the corrupt emperor who murdered his family and sent him into slavery.', 'Action, Adventure, Drama', 'Russell Crowe, Joaquin Phoenix, Connie Nielsen'),
(9, 'Fight Club', 'An insomniac office worker looking for a way to change his life crosses paths with a devil-may-care soap maker and they form an underground fight club that evolves into something much, much more.', 'Drama', 'Brad Pitt, Edward Norton, Meat Loaf'),
(10, 'The Lord of the Rings: The Return of the King', 'Gandalf and Aragorn lead the World of Men against Sauron''s army to draw his gaze from Frodo and Sam as they approach Mount Doom with the One Ring.', 'Action, Adventure, Drama', 'Elijah Wood, Viggo Mortensen, Ian McKellen'),
(11, 'Spirited Away', 'During her family''s move to the suburbs, a sullen 10-year-old girl wanders into a world ruled by gods, witches, and spirits, and where humans are changed into beasts.', 'Animation, Adventure, Family', 'Daveigh Chase, Suzanne Pleshette, Miyu Irino'),
(12, 'Parasite', 'Greed and class discrimination threaten the newly formed symbiotic relationship between the wealthy Park family and the destitute Kim clan.', 'Comedy, Drama, Thriller', 'Song Kang-ho, Lee Sun-kyun, Cho Yeo-jeong'),
(13, 'The Godfather', 'The aging patriarch of an organized crime dynasty transfers control of his clandestine empire to his reluctant son.', 'Crime, Drama', 'Marlon Brando, Al Pacino, James Caan'),
(14, 'Avengers: Endgame', 'After the devastating events of Avengers: Infinity War, the universe is in ruins. With the help of remaining allies, the Avengers assemble once more in order to reverse Thanos'' actions and restore balance to the universe.', 'Action, Adventure, Drama', 'Robert Downey Jr., Chris Evans, Mark Ruffalo'),
(15, 'Joker', 'In Gotham City, mentally troubled comedian Arthur Fleck is disregarded and mistreated by society. He then embarks on a downward spiral of revolution and bloody crime.', 'Crime, Drama, Thriller', 'Joaquin Phoenix, Robert De Niro, Zazie Beetz'),
(16, 'Mad Max: Fury Road', 'In a post-apocalyptic wasteland, a woman rebels against a tyrannical ruler in search for her homeland with the help of a group of female prisoners, a psychotic worshiper, and a drifter named Max.', 'Action, Adventure, Sci-Fi', 'Tom Hardy, Charlize Theron, Nicholas Hoult'),
(17, 'Coco', 'Aspiring musician Miguel, confronted with his family''s ancestral ban on music, enters the Land of the Dead to find his great-great-grandfather, a legendary singer.', 'Animation, Adventure, Family', 'Anthony Gonzalez, Gael García Bernal, Benjamin Bratt'),
(18, 'Whiplash', 'A promising young drummer enrolls at a cut-throat music conservatory where his dreams of greatness are mentored by an instructor who will stop at nothing to realize a student''s potential.', 'Drama, Music', 'Miles Teller, J.K. Simmons, Paul Reiser'),
(19, 'The Grand Budapest Hotel', 'The adventures of Gustave H, a legendary concierge at a famous hotel from the fictional Republic of Zubrowka between the first and second World Wars, and Zero Moustafa, the lobby boy who becomes his most trusted friend.', 'Adventure, Comedy, Drama', 'Ralph Fiennes, F. Murray Abraham, Mathieu Amalric'),
(20, 'Blade Runner 2049', 'Young Blade Runner K''s discovery of a long-buried secret leads him to track down former Blade Runner Rick Deckard, who''s been missing for thirty years.', 'Action, Drama, Mystery', 'Ryan Gosling, Harrison Ford, Ana de Armas');
-- Insert sample data into the 'movie_reviews' table (30 rows)
-- Reviews are linked to movies via movie_id. Includes a mix of positive and negative reviews.
-- Movie title is prepended to the review text.
INSERT INTO movie_reviews (review_id, movie_id, reviewer_name, rating, review_text) VALUES
(1, 1, 'Alice Wonderland', 5, 'Inception: Absolutely mind-bending! A masterpiece of sci-fi.'),
(2, 1, 'Bob The Critic', 2, 'Inception: Too confusing and pretentious. Didn''t enjoy it.'),
(3, 2, 'Charlie Reviewer', 5, 'The Matrix: Revolutionary visuals and a compelling story.'),
(4, 3, 'Diana Prince', 5, 'Interstellar: Visually stunning and emotionally powerful. A must-see.'),
(5, 3, 'Edward Nigma', 4, 'Interstellar: Long, but worth it for the spectacle and ideas.'),
(6, 4, 'Fiona Glenanne', 5, 'The Dark Knight: Heath Ledger''s Joker is iconic. Dark and thrilling.'),
(7, 5, 'George Costanza', 5, 'Pulp Fiction: Quirky, violent, and endlessly quotable.'),
(8, 5, 'Hannah Montana', 1, 'Pulp Fiction: Way too violent and the timeline was confusing. Hated it.'),
(9, 6, 'Ian Malcolm', 4, 'Forrest Gump: A heartwarming story with a great performance by Hanks.'),
(10, 7, 'Jane Doe', 5, 'The Shawshank Redemption: An uplifting story of hope and friendship. Perfect.'),
(11, 7, 'John Smith', 5, 'The Shawshank Redemption: Morgan Freeman is amazing. Truly a classic.'),
(12, 8, 'Kyle Broflovski', 2, 'Gladiator: Generic plot and boring action scenes. Overrated.'),
(13, 9, 'Laura Palmer', 5, 'Fight Club: Provocative and thought-provoking. Norton and Pitt are fantastic.'),
(14, 10, 'Michael Scott', 5, 'The Lord of the Rings: The Return of the King: A fitting and epic conclusion to a legendary trilogy.'),
(15, 11, 'Nancy Drew', 5, 'Spirited Away: Beautiful animation and a magical story for all ages.'),
(16, 12, 'Oscar Martinez', 5, 'Parasite: A brilliant satire with unexpected twists. Loved it!'),
(17, 12, 'Pam Beesly', 4, 'Parasite: Very intense, but incredibly well-directed and acted.'),
(18, 13, 'Quentin Coldwater', 5, 'The Godfather: A cinematic masterpiece. Brando is unforgettable.'),
(19, 14, 'Rachel Green', 3, 'Avengers: Endgame: It was okay, but felt bloated and had too many characters.'),
(20, 14, 'Steve Rogers', 5, 'Avengers: Endgame: The culmination of a decade of storytelling. Perfect ending.'),
(21, 15, 'Tony Stark', 4, 'Joker: A dark and disturbing character study. Phoenix is mesmerizing.'),
(22, 16, 'Uma Thurman', 5, 'Mad Max: Fury Road: Non-stop action and incredible practical effects. What a ride!'),
(23, 17, 'Victor Frankenstein', 5, 'Coco: A heartwarming and visually stunning celebration of family and culture.'),
(24, 18, 'Walter White', 5, 'Whiplash: Intense and gripping. J.K. Simmons is terrifyingly good.'),
(25, 19, 'Xena Warrior', 2, 'The Grand Budapest Hotel: Too quirky for its own good. Style over substance.'),
(26, 20, 'Ygritte Snow', 5, 'Blade Runner 2049: A worthy sequel that expands on the original in meaningful ways. Visually breathtaking.'),
(27, 1, 'Zack Morris', 4, 'Inception: Kept me on the edge of my seat. Very clever.'),
(28, 4, 'Buffy Summers', 5, 'The Dark Knight: The best superhero movie ever made. Ledger is a legend.'),
(29, 8, 'Clark Kent', 3, 'Gladiator: Decent action, but the story felt predictable and dragged a bit.'),
(30, 15, 'Diana Troy', 3, 'Joker: Hard to watch at times, but a powerful performance. Felt it was a bit one-note though.');
如果您有自己的範例資料,且 CSV 檔案與 Cloud 控制台提供的 Cloud SQL 匯入工具相容,可以使用該工具,而非本文介紹的方法。
7. 使用 IF 運算子
我們先試試使用標準 PostgreSQL 方法進行傳統搜尋。
假設我們想搜尋一部關於太空冒險的電影,可以試試下列查詢
SELECT title,description AS movies_about_space
FROM movies
WHERE description like '%space%' OR title like '%space%';
但未傳回任何結果。但我確定至少有一部電影屬於這個類別。我們可以嘗試使用全文搜尋方法。
SELECT title,description
FROM movies
WHERE to_tsvector('english', description) @@ to_tsquery('english', 'space');
甚至可能完全沒有任何結果。因此,我們需要知道一些關鍵字或詞組,才能使用 PostgreSQL 搜尋的「經典」技巧。
我們現在可以嘗試使用 Google_ml.if 函式,透過 AI 技術進行語意篩選。系統會在幕後使用 AI 技術,根據自然語言要求執行語意篩選。
SELECT title,description AS movies_about_space
FROM movies
WHERE
google_ml.if(
prompt => 'Here are descriptions of movies, can you return the ones about space adventures: '||description);
我們會根據要求的語意意義取得「星際效應」電影,即使標題和說明中都沒有「太空」一詞,如您所見,我們也沒有事先建構任何內容,而是完全依賴自動內建函式。
8. 使用 JOIN 搭配 IF 運算子
如果我們想使用 AI 驅動的語意篩選功能彙整兩個資料表,該怎麼做呢?舉例來說,如果使用者評論中提到某部電影,我們可以嘗試根據使用者評論,將評論與電影配對。
在新版 AlloyDB Studio 編輯器分頁中執行:
SELECT title, rating, movie_reviews
FROM movies
JOIN
movie_reviews ON
google_ml.if(
prompt => 'Does the following reviews talk about a movie mentioned? The review: ' || review_text||' and the movie title is: '||title)
AND
title='Interstellar';
我們根據標題中提及的電影名稱,找到兩則符合要求的評論。我們甚至可以進一步簡化要求:
SELECT title, rating, movie_reviews
FROM movies
JOIN
movie_reviews ON
google_ml.if(
prompt => 'Do we have the movie in the review?: ' || review_text||' and the movie title is: '||title)
AND
title='Interstellar';
9. 根據內容評分
電影評論資料表 movie_reviews 包含電影評分,但如果我們想實作自己的評分,可以使用 google_ml.rank 函式。我們可以根據自然語言定義的條件為評論評分,例如取得電影的前 5 則評論,並顯示原始評分供比較。
SELECT rating,review_text AS top_five
FROM movie_reviews
ORDER BY google_ml.rank('Score of 7 to 10 if the review says the movie was really good, 3 to 6 if the review says it''s alright is and 1 to 2 if the review says it was not worth of time. Review: ' || review_text) DESC
LIMIT 5;
如果想在原始評分旁邊顯示新評分,可以將新評分新增至資料欄清單。
SELECT rating,
google_ml.rank('Score of 7 to 10 if the review says the movie was really good, 3 to 6 if the review says it''s alright is and 1 to 2 if the review says it was not worth of time. Review: ' || review_text) AS ml_rank,
review_text AS top_five
FROM movie_reviews
ORDER BY ml_rank DESC
LIMIT 5;
以下是前 5 則評論。
rating | ml_rank | top_five
--------+---------+-----------------------------------------------------------------------
5 | 9 | The Dark Knight: Heath Ledger's Joker is iconic. Dark and thrilling.
5 | 9 | The Matrix: Revolutionary visuals and a compelling story.
5 | 9 | Interstellar: Visually stunning and emotionally powerful. A must-see.
5 | 9 | Inception: Absolutely mind-bending! A masterpiece of sci-fi.
5 | 9 | Pulp Fiction: Quirky, violent, and endlessly quotable.
(5 rows)
5 rows in set (0.13 sec)
我們選擇的評分為 10 顆星中的 9 顆星 (根據熱門評論)。
10. 使用排名改善語意搜尋
我們可以結合語意搜尋和排名,取得更精確的結果。
註冊重新排名模型
使用頂端的「+」符號開啟另一個 Cloud Shell 分頁。
在新的 Cloud Shell 分頁中執行:
export PROJECT_ID=$(gcloud config get-value project)
gcloud iam service-accounts create aip-reranking
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:aip-reranking@$PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/discoveryengine.viewer"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:aip-reranking@$PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/aiplatform.user"
gcloud iam service-accounts add-iam-policy-binding aip-reranking@$PROJECT_ID.iam.gserviceaccount.com \
--member="user:$(gcloud auth list --filter=status:ACTIVE --format="value(account)")" --role="roles/iam.serviceAccountTokenCreator"
gcloud iam service-accounts add-iam-policy-binding aip-reranking@$PROJECT_ID.iam.gserviceaccount.com \
--member="user:$(gcloud auth list --filter=status:ACTIVE --format="value(account)")" --role="roles/iam.serviceAccountUser"
產生存取權權杖,並將其新增為名為「rerank-secret」的密鑰:
echo -n $(gcloud auth print-access-token \
--impersonate-service-account="aip-reranking@$PROJECT_ID.iam.gserviceaccount.com" \
--lifetime=3600) | gcloud secrets create rerank-secret \
--replication-policy="automatic" \
--data-file=-
gcloud secrets add-iam-policy-binding 'rerank-secret' \
--member="serviceAccount:service-$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")@gcp-sa-alloydb.iam.gserviceaccount.com" \
--role="roles/secretmanager.secretAccessor"
在這個範例中,密鑰的有效時間為 1 小時,之後應透過建立新版本來更新密鑰。例如:
echo -n $(gcloud auth print-access-token \
--impersonate-service-account="aip-reranking@$PROJECT_ID.iam.gserviceaccount.com" \
--lifetime=3600) | gcloud secrets versions add rerank-secret \
--data-file=-
接下來,我們需要切換至 AlloyDB Studio,並建立機密值來存取重新排序模型。
在 AlloyDB Studio 主控台中,在變更 $PROJECT_ID 為專案 ID 後執行:
CALL
google_ml.create_sm_secret(
secret_id => 'rerank-secret',
secret_path => 'projects/PROJECT_ID/secrets/rerank-secret/versions/latest');
在變更專案的 PROJECT_ID 後,請在同一個 Studio 工作階段中註冊模型:
CALL
google_ml.create_model(
model_id => 'semantic-ranker-512-002',
model_type => 'reranking',
model_provider => 'custom',
model_request_url =>
'https://discoveryengine.googleapis.com/v1/projects/PROJECT_ID/locations/global/rankingConfigs/default_ranking_config:rank',
model_qualified_name => 'semantic-ranker-512@002',
model_auth_type => 'secret_manager',
model_auth_id => 'rerank-secret',
model_in_transform_fn => 'google_ml.vertexai_reranking_input_transform',
model_out_transform_fn => 'google_ml.vertexai_reranking_output_transform');
使用重新排序模型
我們現在可以在查詢中使用重新排序模型,改善語意搜尋結果,讓結果更具體並選擇最佳選項。
我們來找出動作片,然後使用「電腦和未來」做為條件來為這些電影排名。
WITH
action_movies AS (
SELECT
title,
description,
ROW_NUMBER() OVER (ORDER BY title, description) AS ref_number
FROM
movies
WHERE
google_ml.if(
prompt => 'The following movies are action movies. The movie title: ' || title || ' and the description is: ' || description
)
),
ranked_documents_array AS (
SELECT
ARRAY_AGG(description ORDER BY ref_number) AS docs
FROM
action_movies
),
reranked_results AS (
SELECT
r.index,
r.score
FROM
ranked_documents_array,
ai.rank(
model_id => 'semantic-ranker-512-002',
search_string => 'Computers and future',
documents => ranked_documents_array.docs
) AS r
)
SELECT
am.title,
left(am.description,80) as description,
rr.score
FROM
action_movies am
JOIN
reranked_results rr ON am.ref_number = rr.index
ORDER BY
rr.score DESC;
結果會列出動作片,希望能將關於未來和電腦的電影排在最前面。
title | description | score
-----------------------------------------------+----------------------------------------------------------------------------------+--------
The Matrix | A computer hacker learns about the true nature of his reality. | 0.0145
Mad Max: Fury Road | In a post-apocalyptic wasteland, a woman rebels against a tyrannical ruler in se | 0.002
Avengers: Endgame | After the devastating events of Avengers: Infinity War, the universe is in ruins | 0.0018
Blade Runner 2049 | Young Blade Runner K's discovery of a long-buried secret leads him to track down | 0.0004
The Lord of the Rings: The Return of the King | Gandalf and Aragorn lead the World of Men against Sauron's army to draw his gaze | 1e-05
The Dark Knight | When the menace known as the Joker wreaks havoc and chaos on the people of Gotha | 1e-05
Inception | A thief who steals information by entering people's dreams. | 1e-05
Gladiator | A former Roman General sets out to exact vengeance against the corrupt emperor w | 1e-05
(8 rows)
嘗試使用不同的條件,看看排名對輸出順序的影響。
11. 清理環境
完成實驗室後,請銷毀 AlloyDB 執行個體和叢集
刪除 AlloyDB 叢集和所有執行個體
叢集會使用強制選項進行銷毀,同時也會刪除屬於叢集的所有執行個體。
如果您已連線中斷,且先前的所有設定都已遺失,請在 Cloud Shell 中定義專案和環境變數:
gcloud config set project <your project id>
export REGION=us-central1
export ADBCLUSTER=alloydb-aip-01
export PROJECT_ID=$(gcloud config get-value project)
刪除叢集:
gcloud alloydb clusters delete $ADBCLUSTER --region=$REGION --force
預期的控制台輸出內容:
student@cloudshell:~ (test-project-001-402417)$ gcloud alloydb clusters delete $ADBCLUSTER --region=$REGION --force All of the cluster data will be lost when the cluster is deleted. Do you want to continue (Y/n)? Y Operation ID: operation-1697820178429-6082890a0b570-4a72f7e4-4c5df36f Deleting cluster...done.
刪除 AlloyDB 備份
刪除叢集的所有 AlloyDB 備份:
for i in $(gcloud alloydb backups list --filter="CLUSTER_NAME: projects/$PROJECT_ID/locations/$REGION/clusters/$ADBCLUSTER" --format="value(name)" --sort-by=~createTime) ; do gcloud alloydb backups delete $(basename $i) --region $REGION --quiet; done
預期的控制台輸出內容:
student@cloudshell:~ (test-project-001-402417)$ for i in $(gcloud alloydb backups list --filter="CLUSTER_NAME: projects/$PROJECT_ID/locations/$REGION/clusters/$ADBCLUSTER" --format="value(name)" --sort-by=~createTime) ; do gcloud alloydb backups delete $(basename $i) --region $REGION --quiet; done Operation ID: operation-1697826266108-60829fb7b5258-7f99dc0b-99f3c35f Deleting backup...done.
12. 恭喜
恭喜您完成程式碼研究室!
涵蓋內容
- 如何部署 AlloyDB for Postgres
- 如何啟用 AlloyDB AI 運算子
- 如何使用不同的 AlloyDB AI 運算子
- 如何在 RAG 工作流程中使用 AlloyDB AI 運算子