1. 簡介
在本程式碼研究室中,您將瞭解如何使用 BigQuery 圖表,為虛構零售公司 Cymbal Pets 建立客戶 360 檢視畫面和推薦引擎。您將運用 SQL 的強大功能,直接在 BigQuery 中建立、查詢及分析圖形資料,並結合向量搜尋功能,提供進階產品推薦。
BigQuery Graph 可讓您將資料實體 (例如顧客、產品和訂單) 間的關係建立為圖表,輕鬆解答有關顧客行為和產品偏好的複雜問題。

學習內容
- 為 Cymbal Pets 圖表建立 BigQuery 資料集和結構定義
- 從 Cloud Storage 載入範例資料 (Customers、Products、Orders、Stores)
- 在 BigQuery 中建立連結這些實體的屬性圖
- 使用圖形查詢,以視覺化方式呈現顧客的購買記錄
- 使用向量搜尋建構產品推薦系統
- 使用「一起購買」圖表關係強化建議
軟硬體需求
- 網路瀏覽器,例如 Chrome
- 已啟用計費功能的 Google Cloud 雲端專案
本程式碼研究室適合各種程度的開發人員,包括初學者。
2. 事前準備
建立 Google Cloud 專案
- 在 Google Cloud 控制台中,選取或建立 Google Cloud 專案。
- 確認 Cloud 專案已啟用計費功能。
啟動 Cloud Shell
- 點選 Google Cloud 控制台頂端的「啟用 Cloud Shell」。
- 驗證:
gcloud auth list
- 確認專案:
gcloud config get project
- 視需要設定:
export PROJECT_ID=<YOUR_PROJECT_ID>
gcloud config set project $PROJECT_ID
啟用 API
執行下列指令,啟用必要的 BigQuery API:
gcloud services enable bigquery.googleapis.com
3. 定義結構定義
首先,您需要建立資料集來儲存圖表相關資料表,並定義節點和邊緣的結構定義。
- 在本程式碼研究室中,我們將執行 SQL 指令。您可以在 BigQuery Studio > SQL 編輯器中執行這些指令,也可以在 Cloud Shell 中使用
bq query指令。
我們假設您使用 BigQuery SQL 編輯器,以便更順暢地使用多行建立陳述式。 - 建立
cymbal_pets_demo資料集:
CREATE SCHEMA IF NOT EXISTS cymbal_pets_demo;
- 為
order_items、products、orders、stores、customers和co_related_products_for_angelica建立資料表。這些資料表將做為圖表的來源資料。
CREATE TABLE IF NOT EXISTS cymbal_pets_demo.order_items
(
order_id INT64,
product_id INT64,
order_item_id INT64,
quantity INT64,
price FLOAT64,
PRIMARY KEY (order_id, product_id, order_item_id) NOT ENFORCED
)
CLUSTER BY order_item_id;
CREATE TABLE IF NOT EXISTS cymbal_pets_demo.products
(
product_id INT64,
product_name STRING,
brand STRING,
category STRING,
subcategory INT64,
animal_type INT64,
search_keywords INT64,
price FLOAT64,
description STRING,
inventory_level INT64,
supplier_id INT64,
average_rating FLOAT64,
uri STRING,
embedding ARRAY<FLOAT64>,
PRIMARY KEY (product_id) NOT ENFORCED
)
CLUSTER BY product_id;
CREATE TABLE IF NOT EXISTS cymbal_pets_demo.orders
(
customer_id INT64,
order_id INT64,
shipping_address_city STRING,
store_id INT64,
order_date DATE,
order_type STRING,
payment_method STRING,
PRIMARY KEY (order_id) NOT ENFORCED
)
PARTITION BY order_date
CLUSTER BY order_id;
CREATE TABLE IF NOT EXISTS cymbal_pets_demo.stores
(
store_id INT64,
store_name STRING,
address_state STRING,
address_city STRING,
latitude FLOAT64,
longitude FLOAT64,
opening_hours STRUCT<Monday STRING, Tuesday STRING, Wednesday STRING, Thursday STRING, Friday STRING, Saturday STRING, Sunday STRING>,
manager_id INT64,
PRIMARY KEY (store_id) NOT ENFORCED
)
CLUSTER BY store_id;
CREATE TABLE IF NOT EXISTS cymbal_pets_demo.customers
(
customer_id INT64,
first_name STRING,
last_name STRING,
email STRING,
gender STRING,
address_city STRING,
address_state STRING,
loyalty_member BOOL,
PRIMARY KEY (customer_id) NOT ENFORCED
)
CLUSTER BY customer_id;
CREATE TABLE IF NOT EXISTS cymbal_pets_demo.co_related_products_for_angelica
(
angelica_product_id INT64,
other_product_id INT64,
co_purchase_count INT64
);
您現在已定義圖表資料的結構。
4. 載入資料
現在,請從 Cloud Storage 在資料表中填入範例資料。
在 BigQuery SQL 編輯器中執行下列 LOAD DATA 陳述式:
LOAD DATA INTO `cymbal_pets_demo.customers`
FROM FILES (
format = 'AVRO',
uris = ['gs://sample-data-and-media/cymbal-pets/tables/customers/*.avro']
);
LOAD DATA INTO `cymbal_pets_demo.order_items`
FROM FILES (
format = 'AVRO',
uris = ['gs://sample-data-and-media/cymbal-pets/tables/order_items/*.avro']
);
LOAD DATA INTO `cymbal_pets_demo.orders`
FROM FILES (
format = 'AVRO',
uris = ['gs://sample-data-and-media/cymbal-pets/tables/orders/*.avro']
);
LOAD DATA INTO `cymbal_pets_demo.products`
FROM FILES (
format = 'AVRO',
uris = ['gs://sample-data-and-media/cymbal-pets/tables/products/*.avro']
);
LOAD DATA INTO `cymbal_pets_demo.stores`
FROM FILES (
format = 'AVRO',
uris = ['gs://sample-data-and-media/cymbal-pets/tables/stores/*.avro']
);
系統應會顯示確認訊息,指出資料列已載入各個資料表。
5. 建立屬性圖表
載入資料後,您現在可以定義屬性圖。這會告知 BigQuery 哪些資料表代表節點 (例如「顧客」、「產品」等實體),哪些資料表代表邊緣 (例如「造訪」、「下單」、「擁有」等關係)。

執行下列 DDL 陳述式:
CREATE OR REPLACE PROPERTY GRAPH cymbal_pets_demo.PetsOrderGraph
NODE TABLES (
cymbal_pets_demo.customers KEY(customer_id) LABEL Customer,
cymbal_pets_demo.products KEY(product_id) LABEL Products,
cymbal_pets_demo.stores KEY(store_id) LABEL Stores,
cymbal_pets_demo.orders KEY(order_id) LABEL Orders
)
EDGE TABLES (
cymbal_pets_demo.orders as customer_to_store_edge
KEY (order_id)
SOURCE KEY (customer_id) references customers(customer_id)
DESTINATION KEY (store_id) references stores(store_id)
LABEL Visited
PROPERTIES ALL COLUMNS,
cymbal_pets_demo.order_items
KEY (order_item_id)
SOURCE KEY (order_id) references orders(order_id)
DESTINATION KEY (product_id) references products(product_id)
LABEL Has
PROPERTIES ALL COLUMNS,
cymbal_pets_demo.orders as customer_to_orders_edge
KEY (order_id)
SOURCE KEY (customer_id) references customers(customer_id)
DESTINATION KEY (order_id) references orders(order_id)
LABEL Placed
PROPERTIES ALL COLUMNS,
cymbal_pets_demo.co_related_products_for_angelica
KEY (angelica_product_id)
SOURCE KEY (angelica_product_id) references products(product_id)
DESTINATION KEY (other_product_id) references products(product_id)
LABEL BoughtTogether
PROPERTIES ALL COLUMNS
);
這會建立圖表 PetsOrderGraph,讓我們可以使用 GRAPH_TABLE 運算子執行圖表遍歷。
6. 以視覺化方式呈現所有顧客的購買記錄
在 BigQuery Studio 中開啟「新筆記本」。

在本程式碼實驗室的視覺化和建議部分,我們將使用 BigQuery Studio 中的 Google Colab 筆記本。這樣我們就能輕鬆查看圖表結果。
BigQuery Graph Notebook 會以 IPython Magics 實作。新增 %%bigquery 魔法指令和 TO_JSON 函式後,即可如後續章節所示,將結果視覺化。
假設 Cymbal Pets 想要取得所有顧客的 360 度視覺化資料,以及他們在特定時間範圍內的購買記錄。
在新的儲存格中執行下列指令:
%%bigquery --graph
GRAPH cymbal_pets_demo.PetsOrderGraph
# finds the customer node and then finds all
# the Orders nodes that are connected to that customer through the
# Placed relationship
MATCH (customer:Customer)-[placed:Placed]->(ordr:Orders)-[has:Has]->(product:Products)
# filters the Orders nodes to only include those where the
# order_date is within the last 3 months.
WHERE ordr.order_date >= date('2024-11-27')
# # This line finds all the Products nodes that are connected to the
# # filtered Orders nodes through the Has relationship.
MATCH p=(customer:Customer)-[placed:Placed]->(ordr:Orders)-[has:Has]->(product:Products)
LIMIT 40
RETURN
TO_JSON(p) as paths
畫面應會顯示圖表結果的視覺化呈現方式。

7. 以視覺化方式呈現 Angelica 的購買記錄
假設 Cymbal Pets 想深入瞭解名為 Angelica Russell 的顧客。他們想分析 Angelica 在過去 3 個月內購買的產品,以及顧客造訪的商店。
%%bigquery --graph
GRAPH cymbal_pets_demo.PetsOrderGraph
# finds the customer node with the name "Angelica Russell" and then finds all
# the Orders nodes that are connected to that customer through the
# Placed relationship and all the Products nodes that are connected to the
# filtered Orders nodes through the Has relationship.
MATCH p=(customer:Customer {first_name: 'Angelica', last_name: 'Russell'})-[placed:Placed]->(ordr:Orders)-[has:Has]->(product:Products)
# filters the Orders nodes to only include those where the
# order_date is within the last 3 months.
WHERE ordr.order_date >= date('2024-11-27')
# finds the Stores nodes where Angelica placed order from
MATCH p2=(customer)-[visited:Visited]->(store:Stores)
RETURN
TO_JSON(p) as path, TO_JSON(p2) as path2

8. 使用向量搜尋功能推薦產品
Cymbal Pets 想根據 Angelica 最近購買的產品,向她推薦其他產品。我們可以運用向量搜尋,找出與她過去購買產品的嵌入項目相似的產品。
在新的 Colab 儲存格中執行下列 SQL 指令碼。這個指令碼:
- 識別 Angelica 最近購買的產品。
- 使用
VECTOR_SEARCH從products表格中找出前 4 項類似產品。
注意:這個步驟假設您已執行 AI.GENERATE_EMBEDDINGS,在產品資料表中建立嵌入項目資料欄。
%%bigquery
DECLARE products_bought_by_angelica ARRAY<INT64>;
-- 1. Get IDs of products bought by Angelica
SET products_bought_by_angelica = (
SELECT ARRAY_AGG(product_id) FROM
GRAPH_TABLE(
cymbal_pets_demo.PetsOrderGraph
MATCH (c:Customer {first_name: 'Angelica', last_name: 'Russell'})-[placed:Placed]->(o:Orders)
WHERE o.order_date >= date('2024-11-27')
MATCH (o)-[has_edge:Has]->(p:Products)
RETURN DISTINCT p.product_id as product_id
));
-- 2. Find similar products using vector search
SELECT
query.product_name as AngelicaBought,
base.product_name as RecommendedProducts,
base.category
FROM
VECTOR_SEARCH(
TABLE cymbal_pets_demo.products,
'embedding',
(SELECT * FROM cymbal_pets_demo.products
WHERE product_id IN UNNEST(products_bought_by_angelica)),
'embedding',
top_k => 4)
WHERE query.product_name <> base.product_name;
畫面上應會顯示建議產品清單,這些產品在語意上與 Angelica 購買的產品相似。

9. 使用「一起購買」關係的建議
另一項強大的推薦技術是「協同過濾」,也就是推薦其他使用者經常一起購買的產品。我們已將此建模為圖表中的 BoughtTogether 邊緣。
為了推薦買過的產品,Cymbal Pets 執行了離線圖表分析查詢,找出 Angelica 購買的每項產品,並推薦最適合的產品。
%%bigquery
CREATE OR REPLACE TABLE cymbal_pets_demo.co_related_products_for_angelica AS
SELECT
angelica_product_id,
other_product_id,
co_purchase_count
FROM (
SELECT
angelicaProduct.product_id AS angelica_product_id,
otherProduct.product_id AS other_product_id,
count(otherProduct) AS co_purchase_count,
# ensures that the row numbering is done separately for each angelica_product_id
ROW_NUMBER() OVER (PARTITION BY angelicaProduct.product_id ORDER BY count(otherProduct) DESC) AS rn
FROM
GRAPH_TABLE (cymbal_pets_demo.PetsOrderGraph
MATCH (angelica:Customer {first_name: 'Angelica', last_name: 'Russell'})-[:Placed]->(o:Orders)-[:Has]->(angelicaProduct:Products)
WHERE o.order_date >= date('2024-11-27')
WITH angelica, angelicaProduct
MATCH (otherCustomer:Customer)-[:Placed]->(otherOrder:Orders)-[:Has]->(angelicaProduct) # Find orders where Angelica's products were bought
WHERE otherCustomer <> angelica # Exclude Angelica's own orders
WITH angelicaProduct, otherOrder
MATCH (otherOrder)-[:HAS]->(otherProduct:Products) # Find other products in those orders
WHERE angelicaProduct <> otherProduct # Exclude the original product.
RETURN angelicaProduct, otherProduct, otherOrder
)
GROUP BY
angelicaProduct.product_id, otherProduct.product_id
)
WHERE rn <= 3; # only keep top 3 co-related products

執行這項查詢,透過 BoughtTogether 邊緣推薦與 Angelica 購買項目直接相關的產品:
%%bigquery
SELECT * FROM GRAPH_TABLE(
cymbal_pets_demo.PetsOrderGraph
MATCH (customer:Customer {first_name: 'Angelica', last_name: 'Russell'})-[placed:Placed]->(ordr:Orders)
WHERE ordr.order_date >= date('2024-11-27')
MATCH (ordr)-[has:Has]->(product:Products)
MATCH (product)-[bought_together:BoughtTogether]->(recommended_product:Products)
RETURN
product.product_name AS OriginalProduct,
recommended_product.product_name AS Recommended,
bought_together.co_purchase_count AS Strength
);
這項查詢會從「顧客」>「訂單」>「產品」>「(BoughtTogether)」>「推薦產品」進行遍歷,根據集體購買行為顯示建議。

10. 清理
如要避免系統持續向您的 Google Cloud 帳戶收費,請刪除本程式碼研究室建立的資源。
刪除資料集和所有資料表:
DROP SCHEMA IF EXISTS cymbal_pets_demo CASCADE;
如果您是為這個程式碼研究室建立新專案,也可以刪除該專案:
gcloud projects delete $PROJECT_ID
11. 恭喜
恭喜!您已使用 BigQuery 圖表成功建構 360 度客戶檢視畫面和推薦引擎。
目前所學內容
- 如何在 BigQuery 中建立屬性圖。
- 如何將資料載入圖形節點和邊緣。
- 如何使用
GRAPH_TABLE和MATCH查詢圖形模式。 - 如何結合圖表查詢與向量搜尋,取得混合建議。
後續步驟
- 參閱 BigQuery 圖表說明文件。
- 進一步瞭解 BigQuery 的向量搜尋功能。