העברה מ-Cassandra ל-Bigtable באמצעות שרת proxy עם כתיבת כפולה

נשארו לך עוד 8 דקות

מידע על Codelab זה

העדכון האחרון: אפר׳ 15, 2025

נכתב על ידי Louis Cheynel

1.‏ מבוא

Bigtable הוא שירות מנוהל של מסד נתונים מסוג NoSQL עם ביצועים גבוהים, שמיועד לעומסי עבודה תפעוליים ואנליטיים גדולים. העברה של מסדי נתונים קיימים כמו Apache Cassandra ל-Bigtable בדרך כלל מחייבת תכנון קפדני כדי למזער את זמן ההשבתה ואת ההשפעה על האפליקציות.

בשיעור הזה תלמדו איך להשתמש בשילוב של כלי proxy כדי לבצע מיגרציה מ-Cassandra ל-Bigtable:

Cassandra-Bigtable Proxy: מאפשר ללקוחות ולכלים של Cassandra (כמו cqlsh או מנהלי התקנים) לקיים אינטראקציה עם Bigtable באמצעות פרוטוקול שפת השאילתות של Cassandra‏ (CQL), על ידי תרגום שאילתות.
Datastax Zero Downtime Migration (ZDM) Proxy: שרת proxy בקוד פתוח שנמצא בין האפליקציה לבין שירותי מסדי הנתונים (מקור Cassandra ויעד Bigtable דרך שרת ה-proxy של Cassandra-Bigtable). הוא מארגן כתיבת כפולה ומנהל את ניתוב התנועה, ומאפשר לבצע העברה עם שינויים מינימליים באפליקציה וזמן השבתה קצר.
Cassandra Data Migrator‏ (CDM): כלי בקוד פתוח המשמש להעברה בכמות גדולה של נתונים היסטוריים מאשכול המקור של Cassandra למכונה היעד של Bigtable.

מה תלמדו

איך מגדירים אשכול Cassandra בסיסי ב-Compute Engine.
איך יוצרים מכונה של Bigtable.
איך לפרוס ולהגדיר את שרת ה-proxy של Cassandra-Bigtable כדי למפות סכימה של Cassandra ל-Bigtable.
איך לפרוס ולהגדיר את Datastax ZDM Proxy לכתיבה כפולה.
איך משתמשים בכלי Cassandra Data Migrator כדי להעביר בכמות גדולה נתונים קיימים.
תהליך העבודה הכולל להעברה מבוססת-שרת proxy מ-Cassandra ל-Bigtable.

מה נדרש

פרויקט ב-Google Cloud שבו החיוב מופעל. משתמשים חדשים זכאים לתקופת ניסיון בחינם.
היכרות בסיסית עם מושגים ב-Google Cloud, כמו פרויקטים, Compute Engine, רשתות VPC וכללי חומת אש. היכרות בסיסית עם כלים של שורת הפקודה ב-Linux.
גישה למכונה שבה CLI של gcloud מותקן ומוגדר, או שימוש ב-Google Cloud Shell.

במעבדת הקוד הזו נשתמש בעיקר במכונות וירטואליות (VM) ב-Compute Engine באותה רשת VPC ובאותו אזור, כדי לפשט את הרשתות. מומלץ להשתמש בכתובות IP פנימיות.

2.‏ הגדרת הסביבה

1. בחירת פרויקט קיים או יצירת פרויקט חדש ב-Google Cloud

עוברים אל מסוף Google Cloud ובוחרים פרויקט קיים או יוצרים פרויקט חדש. מציינים את מזהה הפרויקט.

2. הפעלת ממשקי ה-API הנדרשים

מוודאים ש-Compute Engine API ו-Bigtable API מופעלים בפרויקט.

gcloud services enable compute.googleapis.com bigtable.googleapis.com bigtableadmin.googleapis.com --project=<your-project-id>

צריך להחליף אותו במזהה הפרויקט בפועל.

3. בחירת אזור ואזור משנה

בוחרים אזור ואזור משנה למשאבים. לדוגמה, נשתמש ב-us-central1 וב-us-central1-c. מגדירים אותם כמשתני סביבה לנוחות:

export PROJECT_ID="<your-project-id>"
export REGION="us-central1"
export ZONE="us-central1-c"

gcloud config set project $PROJECT_ID
gcloud config set compute/region $REGION
gcloud config set compute/zone $ZONE

4. הגדרת כללי חומת אש

אנחנו צריכים לאפשר תקשורת בין המכונות הווירטואליות שלנו ברשת ה-VPC שמוגדרת כברירת מחדל בכמה יציאות:

יציאת CQL של Cassandra/Proxies: ‏ 9042
יציאת בדיקת התקינות של שרת ה-proxy של ZDM: ‏ 14001
SSH: ‏ 22

יוצרים כלל של חומת אש שמאפשר תעבורת נתונים פנימית ביציאות האלה. נשתמש בתג cassandra-migration כדי להחיל בקלות את הכלל הזה על מכונות וירטואליות רלוונטיות.

gcloud compute firewall-rules create allow-migration-internal \
--network=default \
--action=ALLOW \
--rules=tcp:22,tcp:9042,tcp:14001 \
--source-ranges=10.128.0.0/9 # Adjust if using a custom VPC/IP range \
--target-tags=cassandra-migration

3.‏ פריסה של אשכול Cassandra (מקור)

בקודלאב הזה נגדיר ב-Compute Engine אשכול פשוט של Cassandra עם צומת יחיד. בתרחיש אמיתי, צריך להתחבר לאשכולות הקיימים.

1. יצירת מכונה וירטואלית ב-GCE ל-Cassandra

gcloud compute instances create cassandra-origin \
--machine-type=e2-medium \
--image-family=ubuntu-2004-lts \
--image-project=ubuntu-os-cloud \
--tags=cassandra-migration \
--boot-disk-size=20GB

2. התקנת Cassandra

# Install Java (Cassandra dependency)
sudo apt-get update
sudo apt-get install -y openjdk-11-jre-headless

# Add Cassandra repository
echo "deb [https://debian.cassandra.apache.org](https://debian.cassandra.apache.org) 41x main" | sudo tee -a /etc/apt/sources.list.d/cassandra.sources.list
curl [https://downloads.apache.org/cassandra/KEYS](https://downloads.apache.org/cassandra/KEYS) | sudo apt-key add -

# Install Cassandra
sudo apt-get update
sudo apt-get install -y cassandra

3. יצירת מרחבים של מפתחות וטבלאות

נשתמש בדוגמה של טבלת עובדים ונעשה שימוש במרחב מפתחות שנקרא 'zdmbigtable'.

cd ~/apache-cassandra
bin/cqlsh <your-localhost-ip? 9042  #starts the cql shell

בתוך cqlsh:

-- Create keyspace (adjust replication for production)
CREATE KEYSPACE zdmbigtable WITH replication = {'class':'SimpleStrategy', 'replication_factor':1};

-- Use the keyspace
USE zdmbigtable;

-- Create the employee table
CREATE TABLE employee (
    name text PRIMARY KEY,
    age bigint,
    code int,
    credited double,
    balance float,
    is_active boolean,
    birth_date timestamp
);

-- Exit cqlsh
EXIT;

משאירים את סשן ה-SSH פתוח או מציינים את כתובת ה-IP של המכונה הווירטואלית הזו (hostname -I).

4.‏ הגדרת Bigtable (יעד)

משך 0:01

יוצרים מכונה של Bigtable. נשתמש ב-zdmbigtable בתור מזהה המכונה.

gcloud bigtable instances create zdmbigtable \ 
--display-name="ZDM Bigtable Target" \ 
--cluster=bigtable-c1 \ 
--cluster-zone=$ZONE \ 
--cluster-num-nodes=1 # Use 1 node for dev/testing; scale as needed

הטבלה ב-Bigtable תיווצר מאוחר יותר על ידי סקריפט ההגדרה של שרת ה-proxy של Cassandra-Bigtable.

5.‏ הגדרת שרת proxy של Cassandra-Bigtable

1. יצירת מכונה וירטואלית ב-Compute Engine עבור שרת proxy של Cassandra-Bigtable

gcloud compute instances create bigtable-proxy-vm \ 
--machine-type=e2-medium \
--image-family=ubuntu-2004-lts \
--image-project=ubuntu-os-cloud \
--tags=cassandra-migration \
--boot-disk-size=20GB

מתחברים ב-SSH למכונה הווירטואלית bigtable-proxy-vm:

gcloud compute ssh bigtable-proxy-vm

בתוך המכונה הווירטואלית:

# Install Git and Go
sudo apt-get update
sudo apt-get install -y git golang-go

# Clone the proxy repository
# Replace with the actual repository URL if different
git clone https://github.com/GoogleCloudPlatform/cloud-bigtable-ecosystem.git
cd cassandra-to-bigtable-proxy/

# Set Go environment variables
export GOPATH=$HOME/go
export PATH=$PATH:/usr/local/go/bin:$GOPATH/bin

2. הגדרת שרת ה-proxy

nano config.yaml

מעדכנים את המשתנים הבאים. להגדרות מתקדמות יותר, אפשר להשתמש בדוגמה הזו שזמינה ב-GitHub.

#!/bin/bash
cassandraToBigtableConfigs:
  # Global default GCP Project ID
  projectId: <your-project-id>

listeners:
- name: cluster1
  port: 9042
  bigtable:
    #If you want to use multiple instances then pass the instance names by comma seperated
    #Instance name should not contain any special characters except underscore(_)
    instanceIds: zdmbigtable

    # Number of grpc channels to be used for Bigtable session.
    Session:
      grpcChannels: 4

otel:
  # Set enabled to true or false for OTEL metrics/traces/logs.
  enabled: False

  # Name of the collector service to be setup as a sidecar
  serviceName: cassandra-to-bigtable-otel-service

  healthcheck:
    # Enable the health check in this proxy application config only if the
    # "health_check" extension is added to the OTEL collector service configuration.
    #
    # Recommendation:
    # Enable the OTEL health check if you need to verify the collector's availability
    # at the start of the application. For development or testing environments, it can
    # be safely disabled to reduce complexity.

    # Enable/Disable Health Check for OTEL, Default 'False'.
    enabled: False
    # Health check endpoint for the OTEL collector service
    endpoint: localhost:13133
  metrics:
    # Collector service endpoint
    endpoint: localhost:4317

  traces:
    # Collector service endpoint
    endpoint: localhost:4317
    #Sampling ratio should be between 0 and 1. Here 0.05 means 5/100 Sampling ratio.
    samplingRatio: 1

loggerConfig:
  # Specifies the type of output, here it is set to 'file' indicating logs will be written to a file.
  # Value of `outputType` should be `file` for file type or `stdout` for standard output.
  # Default value is `stdout`.
  outputType: stdout
  # Set this only if the outputType is set to `file`.
  # The path and name of the log file where logs will be stored. For example, output.log, Required Key.
  # Default `/var/log/cassandra-to-spanner-proxy/output.log`.
  fileName: output/output.log
  # Set this only if the outputType is set to `file`.
  # The maximum size of the log file in megabytes before it is rotated. For example, 500 for 500 MB.
  maxSize: 10
  # Set this only if the outputType is set to `file`.
  # The maximum number of backup log files to keep. Once this limit is reached, the oldest log file will be deleted.
  maxBackups: 2
  # Set this only if the outputType is set to `file`.
  # The maximum age in days for a log file to be retained. Logs older than this will be deleted. Required Key.
  # Default 3 days
  maxAge: 1

  # Set this only if the outputType is set to `file`.
  # Default value is set to 'False'. Change the value to 'True', if log files are required to be compressed.
  compress: True

שומרים את הקובץ וסוגרים אותו (ctrl+X, ואז Y ואז Enter ב-nano).

3. הפעלת שרת proxy של Cassandra-Bigtable

מפעילים את שרת ה-proxy.

# At the root of the cassandra-to-bigtable-proxy directory
go run proxy.go

שרת ה-proxy יופעל ויקשיב ביציאה 9042 לחיבורי CQL נכנסים. משאירים את סשן הטרמינל הזה פועל. שימו לב לכתובת ה-IP של המכונה הווירטואלית הזו (hostname -I)

4. יצירת טבלה באמצעות CQL

מחברים את cqlsh לכתובת ה-IP של המכונה הווירטואלית של שרת ה-proxy של Cassandra-Bigtable.

ב-cqlsh, מריצים את הפקודה הבאה:

-- Create the employee table
CREATE TABLE zdmbigtable.employee (
    name text PRIMARY KEY,
    age bigint,
    code int,
    credited double,
    balance float,
    is_active boolean,
    birth_date timestamp
);

בודקים במסוף Google Cloud שבטבלת העובדים ובטבלת המטא-נתונים יש נתונים במכונה של Bigtable.

6.‏ הגדרת שרת ה-Proxy של ZDM

כדי להשתמש ב-ZDM Proxy נדרשות לפחות שתי מכונות: צומת proxy אחד או יותר שמטפל בתנועה, ו-Jumphost שמשמש לפריסה ולתזמור באמצעות Ansible.

1. יצירת מכונות וירטואליות ב-Compute Engine ל-ZDM Proxy

אנחנו צריכים שתי מכונות וירטואליות: zdm-proxy-jumphost ו-zdm-proxy-node-1

# Jumphost VM 
gcloud compute instances create zdm-jumphost \
--machine-type=e2-medium \
--image-family=ubuntu-2004-lts \
--image-project=ubuntu-os-cloud \
--tags=cassandra-migration \
--boot-disk-size=20GB

# Proxy Node VM 
gcloud compute instances create zdm-proxy-node-1 \
--machine-type=e2-standard-8 \
--image-family=ubuntu-2004-lts \
--image-project=ubuntu-os-cloud \
--tags=cassandra-migration \
--boot-disk-size=20GB

שימו לב לכתובות ה-IP של שתי המכונות הווירטואליות.

2. הכנת שרת ה-jump

התחברות ל-zdm-jumphost באמצעות SSH

gcloud compute ssh zdm-jumphost

# Install Git and Ansible

sudo apt-get update
sudo apt-get install -y git ansible

בתוך שרת ה-jump

git clone https:\/\/github.com/datastax/zdm-proxy-automation.git 

cd zdm-proxy-automation/ansible/

עורכים את קובץ התצורה הראשי vars/zdm_proxy_cluster_config.yml:

מעדכנים את origin_contact_points ו-target_contact_points בכתובות ה-IP הפנימיות של המכונה הווירטואלית של Cassandra והמכונה הווירטואלית של שרת ה-proxy של Cassandra-Bigtable, בהתאמה. משאירים את האימות בתוך הערה כי לא הגדרנו אותו.

##############################
#### ORIGIN CONFIGURATION ####
##############################
## Origin credentials (leave commented if no auth)
# origin_username: ...
# origin_password: ...

## Set the following two parameters only if Origin is a self-managed, non-Astra cluster
origin_contact_points: <Your-Cassandra-VM-Internal-IP> # Replace!
origin_port: 9042

##############################
#### TARGET CONFIGURATION ####
##############################
## Target credentials (leave commented if no auth)
# target_username: ...
# target_password: ...

## Set the following two parameters only if Target is a self-managed, non-Astra cluster
target_contact_points: <Your-Bigtable-Proxy-VM-Internal-IP> # Replace!
target_port: 9042

# --- Other ZDM Proxy settings can be configured below ---
# ... (keep defaults for this codelab)

שומרים את הקובץ הזה וסוגרים אותו.

3. פריסת שרת ה-proxy של ZDM באמצעות Ansible

מריצים את מדריך Ansible מתוך ספריית Ansible במארח ה-jump:

ansible-playbook deploy_zdm_proxy.yml -i zdm_ansible_inventory

הפקודה הזו תתקין את התוכנות הנדרשות (כמו Docker) בצומת ה-proxy (zdm-proxy-node-1), תשלוף את קובץ האימג' של ZDM Proxy ב-Docker ותפעיל את קונטיינר ה-proxy עם ההגדרות שסיפקתם.

4. בדיקת התקינות של שרת ה-proxy של ZDM

בודקים את נקודת הקצה של מוכנות שרת ה-proxy של ZDM שפועל ב-zdm-proxy-node-1 (יציאה 14001) מהמחשב המארח:

# Replace <zdm-proxy-node-1-internal-ip> with the actual internal IP.
curl -G http://<zdm-proxy-node-1-internal-ip>:14001/health/readiness

הפלט אמור להיראות כך, ומציין שגם המקור (Cassandra) וגם היעד (Cassandra-Bigtable Proxy) פועלים:

{
  "OriginStatus": {
    "Addr": "<Your-Cassandra-VM-Internal-IP>:9042",
    "CurrentFailureCount": 0,
    "FailureCountThreshold": 1,
    "Status": "UP"
  },
  "TargetStatus": {
    "Addr": "<Your-Bigtable-Proxy-VM-Internal-IP>:9042",
    "CurrentFailureCount": 0,
    "FailureCountThreshold": 1,
    "Status": "UP"
  },
  "Status": "UP"
}

7.‏ הגדרת האפליקציה והפעלת כתיבת כפולה

משך 0:05

בשלב הזה בהעברה אמיתית, צריך להגדיר מחדש את האפליקציות כך שיצביעו על כתובת ה-IP של צומת ה-Proxy של ZDM (למשל, :9042) במקום להתחבר ישירות ל-Cassandra.

אחרי שהאפליקציה מתחברת ל-ZDM Proxy: קריאות מוצגות כברירת מחדל מהמקור (Cassandra). פעולות הכתיבה נשלחות גם למקור (Cassandra) וגם ליעד (Bigtable, דרך שרת ה-proxy של Cassandra-Bigtable). כך האפליקציה תמשיך לפעול כרגיל, תוך הבטחה שהנתונים החדשים ייכתבו בשני מסדי הנתונים בו-זמנית. אפשר לבדוק את החיבור באמצעות cqlsh שמצביע על שרת ה-proxy של ZDM מהמארח המשמש לקפיצה או ממכונה וירטואלית אחרת ברשת:

Cqlsh <zdm-proxy-node-1-ip-address> 9042

כדאי לנסות להוסיף נתונים:

INSERT INTO zdmbigtable.employee (name, age, is_active) VALUES ('Alice', 30, true); 
SELECT * FROM employee WHERE name = 'Alice';

צריך לכתוב את הנתונים האלה גם ב-Cassandra וגם ב-Bigtable. אפשר לאשר זאת ב-Bigtable. לשם כך, נכנסים למסוף Google Cloud ופותחים את עורך השאילתות של Bigtable במכונה. מריצים את השאילתה SELECT * FROM employee והנתונים שהוכנסו לאחרונה אמורים להיות גלויים.

8.‏ העברת נתונים היסטוריים באמצעות Cassandra Data Migrator

עכשיו, כשכתיבה כפולה פעילה לנתונים חדשים, אפשר להשתמש בכלי Cassandra Data Migrator‏ (CDM) כדי להעתיק את הנתונים ההיסטוריים הקיימים מ-Cassandra אל Bigtable.

1. יצירת מכונה וירטואלית ב-Compute Engine ל-CDM

למכונה הווירטואלית הזו צריך להיות מספיק זיכרון ל-Spark.

gcloud compute instances create cdm-migrator-vm \
--machine-type=e2-medium \
--image-family=ubuntu-2004-lts \
--image-project=ubuntu-os-cloud \
--tags=cassandra-migration \
--boot-disk-size=40GB

2. התקנת התנאים המוקדמים (Java 11, ‏ Spark)

מתחברים ב-SSH למכונה הווירטואלית cdm-migrator-vm:

gcloud compute ssh cdm-migrator-vm

בתוך המכונה הווירטואלית:

# Install Java 11 
sudo apt-get update 
sudo apt-get install -y openjdk-11-jdk
 
# Verify Java installation 
java -version 

# Download and Extract Spark (Using version 3.5.3 as requested) 
# Check the Apache Spark archives for the correct URL if needed

wget  [https://archive.apache.org/dist/spark/spark-3.5.3/spark-3.5.3-bin-hadoop3-scala2.13.tgz](https://archive.apache.org/dist/spark/spark-3.5.3/spark-3.5.3-bin-hadoop3-scala2.13.tgz) tar -xvzf spark-3.5.3-bin-hadoop3-scala2.13.tgz
 
export SPARK_HOME=$PWD/spark-3.5.3-bin-hadoop3-scala2.13 
export PATH=$PATH:$SPARK_HOME/bin

3. הורדת Cassandra Data Migrator

מורידים את קובץ ה-jar של כלי CDM. בדף הגרסה של Cassandra Data Migrator ב-GitHub, בודקים מהי כתובת ה-URL הנכונה של הגרסה הרצויה.

# Example using version 5.2.2 - replace URL if needed
wget https://github.com/datastax/cassandra-data-migrator/releases/download/v5.2.2/cassandra-data-migrator-5.2.2.jar)

4. הגדרת CDM

יוצרים קובץ מאפיינים בשם cdm.properties

Nano cdm.properties

מדביקים את ההגדרה הבאה, מחליפים את כתובות ה-IP ומשביתים את התכונות TTL/Writetime כי הן לא נתמכות ישירות ב-Bigtable באותו אופן. משאירים את ההרשאה בתוך הערה.

# Origin Cassandra Connection 
spark.cdm.connect.origin.host <Your-Cassandra-VM-IP-Address> # Replace!
spark.cdm.connect.origin.port 9042
spark.cdm.connect.origin.username cassandra # Leave default, or set if auth is enabled # 
spark.cdm.connect.origin.password cassandra # Leave default, or set if auth is enabled #

# Target Bigtable (via Cassandra-Bigtable Proxy)
Connection spark.cdm.connect.target.host <Your-Bigtable-Proxy-VM-IP-Address> # Replace! 
spark.cdm.connect.target.port 9042
spark.cdm.connect.target.username cassandra # Leave default, or set if auth is enabled #
spark.cdm.connect.target.password cassandra # Leave default, or set if auth is enabled #

# Disable TTL/Writetime features (Important for Bigtable compatibility via Proxy)
spark.cdm.feature.origin.ttl.automatic false 
spark.cdm.feature.origin.writetime.automatic false 
spark.cdm.feature.target.ttl.automatic false 
spark.cdm.feature.target.writetime.automatic false

שומרים את הקובץ וסוגרים אותו.

5. הפעלת משימת ההעברה

מריצים את ההעברה באמצעות spark-submit. הפקודה הזו מורה ל-Spark להריץ את קובץ ה-jar של CDM באמצעות קובץ המאפיינים, ולהציין את מרחבי המפתחות והטבלה שרוצים להעביר. משנים את הגדרות הזיכרון (‎–driver-memory,‏ –executor-memory) בהתאם לגודל המכונה הווירטואלית ולנפח הנתונים.

מוודאים שנמצאים בספרייה שמכילה את קובץ ה-jar ואת קובץ המאפיינים של CDM. אם הורדת גרסת build אחרת, צריך להחליף את 'cassandra-data-migrator-5.2.2.jar'.

./spark-3.5.3-bin-hadoop3-scala2.13/bin/spark-submit \ --properties-file cdm.properties \ --master "local[*]" \ --driver-memory 4G \ --executor-memory 4G \ --class com.datastax.cdm.job.Migrate \ cassandra-data-migrator-5.2.2.jar &> cdm_migration_$(date +%Y%m%d_%H%M).log

ההעברה תפעל ברקע, והיומנים ייכתבו לקובץ cdm_migration_‎… ‎.log. בודקים את קובץ היומן כדי לראות את ההתקדמות ואת השגיאות, אם יש כאלה:

tail -f cdm_migration_*.log

6. אימות העברת הנתונים

אחרי שהמשימה ב-CDM תושלם בהצלחה, מוודאים שהנתונים ההיסטוריים קיימים ב-Bigtable. מאחר ששרת ה-proxy של Cassandra-Bigtable מאפשר קריאות CQL, אפשר שוב להשתמש ב-cqlsh שמחובר לשרת ה-proxy של ZDM (שמנתב קריאות ליעד אחרי ההעברה, או שניתן להגדיר אותו כך) או ישירות לשרת ה-proxy של Cassandra-Bigtable כדי לשלוח שאילתות על הנתונים. חיבור דרך שרת proxy של ZDM:

cqlsh <zdm-proxy-node-1-ip-address> 9042

בתוך cqlsh:

SELECT COUNT(*) FROM zdmbigtable.employee; -- Check row count matches origin 
SELECT * FROM employee LIMIT 10; -- Check some sample data

לחלופין, אפשר להשתמש בכלי cbt (אם הוא מותקן במכונה הווירטואלית של CDM או ב-Cloud Shell) כדי לחפש נתונים ישירות ב-Bigtable:

# First, install cbt if needed
# gcloud components update
# gcloud components install cbt

# Then lookup a specific row (replace 'some_employee_name' with an actual primary key)
cbt -project $PROJECT_ID -instance zdmbigtable lookup employee some_employee_name

9.‏ מעבר (קונספטואלי)

אחרי שתבדקו היטב את עקביות הנתונים בין Cassandra לבין Bigtable, תוכלו להמשיך בתהליך המעבר הסופי.

כשמשתמשים ב-ZDM Proxy, המעבר כרוך בהגדרה מחדש של שרת ה-proxy כך שיקרא בעיקר מהיעד (Bigtable) במקום מהמקור (Cassandra). בדרך כלל עושים זאת באמצעות הגדרת שרת ה-proxy של ZDM, וכך מעבירים את תעבורת הקריאה של האפליקציה ל-Bigtable.

אחרי שתהיה לכם ודאות ש-Bigtable משרת את כל התנועה בצורה תקינה, תוכלו:

כדי להפסיק את הכפילויות של הכתיבה, צריך להגדיר מחדש את שרת ה-proxy של ZDM.
משביתים את אשכול Cassandra המקורי.
מסירים את שרת ה-proxy של ZDM ומאפשרים לאפליקציה להתחבר ישירות לשרת ה-proxy של Cassandra-Bigtable, או משתמשים בלקוח Bigtable CQL מקורי ל-Java.

הפרטים של הגדרת מחדש של שרת proxy של ZDM להעברה לא נכללים בקודלאב הבסיסי הזה, אבל הם מפורטים במסמכי התיעוד של Datastax ZDM.

10.‏ הסרת המשאבים

כדי להימנע מחיובים, מוחקים את המשאבים שנוצרו במהלך הקודלאב.

1. מחיקת מכונות וירטואליות ב-Compute Engine

gcloud compute instances delete cassandra-origin-vm zdm-proxy-jumphost zdm-proxy-node-1 bigtable-proxy-vm cdm-migrator-vm --zone=$ZONE --quiet

2. מחיקת מכונה של Bigtable

gcloud bigtable instances delete zdmbigtable

3. מחיקת כללי חומת האש

gcloud compute firewall-rules delete allow-migration-internal

4. מחיקה של מסד הנתונים של Cassandra (אם הוא מותקן באופן מקומי או קבוע)

אם התקנתם את Cassandra מחוץ למכונה וירטואלית ב-Compute Engine שנוצרה כאן, עליכם לפעול לפי השלבים המתאימים כדי להסיר את הנתונים או להסיר את Cassandra.

11.‏ מעולה!

סיימתם את התהליך להגדרת נתיב העברה מבוסס-שרת proxy מ-Apache Cassandra ל-Bigtable.

למדת איך:

פריסת Cassandra ו-Bigtable.

הגדרת שרת ה-proxy של Cassandra-Bigtable לתאימות ל-CQL.
פריסת שרת ה-proxy של Datastax ZDM לניהול תנועה וכתיבת כפולה.
שימוש ב-Cassandra Data Migrator להעברת נתונים היסטוריים.

הגישה הזו מאפשרת לבצע העברות עם זמן השבתה מינימלי וללא שינויים בקוד, באמצעות שכבת ה-proxy.

השלבים הבאים

מסמכי העזרה של Bigtable
במסמכי התיעוד של Datastax ZDM Proxy מפורט מידע על הגדרות מתקדמות ועל תהליכי מעבר.
פרטים נוספים זמינים במאגר Cassandra-Bigtable Proxy.
במאגר של Cassandra Data Migrator אפשר למצוא מידע נוסף על שימוש מתקדם.
Codelabs אחרים של Google Cloud

דיווח על טעות