使用 Google Antigravity 掌握 KCC 操作

1. 简介

在此 Codelab 中,您将了解 Google Antigravity(在本文档的其余部分中简称为 Antigravity),这是一个智能体式开发平台,可将 IDE 发展为智能体优先的时代。

与仅自动补全行的标准代码助理不同,Antigravity 提供了一个“任务管理中心”,用于管理自主智能体,这些智能体可以规划、编写代码,甚至可以浏览网页来帮助您构建。

Antigravity 被设计为智能体优先的平台。它假定 AI 不仅仅是编写代码的工具,而是一个自主行动者,能够以最少的人工干预来规划、执行、验证和迭代复杂的工程任务。

学习内容

  • 安装和配置 Antigravity。
  • 探索 Antigravity 的关键概念,例如智能体管理器、编辑器、浏览器等。
  • 从头开始构建生产级 KCC Ops 技能 ,以安全合规的方式管理 Google Cloud 资源。

所需条件

目前,Antigravity 以预览版的形式提供给个人 Gmail 账号。它附带免费配额,可用于使用高级模型。

Antigravity 需要在您的系统上本地安装。该产品适用于 Mac、Windows 和特定的 Linux 发行版。除了您自己的机器外,您还需要以下内容:

  • Gmail 账号(个人 Gmail 账号)。
  • Google Cloud 账号和 Google Cloud 项目
  • 支持 Google Cloud 控制台和 Cloud Shell 的网络浏览器,例如 Chrome

2. 设置和要求

项目设置

创建 Google Cloud 项目

  1. Google Cloud 控制台 的项目选择器页面上,选择或创建一个 Google Cloud 项目
  2. 确保您的云项目已启用结算功能。了解如何检查项目是否已启用结算功能

此 Codelab 专为各种水平的用户和开发者(包括新手)设计。

3. 安装

如果您尚未安装 Antigravity,请先安装 Antigravity。目前,该产品提供预览版,您可以使用个人 Gmail 账号开始使用。

前往 下载 页面,然后点击适用于您情况的相应操作系统版本。启动应用安装程序,并在您的机器上安装该程序。安装完成后,启动 Antigravity 应用。

设置期间的关键步骤:

  • 选择设置流程 :我们建议您从头开始学习此 Codelab。
  • 审核驱动型开发(推荐):选择此选项。它允许智能体做出决定并返回给用户以供批准,这对于基础架构运营至关重要。

接下来,配置编辑器登录 Google 。最后,接受使用条款

4. 基础架构设置:GKE 和 Config Connector

在构建技能之前,您需要一个 Google Cloud 环境,其中手动安装了 Config Connector (KCC) ,并在命名空间模式 下进行了配置。这样,您就可以将 GCP 资源作为 Kubernetes 对象进行管理。

第 0 步:准备环境

1. 集群前提条件

创建一个启用了必要功能的新 GKE 集群:

# Set your variables
export PROJECT_ID=$(gcloud config get-value project)
export CLUSTER_NAME="kcc-ops-cluster"
export REGION="us-central1"

# Create the cluster
gcloud container clusters create ${CLUSTER_NAME} \
    --region ${REGION} \
    --release-channel "regular" \
    --workload-pool=${PROJECT_ID}.svc.id.goog \
    --logging=SYSTEM \
    --monitoring=SYSTEM

# Get cluster credentials
gcloud container clusters get-credentials ${CLUSTER_NAME} --region ${REGION}

**2. 安装 Config Connector Operator

该运算符可让您的安装保持最新状态。

# Download the latest Config Connector operator
gcloud storage cp gs://configconnector-operator/latest/release-bundle.tar.gz release-bundle.tar.gz

# Extract the bundle
tar zxvf release-bundle.tar.gz

# Install the operator (Standard Cluster)
kubectl apply -f operator-system/configconnector-operator.yaml

3. 配置命名空间模式

创建一个 ConfigConnector 资源来指定运行模式。

# configconnector.yaml
apiVersion: core.cnrm.cloud.google.com/v1beta1
kind: ConfigConnector
metadata:
  name: configconnector.core.cnrm.cloud.google.com
spec:
  mode: namespaced
  stateIntoSpec: Absent
kubectl apply -f configconnector.yaml

4. 创建身份和命名空间

在本实验中,我们将使用 default 命名空间和专用的 Google 服务账号 (GSA)。

# Set your variables
export PROJECT_ID=$(gcloud config get-value project)
export NAMESPACE="default"

# Create the Google Service Account
gcloud iam service-accounts create kcc-identity --project ${PROJECT_ID}

# Grant permissions on the project
gcloud projects add-iam-policy-binding ${PROJECT_ID} \
    --member="serviceAccount:kcc-identity@${PROJECT_ID}.iam.gserviceaccount.com" \
    --role="roles/owner"

# Grant Metric Writer permissions
gcloud projects add-iam-policy-binding ${PROJECT_ID} \
    --member="serviceAccount:kcc-identity@${PROJECT_ID}.iam.gserviceaccount.com" \
    --role="roles/monitoring.metricWriter"

# Bind GSA to KSA via Workload Identity
gcloud iam service-accounts add-iam-policy-binding \
    kcc-identity@${PROJECT_ID}.iam.gserviceaccount.com \
    --member="serviceAccount:${PROJECT_ID}.svc.id.goog[cnrm-system/cnrm-controller-manager-${NAMESPACE}]" \
    --role="roles/iam.workloadIdentityUser"

5. 配置命名空间

创建一个 ConfigConnectorContext 来监控命名空间。

# configconnectorcontext.yaml
apiVersion: core.cnrm.cloud.google.com/v1beta1
kind: ConfigConnectorContext
metadata:
  name: configconnectorcontext.core.cnrm.cloud.google.com
  namespace: default
spec:
  googleServiceAccount: "kcc-identity@${PROJECT_ID}.iam.gserviceaccount.com"
  stateIntoSpec: Absent
kubectl apply -f configconnectorcontext.yaml

6. 验证安装

等待控制器为 default 命名空间做好准备。

kubectl wait -n cnrm-system \
    --for=condition=Ready pod \
    -l cnrm.cloud.google.com/component=cnrm-controller-manager \
    -l cnrm.cloud.google.com/scoped-namespace=default

5. 智能体管理器:任务管理中心

Antigravity 分支了开源 Visual Studio Code (VS Code) 基础,但彻底改变了用户体验,优先考虑智能体管理而非文本编辑。该界面分为两个不同的主窗口:EditorAgent Manager

智能体管理器

启动 Antigravity 后,用户通常会看到智能体管理器。此界面充当任务管理中心 信息中心。它专为高级编排而设计,允许开发者生成、监控和与在不同工作区或任务中异步运行的多个智能体进行交互。

在此视图中,开发者充当架构师。他们定义了高级目标。每个请求都会生成一个专用的智能体实例。界面会直观呈现这些并行工作流,显示每个智能体的状态、它们生成的制品 (计划、结果、差异)以及任何待人工批准的请求。

6. Antigravity 浏览器和制品

Antigravity 在规划和完成工作时会创建制品 。这些制品包括富 Markdown 文件、架构图、图片、浏览器录制内容和代码差异。

制品解决了“信任差距”问题

当智能体声称“我已修复 bug”时,开发者之前必须阅读代码才能进行验证。在 Antigravity 中,智能体会生成制品来证明这一点。

制品

以下是 Antigravity 生成的主要制品:

  • Task Lists:在编写代码之前生成的结构化计划。
  • Implementation Plan:包含技术详细信息的架构更改。
  • Walkthrough:更改摘要以及如何测试这些更改。
  • Browser Recordings:用于界面验证的浏览器会话视频记录。

Antigravity 浏览器

当智能体需要与网络互动时,它会调用浏览器子智能体 。此子智能体可以点击、滚动、输入和读取控制台日志。它使用专用模型来操作在 Antigravity 管理的浏览器中打开的页面。

7. 编辑器体验

该编辑器保留了 VS Code 的熟悉感。它包括标准文件浏览器、语法突出显示和扩展生态系统。

主要编辑器功能:

  • 自动补全:按 Tab 键接受的智能建议。
  • Tab 键导入:建议添加缺失的依赖项。
  • 命令 (Cmd + I):使用自然语言触发内嵌补全。
  • 智能体侧边栏 (Cmd + L):切换智能体面板以提出问题或使用 @ 引用文件。

8. 提供反馈

Antigravity 的核心功能是能够轻松收集您的反馈。这些制品是您以 Google 文档样式的评论 向智能体提供反馈的方式。

每当您向计划或任务添加评论时,请务必记得提交评论。这会将智能体引导到您想要的方向。

9. 构建 KCC Ops 技能

现在您已经了解了该平台,接下来让我们构建 KCC Ops 技能

Kubernetes Config Connector (KCC) 允许您将 GCP 资源作为 K8s 对象进行管理。不过,它需要安全防护措施来防止配置漂移、违规行为和意外的资源重新创建。

第 1 步:构建技能

在工作区根目录中,为您的技能创建目录结构:

mkdir -p .agents/skills/kcc-ops/scripts
mkdir -p .agents/skills/kcc-ops/resources/policies/templates
mkdir -p .agents/skills/kcc-ops/resources/policies/constraints

第 2 步:编写 SKILL.md(大脑)

SKILL.md 定义了智能体的元数据和核心“黄金法则”。创建 .agents/skills/kcc-ops/SKILL.md

---
name: kcc-ops
description: Assists with Config Connector (KCC) configuration, resource generation, and troubleshooting on Google Cloud.
---

# Config Connector (KCC) Operations Skill

Use this skill to manage Google Cloud resources using Kubernetes-style configuration (Config Connector).

## 🛑 GOLDEN RULE: Separate Generation from Application

**NEVER generate and apply a manifest in a single autonomous step.**

1. **Craft:** Write the generated manifest to a local file.
2. **Analyze:** Present the manifest to the user. Perform Impact Analysis and Dry Runs. Explain the consequences of the change (e.g., "If this topic is deleted, the attached subscription becomes orphaned").
3. **Wait:** Pause execution and explicitly wait for user permission to proceed.
4. **Apply:** Only run `kubectl apply` *after* the user has reviewed the manifest and the impact analysis, and then unequivocally confirmed you should proceed.

## Core Responsibilities

0. **Context Verification**: Verify the execution context (cluster, namespace, GCP project, user account) with the user before performing any operations.
1. **Installation & Health**: Verify KCC is properly installed and healthy on the target cluster.
2. **Resource Inventory**: Query and summarize existing KCC resources within a namespace.
3. **Brownfield Bulk Export (Adoption)**: Export existing GCP project resources into valid KCC YAML manifests.
4. **Manifest Generation**: Generate valid YAML manifests for GCP resources using KCC CRDs.
5. **Impact Analysis**: Identify ancillary services and resources (e.g., Cloud Run, Apps) that depend on a resource being modified.
6. **Change Differentiation**: Generate diff summaries for resource edits to support change control.
7. **Policy Compliance**: Vet KCC manifests against OPA/Gatekeeper policies.
8. **Troubleshooting**: Analyze resource status, and consult the troubleshooting guide to resolve reconciliation issues.

## Guidelines for Operations

### 0. Context Verification

Before performing **any operations or executing commands** (including health checks), you **MUST** verify the current execution context and obtain explicit user confirmation.

1. **Read Context:** Use commands like `kubectl config current-context`, `kubectl config view --minify -o jsonpath='{.contexts[0].context.namespace}'`, `gcloud config get-value project`, and `gcloud config get-value account` to determine the active environment.
2. **Present & Ask:** Show this information to the user clearly (e.g., "I see my context is X, namespace is Y, project is Z, and account is A. Is this correct?").
3. **Wait:** Do not proceed with any other steps or scripts until the user has confirmed or provided corrections.

### 1. Installation & Health Check

Before performing operations, ensure the environment is ready:

- **Automation**: You MUST use `./scripts/check-health.sh` to verify namespaces, controllers, and CRDs. Do not use manual kubectl commands for health checks, as the script enforces standard formatting and context verification.

### 2. Resource Inventory & Discovery

To understand the current state of infrastructure:

- **Automation**: You MUST use `./scripts/inventory.sh` to get a summary table of all KCC resources. Do not use manual kubectl queries, as the script is optimized to securely discover all CRDs with context validation.

### 3. Manifest Structure

- All KCC resources belong to the `cnrm.cloud.google.com` API group.
- Use the `cnrm.cloud.google.com/project-id` annotation for cross-project resource management if not using Namespaced Mode.
- Always include `apiVersion`, `kind`, `metadata`, and `spec`.

### 4. Official Resource Reference (Agent Action)

When generating or troubleshooting manifests, you **must not guess** the API schema. Always consult the [Official Config Connector Reference](https://cloud.google.com/config-connector/docs/reference/overview) for the exact API version, kind, and required fields for the specific resource and cross reference with the official [github repository](https://github.com/GoogleCloudPlatform/k8s-config-connector/tree/master/config/crds/resources).

### 5. Troubleshooting Checklist

When a resource is not reconciling (check `kubectl get <kind> <name> -o yaml`):

- **Ready Condition**: Look for `status.conditions` where `type: Ready` and `status: "False"`.
- **Reason/Message**: Check the `reason` and `message` fields in the status conditions.
- **Consult the Guide**: Immediately check `./resources/troubleshooting-guide.md` for definitions of the error reason (e.g. `DependencyInvalid`, `ManagementConflict`) and follow its resolution steps.
- **Common Issues**:
  - Permissions: The KCC service account lacks IAM roles.
  - Quotas: GCP project quota exceeded.
  - Conflicts: Resource already exists or is managed by another tool.
  - Immutable Fields: Attempting to change a field that requires resource recreation. Look for "Update failed" errors related to immutable fields.
  - Reference Resolution: Check if the resource is waiting for a dependency (e.g., `referenced project not found`).

### 6. Impact Analysis (Ancillary Services)

Before modifying a resource (e.g., GCS Bucket, Pub/Sub Topic), verify whatElse depends on it:

- **Reference Search (Cluster-wide)**: Search for other KCC resources that reference the item.

  ```bash
  # Example: Find resources referencing a bucket named 'my-data-bucket'
  kubectl get-all -n <namespace> -o yaml | grep -C 5 "my-data-bucket"
  ```

- **IAM-based Analysis**: Check for IAM Service Accounts that have roles on the specific resource. A Cloud Run job or GKE Workload Identity might be using those permissions.
- **Common Ancillary Dependencies**:
  - **Storage Buckets**: Look for Cloud Run/GKE mounts (CSI), Cloud Functions triggers, or Dataflow jobs.
  - **Networks**: Check for Firewall rules, Forwarding rules, and GKE cluster assignments.
  - **IAM Policies**: Changing a policy might break access for external applications not managed by KCC.
- **Resource Graph**: Use `gcloud asset search-all-resources` to find resources that might have implicit links.

### 3. Policy Compliance & Best Practices

Evaluate KCC manifests against security and governance policies. The vetting tool supports three source modes:

- **Built-in Mode (Default)**: Uses the skill's high-fidelity `v1beta1` library (300+ Anthos constraints).
  - `Usage: ./scripts/vet-policy.sh <manifest-path>`
- **Remote Mode**: Clones and vets against an external Git repository.
  - `Usage: ./scripts/vet-policy.sh <manifest-path> <repo-url> [git-ref]`
  - ⚠️ **Note**: External libraries like the legacy GCP Policy Library may be out-of-date and cause schema validation errors with modern `gator`.
- **Local Mode**: Vets against a local directory of policies.
  - `Usage: ./scripts/vet-policy.sh <manifest-path> /path/to/local/policies`

**Interaction Model:**

1. Call `./scripts/vet-policy.sh` with the appropriate arguments.
2. Interpret the `=== KCC Best Practices ===` and `=== OPA/Gatekeeper ===` reports.
3. Supplement automated findings with manual review for specific security features not yet covered by OPA (e.g., `publicAccessPrevention: enforced`, `versioning: {enabled: true}`).

  ```bash
  # Run the skill's helper script (repo URL and branch are optional)
  ./scripts/vet-policy.sh manifest.yaml [policy-repo-url] [policy-ref]
  ```

## Skill Assets

This skill includes additional resources to streamline operations:

- **`scripts/`**: Automation scripts (e.g., `vet-policy.sh`, `bulk-export.sh`).
- **`examples/`**: Reference KCC manifests (e.g., `restricted-bucket.yaml`).
- **`resources/`**: Common templates, documentation snippets, and troubleshooting guides (e.g., `troubleshooting-guide.md`).

### 4. Safety Rails for Applying Manifests

Before applying any KCC manifest update to an existing resource, you MUST:

- **Verify Immutable Fields**: Call `./scripts/verify-immutable.sh <manifest-path>` to detect updates to fields (like `location`, `name`, `project-id`) that trigger destructive resource recreation.
- **Explain Impact**: If destructive changes are detected, you MUST warn the user and explain the downtime/data loss implications before requesting approval.

### 5. Emergency Recovery & Troubleshooting

If a resource is stuck in a "Deletion" or "Error" state:

- **Check for Abondon Flag**: Check if the resource has the `cnrm.cloud.google.com/deletion-policy: abandon` annotation. If it does, you will need to remove the annotation and then force delete the resource.
- **Force Delete**: Call `./scripts/force-delete.sh <kind> <name> [namespace]` to bypass Kubernetes finalizers and remove the resource from the cluster.
- **Orphan Warning**: Inform the user that force-deleting a KCC object may leave an orphaned resource in Google Cloud that requires manual cleanup.

### 6. Change Differentiation

When editing an existing resource, always generate a diff to summarize the change for reviewers or Git history:

- **Local Diff**:

  ```bash
  # Diff a local file against the cluster state
  kubectl diff -f modified-resource.yaml
  ```

- **Commit Summary Template**:

  ```text
  [KCC Change] Update <ResourceName> (<Kind>)
  - Field 'spec.foo' changed from 'X' to 'Y'
  - Impact: Ancillary service <ServiceName> will see updated <Config>
  ```

### 9. Best Practices

- **Namespaced Mode**: Prefer namespaced mode for better isolation.
- **Sensitive Data**: Use `spec.credential.secretRef` or similar for sensitive fields.
- **Resource Naming**: Use consistent naming conventions that match your Kubernetes/GCP standards.
- **Annotations**:
  - `cnrm.cloud.google.com/deletion-policy: abandon`: Keep GCP resource on KCC deletion.
  - `cnrm.cloud.google.com/state-into-spec: absent`: Prevents KCC from syncing GCP state back into the Kubernetes object (useful for avoiding reconciliation loops on fields like node counts).

## Common Resource Examples

### Compute Instance

```yaml
apiVersion: compute.cnrm.cloud.google.com/v1beta1
kind: ComputeInstance
metadata:
  name: instance-sample
  annotations:
    cnrm.cloud.google.com/project-id: "my-project-id"
spec:
  machineType: n1-standard-1
  zone: us-central1-a
  bootDisk:
    initializeParams:
      sourceImage: projects/debian-cloud/global/images/family/debian-11
  networkInterface:
    - networkRef:
        name: default
```

### Storage Bucket

```yaml
apiVersion: storage.cnrm.cloud.google.com/v1beta1
kind: StorageBucket
metadata:
  name: bucket-sample
spec:
  location: US
```

### Pub/Sub Topic & Subscription

```yaml
apiVersion: pubsub.cnrm.cloud.google.com/v1beta1
kind: PubSubTopic
metadata:
  name: order-events-topic
---
apiVersion: pubsub.cnrm.cloud.google.com/v1beta1
kind: PubSubSubscription
metadata:
  name: order-processor-sub
spec:
  topicRef:
    name: order-events-topic
  ackDeadlineSeconds: 30
```

第 3 步:实现清单工具

创建 .agents/skills/kcc-ops/scripts/inventory.sh 以发现 KCC 资源:

#!/bin/bash
# List all resources in the cnrm.cloud.google.com group
KCC_KINDS=$(kubectl api-resources --no-headers | awk '/\.cnrm\.cloud\.google\.com/ {print $1}')
KCC_KINDS_CSV=$(echo "$KCC_KINDS" | paste -sd, -)

printf "%-40s %-30s %-10s %s\n" "KIND" "NAME" "READY" "STATUS/MESSAGE"
kubectl get "$KCC_KINDS_CSV" -A -o custom-columns="KIND:.kind,NAME:.metadata.name,READY:.status.conditions[?(@.type=='Ready')].status,MSG:.status.conditions[?(@.type=='Ready')].message" --ignore-not-found --no-headers

第 4 步:添加政策审查逻辑

创建 .agents/skills/kcc-ops/scripts/vet-policy.sh。此脚本将使用 gator 根据 OPA 政策审查清单:

#!/bin/bash
MANIFEST=$1
SKILL_ROOT=$(dirname "$(dirname "$0")")
POLICY_SRC="$SKILL_ROOT/resources/policies"

echo "=== OPA/Gatekeeper Policy Vetting ==="
if command -v gator >/dev/null 2>&1; then
    gator test -f "$MANIFEST" -f "$POLICY_SRC/templates" -f "$POLICY_SRC/constraints"
else
    echo "Gator not found. Skipping OPA audit."
fi

第 5 步:实现不可变字段保护

这是一项关键的安全防护措施。创建 .agents/skills/kcc-ops/scripts/verify-immutable.sh

#!/bin/bash
MANIFEST=$1
KIND=$(grep "^kind:" "$MANIFEST" | awk '{print $2}')
NAME=$(grep "name:" "$MANIFEST" | head -n 1 | awk '{print $2}')

# Check for changes in common immutable fields
IMMUTABLE_FIELDS=("location" "project-id" "name" "zone" "region")
TEMP_FILE=$(mktemp)
kubectl get "$KIND" "$NAME" -o yaml > "$TEMP_FILE" 2>/dev/null

for field in "${IMMUTABLE_FIELDS[@]}"; do
    NEW=$(grep "$field:" "$MANIFEST" | awk '{print $2}')
    OLD=$(grep "$field:" "$TEMP_FILE" | awk '{print $2}')
    if [ -n "$NEW" ] && [ -n "$OLD" ] && [ "$NEW" != "$OLD" ]; then
        echo "🚨 WARNING: Immutable field '$field' is changing! Potential resource recreation."
    fi
done
rm "$TEMP_FILE"

第 6 步:紧急恢复(强制删除)

创建 .agents/skills/kcc-ops/scripts/force-delete.sh

#!/bin/bash
KIND=$1; NAME=$2; NS=${3:-default}
echo "Removing finalizers for $KIND/$NAME in $NS..."
kubectl patch "$KIND" "$NAME" -n "$NS" -p '{"metadata":{"finalizers":null}}' --type=merge
kubectl delete "$KIND" "$NAME" -n "$NS" --wait=false

第 7 步:最终确定资源

将所有脚本设置为可执行脚本:

chmod +x .agents/skills/kcc-ops/scripts/*.sh

10. 测试新技能

现在,开始新对话并测试您的技能:

  1. 发现: @kcc-ops Show me all KCC resources in my cluster.
  2. 合规性:使用 StorageBucket 创建文件 bucket.yaml。提问:@kcc-ops Vet my bucket.yaml manifest.
  3. 安全性:尝试更新 bucket.yaml 中现有存储分区的 location。提问:@kcc-ops Verify my bucket.yaml for immutable changes.

请注意,智能体如何智能地选择正确的脚本并遵循 SKILL.md 中的“黄金法则”。

11. 保护智能体

向 AI 智能体授予对终端的访问权限非常强大,但需要控制。

前往 Antigravity - 设置 - 终端 ,然后探索允许列表拒绝列表

  • 允许列表:在此处添加 lskubectl get 和您的技能脚本。
  • 拒绝列表:添加 sudorm -rf 或其他破坏性命令,以确保智能体始终请求权限。

12. 总结

恭喜!您已经从安装 Antigravity 到构建高保真 KCC Operations 技能

您已经学习了以下内容:

  • 如何使用自定义 Bash 工具扩展智能体。
  • 如何将运营“黄金法则”编码到 SKILL.md 中。
  • 如何为复杂的基础架构管理提供安全防护措施。

后续步骤

使用更多 OPA 限制条件扩展 resources/policies 文件夹,或添加 check-health.sh 脚本以自动执行集群就绪情况检查!