1. 简介
概览
在此 Codelab 中,您将创建一个以 Node.js 编写的 Cloud Run 服务,为视频中的每个场景提供直观说明。首先,您的服务将使用 Video Intelligence API 检测每次场景变化的时间戳。接下来,您的服务将使用名为 ffmpeg 的第三方二进制文件捕获每个场景切换时间戳的屏幕截图。最后,Vertex AI 可视化字幕用于提供屏幕截图的视觉说明。
此 Codelab 还演示了如何在 Cloud Run 服务中使用 ffmpeg 从视频中捕获给定时间戳的图片。由于 ffmpeg 需要单独安装,因此此 Codelab 会介绍如何创建 Dockerfile,以便将 ffmpeg 作为 Cloud Run 服务的一部分进行安装。
下图说明了 Cloud Run 服务的工作原理:

学习内容
- 如何使用 Dockerfile 创建容器映像以安装第三方二进制文件
- 如何通过为 Cloud Run 服务创建服务账号来调用其他 Google Cloud 服务来遵循最小权限原则
- 如何通过 Cloud Run 服务使用 Video Intelligence 客户端库
- 如何调用 Google API 以获取 Vertex AI 中每个场景的直观描述
2. 设置和要求
前提条件
- 您已登录 Cloud 控制台。
- 您之前已部署了 Cloud Run 服务。例如,您可以按照快速入门:部署 Web 服务开始操作。
激活 Cloud Shell
- 在 Cloud Console 中,点击激活 Cloud Shell
。

如果这是您第一次启动 Cloud Shell,系统会显示一个中间屏幕,说明它是什么。如果您看到中间屏幕,请点击继续。

预配和连接到 Cloud Shell 只需花几分钟时间。

这个虚拟机装有所需的所有开发工具。它提供了一个持久的 5 GB 主目录,并在 Google Cloud 中运行,大大增强了网络性能和身份验证功能。您在此 Codelab 中的大部分(即使不是全部)工作都可以通过浏览器完成。
在连接到 Cloud Shell 后,您应该会看到自己已通过身份验证,并且相关项目已设为您的项目 ID。
- 在 Cloud Shell 中运行以下命令以确认您已通过身份验证:
gcloud auth list
命令输出
Credentialed Accounts
ACTIVE ACCOUNT
* <my_account>@<my_domain.com>
To set the active account, run:
$ gcloud config set account `ACCOUNT`
- 在 Cloud Shell 中运行以下命令,以确认 gcloud 命令了解您的项目:
gcloud config list project
命令输出
[core] project = <PROJECT_ID>
如果不是上述结果,您可以使用以下命令进行设置:
gcloud config set project <PROJECT_ID>
命令输出
Updated property [core/project].
3. 启用 API 并设置环境变量
在开始使用此 Codelab 之前,您需要先启用多个 API。此 Codelab 需要使用以下 API。您可以通过运行以下命令来启用这些 API:
gcloud services enable run.googleapis.com \
storage.googleapis.com \
cloudbuild.googleapis.com \
videointelligence.googleapis.com \
aiplatform.googleapis.com
然后,您可以设置要在整个 Codelab 中使用的环境变量。
REGION=<YOUR-REGION> PROJECT_ID=<YOUR-PROJECT-ID> PROJECT_NUMBER=$(gcloud projects describe $PROJECT_ID --format='value(projectNumber)') SERVICE_NAME=video-describer export BUCKET_ID=$PROJECT_ID-video-describer
4. 创建 Cloud Storage 存储分区
创建一个 Cloud Storage 存储分区,您可以在其中使用以下命令上传供 Cloud Run 服务处理的视频:
gsutil mb -l us-central1 gs://$BUCKET_ID/
[可选] 您可以下载此示例视频,将其下载到本地,加以使用。
gsutil cp gs://cloud-samples-data/video/visionapi.mp4 testvideo.mp4
现在,将您的视频文件上传到存储分区中。
FILENAME=<YOUR-VIDEO-FILENAME> gsutil cp $FILENAME gs://$BUCKET_ID
5. 创建 Node.js 应用
首先,为源代码创建一个目录,然后通过 cd 命令进入该目录。
mkdir video-describer && cd $_
然后,创建一个包含以下内容的 package.json 文件:
{
"name": "video-describer",
"version": "1.0.0",
"private": true,
"description": "describes the image in every scene for a given video",
"main": "index.js",
"author": "Google LLC",
"license": "Apache-2.0",
"scripts": {
"start": "node index.js"
},
"dependencies": {
"@google-cloud/storage": "^7.7.0",
"@google-cloud/video-intelligence": "^5.0.1",
"axios": "^1.6.2",
"express": "^4.18.2",
"fluent-ffmpeg": "^2.1.2",
"google-auth-library": "^9.4.1"
}
}
为提高可读性,此应用包含多个源文件。首先,创建一个包含以下内容的 index.js 源文件。此文件包含服务的入口点以及应用的主要逻辑。
const { captureImages } = require('./imageCapture.js');
const { detectSceneChanges } = require('./sceneDetector.js');
const transcribeScene = require('./imageDescriber.js');
const { Storage } = require('@google-cloud/storage');
const fs = require('fs').promises;
const path = require('path');
const express = require('express');
const app = express();
const bucketName = process.env.BUCKET_ID;
const port = parseInt(process.env.PORT) || 8080;
app.listen(port, () => {
console.log(`video describer service ready: listening on port ${port}`);
});
// entry point for the service
app.get('/', async (req, res) => {
try {
// download the requested video from Cloud Storage
let videoFilename = req.query.filename;
console.log("processing file: " + videoFilename);
// download the file to locally to the Cloud Run instance
let localFilename = await downloadVideoFile(videoFilename);
// detect all the scenes in the video & save timestamps to an array
let timestamps = await detectSceneChanges(localFilename);
console.log("Detected scene changes at the following timestamps: ", timestamps);
// create an image of each scene change
// and save to a local directory called "output"
await captureImages(localFilename, timestamps);
// get an access token for the Service Account to call the Google APIs
let accessToken = await transcribeScene.getAccessToken();
console.log("got an access token");
let imageBaseName = path.parse(localFilename).name;
// the data structure for storing the scene description and timestamp
// e.g. an array of json objects {timestamp: 1, description: "..."}, etc.
let scenes = []
// for each timestamp, send the image to Vertex AI
console.log("getting Vertex AI description all the timestamps");
scenes = await Promise.all(
timestamps.map(async (timestamp) => {
let filepath = path.join("./output", imageBaseName + "-" + timestamp + ".png");
// get the base64 encoded image
const encodedFile = await fs.readFile(filepath, 'base64');
// send each screenshot to Vertex AI for description
let description = await transcribeScene.transcribeScene(accessToken, encodedFile)
return { timestamp: timestamp, description: description };
}));
console.log("finished collecting all the scenes");
//console.log(scenes);
return res.json(scenes);
} catch (error) {
//return an error
console.log("received error: ", error);
return res.status(500).json("an internal error occurred");
}
});
async function downloadVideoFile(videoFilename) {
// Creates a client
const storage = new Storage();
// keep same name locally
let localFilename = videoFilename;
const options = {
destination: localFilename
};
// Download the file
await storage.bucket(bucketName).file(videoFilename).download(options);
console.log(
`gs://${bucketName}/${videoFilename} downloaded locally to ${localFilename}.`
);
return localFilename;
}
接下来,使用以下内容创建 sceneDetector.js 文件。此文件使用 Video Intelligence API 检测视频中的场景何时发生变化。
const fs = require('fs');
const util = require('util');
const readFile = util.promisify(fs.readFile);
const ffmpeg = require('fluent-ffmpeg');
const Video = require('@google-cloud/video-intelligence');
const client = new Video.VideoIntelligenceServiceClient();
module.exports = {
detectSceneChanges: async function (downloadedFile) {
// Reads a local video file and converts it to base64
const file = await readFile(downloadedFile);
const inputContent = file.toString('base64');
// setup request for shot change detection
const videoContext = {
speechTranscriptionConfig: {
languageCode: 'en-US',
enableAutomaticPunctuation: true,
},
};
const request = {
inputContent: inputContent,
features: ['SHOT_CHANGE_DETECTION'],
};
// Detects camera shot changes
const [operation] = await client.annotateVideo(request);
console.log('Shot (scene) detection in progress...');
const [operationResult] = await operation.promise();
// Gets shot changes
const shotChanges = operationResult.annotationResults[0].shotAnnotations;
console.log("Shot (scene) changes detected: " + shotChanges.length);
// data structure to be returned
let sceneChanges = [];
// for the initial scene
sceneChanges.push(1);
// if only one scene, keep at 1 second
if (shotChanges.length === 1) {
return sceneChanges;
}
// get length of video
const videoLength = await getVideoLength(downloadedFile);
shotChanges.forEach((shot, shotIndex) => {
if (shot.endTimeOffset === undefined) {
shot.endTimeOffset = {};
}
if (shot.endTimeOffset.seconds === undefined) {
shot.endTimeOffset.seconds = 0;
}
if (shot.endTimeOffset.nanos === undefined) {
shot.endTimeOffset.nanos = 0;
}
// convert to a number
let currentTimestampSecond = Number(shot.endTimeOffset.seconds);
let sceneChangeTime = 0;
// double-check no scenes were detected within the last second
if (currentTimestampSecond + 1 > videoLength) {
sceneChangeTime = currentTimestampSecond;
} else {
// otherwise, for simplicity, just round up to the next second
sceneChangeTime = currentTimestampSecond + 1;
}
sceneChanges.push(sceneChangeTime);
});
return sceneChanges;
}
}
async function getVideoLength(localFile) {
let getLength = util.promisify(ffmpeg.ffprobe);
let length = await getLength(localFile);
console.log("video length: ", length.format.duration);
return length.format.duration;
}
现在创建一个名为 imageCapture.js 的文件,其中包含以下内容。此文件使用节点软件包 fluent-ffmpeg 从节点应用中运行 ffmpeg 命令。
const ffmpeg = require('fluent-ffmpeg');
const path = require('path');
const util = require('util');
module.exports = {
captureImages: async function (localFile, scenes) {
let imageBaseName = path.parse(localFile).name;
try {
for (scene of scenes) {
console.log("creating screenshot for scene: ", + scene);
await createScreenshot(localFile, imageBaseName, scene);
}
} catch (error) {
console.log("error gathering screenshots: ", error);
}
console.log("finished gathering the screenshots");
}
}
async function createScreenshot(localFile, imageBaseName, scene) {
return new Promise((resolve, reject) => {
ffmpeg(localFile)
.screenshots({
timestamps: [scene],
filename: `${imageBaseName}-${scene}.png`,
folder: 'output',
size: '320x240'
}).on("error", () => {
console.log("Failed to create scene for timestamp: " + scene);
return reject('Failed to create scene for timestamp: ' + scene);
})
.on("end", () => {
return resolve();
});
})
}
最后,创建名为 `imageDescriber.js` 的文件,其中包含以下内容。此文件使用 Vertex AI 获取每张场景图片的直观描述。
const axios = require("axios");
const { GoogleAuth } = require('google-auth-library');
const auth = new GoogleAuth({
scopes: 'https://www.googleapis.com/auth/cloud-platform'
});
module.exports = {
getAccessToken: async function () {
return await auth.getAccessToken();
},
transcribeScene: async function(token, encodedFile) {
let projectId = await auth.getProjectId();
let config = {
headers: {
'Authorization': 'Bearer ' + token,
'Content-Type': 'application/json; charset=utf-8'
}
}
const json = {
"instances": [
{
"image": {
"bytesBase64Encoded": encodedFile
}
}
],
"parameters": {
"sampleCount": 1,
"language": "en"
}
}
let response = await axios.post('https://us-central1-aiplatform.googleapis.com/v1/projects/' + projectId + '/locations/us-central1/publishers/google/models/imagetext:predict', json, config);
return response.data.predictions[0];
}
}
创建 Dockerfile 和 .dockerignore 文件
由于此服务使用 ffmpeg,因此您需要创建一个用于安装 ffmpeg 的 Dockerfile。
创建一个名为 Dockerfile 的文件,其中包含以下内容:
# Copyright 2020 Google, LLC. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # Use the official lightweight Node.js image. # https://hub.docker.com/_/node FROM node:20.10.0-slim # Create and change to the app directory. WORKDIR /usr/src/app RUN apt-get update && apt-get install -y ffmpeg # Copy application dependency manifests to the container image. # A wildcard is used to ensure both package.json AND package-lock.json are copied. # Copying this separately prevents re-running npm install on every code change. COPY package*.json ./ # Install dependencies. # If you add a package-lock.json speed your build by switching to 'npm ci'. # RUN npm ci --only=production RUN npm install --production # Copy local code to the container image. COPY . . # Run the web service on container startup. CMD [ "npm", "start" ]
并创建一个名为 .dockerignore 的文件,以忽略对某些文件的容器化。
Dockerfile .dockerignore node_modules npm-debug.log
6. 创建服务账号
您将为 Cloud Run 服务创建一个服务账号,用于访问 Cloud Storage、Vertex AI 和 Video Intelligence API。
SERVICE_ACCOUNT="cloud-run-video-description" SERVICE_ACCOUNT_ADDRESS=$SERVICE_ACCOUNT@$PROJECT_ID.iam.gserviceaccount.com gcloud iam service-accounts create $SERVICE_ACCOUNT \ --display-name="Cloud Run Video Scene Image Describer service account" # to view & download storage bucket objects gcloud projects add-iam-policy-binding $PROJECT_ID \ --member serviceAccount:$SERVICE_ACCOUNT_ADDRESS \ --role=roles/storage.objectViewer # to call the Vertex AI imagetext model gcloud projects add-iam-policy-binding $PROJECT_ID \ --member serviceAccount:$SERVICE_ACCOUNT_ADDRESS \ --role=roles/aiplatform.user
7. 部署 Cloud Run 服务
现在,您可以使用基于源代码的部署自动将 Cloud Run 服务容器化。
注意:Cloud Run 服务的默认处理时间为 60 秒。此 Codelab 使用的超时时间为 5 分钟,因为建议的测试视频的时长为 2 分钟。如果您使用的是时长较长的视频,则可能需要修改时长。
gcloud run deploy $SERVICE_NAME \ --region=$REGION \ --set-env-vars BUCKET_ID=$BUCKET_ID \ --no-allow-unauthenticated \ --service-account $SERVICE_ACCOUNT_ADDRESS \ --timeout=5m \ --source=.
部署后,将服务网址保存在环境变量中。
SERVICE_URL=$(gcloud run services describe $SERVICE_NAME --platform managed --region $REGION --format 'value(status.url)')
8. 调用 Cloud Run 服务
现在,您可以通过提供上传到 Cloud Storage 的视频的名称来调用您的服务。
curl -X GET -H "Authorization: Bearer $(gcloud auth print-identity-token)" ${SERVICE_URL}?filename=${FILENAME}
结果应类似于以下示例输出:
[{"timestamp":1,"description":"an aerial view of a city with a bridge in the background"},{"timestamp":7,"description":"a man in a blue shirt sits in front of shelves of donuts"},{"timestamp":11,"description":"a black and white photo of people working in a bakery"},{"timestamp":12,"description":"a black and white photo of a man and woman working in a bakery"}]
9. 恭喜!
恭喜您完成此 Codelab!
建议您查看有关 Video Intelligence API、Cloud Run 和 Vertex AI 视觉字幕的文档。
所学内容
- 如何使用 Dockerfile 创建容器映像以安装第三方二进制文件
- 如何通过为 Cloud Run 服务创建服务账号来调用其他 Google Cloud 服务来遵循最小权限原则
- 如何通过 Cloud Run 服务使用 Video Intelligence 客户端库
- 如何调用 Google API 以获取 Vertex AI 中每个场景的直观描述
10. 清理
为避免产生意外费用(例如,如果此 Cloud Run 服务的调用次数超出免费层级中的每月 Cloud Run 调用次数),您可以删除该 Cloud Run 服务或删除您在第 2 步中创建的项目。
如需删除 Cloud Run 服务,请前往 https://console.cloud.google.com/run/ 前往 Cloud Run Cloud 控制台,然后删除 video-describer 函数(如果您使用的是其他名称,则删除 $SERVICE_NAME)。
如果您选择删除整个项目,可以前往 https://console.cloud.google.com/cloud-resource-manager,选择您在第 2 步中创建的项目,然后选择“删除”。如果删除项目,则需要在 Cloud SDK 中更改项目。您可以通过运行 gcloud projects list 来查看所有可用项目的列表。