1. 简介
概览
在此 Codelab 中,您将创建一个以 Node.js 编写的 Cloud Run 服务,为视频中的每个场景提供直观说明。首先,您的服务将使用 Video Intelligence API 检测每次场景变化的时间戳。接下来,您的服务将使用名为 ffmpeg 的第三方二进制文件捕获每个场景切换时间戳的屏幕截图。最后,Vertex AI 可视化字幕用于提供屏幕截图的视觉说明。
此 Codelab 还演示了如何在 Cloud Run 服务中使用 ffmpeg 从视频中捕获给定时间戳的图片。由于 ffmpeg 需要单独安装,因此此 Codelab 会介绍如何创建 Dockerfile,以便将 ffmpeg 作为 Cloud Run 服务的一部分进行安装。
下图说明了 Cloud Run 服务的工作原理:
学习内容
- 如何使用 Dockerfile 创建容器映像以安装第三方二进制文件
- 如何通过为 Cloud Run 服务创建服务账号来调用其他 Google Cloud 服务来遵循最小权限原则
- 如何通过 Cloud Run 服务使用 Video Intelligence 客户端库
- 如何调用 Google API 以获取 Vertex AI 中每个场景的直观描述
2. 设置和要求
前提条件
- 您已登录 Cloud 控制台。
- 您之前已部署了 Cloud Run 服务。例如,您可以按照快速入门:部署 Web 服务开始操作。
激活 Cloud Shell
- 在 Cloud Console 中,点击激活 Cloud Shell。
如果这是您第一次启动 Cloud Shell,系统会显示一个中间屏幕,说明它是什么。如果您看到中间屏幕,请点击继续。
预配和连接到 Cloud Shell 只需花几分钟时间。
这个虚拟机装有所需的所有开发工具。它提供了一个持久的 5 GB 主目录,并在 Google Cloud 中运行,大大增强了网络性能和身份验证功能。您在此 Codelab 中的大部分(即使不是全部)工作都可以通过浏览器完成。
在连接到 Cloud Shell 后,您应该会看到自己已通过身份验证,并且相关项目已设为您的项目 ID。
- 在 Cloud Shell 中运行以下命令以确认您已通过身份验证:
gcloud auth list
命令输出
Credentialed Accounts ACTIVE ACCOUNT * <my_account>@<my_domain.com> To set the active account, run: $ gcloud config set account `ACCOUNT`
- 在 Cloud Shell 中运行以下命令,以确认 gcloud 命令了解您的项目:
gcloud config list project
命令输出
[core] project = <PROJECT_ID>
如果不是上述结果,您可以使用以下命令进行设置:
gcloud config set project <PROJECT_ID>
命令输出
Updated property [core/project].
3. 启用 API 并设置环境变量
在开始使用此 Codelab 之前,您需要先启用多个 API。此 Codelab 需要使用以下 API。您可以通过运行以下命令来启用这些 API:
gcloud services enable run.googleapis.com \ storage.googleapis.com \ cloudbuild.googleapis.com \ videointelligence.googleapis.com \ aiplatform.googleapis.com
然后,您可以设置要在整个 Codelab 中使用的环境变量。
REGION=<YOUR-REGION> PROJECT_ID=<YOUR-PROJECT-ID> PROJECT_NUMBER=$(gcloud projects describe $PROJECT_ID --format='value(projectNumber)') SERVICE_NAME=video-describer export BUCKET_ID=$PROJECT_ID-video-describer
4. 创建 Cloud Storage 存储分区
创建一个 Cloud Storage 存储分区,您可以在其中使用以下命令上传供 Cloud Run 服务处理的视频:
gsutil mb -l us-central1 gs://$BUCKET_ID/
[可选] 您可以下载此示例视频,将其下载到本地,加以使用。
gsutil cp gs://cloud-samples-data/video/visionapi.mp4 testvideo.mp4
现在,将您的视频文件上传到存储分区中。
FILENAME=<YOUR-VIDEO-FILENAME> gsutil cp $FILENAME gs://$BUCKET_ID
5. 创建 Node.js 应用
首先,为源代码创建一个目录,然后通过 cd 命令进入该目录。
mkdir video-describer && cd $_
然后,创建一个包含以下内容的 package.json 文件:
{ "name": "video-describer", "version": "1.0.0", "private": true, "description": "describes the image in every scene for a given video", "main": "index.js", "author": "Google LLC", "license": "Apache-2.0", "scripts": { "start": "node index.js" }, "dependencies": { "@google-cloud/storage": "^7.7.0", "@google-cloud/video-intelligence": "^5.0.1", "axios": "^1.6.2", "express": "^4.18.2", "fluent-ffmpeg": "^2.1.2", "google-auth-library": "^9.4.1" } }
为提高可读性,此应用包含多个源文件。首先,创建一个包含以下内容的 index.js 源文件。此文件包含服务的入口点以及应用的主要逻辑。
const { captureImages } = require('./imageCapture.js'); const { detectSceneChanges } = require('./sceneDetector.js'); const transcribeScene = require('./imageDescriber.js'); const { Storage } = require('@google-cloud/storage'); const fs = require('fs').promises; const path = require('path'); const express = require('express'); const app = express(); const bucketName = process.env.BUCKET_ID; const port = parseInt(process.env.PORT) || 8080; app.listen(port, () => { console.log(`video describer service ready: listening on port ${port}`); }); // entry point for the service app.get('/', async (req, res) => { try { // download the requested video from Cloud Storage let videoFilename = req.query.filename; console.log("processing file: " + videoFilename); // download the file to locally to the Cloud Run instance let localFilename = await downloadVideoFile(videoFilename); // detect all the scenes in the video & save timestamps to an array let timestamps = await detectSceneChanges(localFilename); console.log("Detected scene changes at the following timestamps: ", timestamps); // create an image of each scene change // and save to a local directory called "output" await captureImages(localFilename, timestamps); // get an access token for the Service Account to call the Google APIs let accessToken = await transcribeScene.getAccessToken(); console.log("got an access token"); let imageBaseName = path.parse(localFilename).name; // the data structure for storing the scene description and timestamp // e.g. an array of json objects {timestamp: 1, description: "..."}, etc. let scenes = [] // for each timestamp, send the image to Vertex AI console.log("getting Vertex AI description all the timestamps"); scenes = await Promise.all( timestamps.map(async (timestamp) => { let filepath = path.join("./output", imageBaseName + "-" + timestamp + ".png"); // get the base64 encoded image const encodedFile = await fs.readFile(filepath, 'base64'); // send each screenshot to Vertex AI for description let description = await transcribeScene.transcribeScene(accessToken, encodedFile) return { timestamp: timestamp, description: description }; })); console.log("finished collecting all the scenes"); //console.log(scenes); return res.json(scenes); } catch (error) { //return an error console.log("received error: ", error); return res.status(500).json("an internal error occurred"); } }); async function downloadVideoFile(videoFilename) { // Creates a client const storage = new Storage(); // keep same name locally let localFilename = videoFilename; const options = { destination: localFilename }; // Download the file await storage.bucket(bucketName).file(videoFilename).download(options); console.log( `gs://${bucketName}/${videoFilename} downloaded locally to ${localFilename}.` ); return localFilename; }
接下来,使用以下内容创建 sceneDetector.js 文件。此文件使用 Video Intelligence API 检测视频中的场景何时发生变化。
const fs = require('fs'); const util = require('util'); const readFile = util.promisify(fs.readFile); const ffmpeg = require('fluent-ffmpeg'); const Video = require('@google-cloud/video-intelligence'); const client = new Video.VideoIntelligenceServiceClient(); module.exports = { detectSceneChanges: async function (downloadedFile) { // Reads a local video file and converts it to base64 const file = await readFile(downloadedFile); const inputContent = file.toString('base64'); // setup request for shot change detection const videoContext = { speechTranscriptionConfig: { languageCode: 'en-US', enableAutomaticPunctuation: true, }, }; const request = { inputContent: inputContent, features: ['SHOT_CHANGE_DETECTION'], }; // Detects camera shot changes const [operation] = await client.annotateVideo(request); console.log('Shot (scene) detection in progress...'); const [operationResult] = await operation.promise(); // Gets shot changes const shotChanges = operationResult.annotationResults[0].shotAnnotations; console.log("Shot (scene) changes detected: " + shotChanges.length); // data structure to be returned let sceneChanges = []; // for the initial scene sceneChanges.push(1); // if only one scene, keep at 1 second if (shotChanges.length === 1) { return sceneChanges; } // get length of video const videoLength = await getVideoLength(downloadedFile); shotChanges.forEach((shot, shotIndex) => { if (shot.endTimeOffset === undefined) { shot.endTimeOffset = {}; } if (shot.endTimeOffset.seconds === undefined) { shot.endTimeOffset.seconds = 0; } if (shot.endTimeOffset.nanos === undefined) { shot.endTimeOffset.nanos = 0; } // convert to a number let currentTimestampSecond = Number(shot.endTimeOffset.seconds); let sceneChangeTime = 0; // double-check no scenes were detected within the last second if (currentTimestampSecond + 1 > videoLength) { sceneChangeTime = currentTimestampSecond; } else { // otherwise, for simplicity, just round up to the next second sceneChangeTime = currentTimestampSecond + 1; } sceneChanges.push(sceneChangeTime); }); return sceneChanges; } } async function getVideoLength(localFile) { let getLength = util.promisify(ffmpeg.ffprobe); let length = await getLength(localFile); console.log("video length: ", length.format.duration); return length.format.duration; }
现在创建一个名为 imageCapture.js 的文件,其中包含以下内容。此文件使用节点软件包 fluent-ffmpeg 从节点应用中运行 ffmpeg 命令。
const ffmpeg = require('fluent-ffmpeg'); const path = require('path'); const util = require('util'); module.exports = { captureImages: async function (localFile, scenes) { let imageBaseName = path.parse(localFile).name; try { for (scene of scenes) { console.log("creating screenshot for scene: ", + scene); await createScreenshot(localFile, imageBaseName, scene); } } catch (error) { console.log("error gathering screenshots: ", error); } console.log("finished gathering the screenshots"); } } async function createScreenshot(localFile, imageBaseName, scene) { return new Promise((resolve, reject) => { ffmpeg(localFile) .screenshots({ timestamps: [scene], filename: `${imageBaseName}-${scene}.png`, folder: 'output', size: '320x240' }).on("error", () => { console.log("Failed to create scene for timestamp: " + scene); return reject('Failed to create scene for timestamp: ' + scene); }) .on("end", () => { return resolve(); }); }) }
最后,创建名为 `imageDescriber.js` 的文件,其中包含以下内容。此文件使用 Vertex AI 获取每张场景图片的直观描述。
const axios = require("axios"); const { GoogleAuth } = require('google-auth-library'); const auth = new GoogleAuth({ scopes: 'https://www.googleapis.com/auth/cloud-platform' }); module.exports = { getAccessToken: async function () { return await auth.getAccessToken(); }, transcribeScene: async function(token, encodedFile) { let projectId = await auth.getProjectId(); let config = { headers: { 'Authorization': 'Bearer ' + token, 'Content-Type': 'application/json; charset=utf-8' } } const json = { "instances": [ { "image": { "bytesBase64Encoded": encodedFile } } ], "parameters": { "sampleCount": 1, "language": "en" } } let response = await axios.post('https://us-central1-aiplatform.googleapis.com/v1/projects/' + projectId + '/locations/us-central1/publishers/google/models/imagetext:predict', json, config); return response.data.predictions[0]; } }
创建 Dockerfile 和 .dockerignore 文件
由于此服务使用 ffmpeg,因此您需要创建一个用于安装 ffmpeg 的 Dockerfile。
创建一个名为 Dockerfile
的文件,其中包含以下内容:
# Copyright 2020 Google, LLC. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # Use the official lightweight Node.js image. # https://hub.docker.com/_/node FROM node:20.10.0-slim # Create and change to the app directory. WORKDIR /usr/src/app RUN apt-get update && apt-get install -y ffmpeg # Copy application dependency manifests to the container image. # A wildcard is used to ensure both package.json AND package-lock.json are copied. # Copying this separately prevents re-running npm install on every code change. COPY package*.json ./ # Install dependencies. # If you add a package-lock.json speed your build by switching to 'npm ci'. # RUN npm ci --only=production RUN npm install --production # Copy local code to the container image. COPY . . # Run the web service on container startup. CMD [ "npm", "start" ]
并创建一个名为 .dockerignore 的文件,以忽略对某些文件的容器化。
Dockerfile .dockerignore node_modules npm-debug.log
6. 创建服务账号
您将为 Cloud Run 服务创建一个服务账号,用于访问 Cloud Storage、Vertex AI 和 Video Intelligence API。
SERVICE_ACCOUNT="cloud-run-video-description" SERVICE_ACCOUNT_ADDRESS=$SERVICE_ACCOUNT@$PROJECT_ID.iam.gserviceaccount.com gcloud iam service-accounts create $SERVICE_ACCOUNT \ --display-name="Cloud Run Video Scene Image Describer service account" # to view & download storage bucket objects gcloud projects add-iam-policy-binding $PROJECT_ID \ --member serviceAccount:$SERVICE_ACCOUNT_ADDRESS \ --role=roles/storage.objectViewer # to call the Vertex AI imagetext model gcloud projects add-iam-policy-binding $PROJECT_ID \ --member serviceAccount:$SERVICE_ACCOUNT_ADDRESS \ --role=roles/aiplatform.user
7. 部署 Cloud Run 服务
现在,您可以使用基于源代码的部署自动将 Cloud Run 服务容器化。
注意:Cloud Run 服务的默认处理时间为 60 秒。此 Codelab 使用的超时时间为 5 分钟,因为建议的测试视频的时长为 2 分钟。如果您使用的是时长较长的视频,则可能需要修改时长。
gcloud run deploy $SERVICE_NAME \ --region=$REGION \ --set-env-vars BUCKET_ID=$BUCKET_ID \ --no-allow-unauthenticated \ --service-account $SERVICE_ACCOUNT_ADDRESS \ --timeout=5m \ --source=.
部署后,将服务网址保存在环境变量中。
SERVICE_URL=$(gcloud run services describe $SERVICE_NAME --platform managed --region $REGION --format 'value(status.url)')
8. 调用 Cloud Run 服务
现在,您可以通过提供上传到 Cloud Storage 的视频的名称来调用您的服务。
curl -X GET -H "Authorization: Bearer $(gcloud auth print-identity-token)" ${SERVICE_URL}?filename=${FILENAME}
结果应类似于以下示例输出:
[{"timestamp":1,"description":"an aerial view of a city with a bridge in the background"},{"timestamp":7,"description":"a man in a blue shirt sits in front of shelves of donuts"},{"timestamp":11,"description":"a black and white photo of people working in a bakery"},{"timestamp":12,"description":"a black and white photo of a man and woman working in a bakery"}]
9. 恭喜!
恭喜您完成此 Codelab!
建议您查看有关 Video Intelligence API、Cloud Run 和 Vertex AI 视觉字幕的文档。
所学内容
- 如何使用 Dockerfile 创建容器映像以安装第三方二进制文件
- 如何通过为 Cloud Run 服务创建服务账号来调用其他 Google Cloud 服务来遵循最小权限原则
- 如何通过 Cloud Run 服务使用 Video Intelligence 客户端库
- 如何调用 Google API 以获取 Vertex AI 中每个场景的直观描述
10. 清理
为避免产生意外费用(例如,如果此 Cloud Run 服务的调用次数超出免费层级中的每月 Cloud Run 调用次数),您可以删除该 Cloud Run 服务或删除您在第 2 步中创建的项目。
如需删除 Cloud Run 服务,请前往 https://console.cloud.google.com/run/ 前往 Cloud Run Cloud 控制台,然后删除 video-describer
函数(如果您使用的是其他名称,则删除 $SERVICE_NAME)。
如果您选择删除整个项目,可以前往 https://console.cloud.google.com/cloud-resource-manager,选择您在第 2 步中创建的项目,然后选择“删除”。如果删除项目,则需要在 Cloud SDK 中更改项目。您可以通过运行 gcloud projects list
来查看所有可用项目的列表。