本頁面由 Cloud Translation API 翻譯而成。

使用 MediaPipe 建立自訂物件偵測網頁應用程式

1. 事前準備

MediaPipe Solutions 可讓您將機器學習 (ML) 解決方案套用至應用程式。它提供的架構可讓您設定預先建構的處理管道，為使用者提供即時、引人入勝且實用的輸出內容。您甚至可以使用 Model Maker 自訂這些解決方案，以便更新預設模型。

物件偵測是 MediaPipe Solutions 提供的多項 ML 視覺工作之一。MediaPipe Tasks 適用於 Android、Python 和網頁。

在本程式碼研究室中，您將在網路應用程式中加入物件偵測功能，以便偵測圖片和網路攝影機即時影像中的狗。

課程內容

如何使用 MediaPipe Tasks，在網路應用程式中加入物件偵測工作。

建構項目

偵測狗隻出沒的網頁應用程式。您也可以使用 MediaPipe Model Maker 自訂模型，以便偵測所選物件類別。

軟硬體需求

CodePen 帳戶
裝置上有網路瀏覽器
具備 JavaScript、CSS 和 HTML 的基本知識

2. 做好準備

本程式碼研究室會在 CodePen 中執行程式碼，這是一個社交開發環境，可讓您在瀏覽器中編寫程式碼，並在建構時查看結果。

如要開始設定，請按照下列步驟操作：

在 CodePen 帳戶中前往這個 CodePen。您可以使用這段程式碼做為起點，自行建立物體偵測器。
在 CodePen 的導覽選單底部，按一下「Fork」，複製起始程式碼。

CodePen 中的導覽選單，其中包含「Fork」按鈕

在「JS」分頁中，按一下展開箭頭，然後選取「最大化 JavaScript 編輯器」。您只需編輯本程式碼研究室的「JS」分頁中的內容，因此不需要查看「HTML」或「CSS」分頁。

查看範例應用程式

請注意，在預覽窗格中，有兩張狗的圖片，以及啟動網路攝影機的選項。本教學課程中使用的模型，是根據兩張圖片中顯示的三隻狗訓練而成。

範例程式碼的網路應用程式預覽畫面

在「JS」分頁中，您會發現程式碼中有多個註解。例如，您可以在第 15 行找到以下註解：

// Import the required package.

這些註解會指出您需要插入程式碼片段的位置。

3. 匯入 MediaPipe 的 tasks-vision 套件，並新增必要變數

在「JS」JS分頁中，匯入 MediaPipe tasks-vision 套件：

// Import the required package.
import { ObjectDetector, FilesetResolver, Detection } from "https://cdn.skypack.dev/@mediapipe/tasks-vision@latest";

這段程式碼會使用 Skypack 內容傳遞網路 (CDN) 匯入套件。如要進一步瞭解如何搭配使用 Skypack 和 CodePen，請參閱「Skypack + CodePen」。

在專案中，您可以將 Node.js 與 npm 或所選的套件管理工具或 CDN 搭配使用。如要進一步瞭解需要安裝的必要套件，請參閱「JavaScript 套件」。

宣告物件偵測器和執行模式的變數：

// Create required variables.
let objectDetector = null;
let runningMode = "IMAGE";

runningMode 變數是字串，偵測圖片中的物件時會設為 "IMAGE" 值，偵測影片中的物件時會設為 "VIDEO" 值。

4. 初始化物件偵測器

如要初始化物件偵測器，請在 JS 分頁中，在相關註解後方新增以下程式碼：

// Initialize the object detector.
async function initializeObjectDetector() {
  const visionFilesetResolver = await FilesetResolver.forVisionTasks(
    "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm"
  );
  objectDetector = await ObjectDetector.createFromOptions(visionFilesetResolver, {
    baseOptions: {
      modelAssetPath: "https://storage.googleapis.com/mediapipe-assets/dogs.tflite"
    },
    scoreThreshold: 0.3,
    runningMode: runningMode
  });
}
initializeObjectDetector();

FilesetResolver.forVisionTasks() 方法會指定工作所需的 WebAssembly (Wasm) 二進位檔位置。

ObjectDetector.createFromOptions() 方法會將物件偵測工具例項化。您必須提供用於偵測的模型路徑。在本例中，狗狗偵測模型會託管在 Cloud Storage 中。

scoreThreshold 屬性已設為 0.3 值。也就是說，模型會針對偵測到的任何物件傳回結果，信賴水準為 30% 以上。您可以視應用程式需求調整這個門檻。

runningMode 屬性會在 ObjectDetector 物件初始化時設定。您之後可以視需要變更這項和其他選項。

5. 對圖片執行預測

如要對圖片執行預測，請前往 handleClick() 函式，然後在函式主體中加入下列程式碼：

// Verify object detector is initialized and choose the correct running mode.
if (!objectDetector) {
    alert("Object Detector still loading. Please try again");
    return;
  }

  if (runningMode === "VIDEO") {
    runningMode = "IMAGE";
    await objectDetector.setOptions({ runningMode: runningMode });
  }

這個程式碼會判斷是否已初始化物件偵測器，並確保已為圖片設定執行模式。

偵測物件

如要偵測圖片中的物體，請在 handleClick() 函式的主體中加入下列程式碼：

// Run object detection.
  const detections = objectDetector.detect(event.target);

以下程式碼片段包含此工作輸出資料的範例：

ObjectDetectionResult:
 Detection #0:
  Box: (x: 355, y: 133, w: 190, h: 206)
  Categories:
   index       : 17
   score       : 0.73828
   class name  : aci
 Detection #1:
  Box: (x: 103, y: 15, w: 138, h: 369)
  Categories:
   index       : 17
   score       : 0.73047
   class name  : tikka

處理及顯示預測結果

在 handleClick() 函式主體的結尾呼叫 displayImageDetections() 函式：

// Call the displayImageDetections() function.
displayImageDetections(detections, event.target);

在 displayImageDetections() 函式的主體中加入下列程式碼，以顯示物件偵測結果：

// Display object detection results.
  
  const ratio = resultElement.height / resultElement.naturalHeight;

  for (const detection of result.detections) {
    // Description text
    const p = document.createElement("p");
    p.setAttribute("class", "info");
    p.innerText =
      detection.categories[0].categoryName +
      " - with " +
      Math.round(parseFloat(detection.categories[0].score) * 100) +
      "% confidence.";
    // Positioned at the top-left of the bounding box.
    // Height is that of the text.
    // Width subtracts text padding in CSS so that it fits perfectly.
    p.style =
      "left: " +
      detection.boundingBox.originX * ratio +
      "px;" +
      "top: " +
      detection.boundingBox.originY * ratio +
      "px; " +
      "width: " +
      (detection.boundingBox.width * ratio - 10) +
      "px;";
    const highlighter = document.createElement("div");
    highlighter.setAttribute("class", "highlighter");
    highlighter.style =
      "left: " +
      detection.boundingBox.originX * ratio +
      "px;" +
      "top: " +
      detection.boundingBox.originY * ratio +
      "px;" +
      "width: " +
      detection.boundingBox.width * ratio +
      "px;" +
      "height: " +
      detection.boundingBox.height * ratio +
      "px;";

    resultElement.parentNode.appendChild(highlighter);
    resultElement.parentNode.appendChild(p);
  }

這個函式會在圖片中偵測到的物件上方顯示定界框。它會移除先前的任何醒目顯示，然後建立並顯示 <p> 標記，以便醒目顯示偵測到的每個物件。

測試應用程式

在 CodePen 中變更程式碼時，預覽窗格會在您儲存時自動重新整理。如果已啟用自動儲存功能，應用程式可能已重新整理，但建議您再次重新整理。

如要測試應用程式，請按照下列步驟操作：

在預覽窗格中，按一下每張圖片即可查看預測結果。邊界框顯示狗狗的名稱和模型的可信度等級。
如果沒有邊界框，請開啟 Chrome 開發人員工具，然後檢查「主控台」面板是否有錯誤，或查看先前的步驟，確認您沒有遺漏任何步驟。

網路應用程式的預覽畫面，圖片中偵測到的狗隻上方有邊界框

6. 在網路攝影機即時錄影畫面上執行預測

偵測物件

如要偵測網路攝影機即時影像中的物體，請前往 predictWebcam() 函式，然後在函式主體中加入以下程式碼：

// Run video object detection.
  // If image mode is initialized, create a classifier with video runningMode.
  if (runningMode === "IMAGE") {
    runningMode = "VIDEO";
    await objectDetector.setOptions({ runningMode: runningMode });
  }
  let nowInMs = performance.now();

  // Detect objects with the detectForVideo() method.
  const result = await objectDetector.detectForVideo(video, nowInMs);

  displayVideoDetections(result.detections);

無論您是要在串流資料或完整影片上執行推論，影片物件偵測都會使用相同的方法。detectForVideo() 方法與用於相片的 detect() 方法相似，但會額外提供與目前影格相關聯的時間戳記參數。這個函式會即時執行偵測，因此您可以將目前時間傳遞做為時間戳記。

處理及顯示預測結果

如要處理及顯示偵測結果，請前往 displayVideoDetections() 函式，然後在函式主體中加入下列程式碼：

//  Display video object detection results.
  for (let child of children) {
    liveView.removeChild(child);
  }
  children.splice(0);

  // Iterate through predictions and draw them to the live view.
  for (const detection of result.detections) {
    const p = document.createElement("p");
    p.innerText =
      detection.categories[0].categoryName +
      " - with " +
      Math.round(parseFloat(detection.categories[0].score) * 100) +
      "% confidence.";
    p.style =
      "left: " +
      (video.offsetWidth -
        detection.boundingBox.width -
        detection.boundingBox.originX) +
      "px;" +
      "top: " +
      detection.boundingBox.originY +
      "px; " +
      "width: " +
      (detection.boundingBox.width - 10) +
      "px;";

    const highlighter = document.createElement("div");
    highlighter.setAttribute("class", "highlighter");
    highlighter.style =
      "left: " +
      (video.offsetWidth -
        detection.boundingBox.width -
        detection.boundingBox.originX) +
      "px;" +
      "top: " +
      detection.boundingBox.originY +
      "px;" +
      "width: " +
      (detection.boundingBox.width - 10) +
      "px;" +
      "height: " +
      detection.boundingBox.height +
      "px;";

    liveView.appendChild(highlighter);
    liveView.appendChild(p);

    // Store drawn objects in memory so that they're queued to delete at next call.
    children.push(highlighter);
    children.push(p);
  }
}

這個程式碼會移除先前的任何醒目顯示，然後建立並顯示 <p> 標記，以便醒目顯示偵測到的每個物件。

測試應用程式

如要測試即時物體偵測功能，建議您使用模型訓練時使用的狗隻圖片。

如要測試應用程式，請按照下列步驟操作：

將其中一張狗狗相片下載到手機。
在預覽窗格中，按一下「啟用網路攝影機」。
如果瀏覽器顯示對話方塊，要求您授予網路攝影機存取權，請授予權限。
將手機上的狗狗相片放在網路攝影機前方。邊界框顯示犬隻的名稱和模型的可信度等級。
如果沒有邊界框，請開啟 Chrome 開發人員工具，然後檢查「主控台」面板是否有錯誤，或查看先前的步驟，確認您沒有遺漏任何步驟。

在對著即時攝影機的狗圖片上方顯示外框

7. 恭喜

恭喜！您建構的網頁應用程式可偵測圖片中的物件。詳情請參閱 CodePen 上的完整版應用程式。

使用 MediaPipe 建立自訂物件偵測網頁應用程式

1. 事前準備

課程內容

建構項目

軟硬體需求

2. 做好準備

查看範例應用程式

3. 匯入 MediaPipe 的 tasks-vision 套件，並新增必要變數

4. 初始化物件偵測器

5. 對圖片執行預測

偵測物件

處理及顯示預測結果

測試應用程式

6. 在網路攝影機即時錄影畫面上執行預測

偵測物件

處理及顯示預測結果

測試應用程式

7. 恭喜

瞭解詳情