Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

ML Kit จดจำข้อความและลักษณะใบหน้า: iOS

1. บทนำ

ML Kit เป็น SDK สำหรับอุปกรณ์เคลื่อนที่ที่นำความเชี่ยวชาญด้านแมชชีนเลิร์นนิงของ Google มาสู่แอป Android และ iOS ในแพ็กเกจที่ทรงพลังแต่ใช้งานง่าย ไม่ว่าคุณจะเพิ่งเริ่มใช้หรือมีประสบการณ์ด้านแมชชีนเลิร์นนิงอยู่แล้ว ก็สามารถติดตั้งใช้งานฟังก์ชันที่คุณต้องการได้ง่ายๆ โดยใช้โค้ดเพียงไม่กี่บรรทัด คุณไม่จำเป็นต้องมีความรู้เชิงลึกเกี่ยวกับโครงข่ายประสาทหรือการเพิ่มประสิทธิภาพโมเดลเพื่อเริ่มต้นใช้งาน

หลักการทำงาน

ML Kit ช่วยให้คุณใช้เทคนิค ML ในแอปได้อย่างง่ายดายด้วยการรวมเทคโนโลยี ML ของ Google เช่น Mobile Vision และ TensorFlow Lite ไว้ใน SDK เดียว ไม่ว่าคุณจะต้องการความสามารถแบบเรียลไทม์ของโมเดลในอุปกรณ์ของ Mobile Vision หรือความยืดหยุ่นของโมเดลการจัดประเภทรูปภาพ TensorFlow Lite ที่กำหนดเอง ML Kit ก็ช่วยให้คุณทำได้ด้วยโค้ดเพียงไม่กี่บรรทัด

Codelab นี้จะแนะนำวิธีสร้างแอป iOS ของคุณเองที่ตรวจหาข้อความและลักษณะใบหน้าในรูปภาพได้โดยอัตโนมัติ

สิ่งที่คุณจะสร้าง

ใน Codelab นี้ คุณจะได้สร้างแอป iOS ด้วย ML Kit แอปของคุณจะทำสิ่งต่อไปนี้

ใช้ Text Recognition API ของ ML Kit เพื่อตรวจหาข้อความในรูปภาพ
ใช้ Face Detection API ของ ML Kit เพื่อระบุลักษณะใบหน้าในรูปภาพ

รูปภาพของ Grace Hopper ที่สาธิต ML Kit Face Recognition API รูปภาพป้ายบนหญ้าที่แสดงให้เห็น API การจดจำข้อความ

สิ่งที่คุณจะได้เรียนรู้

วิธีใช้ ML Kit SDK เพื่อเพิ่มความสามารถด้านแมชชีนเลิร์นนิงขั้นสูง เช่น การจดจำข้อความ การตรวจหาลักษณะใบหน้า ลงในแอป iOS ได้อย่างง่ายดาย

สิ่งที่คุณต้องมี

Xcode เวอร์ชันล่าสุด (v12.4 ขึ้นไป)
โปรแกรมจำลอง iOS หรืออุปกรณ์ iOS จริงที่ใช้ iOS 10.0 ขึ้นไป
ML Kit รองรับเฉพาะสถาปัตยกรรม 64 บิต 2 รายการต่อไปนี้ x86_64 และ arm64
โค้ดตัวอย่าง
ความรู้พื้นฐานเกี่ยวกับการพัฒนา iOS ใน Swift
ความเข้าใจพื้นฐานเกี่ยวกับโมเดลแมชชีนเลิร์นนิง

Codelab นี้มุ่งเน้นที่ ML Kit เราจะข้ามแนวคิดและบล็อกโค้ดที่ไม่เกี่ยวข้องไป และจะให้คุณคัดลอกและวางได้ง่ายๆ

2. การเริ่มตั้งค่า

ดาวน์โหลดโค้ด

คลิกลิงก์ต่อไปนี้เพื่อดาวน์โหลดโค้ดทั้งหมดสำหรับ Codelab นี้

แตกไฟล์ ZIP ที่ดาวน์โหลด ซึ่งจะสร้างโฟลเดอร์รูท (mlkit-ios-codelab) ที่มีทรัพยากรทั้งหมดที่คุณต้องการ สำหรับโค้ดแล็บนี้ คุณจะต้องใช้เฉพาะทรัพยากรในไดเรกทอรีย่อย vision

vision ไดเรกทอรีย่อยในที่เก็บ mlkit-ios-codelab มี 2 ไดเรกทอรี ดังนี้

เริ่มต้น - โค้ดเริ่มต้นที่คุณจะใช้ต่อใน Codelab นี้
final - โค้ดที่เสร็จสมบูรณ์สำหรับแอปตัวอย่างที่เสร็จแล้ว

เพิ่มทรัพยากร Dependency สำหรับ ML Kit ด้วย CocoaPods

CocoaPods ใช้เพื่อเพิ่มทรัพยากร Dependency ของ ML Kit ลงในแอป หากยังไม่ได้ติดตั้ง CocoaPods ในเครื่อง โปรดดูวิธีการติดตั้งที่นี่ เมื่อติดตั้งแล้ว ให้เปิด Podfile ในโปรแกรมแก้ไขที่ชื่นชอบและเพิ่ม ML Kit เป็นทรัพยากร Dependency

Podfile

platform :ios, '10.0'
use_frameworks!

pod 'GoogleMLKit/FaceDetection'
pod 'GoogleMLKit/TextRecognition'

target 'MLKit-codelab' do
end

ติดตั้ง ML Kit Cocoa Pods

คุณควรใช้บรรทัดคำสั่งเพื่อติดตั้ง ML Kit Cocoa Pods เพื่อให้แน่ใจว่าแอปของคุณมีทรัพยากร Dependency ทั้งหมด

บรรทัดคำสั่ง

# Make sure you are in the root of your app
pod install
xed .

3. เรียกใช้แอปเริ่มต้น

ตอนนี้คุณพร้อมที่จะเรียกใช้แอปเป็นครั้งแรกแล้ว คลิก Run ใน Xcode เพื่อคอมไพล์แอปและเรียกใช้ใน iOS Simulator

แอปควรเปิดตัวในโปรแกรมจำลอง ตอนนี้คุณควรเห็นเลย์เอาต์พื้นฐานที่มีเครื่องมือเลือกซึ่งช่วยให้คุณเลือกระหว่าง 2 รูปภาพได้ ในส่วนถัดไป คุณจะเพิ่มการจดจำข้อความลงในแอปเพื่อระบุข้อความในรูปภาพ

4. เพิ่มการจดจำข้อความในอุปกรณ์

ในขั้นตอนนี้ เราจะเพิ่มฟังก์ชันการทำงานลงในแอปเพื่อจดจำข้อความในรูปภาพ

นำเข้าโมดูล MLVision

ตรวจสอบว่ามีการนำเข้าต่อไปนี้ในคลาส ViewController

ViewController.swift

import MLKit

สร้าง VisionTextRecognizer

เพิ่มพร็อพเพอร์ตี้เลซีต่อไปนี้ลงในคลาส ViewController

ViewController.swift

private lazy var textRecognizer = TextRecognizer.textRecognizer()

ตั้งค่าและเรียกใช้การจดจำข้อความในอุปกรณ์บนรูปภาพ

เพิ่มรายการต่อไปนี้ลงในrunTextRecognitionเมธอดของViewControllerคลาส

ViewController.swift

func runTextRecognition(with image: UIImage) {
  let visionImage = VisionImage(image: image)
  textRecognizer.process(visionImage) { features, error in
    self.processResult(from: features, error: error)
  }
}

โค้ดด้านบนจะกำหนดค่าเครื่องตรวจหาการจดจำข้อความและเรียกใช้ฟังก์ชัน processResult(from:, error:) พร้อมการตอบกลับ

ประมวลผลการตอบกลับการจดจำข้อความ

เพิ่มโค้ดต่อไปนี้ลงใน processResult ในคลาส ViewController เพื่อแยกวิเคราะห์ผลลัพธ์และแสดงในแอป

ViewController.swift

 func processResult(from text: Text?, error: Error?) {
    removeDetectionAnnotations()
    guard error == nil, let text = text else {
      let errorString = error?.localizedDescription ?? Constants.detectionNoResultsMessage
      print("Text recognizer failed with error: \(errorString)")
      return
    }

    let transform = self.transformMatrix()

    // Blocks.
    for block in text.blocks {
      drawFrame(block.frame, in: .purple, transform: transform)

      // Lines.
      for line in block.lines {
        drawFrame(line.frame, in: .orange, transform: transform)

        // Elements.
        for element in line.elements {
          drawFrame(element.frame, in: .green, transform: transform)

          let transformedRect = element.frame.applying(transform)
          let label = UILabel(frame: transformedRect)
          label.text = element.text
          label.adjustsFontSizeToFitWidth = true
          self.annotationOverlayView.addSubview(label)
        }
      }
    }
  }

เรียกใช้แอปในโปรแกรมจำลอง

ตอนนี้คลิก Run ใน Xcode เมื่อแอปโหลดแล้ว ให้ตรวจสอบว่าได้เลือก Image 1 ในเครื่องมือเลือกแล้ว และคลิกปุ่ม Find Text

ตอนนี้แอปของคุณควรมีลักษณะเหมือนรูปภาพด้านล่าง ซึ่งแสดงผลการจดจำข้อความและกรอบล้อมรอบที่ซ้อนทับอยู่บนรูปภาพต้นฉบับ

ภาพ: Kai Schreiber / Wikimedia Commons / CC BY-SA 2.0

ขอแสดงความยินดี คุณเพิ่งเพิ่มการจดจำข้อความในอุปกรณ์ลงในแอปโดยใช้ ML Kit การจดจำข้อความในอุปกรณ์เหมาะสำหรับ Use Case หลายอย่าง เนื่องจากจะทำงานได้แม้ว่าแอปจะไม่มีการเชื่อมต่ออินเทอร์เน็ต และรวดเร็วพอที่จะใช้กับรูปภาพนิ่งและเฟรมวิดีโอสด

5. เพิ่มการตรวจจับเส้นโครงใบหน้าในอุปกรณ์

ในขั้นตอนนี้ เราจะเพิ่มฟังก์ชันการทำงานลงในแอปเพื่อตรวจหารูปหน้าในรูปภาพ

สร้าง FaceDetector

เพิ่มพร็อพเพอร์ตี้เลซีต่อไปนี้ลงในคลาส ViewController

ViewController.swift

private lazy var faceDetectorOption: FaceDetectorOptions = {
  let option = FaceDetectorOptions()
  option.contourMode = .all
  option.performanceMode = .fast
  return option
}()
private lazy var faceDetector = FaceDetector.faceDetector(options: faceDetectorOption)

ตั้งค่าและเรียกใช้การตรวจหารูปทรงใบหน้าในอุปกรณ์บนรูปภาพ

เพิ่มรายการต่อไปนี้ลงในrunFaceContourDetectionเมธอดของViewControllerคลาส

ViewController.swift

  func runFaceContourDetection(with image: UIImage) {
    let visionImage = VisionImage(image: image)
    faceDetector.process(visionImage) { features, error in
      self.processResult(from: features, error: error)
    }
  }

ประมวลผลการตอบกลับของเครื่องตรวจจับใบหน้า

ViewController.swift

  func processResult(from faces: [Face]?, error: Error?) {
    removeDetectionAnnotations()
    guard let faces = faces else {
      return
    }

    for feature in faces {
      let transform = self.transformMatrix()
      let transformedRect = feature.frame.applying(transform)
      UIUtilities.addRectangle(
        transformedRect,
        to: self.annotationOverlayView,
        color: UIColor.green
      )
      self.addContours(forFace: feature, transform: transform)
    }
  }

สุดท้าย ให้เพิ่มเมธอดตัวช่วย addContours ในคลาส ViewController เพื่อวาดจุดขอบ

ViewController.swift

 private func addContours(forFace face: Face, transform: CGAffineTransform) {
    // Face
    if let faceContour = face.contour(ofType: .face) {
      for point in faceContour.points {
        drawPoint(point, in: .blue, transform: transform)
      }
    }

    // Eyebrows
    if let topLeftEyebrowContour = face.contour(ofType: .leftEyebrowTop) {
      for point in topLeftEyebrowContour.points {
        drawPoint(point, in: .orange, transform: transform)
      }
    }
    if let bottomLeftEyebrowContour = face.contour(ofType: .leftEyebrowBottom) {
      for point in bottomLeftEyebrowContour.points {
        drawPoint(point, in: .orange, transform: transform)
      }
    }
    if let topRightEyebrowContour = face.contour(ofType: .rightEyebrowTop) {
      for point in topRightEyebrowContour.points {
        drawPoint(point, in: .orange, transform: transform)
      }
    }
    if let bottomRightEyebrowContour = face.contour(ofType: .rightEyebrowBottom) {
      for point in bottomRightEyebrowContour.points {
        drawPoint(point, in: .orange, transform: transform)
      }
    }

    // Eyes
    if let leftEyeContour = face.contour(ofType: .leftEye) {
      for point in leftEyeContour.points {
        drawPoint(point, in: .cyan, transform: transform)
      }
    }
    if let rightEyeContour = face.contour(ofType: .rightEye) {
      for point in rightEyeContour.points {
        drawPoint(point, in: .cyan, transform: transform)
      }
    }

    // Lips
    if let topUpperLipContour = face.contour(ofType: .upperLipTop) {
      for point in topUpperLipContour.points {
        drawPoint(point, in: .red, transform: transform)
      }
    }
    if let bottomUpperLipContour = face.contour(ofType: .upperLipBottom) {
      for point in bottomUpperLipContour.points {
        drawPoint(point, in: .red, transform: transform)
      }
    }
    if let topLowerLipContour = face.contour(ofType: .lowerLipTop) {
      for point in topLowerLipContour.points {
        drawPoint(point, in: .red, transform: transform)
      }
    }
    if let bottomLowerLipContour = face.contour(ofType: .lowerLipBottom) {
      for point in bottomLowerLipContour.points {
        drawPoint(point, in: .red, transform: transform)
      }
    }

    // Nose
    if let noseBridgeContour = face.contour(ofType: .noseBridge) {
      for point in noseBridgeContour.points {
        drawPoint(point, in: .yellow, transform: transform)
      }
    }
    if let noseBottomContour = face.contour(ofType: .noseBottom) {
      for point in noseBottomContour.points {
        drawPoint(point, in: .yellow, transform: transform)
      }
    }
  }

เรียกใช้แอปในโปรแกรมจำลอง

ตอนนี้คลิก Run ใน Xcode เมื่อแอปโหลดแล้ว ให้ตรวจสอบว่าได้เลือก Image 2 ในเครื่องมือเลือกแล้ว และคลิกปุ่ม Find Face Contour ตอนนี้แอปของคุณควรมีลักษณะเหมือนรูปภาพด้านล่าง ซึ่งแสดงรูปทรงใบหน้าของ Grace Hopper เป็นจุดที่ซ้อนทับอยู่บนรูปภาพต้นฉบับ

ยินดีด้วย คุณเพิ่งเพิ่มการตรวจหารูปใบหน้าในอุปกรณ์ลงในแอปโดยใช้ ML Kit ในอุปกรณ์ การตรวจหารูปใบหน้าใน ML Kit ในอุปกรณ์เหมาะสำหรับกรณีการใช้งานหลายอย่าง เนื่องจากจะทำงานได้แม้ว่าแอปจะไม่มีการเชื่อมต่ออินเทอร์เน็ต และรวดเร็วพอที่จะใช้กับภาพนิ่งและเฟรมวิดีโอสด

6. ยินดีด้วย

คุณใช้ ML Kit เพื่อเพิ่มความสามารถด้านแมชชีนเลิร์นนิงขั้นสูงลงในแอปได้อย่างง่ายดาย

สิ่งที่เราได้พูดถึง

วิธีเพิ่ม ML Kit ลงในแอป iOS
วิธีใช้การจดจำข้อความในอุปกรณ์ใน ML Kit เพื่อค้นหาข้อความในรูปภาพ
วิธีใช้การจดจำใบหน้าในอุปกรณ์ใน ML Kit เพื่อระบุลักษณะใบหน้าในรูปภาพ

ขั้นตอนถัดไป

ใช้ ML Kit ในแอป iOS ของคุณเอง

ดูข้อมูลเพิ่มเติม

https://g.co/mlkit