Google uses AI technology to translate content into your preferred language. AI translations can contain errors.

Google Tensor에서 LiteRT 구현

1. 개요

Google Tensor SDK는 Pixel 기기용 LiteRT 모델을 컴파일하는 데 사용됩니다. 컴파일된 모델은 ML 추론 성능 향상을 위해 Pixel 기기에 배포할 수 있습니다. SDK를 사용하려면 먼저 모델을 LiteRT (tflite) 모델로 변환해야 합니다.

이 Codelab은 GitHub의 일반 Colab인 LiteRT AOT 컴파일 튜토리얼 Colab:을 기반으로 합니다.

목표

LiteRT AOT (Ahead-of-Time) 컴파일러를 사용하여 TFLite 모델에서 기기 내 EdgeTPU에 맞게 최적화되고 컴파일된 LiteRT 모델로 셀피 세분화 모델을 컴파일하는 방법을 알아봅니다.

이 Colab에서는 기기 내 AI용 Play (PODAI)로 모델을 준비하는 단계도 안내합니다.

PODAI는 기기 내 AI 기능에 맞춤 모델을 더 효율적으로 제공합니다. AI 모델의 출시, 타겟팅, 버전 관리, 다운로드 프로세스를 간소화합니다. LiteRT EdgeTPU AOT 컴파일과 결합하면 개발자는 최종 사용자의 휴대전화에 어떤 EdgeTPU가 포함되어 있는지 알 필요 없이 다양한 기기에 컴파일된 ML 모델을 제공할 수 있습니다.

사용된 모델

사용 중인 모델은 원래 MediaPipe 이미지 분할 가이드에 게시되었습니다. 이 Codelab에서 사용된 모델에 관한 세부정보는 다음과 같습니다.

SelfieMulticlass: 사람의 이미지를 가져와 머리카락, 피부, 의류와 같은 영역을 찾고 이러한 항목의 이미지 분할 맵을 출력하는 LiteRT 모델입니다.

2. 시작하기

Google Tensor SDK에 액세스하고 시작하려면 다음 단계를 따르세요.

Google Tensor SDK에 액세스하려면 가입하세요. 계속 진행하기 전에 컴파일러 플러그인의 다운로드 링크가 포함된 Google의 이메일을 기다려야 합니다.
컴파일러 플러그인 (litert_plugin_compiler.tar.gz)을 다운로드하고 원하는 폴더에 배치합니다.
환경 변수를 다운로드한 파일의 로컬 시스템 경로인 GOOGLE_TENSOR_SDK_BETA로 설정합니다.
bash 터미널에서 이 명령어를 실행할 수 있습니다.
```
export GOOGLE_TENSOR_SDK_BETA=/path/to/downloaded/compiler
```
또는 Colab 노트북에서 실행할 수 있습니다.
```
%env GOOGLE_TENSOR_SDK_BETA=/path/to/downloaded/compiler
```
그런 다음 이 명령어를 실행하여 패키지를 설치합니다.
```
pip install ai-edge-litert-sdk-google-tensor
```

3. 필수 패키지 설치

먼저 EdgeTPU AOT 컴파일러가 포함된 ai-edge-litert-nightly와 모델 변환에 사용하는 기타 라이브러리를 비롯한 필수 패키지를 설치합니다.

이 패키지를 사용하여 Google Tensor용 LiteRT 백엔드(ai-edge-litert-sdk-google-tensor)를 설치합니다.

패키지를 설치한 후 세션을 다시 시작하고 설치 단계부터 진행합니다. 설치를 반복하지 마세요.

시스템에서 설정을 실행하려는 경우 Python 가상 환경 (venv)을 사용하고 가상 환경 내에서 이러한 명령어를 실행하는 것이 좋습니다.

특정 패키지 제거

그 전에 기본적으로 Colab 런타임과 함께 제공되는 TensorFlow를 제거합니다.

pip uninstall -y tensorflow ai-edge-litert

모든 라이브러리 설치

Google Tensor용 LiteRT 백엔드 설치

pip install ai-edge-litert-sdk-google-tensor

나머지 패키지 설치

pip install matplotlib huggingface-hub ai-edge-litert-nightly

4. 모든 라이브러리 가져오기

설치가 완료된 후 기본 실행으로 진행합니다.

필수 패키지를 가져옵니다.

import os
import shutil

from ai_edge_litert.aot import aot_compile as aot_lib
from ai_edge_litert.aot.ai_pack import export_lib as ai_pack_export
from ai_edge_litert.aot.vendors.google_tensor import target as gt_target
import huggingface_hub
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image
import requests

참고: ai-edge-litert-nightly 대신 안정화 버전 패키지 ai-edge-litert를 사용하는 경우 모든 가져오기 문 앞에 이 환경 변수를 설정해야 합니다 (스크립트의 각 실행). GOOGLE_TENSOR_BACKEND_ENABLED=1. 동일한 Python 코드 스니펫은 다음과 같습니다.

import os
os.environ['GOOGLE_TENSOR_BACKEND_ENABLED'] = '1'

또는 (셸)

export GOOGLE_TENSOR_BACKEND_ENABLED=1

5. LiteRT 모델 컴파일

이 섹션에서는 LiteRT (TFLite) 모델을 직접 컴파일하는 것과 같은 고급 사용법을 다룹니다.

TFLite 모델에서 EdgeTPU 컴파일

이 단계에는 TFLite 모델이 필요합니다. TFLite 모델이 없는 경우 모델을 TFLite 형식으로 변환합니다.

TFLite 모델 가져오기

이 사용 사례에서는 MediaPipe MultiClass 세분화 모델을 사용합니다.

TFLite 모델은 MediaPipe 이미지 분할 페이지에서 사용할 수 있습니다.

work_dir = '.'

model_url = 'https://storage.googleapis.com/mediapipe-models/image_segmenter/selfie_multiclass_256x256/float32/latest/selfie_multiclass_256x256.tflite'
tflite_model_path = os.path.join(work_dir, 'selfie_multiclass_256x256.tflite')

model_content = requests.get(model_url)

with open(tflite_model_path, 'wb') as fout:
  fout.write(model_content.content)

LiteRT Python API를 사용하여 TfLite 모델 빠르게 확인

다음 예에서는 마스크 이미지와 혼합된 결과를 모두 볼 수 있습니다.

# Downloading Testing image

test_image = huggingface_hub.hf_hub_download(
    repo_id="litert-community/MediaPipe-Selfie-Segmentation",
    filename="test_img.png",
)
pil_image = Image.open(test_image).convert("RGB").resize((256, 256))

from ai_edge_litert.compiled_model import CompiledModel

SEGMENT_COLORS = [
    (0, 0, 0),
    (255, 0, 0),
    (0, 255, 0),
    (0, 0, 255),
    (255, 255, 0),
    (255, 0, 255),
]
INPUT_SIZE = (256, 256)
NUM_CLASSES = 6

# Load the model and image
model = CompiledModel.from_file(tflite_model_path)
original_image = np.array(Image.open(test_image).convert('RGB'))
img_array = np.array(pil_image).astype(np.float32)

# Normalize the image
normalized = (img_array - 127.5) / 127.5
normalized = np.ascontiguousarray(normalized, dtype=np.float32)

# Run inference
sig_idx = 0
input_buffers = model.create_input_buffers(sig_idx)
output_buffers = model.create_output_buffers(sig_idx)
input_data = normalized.reshape(-1)
input_buffers[0].write(input_data)
model.run_by_index(sig_idx, input_buffers, output_buffers)

# Get output data
height, width = INPUT_SIZE
output_size = height * width * NUM_CLASSES
output_data = output_buffers[0].read(output_size, np.float32)
output_data = output_data.reshape(height, width, NUM_CLASSES)
mask = np.argmax(output_data, axis=2).astype(np.uint8)

# Create colored mask
colored_mask = np.zeros((height, width, 3), dtype=np.uint8)
for label_idx in range(NUM_CLASSES):
  class_mask = mask == label_idx
  color = SEGMENT_COLORS[label_idx]
  colored_mask[class_mask] = color

# Blend with original image
# Resize colored mask to match original image if necessary
if original_image.shape[:2] != colored_mask.shape[:2]:
  colored_mask_pil = Image.fromarray(colored_mask)
  colored_mask_pil = colored_mask_pil.resize(
      (original_image.shape[1], original_image.shape[0])
  )
  colored_mask = np.array(colored_mask_pil)

# Blend images with alpha 0.5
alpha = 0.5
blended_image = (
    original_image * (1 - alpha) + colored_mask * alpha
).astype(np.uint8)

# Display them
fig, axes = plt.subplots(1, 3, figsize=(9, 3))

for idx, (title, image) in enumerate([
    ('Original Image', original_image),
    ('Colored Mask', colored_mask),
    ('Blended Image', blended_image),
]):
  axes[idx].imshow(image)
  axes[idx].set_title(title)
  axes[idx].axis('off')

plt.tight_layout()
plt.show()

EdgeTPU AOT 컴파일을 사용하여 LiteRT 모델로 변환

ai_edge_litert.aot의 API를 사용하여 모델을 컴파일합니다.

compiled_models = aot_lib.aot_compile(tflite_model_path, keep_going=True)

# This variable will be used later to create the AI Pack.
all_google_tensor_compiled_models = compiled_models

# Print Compilation Report
print(all_google_tensor_compiled_models.compilation_report())

# Saving compiled models to disk. This saves all the compiled models, and a CPU
# fallback model.
all_google_tensor_compiled_models.export(
    work_dir, model_name='selfie_segmentation'
)

컴파일이 완료되면 model.export 메서드를 사용하여 모든 모델을 디스크로 내보냅니다.

기본적으로 모델은 출력 디렉터리의 플랫 구조에 저장되며 각 모델 이름에는 백엔드 ID가 접미사로 붙습니다.

예를 들면 다음과 같습니다.

모델 파일 이름	백엔드	SoC	부가 정보
selfie_segmentation_fallback.tflite	CPU/GPU	해당 사항 없음	해당 사항 없음
selfie_segmentation_Google_Tensor_G3.tflite	Google	Tensor_G3	Google Tensor G3
selfie_segmentation_Google_Tensor_G4.tflite	Google	Tensor_G4	Google Tensor G4
selfie_segmentation_Google_Tensor_G5.tflite	Google	Tensor_G5	Google Tensor G5

6. CPU에서 내보내기 및 유효성 검사

컴파일이 완료되면 CPU에서 TFLite 모델을 확인합니다. 컴파일 중에 생성된 '대체 모델'을 사용하여 이 작업을 실행합니다.

# Run LiteRT with test image
from ai_edge_litert.compiled_model import CompiledModel

# Normalize the image to [-1, 1]
img_array = np.array(pil_image, dtype=np.float32)
normalized = (img_array - 127.5) / 127.5
numpy_array = np.ascontiguousarray(normalized)[None, ...]

cpu_model_path = os.path.join(work_dir, "selfie_segmentation_fallback.tflite")
cm_model = CompiledModel.from_file(cpu_model_path)
sig_idx = 0
input_buffers = cm_model.create_input_buffers(sig_idx)
output_buffers = cm_model.create_output_buffers(sig_idx)
input_buffers[0].write(numpy_array)
cm_model.run_by_index(sig_idx, input_buffers, output_buffers)

# Read the 6-channel output and apply argmax
output_data = output_buffers[0].read(256 * 256 * 6, np.float32)
output_data = output_data.reshape((256, 256, 6))
mask = np.argmax(output_data, axis=2).astype(np.uint8)

# Create a colored mask using the previously defined SEGMENT_COLORS
colored_mask = np.zeros((256, 256, 3), dtype=np.uint8)
for label_idx in range(6):
  class_mask = mask == label_idx
  color = SEGMENT_COLORS[label_idx]
  colored_mask[class_mask] = color

mask_image = Image.fromarray(colored_mask)

# Show output results
fig, axes = plt.subplots(1, 2, figsize=(9, 3))

for idx, (title, image) in enumerate([
    ('Test Image', pil_image),
    ('TFLite Mask Image', mask_image),
]):
  axes[idx].imshow(image)
  axes[idx].set_title(title)
  axes[idx].axis('off')

plt.tight_layout()
plt.show()

7. PODAI용 모델 내보내기

모델이 확인되면 다음 필수 단계는 배포를 위해 모델을 준비하는 것입니다. 이 섹션에서는 Google Play에 업로드할 컴파일된 모델을 패키징하여 Google Play 기기 내 AI (PODAI) 프레임워크를 통해 사용자 기기에 제공하는 방법을 자세히 설명합니다.

AiEdgeLiteRT AOT (Ahead-of-Time) 모듈은 이 목적을 위해 특별히 ai_pack 유틸리티를 제공합니다. 이러한 유틸리티는 중요한 데이터 애셋인 AI 팩을 만듭니다. AI 팩은 컴파일된 모델을 기기 타겟팅 구성과 번들로 묶어 올바른 모델과 애셋이 적절한 사용자 기기에 제공되도록 합니다. 이는 특정 SoC (System-on-Chip)에 맞게 최적화된 모델이 해당 SoC가 장착된 기기에만 도달하도록 하므로 NPU (Neural Processing Unit) 컴파일에 특히 중요합니다.

# Configuring the AI Pack
os.makedirs('selfie_multiclass', exist_ok=True)
ai_pack_dir = os.path.join(work_dir, 'ai_pack')
ai_pack_name = 'selfie_segmentation'
litert_model_name = 'segmentation_model'

# Clean up
shutil.rmtree(ai_pack_dir, ignore_errors=True)

# Export
ai_pack_export.export(
    all_google_tensor_compiled_models,
    ai_pack_dir,
    ai_pack_name,
    litert_model_name
)

AI 팩 소스 검사

def list_files(startpath):
  """Function to print out the tree structure of a directory."""
  for root, dirs, files in os.walk(startpath):
    level = root.replace(startpath, '').count(os.sep)
    indent = ' ' * 4 * (level)
    print('{}{}/'.format(indent, os.path.basename(root)))
    subindent = ' ' * 4 * (level + 1)
    for f in files:
      print('{}{}'.format(subindent, f))
"""View the files generated within the AI pack directory"""
list_files(ai_pack_dir)

8. 고급 옵션 구성

특정 기기 또는 EdgeTPU용 NPU 컴파일

기본적으로 LiteRT AOT 컴파일은 등록된 모든 백엔드로 컴파일됩니다. 로컬 개발의 경우 개발 휴대전화와 같은 특정 기기에만 컴파일하는 것이 좋습니다. 컴파일 대상을 명시적으로 제공하여 이 작업을 실행합니다.

다음 예에서는 Google Tensor G5로 컴파일합니다.

# Specifying the compilation target
tensor_g5_target = gt_target.Target(gt_target.SocModel.TENSOR_G5)

# Compile from the TFLite model for a specific target
compiled_models = aot_lib.aot_compile(
    tflite_model_path,
    target=[tensor_g5_target],
    keep_going=False,  # We want to error out when there's failure.
)

print(compiled_models.compilation_report())

Google Tensor용 컴파일 플래그

컴파일 플래그를 통해 컴파일 프로세스를 맞춤설정합니다. 여기서는 google_tensor_truncation_type="half" 플래그가 사용됩니다.

TFLite 모델을 컴파일할 때

compiled_models = aot_lib.aot_compile(
    tflite_model_path,
    target=[tensor_g5_target],
    keep_going=False,
    google_tensor_truncation_type="half"
)

9. 다음 단계

축하합니다!

이제 모델을 PODAI에서 사용할 수 있습니다.

이제 다음 단계를 위해 Android 스튜디오로 이동합니다. 자세한 내용은 LiteRT 이미지 분할 샘플을 참고하세요.