On-Device Image Generation on Android with MediaPipe

About this codelab

Last updated Oct 24, 2023

Written by Paul Ruiz

1. Introduction

What is MediaPipe?

MediaPipe Solutions lets you apply machine-learning (ML) solutions to your apps. It provides a framework for configuring prebuilt processing pipelines that deliver immediate, engaging, and useful output to users. You can even customize many of these solutions with MediaPipe Model Maker to update the default models.

Text-to-Image generation is one of several ML tasks that MediaPipe Solutions has to offer.

In this Codelab, you will start with a mostly-bare Android app, then progress through multiple steps until you are able to generate new images directly on your Android device.

What you'll learn

How to implement text-to-image generation running locally in an Android app with MediaPipe Tasks.

What you'll need

An installed version of Android Studio (this codelab was written and tested with Android Studio Giraffe).
An Android device with at least 8GB of RAM.
Basic knowledge of Android development and the ability to run a pre-written Python script.

2. Add MediaPipe Tasks to the Android app

Download the Android starter app

This codelab will start with a pre-made sample consisting of the UI that will be used for a basic version of image generation. You can find that starting app in the official MediaPipe Samples repo here. Clone the repo or download the zipfile by clicking Code > Download ZIP.

Import the app to Android Studio

Open Android Studio.
From the Welcome to Android Studio screen, select Open in the top right corner.

Navigate to where you cloned or downloaded the repository and open the codelabs/image_generation_basic/android/start directory.
At this stage the app should not compile because you have not included the MediaPipe Tasks dependency yet.

You will fix the app and get it running by going into the build.gradle file and scrolling down to // Step 1 - Add dependency. From there, include the following line and then hit the Sync Now button that appears in the banner at the top of Android Studio.

// Step 1 - Add dependency
implementation 'com.google.mediapipe:tasks-vision-image-generator:latest.release'

Once syncing has completed, verify that everything opened and installed correctly by clicking on the green run arrow ( ) in the top right of Android Studio. You should see the app open to a screen with two radio buttons and a button labeled INITIALIZE. If you click on that button, you should be immediately taken to a separate UI consisting of a text prompt and other options alongside a button labeled GENERATE.

Unfortunately that's about the extent of the starter app, so it's time for you to learn how you will finish this app and start generating new images on your device!

3. Setting up the Image Generator

For this example, the majority of the image generation work will happen in the ImageGenerationHelper.kt file. When you open this file, you will notice a variable towards the top of the class called imageGenerator. This is the Task object that will do the heavy lifting in your image generation app.

Just below that object you will see a function called initializeImageGenerator() with the following comment: // Step 2 - initialize the image generator. As you might guess, this is where you will initialize the ImageGenerator object. Replace that function body with the following code to set the image generation model path and initialize the ImageGenerator object:

// Step 2 - initialize the image generator
val options = ImageGeneratorOptions.builder()
    .setImageGeneratorModelDirectory(modelPath)
    .build()

imageGenerator = ImageGenerator.createFromOptions(context, options)

Below that you will see another function named setInput(). This accepts three parameters: a prompt string that will be used to define the generated image, the number of iterations that the Task should go through while generating the new image, and a seed value that can be used to create new versions of an image based on the same prompt while generating the same image when the same seed is used. The purpose of this function is to set these initial parameters for the image generator when you attempt to create an image that does display intermediate steps.

Go ahead and replace the setInput() body (where you will see the comment // Step 3 - accept inputs) with this line:

// Step 3 - accept inputs
imageGenerator.setInputs(prompt, iteration, seed)

The next two steps are where the generation takes place. The generate() function accepts the same inputs as setInput, but creates an image as a one-shot call that does not return any intermediate step images. You can replace the body of this function (which includes the comment // Step 4 - generate without showing iterations) with the following:

// Step 4 - generate without showing iterations
val result = imageGenerator.generate(prompt, iteration, seed)
val bitmap = BitmapExtractor.extract(result?.generatedImage())
return bitmap

It's important to know that this task happens synchronously, so you will need to call the function from a background thread. You will learn more about that a little later in this codelab.

The final step you will take in this file is to fill in the execute() function (labeled as Step 5). This will accept a parameter that tells it if it should return an intermediate image or not for the single step of generation that will be performed with the ImageGenerator execute() function. Replace the function body with this code:

// Step 5 - generate with iterations
val result = imageGenerator.execute(showResult)

if (result == null || result.generatedImage() == null) {
    return Bitmap.createBitmap(512, 512, Bitmap.Config.ARGB_8888)
        .apply {
            val canvas = Canvas(this)
            val paint = Paint()
            paint.color = Color.WHITE
            canvas.drawPaint(paint)
        }
}

val bitmap =
    BitmapExtractor.extract(result.generatedImage())

return bitmap

And that's it for the helper file. In the next section you will fill out the ViewModel file that handles the logic for this example.

4. Bringing the App Together

The MainViewModel file will handle UI states and other logic related to this example app. Go ahead and open it now.

Towards the top of the file you should see the comment // Step 6 - set model path. This is where you will tell your app where it can find the model files that are necessary for image generation. For this example you will set the value to /data/local/tmp/image_generator/bins/.

// Step 6 - set model path
private val MODEL_PATH = "/data/local/tmp/image_generator/bins/"

From there, scroll down to the generateImage() function. Towards the bottom of this function you will see both Step 7 and Step 8, which will be used to generate images with either returned iterations or none, respectively. As both of these operations happen synchronously, you will notice that they're wrapped in a coroutine. You can start by replacing // Step 7 - Generate without showing iterations with this block of code to call generate() from the ImageGenerationHelper file, then update the UI state.

// Step 7 - Generate without showing iterations
val result = helper?.generate(prompt, iteration, seed)
_uiState.update {
    it.copy(outputBitmap = result)
}

Step 8 gets a little trickier. Because the execute() function only performs one step instead of all steps for image generation, you will need to call each step individually through a loop. You will also need to determine if the current step should be displayed for the user. Finally, you will update the UI state if the current iteration should be displayed. You can do all of this now.

// Step 8 - Generate with showing iterations
helper?.setInput(prompt, iteration, seed)
for (step in 0 until iteration) {
    isDisplayStep =
        (displayIteration > 0 && ((step + 1) % displayIteration == 0))
    val result = helper?.execute(isDisplayStep)

    if (isDisplayStep) {
        _uiState.update {
            it.copy(
                outputBitmap = result,
                generatingMessage = "Generating... (${step + 1}/$iteration)",
            )
        }
    }
}

At this point you should be able to install your app, initialize the image generator, and then create a new image based on a text prompt

... except now the app crashes when you try to initialize the image generator. The reason this is happening is that you need to copy your model files to your device. To get the most up to date information on known-to-work third-party models, converting them for this MediaPipe task, and copying them to your device, you can review this section of the official documentation.

Along with copying files directly to your development device, it is also possible to set up Firebase Storage to download the necessary files directly to the user's device at run time.

5. Deploy and test the app

After all of that, you should have a working app that can accept a text prompt and generate new images entirely on-device! Go ahead and deploy the app to a physical Android device to test it, though remember that you will want to try this with a device with at least 8GB of memory.

Click Run ( ) in the Android Studio toolbar to run the app.
Select the type of generation steps (final or with iterations) and then press the INITIALIZE button.
On the next screen, set any properties you want and click on the GENERATE button to see what the tool comes up with.

6. Congratulations!

You did it! In this codelab you have learned how to add on-device text-to-image generation to an Android app.

Next steps

There's more you can do with the image generation task, including

using a base image to structure generated images through plugins, or train your own additional LoRA weights through Vertex AI.
Use Firebase Storage to retrieve model files on your device without requiring the use of the ADB tool.

We're looking forward to seeing all of the cool things you make with this experimental task, and keep an eye out for even more codelabs and content from the MediaPipe team!

Report a mistake