ML Kit is a mobile SDK that brings Google's machine learning expertise to Android and iOS apps in a powerful yet easy-to-use package. Whether you're new or experienced in machine learning, you can easily implement the functionality you need in just a few lines of code. There's no need to have deep knowledge of neural networks or model optimization to get started.

How does it work?

ML Kit makes it easy to apply ML techniques in your apps by bringing Google's ML technologies, such as the Google Cloud Vision API, Mobile Vision, and TensorFlow Lite, together in a single SDK. Whether you need the power of cloud-based processing, the real-time capabilities of Mobile Vision's on-device models, or the flexibility of custom TensorFlow Lite models, ML Kit makes it possible with just a few lines of code.

This codelab will walk you through creating your own Android app that can automatically detect text and facial features in an image.

What you will build

In this codelab, you're going to build an Android app with Firebase ML Kit. Your app will:

  • Use the ML Kit Text Recognition API to detect text in images
  • Use the ML Kit Face Contour API to identify facial features in images
  • (Optional) Use the ML Kit Cloud Text Recognition API to expand text recognition capabilities (such as non-Latin alphabets) when the device has internet connectivity
  • Learn how to host a custom pre-trained Tensor Flow Lite model using Firebase
  • Use the ML Kit Custom Model API to download the pre-trained TensorFlow Lite model to your app
  • Use the downloaded model to run inference and label images

What you'll learn

What you'll need

This codelab is focused on ML Kit. Non-relevant concepts and code blocks are glossed over and are provided for you to simply copy and paste.

Download the Code

Click the following link to download all the code for this codelab:

Download source code

Unpack the downloaded zip file. This will unpack a root folder (mlkit-android) with all of the resources you will need.

The mlkit repository contains two directories:

Download the Tensor Flow Lite model

Click the following link to download the pre-trained Tensor Flow Lite model we will be using in this codelab:

Download model

Unpack the downloaded zip file. This will unpack a root folder (mobilenet_v1_1.0_224_quant) inside which you will find the Tensor Flow Lite custom model we will use in this codelab (mobilenet_v1_1.0_224_quant.tflite).

  1. Go to the Firebase console.
  2. Select Create New Project, and name your project "ML Kit Codelab".

Connect your Android app

  1. From the overview screen of your new project,
    click Add Firebase to your Android app.
  2. Enter the codelab's package name: com.google.firebase.codelab.mlkit.
  3. Leave the other fields blank and click Register app.

Add google-services.json file to your app

After adding the package name and selecting Continue, your browser automatically downloads a configuration file that contains all the necessary Firebase metadata for your app. Copy the google-services.json file into the app directory in your project. Skip the remaining instructions for adding the Firebase SDK to your app. This has already been done for you in the starter project.

Add the dependencies for ML Kit and the google-services plugin to your app

The google-services plugin uses the google-services.json file to configure your application to use Firebase, and the ML Kit dependencies allow you to integrate the ML Kit SDK in your app. The following lines should already be added to the end of the build.gradle file in the app directory of your project (check to confirm):

build.gradle

dependencies {
  // ...
    implementation 'com.google.firebase:firebase-ml-vision:18.0.1'
    implementation 'com.google.firebase:firebase-ml-vision-image-label-model:17.0.2'
    implementation 'com.google.firebase:firebase-ml-vision-face-model:17.0.2'
    implementation 'com.google.firebase:firebase-ml-model-interpreter:16.2.3'
}
apply plugin: 'com.google.gms.google-services'

Sync your project with gradle files

To be sure that all dependencies are available to your app, you should sync your project with gradle files at this point. Select Sync Project with Gradle Files from the Android Studio toolbar.

Now that you have imported the project into Android Studio and configured the google-services plugin with your JSON file, and added the dependencies for ML Kit, you are ready to run the app for the first time. Start the Android Studio emulator, and click Run () in the Android Studio toolbar.

The app should launch on your emulator. At this point, you should see a basic layout that has a drop down field which allows you to select between 6 images. In the next section, you add text recognition to your app to identify text in the images.

In this step, we will add functionality to your app to recognize text in images.

Set up and run on-device text recognition on an image

Add the following to the runTextRecognition method of MainActivity class:

MainActivity.java

    private void runTextRecognition() {
        FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(mSelectedImage);
        FirebaseVisionTextRecognizer recognizer = FirebaseVision.getInstance()
                .getOnDeviceTextRecognizer();
        mTextButton.setEnabled(false);
        recognizer.processImage(image)
                .addOnSuccessListener(
                        new OnSuccessListener<FirebaseVisionText>() {
                            @Override
                            public void onSuccess(FirebaseVisionText texts) {
                                mTextButton.setEnabled(true);
                                processTextRecognitionResult(texts);
                            }
                        })
                .addOnFailureListener(
                        new OnFailureListener() {
                            @Override
                            public void onFailure(@NonNull Exception e) {
                                // Task failed with an exception
                                mTextButton.setEnabled(true);
                                e.printStackTrace();
                            }
                        });
    }

The code above configures the text recognition detector and calls the function processTextRecognitionResult with the response.

Process the text recognition response

Add the following code to processTextRecognitionResult in the MainActivity class to parse the results and display them in your app.

MainActivity.java

    private void processTextRecognitionResult(FirebaseVisionText texts) {
        List<FirebaseVisionText.TextBlock> blocks = texts.getTextBlocks();
        if (blocks.size() == 0) {
            showToast("No text found");
            return;
        }
        mGraphicOverlay.clear();
        for (int i = 0; i < blocks.size(); i++) {
            List<FirebaseVisionText.Line> lines = blocks.get(i).getLines();
            for (int j = 0; j < lines.size(); j++) {
                List<FirebaseVisionText.Element> elements = lines.get(j).getElements();
                for (int k = 0; k < elements.size(); k++) {
                    Graphic textGraphic = new TextGraphic(mGraphicOverlay, elements.get(k));
                    mGraphicOverlay.add(textGraphic);

                }
            }
        }
    }

Run the app on the emulator

Now click Run () in the Android Studio toolbar. Once the app loads, make sure that Test Image 1(Text) is selected in the drop down field and click on the FIND TEXT button.

Your app should now look like image below, showing the text recognition results and bounding boxes overlaid on top of the original image.

Photo: Kai Schreiber / Wikimedia Commons / CC BY-SA 2.0

Congratulations, you have just added on-device text recognition to your app using ML Kit for Firebase! On-device text recognition is great for many use cases as it works even when your app doesn't have internet connectivity and is fast enough to use on still images as well as live video frames. However, it does have some limitations. For example, try selecting Test Image 2 (Text) in the emulator and click FIND TEXT. Notice that on-device text recognition doesn't return meaningful results for text in non-Latin alphabets.

In a later step, we will use the cloud text recognition functionality in ML Kit to fix this issue.

In this step, we will add functionality to your app to recognize the contours of faces in images.

Set up and run on-device face contour detection on an image

Add the following to the runFaceContourDetection method of MainActivity class:

MainActivity.java

    private void runFaceContourDetection() {
        FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(mSelectedImage);
        FirebaseVisionFaceDetectorOptions options =
                new FirebaseVisionFaceDetectorOptions.Builder()
                        .setPerformanceMode(FirebaseVisionFaceDetectorOptions.FAST)
                        .setContourMode(FirebaseVisionFaceDetectorOptions.ALL_CONTOURS)
                        .build();

        mFaceButton.setEnabled(false);
        FirebaseVisionFaceDetector detector = FirebaseVision.getInstance().getVisionFaceDetector(options);
        detector.detectInImage(image)
                .addOnSuccessListener(
                        new OnSuccessListener<List<FirebaseVisionFace>>() {
                            @Override
                            public void onSuccess(List<FirebaseVisionFace> faces) {
                                mFaceButton.setEnabled(true);
                                processFaceContourDetectionResult(faces);
                            }
                        })
                .addOnFailureListener(
                        new OnFailureListener() {
                            @Override
                            public void onFailure(@NonNull Exception e) {
                                // Task failed with an exception
                                mFaceButton.setEnabled(true);
                                e.printStackTrace();
                            }
                        });

    }

The code above configures the face contour detector and calls the function processFaceContourDetectionResult with the response.

Process the face contour detection response

Add the following code to processFaceContourDetectionResult in the MainActivity class to parse the results and display them in your app.

MainActivity.java

    private void processFaceContourDetectionResult(List<FirebaseVisionFace> faces) {
        // Task completed successfully
        if (faces.size() == 0) {
            showToast("No face found");
            return;
        }
        mGraphicOverlay.clear();
        for (int i = 0; i < faces.size(); ++i) {
            FirebaseVisionFace face = faces.get(i);
            FaceContourGraphic faceGraphic = new FaceContourGraphic(mGraphicOverlay);
            mGraphicOverlay.add(faceGraphic);
            faceGraphic.updateFace(face);
        }
    }

Run the app on the emulator

Now click Run () in the Android Studio toolbar. Once the app loads, make sure that Test Image 3 (Face) is selected in the drop down field and click on the FIND FACE CONTOUR button.

Your app should now look like image below, showing the face contour detection results and showing the contours of the face as points overlaid on top of the original image.

Congratulations, you have just added on-device face contour detection to your app using ML Kit for Firebase! On-device face contour detection is great for many use cases as it works even when your app doesn't have internet connectivity and is fast enough to use on still images as well as live video frames.

The pre-trained Tensor Flow Lite model we will be using in our app is the MobileNet_v1 model, which has been designed to be used in low-latency, low-power environments, and offers a good compromise between model size and accuracy. In this step, we will be hosting this model with Firebase by uploading it to our Firebase project. This enables apps using the ML Kit SDK to automatically download the model to our devices, and allows us to do model version management easily in the Firebase Console.

Host the custom model with Firebase

  1. Go to the Firebase console.
  2. Select your project.
  3. Select ML Kit under the DEVELOP section in the left hand navigation.
  4. Click on the CUSTOM tab.
  5. Click on Add another model and use "cloud_model_1" as the name. This is the name we will later use to download our custom model in our Android code.
  6. In the TensorFlow Lite model section, click BROWSE and upload the mobilenet_v1_1.0_224_quant.tflite file you downloaded earlier.
  7. Click PUBLISH.

We are now ready to modify our Android code to use this hosted model.

Download the custom model from Firebase

Now that we have hosted a pre-trained custom model by uploading it to our Firebase Project, we will modify our app code to automatically download and use this model.

Add the following fields to the top of the MainActivity class to define our FirebaseModelInterpreter.

MainActivity.java

    /**
     * An instance of the driver class to run model inference with Firebase.
     */
    private FirebaseModelInterpreter mInterpreter;
    /**
     * Data configuration of input & output data of model.
     */
    private FirebaseModelInputOutputOptions mDataOptions;

Then add the following code to the initCustomModel method of the MainActivity class.

MainActivity.java

    private void initCustomModel() {
        mLabelList = loadLabelList(this);

        int[] inputDims = {DIM_BATCH_SIZE, DIM_IMG_SIZE_X, DIM_IMG_SIZE_Y, DIM_PIXEL_SIZE};
        int[] outputDims = {DIM_BATCH_SIZE, mLabelList.size()};
        try {
            mDataOptions =
                    new FirebaseModelInputOutputOptions.Builder()
                            .setInputFormat(0, FirebaseModelDataType.BYTE, inputDims)
                            .setOutputFormat(0, FirebaseModelDataType.BYTE, outputDims)
                            .build();
            FirebaseModelDownloadConditions conditions = new FirebaseModelDownloadConditions
                    .Builder()
                    .requireWifi()
                    .build();
            FirebaseLocalModelSource localSource =
                    new FirebaseLocalModelSource.Builder("asset")
                            .setAssetFilePath(LOCAL_MODEL_ASSET).build();

            FirebaseCloudModelSource cloudSource = new FirebaseCloudModelSource.Builder
                    (HOSTED_MODEL_NAME)
                    .enableModelUpdates(true)
                    .setInitialDownloadConditions(conditions)
                    .setUpdatesDownloadConditions(conditions)  // You could also specify
                    // different conditions
                    // for updates
                    .build();
            FirebaseModelManager manager = FirebaseModelManager.getInstance();
            manager.registerLocalModelSource(localSource);
            manager.registerCloudModelSource(cloudSource);
            FirebaseModelOptions modelOptions =
                    new FirebaseModelOptions.Builder()
                            .setCloudModelName(HOSTED_MODEL_NAME)
                            .setLocalModelName("asset")
                            .build();
            mInterpreter = FirebaseModelInterpreter.getInstance(modelOptions);
        } catch (FirebaseMLException e) {
            showToast("Error while setting up the model");
            e.printStackTrace();
        }
    }

Note how we use FirebaseModelInputOutputOptions in the code to specify inputs expected by our custom model and the outputs it generates. In the case of the Mobilenetv1 model, we use an input of 224x224 pixel images and generate a 1 dimensional list of outputs. We then set up the conditions in which our custom model should be downloaded to the device and register it with FirebaseModelManager.

Bundle a local version of the model for offline scenarios

Hosting a model with Firebase allows you to make updates to the model and have those automatically be downloaded to your users. However, in situations where there is poor internet connectivity, you may also want to bundle a local version of your model. By both hosting the model on Firebase and supporting locally, you can ensure that the most recent version of the model is used when network connectivity is available, but your app's ML features still work when the Firebase-hosted model isn't available.

We have already added the code to do this in the previous code snippet. We first created a FirebaseLocalModelSource and called registerLocalModelSource to register it with our FirebaseModelManager. All you have to do is add the mobilenet_v1.0_224_quant.tflite you downloaded earlier to the assets folder inside app/src/main.

Then, add the following to the build.gradle file in your project's app module:

build.gradle

android {

    // ...

    aaptOptions {
        noCompress "tflite"
    }
}

The .tflite file will now be included in the app package and available to ML Kit as a raw asset.

In this step, we will define a function that uses the FirebaseModelInterpreter we configured in the previous step to run inference using the downloaded or local custom model.

Add code to use the downloaded/local model in your app

Copy the following code into the runModelInference method into the MainActivity class.

MainActivity.java

    private void runModelInference() {
        if (mInterpreter == null) {
            Log.e(TAG, "Image classifier has not been initialized; Skipped.");
            return;
        }
        // Create input data.
        ByteBuffer imgData = convertBitmapToByteBuffer(mSelectedImage, mSelectedImage.getWidth(),
                mSelectedImage.getHeight());

        try {
            FirebaseModelInputs inputs = new FirebaseModelInputs.Builder().add(imgData).build();
            // Here's where the magic happens!!
            mInterpreter
                    .run(inputs, mDataOptions)
                    .addOnFailureListener(new OnFailureListener() {
                        @Override
                        public void onFailure(@NonNull Exception e) {
                            e.printStackTrace();
                            showToast("Error running model inference");
                        }
                    })
                    .continueWith(
                            new Continuation<FirebaseModelOutputs, List<String>>() {
                                @Override
                                public List<String> then(Task<FirebaseModelOutputs> task) {
                                    byte[][] labelProbArray = task.getResult()
                                            .<byte[][]>getOutput(0);
                                    List<String> topLabels = getTopLabels(labelProbArray);
                                    mGraphicOverlay.clear();
                                    GraphicOverlay.Graphic labelGraphic = new LabelGraphic
                                            (mGraphicOverlay, topLabels);
                                    mGraphicOverlay.add(labelGraphic);
                                    return topLabels;
                                }
                            });
        } catch (FirebaseMLException e) {
            e.printStackTrace();
            showToast("Error running model inference");
        }

    }

ML Kit handles downloading and running the model automatically (or using the local bundled version if the hosted model can't be downloaded), and provides the results with task.getResult(). We then sort and display these results in our app UI.

Run the app on the emulator

Now click Run () in the Android Studio toolbar. Once the app loads, make sure that Test Image 4 (Object) is selected in the drop down field and click on the FIND OBJECTS button.

Your app should now look like image below, showing the model inference results and the detected image labels with their confidence levels.

In a previous step, you added on-device text recognition to the app, which runs quickly, without an internet connection, and is free. However, it does have some limitations. For example, try selecting Test Image 2 (Text) in the emulator and click FIND TEXT. Notice that on-device text recognition doesn't return meaningful results for text in non-Latin alphabets. In this step, you will add cloud text recognition to your app using ML Kit for Firebase. This will allow you to detect more types of text in images, such as non-Latin alphabets.

Switch your Firebase project to the Blaze plan

Only Blaze-level projects can use the Cloud Vision APIs. Follow these steps to switch your project to the Blaze plan and enable pay-as-you-go billing.

  1. Open your project in the Firebase console.
  2. Click on the MODIFY link in the lower left corner next to the currently selected Spark plan.
  3. Select the Blaze plan and follow the instructions in the Firebase Console to add a billing account.

Enable the Cloud Vision API

You need to enable the Cloud Vision API in order to use cloud text recognition in MK Kit.

  1. Open the Cloud Vision API in the Cloud Console API library.
  2. Ensure that your Firebase project is selected in the menu at the top of the page.
  3. If the API is not already enabled, click Enable.

Set up and run cloud text recognition on an image

Add the following to the runCloudTextRecognition method of the MainActivity class:

MainActivity.java

    private void runCloudTextRecognition() {
        mCloudButton.setEnabled(false);
        FirebaseVisionImage image = FirebaseVisionImage.fromBitmap(mSelectedImage);
        FirebaseVisionDocumentTextRecognizer recognizer = FirebaseVision.getInstance()
                .getCloudDocumentTextRecognizer();
        recognizer.processImage(image)
                .addOnSuccessListener(
                        new OnSuccessListener<FirebaseVisionDocumentText>() {
                            @Override
                            public void onSuccess(FirebaseVisionDocumentText texts) {
                                mCloudButton.setEnabled(true);
                                processCloudTextRecognitionResult(texts);
                            }
                        })
                .addOnFailureListener(
                        new OnFailureListener() {
                            @Override
                            public void onFailure(@NonNull Exception e) {
                                // Task failed with an exception
                                mCloudButton.setEnabled(true);
                                e.printStackTrace();
                            }
                        });
    }

The code above configures the text recognition detector and calls the function processCloudTextRecognitionResult with the response.

Process the text recognition response

Add the following code to processCloudTextRecognitionResult in the MainActivity class to parse the results and display them in your app.

MainActivity.java

    private void processCloudTextRecognitionResult(FirebaseVisionDocumentText text) {
        // Task completed successfully
        if (text == null) {
            showToast("No text found");
            return;
        }
        mGraphicOverlay.clear();
        List<FirebaseVisionDocumentText.Block> blocks = text.getBlocks();
        for (int i = 0; i < blocks.size(); i++) {
            List<FirebaseVisionDocumentText.Paragraph> paragraphs = blocks.get(i).getParagraphs();
            for (int j = 0; j < paragraphs.size(); j++) {
                List<FirebaseVisionDocumentText.Word> words = paragraphs.get(j).getWords();
                for (int l = 0; l < words.size(); l++) {
                    CloudTextGraphic cloudDocumentTextGraphic = new CloudTextGraphic(mGraphicOverlay,
                            words.get(l));
                    mGraphicOverlay.add(cloudDocumentTextGraphic);
                }
            }
        }
    }

Run the app on the emulator

Now click Run () in the Android Studio toolbar. Once the app loads, select Test Image 2 (Text)in the drop down field and click on the FIND TEXT (CLOUD) button. Notice that now we are successfully able to recognize the non-Latin characters in the image!

Cloud text recognition in ML Kit is ideally suited if:

You have used ML Kit for Firebase to easily add advanced machine learning capabilities to your app.

What we've covered

Next Steps

Learn More