ML Kit is a mobile SDK that brings Google's machine learning expertise to Android and iOS apps in a powerful yet easy-to-use package. Whether you're new or experienced in machine learning, you can easily implement the functionality you need in just a few lines of code. There's no need to have deep knowledge of neural networks or model optimization to get started. On the other hand, if you are an experienced ML developer, see the Custom Machine Learning Models with ML Kit codelab to learn how ML Kit makes it easy to use your custom TensorFlow Lite models in your mobile apps.

How does it work?

ML Kit makes it easy to apply ML techniques in your apps by bringing Google's ML technologies, such as the Google Cloud Vision API, Mobile Vision, and TensorFlow Lite, together in a single SDK. Whether you need the power of cloud-based processing, the real-time capabilities of Mobile Vision's on-device models, or the flexibility of custom TensorFlow Lite models, ML Kit makes it possible with just a few lines of code.

This codelab will walk you through creating your own Android app that can automatically detect objects in a provided image.

What you will build

In this codelab, you're going to build an Android app with Firebase ML Kit. Your app will:

  • Utilize the ML Kit Image Labeling API to detect objects in a provided image
  • Use the ML Kit Cloud Image Labeling API to expand object recognition capabilities (such as the ability to detect over 10,000+ unique objects) when the device has internet connectivity

What you'll learn

What you'll need

This codelab is focused on ML Kit. Non-relevant concepts and code blocks are glossed over and are provided for you to simply copy and paste.

Download the Code

Click the following link to download all the code for this codelab:

Download source code

Unpack the downloaded zip file. This will unpack a root folder (mlkit-android) with all of the resources you will need. For this codelab, you will only need the resources in the image-labeling subdirectory.

The image-labeling subdirectory in the mlkit repository contains two directories:

  1. Go to the Firebase console.
  2. Select Create New Project, and name your project "ML Kit Codelab."

Connect your Android app

  1. From the overview screen of your new project,
    click Add Firebase to your Android app.
  2. Enter the codelab's package name: com.google.firebase.codelab.image_labeling.

Add google-services.json file to your app

After adding the package name and selecting Continue, your browser automatically downloads a configuration file that contains all the necessary Firebase metadata for your app. Copy the google-services.json file into the app directory in your project.

Add the dependencies for ML Kit and the google-services plugin to your app

The google-services plugin uses the google-services.json file to configure your application to use Firebase, and the ML Kit dependencies allow you to integrate the ML Kit SDK in your app. The following lines should already be added to the end of the build.gradle file in the app directory of your project (check to confirm):

build.gradle

dependencies {
  // ...
  implementation 'com.google.firebase:firebase-ml-vision:17.0.0'
  implementation 'com.google.firebase:firebase-ml-vision-image-label-model:15.0.0'
}
apply plugin: 'com.google.gms.google-services'

Sync your project with gradle files

To be sure that all dependencies are available to your app, you should sync your project with gradle files at this point. Select Sync Project with Gradle Files () from the Android Studio toolbar.

Now that you have imported the project into Android Studio and configured the google-services plugin with your JSON file, and added the dependencies for ML Kit, you are ready to run the app for the first time. Start the Android Studio emulator, and click Run () in the Android Studio toolbar.

The app should launch on your emulator. At this point, you should see a basic layout that has a Camera preview along with a FloatingActionButton to capture the image that's currently being shown in the preview.

In this step, we will add functionality to your app to label objects in images.

Set up and run on-device image labeling on an image

When it comes to the implementation, the APIs provided by MLKit have 3 main steps :

Step 1 : Creating a FirebaseVisionImage object using the captured Bitmap

Step 2 : Creating the detector for the API. For Image Labeling, this will be the FirebaseVisionLabelDetector.

Step 3 : Running the detector over the FirebaseVisionImage and attach onSuccess and onFailure listeners.

Add the following code snippet to the runImageLabeling method of ImageLabelActivity class:

ImageLabelActivity.kt

   private fun runImageLabeling(bitmap: Bitmap) {
        //Create a FirebaseVisionImage
        val image = FirebaseVisionImage.fromBitmap(bitmap)
        
        //Get access to an instance of FirebaseImageDetector
        val detector = FirebaseVision.getInstance().visionLabelDetector

        //Use the detector to detect the labels inside the image
        detector.detectInImage(image)
                .addOnSuccessListener {
                    // Task completed successfully
                    progressBar.visibility = View.GONE
                    itemAdapter.setList(it)
                    sheetBehavior.setState(BottomSheetBehavior.STATE_EXPANDED)
                }
                .addOnFailureListener {
                    // Task failed with an exception
                    progressBar.visibility = View.GONE
                    Toast.makeText(baseContext, "Sorry, something went wrong!", Toast.LENGTH_SHORT).show()
                }
    }

The code above configures the image label detector and then runs it over the image to get a list of FirebaseVisionLabel objects containing the label, entity ID and the confidence score as a response.

Understanding the response

The detector's onSuccessListener method returns a List of FirebaseVisionLabel object which in turn contains 3 things :

Confidence : This is the overall confidence of the result and ranges from 0.0f to 1.0f.

Label : This is the name of the detected label (object) from the given image.

Entity Id : This contains the unique entity ID for the detected label.
The list of all the IDs are available in Google Knowledge Graph Search API

This list of detected Labels is then passed on to the adapter where it is populated and displayed inside a recyclerview.

Run the app on the emulator

Now click Run () in the Android Studio toolbar. Once the app loads, point your camera to an object and press the FloatingActionButton with the camera icon and wait for the app to process the image.

Once done, your app should now look like image below, showing the name and confidence of the detected objects along with the image you captured.

Photo: Harshit Dwivedi

Congratulations, you have just added on-device object detection to your app using ML Kit for Firebase! On-device object detection is great for many use cases as it works even when your app doesn't have internet connectivity and is fast enough to use on still images as well as live video frames. However, it does have some limitations. For example, you may notice that the results are not very accurate as the Laptop here is identified as a Musical Instrument and a Television.

You will find similar experiences while trying to identify other objects as well using On-Device Image Labeling.

In the next step, we will use the cloud image labeling functionality in ML Kit to fix this issue.

In this step, you will add cloud text recognition to your app using ML Kit for Firebase. This will allow you to detect more types of text in images, such as non-Latin alphabets.

Switch your Firebase project to the Blaze plan

Only Blaze-level projects can use the Cloud Vision APIs. Follow these steps to switch your project to the Blaze plan and enable pay-as-you-go billing.

  1. Open your project in the Firebase console.
  2. Click on the MODIFY link in the lower left corner next to the currently selected Spark plan.
  3. Select the Blaze plan and follow the instructions in the Firebase Console to add a billing account.

Enable the Cloud Vision API

You need to enable the Cloud Vision API in order to use cloud text recognition in MK Kit.

  1. Open the Cloud Vision API in the Cloud Console API library.
  2. Ensure that your Firebase project is selected in the menu at the top of the page.
  3. If the API is not already enabled, click Enable.

Set up and run cloud text recognition on an image

Add the following to the runCloudImageLabeling method of the ImageLabelActivity class:

ImageLabelActivity.kt

    private fun runCloudImageLabeling(bitmap: Bitmap) {
        //Create a FirebaseVisionImage
        val image = FirebaseVisionImage.fromBitmap(bitmap)

        //Get access to an instance of FirebaseCloudImageDetector
        val detector = FirebaseVision.getInstance().visionCloudLabelDetector

        //Use the detector to detect the labels inside the image
        detector.detectInImage(image)
                .addOnSuccessListener {
                    // Task completed successfully
                    progressBar.visibility = View.GONE
                    itemAdapter.setList(it)
                    sheetBehavior.setState(BottomSheetBehavior.STATE_EXPANDED)
                }
                .addOnFailureListener {
                    // Task failed with an exception
                    progressBar.visibility = View.GONE
                    Toast.makeText(baseContext, "Sorry, something went wrong!", Toast.LENGTH_SHORT).show()
                }
    }

Next, change the onClick method call on line number 69 inside ImageLabelActivity to :

ImageLabelActivity.kt

override fun onClick(v: View?) {
        progressBar.visibility = View.VISIBLE
        cameraView.captureImage { cameraKitImage ->
            // Get the Bitmap from the captured shot
            runCloudImageLabeling(cameraKitImage.bitmap)
            runOnUiThread {
                showPreview()
                imagePreview.setImageBitmap(cameraKitImage.bitmap)
            }
        }
    }

The code above configures the image label detector and then runs it over the image to get a list of FirebaseVisionCloudLabel objects containing the label, entity ID and the confidence score as a response..

Run the app on the emulator

Now click Run () in the Android Studio toolbar. Once the app loads, point your camera to an object and press the FloatingActionButton with the camera icon and wait for the app to process the image.

Once done, your app should now look like image below, showing the name and confidence of the detected objects along with the image you captured.

Notice that now the items displayed in the recyclerview are more accurate and closely resemble the item in the captured image.

Cloud image labeling in ML Kit is ideally suited if:

You have used ML Kit for Firebase to easily add advanced machine learning capabilities to your app.

What we've covered

Next Steps

Learn More