ML Kit is a mobile SDK that brings Google's machine learning expertise to Android and iOS apps in a powerful yet easy-to-use package. Whether you're new or experienced in machine learning, you can easily implement the functionality you need in just a few lines of code. There's no need to have deep knowledge of neural networks or model optimization to get started.

How does it work?

ML Kit makes it easy to apply ML techniques in your apps by bringing Google's ML technologies, such as Object Detection and Tracking, Google Cloud Vision API, Mobile Vision, and TensorFlow Lite, together in a single SDK. Whether you need the power of cloud-based processing, the real-time capabilities of Mobile Vision's on-device models, or the flexibility of custom TensorFlow Lite models, ML Kit makes it possible with just a few lines of code.

This codelab will walk you through simple steps to add Object Detection and Tracking(ODT) for a given image into your existing Android app. Please note that this codelab takes some shortcuts to highlight MLKit ODT usage.

What you will build

In this codelab, you're going to build an Android app with ML Kit for Firebase. Your app will use the ML Kit Object Detection API to detect and track objects in a given image.

In the end, you should see something similar to the image on the right.

What you'll learn

What you'll need

This codelab is focused on ML Kit. Non-relevant concepts and code blocks are glossed over and are provided for you to simply copy and paste.

Download the Code

Click the following link to download all the code for this codelab:

Download source code

Unpack the downloaded zip file. This will unpack a root folder (mlkit-android) with all of the resources you will need. For this codelab, you will only need the sources in the object-detection subdirectory.

The object-detection subdirectory in the mlkit-android repository contains two directories:

You could either create a new firebase project or use your existing firebase projects for this codelab, detailed steps are at Firebase doc page. Note if the project is already created by someone else on Kiosk, you do not need to recreate it, just follow through to make sure the codelab project is added to your Firebase console project.

  1. Go to the Firebase console.
  2. If project "MLKit Codelab" is not there, create it by selecting "Add project", follow the on screen prompt to create it; If "MLKit codelab" project is already there, just select it by clicking on it
  3. You will be on "Get started by adding Firebase to your app" page, select Android app type to add
  4. At "Add Firebase to your app" screen, register the package name for this codelab:
    com.google.firebase.mlkit.codelab.objectdetection
  5. Give it a nickname you like, and click on "Register app" to register the codelab app

Add google-services.json file to your app

After adding the package name and selecting Continue, then downloads a configuration file that contains all the necessary Firebase metadata for your app. Copy the google-services.json file into the starter/app directory (also final/app directory if you like to run the final project).

Add the dependencies for ML Kit and the google-services plugin to your app

The google-services plugin uses the google-services.json file to configure your application to use Firebase, and the ML Kit dependencies allow you to integrate the ML Kit SDK in your app. The following lines should already be added to the end of the app/build.gradle file of your project (check to confirm):

build.gradle

dependencies {
  // ...
  implementation 'com.google.firebase:firebase-ml-vision:20.0.0'
  implementation 'com.google.firebase:firebase-ml-vision-object-detection-model:16.0.0'
}
apply plugin: 'com.google.gms.google-services'

Sync your project with gradle files

Open your starter project if you have not done so yet (Android Studio menu File > Open, navigate to starter/app/build.grade and open it). To be sure that all dependencies are available to your app, you should sync your project with gradle files at this point. Select Sync Project with Gradle Files () from the Android Studio toolbar.

(If this button is disabled, make sure you import only "starter/app/build.grade" not the entire repository.)

Now that you have imported the project into Android Studio and configured the google-services plugin with your JSON file, and added the dependencies for ML Kit, you are ready to run the app for the first time. Connect your Android device via USB to your host or Start the Android Studio emulator, and click Run (execute.png) in the Android Studio toolbar.

The app should launch on your Android device. At this point, you should see a basic layout that has an image view along with a FloatingActionButton () to

  1. bring up the camera app integrated in your device/emulator
  2. take a photo inside your camera app, and accept it () in Camera App
  3. receive the captured image to starter app ( and camera close itself )
  4. display the capture image

Try out "take a photo" button, follow the prompts to grant the permissions, take a photo, accept the photo and observe it displayed inside the starter app. Repeat a few times to see how it works:

Note:

In this step, we will add the functionality to the starter app to detect objects in images.

Set up and run on-device object detection on an image

There are only 3 simple steps with 3 APIs to set up ML Kit ODT

You achieve these inside the function runObjectDetection(bitmap: Bitmap)in file MainActivity.kt.

/**
* MLKit Object Detection Function
*/
private fun runObjectDetection(bitmap: Bitmap) {
}

Right now the function is empty. Move on to the following steps to implement ML Kit ODT! Along the way, Studio would prompt you to add the necessary imports

Step 1: Create Image Object

The images for this codelab are coming from on-device camera with MediaStore.ACTION_IMAGE_CAPTURE; after you accept image from Camera app, the image will be available to starter app; starter has decoded it into Bitmap format already. MLKit provides a simple API to create a FirebaseVisionImage from Bitmap

// Step 1: create MLKit's VisionImage object
val image = FirebaseVisionImage.fromBitmap(bitmap)

Add the above code to the top of runObjectDetection(bitmap:Bitmap).

Step 2: Create Detector Instance

ML Kit follows Builder Design Pattern, you would pass the configuration to builder, then acquire a detector from it. There are 3 options to configure:

This codelab is for single image object detect & classification, let's do that:

// Step 2: acquire detector object
val options = FirebaseVisionObjectDetectorOptions.Builder()
   .setDetectorMode(FirebaseVisionObjectDetectorOptions.SINGLE_IMAGE_MODE)
   .enableMultipleObjects()
   .enableClassification()
   .build()
val detector = FirebaseVision.getInstance().getOnDeviceObjectDetector(options)

Step 3: Feed Image(s) to the detector

Object detection and classification is async processing:

The following code does just that (copy and append it to the existing code inside fun runObjectDetection(bitmap:Bitmap)):

// Step 3: feed given image to detector and setup callback
detector.processImage(image)
   .addOnSuccessListener {
       // Task completed successfully
        debugPrint(it)
   }
   .addOnFailureListener {
       // Task failed with an exception
       Toast.makeText(baseContext, "Oops, something went wrong!", 
                      Toast.LENGTH_SHORT).show()
   }

Upon completion, detector notifies you with

  1. Total objects detected
  2. each detected object is described with

You probably noticed that the code does a printf kind of processing for the detected result with debugPrint(). Add it into MainActivity class:

    private fun debugPrint(visionObjects : List<FirebaseVisionObject>) {
        val LOG_MOD = "MLKit-ODT"
        for ((idx, obj) in visionObjects.withIndex()) {
            val box = obj.boundingBox

            Log.d(LOG_MOD, "Detected object: ${idx} ")
            Log.d(LOG_MOD, "  Category: ${obj.classificationCategory}")
            Log.d(LOG_MOD, "  trackingId: ${obj.trackingId}")
            Log.d(LOG_MOD, "  entityId: ${obj.entityId}")
            Log.d(LOG_MOD, "  boundingBox: (${box.left}, ${box.top}) - (${box.right},${box.bottom})")
            if (obj.classificationCategory != FirebaseVisionObject.CATEGORY_UNKNOWN) {
                val confidence: Int = obj.classificationConfidence!!.times(100).toInt()
                Log.d(LOG_MOD, "  Confidence: ${confidence}%")
            }
        }
    }    

you are ready to accept images for detection! Compile and run it: you should see the same as the one you saw before ( because we have not feed image to detector yet, but detector is ready )

Final Step: turn on detector with images

Now plug in the detection function to the captured image! In onActivityResult() function, call runObjectDetection(image) after you send the same image for displaying

// TODO: run through ODT and display result
runObjectDetection(image)

Now compile, run the codelab by clicking Run (execute.png) in Android Studio toolbar; look at the logcat window() inside IDE, you should see something similar to:

D/MLKit-ODT: Detected object: 0 
D/MLKit-ODT:   Category: 1
D/MLKit-ODT:   trackingId: null
D/MLKit-ODT:   entityId: /g/11g0srqwrg
D/MLKit-ODT:   size: (88, 810) - (1790,3669)
D/MLKit-ODT: Confidence: 91%
D/MLKit-ODT: Detected object: 1 
D/MLKit-ODT:   Category: 0
D/MLKit-ODT:   trackingId: null
D/MLKit-ODT:   entityId: /m/0bl9f
D/MLKit-ODT:   size: (14, 518) - (1449,1562)

which means that detector saw 2 objects of:

Technically that is all that you need to get ML Kit Object Detection to work: you got it all at this moment! Congratulations! Yeah, on UI side, you are still at the stage when you started, but you could make use of the detected results on UI such as drawing out the bounding box to create a better experience: let's go to the next step -- post process the detected results!

In previous steps, you print the detected result into logcat: simple and fast. In this section, you would make use of the result into the image:

Create the drawing function

Go to where you call debugPrint(it), and replace it with the following code snippet:

// Post-detection processing : draw result
val drawingView = DrawingView(getApplicationContext(), it)
drawingView.draw(Canvas(bitmap))
runOnUiThread { imageView.setImageBitmap(bitmap) }

As you could see that drawing is through Canvas with DrawingView -- a derived class from View. The implementation is the following: just copy it into the bottom of the file MainActivity.kt(it is relatively long), outside any class:

/**
 * DrawingView class:
 *    onDraw() function implements drawing
 *     - boundingBox
 *     - Category
 *     - Confidence ( if Category is not CATEGORY_UNKNOWN )
 */
class DrawingView(context: Context, var visionObjects: List<FirebaseVisionObject>) : View(context) {

    companion object {
        // mapping table for category to strings: drawing strings
        val categoryNames: Map<Int, String> = mapOf(
            FirebaseVisionObject.CATEGORY_UNKNOWN to "Unknown",
            FirebaseVisionObject.CATEGORY_HOME_GOOD to "Home Goods",
            FirebaseVisionObject.CATEGORY_FASHION_GOOD to "Fashion Goods",
            FirebaseVisionObject.CATEGORY_FOOD to "Food",
            FirebaseVisionObject.CATEGORY_PLACE to "Place",
            FirebaseVisionObject.CATEGORY_PLANT to "Plant"
        )
    }

    val MAX_FONT_SIZE = 96F

    override fun onDraw(canvas: Canvas) {
        super.onDraw(canvas)
        val pen = Paint()
        pen.textAlign = Paint.Align.LEFT

        for (item in visionObjects) {
            // draw bounding box
            pen.color = Color.RED
            pen.strokeWidth = 8F
            pen.style = Paint.Style.STROKE
            val box = item.getBoundingBox()
            canvas.drawRect(box, pen)

            // Draw result category, and confidence
            val tags: MutableList<String> = mutableListOf()
            tags.add("Category: ${categoryNames[item.classificationCategory]}")
            if (item.classificationCategory !=
                FirebaseVisionObject.CATEGORY_UNKNOWN) {
                tags.add("Confidence: ${item.classificationConfidence!!.times(100).toInt()}%")
            }

            var tagSize = Rect(0, 0, 0, 0)
            var maxLen = 0
            var index: Int = -1

            for ((idx, tag) in tags.withIndex()) {
                if (maxLen < tag.length) {
                    maxLen = tag.length
                    index = idx
                }
            }

            // calculate the right font size
            pen.style = Paint.Style.FILL_AND_STROKE
            pen.color = Color.YELLOW
            pen.strokeWidth = 2F

            pen.textSize = MAX_FONT_SIZE
            pen.getTextBounds(tags[index], 0, tags[index].length, tagSize)
            val fontSize: Float = pen.textSize * box.width() / tagSize.width()

            // adjust the font size so texts are inside the bounding box
            if (fontSize < pen.textSize) pen.textSize = fontSize

            var margin = (box.width() - tagSize.width()) / 2.0F
            if (margin < 0F)margin = 0F

            // draw tags onto bitmap (bmp is in upside down format)
            for ((idx, txt) in tags.withIndex()) {
                canvas.drawText(
                    txt, box.left + margin,
                    box.top + tagSize.height().times(idx + 1.0F), pen
                )
            }
        }
    }
}

When onDraw() function is called, the newly generated image is the one that you post it to image view window: you got the visible result!

Run it

Now click Run (execute.png) in the Android Studio toolbar. Once the app loads, press the FloatingActionButton with the camera icon, point your camera to an object, take a photo, accept the photo ( in Camera App ), you would see the detection result; press the FloatingActionButton again to repeat a couple of times to experience the latest MlKit ODT!


You have used ML Kit for Firebase to add Object Detection capabilities to your app:

That is all you need to get it up and running!

As you proceed, you might like to enhance the model: as you can see that the default model could only recognize 5 categories -- the model does not even know mobile phones, neither does it know Android 9 Pie statue; the solution is AutoML which enables you train your model.

What we've covered

Next Steps

Learn More