In this codelab, you will learn how to build a simple "teachable machine", a custom image classifier that you will train on the fly in the browser using TensorFlow.js, a powerful and flexible machine learning library for Javascript. You will first load and run a popular pre-trained model called MobileNet for image classification in the browser. You will then use a technique called "transfer learning", which bootstraps our training with the pre-trained MobileNet model and customizes it to train for your application.

This codelab will not go over the theory behind the teachable machine application. If you are curious about that, check out this tutorial.

What you'll learn

So let's get started!

To complete this codelab, you will need:

If you are using a computer without a code editor, you can use the free Google Cloud Console.

  1. Open https://console.cloud.google.com/
  2. Click the "Activate Cloud Shell" button
  3. Click "Launch code editor"
  4. Create a file index.html, a root HTML file for the main web page.
  5. Create a file index.js, where our JS source code to train and run the TensorFlow.js model will live
  6. In the console, start a simple HTTP server to serve the files: python -m SimpleHTTPServer 8080
  7. Click "Web Preview" to open up the development server which will serve your application.

Open index.html in an editor and add this content:

<html>
  <head>
    <!-- Load the latest version of TensorFlow.js -->
    <script src="https://unpkg.com/@tensorflow/tfjs"></script>
  </head>
  <body>
    <div id="console"></div>
    <!-- Load index.js after the content of the page -->
    <script src="index.js"></script>
  </body>
</html>

To load the pretrained MobileNet model, we have to import the MobileNet library bundled on unpkg.com. Add the following line to the body of index.html, in the <head> section (after the import of tfjs):

...
<script src="https://unpkg.com/@tensorflow-models/mobilenet"></script>
...

To the beginning of the body of index.html, let's add an image that we will use to test and make sure that MobileNet works:

<body>
...
  <img id="img" crossOrigin src="https://i.imgur.com/JlUvsxa.jpg" width=227 height=227/>
...
</body>

Next, Open/Create file Index.js in a code editor, and include the following code:

let net;

async function app() {
  console.log('Loading mobilenet..');

  // Load the model.
  net = await mobilenet.load();
  console.log('Sucessfully loaded model');

  // Make a prediction through the model on our image.
  const imgEl = document.getElementById('img');
  const result = await net.classify(imgEl);
  console.log(result);
}

app();

To run the webpage, simply open index.html in a Web Browser. If you are using the cloud console, simply refresh the preview page.

You should see a picture of a dog and in the Javascript console in Developer Tools, the top predictions of MobileNet! Note that this may take a little bit of time to download the model, be patient!

Did the image get classified correctly?

It's also worth noting that this will also work on a mobile phone!

Now, let's make this more interactive and real-time. Let's set up the webcam to make predictions on images that come through the webcam.

First set up the webcam video element. Open the index.html file, and add the following line inside the <body> section and delete the <img> tag we had for loading the dog image:

<video autoplay playsinline muted id="webcam" width="224" height="224"></video>

Open the index.js file and add the webcamElement to the very top of the file

const webcamElement = document.getElementById('webcam');

In the same index.js file, add the webcam setup function before the call to the "app()" function:

async function setupWebcam() {
  return new Promise((resolve, reject) => {
    const navigatorAny = navigator;
    navigator.getUserMedia = navigator.getUserMedia ||
        navigatorAny.webkitGetUserMedia || navigatorAny.mozGetUserMedia ||
        navigatorAny.msGetUserMedia;
    if (navigator.getUserMedia) {
      navigator.getUserMedia({video: true},
        stream => {
          webcamElement.srcObject = stream;
          webcamElement.addEventListener('loadeddata',  () => resolve(), false);
        },
        error => reject());
    } else {
      reject();
    }
  });
}

Now, in the app() function which you added before, you can remove the prediction through the image and instead create an infinite loop which makes predictions through the webcam element.

async function app() {
  console.log('Loading mobilenet..');

  // Load the model.
  net = await mobilenet.load();
  console.log('Sucessfully loaded model');
  
  await setupWebcam();
  while (true) {
    const result = await net.classify(webcamElement);

    document.getElementById('console').innerText = `
      prediction: ${result[0].className}\n
      probability: ${result[0].probability}
    `;

    // Give some breathing room by waiting for the next animation frame to
    // fire.
    await tf.nextFrame();
  }
}

If you open your console in the webpage, you should now see a MobileNet prediction with probability for every frame collected on the webcam.

These may be non-sensical because the ImageNet dataset does not look very much like images that would typically appear in a webcam. That's okay though, because we're going to repurpose MobileNet for our own custom objects in the next section!

Now, let's make this more useful. We will make a custom 3-class object classifier using the webcam on the fly. We're going to make a classification through MobileNet, but this time we will take an internal representation (activation) of the model for a particular webcam image and use that for classification.

We'll use a module called a "K-Nearest Neighbors Classifier", which effectively lets us put webcam images (actually, their MobileNet activations) into different categories (or "classes"), and when the user asks to make a prediction we simply choose the class that has the most similar activation to the one we are making a prediction for.

Add an import of KNN Classifier to the end of the imports in the <head> tag of index.html (you will still need MobileNet, so don't remove that import):

...
<script src="https://unpkg.com/@tensorflow-models/knn-classifier"></script>
...

Add 3 buttons for each of the buttons in index.html below the video element. These buttons will be used to add training images to the model.

...
<button id="class-a">Add A</button>
<button id="class-b">Add B</button>
<button id="class-c">Add C</button>
...

At the top of index.js, create the classifier:

const classifier = knnClassifier.create();

Update the app function:

async function app() {
  console.log('Loading mobilenet..');

  // Load the model.
  net = await mobilenet.load();
  console.log('Sucessfully loaded model');

  await setupWebcam();

  // Reads an image from the webcam and associates it with a specific class
  // index.
  const addExample = classId => {
    // Get the intermediate activation of MobileNet 'conv_preds' and pass that
    // to the KNN classifier.
    const activation = net.infer(webcamElement, 'conv_preds');

    // Pass the intermediate activation to the classifier.
    classifier.addExample(activation, classId);
  };

  // When clicking a button, add an example for that class.
  document.getElementById('class-a').addEventListener('click', () => addExample(0));
  document.getElementById('class-b').addEventListener('click', () => addExample(1));
  document.getElementById('class-c').addEventListener('click', () => addExample(2));

  while (true) {
    if (classifier.getNumClasses() > 0) {
      // Get the activation from mobilenet from the webcam.
      const activation = net.infer(webcamElement, 'conv_preds');
      // Get the most likely class and confidences from the classifier module.
      const result = await classifier.predictClass(activation);

      const classes = ['A', 'B', 'C'];
      document.getElementById('console').innerText = `
        prediction: ${classes[result.classIndex]}\n
        probability: ${result.confidences[result.classIndex]}
      `;
    }

    await tf.nextFrame();
  }
}

Now when you load the index.html page, you can use common objects or face/body gestures to capture images for each of the three classes. Each time you click one of the "Add" buttons, one image is added to that class as a training example. While you do this, the model continues to make predictions on webcam images coming in and shows the results in real-time.

In this codelab, you implemented a simple machine learning web application using TensorFlow.js. You loaded and used a pretrained MobileNet model for classifying images from webcam. You then customized the model to classify images into three custom categories.

Be sure to visit js.tensorflow.org for more examples and demos with code to see how you can use TensorFlow.js in your applications.

/