Integrate the Vision API with Dialogflow

1. Before you begin

In this codelab, you'll integrate the Vision API with Dialogflow to provide rich and dynamic machine learning-based responses to user-provided image inputs. You'll create a chatbot app that takes an image as input, processes it in the Vision API, and returns an identified landmark to the user. For example, if the user uploads an image of the Taj Mahal, the chatbot will return Taj Mahal as the response.

That's useful because you can do an analysis of the items in the image and take action on the gained information. You can also build a refund-processing system to help users upload receipts, extract the date of purchase in the receipt, and process the refund if the date is appropriate.

Take a look at the following sample dialog:

User: Hi

Chatbot: Hi! You can upload a picture to explore landmarks

User: Upload an image with Taj Mahal in it.

Chatbot: File is being processed, here are the results: Taj Mahal, Taj Mahal Garden, Taj Mahal.

15a4243e453415ca.png

Prerequisites

Before proceeding, you need to complete the following codelabs:

  1. Build an appointment scheduler with Dialogflow
  2. Integrate a Dialogflow chatbot with Actions on Google
  3. Understand entities in Dialogflow
  4. Build a frontend Django client for a Dialogflow app

You also need to understand the basic concepts and constructs of Dialogflow, which you can glean from the following videos in the Build a chatbot with Dialogflow pathway:

What you'll learn

  • How to create a Dialogflow agent
  • How to update a Dialogflow agent to upload files
  • How to set up the Vision API connection with Dialogflow fulfillment
  • How to set up and run a Django frontend app for Dialogflow
  • How to deploy the Django frontend app to Google Cloud on App Engine
  • How to test the Dialogflow app from a custom frontend

What you'll build

  • Create a Dialogflow agent
  • Implement a Django frontend to upload a file
  • Implement Dialogflow fulfillment to invoke Vision API against the uploaded image

What you'll need

  • A basic knowledge of Python
  • A basic understanding of Dialogflow
  • A basic understanding of the Vision API

2. Architecture overview

You'll create a new conversational experience with a custom Django frontend and extend it to integrate with the Vision API. You'll build the frontend with the Django framework, run and test it locally, and then deploy it to App Engine. The frontend will look like this:

5b07e09dc4b84646.png

The request flow will work like this as illustrated in the following image:

  1. The user will send a request via the frontend.
  2. That'll trigger a call to the Dialogflow detectIntent API to map the user's utterance to the right intent.
  3. Once the explore landmark intent is detected, Dialogflow fulfillment will send a request to the Vision API, receive a response, and send it to the user.

153725eb50e008d4.png

Here's what the overall architecture will look like.

a2fcea32222a9cb4.png

3. What's the Vision API?

The Vision API is a pre-trained ML model that derives insights from images. It can get you multiple insights, including image labeling, face and landmark detection, optical character recognition, and tagging of explicit content. To learn more, see Vision AI.

4. Create a Dialogflow agent

  1. Go to the Dialogflow console.
  2. Sign in. (If you're a first-time user, then use your email to sign up.)
  3. Accept the terms and conditions, and you'll be in the console.
  4. Click d9e90c93fc779808.png, scroll to the bottom, and click Create new agent. 3b3f9677e2a26d93.png
  5. Enter "VisionAPI" as the Agent name.
  6. Click Create.

Dialogflow creates the following two default intents as a part of the agent:

  1. Default welcome intent greets your users.
  2. Default fallback intent catches all the questions that your bot does not understand.

At this point, you have a functional bot that greets users, but you need to update it to let users know that they can upload an image to explore landmarks.

Update default welcome intent to notify the user to upload image

  1. Click Default Welcome Intent.
  2. Navigate to Responses > Default > Text or SSML Response and enter "Hi! You can upload a picture to explore landmarks."

f9cd9ba6917a7aa9.png

Create entity

  1. Click Entities.

432fff294b666c93.png

  1. Click Create Entity, name it "filename," and click Save.

602d001d684485de.png

Create new intent

  1. Click Intents > Create Intent.
  2. Enter "Explore uploaded image" as the Intent name.
  3. Click Training phrases > Add Training Phrases and enter "file is demo.jpg" and "file is taj.jpeg" as user expressions with @filename as the entity.

dd54ebda59c6b896.png

  1. Click Responses > Add Response > Default > Text or SSML Response. Enter "Assessing file" and click Add Responses.
  2. Click Fulfillment > Enable fulfillment and turn on Enable webhook call for this intent.

b32b7ac054fcc938.png

5. Set up fulfillment to integrate with Vision API

  1. Click Fulfillment.
  2. Enable Inline Editor.

c8574c6ef899393f.png

  1. Update the index.js with the following code and update YOUR-BUCKET-NAME with the name of your Cloud Storage bucket.
'use strict';

const functions = require('firebase-functions');
const {google} = require('googleapis');
const {WebhookClient} = require('dialogflow-fulfillment');
const vision = require('@google-cloud/vision');
  /**
   * TODO(developer): Uncomment the following lines before running the sample.
   */
const bucketName = 'YOUR-BUCKET-NAME';
const timeZone = 'America/Los_Angeles';
const timeZoneOffset = '-07:00';

exports.dialogflowFirebaseFulfillment = functions.https.onRequest((request, response) => {
  const agent = new WebhookClient({ request, response });
  console.log("Parameters", agent.parameters);

  function applyML(agent){
    const filename = agent.parameters.filename;
    console.log("filename is: ", filename);

    // call vision API to detect text
    return callVisionApi(agent, bucketName, filename).then(result => {
                      console.log(`result is ${result}`);
                      agent.add(`file is being processed, here are the results:  ${result}`);
            //agent.add(`file is being processed ${result}`);
        }).catch((error)=> {
            agent.add(`error occurred at apply ml function`  + error);
        });
  }

  let intentMap = new Map();
  intentMap.set('Explore uploaded image', applyML);
  agent.handleRequest(intentMap);
});


async function callVisionApi(agent, bucketName, fileName){
    // [START vision_text_detection_gcs]
  // Imports the Google Cloud client libraries
  // Creates a client
  
  const client = new vision.ImageAnnotatorClient();
    try {
        // Performs text detection on the gcs file
        const [result] = await client.landmarkDetection(`gs://${bucketName}/${fileName}`);
        const detections = result.landmarkAnnotations;
        var detected = [];
        detections.forEach(text => {
            console.log(text.description);
            detected.push(text.description);
        });
        return detected;
    }
    catch(error) {
        console.log('fetch failed', error);
        return [];
    }
}
  1. Paste the following into package.json to replace its contents.
{
  "name": "dialogflowFirebaseFulfillment",
  "description": "Dialogflow fulfillment for the bike shop sample",
  "version": "0.0.1",
  "private": true,
  "license": "Apache Version 2.0",
  "author": "Google Inc.",
  "engines": {
    "node": "6"
  },
  "scripts": {
    "lint": "semistandard --fix \"**/*.js\"",
    "start": "firebase deploy --only functions",
    "deploy": "firebase deploy --only functions"
  },
  "dependencies": {
    "firebase-functions": "2.0.2",
    "firebase-admin": "^5.13.1",
    "actions-on-google": "2.2.0", 
    "googleapis": "^27.0.0",
    "dialogflow-fulfillment": "^0.6.1",
    "@google-cloud/bigquery": "^1.3.0",
    "@google-cloud/storage": "^2.0.0",
    "@google-cloud/vision": "^0.25.0"
  }
}
  1. Click Save.

6. Download and run the frontend app

  1. Clone this repository to your local machine:
https://github.com/priyankavergadia/visionapi-dialogflow.git
  1. Change to the directory that contains the code. Alternatively, you can download the sample as a zip and extract it.
cd visionapi-dialogflow

7. Set up your local environment

When deployed, your app uses the Cloud SQL Proxy that is built into the App Engine standard environment to communicate with your Cloud SQL instance. However, to test your app locally, you must install and use a local copy of the Cloud SQL Proxy in your development environment. To learn more, see About the Cloud SQL Proxy.

To perform basic admin tasks on your Cloud SQL instance, you can use the Cloud SQL for MySQL client.

Install the Cloud SQL Proxy

Download and install the Cloud SQL Proxy with the following command. The Cloud SQL Proxy is used to connect to your Cloud SQL instance when running locally.

Download the proxy:

curl -o cloud_sql_proxy https://dl.google.com/cloudsql/cloud_sql_proxy.darwin.amd64

Make the proxy executable.

chmod +x cloud_sql_proxy

Create a Cloud SQL instance

  1. Create a Cloud SQL for MySQL Second Generation instance. Enter "polls-instance" or something similar as the name. It can take a few minutes for the instance to be ready. After it's ready, it should be visible in the instance list.
  2. Now use the gcloud command-line tool to run the following command where [YOUR_INSTANCE_NAME] represents the name of your Cloud SQL instance. Make a note of the value shown for connectionName for the next step. It displays in the format [PROJECT_NAME]:[REGION_NAME]:[INSTANCE_NAME].
gcloud sql instances describe [YOUR_INSTANCE_NAME]

Alternatively, you can click on the instance in the console to get the Instance connection name.

c11e94464bf4fcf8.png

Initialize your Cloud SQL instance

Start the Cloud SQL Proxy using the connectionName from the previous section.

./cloud_sql_proxy -instances="[YOUR_INSTANCE_CONNECTION_NAME]"=tcp:3306

Replace [YOUR_INSTANCE_CONNECTION_NAME] with the value that you recorded in the previous section. That establishes a connection from your local computer to your Cloud SQL instance for local testing purposes. Keep the Cloud SQL Proxy running the entire time that you test your app locally.

Next, create a new Cloud SQL user and database.

  1. Create a new database using the Google Cloud Console for your Cloud SQL instance named polls-instance. For example, you can enter "polls" as the name. a3707ec9bc38d412.png
  2. Create a new user using the Cloud Console for your Cloud SQL instance named polls-instance. f4d098fca49cccff.png

Configure the database settings

  1. Open mysite/settings-changeme.py for editing.
  2. Rename the file to setting.py.
  3. In two places, replace [YOUR-USERNAME] and [YOUR-PASSWORD] with the database username and password that you created in the previous section. That helps set up the connection to the database for App Engine deployment and local testing.
  4. In line ‘HOST': ‘cloudsql/ [PROJECT_NAME]:[REGION_NAME]:[INSTANCE_NAME]' replace [PROJECT_NAME]:[REGION_NAME]:[INSTANCE_NAME] with your instance name acquired in the previous section.
  5. Run the following command and copy the outputted connectionName value for the next step.
gcloud sql instances describe [YOUR_INSTANCE_NAME]
  1. Replace [YOUR-CONNECTION-NAME] with the value that you recorded in the previous step
  2. Replace [YOUR-DATABASE] with the name that you chose in the previous section.
# [START db_setup]
if os.getenv('GAE_APPLICATION', None):
    # Running on production App Engine, so connect to Google Cloud SQL using
    # the unix socket at /cloudsql/<your-cloudsql-connection string>
    DATABASES = {
        'default': {
            'ENGINE': 'django.db.backends.mysql',
            'HOST': '/cloudsql/[PROJECT_NAME]:[REGION_NAME]:[INSTANCE_NAME]',
            'USER': '[YOUR-USERNAME]',
            'PASSWORD': '[YOUR-PASSWORD]',
            'NAME': '[YOUR-DATABASE]',
        }
    }
else:
    # Running locally so connect to either a local MySQL instance or connect to
    # Cloud SQL via the proxy. To start the proxy via command line:
    #     $ cloud_sql_proxy -instances=[INSTANCE_CONNECTION_NAME]=tcp:3306
    # See https://cloud.google.com/sql/docs/mysql-connect-proxy
    DATABASES = {
        'default': {
            'ENGINE': 'django.db.backends.mysql',
            'HOST': '127.0.0.1',
            'PORT': '3306',
            'NAME': '[YOUR-DATABASE]',
            'USER': '[YOUR-USERNAME]',
            'PASSWORD': '[YOUR-PASSWORD]'
        }
    }
# [END db_setup]
  1. Close and save settings.py.

8. Set up service account

  1. In Dialogflow's console, click 21a21c1104f5fdf3.png. In the General tab, navigate to Google Project > Project ID and click Google Cloud 7b2236f5627c37a0.png to open the Cloud Console. a4cfb880b3c8e789.png
  2. Click Navigation menu ☰ > IAM & Admin > Service accounts, then click 796e7c9e65ae751f.png next to Dialogflow integrations and click Create key.

3d72abc0c184d281.png

  1. A JSON file will download to your computer, which you'll need in the following setup sections.

9. Set up Dialogflow detectIntent endpoint to be called from the app

  1. In the chat folder, replace key-sample.json with your credentials JSON file and name it key.json.
  2. In views.py in the chat folder, change the GOOGLE_PROJECT_ID = "<YOUR_PROJECT_ID>" to your project ID.

10. Create Cloud Storage buckets

Create a Cloud Storage bucket for frontend static objects

  1. In the Cloud Console, click Navigate Navigation menu ☰ > Storage.

87ff9469db4eb77f.png

  1. Click Create Bucket.
  2. Provide a globally unique name.

a15a6612e92a39d3.png

  1. Choose where to store your data. Pick Region and select the location that best suits your needs.
  2. Choose Standard as your default storage class.

9c56abe632cf61db.png

  1. Choose Set permissions uniformly at bucket-level (Bucket Policy Only), then click Continue to create the bucket.

f175ac794049df04.png

  1. Once the bucket is created, click Navigation menu ☰ > Storage > Browser and locate the bucket that you created.

9500ee19b427158c.png

  1. Click 796e7c9e65ae751f.png next to the corresponding bucket and click Edit bucket permissions.

fd0a310bc3656edd.png

  1. Click Add Members, click into New members, enter "allUsers," then click Select a role > Storage Object Viewer. That provides viewing access to the static frontend files to allUsers. That's not an ideal security setting for the files, but it works for the purpose of this particular codelab.

7519116abd56d5a3.png

Create a Cloud Storage bucket for user-uploaded images

Follow the same instructions to create a separate bucket to upload user images. Set permissions to "allUsers" again, but select Storage Object Creator and Storage Object Viewer as the roles.

11. Configure the Cloud Storage buckets in the frontend app

Configure the Cloud Storage bucket in settings.py

  1. Open mysite/setting.py.
  2. Locate the GCS_BUCKET variable and replace ‘<YOUR-GCS-BUCKET-NAME>' with your Cloud Storage static bucket.
  3. Locate the GS_MEDIA_BUCKET_NAME variable and replace ‘<YOUR-GCS-BUCKET-NAME-MEDIA>' with your Cloud Storage bucket name for the images.
  4. Locate the GS_STATIC_BUCKET_NAME variable and replace ‘<YOUR-GCS-BUCKET-NAME-STATIC>' with your Cloud Storage bucket name for the static files.
  5. Save the file.
GCS_BUCKET = '<YOUR-GCS-BUCKET-NAME>'
GS_MEDIA_BUCKET_NAME = '<YOUR-GCS-BUCKET-NAME-MEDIA>'
GS_STATIC_BUCKET_NAME = '<YOUR-GCS-BUCKET-NAME-STATIC>'

Configure Cloud Storage bucket in home.html

  • Open the chat folder, then open templates and rename home-changeme.html to home.html.
  • Look for <YOUR-GCS-BUCKET-NAME-MEDIA> and replace it with your bucket name for where you would like the user-uploaded file to be saved. That prevents you from storing the user-uploaded file in the frontend and keeping the static assets in the Cloud Storage bucket. The Vision API calls the Cloud Storage bucket to pick up the file and make the prediction.

12. Build and run the app locally

To run the Django app on your local computer, you'll need to set up a Python development environment, including Python, pip, and virtualenv. For instructions, refer to Setting Up a Python Development Environment.

  1. Create an isolated Python environment and install dependencies.
virtualenv env
source env/bin/activate
pip install -r requirements.txt
  1. Run the Django migrations to set up your models.
python3 manage.py makemigrations
python3 manage.py makemigrations polls
python3 manage.py migrate
  1. Start a local web server.
python3 manage.py runserver
  1. In your web browser, navigate to http://localhost:8000/. You should see a simple webpage that looks like this:.

8f986b8981f80f7b.png

The sample app pages are delivered by the Django web server running on your computer. When you're ready to move forward, press Control+C (Command+C on Macintosh) to stop the local web server.

Use the Django admin console

  1. Create a superuser.
python3 manage.py createsuperuser
  1. Start a local web server.
python3 manage.py runserver
  1. Navigate to http://localhost:8000/admin/ in your web browser. To log onto the admin site, enter the username and password that you created when you ran createsuperuser.

13. Deploy the app to the App Engine standard environment

Gather all the static content into one folder by running the following command, which moves all of the app's static files into the folder specified by STATIC_ROOT in settings.py:

python3 manage.py collectstatic

Upload the app by running the following command from the directory of the app where the app.yaml file is located:

gcloud app deploy

Wait for the message that notifies you that the update has been completed.

14. Test the frontend app

In your web browser, navigate to https://<your_project_id>.appspot.com

This time, your request is served by a web server running in the App Engine standard environment.

The app deploy command deploys the app as described in app.yaml and sets the newly deployed version as the default version, causing it to serve all new traffic.

15. Production

When you're ready to serve your content in production, change the DEBUG variable to False in mysite/settings.py.

16. Test your chatbot

You can test your chatbot in the simulator, or use the web or Google Home integration that you previously built.

  1. User: "hi"
  2. Chatbot: "Hi! You can upload a picture to explore landmarks."
  3. User uploads an image.

Download this image, name it demo.jpg, and use it.

c3aff843c9f132e4.jpeg

  1. Chatbot: "File is being processed, here are the results: Golden Gate Bridge,Golden Gate National Recreation Area,Golden Gate Bridge,Golden Gate Bridge,Golden Gate Bridge."

Overall, it should look like this:

228df9993bfc001d.png

17. Clean up

If you want to complete other Dialogflow codelabs, then skip this section and return to it later.

Delete the Dialogflow agent

  1. Click ca4337eeb5565bcb.png next to your existing agent.

520c1c6bb9f46ea6.png

  1. In the General tab, scroll down and click Delete This Agent.
  2. Type Delete into the window that appears and click Delete.

18. Congratulations

You created a chatbot in Dialogflow and integrated it with the Vision API. You're now a chatbot developer!

Learn more

To learn more, check out the code samples on the Dialogflow Github page.