Using the Vision API with C#

1. Overview

The Google Cloud Vision API allows developers to easily integrate vision detection features within applications, including image labeling, face and landmark detection, optical character recognition (OCR), and tagging of explicit content.

In this codelab you will focus on using the Vision API with C#. You will learn how to perform text detection, landmark detection, and face detection!

What you'll learn

How to use the Cloud Shell
How to Enable the Google Cloud Vision API
How to Authenticate API requests
How to install the Vision API client library for C#
How to perform Label detection
How to perform Text detection
How to perform Landmark detection
How to perform Face Detection

What you'll need

A Google Cloud Platform Project
A Browser, such Chrome or Firefox
Familiarity using C#

Survey

How will you use this tutorial?

Read it through only

Read it and complete the exercises

How would you rate your experience with C#?

Novice

Intermediate

Proficient

How would you rate your experience with using Google Cloud Platform services?

Novice

Intermediate

Proficient

2. Setup and Requirements

Self-paced environment setup

Sign-in to the Google Cloud Console and create a new project or reuse an existing one. If you don't already have a Gmail or Google Workspace account, you must create one.

The Project name is the display name for this project's participants. It is a character string not used by Google APIs. You can always update it.
The Project ID is unique across all Google Cloud projects and is immutable (cannot be changed after it has been set). The Cloud Console auto-generates a unique string; usually you don't care what it is. In most codelabs, you'll need to reference your Project ID (typically identified as PROJECT_ID). If you don't like the generated ID, you might generate another random one. Alternatively, you can try your own, and see if it's available. It can't be changed after this step and remains for the duration of the project.
For your information, there is a third value, a Project Number, which some APIs use. Learn more about all three of these values in the documentation.

Next, you'll need to enable billing in the Cloud Console to use Cloud resources/APIs. Running through this codelab won't cost much, if anything at all. To shut down resources to avoid incurring billing beyond this tutorial, you can delete the resources you created or delete the project. New Google Cloud users are eligible for the $300 USD Free Trial program.

Start Cloud Shell

While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Google Cloud Shell, a command line environment running in the Cloud.

Activate Cloud Shell

From the Cloud Console, click Activate Cloud Shell .

If this is your first time starting Cloud Shell, you're presented with an intermediate screen describing what it is. If you were presented with an intermediate screen, click Continue.

It should only take a few moments to provision and connect to Cloud Shell.

This virtual machine is loaded with all the development tools needed. It offers a persistent 5 GB home directory and runs in Google Cloud, greatly enhancing network performance and authentication. Much, if not all, of your work in this codelab can be done with a browser.

Once connected to Cloud Shell, you should see that you are authenticated and that the project is set to your project ID.

Run the following command in Cloud Shell to confirm that you are authenticated:

gcloud auth list

Command output

 Credentialed Accounts
ACTIVE  ACCOUNT
*       <my_account>@<my_domain.com>

To set the active account, run:
    $ gcloud config set account `ACCOUNT`

Run the following command in Cloud Shell to confirm that the gcloud command knows about your project:

gcloud config list project

Command output

[core]
project = <PROJECT_ID>

If it is not, you can set it with this command:

gcloud config set project <PROJECT_ID>

Command output

Updated property [core/project].

3. Enable the Vision API

Before you can begin using the Vision API you must enable the API. Using the Cloud Shell, you can enable the API by using the following command:

gcloud services enable vision.googleapis.com

4. Install the Google Cloud Vision API client library for C#

First, create a simple C# console application that you will use to run Vision API samples:

dotnet new console -n VisionApiDemo

You should see the application created and dependencies resolved:

The template "Console Application" was created successfully.
Processing post-creation actions...
...
Restore succeeded.

Next, navigate to VisionApiDemo folder:

cd VisionApiDemo/

And add Google.Cloud.Vision.V1 NuGet package to the project:

dotnet add package Google.Cloud.Vision.V1

info : Adding PackageReference for package 'Google.Cloud.Vision.V1' into project '/home/atameldev/VisionApiDemo/VisionApiDemo.csproj'.
log  : Restoring packages for /home/atameldev/VisionApiDemo/VisionApiDemo.csproj...
...
info : PackageReference for package 'Google.Cloud.Vision.V1' version '1.2.0' added to file '/home/atameldev/VisionApiDemo/VisionApiDemo.csproj'.

Now, you're ready to use Vision API!

5. Perform Label Detection

One of the Vision API's basic features is to identify objects or entities in an image, known as label annotation. Label detection identifies general objects, locations, activities, animal species, products, and more. The Vision API takes an input image and returns the most likely labels which apply to that image. It returns the top-matching labels along with a confidence score of a match to the image.

In this example, you will perform label detection on an image of a street scene in Shanghai. Open the code editor from the top right side of the Cloud Shell:

Navigate to the Program.cs file inside the VisionApiDemo folder and replace the code with the following:

using Google.Cloud.Vision.V1;
using System;

namespace VisionApiDemo
{
    class Program
    {   
        static void Main(string[] args)
        {
            var client = ImageAnnotatorClient.Create();
            var image = Image.FromUri("gs://cloud-samples-data/vision/using_curl/shanghai.jpeg");
            var labels = client.DetectLabels(image);

            Console.WriteLine("Labels (and confidence score):");
            Console.WriteLine(new String('=', 30));

            foreach (var label in labels)
            {
                Console.WriteLine($"{label.Description} ({(int)(label.Score * 100)}%)");
            }
        }
    }
}

Take a minute or two to study the code and see how the Vision API C# library is used to perform label detection.

Back in Cloud Shell, run the app:

dotnet run

You should see the following output:

Labels (and confidence score):
==============================
Wheel (97%)
Tire (97%)
Photograph (94%)
Bicycle (94%)
Motor vehicle (89%)
Infrastructure (89%)
Vehicle (86%)
Mode of transport (84%)
Bicycle wheel (83%)
Asphalt (81%)

Summary

In this step, you were able to perform label detection on an image of a street scene in China and display the most likely labels associated with that image. Read more about Label Detection.

6. Perform Text Detection

Vision API's Text Detection performs Optical Character Recognition. It detects and extracts text within an image with support for a broad range of languages. It also features automatic language identification.

In this example, you will perform text detection on an image of a system software update screen.

Navigate to the Program.cs file inside the VisionApiDemo folder and replace the code with the following:

using Google.Cloud.Vision.V1;
using System;

namespace VisionApiDemo
{
    class Program
    {   
        static void Main(string[] args)
        {
            var client = ImageAnnotatorClient.Create();
            var image = Image.FromUri("gs://cloud-samples-data/vision/text/screen.jpg");
            var response = client.DetectText(image);
            foreach (var annotation in response)
            {
                if (annotation.Description != null)
                {
                    Console.WriteLine(annotation.Description);
                }
            }
        }
    }
}

Take a minute or two to study the code and see how the Vision API C# library is used to perform text detection.

Back in Cloud Shell, run the app:

dotnet run

You should see the following output:

System Software Update
Back
Preparing to install...
After preparation is complete, the PS4 will automatically restart and the update file will be
installed.
37%
gus class
System
Software
Update
Back
Preparing
to
install
...
After
preparation
is
complete
,
the
PS4
will
automatically
restart
and
the
update
file
will
be
installed
.
37
%
gus
class

Summary

In this step, you were able to perform text detection on an image of an Otter Crossing and print recognized text from the image. Read more about Text Detection.

7. Perform Landmark Detection

Vision API's Landmark Detection detects popular natural and man-made structures within an image.

In this example, you will perform landmark detection on an image of the Eiffel Tower.

Navigate to the Program.cs file inside the VisionApiDemo folder and replace the code with the following:

using Google.Cloud.Vision.V1;
using System;

namespace VisionApiDemo
{
    class Program
    {   
        static void Main(string[] args)
        {
            var client = ImageAnnotatorClient.Create();
            var image = Image.FromUri("gs://cloud-samples-data/vision/eiffel_tower.jpg");
            var response = client.DetectLandmarks(image);
            foreach (var annotation in response)
            {
                if (annotation.Description != null)
                {
                    Console.WriteLine(annotation.Description);
                }
            }
        }
    }
}

Take a minute or two to study the code and see how the Vision API C# library is used to perform landmark detection.

Back in Cloud Shell, run the app:

dotnet run

You should see the following output:

Eiffel Tower

Summary

In this step, you were able to perform landmark detection on image of the Eiffel Tower. Read more about Landmark Detection.

8. Perform Emotional Face Detection

Face Detection detects multiple faces within an image along with the associated key facial attributes such as emotional state or wearing headwear.

In this example, you will detect the likelihood of emotional state from four different emotional likelihoods including: joy, anger, sorrow, and surprise.

Navigate to the Program.cs file inside the VisionApiDemo folder and replace the code with the following:

using Google.Cloud.Vision.V1;
using System;

namespace VisionApiDemo
{
    class Program
    {
        static void Main(string[] args)
        {
            var client = ImageAnnotatorClient.Create();

            var image = Image.FromUri("gs://cloud-samples-data/vision/face/face_no_surprise.jpg");
            var response = client.DetectFaces(image);
            foreach (var annotation in response)
            {
                Console.WriteLine($"Picture: {image}");
                Console.WriteLine($" Surprise: {annotation.SurpriseLikelihood}");
            }
        }
    }
}

Take a minute or two to study the code and see how the Vision API C# library is used to perform emotional face detection.

Run the app;

dotnet run

You should see the following output for our face_no_surprise example:

Picture: { "source": { "imageUri": "gs://cloud-samples-data/vision/face/face_no_surprise.jpg" } }
 Surprise: Likely

Summary

In this step, you were able to perform emotional face detection. Read more about Face Detection.

9. Congratulations!

You learned how to use the Vision API using C# to perform different detection on images!

Clean up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this quickstart:

Go to the Cloud Platform Console.
Select the project you want to shut down, then click ‘Delete' at the top: this schedules the project for deletion.

Learn More

Google Cloud Vision API: https://cloud.google.com/vision/docs/
C#/.NET on Google Cloud Platform: https://cloud.google.com/dotnet/
Google Cloud .NET client: https://googlecloudplatform.github.io/google-cloud-dotnet

License

This work is licensed under a Creative Commons Attribution 2.0 Generic License.