In this codelab, you will focus on using the Google Cloud Natural Language API with Python. You will learn how to perform sentiment analysis, entity analysis, syntax analysis, and content classification.

What you'll learn

What you'll need

Survey

How will you use this tutorial?

Read it through only Read it and complete the exercises

How would you rate your experience with Python?

Novice Intermediate Proficient

How would you rate your experience with using Google Cloud Platform services?

Novice Intermediate Proficient

Self-paced environment setup

If you don't already have a Google Account (Gmail or Google Apps), you must create one. Sign-in to Google Cloud Platform console (console.cloud.google.com) and create a new project:

Screenshot from 2016-02-10 12:45:26.png

Remember the project ID, a unique name across all Google Cloud projects (the name above has already been taken and will not work for you, sorry!). It will be referred to later in this codelab as PROJECT_ID.

Next, you'll need to enable billing in the Cloud Console in order to use Google Cloud resources.

Running through this codelab shouldn't cost you more than a few dollars, but it could be more if you decide to use more resources or if you leave them running (see "cleanup" section at the end of this document).

New users of Google Cloud Platform are eligible for a $300 free trial.

Start Cloud Shell

While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Google Cloud Shell, a command line environment running in the Cloud.

Activate Google Cloud Shell

From the GCP Console click the Cloud Shell icon on the top right toolbar:

Then click "Start Cloud Shell":

It should only take a few moments to provision and connect to the environment:

This virtual machine is loaded with all the development tools you'll need. It offers a persistent 5GB home directory, and runs on the Google Cloud, greatly enhancing network performance and authentication. Much, if not all, of your work in this lab can be done with simply a browser or your Google Chromebook.

Once connected to Cloud Shell, you should see that you are already authenticated and that the project is already set to your PROJECT_ID.

Run the following command in Cloud Shell to confirm that you are authenticated:

gcloud auth list

Command output

Credentialed accounts:
 - <myaccount>@<mydomain>.com (active)
gcloud config list project

Command output

[core]
project = <PROJECT_ID>

If it is not, you can set it with this command:

gcloud config set project <PROJECT_ID>

Command output

Updated property [core/project].

Before you can begin using the Natural Language API, you must enable the API. Using Cloud Shell, you can enable the API by using the following command:

gcloud services enable language.googleapis.com

In order to make requests to the Natural Language API, you need to use a Service Account. A Service Account belongs to your project and it is used by the Python client library to make Natural Language API requests. Like any other user account, a service account is represented by an email address. In this section, you will use the Cloud SDK to create a service account and then create credentials you will need to authenticate as the service account.

First, set a PROJECT_ID environment variable:

export PROJECT_ID=$(gcloud config get-value core/project)

Next, create a new service account to access the Natural Language API by using:

gcloud iam service-accounts create my-nl-sa \
  --display-name "my nl service account"

Next, create credentials that your Python code will use to login as your new service account. Create and save these credentials as a ~/key.json JSON file by using the following command:

gcloud iam service-accounts keys create ~/key.json \
  --iam-account  my-nl-sa@${PROJECT_ID}.iam.gserviceaccount.com

Finally, set the GOOGLE_APPLICATION_CREDENTIALS environment variable, which is used by the Natural Language API Python library, covered in the next step, to find your credentials. The environment variable should be set to the full path of the credentials JSON file you created, by using:

export GOOGLE_APPLICATION_CREDENTIALS="/home/${USER}/key.json"

You can read more about authenticating the Natural Language API.

We're going to use the Google Cloud Python client library, which should already be installed in your Cloud Shell environment. You can read more about Google Cloud Python services here.

Check that the client library is already installed:

pip freeze | grep google-cloud-language

You should see something similar to:

google-cloud-language==1.2.0

Now, you're ready to use the Natural Language API!

In this codelab, we'll use an interactive Python interpreter called IPython. Start a session by running ipython in Cloud Shell. This command runs the Python interpreter in an interactive Read, Eval, Print, Loop (REPL) session.

user@project:~$ ipython
Python 3.5.3 (default, Sep 27 2018, 17:25:39)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.6.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]:

In this section, we will perform Sentiment Analysis on a string and find out the Score and Magnitude using the Natural Language API.

The Score of the sentiment ranges between -1.0 (negative) and 1.0 (positive) and corresponds to the overall sentiment from the given information.

The Magnitude of the sentiment ranges from 0.0 to +infinity and indicates the overall strength of sentiment from the given information. The more information that is provided the higher the magnitude.

Copy the following code into your IPython session:

from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types


def analyze_text_sentiment(text):
    client = language.LanguageServiceClient()
    document = types.Document(
        content=text,
        type=enums.Document.Type.PLAIN_TEXT)

    response = client.analyze_sentiment(document=document)

    sentiment = response.document_sentiment
    results = [
        ('text', text),
        ('score', sentiment.score),
        ('magnitude', sentiment.magnitude),
    ]
    for k, v in results:
        print('{:10}: {}'.format(k, v))
 

Call the function:

text = 'Guido van Rossum is great!'
analyze_text_sentiment(text)

You should see the following output:

text      : Guido van Rossum is great!
score     : 0.800000011920929
magnitude : 0.800000011920929

Take a moment to test your own sentences.

Summary

In this step, you were able to perform Sentiment Analysis on a string of text and printed out its score and magnitude. Read more about Sentiment Analysis.

Entity Analysis inspects the given information for entities by searching for proper nouns such as public figures, landmarks, etc., and returns information about those entities.

Copy the following code into your IPython session:

from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types


def analyze_text_entities(text):
    client = language.LanguageServiceClient()
    document = types.Document(
        content=text,
        type=enums.Document.Type.PLAIN_TEXT)

    response = client.analyze_entities(document=document)

    for entity in response.entities:
        print('=' * 79)
        results = [
            ('name', entity.name),
            ('type', enums.Entity.Type(entity.type).name),
            ('salience', entity.salience),
            ('wikipedia_url', entity.metadata.get('wikipedia_url', '-')),
            ('mid', entity.metadata.get('mid', '-')),
        ]
        for k, v in results:
            print('{:15}: {}'.format(k, v))
 

Call the function:

text = 'Guido van Rossum is great, and so is Python!'
analyze_text_entities(text)

You should see the following output:

===============================================================================
name           : Guido van Rossum
type           : PERSON
salience       : 0.6580443978309631
wikipedia_url  : https://en.wikipedia.org/wiki/Guido_van_Rossum
mid            : /m/01h05c
===============================================================================
name           : Python
type           : ORGANIZATION
salience       : 0.34195560216903687
wikipedia_url  : https://en.wikipedia.org/wiki/Python_(programming_language)
mid            : /m/05z1_

Take a moment to test your own sentences mentioning other entities.

Summary

In this step, you were able to perform Entity Analysis on a string of text and printed out its entities. Read more about Entity Analysis.

Syntactic Analysis extracts linguistic information, breaking up the given text into a series of sentences and tokens (generally, word boundaries), providing further analysis on those tokens.

This example will print out the number of sentences, tokens, and provide the part of speech for each token.

Copy the following code into your IPython session:

from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types


def analyze_text_syntax(text):
    client = language.LanguageServiceClient()
    document = types.Document(
        content=text,
        type=enums.Document.Type.PLAIN_TEXT)

    response = client.analyze_syntax(document=document)

    fmts = '{:10}: {}'
    print(fmts.format('sentences', len(response.sentences)))
    print(fmts.format('tokens', len(response.tokens)))
    for token in response.tokens:
        part_of_speech_tag = enums.PartOfSpeech.Tag(token.part_of_speech.tag)
        print(fmts.format(part_of_speech_tag.name, token.text.content))
 

Call the function:

text = 'Guido van Rossum is great!'
analyze_text_syntax(text)

You should see the following output:

sentences : 1
tokens    : 6
NOUN      : Guido
NOUN      : van
NOUN      : Rossum
VERB      : is
ADJ       : great
PUNCT     : !

Take a moment to test your own sentences with other syntactic structures.

Here is a visual interpretation showing the complete syntactic analysis:

Summary

In this step, you were able to perform Syntax Analysis on a simple string of text and printed out the number of sentences, number of tokens, and linguistic information for each token. Read more about Syntax Analysis.

Content Classification analyzes a document and return a list of content categories that apply to the text found in the document.

This example will print out the categories that apply to a description of the Python language.

Copy the following code into your IPython session:

from google.cloud import language
from google.cloud.language import enums
from google.cloud.language import types


def classify_text(text):
    client = language.LanguageServiceClient()
    document = types.Document(
        content=text,
        type=enums.Document.Type.PLAIN_TEXT)

    response = client.classify_text(document=document)

    for category in response.categories:
        print('=' * 79)
        print('category  : {}'.format(category.name))
        print('confidence: {:.0%}'.format(category.confidence))
 

Call the function:

text = """
Python is an interpreted, high-level, general-purpose programming language. 
Created by Guido van Rossum and first released in 1991, 
Python's design philosophy emphasizes code readability 
with its notable use of significant whitespace.
"""
classify_text(text)

You should see the following output:

===============================================================================
category  : /Computers & Electronics/Programming
confidence: 99%
===============================================================================
category  : /Science/Computer Science
confidence: 99%

Take a moment to test your own sentences relating to other categories. A complete list of content categories can be found here.

Summary

In this step, you were able to perform Content Classification on a string of text and printed out the related categories. Read more about Content Classification.

You learned how to use the Natural Language API using Python to perform different kinds of analyses on information!

Clean up

To avoid incurring charges to your Google Cloud Platform account for the resources used in this quickstart:

Learn More

License

This work is licensed under a Creative Commons Attribution 2.0 Generic License.