1. Overview
In this codelab, you will focus on using the Natural Language API with Python. You will learn how to perform sentiment analysis, entity analysis, syntax analysis, and content classification.
What you'll learn
- How to set up your environment
- How to perform sentiment analysis
- How to perform entity analysis
- How to perform syntax analysis
- How to perform content classification
What you'll need
Survey
How will you use this tutorial?
How would you rate your experience with Python?
How would you rate your experience with Google Cloud services?
2. Setup and requirements
Self-paced environment setup
- Sign-in to the Google Cloud Console and create a new project or reuse an existing one. If you don't already have a Gmail or Google Workspace account, you must create one.
- The Project name is the display name for this project's participants. It is a character string not used by Google APIs. You can always update it.
- The Project ID is unique across all Google Cloud projects and is immutable (cannot be changed after it has been set). The Cloud Console auto-generates a unique string; usually you don't care what it is. In most codelabs, you'll need to reference your Project ID (typically identified as
PROJECT_ID
). If you don't like the generated ID, you might generate another random one. Alternatively, you can try your own, and see if it's available. It can't be changed after this step and remains for the duration of the project. - For your information, there is a third value, a Project Number, which some APIs use. Learn more about all three of these values in the documentation.
- Next, you'll need to enable billing in the Cloud Console to use Cloud resources/APIs. Running through this codelab won't cost much, if anything at all. To shut down resources to avoid incurring billing beyond this tutorial, you can delete the resources you created or delete the project. New Google Cloud users are eligible for the $300 USD Free Trial program.
Start Cloud Shell
While Google Cloud can be operated remotely from your laptop, in this codelab you will be using Cloud Shell, a command line environment running in the Cloud.
Activate Cloud Shell
- From the Cloud Console, click Activate Cloud Shell
.
If this is your first time starting Cloud Shell, you're presented with an intermediate screen describing what it is. If you were presented with an intermediate screen, click Continue.
It should only take a few moments to provision and connect to Cloud Shell.
This virtual machine is loaded with all the development tools needed. It offers a persistent 5 GB home directory and runs in Google Cloud, greatly enhancing network performance and authentication. Much, if not all, of your work in this codelab can be done with a browser.
Once connected to Cloud Shell, you should see that you are authenticated and that the project is set to your project ID.
- Run the following command in Cloud Shell to confirm that you are authenticated:
gcloud auth list
Command output
Credentialed Accounts ACTIVE ACCOUNT * <my_account>@<my_domain.com> To set the active account, run: $ gcloud config set account `ACCOUNT`
- Run the following command in Cloud Shell to confirm that the gcloud command knows about your project:
gcloud config list project
Command output
[core] project = <PROJECT_ID>
If it is not, you can set it with this command:
gcloud config set project <PROJECT_ID>
Command output
Updated property [core/project].
3. Environment setup
Before you can begin using the Natural Language API, run the following command in Cloud Shell to enable the API:
gcloud services enable language.googleapis.com
You should see something like this:
Operation "operations/..." finished successfully.
Now, you can use the Natural Language API!
Navigate to your home directory:
cd ~
Create a Python virtual environment to isolate the dependencies:
virtualenv venv-language
Activate the virtual environment:
source venv-language/bin/activate
Install IPython and the Natural Language API client library:
pip install ipython google-cloud-language
You should see something like this:
... Installing collected packages: ..., ipython, google-cloud-language Successfully installed ... google-cloud-language-2.9.0 ...
Now, you're ready to use the Natural Language API client library!
In the next steps, you'll use an interactive Python interpreter called IPython, which you installed in the previous step. Start a session by running ipython
in Cloud Shell:
ipython
You should see something like this:
Python 3.9.2 (default, Feb 28 2021, 17:03:44) Type 'copyright', 'credits' or 'license' for more information IPython 8.12.0 -- An enhanced Interactive Python. Type '?' for help. In [1]:
4. Sentiment analysis
In this section, you will perform sentiment analysis on a string and find out the Score and Magnitude using the Natural Language API.
The Score of the sentiment ranges between -1.0 (negative) and 1.0 (positive) and corresponds to the overall sentiment from the given information.
The Magnitude of the sentiment ranges from 0.0 to +infinity and indicates the overall strength of sentiment from the given information. The more information provided, the higher the magnitude.
Copy the following code into your IPython session:
from google.cloud import language
def analyze_text_sentiment(text: str):
client = language.LanguageServiceClient()
document = language.Document(content=text, type_=language.Document.Type.PLAIN_TEXT)
response = client.analyze_sentiment(document=document)
sentiment = response.document_sentiment
results = dict(
text=text,
score=f"{sentiment.score:.1%}",
magnitude=f"{sentiment.magnitude:.1%}",
)
for key, value in results.items():
print(f"{key:10}: {value}")
Call the function:
text = "Guido van Rossum is great!"
analyze_text_sentiment(text)
You should see the following output:
text : Guido van Rossum is great! score : 90.0% magnitude : 90.0%
Take a moment to test your own sentences.
Summary
In this step, you were able to perform sentiment analysis on a string of text and printed out its score and magnitude. Read more about sentiment analysis.
5. Entity analysis
Entity analysis inspects the given information for entities by searching for proper nouns such as public figures, landmarks, etc., and returns information about those entities.
Copy the following code into your IPython session:
from google.cloud import language
def analyze_text_entities(text: str):
client = language.LanguageServiceClient()
document = language.Document(content=text, type_=language.Document.Type.PLAIN_TEXT)
response = client.analyze_entities(document=document)
for entity in response.entities:
print("=" * 80)
results = dict(
name=entity.name,
type=entity.type_.name,
salience=f"{entity.salience:.1%}",
wikipedia_url=entity.metadata.get("wikipedia_url", "-"),
mid=entity.metadata.get("mid", "-"),
)
for key, value in results.items():
print(f"{key:15}: {value}")
Call the function:
text = "Guido van Rossum is great, and so is Python!"
analyze_text_entities(text)
You should see the following output:
================================================================================ name : Guido van Rossum type : PERSON salience : 65.8% wikipedia_url : https://en.wikipedia.org/wiki/Guido_van_Rossum mid : /m/01h05c ================================================================================ name : Python type : ORGANIZATION salience : 34.2% wikipedia_url : https://en.wikipedia.org/wiki/Python_(programming_language) mid : /m/05z1_
Take a moment to test your own sentences mentioning other entities.
Summary
In this step, you were able to perform entity analysis on a string of text and printed its entities. Read more about entity analysis.
6. Syntax analysis
Syntax analysis extracts linguistic information, breaking up the given text into a series of sentences and tokens (generally based on word boundaries), providing further analysis on those tokens.
This example will print the number of sentences, tokens, and provide the part of speech for each token.
Copy the following code into your IPython session:
from google.cloud import language
def analyze_text_syntax(text: str):
client = language.LanguageServiceClient()
document = language.Document(content=text, type_=language.Document.Type.PLAIN_TEXT)
response = client.analyze_syntax(document=document)
line = "{:10}: {}"
print(line.format("sentences", len(response.sentences)))
print(line.format("tokens", len(response.tokens)))
for token in response.tokens:
print(line.format(token.part_of_speech.tag.name, token.text.content))
Call the function:
text = "Guido van Rossum is great!"
analyze_text_syntax(text)
You should see the following output:
sentences : 1 tokens : 6 NOUN : Guido NOUN : van NOUN : Rossum VERB : is ADJ : great PUNCT : !
Take a moment to test your own sentences with other syntactic structures.
Here is a visual interpretation showing the complete syntax analysis:
Summary
In this step, you were able to perform syntax analysis on a simple string of text and print out the number of sentences, number of tokens, and linguistic information for each token. Read more about syntax analysis.
7. Content classification
Content classification analyzes a document and returns a list of content categories that apply to the text found in the document.
This example will print the categories that apply to a Python language description.
Copy the following code into your IPython session:
from google.cloud import language
def classify_text(text: str):
client = language.LanguageServiceClient()
document = language.Document(content=text, type_=language.Document.Type.PLAIN_TEXT)
response = client.classify_text(document=document)
for category in response.categories:
print("=" * 80)
print(f"category : {category.name}")
print(f"confidence: {category.confidence:.0%}")
Call the function:
text = (
"Python is an interpreted, high-level, general-purpose programming language. "
"Created by Guido van Rossum and first released in 1991, "
"Python's design philosophy emphasizes code readability "
"with its notable use of significant whitespace."
)
classify_text(text)
You should see the following output:
================================================================================ category : /Computers & Electronics/Programming confidence: 99% ================================================================================ category : /Science/Computer Science confidence: 99%
Take a moment to test your own sentences relating to other categories. You'll find a complete list in content categories.
Summary
In this step, you were able to perform content classification on a string of text and printed the related categories. Read more about content classification.
8. Congratulations!
You learned how to use the Natural Language API using Python!
Clean up
To clean up your development environment, from Cloud Shell:
- If you're still in your IPython session, go back to the shell:
exit
- Stop using the Python virtual environment:
deactivate
- Delete your virtual environment folder:
cd ~ ; rm -rf ./venv-language
To delete your Google Cloud project, from Cloud Shell:
- Retrieve your current project ID:
PROJECT_ID=$(gcloud config get-value core/project)
- Make sure this is the project you want to delete:
echo $PROJECT_ID
- Delete the project:
gcloud projects delete $PROJECT_ID
Learn more
- Test the demo in your browser: https://cloud.google.com/natural-language#natural-language-api-demo
- Natural Language documentation: https://cloud.google.com/natural-language/docs
- Python on Google Cloud: https://cloud.google.com/python
- Cloud Client Libraries for Python: https://github.com/googleapis/google-cloud-python
License
This work is licensed under a Creative Commons Attribution 2.0 Generic License.