Learn to prototype Responsible AI with the PAIR Guidebook and MakerSuite

1. Before you begin

MakerSuite is a set of tools that let you prototype with large language models right from the browser—no setup required. Using MakerSuite, you can go from quickly trying out prompts to creating an API that your app can access directly, which helps teams quickly deliver great applications based on generative AI. The People + AI Research (PAIR) Guidebook offers guidance on how to design a new product with AI, focusing on human-centered data practices and earning user trust—guidance that's applicable to using MakerSuite.

In this codelab, you learn how to leverage these two resources together in order to build responsible AI-based experiences. The focus of the codelab is on responsible prototyping with generative AI, not the end-to-end workflow of these specific resources. To learn about the general workflow for MakerSuite, see this basic tutorial for MakerSuite, and consult the PAIR Guidebook for more comprehensive guidance for designing AI products.

Prerequisites

  • Basic understanding of AI.
  • Some knowledge of product development workflow.

What you'll learn

  • How to use the PAIR Guidebook to probe how well your AI experiences work for different audiences, and how to know which tasks should or shouldn't use AI.
  • How to create generative AI experiences that draw from the richness of users' cultural practices.
  • How to integrate opportunities in the AI development process that earn user trust by focusing on user-facing explainability.
  • How to use a broader toolkit of generative AI materials and human-centered AI resources for further exploration.

What you'll build

This codelab walks you through a hands-on prototyping process for responsible generative AI as you design a creative writing tool. If you are interested, you can even integrate these prompts you design into Wordcraft, an open-source AI-powered text editor, released as a research prototype by Google.

What you'll need

  • Browser
  • Google account, in order to access MakerSuite

2. Get set up

MakerSuite

MakerSuite is a set of Google tools that lets you prototype with large language models right from the browser—no setup required. You can quickly try out models and experiment with different prompts. When you've built something you're happy with, you can easily export it as Python code, and then call the same models using the Generative Language API.

To experiment with large language models using MakerSuite, sign up for the wait list.

People + AI Research Guidebook

The People + AI Research (PAIR) Guidebook is a resource that helps developers, designers, product managers, students, and many others use AI responsibly.

The PAIR Guidebook can help you and your team develop a list of key questions related to AI—including generative AI—in your product.

  • When and how should I use AI in my product?
  • How do I help users build trust in my AI system?
  • How do I explain my AI system to users?
  • How can AI experiences be culturally inclusive and equity-oriented?

You use the PAIR Guidebook throughout this codelab in order to develop questions for prototyping and to choose among different design options.

Get the code for Wordcraft (optional)

Wordcraft is an AI-powered text editor developed at Google Research that explores collaborative human and AI story writing. The Wordcraft code is open source, so you can experiment with prompts in this codelab on your own.

  • To get the code for Wordcraft, use the following command:
git clone https://github.com/pair-code/wordcraft

Alternatively, you can download the zip file:

TBD

3. Use generative AI for story writing

A large language model (LLM) is an AI model that is trained on huge amounts of text from books, articles, and websites in order to learn grammar, common phrases, and other information. Based on this data and with some additional finetuning, an LLM like PaLM can complete many artificial intelligence tasks based on simple instructions rather than requiring sophisticated machine learning programming. It can also answer questions, summarize information, translate languages, and perform many other AI tasks.

In this codelab, you use an LLM to prototype an app that helps authors write stories. In addition to having general information about the world, grammar, and so on, Google's PaLM LLM is designed to follow user instructions, or prompts. So, to prototype your tool in MakerSuite, you teach your model what to write as a response instigated by a user prompt.

Write AI-assisted stories using text prompts in MakerSuite

  1. To create a prompt, click Create New in the left panel, and choose Text prompt. Start with this prompt:
You are a talented fiction author. Write a story about a given topic.
Topic: {{topic}}

Once you enter this prompt, MakerSuite detects that {{topic}} is an input to the prompt and opens a Test your prompt panel so you can see how your prompt works with a variety of inputs.

  1. Enter A boy discovers a lost cat in his yard, and then click the model panel (to the right of the Run button) and choose the temperature to be 0.8.
  2. Finally, click Run in order to run your prompt. You should see the prompt output in the table momentarily.

Choosing the temperature setting to be 0.8 tells the model that you prefer diversity in its output. Doing so results in more creative stories but yields different outputs every time it's run. If you want the exact same output each time, set the temperature to 0.

One possible output is the following:

Henry sat on the edge of his porch and swung his legs back and forth. A small cat ran in front of him. Henry petted the cat. The cat began to purr. Henry picked up the cat and placed it on his lap. The cat began to snuggle close to him. Henry decided to keep the cat.
  1. Try running the prompt several times and notice the different stories that are created.

Screenshot of the Makersuite editor. The view shows the prompt on top, with the Test your prompt table of test inputs below.

As you can see, the model writes a structured story that flows logically, but it also makes several assumptions. The story is centered around a boy called Henry, for instance. You can change these assumptions by specifying the name of our protagonist or even specifying whether you want the story to focus on the kitten or the human.

  1. Update your prompt, and then click Run to see how it works with all test inputs.

Identify the tasks best suited for AI assistance, using the PAIR Guidebook

So far, the assumption is that the AI model writes a complete story, given only a brief description. But is this the right design decision for your creative tool? For instance, imagine an assistant that helps authors rewrite parts of the story of their choosing. You can prototype this interaction in MakerSuite, for instance, making the story fragment more dramatic.

This provides much more focused assistance, rewriting paragraphs at a time. At a higher level, with a few changes to your prompt, you can prototype a user-augmentation tool rather than a task-automation tool.

The PAIR Guidebook offers a principled way to ask and answer questions like these in the AI development process. While MakerSuite helps you prototype ideas quickly, the PAIR Guidebook allows you to narrow down design choices to the most promising ones for your purposes and the audience you aim to engage. Use the Guidebook to understand whether augmentation or automation is the right approach for partnering with AI to build your app.

Begin with the How should I use AI? guiding question in the Guidebook. As this Guidebook pattern notes, it is better to use AI when it adds unique value. In this case, because LLMs are trained with lots of data about grammar, common phrases, and other information from the internet, it might be useful to leverage the ability of the model to understand the world of the story that you want to describe in your writing app's output and suggest ways to rewrite it. This builds on the personalized recommendation pattern in the Guidebook.

Take this a step further. The PAIR Guidebook offers a chapter on user needs with guidance on whether tasks should be automated or augmented.

When considering augmentation or automation, remember that your prototype is meant to be a helpful app for writers. So, it seems likely that your users enjoy writing, want to take personal ownership of their writing, and have preferences built over a lifetime of writing that might be difficult to communicate. Taken together, this suggests that an augmentation approach might be the more promising option.

Based on the PAIR Guidebook, it might make sense to think of the app you are prototyping as not a tool for writing, but for rewriting. For instance, you can change the prompt to allow for different styles of writing.

  1. Create a new text prompt:
Edit the paragraph below. Make it \{\{rewrite style\}\}. Only respond with the updated text. Do not include any explanation.

Paragraph: {{paragraph}}

Here, both \{\{rewrite style\}\} and {{paragraph}} are text inputs.

  1. In the testing panel, try a number of rewriting styles such as shorter, more dramatic, more witty, less grammatically awkward, poetic, and so on.

Design for stories around the world

So far, you've tested the rewrite a paragraph prompt with stories that lack a strong cultural context. When designing Responsible AI experiences, it is often useful to try a diverse variety of inputs.

Try a number of test inputs, such as:

  • In a quiet corner of a quaint Parisian café, a solitary patron savored the aroma of freshly brewed coffee, his thoughts drifting to a long-forgotten moment that forever changed the course of his life.
  • Amidst the chaotic energy of a Mumbai local train, a middle-aged woman struck up a conversation with a stranger. How fascinating, she thought, to live in the same city and have lives that were so different.
  • Amid the vibrant chaos of a bustling Shanghai street market, a street food vendor took a moment to observe the ebb and flow of the crowd.

Experiment with other cultural and geographical contexts responsibly, taking care to avoid unfair bias and historical stereotypes. Note that while the LLM is knowledgeable about many parts of the world based on existing data found online, it may not get all the details on a specific geographical place right. As the PAIR Guidebook suggests, it is important in augmentation tasks to offer control to the users. For instance, you can extend the rewrite capabilities of your prototype to allow for greater control of the plot and details of the story.

Many generative models also sometimes exhibit default assumptions, due in part to patterns that are more prevalent in their huge training datasets of online information. It's important to know that models can be steered to make other, equally valid assumptions. For instance, for your rewrite a paragraph prompt above, you can specify a gender for the stranger on the train by changing the rewrite style, writing "shorter. Remember the stranger is a woman, as well."

4. Build trust

Without users' trust, even the most innovative AI capabilities might go unused. Trust is a result of users feeling that the AI is capable, reliable, and helpful. Helping users develop trust can encourage them to learn how and when to use specific features, and it can lead to a better user experience overall.

The PAIR Guidebook offers a few ideas to help users determine how much they should trust AI systems:

Build trust early on

With generative AI, it is especially useful to communicate the intent of the features and to help users understand the AI's limitations. For example, because language models are designed primarily to predict what comes next in text, they may not always be factually accurate in its output. So, it's important to help users understand that this prototype is a creative writing aid and is not intended to be factual. If the user wants to fact-check details that they wish to be factual, they should search online via trusted resources.

Brainstorm a few different ways you might help users understand that this prototype is not intended to be used for writing factual information, but is specifically for writing fiction.

Maintain trust

Similarly, while generative AI models are highly capable, users cannot always verify that tasks are completed correctly for many specific use cases. For example, this prototype is designed around targeted completion of text and targeted rewriting of fiction—capabilities that users can easily verify. In contrast, while generative models can easily be prompted for rewriting large portions of text, users may miss subtle errors that might have crept in. In general, focusing interactive generative AI features on tasks that users can readily verify helps earn their trust.

A final opportunity to maintain trust is to leverage the steerability of generative models. Unlike previous AI models that are designed for a narrowly specified task, outputs of generative models are much easier for end users to customize (as demonstrated by asking for more dramatic, shorter, or similar rewrites). While such steerability may lead to a better user experience, care should be taken to bound this steerability within the model's capabilities. For instance, in this prototype, instead of asking users for ways to rewrite their text, you might offer a list of rewrite instructions found to work well as suggestions to the end user.

Recover from lost trust

Despite your best efforts, there might be cases in which the model yields suboptimal results. In such cases, it's important to allow users to undo any AI actions. Similarly, it's often better to identify features that have variable performance and only trigger them when users explicitly request AI assistance.

  • Brainstorm a few different ways you might create undo features or other ways to recover user trust.

You can see solutions to these challenges in the codelab solution.

5. Put it all together

So far, you experimented with prompts in MakerSuite. When you are happy with these prompts, use them directly in your prototype.

  • First, save your prompt, and then click Get code in the top-right corner. If you haven't already, you also need to enable your API key by clicking Enable API key in the Get code dialog that is displayed.

The makersuite toolbar. The Get code button is at the top right.

MakerSuite generates code that you can use directly in your application. For example, for use with a web application, choose the JavaScript code. You can copy the code directly from the dialog and paste it in your web app. If you update your prompt in MakerSuite, remember to update it in your code using the prompt variable in the included code.

Dialog box showing code generated by Makersuite. Users can choose between using cURL, or javascript or python libraries, or retrieving the prompt information as JSON.

If you want to integrate this API into a pre-built app for creative writing, you can download the Wordcraft code.

Codelab solution

You can get the code for Wordcraft from GitHub:

git clone https://github.com/pair-code/wordcraft

Alternatively, you can download the repository as a zip file:

6. Congratulations

You completed the Learn to prototype Responsible AI with the PAIR Guidebook and MakerSuite codelab and learned how to prototype Responsible AI experiences (in this case, for a creative writing app) using a few Google tools. We can't wait to see what you build!

Further reading