Generative AI for Video Analytics with Vertex AI

About this codelab

Last updated Oct 30, 2023

Written by Ravi Manjunatha, Ishaan Agrawal

1. Introduction

Learn how to analyze the views of Inluencers in youtube on any company or product using Google's Gen AI capabilities.

With the advent of LLMs tapping insights from a wide range of sources such as balance sheets, views on social media platforms, influencers opinions have become all the more easier.

Social media influencers especially in the Tech & Finance world are increasingly seen as key proponents of an organization or its competitors' products & policies.

What you'll build

In this codelab, you will explore how the PaLM2 model in VertexAI and Langchain come together for the youtube influencer analytics solution.

Environment

Go to https://colab.research.google.com/#create=true to create a new notebook in Google Colab sandbox environment.

2. Install packages and authenticate

In the first cell in the new notebook, use these commands to install the required packages.

!pip install google-cloud-aiplatform
!pip install langchain
!pip install chromadb
!pip install pytube
!pip install youtube-transcript-api
!pip install gradio
from google.cloud import aiplatform

It will prompt you to restart the runtime machine after installing the above packages. Click RESTART RUNTIME or select Restart runtime from the Runtime menu.

Authenticate your google cloud account

Your account should have the Vertex AI user role.

Open Google Cloud console and search for IAM and Admin service. In the PERMISSIONS tab under VIEW BY PRINCIPALS, select GRANT ACCESS. Enter / select your principal and then add the role "Vertex AI User" and SAVE, as shown in the image below:

Now go back to the colab tab and enter the below code snippet in the second cell in the current working file. This will check for authentication.

from google.colab import auth as google_auth
google_auth.authenticate_user()

It will prompt you to allow access. Continue to do so.

3. Initialize and import

Initialize your project by entering the below snippet in the next cell.

import vertexai
PROJECT_ID = "<projectid>" #enter your project id here
vertexai.init(project=PROJECT_ID)

Import the libraries for the solution

Use these commands to import the required libraries.

from langchain.document_loaders import YoutubeLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langchain.llms import VertexAI

Initialize the Vertex AI LLM model

Use this code snippet to initialize the Vertex AI LLM model. This initializes "llm" with the Text-Bison model of Vertex AI.

!python3 -m pip install langchain

llm = VertexAI(
model_name="text-bison@001",
max_output_tokens=256,
temperature=0.1,
top_p=0.8,
top_k=40,
verbose=True,
)

We will use the Vertex AI embeddings to convert the video chunks to embeddings. In this part of the code, we will only initialize the embeddings object. In the store and retrieve section, we will apply the embeddings to the chunks created from the video.

Chunking in generative AI is the process of breaking down large content into smaller, manageable pieces or ‘chunks'. This is done because generative AI models have limits on how much data they can process at once. By chunking the data, the model can focus on one chunk at a time and generate more accurate and coherent outputs.

Embeddings are a way of representing content as a vector of numbers. This allows computers to understand the meaning of data in a more sophisticated way than traditional methods, such as shot detection or keyframe extraction, if it's for videos and bag-of-words, if it's for language data.

from langchain.embeddings import VertexAIEmbeddings

# Embedding
EMBEDDING_QPM = 100
EMBEDDING_NUM_BATCH =5
embeddings = VertexAIEmbeddings(
    requests_per_minute=EMBEDDING_QPM,
    num_instances_per_batch=EMBEDDING_NUM_BATCH,
)

5. Load and chunk the video

Load the video to summarize or ask questions from.

loader = YoutubeLoader.from_youtube_url("https://www.youtube.com/watch?v=A8jyW_6hCGU&t=161s", add_video_info=True)
result = loader.load()

Split the video

Split the video into multiple chunks using the Recursive Character splitter technique.

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1500, chunk_overlap=0)
docs = text_splitter.split_documents(result)
print(f"# of documents = {len(docs)}")

6. Store and retrieve

Store your documents

For this exercise, we are using ChromaDB. You can also use Vertex AI Vector Search. Store your documents and index them in ChromaDB as a vector store. ChromaDB is used to store and retrieve vector embeddings for use with LLMs and to perform semantic search over data.

db = Chroma.from_documents(docs, embeddings)
retriever = db.as_retriever(search_type="similarity", search_kwargs={"k": 2})

Create a retriever chain

Create a retriever chain to answer the question. This is where we associate the Vertex AI Text Bison model LLM and the retriever that retrieves the embeddings from Chroma DB.

qa = RetrievalQA.from_chain_type( llm=llm, chain_type="stuff", retriever=retriever, return_source_documents=True)

7. Define your prompt

Define your prompt to ask questions and get answers from the indexed content.

def sm_ask(question, print_results=True):
  video_subset = qa({"query": question})
  context = video_subset
  prompt = f"""
  Answer the following question in a detailed manner, using information from the text below. If the answer is not in the text,say I dont know and do not generate your own response.

  Question:
  {question}
  Text:
  {context}

  Question:
  {question}

  Answer:
  """
  parameters = {
  "temperature": 0.1,
  "max_output_tokens": 256,
  "top_p": 0.8,
  "top_k": 40
  }
  response = llm.predict(prompt, **parameters)
  return {
  "answer": response

  }

8. Integrate the LLM application

Integrate the LLM application with Gradio for a visual front end interaction.

import gradio as gr
def get_response(input_text):
  response = sm_ask(input_text)
  return response

grapp = gr.Interface(fn=get_response, inputs="text", outputs="text")
grapp.launch()

9. Test the solution

Now let's proceed to test the solution. Run the cell containing the above code. Either view the UI in the cell result or click the link that is generated. You should be able to visualize the interface with the input-output components. Input a question on the video and view the model response.

With this, we can now integrate Youtube videos and analyze them using Vertex AI PaLM API models. You can further extend this to integrate with databases or data warehouses. See Vertex AI LLM product documentation to learn more about available models.

10. Congratulations!

Congratulations! You have successfully used a Vertex AI Text Generation LLM programmatically to perform text analytics on your data only using SQL-queries. Check out Vertex AI LLM product documentation to learn more about available models.

Report a mistake