Introduction to Generative AI: GWENDOLYN STRIPLING: Hello.And welcome to Introduction to Generative AI.My name is Dr. Gwendolyn Stripling.And I am the artificial intelligencetechnical curriculum developer here at Google Cloud.In this course, you learn to define generative AI,explain how generative AI works, describe generative AI modeltypes, and describe generative AI applications.Generative AI is a type of artificial intelligencetechnology that can produce various types of content,including text, imagery, audio, and synthetic data.But what is artificial intelligence?Well, since we are going to exploregenerative artificial intelligence,let’s provide a bit of context.So two very common questions askedare what is artificial intelligenceand what is the difference between AI and machinelearning.One way to think about it is that AI is a discipline,like physics for example.AI is a branch of computer sciencethat deals with the creation of intelligence agents, whichare systems that can reason, and learn, and act autonomously.Essentially, AI has to do with the theory and methodsto build machines that think and act like humans.In this discipline, we have machine learning,which is a subfield of AI.It is a program or system that trains a model from input data.That trained model can make useful predictionsfrom new or never before seen datadrawn from the same one used to train the model.Machine learning gives the computerthe ability to learn without explicit programming.Two of the most common classes of machine learning modelsare unsupervised and supervised ML models.The key difference between the twois that, with supervised models, we have labels.Labeled data is data that comes with a tag like a name, a type,or a number.Unlabeled data is data that comes with no tag.This graph is an example of the problemthat a supervised model might try to solve.For example, let’s say you are the owner of a restaurant.You have historical data of the bill amountand how much different people tipped based on order typeand whether it was picked up or delivered.In supervised learning, the model learns from past examplesto predict future values, in this case tips.So here the model uses the total bill amountto predict the future tip amount based on whether an order waspicked up or delivered.This is an example of the problemthat an unsupervised model might try to solve.So here you want to look at tenure and incomeand then group or cluster employeesto see whether someone is on the fast track.Unsupervised problems are all about discovery,about looking at the raw data and seeing if it naturallyfalls into groups.Let’s get a little deeper and show this graphicallyas understanding these concepts arethe foundation for your understanding of generative AI.In supervised learning, testing data values or xare input into the model.The model outputs a prediction and compares that predictionto the training data used to train the model.If the predicted test data values and actual training datavalues are far apart, that’s called error.And the model tries to reduce this erroruntil the predicted and actual values are closer together.This is a classic optimization problem.Now that we’ve explored the differencebetween artificial intelligence and machine learning,and supervised and unsupervised learning,let’s briefly explore where deep learningfits as a subset of machine learning methods.While machine learning is a broad field thatencompasses many different techniques,deep learning is a type of machine learningthat uses artificial neural networks,
Introduction to Generative AI
allowing them to process more complex patterns than machinelearning.Artificial neural networks are inspired by the human brain.They are made up of many interconnected nodes or neuronsthat can learn to perform tasks by processing data and makingpredictions.Deep learning models typically have many layersof neurons, which allows them to learnmore complex patterns than traditional machine learningmodels.And neural networks can use both labeled and unlabeled data.This is called semi-supervised learning.In semi-supervised learning, a neural networkis trained on a small amount of labeled dataand a large amount of unlabeled data.The labeled data helps the neural networkto learn the basic concepts of the taskwhile the unlabeled data helps the neural networkto generalize to new examples.Now we finally get to where generative AIfits into this AI discipline.Gen AI is a subset of deep learning, whichmeans it uses artificial neural networks,can process both labeled and unlabeled data usingsupervised, unsupervised, and semi-supervised methods.Large language models are also a subset of deep learning.Deep learning models, or machine learning models in general,can be divided into two types, generative and discriminative.A discriminative model is a type of modelthat is used to classify or predict labels for data points.Discriminative models are typicallytrained on a data set of labeled data points.And they learn the relationship between the featuresof the data points and the labels.Once a discriminative model is trained,it can be used to predict the label for new data points.A generative model generates new data instancesbased on a learned probability distribution of existing data.Thus generative models generate new content.Take this example here.The discriminative model learns the conditional probabilitydistribution or the probability of y,our output, given x, our input, that this is a dogand classifies it as a dog and not a cat.The generative model learns the joint probability distributionor the probability of x and y and predictsthe conditional probability that this is a dogand can then generate a picture of a dog.So to summarize, generative modelscan generate new data instances while discriminative modelsdiscriminate between different kinds of data instances.The top image shows a traditional machinelearning model which attempts to learnthe relationship between the data and the label,or what you want to predict.The bottom image shows a generative AI modelwhich attempts to learn patterns on content so that itcan generate new content.A good way to distinguish what is gen AI and what is notis shown in this illustration.It is not gen AI when the output, or y, or label isa number or a class, for example spam or not spam,or a probability.It is gen AI when the output is natural language, like speechor text, an image or audio, for example.Visualizing this mathematically would look like this.If you haven’t seen this for a while,the y is equal to f of x equation calculatesthe dependent output of a process given different inputs.The y stands for the model output.The f embodies the function used in the calculation.And the x represents the input or inputs used for the formula.So the model output is a function of all the inputs.If the y is the number, like predicted sales,it is not gen AI.If y is a sentence, like define sales,it is generative as the question would elicit a text response.The response would be based on all the massive large datathe model was already trained on.To summarize at a high level, the traditional, classicalsupervised and unsupervised learning processtakes training code and label data to build a model.Depending on the use case or problem,the model can give you a prediction.It can classify something or cluster something.We use this example to show you how much more robustthe gen AI process is.The gen AI process can take training code, label data,and unlabeled data of all data typesand build a foundation model.The foundation model can then generate new content.For example, text, code, images, audio, video, et cetera.We’ve come a long away from traditional programmingto neural networks to generative models.In traditional programming, we usedto have to hard code the rules for distinguishing a cat–the type, animal; legs, four; ears, two; fur, yes;likes yarn and catnip.In the wave of neural networks, wecould give the network pictures of cats and dogsand ask is this a cat and it would predict a cat.In the generative wave, we as userscan generate our own content, whether itbe text, images, audio, video, et cetera, for examplemodels like PaLM or Pathways Language Model,or LAMBDA, Language Model for Dialogue Applications,ingest very, very large data from the multiple sourcesacross the internet and build foundation languagemodels we can use simply by asking a question,whether typing it into a prompt or verballytalking into the prompt itself.So when you ask it what’s a cat, itcan give you everything it has learned about a cat.Now we come to our formal definition.What is generative AI?Gen AI is a type of artificial intelligencethat creates new content based on what it haslearned from existing content.The process of learning from existing contentis called training and results in the creationof a statistical model when given a prompt.AI uses the model to predict what an expected response mightbe and this generates new content.Essentially, it learns the underlying structureof the data and can then generatenew samples that are similar to the data it was trained on.As previously mentioned, a generative language modelcan take what it has learned from the examples it’sbeen shown and create something entirely newbased on that information.Large language models are one type of generative AIsince they generate novel combinations of textin the form of natural sounding language.A generative image model takes an imageas input and can output text, another image, or video.For example, under the output text,you can get visual question answeringwhile under output image, an image completion is generated.And under output video, animation is generated.A generative language model takes text as inputand can output more text, an image, audio, or decisions.For example, under the output text,question answering is generated.And under output image, a video is generated.We’ve stated that generative language models learnabout patterns and language through training data,then, given some text, they predict what comes next.Thus generative language models are pattern matching systems.They learn about patterns based on the data you provide.Here is an example.Based on things it’s learned from its training data,it offers predictions of how to complete this sentence,I’m making a sandwich with peanut butter and jelly.Here is the same example using Bard,which is trained on a massive amount of text dataand is able to communicate and generatehumanlike text in response to a wide range of promptsand questions.Here is another example.The meaning of life is–and Bart gives you a contextual answerand then shows the highest probability response.The power of generative AI comes from the use of transformers.Transformers produced a 2018 revolutionin natural language processing.At a high level, a transformer modelconsists of an encoder and decoder.The encoder encodes the input sequenceand passes it to the decoder, whichlearns how to decode the representationfor a relevant task.In transformers, hallucinations are words or phrasesthat are generated by the model thatare often nonsensical or grammatically incorrect.Hallucinations can be caused by a number of factors,including the model is not trained on enough data,or the model is trained on noisy or dirty data,or the model is not given enough context,or the model is not given enough constraints.Hallucinations can be a problem for transformersbecause they can make the output text difficult to understand.They can also make the model morelikely to generate incorrect or misleading information.A prompt is a short piece of textthat is given to the large language model as input.And it can be used to control the output of the modelin a variety of ways.Prompt design is the process of creatinga prompt that will generate the desired outputfrom a large language model.As previously mentioned, gen AI depends a loton the training data that you have fed into it.And it analyzes the patterns and structures of the input dataand thus learns.But with access to a browser based prompt, you, the user,can generate your own content.We’ve shown illustrations of the types of input based upon data.Here are the associated model types.Text-to-text.Text-to-text models take a natural language inputand produces a text output.These models are trained to learn the mappingbetween a pair of text, e.g.for example, translation from one language to another.Text-to-image.Text-to-image models are trained on a large set of images,each captioned with a short text description.Diffusion is one method used to achieve this.Text-to-video and text-to-3D.Text-to-video models aim to generate a video representationfrom text input.The input text can be anything from a single sentenceto a full script.And the output is a video that corresponds to the input text.Similarly, text-to-3D models generatethree dimensional objects that correspond to a user’s textdescription.For example, this can be used in games or other 3D worlds.Text-to-task.Text-to-task models are trained to perform a defined taskor action based on text input.This task can be a wide range of actionssuch as answering a question, performing a search,making a prediction, or taking some sort of action.For example, a text-to-task modelcould be trained to navigate a web UI or make changes to a docthrough the GUI.A foundation model is a large AI model pre-trainedon a vast quantity of data designed to be adapted or finetuned to a wide range of downstream tasks,such as sentiment analysis, image captioning, and objectrecognition.Foundation models have the potentialto revolutionize many industries, includinghealth care, finance, and customer service.They can be used to detect fraud and providepersonalized customer support.Vertex AI offers a model garden thatincludes foundation models.The language foundation models includePaLM API for chat and text.The vision foundation models includes stable diffusion,which has been shown to be effective at generatinghigh quality images from text descriptions.Let’s say you have a use case whereyou need to gather sentiments about how your customers arefeeling about your product or service.You can use the classification task sentiment analysis taskmodel for just that purpose.And what if you needed to perform occupancy analytics?There is a task model for your use case.Shown here are gen AI applications.Let’s look at an example of code generationshown in the second block under code at the top.In this example, I’ve input a code file conversion problem,converting from Python to JSON.I use Bard.And I insert into the prompt box the following.I have a Pandas DataFrame with two columns, one with the filename and one with the hour in which it is generated.I’m trying to convert this into a JSON filein the format shown onscreen.Bard returns the steps I need to do this and the code snippet.And here my output is in a JSON format.It gets better.I happen to be using Google’s free, browser-based JupyterNotebook, known as Colab.And I simply export the Python code to Google’s Colab.To summarize, Bart code generationcan help you debug your lines of source code,explain your code to you line by line,craft SQL queries for your database,translate code from one language to another,and generate documentation and tutorials for source code.Generative AI Studio lets you quickly explore and customizegen AI models that you can leverage in your applicationson Google Cloud.Generative AI Studio helps developers create and deployGen AI models by providing a variety of tools and resourcesthat make it easy to get started.For example, there’s a library of pre-trained models.There is a tool for fine tuning models.There is a tool for deploying models to production.And there is a community forum for developersto share ideas and collaborate.Generative AI App Builder lets youcreate gen AI apps without having to write any code.Gen AI App Builder has a drag and drop interfacethat makes it easy to design and build apps.It has a visual editor that makesit easy to create and edit app content.It has a built-in search engine thatallows users to search for information within the app.And it has a conversational AI Enginethat helps users to interact with the app usingnatural language.You can create your own digital assistants, custom searchengines, knowledge bases, training applications,and much more.PaLM API lets you test and experimentwith Google’s large language models and gen AI tools.To make prototyping quick and more accessible,developers can integrate PaLM API with Maker suiteand use it to access the API using a graphical userinterface.The suite includes a number of different tools such as a modeltraining tool, a model deployment tool, and a modelmonitoring tool.The model training tool helps developers train ML modelson their data using different algorithms.The model deployment tool helps developers deploy ML modelsto production with a number of different deployment options.The model monitoring tool helps developersmonitor the performance of their ML modelsin production using a dashboard and a numberof different metrics.Thank you for watching our course, Introductionto Generative AI.