What is Generative AI?
And how is it different then regular AI? In this article, you’ll learn everything you need to know about generative AI — what it is, how it’s being used, its pros and cons, and its economic impact. Let’s dive in!
Question: Can you guess which of the following article headlines was written by generative AI?
Knit Perfection: The Enduring Charm and Versatility of Cardigans
BTS: Breaking Records and Boundaries in the Global Music Scene
Smooth Ride Ahead: Everything You Need to Know About Power Steering Fluid
It’s a trick question. They were all written by OpenAI’s ChatGPT, a generative AI tool that creates content from a prompt. All I did was ask it to write a headline about cardigans, BTS, and power steering fluid. And while ChatGPT is text-based AI, generative AI does more than create text…but we’ll get into all that.
In this article, you’ll learn everything you need to know about generative AI — what it is, how it’s being used, its pros and cons, and its economic impact on industries like healthcare, finance, and entertainment. Let’s dive in!
Table of Contents
- What is Generative AI?
- What is Generative AI Used For?
- How Does Generative AI Work?
- Generative AI Models
- The Economic Impact of Generative AI
- The Pros and Cons of Generative AI
- Navigating the Shift to Generative AI
What is Generative AI?
Generative AI is any artificial intelligence that generates net-new content in the form of text, audio, video, images, and more.
To work with Generative AI, you need to give it a prompt. Usually, but not always, the prompt comes in text-form. The generative AI you are interacting with then processes that prompt through its generative AI model and generates a response.
So, what does it mean for an AI to “generate net-new content”? This is easiest to understand when we compare common examples of AI that are not generative AI to how generative AI generates responses.
For example, compare a Google Search to Google Gemini, their newly released multimedia generative AI.
Let’s say you are curious about the cubist painting movement (and who isn’t?).
So, you hop on over to Google Search and type in the following:
“Famous examples of cubist paintings”
Google Search will leverage its extremely comprehensive and complex machine learning algorithms to provide you with the best answer it can give from reputable sources on the internet:
Google Search will produce a list of 24 famous cubist paintings curated from the internet, as well as a list of reputable sources that will provide you with more information.
Now, head on over to Google Gemini and give it the exact same prompt:
Google Gemini, and its generative AI model, acts differently right off the bat. Instead of just pulling from existing sources, Gemini actually goes ahead and provides me with a short description of cubism that it has “written” itself, as well as providing a short list of five cubist paintings it has designated as “some of the most famous” ones.
Here’s where things with generative AI get interesting: while Google Search results are limited to what content already exists on the internet, Google Gemini can create net-new content (remember: it’s generative AI). So I can now ask Gemini to create a new image that looks like a famous cubist painting:
And just like that, I have four colorful cubist paintings! Thanks, generative AI! (Editor’s note: but what will they get at auction?)
Let’s compare some other common AI uses and what generative AI could do in those instances:
Example of non-generative AI | Generative AI |
Google Ads that read your emails and send you related ads. | Gen AI writes personalized copy for the ad based on your other opened emails |
Netflix recommendation engine recommends to you a new romcom | ChatGPT writes you a script for a new romcom |
Siri allows you to dictate a text via voice command | A genAI digital assistant writes the text for you, reads it to you, and then you approve it via voice command |
Google Search provides you with a list of 50 songs about crying | Google Gemini writes you a song about crying, sheet music included |
What is Generative AI Used For?
While it’s certainly fun to use, Generative AI technology can do more than just write songs and create images of cubist paintings. Generative AI is able to create new content is actually quite profound because of its incredibly broad application potential. Generative AI can be used by anyone and across all industries. Its ability to generate language, visuals, audio, and synthetic data is what has everyone scrambling to figure out how to incorporate it into their personal and professional lives.
- Language: Generative AI uses large language models (LLMs) to generate language. Large language models are trained on massive amounts of data and focus on language tasks. This powers generative AI tools to understand language and create new content that looks, sounds, and reads as natural language. LLMs can answer questions, write code, and provide translations.
- Visual: By using different models, text-to-visual AI programs like Midjourney and Stability AI’s Stable Diffusion can create AI-generated visuals — images, videos, illustrations, artwork, 3D models, and more.
- Audio: Generative AI models use machine learning algorithms to analyze audio, identify patterns, and create new audio content. It’s commonly used in music composition and voice transcription.
- Synthetic Data: AI needs data to train and acquiring data — especially labeled data — can be expensive. Generative AI models can create synthetic data that other AI models use to train.
How Does Generative AI Work?
Generative AI is a sophisticated process. I mean, how else would an application be able to take a prompt like “Create a one-minute video script for a social media ad campaign promoting our new product,” and actually create it? Let’s break it down into a few steps.
- Generative AI models are fed existing data and content. Generative AI requires large amounts of existing data (or content) so it can ‘learn’, and then create net-new content.
- Generative AI uses unsupervised and semi-supervised machine learning for training. While unsupervised machine learning is a technique where algorithms analyze unlabeled data, semi-supervised is a hybrid that uses both labeled and unlabeled data for training. These machine learning algorithms are what let generative AI models discover and identify patterns without human intervention (unsupervised) or with limited human intervention (semi-supervised).
- Generative AI uses neural networks to identify patterns in data. Neural networks follow an artificial process where a machine learning model makes decisions by trying to copy the complex way our brains process information. This intricate system is what allows generative AI to process data and figure out its patterns.
- Generative AI models learn from these patterns. After discovering the patterns, generative AI models analyze and learn these patterns for future application.
- Generative AI models create new patterns when given a prompt. When given a text prompt, the AI system will use the patterns it learned to create new, similar patterns.
This is a very high-level overview of how the process works, but depending on the generative AI model, the system may use different techniques for training and generating content.
Generative AI Models
There are several ways that generative AI can start creating new content. Think about when you’re programming an address into your GPS. You can choose the fastest route, the route with no tolls, or a route that avoids highways. Regardless of what you choose, you’ll arrive at your destination. Similarly, generative AI uses different types of foundation models to create new content.
Amazon Web Services describes foundation models as “a form of generative AI… that generate[s] output from one or more inputs (prompts) in the form of human language instructions.” They’re based on neural networks and trained for different purposes.
And just like your car trip where a different route is better for a specific purpose (i.e. a route with no tolls can help you save money), some generative AI systems may work better for a specific purpose — text generation vs. image generation vs. audio generation.
While transformer and diffusion models are definitely all the rage at the moment, there are about six models that are popular when discussing generative AI.
1. Recurrent Neural Networks
IBM describes recurrent neural networks (RNNs) as “a type of artificial neural network which uses sequential data or time series data.” With its “memory,” the network is able to take the information it learned from its training data and apply that to its current input (request) and subsequently, the desired output. Recurrent neural networks can do this because they use each individual element in a sequence to predict the next element.
Finish the sentence: The car was so expensive! It cost an arm and ____.
If you’re familiar with the idiom, you might’ve answered with an easy “a leg.” Why? There are a few subtle hints that get you to the answer. First, the word “expensive” right before the idiom gave extra context. Second, the sequence of the words “cost an arm and…” is very telling. If the second sentence started as: “An arm and…,” it would be significantly harder to guess the next word without the additional context, even if the sentence was fundamentally communicating the same idea.
The way in which your brain is able to finish the series is the process that recurrent neural networks copy. Recurrent neural networks have their place in both traditional and generative AI where you see them used in language translation, natural language processing (NLP), and speech recognition. This generative AI model is commonly used in music-generating apps, but because of its drawbacks — like its inability to handle long sequences — some are turning to the more evolved transformer models.
2. Convolutional Neural Networks
Convolutional neural networks (CNNs) utilize a multilayered neural network for image classification and object recognition. This process starts with input data or an image, then a filter is applied to the image to detect certain features. As the filter slides over the image, it marks dots in the areas of the image where it detects the feature, creating a feature map. The filters usually start off by detecting basic patterns before evolving to more complex patterns in additional layers. At the end of the process — using the features or patterns it was able to detect in all previous layers — the model makes its decision and classifies the image.
A popular application of convolutional neural networks is facial recognition. It filters for a particular feature — that it’s attributed to you — and then scans the entire image looking for it. That’s why your smartphone can recognize your face whether you’re looking at your phone head-on or with your face turned at a different angle. While CNNs are more commonly recognized as a classification technique, convolutional neural networks can help generate images like the ones you’d see created with a generative AI tool like OpenAI’s DALL-E.
3. Transformer Models
Transformers are the more evolved version of RNNs. Both techniques are used on sequential data, but transformers take it one step further. While RNNs process each element one after the other, transformers don’t. Instead, it works in parallel to process the entire sequence simultaneously. This also gives it the ability to process longer sequences than RNNs, making it faster and more efficient. Transformers are ideal for natural language processing and text generation, putting the “T” in tools like ChatGPT: Generative Pre-Trained Transformers. Both GPT-3 and GPT-4 are the generative models used for the text-to-text tool, which is why it’s able to respond to prompts so quickly.
4. Generative Adversarial Networks
Much like the name suggests, Generative adversarial networks (GANs) are made of two neural networks that are pitted against each other. There is a generator that creates synthetic data, and then the discriminator — its opponent — has to figure out whether the data from the generator is real or fake. The goal? Outperform the competition. The generator will try to make their data seem more and more real while the discriminator needs to get better at differentiating real data from synthetic data. And while these two neural networks are technically working against each other, they’re also working together to improve, like a sparring partner. As the discriminator gets better at picking which data is real or fake, the generator gets better at creating realistic data. GANs are commonly used in generating synthetic visual content, such as images, art, and video.
5. Variational Autoencoders
Similar to GANs, variational autoencoders (VAEs) are a generative AI model made up of two neural networks — this time an encoder and a decoder. The encoder compresses and preserves input data into a new encoded space called latent space. The decoder then reverses this process and puts the original data back together while removing ‘noise’, aka any irrelevant information. While the encoder works to find better ways to encode data into this abstract representation, the decoder needs to optimize how it regenerates the input data.
For content creation, VAEs combine and compress data, effectively cleaning it by finding errors and removing noise. VAEs are used in generating text, video, and even synthetic data that are used for training other AI algorithms. VAEs are especially ideal and commonly used for image generation. For example, when the generative AI model is fed images of real people, it learns the patterns and features of the training data and uses that to create images of fake people that look real.
6. Diffusion Models
Diffusion models follow a two-step training process of forward diffusion and reverse diffusion. During forward diffusion, random noise is slowly added to the original data. The reverse diffusion process is responsible for removing the noise and putting the data back together. Once the model is trained on this process, it can take random noise and run it through the denoising process to create new data. To better visualize this, imagine a picture where all you see is static. The diffusion process would remove this noise until an image is created from it.
The Economic Impact of Generative AI
Despite having become incredibly popular in the last year or so, the rise of AI isn’t a new conversation. The potential economic impact of generative AI is thought to be as massive as the datasets it’s trained on. Goldman Sachs research speculates that “As tools using advances in natural processing language work their way into businesses and society, they could drive a 7% (or almost $7 trillion) increase in global GDP…over a 10-year period.”
We already know generative AI is working its way into businesses and society. Several industries are already using it in their operations. Examples include:
- Automotive: Car companies use generative AI to create synthetic data that helps them create 3D simulations for training autonomous vehicles. While we immediately think of self-driving cars, smaller-scale capabilities like self-parking and auto-steering are also advancements from generative AI.
- Education: Teachers can use generative AI as a supplemental teaching tool for creating lesson plans, worksheets, or even AI chatbot tutors.
- Entertainment: From a single prompt, generative AI models can create animations, scripts, and games.
- Healthcare: Generative AI models can help create new protein sequences that medical researchers can use for developing vaccines and drug discovery.
- Finance: Banks can leverage generative AI with chatbot assistants. Not only can it help banks quickly detect fraud, but the programmed chatbots can help customers resolve fraudulent transactions.
So does this mean that all the uses of generative AI are positive? Unfortunately, while generative AI can lead to positive advancements in many industries, there are also some significant concerns and challenges.
The Pros and Cons of Generative AI
Just like all things, there are benefits and drawbacks to using generative AI. This post won’t spend too much time dissecting them because, at this moment, generative AI is like a high-speed train. It’s not a question of if it’ll arrive but when.
We now know that generative AI can:
- Create realistic, high-quality content
- Increase efficiency and productivity among workers
- Enhance and personalize customer experience
- Save time and money by automating operational tasks
- Generate valuable insights from data for business growth and development
However, for every advantage, there are just as many disadvantages to consider. Issues that come up with generative AI include but aren’t limited to:
- Hallucination: AI models “hallucinate” when they generate false information or information that doesn’t make sense.
- Content Moderation: AI models need to be able to tell the difference between appropriate and inappropriate content. For this to happen, human workers need to filter through offensive materials and manually label them for data training. This can potentially create negative work conditions for workers for the sake of AI.
- Misleading Information: Beyond AI hallucinations, generative AI can be manipulated to create false information. Common issues include AI-made vocals that sound like popular musicians or “deep-fakes” — a digitally manipulated image or video that steals and swaps the likeness of one person with another without consent.
- Biases and Discrimination: When biases are present in training data, AI models can double down and implement them during application. This might look like an AI-powered loan tool implementing a bias against minorities.
- Legal Issues: The legal ramifications of generative AI are a gray area with possible copyright, privacy, and liability issues.
There are some serious concerns about generative AI, but it’s not going away any time soon. We need developers and engineers to do what they do best — debugging and fine-tuning AI processes and systems and approaching them from an ethical and legal POV until we can reach responsible AI.
Navigating the Shift to Generative AI
“Among organizations considering or using AI, 82% believe it will either significantly change or transform their industry.” a 2023 Google Cloud study found. Over the last few decades, developers and AI engineers have made waves in the introduction and development of AI and its related fields — computer engineering, machine learning, data science, and of course, generative AI. The advancements brought the rise of innovative AI applications like ChatGPT. It’s responsible for the meteoric rise of NVIDIA and the continued growth of companies like Microsoft, Meta, and Google.
The world is shifting towards generative AI, and you might be ready to do the same. Navigating to a career in tech or AI takes hard work, but the process doesn’t have to be overly complicated. In fact, it’s pretty easy with Skillcrush’s Break Into Tech program. When you’re ready to take the first step, you’ll learn all the skills for a successful career in tech and a smooth transition into AI.
Jouviane Alexandre
Category: Artificial Intelligence, Artificial Intelligence Jobs, Blog