What is Generative AI? – A Technical Deep Dive

October 9, 2024

We explore the technical aspects of GenAI and the technology's latest developments, as well as its role in humanoid AI and the challenges that automation in code development pose for the IT industry.

By: Krishana Gyanwali

Generative AI (GenAI) is a tremendous achievement in artificial intelligence that enables machines to rapidly generate content (like text, images, music, voice synthesizing, and computer code) based on learned patterns, structures, and features derived from existing data.

The technology is driving advancements in various fields, such as mass media, automation, and software development, and most of the deep learning models or LLMs that power GenAI are based on the transformer model.

In this blog post, we will explore the technical aspects of GenAI and the technology’s latest developments. We will also discuss GenAI’s use case for the concept of humanoid AI as well as the challenges that automation in code development pose for the future IT market.

Background of Generative AI

The term “Generative AI” was introduced during the 1960s but it did not become popular until Generative Adversarial Networks (GANs) were developed. GenAI primarily operates on deep learning architectures like GANs, Variational Autoencoders (VAEs), and the previously mentioned Transformer models (e.g., BERT, GPT-3, GPT-4/5).

These models are usually trained on large datasets (petabytes) in sophisticated hardware infrastructures. The training process often takes days or months depending on the amount of data input, layers, and parameters used in the neural networks (as well as any other advanced computation involved to develop the model).

These kinds of models understand the underlying distribution of data, patterns, and structure, and generate new output or ideas that resemble human-like creative output.

Core Architectures of GenAI – GANS

One of the major components of GenAI is GANs. GANs architecture is built based on two opposite neural networks – one is a “generator” and one is a “discriminator”. The generator creates data samples while the discriminator evaluates them.

The generator’s creation of fake data is to fool the discriminator, similar to how a human does so in mimicry by copying somebody else’s voice. After many iterations (trainings), the generator learns to create more realistic output, and the discriminator becomes more skilled at identifying real data from fake data.

The following code snippet is an example of a deep learning model for a generator:

# import required libraries
import tensorflow as tf
from tensorflow.keras.layers import Dense, Reshape, Flatten
from tensorflow.keras.models import Sequential

# import required libraries

import tensorflow as tf

from tensorflow.keras.layers import Dense, Reshape, Flatten

from tensorflow.keras.models import Sequential

# Generator Model
def build_generator():
    model = Sequential()
    model.add(Dense(256, input_dim=100))
    model.add(Reshape((16, 16, 1)))
    model.add(Dense(128, activation='relu')) #activation function
    model.add(Flatten())
    model.add(Dense(784, activation='sigmoid')) # activation function
    return model

# calling the build_generator function which consists of models 
generator = build_generator()
generator.summary()

# Generator Model

def build_generator():

model = Sequential()

model.add(Dense(256, input_dim=100))

model.add(Reshape((16, 16, 1)))

model.add(Dense(128, activation='relu')) #activation function

model.add(Flatten())

model.add(Dense(784, activation='sigmoid')) # activation function

return model

# calling the build_generator function which consists of models

generator = build_generator()

generator.summary()

VAEs (Variational Autoencoders)

Another type of model GenAI uses is a variational autoencoder (VAE). VAE is a technique for learning the data distribution that encodes the input data into a probabilistic distribution of a dataset by using anomality detection approach. This GenAI model is useful for generating images and synthesizing data.

The following code snippets show the encoder and decoder architecture that is adopted in VAEs:

# import required libraries
import torch
from torch import nn

# VAE model
class VAE(nn.Module):
    def __init__(self, input_dim, latent_dim):
        super(VAE, self).__init__()
        self.encoder = nn.Sequential(
            nn.Linear(input_dim, 128),
            nn.ReLU(), # activation function
            nn.Linear(128, latent_dim)
        )
        self.decoder = nn.Sequential(
            nn.Linear(latent_dim, 128),
            nn.ReLU(),
            nn.Linear(128, input_dim),
            nn.Sigmoid()
        )

    def forward(self, x):
        latent = self.encoder(x)
        reconstructed = self.decoder(latent)
        return reconstructed

vae = VAE(784, 20)

# import required libraries

import torch

from torch import nn

# VAE model

class VAE(nn.Module):

def __init__(self, input_dim, latent_dim):

super(VAE, self).__init__()

self.encoder = nn.Sequential(

nn.Linear(input_dim, 128),

nn.ReLU(), # activation function

nn.Linear(128, latent_dim)

)

self.decoder = nn.Sequential(

nn.Linear(latent_dim, 128),

nn.ReLU(),

nn.Linear(128, input_dim),

nn.Sigmoid()

)

def forward(self, x):

latent = self.encoder(x)

reconstructed = self.decoder(latent)

return reconstructed

vae = VAE(784, 20)

Transformers

The Transformer architecture brought a revolution in the AI world. Before the Transformer model, LSTM (Long Short Term Memory), ARIMA, and SARIMA depended on sequences of words in sentences. Transformer understands all words simultaneously by capturing the context of the word. Models like BERT, GPT-3, and GPT-4 are based on the transformer architecture.

What makes this model so powerful is that it can learn from enormous data sources and fine-tune the output. This is also based on the Encoder-Decoder approach with an attention mechanism enabled on it. The Self-Attention Mechanism compares words in the input data with every other word and focuses on the most important words by giving a score for each. The feedforward mechanism takes the score from self-attention and fine-tunes it to optimize the learning process.

# import required libraries 
import torch
import torch.nn as nn
# A simple explanation of transformer model which consists of self attention mechanism
# Self-attention: computes attention scores and weights between queries, keys, and values
class SelfAttention(nn.Module):
    def forward(self, Q, K, V, mask=None):
        scores = torch.matmul(Q, K.transpose(-2, -1)) / (Q.size(-1) ** 0.5)
        if mask is not None: scores.masked_fill_(mask == 0, float('-inf'))
        attention = torch.softmax(scores, dim=-1)
        return torch.matmul(attention, V)

# Transformer Block: combines attention and a feed-forward network
class TransformerBlock(nn.Module):
    def forward(self, x, mask=None):
        attn_out = SelfAttention()(x, x, x, mask)  # Self-attention
        x = nn.LayerNorm(x.size()[1:])(x + attn_out)  # Add & Norm
        ff_out = nn.Linear(x.size(-1), x.size(-1))(torch.relu(x))  # Feed-forward
        return nn.LayerNorm(x.size()[1:])(x + ff_out)  # Add & Norm

# Transformer Model: stacks multiple blocks with embeddings
class Transformer(nn.Module):
    def forward(self, x, mask=None):
        for _ in range(num_layers):
            x = TransformerBlock()(x, mask)
        return x

# import required libraries

import torch

import torch.nn as nn

# A simple explanation of transformer model which consists of self attention mechanism

# Self-attention: computes attention scores and weights between queries, keys, and values

class SelfAttention(nn.Module):

def forward(self, Q, K, V, mask=None):

scores = torch.matmul(Q, K.transpose(-2, -1)) / (Q.size(-1) ** 0.5)

if mask is not None: scores.masked_fill_(mask == 0, float('-inf'))

attention = torch.softmax(scores, dim=-1)

return torch.matmul(attention, V)

# Transformer Block: combines attention and a feed-forward network

class TransformerBlock(nn.Module):

def forward(self, x, mask=None):

attn_out = SelfAttention()(x, x, x, mask) # Self-attention

x = nn.LayerNorm(x.size()[1:])(x + attn_out) # Add & Norm

ff_out = nn.Linear(x.size(-1), x.size(-1))(torch.relu(x)) # Feed-forward

return nn.LayerNorm(x.size()[1:])(x + ff_out) # Add & Norm

# Transformer Model: stacks multiple blocks with embeddings

class Transformer(nn.Module):

def forward(self, x, mask=None):

for _ in range(num_layers):

x = TransformerBlock()(x, mask)

return x

Latest Developments in Feeding Models With New Data

Today, a key priority is to update a model with new data as a process of continuous learning. In traditional machine learning work, doing so required retraining the model completely with new data. This demanded a lot of human, technical, and financial resources.

Today’s new techniques, such as AutoML, synthetic data generation, and multimodal AI, are the latest innovations of data feeding. RAG (Retrieval-Augmentation) is one of the newest advancements that enhances existing knowledge by updating the knowledge from external relevant data sources before generating outputs.

Data Augmentation and Fine-Tuning

Data augmentation is very useful when there is an imbalance of datasets or very limited data information is available. It’s a process that generates additional training data from existing datasets by applying transformations such as rotation, flipping, and noise addition.

Fine-tuning involves retraining the pre-trained models on new data with a lower learning rate, allowing it to adapt to new information without overfitting by adjusting the model’s parameter. This generates more contextually-relevant output and saves training time, hardware resources, and financial resources because it can be built on pre-existing knowledge.

# import libraries 
from torchvision import transforms

#Data Augmentation
transform = transforms.Compose([
    transforms.RandomHorizontalFlip(),
    transforms.RandomRotation(10),
    transforms.ToTensor()
])

# Fine-Tuning using pytorch
import torch
from torchvision import models

model = models.resnet18(pretrained=True)
for param in model.parameters():
    param.requires_grad = False

model.fc = torch.nn.Linear(model.fc.in_features, 10)

# import libraries

from torchvision import transforms

#Data Augmentation

transform = transforms.Compose([

transforms.RandomHorizontalFlip(),

transforms.RandomRotation(10),

transforms.ToTensor()

])

# Fine-Tuning using pytorch

import torch

from torchvision import models

model = models.resnet18(pretrained=True)

for param in model.parameters():

param.requires_grad = False

model.fc = torch.nn.Linear(model.fc.in_features, 10)

Transfer Learning

Transfer learning is another resource-saving technique in which a model is trained on one task to apply to another related task. This approach is effective when data is limited, as the model has already learned general features from the initial training datasets.

The code below is the Bert model (bert-base-uncased) and is pre-trained on massive amounts of data. The knowledge generated from the model can be plugged in another similar use case. This approach saves time and resources for training and re-training the models.

# import libraries
from transformers import BertForSequenceClassification, AdamW

model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)

# Freeze most of the layers
for param in model.base_model.parameters():
    param.requires_grad = False

#Fine-tune the classifier head with new data
optimizer = AdamW(model.parameters(), lr=1e-5)

# import libraries

from transformers import BertForSequenceClassification, AdamW

model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=2)

# Freeze most of the layers

for param in model.base_model.parameters():

param.requires_grad = False

#Fine-tune the classifier head with new data

optimizer = AdamW(model.parameters(), lr=1e-5)

Humanoid AI: Progress and Challenges

Humanoid AI is a new development that refers to AI systems designed to mimic human behavior and interaction. It is a a combination of robotics, LLMs, and neuroscience. The main idea behind it is to achieve human-like capabilities in understanding, reasoning, and interacting with environments. An example of this is a Sophia robot.

The latest development of NVDIA’s sophisticated chip has contributed significantly towards humanoid development. Interestingly, the average cost for such sophisticated products is typically more than $100,000 , but costs will gradually go down in the future as technologies get more advanced.

Technological Progress of Humanoid AI

The latest growth of LLM models (sophisticated NLP), improved hardware components, improved computer vision algorithms, and GenAI models helped to develop humanoid AI.

The Sophia robot isn’t the only humanoid robot around. Amazon has already started to use humanoid robots. Boston Dynamics has a well-known humanoid robot called Atlas.

While there has been a tremendous amount of growth in humanoid AI, there has also been a substantial amount of challenges. Here are a few examples:

1. Data Privacy and Security

A humanoid system requires a massive amount of data to function effectively, and if that data changes constantly or is hard for AI to consume, then security concerns may arise. For example, if a robot were to drive a car, the volatility in the minds and actions of human drivers may pose challenges for the robot to operate safely.

2. Ethical Considerations

The more humanoid AI becomes sophisticated, the more it raises ethical questions. Areas like consent, decision-making, and the potential misuse of available information as well as the changes it will bring in human relationships and emotions are all points of legitimate concern.

3. Technical Limitations

Even with today’s advanced AI models, challenges for AI remain in understanding the emotional expression and meanings of words based on the context of their use. These challenges may get resolved as AI resources and investment grow.

4. Automation in Code Development and Future IT Market Challenges

Automation
Github copilot and OpenAI’s codex are two of the latest AI-powered automated tools for code generation. They have helped developers and tech leaders complete tasks faster than ever before. Not only do they help reduce the amount of developers required to achieve a project goal, but also provide useful feedback on work.

Developers are able to finish projects faster, but there is risk in less experienced developers committing code that’s been generated by AI to production environments that they blindly trust.

Here’s an example of using an AI-powered code assistant to generate Python code:

# Example using OpenAI Codex
import openai

openai.api_key = "your-api-key"

response = openai.Completion.create(
  engine="code-davinci-002",
  prompt="Write a Python function to sort a list of integers",
  max_tokens=100
)

print(response.choices[0].text.strip())

# Example using OpenAI Codex

import openai

openai.api_key = "your-api-key"

response = openai.Completion.create(

engine="code-davinci-002",

prompt="Write a Python function to sort a list of integers",

max_tokens=100

)

print(response.choices[0].text.strip())

The Future of IT Market
Generative AI is impacting the IT market as we speak. Here are a few examples to consider:

1. Job displacement: As AI tools become more sophisticated, there is growing worry about job availability for technical professionals. It is believed that automation and advanced AI have contributed towards mass corporate lay offs in tech industries.

2. Skill evolution: Developers will need to allocate more time in learning and sharpening their skills in order to “compete” with advanced machines. For example, GPT-4 has an estimation of 1.8 trillion parameters whereas the average human has 86 billions neurons approximately. It may be wise for humans to dedicate their time to higher-level problem solving work, as AI can take care of repetitive and routine coding and technical work.

3. Dependency and trust: Like humans, machines can also make mistakes. As a result, it’s not possible for developers to fully rely on AI-generated code yet.

4. Legal implications: The overall use of AI in coding and automation raises questions about ownership and accountability. Buggy AI-generated code could shut down an entire server. An AI-generated answer might contain bias. These scenarios could lead to legal issues.

Conclusion

GenAI is a powerful technology that can be leveraged to solve problems. The evolution of advanced AI models might make human life easier in the future, but at the same time will also present challenges and issues. If we use these tools in the right way, humans will be able to work faster, easier, and generate more productivity and revenue for the companies they work for. With this said, AI development and usage must be done in an ethical manner. It will be very interesting to see how the future of AI takes form over the next several years.

Thanks for reading. Want to keep your data analytics knowledge sharp?

Get the latest Key2 content and more delivered right to your inbox!

Related Content

How to Use Copilot for Power BI

By: Mark Seaman “Copilot” is Microsoft's new AI generative tool that’s now included in various Microsoft products, including Power BI, the highly popular data visualization platform. This integration is often referred to as "Copilot for Power BI". Copilot for Power...

How to Use Azure AI Language for Sentiment Analysis

By: Jay Clegg Intro - NLP & Sentiment Analysis Although research in Natural Language Processing (NLP) dates back many decades, recent advancements in both computing hardware and NLP architectures have produced incredibly useful results that have attracted intense...

Power BI Artificial Intelligence – 3 AI Visuals to Enhance Your Analytics

By: Brad Lathrop Did you know Power BI has artificial intelligence capabilities built right into the platform? Users can create and include AI visualizations in Power BI Desktop reports, which can then be published to the Power BI service for sharing. If you...