Back to Course
CMSC 173

Module 13: Advanced Neural Networks

1 / --

Advanced Neural Networks

CMSC 173 - Module 13

Noel Jeffrey Pinton
Department of Computer Science
University of the Philippines Cebu

Outline

\tableofcontents

What are Advanced Neural Networks?

Basic Neural Networks

  • Fully connected layers
  • Good for tabular data
  • Limited to simple patterns
  • We learned these already!
\begin{exampleblock}{Advanced Architectures}
  • CNNs: For images and spatial data
  • Transformers: For text and sequences
  • GANs: Generate new data
  • VAEs: Learn compressed representations
  • Diffusion: Create high-quality images
\end{exampleblock}

Why Learn These?

They power the AI you use every day:
  • ChatGPT (Transformer)
  • DALL-E 2 (Diffusion)
  • Face unlock on phones (CNN)
  • Google Translate (Transformer)
  • AI art generators (GAN/Diffusion)
\begin{tipblock}{This Module's Focus} Understanding applications rather than complex math! \end{tipblock}

Real-World Applications Overview

[Figure: ../figures/cnn_applications.png]

Course Philosophy

Learn by seeing what's possible! We'll focus on understanding what these networks can do and how to use them, not deriving complex mathematics.

CNNs: What Are They?

Simple Explanation

CNNs are neural networks designed for images. They work by:
  • Looking at small patches of the image
  • Finding patterns (edges, shapes, textures)
  • Building up to complex objects
  • Making decisions based on what they see
\begin{exampleblock}{Why Not Regular NNs?}
  • Images have too many pixels
  • Spatial relationships matter
  • Same pattern appears in different places
  • CNNs are much more efficient
\end{exampleblock}

[Figure: ../figures/cnn_simple_intuition.png]

Key Insight

CNNs learn to recognize patterns automatically - no manual feature engineering!

How CNNs Process Images

[Figure: ../figures/cnn_simple_architecture.png]

Processing Pipeline

Input Image $\rightarrow$ Find Edges $\rightarrow$ Find Shapes $\rightarrow$ Find Objects $\rightarrow$ Decision
\begin{exampleblock}{Analogy} Like how humans see: First we see lines and edges, then shapes, then we recognize "this is a cat!" \end{exampleblock}

CNNs vs Traditional Computer Vision

[Figure: ../figures/cnn_vs_traditional.png]

Traditional Methods

  • Manual feature design
  • Hard to adapt to new tasks
  • Limited accuracy
  • Lots of expert knowledge needed
\begin{exampleblock}{CNNs}
  • Automatic feature learning
  • Easily adapt to new problems
  • State-of-the-art accuracy
  • Just need training data
\end{exampleblock}

CNN Applications: Medical Imaging

Cancer Detection

Real Application:
  • Detect tumors in X-rays and MRIs
  • Classify skin lesions (benign/malignant)
  • Analyze mammograms for breast cancer
  • Help radiologists work faster
\begin{exampleblock}{Impact}
  • Earlier disease detection
  • Fewer missed diagnoses
  • Reduced radiologist workload
  • Available in rural areas
\end{exampleblock}

Retinal Disease Diagnosis

Example: Google's Diabetic Retinopathy Detection
  • Analyzes eye scans
  • Detects diabetes complications
  • Matches expert doctor accuracy
  • Used in India, Thailand

Success Story

FDA-approved AI systems now assist doctors in real hospitals!

CNN Applications: Self-Driving Cars

Lane Detection

What CNNs Do:
  • Identify road lane markings
  • Track lane boundaries in real-time
  • Work in various lighting conditions
  • Handle curves and intersections

Object Detection

  • Detect pedestrians, cars, cyclists
  • Recognize traffic signs and lights
  • Estimate distance to objects
  • Predict object movement
\begin{exampleblock}{Companies Using This}
  • Tesla: Full Self-Driving (FSD)
  • Waymo: Autonomous taxis
  • Cruise: Robotaxis in SF
  • Mobileye: Driver assistance
\end{exampleblock}

Real Deployment

Over 1 million vehicles use CNN-based vision systems today!

CNN Applications: Face Recognition

Phone Unlock (Face ID)

How It Works:
  • CNN extracts facial features
  • Creates unique "face print"
  • Compares to stored template
  • Works in different lighting
  • Adapts to appearance changes
\begin{exampleblock}{Daily Use Cases}
  • iPhone/Android face unlock
  • Photo organization (Google Photos)
  • Security access control
  • Airport immigration
\end{exampleblock}

Social Media Applications

  • Facebook: Auto-tag friends in photos
  • Snapchat: Face filters and effects
  • Instagram: Beauty filters
  • TikTok: Face tracking for AR

Privacy Note

Face recognition raises important privacy concerns - always consider ethics!

CNN Applications: Security \& Surveillance

Smart Security Cameras

Capabilities:
  • Detect people vs animals
  • Recognize package delivery
  • Identify suspicious behavior
  • Track movement patterns
  • Send targeted alerts
\begin{exampleblock}{Consumer Products}
  • Ring Doorbell cameras
  • Nest security systems
  • Arlo smart cameras
  • Reduce false alarms by 90\%
\end{exampleblock}

Retail Applications

Amazon Go Stores:
  • Track what customers pick up
  • Automatic checkout (no cashiers)
  • Prevent shoplifting
  • Analyze shopping behavior

Industry Impact

Checkout-free stores save 75\% of labor costs while improving customer experience!

CNN Applications: Satellite Imagery Analysis

Environmental Monitoring

Applications:
  • Track deforestation in Amazon
  • Monitor crop health
  • Detect illegal fishing
  • Assess disaster damage
  • Map urban growth
\begin{exampleblock}{Real Projects}
  • Planet Labs: Daily Earth imaging
  • Global Fishing Watch: Ocean monitoring
  • NASA: Climate change tracking
\end{exampleblock}

Humanitarian Uses

  • Count refugees in camps
  • Assess natural disaster impact
  • Map poverty indicators
  • Monitor conflict zones
  • Guide relief efforts

Scale

CNNs can analyze millions of satellite images - impossible for humans alone!

What Are Generative Models?

[Figure: ../figures/generative_vs_discriminative.png]

Discriminative Models

What they do:
  • Classify/label existing data
  • "Is this a cat or dog?"
  • CNNs for image classification
\begin{exampleblock}{Generative Models} What they do:
  • Create new data
  • "Generate a new cat image"
  • GANs, VAEs, Diffusion models
\end{exampleblock}

Generative Model Applications Overview

[Figure: ../figures/generative_applications.png]

Three Main Types We'll Cover

  1. GANs (Generative Adversarial Networks): Two networks compete to create realistic images
  2. VAEs (Variational Autoencoders): Learn compressed representations, generate variations
  3. Diffusion Models: Start with noise, gradually create detailed images

GANs: The Basic Idea

[Figure: ../figures/gan_simple_concept.png]

Simple Explanation

Two neural networks compete:
  • Generator: Creates fake images (like an art forger)
  • Discriminator: Tries to spot fakes (like an art detective)
  • They get better by competing with each other
  • Eventually, fakes become indistinguishable from real!

GAN Application: AI Art Generation

Artbreeder

What it does:
  • Generate unique portraits
  • Mix different faces together
  • Adjust age, gender, ethnicity
  • Create landscapes, album covers
  • Used by 10+ million users
\begin{exampleblock}{How Artists Use It}
  • Book cover illustrations
  • Character design for games
  • Concept art for films
  • Social media content
\end{exampleblock}

ThisPersonDoesNotExist.com

  • Generates random faces
  • 100\% synthetic people
  • Photorealistic quality
  • New face every refresh
  • Built with StyleGAN

Try It Yourself!

Visit the website - every face you see was created by AI, not a photo!

GAN Application: Deepfake Detection

The Problem

Malicious Uses:
  • Fake celebrity videos
  • Misinformation campaigns
  • Identity fraud
  • Non-consensual content
\begin{exampleblock}{The Solution} GANs fight GANs:
  • Train detectors on fake data
  • Identify artifacts and inconsistencies
  • Real-time video verification
  • Protect public figures
\end{exampleblock}

Real Deployments

  • Facebook/Meta: Deepfake detection system
  • Microsoft: Video Authenticator tool
  • Intel: FakeCatcher (96\% accuracy)
  • Adobe: Content Authenticity Initiative

Arms Race

Detection technology must constantly evolve as GANs improve!

GAN Application: Synthetic Medical Data

Why Generate Medical Data?

Privacy \& Scarcity Issues:
  • Real patient data is private (HIPAA)
  • Rare diseases lack training samples
  • Hard to share data between hospitals
  • Need diverse examples for AI training
\begin{exampleblock}{What GANs Generate}
  • Synthetic X-rays
  • Artificial MRI scans
  • Fake patient records
  • Privacy-preserving datasets
\end{exampleblock}

Real Research Applications

  • Mayo Clinic: Generate rare tumor samples
  • Stanford: Synthetic chest X-rays
  • MIT: Privacy-safe medical records
  • Train better AI without compromising privacy

Impact

Enables medical AI research while protecting patient privacy!

GAN Application: Game Character Creation

Modern Game Development

How GANs Help:
  • Generate unique NPC faces
  • Create diverse character variations
  • Design textures and materials
  • Procedural content generation
  • Speed up asset creation
\begin{exampleblock}{Real Game Studios}
  • EA Sports: Generate realistic player faces
  • Ubisoft: NPC diversity in Assassin's Creed
  • Reduce manual art time by 70\%
\end{exampleblock}

Player Customization

  • Infinite character appearance options
  • Realistic face generation
  • Upload photo for custom avatar
  • AI-assisted character design

Industry Adoption

Major game engines (Unity, Unreal) now integrate GAN-based tools!

GAN Application: Fashion Design

AI Fashion Designers

What They Generate:
  • New clothing designs
  • Pattern and texture variations
  • Color scheme combinations
  • Style transfer between eras
  • Personalized recommendations
\begin{exampleblock}{Fashion Companies Using AI}
  • Stitch Fix: Personalized designs
  • Tommy Hilfiger: IBM collaboration
  • Zalando: Generated fashion models
\end{exampleblock}

Virtual Try-On

  • Generate how clothes look on you
  • Try outfits without physically wearing
  • Reduce online shopping returns
  • Personalized styling suggestions

Business Impact

AI-designed collections sell out 30\% faster than traditional designs!

VAEs: What Are They?

[Figure: ../figures/vae_simple_concept.png]

Simple Explanation

VAEs compress data into a small code, then decompress it:
  • Encoder: Compress image into compact representation (like zip file)
  • Latent Space: The compressed "code" capturing key features
  • Decoder: Reconstruct image from the code
  • Can generate new images by sampling random codes!

VAE Application: Anomaly Detection

Manufacturing Quality Control

How It Works:
  • Train VAE on normal products
  • VAE learns what "normal" looks like
  • Defects reconstruct poorly
  • High reconstruction error = defect!
\begin{exampleblock}{Real Applications}
  • Detect scratches on surfaces
  • Find cracks in materials
  • Identify missing components
  • Automated quality inspection
\end{exampleblock}

Other Anomaly Detection Uses

  • Cybersecurity: Detect network intrusions
  • Finance: Identify fraudulent transactions
  • Healthcare: Flag unusual patient vitals
  • IoT: Detect sensor failures

Advantage

Works without labeled defect examples - learns from normal data only!

VAE Application: Image Compression

Why VAEs for Compression?

Advantages over JPEG:
  • Better quality at low bitrates
  • Learned compression (adapts to content)
  • Can compress to tiny sizes
  • Semantic preservation
\begin{exampleblock}{How It Works}
  • Encoder compresses to latent code
  • Store only the small code
  • Decoder reconstructs when needed
  • 10-100x smaller than JPEG
\end{exampleblock}

Real-World Uses

  • Store medical imaging archives
  • Stream video at lower bandwidth
  • Compress satellite imagery
  • Mobile app image caching

Research Example

Google's neural image compression beats JPEG by 50\% in quality metrics!

VAE Application: Drug Molecule Generation

Pharmaceutical Discovery

Traditional Approach:
  • Test millions of molecules
  • Takes 10+ years per drug
  • Costs billions of dollars
  • High failure rate
\begin{exampleblock}{VAE Approach}
  • Learn from existing drugs
  • Generate similar molecules
  • Optimize for target properties
  • Find candidates much faster
\end{exampleblock}

Real Pharmaceutical AI

  • Insilico Medicine: Generated novel molecules
  • Atomwise: AI drug discovery platform
  • BenevolentAI: COVID-19 drug repurposing
  • Reduce discovery time by 75\%

Major Milestone

First AI-discovered drug entered human trials in 2020!

Transformers: What Are They?

[Figure: ../figures/transformer_simple_architecture.png]

Simple Explanation

Transformers process sequences by paying attention to relevant parts:
  • Designed for text, but work on images/audio too
  • Use "attention" to focus on important words
  • Process entire sequence at once (fast!)
  • Foundation of modern AI: GPT, BERT, ChatGPT

Transformer Applications Overview

[Figure: ../figures/transformer_applications.png]

Why Transformers Changed Everything

Before 2017: RNNs struggled with long sequences. After 2017: Transformers enabled GPT, BERT, and the current AI revolution!

Transformer Application: ChatGPT

What ChatGPT Can Do

Capabilities:
  • Answer questions
  • Write code and debug
  • Compose essays and emails
  • Explain complex topics
  • Translate languages
  • Creative writing
\begin{exampleblock}{Real Usage Statistics}
  • 100+ million weekly users
  • Fastest-growing consumer app
  • Used in 185+ countries
\end{exampleblock}

How Students Use It

  • Homework help and tutoring
  • Research assistance
  • Programming debugging
  • Study guide creation
  • Language learning
  • Career advice

Built With Transformers

GPT-4 uses a massive transformer with 175+ billion parameters!

Transformer Application: Google Translate

Old vs New Approach

Before Transformers (2016):
  • Phrase-based translation
  • Limited context understanding
  • Often awkward output
After Transformers (2017+):
  • Sentence-level context
  • Natural, fluent translations
  • 60\% reduction in errors
\begin{exampleblock}{Features Powered by Transformers}
  • 133 languages supported
  • Real-time conversation mode
  • Camera translation (point and translate)
  • Offline translation
  • Context-aware results
\end{exampleblock}

Daily Impact

500+ million people use Google Translate every day!

Transformer Application: GitHub Copilot

AI Pair Programmer

What Copilot Does:
  • Suggests code as you type
  • Writes entire functions
  • Explains existing code
  • Converts comments to code
  • Generates tests
  • Fixes bugs
\begin{exampleblock}{Real Developer Impact}
  • 46\% of code written by AI
  • 55\% faster task completion
  • Used by 1.2 million developers
\end{exampleblock}

How It Works

  • Built on GPT (Codex model)
  • Trained on billions of lines of code
  • Understands context from your files
  • Suggests in real-time
  • Supports 12+ programming languages

For Students

Great learning tool - see how experts solve problems!

Transformer Application: Email Auto-Complete

Gmail Smart Compose

Features:
  • Suggests next words/sentences
  • Learns your writing style
  • Adapts to context
  • Multi-language support
  • Works on mobile too
\begin{exampleblock}{Time Savings}
  • Average user saves 1 billion characters/week
  • Reduces writing time by 11\%
  • 4+ billion emails use it daily
\end{exampleblock}

Other Email AI Features

  • Smart Reply: Suggest full responses
  • Subject suggestions: Auto-generate subjects
  • Tone adjustment: Make emails more formal
  • Grammar correction: Fix mistakes

All Powered by Transformers

These "small" conveniences use the same tech as ChatGPT!

Transformer Application: Document Summarization

Automatic Summarization

What It Does:
  • Read long documents
  • Extract key points
  • Generate concise summary
  • Preserve important details
  • Save reading time
\begin{exampleblock}{Real Products}
  • Microsoft Word: Auto-summarize
  • Slack: Thread summaries
  • Notion AI: Note summarization
  • Chrome extensions: Web page summaries
\end{exampleblock}

Use Cases

  • Research paper summaries
  • News article digests
  • Legal document review
  • Meeting notes condensation
  • Customer feedback analysis

Productivity Boost

Lawyers using AI summarization save 60\% of document review time!

Vision Transformers: Images Meet Transformers

[Figure: ../figures/vision_transformer_concept.png]

Vision Transformers (ViT)

Applying transformers to images:
  • Break image into patches (like words)
  • Apply transformer attention to patches
  • Often better than CNNs with enough data
  • Used in DALL-E, Imagen, latest AI systems

Diffusion Models: How They Work

[Figure: ../figures/diffusion_simple_concept.png]

Simple Explanation

Create images by gradually removing noise:
  • Start with pure random noise
  • Gradually remove noise step-by-step
  • Guided by text description
  • End with high-quality image
  • Like a sculptor revealing a statue from marble!

Diffusion vs GANs vs VAEs

[Figure: ../figures/diffusion_vs_others.png]

GANs

Pros: Fast generation\\ Cons: Hard to train, mode collapse

VAEs

Pros: Stable, good latent space\\ Cons: Blurry outputs
\begin{exampleblock}{Diffusion} Pros: Best quality, stable\\ Cons: Slower generation \end{exampleblock}

Diffusion Applications Overview

[Figure: ../figures/diffusion_applications.png]

Why Diffusion Models Won

They power DALL-E 2, Midjourney, Stable Diffusion - the best AI image generators today!

Diffusion Application: DALL-E 2

What DALL-E 2 Can Do

Text-to-Image Generation:
  • Type a description, get an image
  • Photorealistic or artistic styles
  • Combine multiple concepts
  • Edit existing images
  • Outpainting (extend images)
\begin{exampleblock}{Example Prompts}
  • "A cat astronaut on Mars"
  • "Oil painting of a sunset over Manila"
  • "Teddy bear shopping for groceries"
\end{exampleblock}

Real-World Uses

  • Marketing content creation
  • Concept art for entertainment
  • Educational illustrations
  • Social media graphics
  • Product mockups

By OpenAI

Same company behind ChatGPT - 1.5+ million users create images daily!

Diffusion Application: Midjourney

What Makes Midjourney Special

Artistic Focus:
  • Exceptionally beautiful outputs
  • Strong artistic style
  • Great for fantasy/sci-fi art
  • Discord-based interface
  • Community of 16+ million users
\begin{exampleblock}{Popular Use Cases}
  • Book cover designs
  • Album artwork
  • Game concept art
  • NFT art generation
\end{exampleblock}

Industry Impact

  • Artists use it for inspiration
  • Magazine covers created with AI
  • Award-winning art competitions
  • Commercial illustration work

Controversy

AI art won Colorado State Fair - sparked debate about AI creativity!

Diffusion Application: Stable Diffusion

Why Stable Diffusion is Different

Open Source:
  • Free to use and modify
  • Run on your own computer
  • Customize and fine-tune
  • No usage restrictions
  • Active developer community
\begin{exampleblock}{Technical Details}
  • Can run on consumer GPUs
  • Faster than DALL-E 2
  • Extensible with plugins
  • Multiple versions and variants
\end{exampleblock}

Popular Applications Built With It

  • DreamStudio (official interface)
  • Automatic1111 (popular UI)
  • ComfyUI (node-based editor)
  • Mobile apps (Draw Things)
  • Photoshop plugins

Democratizing AI

Anyone with a decent computer can now generate professional-quality images!

Diffusion Application: Adobe Firefly

Professional Image Editing

Firefly Features:
  • Text-to-image generation
  • Generative fill (edit parts of images)
  • Text effects (3D text styles)
  • Generative recolor
  • Integrated in Photoshop
\begin{exampleblock}{Key Advantages}
  • Trained on Adobe Stock (licensed data)
  • Commercially safe to use
  • Professional quality outputs
  • Seamless Creative Cloud integration
\end{exampleblock}

Real Designer Workflows

  • Remove unwanted objects
  • Extend backgrounds
  • Generate variations quickly
  • Create mockups from descriptions
  • Speed up creative process 10x

Industry Standard

Adobe's AI tools are becoming essential for professional designers!

Diffusion Application: Video Generation

Text-to-Video AI

Emerging Applications:
  • Generate short video clips
  • Animate static images
  • Create transitions
  • Style transfer for video
  • AI-assisted editing
\begin{exampleblock}{Current Platforms}
  • Runway Gen-2: Text-to-video
  • Pika Labs: Video generation
  • Stable Video Diffusion: Open source
\end{exampleblock}

Use Cases

  • Social media content
  • Marketing videos
  • Animated presentations
  • Film pre-visualization
  • Game cinematics

Future is Coming

Video generation is improving rapidly - expect major breakthroughs soon!

Text-to-Image Process Explained

[Figure: ../figures/text_to_image_process.png]

How It All Works Together

  1. Text Encoder (Transformer): Understand your description
  2. Diffusion Model: Generate image from noise
  3. Guidance: Steer generation toward text description
  4. Refinement: Iteratively improve quality

Ethical Considerations

[Figure: ../figures/generative_ethics.png]

Important Questions to Consider

As these technologies become powerful, we must think carefully about their impact!

Key Ethical Issues

Misinformation \& Deepfakes

Concerns:
  • Fake news and propaganda
  • Identity fraud
  • Non-consensual content
  • Erosion of trust in media
Solutions:
  • Detection technology
  • Digital watermarking
  • Media literacy education
  • Legal frameworks

Bias \& Fairness

Problems:
  • Biased training data
  • Perpetuating stereotypes
  • Unfair representation
  • Discrimination in outputs
Mitigation:
  • Diverse training datasets
  • Bias testing and auditing
  • Responsible AI guidelines
  • Inclusive development teams

More Ethical Considerations

Copyright \& Intellectual Property

Questions:
  • Who owns AI-generated content?
  • Is training on copyrighted data fair use?
  • Should artists be compensated?
  • How to attribute AI creations?
Current Debates:
  • Ongoing lawsuits (artists vs AI companies)
  • New legislation being proposed
  • Industry opt-out mechanisms

Job Displacement

Concerns:
  • Will AI replace creative jobs?
  • Impact on artists, writers, designers
  • Economic inequality
  • Need for reskilling
Opportunities:
  • AI as a tool, not replacement
  • New creative possibilities
  • Democratization of creation
  • Focus on uniquely human skills

Your Responsibility

As future AI practitioners, think critically about the impact of your work!

Understanding Attention

[Figure: ../figures/attention_simple_concept.png]

What is Attention?

A mechanism that lets neural networks focus on relevant parts:
  • In text: Focus on important words in a sentence
  • In images: Focus on relevant image regions
  • Learns automatically what to pay attention to
  • Core component of Transformers

Attention Example: Language Translation

Problem Without Attention

Translating: "The cat sat on the mat" Old approach:
  • Process word by word left to right
  • Forget earlier context
  • Struggle with long sentences
  • Poor word alignment
\begin{exampleblock}{With Attention} For each output word, the model:
  • Looks at ALL input words
  • Focuses on relevant ones
  • "sat" pays attention to "cat" and "mat"
  • Handles long-distance dependencies
  • Better translation quality
\end{exampleblock}

Why It's Revolutionary

Attention enabled Transformers to outperform all previous architectures!

Getting Started: Available Tools

Free/Accessible Tools

Try these today:
  • ChatGPT: Free tier available
  • Bing Image Creator: Free DALL-E access
  • Google Colab: Run Stable Diffusion free
  • Hugging Face: Try many models online
  • Runway: Free trial for video
\begin{exampleblock}{Learning Resources}
  • Fast.ai courses (free)
  • Hugging Face tutorials
  • Papers with Code
  • YouTube: Two Minute Papers
\end{exampleblock}

For Developers

Build your own:
  • PyTorch or TensorFlow
  • Hugging Face Transformers library
  • Stable Diffusion on GitHub
  • Pre-trained models available
  • Fine-tune on your data

Start Small

Use existing models before building from scratch - learn by doing!

Tips for Using AI Image Generators

Writing Good Prompts

Be specific:
  • Describe style (photorealistic, cartoon, oil painting)
  • Specify details (colors, lighting, mood)
  • Mention composition (close-up, wide shot)
  • Add quality keywords (4K, detailed, masterpiece)
\begin{exampleblock}{Example Good Prompt} "A majestic golden retriever sitting in a flower meadow at sunset, photorealistic, warm lighting, shallow depth of field, 4K quality" \end{exampleblock}

Iteration is Key

  • Generate multiple variations
  • Refine your prompt
  • Use negative prompts (what to avoid)
  • Adjust parameters (steps, guidance)
  • Learn from community prompts

Pro Tip

Check out prompt libraries (Lexica.art, PromptHero) to learn from others!

Common Challenges \& Solutions

Challenge: Poor Results

If outputs look bad:
  • Improve your prompt specificity
  • Try different seed values
  • Adjust generation parameters
  • Use a different model/variant
  • Increase generation steps

Challenge: Wrong Anatomy/Details

Known limitations:
  • Hands and fingers often wrong
  • Text in images unclear
  • Physics may be incorrect
  • Use inpainting to fix specific parts

Challenge: Slow Generation

Speed up:
  • Use lower resolution first
  • Reduce number of steps
  • Try faster samplers
  • Use GPU acceleration
  • Consider paid services for speed

Challenge: Reproducibility

Get consistent results:
  • Save your seed numbers
  • Keep prompt exactly the same
  • Note all parameters used
  • Use img2img for variations

Key Takeaways

What We Learned

Five major architectures changing the world:
  1. CNNs: Revolutionized computer vision (medical imaging, self-driving cars, face recognition)
  2. GANs: Generate realistic images (AI art, deepfakes, synthetic data)
  3. VAEs: Compress and generate (anomaly detection, drug discovery)
  4. Transformers: Dominated NLP (ChatGPT, translation, code generation)
  5. Diffusion: Best image generation (DALL-E 2, Midjourney, Stable Diffusion)

Main Message

These aren't just research projects - they're tools you can use TODAY in real applications!

Applications Summary

CNNs Applications

  • Medical tumor detection
  • Self-driving lane detection
  • Phone face unlock
  • Security cameras
  • Satellite imagery analysis

GAN Applications

  • Artbreeder AI art
  • Deepfake detection
  • Synthetic medical data
  • Game character creation
  • Fashion design

Transformer Applications

  • ChatGPT conversations
  • Google Translate
  • GitHub Copilot
  • Email auto-complete
  • Document summarization

Diffusion Applications

  • DALL-E 2 image generation
  • Midjourney art creation
  • Stable Diffusion (open source)
  • Adobe Firefly editing
  • Video generation (emerging)

The Future is Here

Trends to Watch

Next 1-2 years:
  • Multimodal AI: Text, image, audio, video together
  • Better video generation: Movie-quality AI videos
  • 3D generation: Create 3D models from text
  • Real-time generation: Instant results
  • Personalization: AI that learns your style
\begin{exampleblock}{Career Opportunities} Skills in demand:
  • AI/ML engineering
  • Prompt engineering
  • AI safety and ethics
  • Creative AI applications
  • AI product management
\end{exampleblock}

Get Involved

The best way to learn is to experiment - start building today!

How to Continue Learning

Hands-On Practice

  • Try Stable Diffusion on Colab
  • Build projects with Hugging Face
  • Fine-tune models on your data
  • Participate in Kaggle competitions
  • Contribute to open source projects

Online Courses

  • Fast.ai: Practical Deep Learning
  • Stanford CS230: Deep Learning
  • Coursera: Deep Learning Specialization
  • Hugging Face NLP Course (free)

Stay Updated

  • Follow Papers with Code
  • Read AI newsletters (The Batch, etc.)
  • Watch Two Minute Papers (YouTube)
  • Join AI Discord communities
  • Attend local meetups
\begin{exampleblock}{Next Steps in This Course} Workshop: Hands-on coding with ResNet, GPT-2, Stable Diffusion - let's use these models! \end{exampleblock}

Questions?

{\LargeThank you for your attention!}

Contact Information

Instructor: Noel Jeffrey Pinton\\ Course: CMSC 173 - Machine Learning\\ Institution: University of the Philippines - Cebu\\ Department: Computer Science

Remember

Advanced neural networks are tools that empower creativity and solve real problems. Use them responsibly and ethically!

End of Module 13

Advanced Neural Networks

Questions?