Introduction to Digital Image Processing
CMSC 178IP - Module 01
Noel Jeffrey Pinton
Department of Computer Science
University of the Philippines Cebu
What is Digital Image Processing?
Digital Image Processing (DIP) is the use of computer algorithms to perform image processing on digital images.
Input
- Digital image(s)
- 2D array of pixel values
Output
- Enhanced image
- Extracted information
- Measurements or features
Real-World Applications
Where is DIP Used?
- Medical Imaging: CT scans, MRI, X-rays
- Satellite & Remote Sensing: Weather, agriculture, mapping
- Autonomous Vehicles: Object detection, lane recognition
- Biometrics: Face recognition, fingerprint analysis
- Social Media: Filters, enhancement, compression
Learning Objectives
By the end of this module, you will:
- Explain how digital images are represented as 2D arrays
- Describe the processes of sampling and quantization
- Distinguish between binary, grayscale, and color images
- Understand RGB and HSV color models
- Calculate memory requirements for different image types
Image as a 2D Function
Continuous Image: A function $f(x, y)$ where $x, y$ are spatial coordinates and $f$ is the intensity (brightness).
Digital Image: A discrete function $f[m, n]$ where indices $m, n$ are integers.
$$f: \mathbb{Z}^2 \rightarrow \{0, 1, 2, ..., L-1\}$$
Continuous
- $x, y \in \mathbb{R}$
- Infinite precision
- Real-world scene
Discrete
- $m \in \{0, ..., M-1\}$
- $n \in \{0, ..., N-1\}$
- Finite pixels
Knowledge Check
Think About It
What does f(x,y) represent in image processing?
f(x,y) represents the intensity (brightness) value at the spatial coordinates (x,y). In a continuous image, x and y are real numbers; in a digital image, they are discrete indices.
Click the blurred area to reveal the answer
From Continuous to Discrete
Converting a continuous image to digital form requires two processes: Sampling and Quantization.

Pixel: Picture element — the smallest addressable unit in a digital image.
Image Coordinate System
Standard Convention
- Origin: Top-left corner at (0, 0)
- x-axis: Increases to the right (columns)
- y-axis: Increases downward (rows)

Matrix notation: $I[i, j]$ where $i$ = row, $j$ = column
Spatial Sampling
Sampling: The process of converting continuous spatial coordinates into discrete pixel locations.
Higher Sampling Rate
- More pixels
- More detail preserved
- Larger file size
Lower Sampling Rate
- Fewer pixels
- Loss of detail
- Aliasing artifacts
$$\text{Nyquist Theorem: } f_s \geq 2 \cdot f_{max}$$
The sampling rate must be at least twice the highest frequency in the image to avoid aliasing.
Knowledge Check
Think About It
What happens if we sample below the Nyquist rate?
Aliasing occurs — high-frequency details appear as false low-frequency patterns (jagged edges, moiré patterns). Information is permanently lost and cannot be recovered.
Click the blurred area to reveal the answer
Effect of Sampling Rate
Lower sampling rates result in fewer pixels and loss of fine detail.

Observations
- 512×512: Full detail (262,144 pixels)
- 128×128: Some detail loss (16,384 pixels)
- 32×32: Heavy pixelation (1,024 pixels)
Spatial Resolution
Spatial Resolution: The number of pixels per unit length (e.g., pixels per inch, PPI).

Common Resolutions
- VGA: 640 × 480 (307,200 pixels)
- Full HD: 1920 × 1080 (2.07 megapixels)
- 4K: 3840 × 2160 (8.29 megapixels)
Intensity Quantization
Quantization: The process of mapping continuous intensity values to discrete levels.
$$L = 2^k \text{ levels, where } k = \text{bits per pixel}$$
Common Bit Depths
- 1-bit: 2 levels (binary)
- 4-bit: 16 levels
- 8-bit: 256 levels
- 16-bit: 65,536 levels
Effects
- Fewer levels → visible banding
- More levels → smoother gradients
- Trade-off: quality vs. storage
Effect of Quantization
Fewer bits per pixel results in visible banding (posterization effect).

Observations
- 8-bit: 256 levels — smooth gradients
- 4-bit: 16 levels — slight banding
- 2-bit: 4 levels — obvious posterization
- 1-bit: 2 levels — binary only
Image Types Overview
Three Common Image Types

- Binary: 1-bit, 2 values
- Grayscale: 8-bit, 256 levels
- Color: 24-bit RGB
- 3 channels × 8 bits each
Binary Images
Binary Image: Only two values — 0 (black) or 1 (white).
Properties
- 1 bit per pixel
- Values: {0, 255}
- Simplest image type
Applications
- Document scanning
- Masks and regions
- Morphological operations
Creating Binary Images
# Thresholding
binary = (gray > 128).astype(np.uint8) * 255
Grayscale Images
Grayscale Image: 256 intensity levels from black (0) to white (255).
Properties
- 8 bits per pixel
- 1 channel
- Values: 0-255
Memory
$$\text{Size} = W \times H \text{ bytes}$$
Example: 640×480 = 307.2 KB
RGB to Grayscale (Luminosity)
$$Y = 0.299R + 0.587G + 0.114B$$
Knowledge Check
Think About It
Why are the coefficients in the luminosity formula different?
Human eyes are most sensitive to green light, less to red, and least to blue. The coefficients (0.299, 0.587, 0.114) weight each channel according to perceived brightness.
Click the blurred area to reveal the answer
Color Images (RGB)
RGB Color Image: Three channels — Red, Green, Blue — each with 8 bits.
Properties
- 24 bits per pixel
- 3 channels
- 16.7 million colors
Memory
$$\text{Size} = W \times H \times 3$$
Example: 640×480 = 921.6 KB
Accessing Channels
R = image[:, :, 0]
G = image[:, :, 1]
B = image[:, :, 2]
RGB Color Model
RGB: Additive color mixing — combining light.

Color Mixing
- R + G = Yellow
- G + B = Cyan
- R + B = Magenta
- R + G + B = White
Pure Colors
- Red: (255, 0, 0)
- Green: (0, 255, 0)
- Blue: (0, 0, 255)
HSV Color Model
HSV: Hue, Saturation, Value — more intuitive for humans.

Components
- H (Hue): Color type (0-360°)
- S (Saturation): Color purity (0-100%)
- V (Value): Brightness (0-100%)
Advantages
- Intuitive color selection
- Better for thresholding
- Separates luminance from color
Knowledge Check
Think About It
When would you use HSV instead of RGB?
HSV is better for color-based segmentation (e.g., detecting red objects) because you can threshold on Hue regardless of lighting (Value). In RGB, shadows and highlights change all three channels.
Click the blurred area to reveal the answer
Color Space Conversion
Converting between color spaces is essential for many image processing tasks.

OpenCV Conversion
import cv2
# BGR to HSV
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# BGR to Grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
Bit Depth & Storage
Higher bit depth = better quality but larger file size.

Trade-off: Always balance quality requirements against storage and processing constraints.
Memory Calculation
Uncompressed Image Size:
$$\text{Bytes} = \text{Width} \times \text{Height} \times \frac{\text{Bits per Pixel}}{8}$$

Examples
- 1920×1080 Grayscale: 2.07 MB
- 1920×1080 RGB: 6.22 MB
- 4K RGB: 24.88 MB
Knowledge Check
Think About It
Calculate the size of a 4K (3840×2160) RGBA image with 8 bits per channel.
Size = 3840 × 2160 × 4 bytes = 33,177,600 bytes ≈ 31.64 MB (uncompressed)
Click the blurred area to reveal the answer
Quality vs Storage Trade-off

When to Use High Bit Depth
- Medical imaging
- Scientific photography
- Professional editing
When to Use Low Bit Depth
- Web images
- Mobile apps
- Real-time video
Summary
Key Takeaways
- Digital images are 2D arrays of discrete pixel values
- Sampling determines spatial resolution (number of pixels)
- Quantization determines intensity levels (bit depth)
- Image types: Binary (1-bit), Grayscale (8-bit), Color (24-bit)
- Color models: RGB (additive) vs HSV (intuitive)
- Memory: Width × Height × Channels × Bits/8
Next: Module 2 — Storage and Compression
End of Module 01
Introduction to Digital Image Processing
Questions?