Noel Jeffrey Pinton
Department of Computer Science
University of the Philippines Cebu
Noel Jeffrey Pinton
Department of Computer Science
University of the Philippines Cebu
By the end of this module, you will be able to:
Translation, rotation, and scaling
Translation, rotation, and scaling are the fundamental building blocks
$$\begin{bmatrix} x' \\ y' \end{bmatrix} = \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix}$$
Rotation by angle θ around the origin
Unified representation for all transformations
Homogeneous Coordinates: Add an extra dimension to represent 2D points as 3D vectors.
$$(x, y) \rightarrow (x, y, 1)$$
Enables all transformations (including translation) as matrix multiplication.
Why do we use homogeneous coordinates in computer graphics?
Homogeneous coordinates allow all geometric transformations, including translation, to be expressed as matrix multiplication. This enables combining multiple transformations into a single matrix.
Click the blurred area to reveal the answer
Preserving parallel lines
$$\begin{bmatrix} x' \\ y' \\ 1 \end{bmatrix} = \begin{bmatrix} a & b & t_x \\ c & d & t_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$$
Combines rotation, scaling, shearing, and translation
From rigid to projective: each level adds more degrees of freedom
$$T_{combined} = T_3 \cdot T_2 \cdot T_1$$
Order matters! Transformations are applied right-to-left.
If you want to rotate an image around its center, what sequence of transformations is needed?
1) Translate to move center to origin, 2) Rotate around origin, 3) Translate back to original position. The matrices are multiplied: T_back × R × T_to_origin
Click the blurred area to reveal the answer
Homography and perspective
Homography (8 DOF):
$$\begin{bmatrix} x' \\ y' \\ w \end{bmatrix} = \begin{bmatrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$$
Final coordinates: $(x'/w, y'/w)$
Correcting perspective distortion in document scanning and photography
Resampling pixel values
When would you prefer nearest-neighbor over bicubic interpolation?
Nearest-neighbor is preferred for: 1) Real-time applications where speed is critical, 2) Pixel art where you want to preserve sharp edges, 3) Binary/label images where interpolated values are meaningless.
Click the blurred area to reveal the answer
Non-linear transformations for artistic effects and correction
Polar Transform: Convert from Cartesian to polar coordinates.
$$r = \sqrt{x^2 + y^2}, \quad \theta = \arctan(y/x)$$
Useful for analyzing circular patterns and rotational symmetry.
Aligning images from different sources
Corresponding points are used to estimate the transformation
Medical imaging, satellite imagery, panorama stitching
import cv2
import numpy as np
img = cv2.imread('image.jpg')
rows, cols = img.shape[:2]
# Rotation matrix (center, angle, scale)
M = cv2.getRotationMatrix2D((cols/2, rows/2), 45, 1.0)
rotated = cv2.warpAffine(img, M, (cols, rows))
# Perspective transform
pts1 = np.float32([[0,0], [300,0], [0,300], [300,300]])
pts2 = np.float32([[0,0], [300,0], [50,300], [250,300]])
H = cv2.getPerspectiveTransform(pts1, pts2)
warped = cv2.warpPerspective(img, H, (cols, rows))
Key Takeaways
Thank you for your attention!
Next: Module 07 - Feature Extraction
Questions?