**E27: Computer Vision**
**ENGR 27/CPSC 72**
**Spring 2025**
**[Matt Zucker](http://mzucker.github.io/swarthmore)**
| Lecture: | Tue/Thu 11:20 AM-12:35 PM, Singer 346 |
|-----------------|--------------------------------------------|
| Office/Lab Hours: | Fri 10:30-noon and by appointment |
| Wizard session: | TBA |
| Discussion forum: | |
# Overview
This class is about applying mathematical theory to endow computers with the ability to understand and interact with the real world through physical imaging sensors. Although there will be plenty of programming, coding is not the main focus. **If you are looking for a class about programming particular APIs (e.g. OpenCV, TensorFlow), you may be disappointed!**
The course is divided into three broad areas of investigation:
* *Appearance based methods (~4.5 weeks)* including filtering, morphological operators, convolutions, frequency domain methods, edge and feature detection, correlation, and template tracking.
* *3D geometry based methods (~4.5 weeks)* including multiple view geometry, stereo and structured light, and structure from motion.
* *Probabilistic and learning based (~5 weeks)* methods such as classification, object recognition, and clustering.
# Resources
* Textbook: [Richard Szeliski, *Computer Vision: Algorithms and Applications, 2nd Edition* Springer 2010-22.](http://szeliski.org/Book/)
* [OpenCV Reference manual](https://docs.opencv.org/master/)
* [Numpy Reference manual](https://numpy.org/doc/stable/)
* [Linear algebra refresher](../linalg-reintroduction.pdf)
* [Python/numpy tutorial from Stanford's CS 231n](https://cs231n.github.io/python-numpy-tutorial/)
# Requirements
**Prerequisites:** Either
[ENGR 019](https://catalog.swarthmore.edu/preview_course.php?catoid=30&coid=97706)
or
[ENGR 021](https://catalog.swarthmore.edu/preview_course.php?catoid=30&coid=100069)
or permission of the
instructor. MATH 027
or
MATH 028
is recommended.
**Skills:** In practice, I expect you to be conversant in elementary
programming concepts in Python, including basic loops, functions, and
array processing. You should also be comfortable with the process of converting
a set of mathematical equations into a working Python program.
I also expect you to be comfortable with
[linear algebra concepts](../linalg-reintroduction.pdf) such as
solving linear systems, matrix inverses, rank and
eigenvalues/vectors. We will also be using related geometric concepts
such as the dot product and vector norms as well as rotations and
translations.
**Time:** I expect students to spend approximately 8 hours per
week on this class (4 classes × 8 hours per class +
[8 hours for paid student work](https://www.swarthmore.edu/student-employment/employment-faqs)
= 40 hours). Although this figure will vary from individual to
individual and week to week, you should plan to commit several hours
outside of class to homework, reading, and projects each week.
**Discussion forum**: We will use [this EdStem forum](https://edstem.org/us/courses/74585/discussion/)
throughout the semester to communicate course announcements and answer
questions. Please use Ed (instead of just emailing me) for all course-related communications --
this allows students to see common problems and to engage in
discussions about course material.
**Wizards**: The course Wizards are Quentin Adolphe and Hojune Kim. The
class will have a weekly Wizard session on Sundays from 7-9 PM in Singer 221.
# Assignments
Homework consisting of math, short answer questions, and small
programming exercises will be assigned roughly weekly. Typically,
homework will be assigned on Tuesday, and be due online before the
start of class the following Tuesday.
We will also have regular quizzes to be completed each week. Each
quiz will generally be based on the homework due the previous
week, and potentially course material prior to that as well.
There will be several regularly scheduled projects as well as an
open-ended final project. Ideally, projects will be completed by pairs
of students, but I will allow a larger group if we have an odd number
of students.
Grading will follow approximately the divisions shown below:
* Homeworks: 35%
* Weekly quizzes: 20%
* Regular projects: 35%
* Final project: 10%
Homework is graded on a ✓+, ✓, ✓- scale. A grade of ✓+ indicates no
notable errors, ✓ indicates a sound understanding of course material,
and ✓- indicates that you should spend more time reviewing the
material on the assignment, attend office hours, and/or attend Wizard
sessions to shore up your understanding.
Each student may miss one homework or one quiz with no penalty; if a
student completes all homeworks or quizzes, I will drop the lowest
grade in that category.
Although we will not have a final exam, I may administer one final
quiz during the exam period.
## Collaboration and attribution
* Feel free to collaborate with your classmates on homework; however,
you must submit your own work. Duplicating others’ assignments
verbatim (especially code!) is prohibited.
* If you do discuss homework with your classmates, I expect you to
disclose any such collaboration clearly in your submitted work. Err
on the side of caution – it’s the best way to avoid awkward
conversations about suspicious similarities between assignments.
* Cite any external sources used, including the textbook, internet,
discussions with other professors, etc.
* Do not consult ChatGPT or any other generative AI systems to
complete any graded work for this course.
* Collaboration or communication with other students about quizzes is
prohibited, as is consulting outside resources beyond the textbook,
your class notes, and the course website.
* Aside from raising technical and procedural questions on the course
discussion forum, do not collaborate on projects with others
besides your partner.
* Do not post homework, quiz, or project solutions online. Questions
or answers that discuss solutions too closely will be deleted.
Aside from the course-specific policies above, you are expected to
understand and abide by the college's
[policy on academic misconduct](https://www.swarthmore.edu/student-handbook/academic-policies#academic_misconduct).
## Late policy, extensions, etc.
Homework will generally be assigned on Tuesday, and due at the start
of class the following Tuesday. Homework assignments may be turned in
up to a week late for half credit. Students get one free late homework
turn-in without penalty.
Quizzes will be graded under the same guidelines (half-credit for up
to one week late, first late quiz not penalized).
Projects may be turned in up to a week late for half credit. There
are no penalty-free late project submissions.
I will be liberal in granting extensions for homeworks and projects
*when requested in advance*; however, extensions for quizzes will only
be granted for exceptional circumstances.
Extensions for assignments are much less likely to be granted
retroactively. The more notice and communication I have from you
about your academic needs, the more help I can provide!
# Accommodations
If you believe you need accommodations for a disability or a chronic
medical condition, please contact Student Disability Services (Parrish
113W) via [email](mailto:studentdisabilityservices@swarthmore.edu) to arrange
an appointment to discuss your needs. As appropriate, the Office will
issue students with documented disabilities or medical conditions a
formal Accommodations Letter. Since accommodations require early
planning and are not retroactive, please contact Student Disability
Services as soon as possible. For details about the accommodations
process, visit the Student Disability Services website. You are also
welcome to contact me privately to discuss your academic
needs. However, all disability-related accommodations must be
arranged, in advance, through Student Disability Services.
***Even outside the context of accommodations for disabilities, if there is something I can do to facilitate your learning, please do not hesitate to contact me.***
# Software
You will need an up-to-date installation of Python 3 along with OpenCV 4. It's available through multiple distribution channels. See [this document](install_opencv.html) for detailed installation instructions.
OpenCV is also installed on the CS cluster machines.
# Schedule
!!! Warning
Please note that topics and dates are subject to change. Check this page
frequently for updates, assignments, readings, etc.
Jan 21, 2025: Introdution
Topics:
* Introduction
* Syllabus review
* Image formation
Reading/resources:
* Chapter 1
* Sections 2.1, 2.3
* [Linear algebra basics](../linalg-reintroduction.pdf)
* [Installing OpenCV](install_opencv.html)
Jan 23, 2025: Background subtraction
Topics:
* Image representations
* Thresholding & color segmentation
Reading/resources:
* Sections 3.1, 3.2, 3.3
Jan 28, 2025: Morphological operators and filtering
Topics:
* Morphological operators
* Adaptive thresholding
Reading/resources:
* Section 3.2
Jan 30, 2025: Filtering & image derivatives
Topics:
* Convolution
* Cross-correlation
* Image gradients
Reading/resources:
* Sections 3.2, 7.2
Assignments:
* [Project 1](project1.html)
(Jan 30, 2025): P1: Background subtraction
Feb 4, 2025: Edge detection
Topics:
* Canny edge detector
Reading/resources:
* Section 7.2
Feb 6, 2025: Frequency domain
Topics:
* Fourier transform
Reading/resources:
* Section 3.4, 3.5
Feb 11, 2025: Frequency domain, cont'd.
Topics:
* Fourier transform
Feb 13, 2025: Feature detection & description
Topics:
* Feature detection
* Feature matching
* Random sample consensus (RANSAC)
Reading/resources:
* Section 7.1
* [FAST: Rosten et al. 2006](../papers/rosten2006fast.pdf)
* [BRIEF: Calonder et al. 2010](../papers/calonder2010brief.pdf)
* [ORB: Rublee et al. 2011](../papers/rublee2011orb.pdf)
* [Whiteboard from today](feature_detectors_and_descriptors.pdf)
Feb 18, 2025: Template matching & template tracking
Topics:
* Squared distances
* Normalized cross-correlation
* Review: ordinary least squares
* Kanade-Lucas-Tomasi tracker
Reading/resources:
* Section 9.1
(Feb 18, 2025): P2: Pyramid blending & hybrid images
Feb 20, 2025: Projective algebra fundamentals
Topics:
* Points in 2D
* Lines in 2D
* Homogeneous coordinates
Reading/resources:
* Sections 2.1, 3.6.1
* [Application: unprojecting text](https://mzucker.github.io/2016/10/11/unprojecting-text-with-ellipses.html)
Feb 25, 2025: Homographies and homogeneous least squares
Topics:
* Homographies
* Homogeneous least squares
Feb 27, 2025: 3D geometric fundamentals
Topics:
* Homography recap
Reading/resources:
* Section 11.1
Mar 4, 2025: Geometric image formation
Topics:
* Image formation redux
* Intrinsic & extrinsic parameters
Mar 6, 2025: Camera calibration
Topics:
* Triangulation
* Camera calibration
* Algebraic vs geometric error
Reading/resources:
* Section 11.1
(Mar 6, 2025): P3: Image stitching with homographies
(Mar 11, 2025): Spring break
(Mar 13, 2025): Spring break
Mar 18, 2025: 3D imaging
Topics:
* Stereo vision
* Structured light
Reading/resources:
* Chapter 12
* [Technical description of Kinect calibration](http://wiki.ros.org/kinect_calibration/technical)
* [Lens distortion](https://www.edmundoptics.com/knowledge-center/application-notes/imaging/distortion/)
Mar 20, 2025: Epipolar geometry
Topics:
* Essential matrix
* Fundamental matrix
Reading/resources:
* Section 12.1
Mar 25, 2025: Structure from Motion
Topics:
* Structure from Motion
* Affine cameras
* Singular Value Decomposition (SVD)
Reading/resources:
* Chapter 11
Mar 27, 2025: Affine SFM
Topics:
* Affine SFM
Apr 1, 2025: Face recognition / ML ethics
Topics:
* Face recognition
* Ethics in machine learning
Apr 3, 2025: Machine Learning basics
Topics:
* Binary classification
* Linear classifiers/single-layer perceptron
Reading/resources:
* Section 5.1
Apr 8, 2025: Neural networks
Topics:
* Multi-layer perceptron
Reading/resources:
* Section 5.3
* MNIST dataset
* [original website](http://yann.lecun.com/exdb/mnist/)
* [on Wikipedia](https://en.wikipedia.org/wiki/MNIST_database)
* [Neural network MNIST demo](https://github.swarthmore.edu/e27-spring2025/nnet_demo)
Apr 10, 2025: Neural networks cont'd
Topics:
* Backpropagation of error
Apr 15, 2025: Misc topics in neural nets / ML
Topics:
* Dataset practicalities
* ML software packages: Keras
* Nearest neighbor classification
Reading/resources:
* Section 5.2
Apr 17, 2025: Unsupervised learning & PCA
Topics:
* Unsupervised learning
* Principal component analysis
Apr 22, 2025: Clustering & *k*-means
Topics:
* Clustering & *k*-means
* Visual bag of words
Reading/resources:
* [Varma & Zisserman paper](../papers/varma05textons.pdf)
Apr 24, 2025: Deep learning
Topics:
* Modern activation functions
* Convolutional networks & weight tying
* Data augmentation
* Dropout
Reading/resources:
* Sections 5.3, 5.4
Apr 29, 2025: Application: Resnet & GANs
Topics:
* Resnet
* ResNet application - fruit fly classification
* GANs
Reading/resources:
* Section 5.5
* Datasets: [CIFAR](https://www.cs.toronto.edu/~kriz/cifar.html), [ILSVRC2012](http://image-net.org/challenges/LSVRC/2012/analysis/#bbox_div), [ImageNet](http://www.image-net.org/challenges/LSVRC/index)
* [VGG in Tensorflow](http://www.cs.toronto.edu/~frossard/post/vgg16/)
* [*Deep Residual Learning for Image Recognition*, He et al. 2015](../papers/he2015resnet.pdf)
* GANs: [Goodfellow et al. 2014](https://papers.nips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf), [2016](https://arxiv.org/pdf/1606.03498.pdf), [Isola et al. 2017](https://arxiv.org/pdf/1611.07004.pdf)
* [pix2pix online examples](https://affinelayer.com/pixsrv/)
May 1, 2025: Transfer learning & manifold learning
Topics:
* Transfer learning
* FaceNet
* Feature visualization
Resources:
* [Transfer learning in Keras](https://keras.io/guides/transfer_learning/)
* [Schroff et al. 2015 - FaceNet](https://arxiv.org/abs/1503.03832)
* [Deep dream video](https://vimeo.com/132700334)
* [Olah et al. 2017 - Feature Visualization](https://distill.pub/2017/feature-visualization/)
(May 2, 2025): Classes end