**E27: Computer Vision**
**ENGR 27/CPSC 72**
**Spring 2024**
**[Matt Zucker](http://mzucker.github.io/swarthmore)**
| Lecture: | Tue/Thu 11:20 AM-12:35 PM, Singer 346 |
|-----------------|--------------------------------------------|
| Office/Lab Hours: | Tue 3:30-4:30, Fri 10:30-12:00, and by appointment, Singer 235 |
| Wizard session: | Sundays 7-9 PM, Singer 221 |
| Discussion forum: | |
# Overview
This class is about applying mathematical theory to endow computers with the ability to understand and interact with the real world through physical imaging sensors. Although there will be plenty of programming, coding is not the main focus. **If you are looking for a class about programming particular APIs (e.g. OpenCV, TensorFlow), you may be disappointed!**
The course is divided into three broad areas of investigation:
* *Appearance based methods (~4.5 weeks)* including filtering, morphological operators, convolutions, frequency domain methods, edge and feature detection, correlation, and template tracking.
* *3D geometry based methods (~4.5 weeks)* including multiple view geometry, stereo and structured light, and structure from motion.
* *Probabilistic and learning based (~5 weeks)* methods such as classification, object recognition, and clustering.
# Resources
* Textbook: [Richard Szeliski, *Computer Vision: Algorithms and Applications, 2nd Edition* Springer 2010-22.](http://szeliski.org/Book/)
* [OpenCV Reference manual](https://docs.opencv.org/master/)
* [Numpy Reference manual](https://numpy.org/doc/stable/)
* [Linear algebra refresher](../linalg-reintroduction.pdf)
* [Python/numpy tutorial from Stanford's CS 231n](https://cs231n.github.io/python-numpy-tutorial/)
# Requirements
**Prerequisites:** Either
[ENGR 019](https://catalog.swarthmore.edu/preview_course.php?catoid=29&coid=85281)
or
[ENGR 021](https://catalog.swarthmore.edu/preview_course.php?catoid=29&coid=93039)
or permission of the
instructor. [MATH 027](https://catalog.swarthmore.edu/preview_course.php?catoid=29&coid=85648)
or
[MATH 028](https://catalog.swarthmore.edu/preview_course.php?catoid=29&coid=85649)
is recommended.
**Skills:** In practice, I expect you to be conversant in elementary
programming concepts in Python, including basic loops, functions, and
array processing. You should also be comfortable with the process of converting
a set of mathematical equations into a working Python program.
I also expect you to be comfortable with
[linear algebra concepts](../linalg-reintroduction.pdf) such as
solving linear systems, matrix inverses, rank and
eigenvalues/vectors. We will also be using related geometric concepts
such as the dot product and vector norms as well as rotations and
translations.
**Time:** I expect students to spend approximately 8 hours per
week on this class (4 classes × 8 hours per class +
[8 hours for paid student work](https://www.swarthmore.edu/student-employment/employment-faqs)
= 40 hours). Although this figure will vary from individual to
individual and week to week, you should plan to commit several hours
outside of class to homework, reading, and projects each week.
**Discussion forum**: We will use [this EdStem forum](https://edstem.org/us/courses/54877/discussion/)
throughout the semester to communicate course announcements and answer
questions. Please use Ed (instead of just emailing me) for all course-related communications --
this allows students to see common problems and to engage in
discussions about course material.
**Wizards**: The course Wizards are Quentin Adolphe and Hojune Kim. The
class will have a weekly Wizard session on Sundays from 7-9 PM in Singer 221.
# Assignments
Homework consisting of math, short answer questions, and small
programming exercises will be assigned roughly weekly. Typically,
homework will be assigned on Tuesday, and be due online before the
start of class the following Tuesday.
We will also have regular quizzes to be completed each week. Each
quiz will generally be based on the homework due the previous
week, and potentially course material prior to that as well.
There will be several regularly scheduled projects as well as an
open-ended final project. Ideally, projects will be completed by pairs
of students, but I will allow a larger group if we have an odd number
of students.
Grading will follow approximately the divisions shown below:
* Homeworks: 35%
* Weekly quizzes: 20%
* Regular projects: 35%
* Final project: 10%
Homework is graded on a ✓+, ✓, ✓- scale. A grade of ✓+ indicates no
notable errors, ✓ indicates a sound understanding of course material,
and ✓- indicates that you should spend more time reviewing the
material on the assignment, attend office hours, and/or attend Wizard
sessions to shore up your understanding.
Each student may miss one homework or one quiz with no penalty; if a
student completes all homeworks or quizzes, I will drop the lowest
grade in that category.
Although we will not have a final exam, I may administer one final
quiz during the exam period.
## Collaboration and attribution
* Feel free to collaborate with your classmates on homework; however,
you must submit your own work. Duplicating others’ assignments
verbatim (especially code!) is prohibited.
* If you do discuss homework with your classmates, I expect you to
disclose any such collaboration clearly in your submitted work. Err
on the side of caution – it’s the best way to avoid awkward
conversations about suspicious similarities between assignments.
* Cite any external sources used, including the textbook, internet,
discussions with other professors, etc.
* Do not consult ChatGPT or any other generative AI systems to
complete any graded work for this course.
* Collaboration or communication with other students about quizzes is
prohibited, as is consulting outside resources beyond the textbook,
your class notes, and the course website.
* Aside from raising technical and procedural questions on the course
discussion forum, do not collaborate on projects with others
besides your partner.
* Do not post homework, quiz, or project solutions online. Questions
or answers that discuss solutions too closely will be deleted.
Aside from the course-specific policies above, you are expected to
understand and abide by the college's
[policy on academic misconduct](https://www.swarthmore.edu/student-handbook/academic-policies#academic_misconduct).
## Late policy, extensions, etc.
Homework will generally be assigned on Tuesday, and due at the start
of class the following Tuesday. Homework assignments may be turned in
up to a week late for half credit. Students get one free late homework
turn-in without penalty.
Quizzes will be graded under the same guidelines (half-credit for up
to one week late, first late quiz not penalized).
Projects may be turned in up to a week late for half credit. There
are no penalty-free late project submissions.
I will be liberal in granting extensions for homeworks and projects
*when requested in advance*; however, extensions for quizzes will only
be granted for exceptional circumstances.
Extensions for assignments are much less likely to be granted
retroactively. The more notice and communication I have from you
about your academic needs, the more help I can provide!
# Accommodations
If you believe you need accommodations for a disability or a chronic
medical condition, please contact Student Disability Services (Parrish
113W) via [email](mailto:studentdisabilityservices@swarthmore.edu) to arrange
an appointment to discuss your needs. As appropriate, the Office will
issue students with documented disabilities or medical conditions a
formal Accommodations Letter. Since accommodations require early
planning and are not retroactive, please contact Student Disability
Services as soon as possible. For details about the accommodations
process, visit the Student Disability Services website. You are also
welcome to contact me privately to discuss your academic
needs. However, all disability-related accommodations must be
arranged, in advance, through Student Disability Services.
***Even outside the context of accommodations for disabilities, if there is something I can do to facilitate your learning, please do not hesitate to contact me.***
# Software
You will need an up-to-date installation of Python 3 along with OpenCV 4. It's available through multiple distribution channels. See [this document](install_opencv.html) for detailed installation instructions.
OpenCV is also installed on the CS cluster machines. Make sure to run
`python3` from the command line when you want to use it.
# Schedule
!!! Warning
Please note that topics and dates are subject to change. Check this page
frequently for updates, assignments, readings, etc.
Jan 23, 2024: Introdution
Topics:
* Introduction
* Syllabus review
* Image formation
Reading/resources:
* Chapter 1
* Sections 2.1, 2.3
* [Linear algebra basics](../linalg-reintroduction.pdf)
* [Installing OpenCV](install_opencv.html)
Jan 25, 2024: Background subtraction
Topics:
* Image representations
* Thresholding & color segmentation
Reading/resources:
* Sections 3.1, 3.2, 3.3
* [Demo code from today](https://github.com/swatbotics/e27_s24_demo_code_jan25/tree/main)
Assignments:
* [Homework 1](homework1.html)
Jan 30, 2024: Morphological operators and filtering
Topics:
* Morphological operators
* Adaptive thresholding
Reading/resources:
* Section 3.2
Assignments:
* [Homework 2](homework2.html)
Feb 1, 2024: Filtering & image derivatives
Topics:
* Convolution
* Cross-correlation
* Image gradients
Reading/resources:
* Sections 3.2, 7.2
* Filtering Python notebooks
* PDF: [Part 1](filtering_part_1.pdf), [Part 2](filtering_part_2.pdf)
* [Source](https://github.swarthmore.edu/e27-spring2024/filtering_notebooks)
Assignments:
* [Project 1](project1.html)
(Feb 1, 2024): P1: Background subtraction
Feb 6, 2024: Edge detection
Topics:
* Canny edge detector
Reading/resources:
* Section 7.2
* [Finite differencing and noise plot and code from today](https://github.swarthmore.edu/gist/mzucker1/795fc1e3eb4997b481a5ba72b48f0c10)
Assignments:
* [Homework 3](homework3.html)
Feb 8, 2024: Frequency domain
Topics:
* Fourier transform
Reading/resources:
* Section 3.4, 3.5
* [Fourier Python Demo](https://github.swarthmore.edu/e27-spring2024/fourier_demo)
Feb 13, 2024: Frequency domain, cont'd.
Topics:
* Fourier transform
Assignments:
* [Homework 4](homework4.html)
Feb 15, 2024: Feature detection & description
Topics:
* Feature detection
* Feature matching
* Random sample consensus (RANSAC)
Reading/resources:
* Section 7.1
* [FAST: Rosten et al. 2006](../papers/rosten2006fast.pdf)
* [BRIEF: Calonder et al. 2010](../papers/calonder2010brief.pdf)
* [ORB: Rublee et al. 2011](../papers/rublee2011orb.pdf)
* [Whiteboard from today](feature_detectors_and_descriptors.pdf)
Feb 20, 2024: Template matching & template tracking
Topics:
* Squared distances
* Normalized cross-correlation
* Review: ordinary least squares
* Kanade-Lucas-Tomasi tracker
Reading/resources:
* Section 9.1
Assignments:
* [Homework 5](homework5.html)
* [Project 2](project2.html)
(Feb 20, 2024): P2: Pyramid blending & hybrid images
Feb 22, 2024: Projective algebra fundamentals
Topics:
* Points in 2D
* Lines in 2D
* Homogeneous coordinates
Reading/resources:
* Sections 2.1, 3.6.1
* [Application: unprojecting text](https://mzucker.github.io/2016/10/11/unprojecting-text-with-ellipses.html)
Feb 27, 2024: Homographies and homogeneous least squares
Topics:
* Homographies
* Homogeneous least squares
Reading/resources:
* [Digital whiteboard from today](e27-2024-02-27.pdf)
Feb 29, 2024: 3D geometric fundamentals
Topics:
* Homography recap
Reading/resources:
* Section 11.1
* [Homography in-class exercise](homography_exercise.html)
Mar 5, 2024: Geometric image formation
Topics:
* Image formation redux
* Intrinsic & extrinsic parameters
Reading/resources:
* [Digital whiteboard from today](geometric_image_formation.pdf)
Mar 7, 2024: Camera calibration
Topics:
* Triangulation
* Camera calibration
* Algebraic vs geometric error
Reading/resources:
* Section 11.1
* [Fundamental matrix explorer code](https://github.swarthmore.edu/e27-spring2024/fundamental)
Assignments:
* [Homework 6](homework6.html)
* [Project 3](project3.html)
(Mar 7, 2024): P3: Image stitching with homographies
(Mar 12, 2024): Spring break
(Mar 14, 2024): Spring break
Mar 19, 2024: 3D imaging
Topics:
* Stereo vision
* Structured light
Reading/resources:
* Chapter 12
* [Technical description of Kinect calibration](http://wiki.ros.org/kinect_calibration/technical)
* [Lens distortion](https://www.edmundoptics.com/knowledge-center/application-notes/imaging/distortion/)
Assignments:
* [Homework 7](homework7.html)
Mar 21, 2024: Epipolar geometry
Topics:
* Essential matrix
* Fundamental matrix
Reading/resources:
* Section 12.1
Mar 26, 2024: Structure from Motion
Topics:
* Structure from Motion
* Affine cameras
* Singular Value Decomposition (SVD)
Reading/resources:
* Chapter 11
Homework:
* [Homework 8](homework8.html)
Mar 28, 2024: Affine SFM
Topics:
* Affine SFM
(Mar 28, 2024): P4: Augmented reality gaming
Apr 2, 2024: Face recognition / ML ethics
Topics:
* Face recognition
* Ethics in machine learning
Apr 4, 2024: Machine Learning basics
Topics:
* Binary classification
* Linear classifiers/single-layer perceptron
Reading/resources:
* Section 5.1
Apr 9, 2024: Neural networks
Topics:
* Multi-layer perceptron
Reading/resources:
* Section 5.3
* MNIST dataset
* [original website](http://yann.lecun.com/exdb/mnist/)
* [on Wikipedia](https://en.wikipedia.org/wiki/MNIST_database)
* [Neural network MNIST demo](https://github.swarthmore.edu/e27-spring2024/nnet_demo)
Apr 11, 2024: Neural networks cont'd
Topics:
* Backpropagation of error
Reading/resources:
* [Neural networks handout](neural-networks.pdf)
Assignments:
* [Homework 9](homework9.html)
Apr 16, 2024: Misc topics in neural nets / ML
Topics:
* Dataset practicalities
* ML software packages: Keras
* Nearest neighbor classification
Reading/resources:
* Section 5.2
* [Keras demos](https://github.swarthmore.edu/e27-spring2024/keras_demos)
Apr 18, 2024: Unsupervised learning & PCA
Topics:
* Unsupervised learning
* Principal component analysis
Resources:
* [MNIST + PCA + kNN Python demo](https://github.com/swatbotics/mnist_pca_knn/blob/main/mnist_pca_knn.py)
Assignments:
* [Homework 10](homework10.html)
Apr 23, 2024: Clustering & *k*-means
Topics:
* Clustering & *k*-means
* Visual bag of words
Reading/resources:
* [Varma & Zisserman paper](../papers/varma05textons.pdf)
* [Python *k*-means and visual bag of words demos](https://github.swarthmore.edu/e27-spring2024/kmeans_vbow)
Apr 25, 2024: Deep learning
Topics:
* Modern activation functions
* Convolutional networks & weight tying
* Data augmentation
* Dropout
Reading/resources:
* Sections 5.3, 5.4
* [Keras convolutional net MNIST demo](https://github.swarthmore.edu/e27-spring2024/keras_convnet)
Assignments:
* [Homework 11](homework11.html)
(Apr 25, 2024): P5: Image classification
Apr 30, 2024: Application: Resnet & GANs
Topics:
* Resnet
* ResNet application - fruit fly classification
* GANs
Reading/resources:
* Section 5.5
* Datasets: [CIFAR](https://www.cs.toronto.edu/~kriz/cifar.html), [ILSVRC2012](http://image-net.org/challenges/LSVRC/2012/analysis/#bbox_div), [ImageNet](http://www.image-net.org/challenges/LSVRC/index)
* [VGG in Tensorflow](http://www.cs.toronto.edu/~frossard/post/vgg16/)
* [*Deep Residual Learning for Image Recognition*, He et al. 2015](../papers/he2015resnet.pdf)
* GANs: [Goodfellow et al. 2014](https://papers.nips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf), [2016](https://arxiv.org/pdf/1606.03498.pdf), [Isola et al. 2017](https://arxiv.org/pdf/1611.07004.pdf)
* [pix2pix online examples](https://affinelayer.com/pixsrv/)
Assignments:
* [Project 4](project4.html)
May 2, 2024: Transfer learning & manifold learning
Topics:
* Transfer learning
* FaceNet
* Feature visualization
Resources:
* [Transfer learning in Keras](https://keras.io/guides/transfer_learning/)
* [Schroff et al. 2015 - FaceNet](https://arxiv.org/abs/1503.03832)
* [Deep dream video](https://vimeo.com/132700334)
* [Olah et al. 2017 - Feature Visualization](https://distill.pub/2017/feature-visualization/)