**E27: Computer Vision** **ENGR 27/CPSC 72** **Spring 2024** **[Matt Zucker](http://mzucker.github.io/swarthmore)** | Lecture: | Tue/Thu 11:20 AM-12:35 PM, Singer 346 | |-----------------|--------------------------------------------| | Office/Lab Hours: | Tue 3:30-4:30, Fri 10:30-12:00, and by appointment, Singer 235 | | Wizard session: | Sundays 7-9 PM, Singer 221 | | Discussion forum: | | # Overview This class is about applying mathematical theory to endow computers with the ability to understand and interact with the real world through physical imaging sensors. Although there will be plenty of programming, coding is not the main focus. **If you are looking for a class about programming particular APIs (e.g. OpenCV, TensorFlow), you may be disappointed!** The course is divided into three broad areas of investigation: * *Appearance based methods (~4.5 weeks)* including filtering, morphological operators, convolutions, frequency domain methods, edge and feature detection, correlation, and template tracking. * *3D geometry based methods (~4.5 weeks)* including multiple view geometry, stereo and structured light, and structure from motion. * *Probabilistic and learning based (~5 weeks)* methods such as classification, object recognition, and clustering. # Resources * Textbook: [Richard Szeliski, *Computer Vision: Algorithms and Applications, 2nd Edition* Springer 2010-22.](http://szeliski.org/Book/) * [OpenCV Reference manual](https://docs.opencv.org/master/) * [Numpy Reference manual](https://numpy.org/doc/stable/) * [Linear algebra refresher](../linalg-reintroduction.pdf) * [Python/numpy tutorial from Stanford's CS 231n](https://cs231n.github.io/python-numpy-tutorial/) # Requirements **Prerequisites:** Either [ENGR 019](https://catalog.swarthmore.edu/preview_course.php?catoid=29&coid=85281) or [ENGR 021](https://catalog.swarthmore.edu/preview_course.php?catoid=29&coid=93039) or permission of the instructor. [MATH 027](https://catalog.swarthmore.edu/preview_course.php?catoid=29&coid=85648) or [MATH 028](https://catalog.swarthmore.edu/preview_course.php?catoid=29&coid=85649) is recommended. **Skills:** In practice, I expect you to be conversant in elementary programming concepts in Python, including basic loops, functions, and array processing. You should also be comfortable with the process of converting a set of mathematical equations into a working Python program. I also expect you to be comfortable with [linear algebra concepts](../linalg-reintroduction.pdf) such as solving linear systems, matrix inverses, rank and eigenvalues/vectors. We will also be using related geometric concepts such as the dot product and vector norms as well as rotations and translations. **Time:** I expect students to spend approximately 8 hours per week on this class (4 classes × 8 hours per class + [8 hours for paid student work](https://www.swarthmore.edu/student-employment/employment-faqs) = 40 hours). Although this figure will vary from individual to individual and week to week, you should plan to commit several hours outside of class to homework, reading, and projects each week. **Discussion forum**: We will use [this EdStem forum](https://edstem.org/us/courses/54877/discussion/) throughout the semester to communicate course announcements and answer questions. Please use Ed (instead of just emailing me) for all course-related communications -- this allows students to see common problems and to engage in discussions about course material. **Wizards**: The course Wizards are Quentin Adolphe and Hojune Kim. The class will have a weekly Wizard session on Sundays from 7-9 PM in Singer 221. # Assignments Homework consisting of math, short answer questions, and small programming exercises will be assigned roughly weekly. Typically, homework will be assigned on Tuesday, and be due online before the start of class the following Tuesday. We will also have regular quizzes to be completed each week. Each quiz will generally be based on the homework due the previous week, and potentially course material prior to that as well. There will be several regularly scheduled projects as well as an open-ended final project. Ideally, projects will be completed by pairs of students, but I will allow a larger group if we have an odd number of students. Grading will follow approximately the divisions shown below: * Homeworks: 35% * Weekly quizzes: 20% * Regular projects: 35% * Final project: 10% Homework is graded on a ✓+, ✓, ✓- scale. A grade of ✓+ indicates no notable errors, ✓ indicates a sound understanding of course material, and ✓- indicates that you should spend more time reviewing the material on the assignment, attend office hours, and/or attend Wizard sessions to shore up your understanding. Each student may miss one homework or one quiz with no penalty; if a student completes all homeworks or quizzes, I will drop the lowest grade in that category. Although we will not have a final exam, I may administer one final quiz during the exam period. ## Collaboration and attribution * Feel free to collaborate with your classmates on homework; however, you must submit your own work. Duplicating others’ assignments verbatim (especially code!) is prohibited. * If you do discuss homework with your classmates, I expect you to disclose any such collaboration clearly in your submitted work. Err on the side of caution – it’s the best way to avoid awkward conversations about suspicious similarities between assignments. * Cite any external sources used, including the textbook, internet, discussions with other professors, etc. * Do not consult ChatGPT or any other generative AI systems to complete any graded work for this course. * Collaboration or communication with other students about quizzes is prohibited, as is consulting outside resources beyond the textbook, your class notes, and the course website. * Aside from raising technical and procedural questions on the course discussion forum, do not collaborate on projects with others besides your partner. * Do not post homework, quiz, or project solutions online. Questions or answers that discuss solutions too closely will be deleted. Aside from the course-specific policies above, you are expected to understand and abide by the college's [policy on academic misconduct](https://www.swarthmore.edu/student-handbook/academic-policies#academic_misconduct). ## Late policy, extensions, etc. Homework will generally be assigned on Tuesday, and due at the start of class the following Tuesday. Homework assignments may be turned in up to a week late for half credit. Students get one free late homework turn-in without penalty. Quizzes will be graded under the same guidelines (half-credit for up to one week late, first late quiz not penalized). Projects may be turned in up to a week late for half credit. There are no penalty-free late project submissions. I will be liberal in granting extensions for homeworks and projects *when requested in advance*; however, extensions for quizzes will only be granted for exceptional circumstances. Extensions for assignments are much less likely to be granted retroactively. The more notice and communication I have from you about your academic needs, the more help I can provide! # Accommodations If you believe you need accommodations for a disability or a chronic medical condition, please contact Student Disability Services (Parrish 113W) via [email](mailto:studentdisabilityservices@swarthmore.edu) to arrange an appointment to discuss your needs. As appropriate, the Office will issue students with documented disabilities or medical conditions a formal Accommodations Letter. Since accommodations require early planning and are not retroactive, please contact Student Disability Services as soon as possible. For details about the accommodations process, visit the Student Disability Services website. You are also welcome to contact me privately to discuss your academic needs. However, all disability-related accommodations must be arranged, in advance, through Student Disability Services. ***Even outside the context of accommodations for disabilities, if there is something I can do to facilitate your learning, please do not hesitate to contact me.*** # Software You will need an up-to-date installation of Python 3 along with OpenCV 4. It's available through multiple distribution channels. See [this document](install_opencv.html) for detailed installation instructions. OpenCV is also installed on the CS cluster machines. Make sure to run `python3` from the command line when you want to use it. # Schedule !!! Warning Please note that topics and dates are subject to change. Check this page frequently for updates, assignments, readings, etc. Jan 23, 2024: Introdution Topics: * Introduction * Syllabus review * Image formation Reading/resources: * Chapter 1 * Sections 2.1, 2.3 * [Linear algebra basics](../linalg-reintroduction.pdf) * [Installing OpenCV](install_opencv.html) Jan 25, 2024: Background subtraction Topics: * Image representations * Thresholding & color segmentation Reading/resources: * Sections 3.1, 3.2, 3.3 * [Demo code from today](https://github.com/swatbotics/e27_s24_demo_code_jan25/tree/main) Assignments: * [Homework 1](homework1.html) Jan 30, 2024: Morphological operators and filtering Topics: * Morphological operators * Adaptive thresholding Reading/resources: * Section 3.2 Assignments: * [Homework 2](homework2.html) Feb 1, 2024: Filtering & image derivatives Topics: * Convolution * Cross-correlation * Image gradients Reading/resources: * Sections 3.2, 7.2 * Filtering Python notebooks * PDF: [Part 1](filtering_part_1.pdf), [Part 2](filtering_part_2.pdf) * [Source](https://github.swarthmore.edu/e27-spring2024/filtering_notebooks) Assignments: * [Project 1](project1.html) (Feb 1, 2024): P1: Background subtraction Feb 6, 2024: Edge detection Topics: * Canny edge detector Reading/resources: * Section 7.2 * [Finite differencing and noise plot and code from today](https://github.swarthmore.edu/gist/mzucker1/795fc1e3eb4997b481a5ba72b48f0c10) Assignments: * [Homework 3](homework3.html) Feb 8, 2024: Frequency domain Topics: * Fourier transform Reading/resources: * Section 3.4, 3.5 * [Fourier Python Demo](https://github.swarthmore.edu/e27-spring2024/fourier_demo) Feb 13, 2024: Frequency domain, cont'd. Topics: * Fourier transform Assignments: * [Homework 4](homework4.html) Feb 15, 2024: Feature detection & description Topics: * Feature detection * Feature matching * Random sample consensus (RANSAC) Reading/resources: * Section 7.1 * [FAST: Rosten et al. 2006](../papers/rosten2006fast.pdf) * [BRIEF: Calonder et al. 2010](../papers/calonder2010brief.pdf) * [ORB: Rublee et al. 2011](../papers/rublee2011orb.pdf) * [Whiteboard from today](feature_detectors_and_descriptors.pdf) Feb 20, 2024: Template matching & template tracking Topics: * Squared distances * Normalized cross-correlation * Review: ordinary least squares * Kanade-Lucas-Tomasi tracker Reading/resources: * Section 9.1 Assignments: * [Homework 5](homework5.html) * [Project 2](project2.html) (Feb 20, 2024): P2: Pyramid blending & hybrid images Feb 22, 2024: Projective algebra fundamentals Topics: * Points in 2D * Lines in 2D * Homogeneous coordinates Reading/resources: * Sections 2.1, 3.6.1 * [Application: unprojecting text](https://mzucker.github.io/2016/10/11/unprojecting-text-with-ellipses.html) Feb 27, 2024: Homographies and homogeneous least squares Topics: * Homographies * Homogeneous least squares Reading/resources: * [Digital whiteboard from today](e27-2024-02-27.pdf) Feb 29, 2024: 3D geometric fundamentals Topics: * Homography recap Reading/resources: * Section 11.1 * [Homography in-class exercise](homography_exercise.html) Mar 5, 2024: Geometric image formation Topics: * Image formation redux * Intrinsic & extrinsic parameters Reading/resources: * [Digital whiteboard from today](geometric_image_formation.pdf) Mar 7, 2024: Camera calibration Topics: * Triangulation * Camera calibration * Algebraic vs geometric error Reading/resources: * Section 11.1 * [Fundamental matrix explorer code](https://github.swarthmore.edu/e27-spring2024/fundamental) Assignments: * [Homework 6](homework6.html) * [Project 3](project3.html) (Mar 7, 2024): P3: Image stitching with homographies (Mar 12, 2024): Spring break (Mar 14, 2024): Spring break Mar 19, 2024: 3D imaging Topics: * Stereo vision * Structured light Reading/resources: * Chapter 12 * [Technical description of Kinect calibration](http://wiki.ros.org/kinect_calibration/technical) * [Lens distortion](https://www.edmundoptics.com/knowledge-center/application-notes/imaging/distortion/) Assignments: * [Homework 7](homework7.html) Mar 21, 2024: Epipolar geometry Topics: * Essential matrix * Fundamental matrix Reading/resources: * Section 12.1 Mar 26, 2024: Structure from Motion Topics: * Structure from Motion * Affine cameras * Singular Value Decomposition (SVD) Reading/resources: * Chapter 11 Homework: * [Homework 8](homework8.html) Mar 28, 2024: Affine SFM Topics: * Affine SFM (Mar 28, 2024): P4: Augmented reality gaming Apr 2, 2024: Face recognition / ML ethics Topics: * Face recognition * Ethics in machine learning Apr 4, 2024: Machine Learning basics Topics: * Binary classification * Linear classifiers/single-layer perceptron Reading/resources: * Section 5.1 Apr 9, 2024: Neural networks Topics: * Multi-layer perceptron Reading/resources: * Section 5.3 * MNIST dataset * [original website](http://yann.lecun.com/exdb/mnist/) * [on Wikipedia](https://en.wikipedia.org/wiki/MNIST_database) * [Neural network MNIST demo](https://github.swarthmore.edu/e27-spring2024/nnet_demo) Apr 11, 2024: Neural networks cont'd Topics: * Backpropagation of error Reading/resources: * [Neural networks handout](neural-networks.pdf) Assignments: * [Homework 9](homework9.html) Apr 16, 2024: Misc topics in neural nets / ML Topics: * Dataset practicalities * ML software packages: Keras * Nearest neighbor classification Reading/resources: * Section 5.2 * [Keras demos](https://github.swarthmore.edu/e27-spring2024/keras_demos) Apr 18, 2024: Unsupervised learning & PCA Topics: * Unsupervised learning * Principal component analysis Resources: * [MNIST + PCA + kNN Python demo](https://github.com/swatbotics/mnist_pca_knn/blob/main/mnist_pca_knn.py) Assignments: * [Homework 10](homework10.html) Apr 23, 2024: Clustering & *k*-means Topics: * Clustering & *k*-means * Visual bag of words Reading/resources: * [Varma & Zisserman paper](../papers/varma05textons.pdf) * [Python *k*-means and visual bag of words demos](https://github.swarthmore.edu/e27-spring2024/kmeans_vbow) Apr 25, 2024: Deep learning Topics: * Modern activation functions * Convolutional networks & weight tying * Data augmentation * Dropout Reading/resources: * Sections 5.3, 5.4 * [Keras convolutional net MNIST demo](https://github.swarthmore.edu/e27-spring2024/keras_convnet) Assignments: * [Homework 11](homework11.html) (Apr 25, 2024): P5: Image classification Apr 30, 2024: Application: Resnet & GANs Topics: * Resnet * ResNet application - fruit fly classification * GANs Reading/resources: * Section 5.5 * Datasets: [CIFAR](https://www.cs.toronto.edu/~kriz/cifar.html), [ILSVRC2012](http://image-net.org/challenges/LSVRC/2012/analysis/#bbox_div), [ImageNet](http://www.image-net.org/challenges/LSVRC/index) * [VGG in Tensorflow](http://www.cs.toronto.edu/~frossard/post/vgg16/) * [*Deep Residual Learning for Image Recognition*, He et al. 2015](../papers/he2015resnet.pdf) * GANs: [Goodfellow et al. 2014](https://papers.nips.cc/paper/2014/file/5ca3e9b122f61f8f06494c97b1afccf3-Paper.pdf), [2016](https://arxiv.org/pdf/1606.03498.pdf), [Isola et al. 2017](https://arxiv.org/pdf/1611.07004.pdf) * [pix2pix online examples](https://affinelayer.com/pixsrv/) Assignments: * [Project 4](project4.html) May 2, 2024: Transfer learning & manifold learning Topics: * Transfer learning * FaceNet * Feature visualization Resources: * [Transfer learning in Keras](https://keras.io/guides/transfer_learning/) * [Schroff et al. 2015 - FaceNet](https://arxiv.org/abs/1503.03832) * [Deep dream video](https://vimeo.com/132700334) * [Olah et al. 2017 - Feature Visualization](https://distill.pub/2017/feature-visualization/)