**E27: Computer Vision** **ENGR 027/CPSC 072** **Spring 2020** **[Matt Zucker](../index.html)** | Lecture: | Tue/Thu 1:15-2:30PM, Singer 346 | |---------------|-------------------------------------------| | Office Hours: | Wed 2:30-4:00PM, Fri 10:30AM-noon, Singer 235| This class is about applying mathematical theory to endow computers with the ability to understand and interact with the real world through physical imaging sensors. Although there will be plenty of programming, coding is not the main focus. **If you are looking for a class about programming particular APIs (e.g. OpenCV, TensorFlow), you may be disappointed!** The course is divided into three broad areas of investigation: * *Appearance based methods* including filtering, morphological operators, convolutions, frequency domain methods, edge and feature detection, correlation, and template tracking. * *Probabilistic and learning based* methods such as classification, object recognition, and clustering. * *3D geometry based methods* including multiple view geometry, structure from motion, visual odometry, stereo and structured light, and shape from shading. # Requirements **Prerequisites:** Either ENGR 19 or CPSC 35. MATH 27 or 28 is strongly recommended. **Skills:** In practice, I expect you to understand elementary programming concepts, including basic loops, functions, and array processing. I also expect you to be comfortable with [linear algebra concepts](../linalg-reintroduction.pdf) such as solving linear systems, matrix inverses, rank, and eigenvalues/vectors. We will also be using related geometric concepts such as the dot product and vector norms as well as rotations and translations. **Time:** I expect students to spend approximately 8 hours per week on this class (4 classes × 8 hours per class + [8 hours for paid student work](https://www.swarthmore.edu/student-employment/employment-faqs) = 40 hours). Although this figure will vary from individual to individual and week to week, you should plan to commit several hours outside of class to homework, reading, and projects each week. # Resources **Textbook**: Richard Szeliski, *Computer Vision: Algorithms and Applications,* Springer 2010-11. [Available free online from the author.](http://szeliski.org/Book/) **OpenCV Documentation**: We will be using OpenCV version 3, and it can be hard to find Python documentation for this. Here is the [OpenCV 3.0 beta reference manual](https://docs.opencv.org/3.0-beta/modules/refman.html), with Python functions included. **Piazza**: We will use [this Piazza group](http://piazza.com/swarthmore/spring2020/engr027cpsc072) throughout the semester to communicate course announcements and answer questions. Please use Piazza (instead of just emailing me) for all course-related communications -- this allows students to see common problems and to engage in discussions about course material. **Wizards**: The class will have a weekly Wizard session to discuss homeworks (primarily) and projects. Details TBA on Piazza. # Assignments Homework consisting of math, short answer questions, and small programing exercises will be assigned weekly. There will be several larger projects/labs which are both more open-ended and more programming intensive. These projects and labs are self-scheduled, which means I expect you and your lab partner to find time to complete them on your own. I am happy to give advice about homework and projects during office hours, and I can also meet with students or pairs outside of office hours by appointment. The course has a midterm exam and a final exam (cumulative, but biased towards the second half of the course). Grading will follow approximately the divisions shown below: * Homework: 30% * Projects/labs: 35% * Midterm exam: 15% * Final exam: 15% * Participation: 5% Project feedback will be delivered in-person. I will be soliciting your availability to meet around the deadline of the first project. ## Collaboration and attribution * Feel free to collaborate with your classmates on homework; however, you must submit your own work. Duplicating others’ assignments verbatim (especially code!) is prohibited. * If you do discuss homework with your classmates, I expect you to disclose any such collaboration clearly in your submitted work. Err on the side of caution – it’s the best way to avoid awkward conversations about suspicious similarities between assignments. * Cite any external sources used, including the textbook, internet, discussions with other professors, etc. * Aside from raising technical and procedural questions on the course Piazza, do not collaborate on projects with others outside your group. * Do not post homework or project solutions on Piazza. Questions or answers that discuss solutions too closely will be deleted. Aside from the course-specific policies above, you are expected to understand and abide by the college's [policy on academic misconduct](https://www.swarthmore.edu/student-handbook/academic-policies#academic_misconduct). ## Late policy Homework will generally be assigned on Thursday, and due at the start of class the following Thursday. Homework assignments may be turned in up to a week late for half credit. Students get one free late homework turn-in without penalty. Late projects which have not been excused in advance may be strongly penalized. I will try to accommodate you in extraordinary circumstances, *especially if you contact me ahead of time*. # Accommodations If you believe you need accommodations for a disability or a chronic medical condition, please email Student Disability Services at studentdisabilityservices@swarthmore.edu to arrange an appointment to discuss your needs. As appropriate, the office will issue students with documented disabilities or medical conditions a formal Accommodations Letter. Since accommodations require early planning and are not retroactive, please contact Student Disability Services as soon as possible. For details about the accommodations process, [visit the Student Disability Services website](http://www.swarthmore.edu/academic-advising-support/welcome-to-student-disability-service). You are also welcome to contact me the faculty member privately to discuss your academic needs. However, all disability-related accommodations must be arranged, in advance, through Student Disability Services. # Schedule The topics below are subject to change. Please check this page regularly for updates. January 21, 2020: Intro; fundamentals Topics: * Introduction * Linear algebra review * Image formation * Image representations * Homogeneous coordinates Reading/resources: * Chapter 1 * Sections 2.1, 2.3 * [Linear algebra basics](../linalg-reintroduction.pdf) * [Installing OpenCV](install_opencv.html) * [Rolling shutter video](https://www.youtube.com/watch?v=dNVtMmLlnoE) Assignments: * [Homework 1](homework1.pdf) * [Tutorial code](tutorial.zip) January 28, 2020: Points and lines Topics: * Lines in 2D * Review: ordinary least squares * Homographies Reading/resources: * Sections 3.1, 3.6.1 * [Application: unprojecting text](https://mzucker.github.io/2016/10/11/unprojecting-text-with-ellipses.html) Assignments: * [Homework 2](homework2.pdf) * [Starter code](hw2_starter.zip) February 4, 2020: Background subtraction, filtering Topics: * Homogeneous least squares * Thresholding & color segmentation * Project 1 briefing * Morphological operators * Convolution & cross-correlation Reading/resources: * Sections 3.2, 3.3, 3.5.1, 3.5.2 Assignments: * [Homework 3](homework3.pdf) * Project 1 [assignment](project1.pdf), [starter code](project1_starter.zip) February 11, 2020: Edge detection Topics: * Gradients and edge detection * Review of weeks 1-3 Reading/resources: * Sections 4.2, 4.3 * Muddy cards slides: [HTML](muddy-cards-feb11.md.html), [PDF](muddy-cards-feb11.pdf) * [Example code](muddy_feb11.zip) Assignments: * [Homework 4](homework4.pdf) * [Starter code](hw4_starter.zip) February 18, 2020: Filtering, cont'd.; frequency domain Topics: * Edge detection * Fourier transform Resources: * [In-class Python demos](fourier.zip) Assignments: * [Homework 5](homework5.pdf) * [Starter code](hw5_starter.zip) February 20, 2020: ML basics ([Mathieson](https://smathieson.sites.haverford.edu/)) Topics: * Binary classification * Linear classifiers/Single-layer perceptron * Nearest neighbor * Multi-layer perceptron Resources: * [Slides](Mathieson_introML_slides.pdf) * [Handout](Mathieson_introML_handout.pdf) February 25, 2020: Neural networks Topics: * Intro to neural networks * Backpropagation of error Reading/resources: * [MNIST data set](http://yann.lecun.com/exdb/mnist/) * [Neural networks handout](neural-networks.pdf) * [MNIST python demo](nnet_demo.zip) Assignments: * [Project 2](project2.pdf) * [Homework 6](homework6.pdf) * [xor_nnet.py](xor_nnet.py) March 3, 2020: Exam review, neural networks cont'd Topics: * Data preprocessing * ML software packages Reading/resources: * [`keras_xor.py` - like HW6 but with Keras](keras_xor.py) * [`mnist_like_its_1998.py` - solving MNIST digit classification](mnist_like_its_1998.py) March 5, 2020: Midterm exam (in-class) (March 10, 2020): Spring break March 17, 2020: Dimensionality reduction (skipped due to COVID-19) Topics: * PCA * Eigenfaces * Clustering & $k$-means Reading/resources: * Section 14.2, 14.4 March 24, 2020: Deep learning Topics: * ReLU * Softmax activation * General loss functions * Convolutional networks * Resnet Reading/resources: * [Keras examples from class](https://github.com/mzucker/e27_keras_examples) * [He et al 2015: Deep residual learning](../papers/he2015resnet.pdf) * [Keras implementation for CIFAR](https://github.com/keras-team/keras/blob/master/examples/cifar10_resnet.py) * [Pre-trained resnets in Keras](https://github.com/keras-team/keras/blob/master/docs/templates/applications.md#resnet) Assignments: * [Homework 7](homework7.pdf) * [hw7.zip](hw7.zip) March 31, 2020: Deep learning, cont'd Topics: * Transfer learning * Feature visualization * Generative Adversarial Networks * Manifold learning Reading/resources: * [Feature visualization](https://distill.pub/2017/feature-visualization/) * [Image-to-image translation demo (e.g. edges2cats)](https://affinelayer.com/pixsrv/) * Papers (see citiations in links above, too): * [Goodfellow et al 2014: GANs](https://arxiv.org/abs/1406.2661) * [Isola et al 2017: Conditional Adversarial Nets](https://phillipi.github.io/pix2pix/) * [Schroff et al 2015: FaceNet](https://arxiv.org/abs/1503.03832) Assignments: * [Final project](finalproject.html) April 7, 2020: 3D geometric fundamentals Topics: * Image formation redux * Camera calibration * Intrinsic & extrinsic parameters * Algebraic vs. geometric error Reading/resources: * Chapter 6 April 14, 2020: Stereo & multiple view geometry Topics: * Stereo * Epipolar geometry Reading/resources: * Chapter 7 up to but not including 7.2.1 * Sections 11-11.3 Assignments: * [Homework 8](homework8.pdf) * [`stereo_hw.zip`](stereo_hw.zip) * [`box3d.zip`](box3d.zip) April 21, 2020: Keypoints; structured light Topics: * Epipolar geometry, cont'd. * Keypoints * Feature detection * Feature matching * Structured light Reading/resources: * [FAST: Rosten et al. 2006](../papers/rosten2006fast.pdf) * [BRIEF: Calonder et al. 2010](../papers/calonder2010brief.pdf) * [ORB: Rublee et al. 2011](../papers/rublee2011orb.pdf) * [ROS - Kinect technical specs](http://wiki.ros.org/kinect_calibration/technical) April 28, 2020: Structure from motion Topics: * Singular value decomposition * Affine SFM Reading/resources: * Chapter 7