**Project 4: Image Stitching with Homographies** **Due 5/3** **[ENGR 27 Spring 2021](index.html#schedule1_2021-4-6)** You will find a personalized **`project4`** repository on the [Swarthmore github](https://github.swarthmore.edu/e27-spring2021). Commit and push your code and writeup by 11:59:59 PM on Monday, May 3. # Overview In this lab, you will obtain two or more separate images of a planar (or near-planar) scene and stitch them together by finding the perspective transformation(s) that map corresponding points to each other. Here are example point correspondences from a pair of images of the whiteboard from my pre-pandemic ENGR 052 class, and the resulting mosaic after stitching the images together: ![](images/left_points.png width="80%") ![](images/right_points.png width="80%") ![](images/e52.jpg width="90%") (Note that this is actually a challenging pair of images because the amount of overlapping content is very small -- it's much better to use images that have more like 25-50% overlap!) # Getting started In your **`project4`** repository, you will find a small assortment of starter code as well as a couple of pairs of images of planar objects suitable for stitching. Before you begin coding, read over and run the **`transrot.py`** and **`t_homog.py`** files. These demonstrate some important aspects of manipulating images and using transformations of the plane in OpenCV. You can run **`t_homog.py`** without any command-line arguments: ~~~ none python t_homog.py ~~~ For **`transrot.py`**, you will need to specify rotations and translations from the command line, for example: ~~~ none python transrot.py data/phoenix.jpg 30 20 10 data/swat_logo.png 140 360 -10 ~~~ It's a good idea to make sure you fully understand the code in both of these example programs as you begin to work on your project. Finally, you will use the **`pick_points.py`** script to mark points in image files. You can run it on an image file as follows: ~~~ none python pick_points.py data/weird1.jpg ~~~ It displays a help screen when it starts. You can re-display the help screen by hitting the `?` key at any time. Note that you only need to use the point-picking program -- there is no need for you to fully understand the code it contains. # Image stitching Your job is to update the **`stitcher.py`** file in your github repository to emit an **`output.jpg`** image that combines the two input images you specify on a command line. Here are the steps you will want to follow: 1. Identify a set of four or more (hint: more is better) corresponding points in two images and save them to text files with one `x, y` point per line. You can use the **`pick_points.py`** program from the starter code to do this. Point data for the **`weird1.jpg`** and **`weird2.jpg`** images is included with your starter code, so you might want to start with that dataset! Each identically numbered point in each image should correspond to the same physical point in the world. For example for the **`weird1.jpg`** and **`weird2.jpg`** images and tagged points included with the starter code, point #1 in both images corresponds to the top-left of the book. The text filename is obtained by replacing the image file extension (e.g. `'.png'` or `'.jpg'`) with `'.txt'`. So the text file corresponding to **`data/weird1.jpg`** is **`data/weird1.txt`**. 2. Find the homography $\mathrm{H}$ that best transforms the set of points from image $A$ into the set from image $B$. To do this, use the `cv2.findHomography` function (documentation [here](https://docs.opencv.org/master/d9/d0c/group__calib3d.html#ga4abc2ece9fab9398f2e560d53c8c9780), example code [here](https://docs.opencv.org/master/d1/de0/tutorial_py_feature_homography.html)). Note that `cv2.findHomography` expects you to provide points in arrays of shape `(n, 1, 2)` with data type `numpy.float32`. I suggest you omit the optional arguments, and only pass in the first two arguments. Finally, note that the function returns a pair containing the desired homography as well as a mask. You can safely ignore the mask, provided you are picking the point correspondences by hand (it can be useful for machine-created point correspondences). 3. Now you will need to potentially enlarge and translate the viewport to fit the images together in one composition. Let’s assume you are warping image $A$ into the frame of image $B$, as shown below: ![](images/homography-scaling.png width="75%") You will need to create an array of eight points: the four unmodified corner points of image $B$, and the four corner points of image $A$, mapped through your homography $\mathbf{H}$ by the cv2.perspectiveTransform function. Use the [`cv2.boundingRect`](https://docs.opencv.org/master/d3/dc0/group__imgproc__shape.html#ga103fcbda2f540f3ef1c042d6a9b35ac7) function to obtain the top-left point $(x_0, y_0)$ and the viewport size $(w_C, h_C)$. 4. Define the homography $\mathbf{M} = \mathbf{T} \mathbf{H}$, where $\mathbf{T}$ is the translation $$ \mathbf{T} = \left[\begin{array}{ccc} 1 & 0 & -x_0 \\ 0 & 1 & -y_0 \\ 0 & 0 & 1 \end{array}\right] $$ Make sure to use matrix product and not element-wise product to compose these two transformations! Use [`numpy.matmul`](https://numpy.org/doc/stable/reference/generated/numpy.matmul.html) or the `@` operator to do matrix multiplication instead of the element-wise multiplication provided by the `*` operator. 5. Use the [`cv2.warpPerspective`](https://docs.opencv.org/master/da/d54/group__imgproc__transform.html#gaf73673a7e8e18ec6963e3774e6a94b87) function to prepare two images of size $(w_C, h_C)$ to be superimposed together. You will warp image $A$ with matrix $\mathbf{M}$ and warp image $B$ with matrix $\mathbf{T}$ to obtain two separate warped images. 6. Combine the warped images together using simple averaging so the quality of the image stitching is readily apparent. Assuming `warpA` and `warpB` hold the respective outputs of `cv2.warpPerspective`, you can obtain an output image with the code ~~~ Python finalImage = warpA//2 + warpB//2 ~~~ This will result in reduced brightness where the images don't overlap. If you want, you can optionally use masks to find the overlapping areas and display the average there, and simply copy from one image or the other in non-overlapping areas (this is similar to what my code does to generate the example images in this document). But please make sure there you do a 50-50 average where the images do overlap, so I can see the quality of your image alignment! 7. Use [`cv2.imwrite`](https://docs.opencv.org/master/d4/da8/group__imgcodecs.html#gabbc7ef1aa2edfaa87772f1202d67e0ce) to save your merged image to a file named **`output.jpg`**. If you run into any problems coding this up, make sure you have a clear understanding of what the example code is doing! There are lots of easy pitfalls to avoid if you read the code and comments. ## Troubleshooting/tips * Since four points exactly determine a homography, try starting with a minimal set of four points and only adding/refining points after you have a rough alignment. You should be able to get a coarse alignment by running on the existing **`weird1.jpg`** and **`weird2.jpg`** data. * Only pick points that are easy to locate in multiple images. Corners and intersections of lines are good. Points inside uniformly-colored areas are bad. * If your images are aligning OK in some places but not others, try adding more points and/or refining your existing points. Here is my output for the data that comes with the starter code: ![](images/weird_mediocre.jpg width="50%") The ghosting or "echoes" around the alien face and the ladder indicate the alignment is not very good and could benefit from adding & improving point correspondences. * If your images are not aligning horribly or you are getting bizarre warps in your output, make sure your points are numbered consistently across both input images. Here's the glitchy output that results if I shuffle the order of the points for **`weird1.jpg`**: ![](images/weird_shuffle.jpg width="50%") Remember, a homography can turn the unit square into an arbitrary quadrilateral, including the "unit bowtie", which accounts for the odd warp above. ************************** * * * *----* *--* * * | | --> \/ * * | | H /\ * * *----* *--* * * * ************************** ## Generate your own dataset(s) Once you get your code working on my data, snap your own photos and generate point correspondences for them by tagging points. Your final submission should contain at least two datasets of at least two images each that result in high-quality output (i.e. very little obvious ghosting). It's OK if one of the datasets is from the images distributed from the starter code, but you should make sure to submit at least one dataset generated totally by your group. I suggest starting with some familiar planar objects. Whiteboards, chalkboards, posters, walls, and carpets with interesting patterns are all reasonable candidates. Homographies will *not* work well to stitch together images of non-planar objects on top of planar objects. For example, if you have two images of a chessboard from two different angles, you could align the board itself in the two images, but the chess pieces will clearly appear distorted! The other scenario where homographies can stitch images are image pairs related by pure camera rotation. This is also well-approximated when photographing multiple views of a faraway vista from a single vantage point. For example, two overlapping views taken out the window or over the balcony of a tall building. # Going further When you have succesfully mastered image stitching on multiple datasets including your own, go one step further. Here are some possible ideas to explore: * Make the program work with more than two input images and provide a dataset with three or more images to demonstrate this on. * Find OpenCV code on the web and adapt it into your program to automatically find point correspondences between two images. I suggest you start with the tutorials [here](https://docs.opencv.org/master/db/d27/tutorial_py_table_of_contents_feature2d.html). * Present and discuss a proof that any two images of a planar object can be related by a homography. You can probably find a derivation online, or you can try to prove it yourself. (I'm happy to help sketch out this proof in office hours once we discuss intrinsic and extrinsic parameters this week...) * Or email me if you want to discuss your own idea. Whatever you do for this, please make sure that the **`stitcher.py`** program you submit works as originally intended -- rather than modifying its functionality substantially, please just check in a new Python script side-by-side with it. Don't forget to document your code so I know how to run it, too! # What to turn in Along with source code your **`stitcher.py`** and any other code you write, please also submit two input datasets (images and points) that result in high-quality output. Additionally, please submit a 2-4 page PDF report which addresses the following points/questions: * Who did what in your project? * Describe your process for developing the program and how you alternated between editing code and generating data. * What difficulties did you encounter along the way, and how did you overcome them? * Include images of your program outputs for at least two datasets in your PDF. * Describe the work you did for the "going further" aspect of the project, including telling me how to use any additional programs you wrote. Include program output images in your writeup if appropriate. Since we are not dealing with large ML datasets, it's no problem to include your input images in the github repo -- no need for a separate Google Drive link.