CS 180 Fall 2024

Overview

I explored the usage of image pyramid for image alignement in this project to create RGB values from seperated channel values. I first use skio to pre-process the given image, which contains 3 channel values. I split the imput image into 3 equal heights and then crop them by 8% from each side of the boundary. With the 3 channel pixels, I then min-max normalized each of them to account for cases were exposure on the three channels are different. Then I started trying to align the pixels directly using for loop across some (dx, dy) (in my case I use -16 to 16). Do note that I used the Normalized Cross-Correlation (NCC) value to evaluate on whether the offset is good. In contrast to Euclidean Distance, I found that NCC performs better when exposure difference between the channels are strong. This worked for small jpeg images. However, for the bigger tif inputs, I soon realize that I have to use image pyramid. To solve the issue, I recursivly called the align function. The function will return starting from image that is scaled down to no more than 256 pixels (base case). Note that this will be our biggest movement and as the recursion returns to previous layers, the alignment movement will be more and more fine grined (detailed).
I run into some issues in parameter tuning. All the files but emir.jpg resulted rather neat. I took a closer look at emir and reazlied that the green_blue alignment is perfect yet the red_blue alignment is off by a huge scale (See end of document). When I try to fix emir, other images started to act weird which made me realize emir may have its own unique issue. Upon inspecting the input data, I believe that the error may appear from the extreme difference in exposure among the 3 channels. However, all other images displayed fine so I believe emir is its own edge case and thus the algorithm does work in general for large tif images.