**Colorizing the Prokudin-Gorskii Photo Collection** Student name: Abhishek Pavani (#) Project Overview In this project, we are required to develop a method to colorize the the Prokudin-Gorskii Photo Collection, which is a series of photgraphs of the Russian Empire with a single color channel taken by Sergei Mikhailovich Prokudin-Goirskii in the early 20th century. The data consists of a single long image which consists of all the 3 color channels starting from blue at the top, green in the middle and red at the bottom. The goal of this project is to use image processing techniques to align and stack the images correctly to produce a colour image. Through this project, we also use techniques like image pyramids to reduce the computation time, image processing techniques like auto white balance, color contrast, auto crop etc to enhance the quality of the aligned images. (#) Approach (##) Image Alignment : Single Scale Implementation There was one lower resolution image provided in the dataset. So first, I implemented two metrics for computing image similarity namely, Sum of squared distances and Normalized cross correlation. The formulae for computing SSD and NCC is given below: $$\mathrm{NCC}(u, v) = \frac{\sum_{(x’,y’)} \left[{\{f(x’+u, y’+v) – \overline{f}}\}* \{{g(x’, y’) – \overline{g}\}}\right]}{ \sqrt{\sum_{(x’,y’)} \{{f(x’+u, y’+v) – \overline{f}}\}^2}\sqrt{\sum_{(x’,y’)} \{ {g(x’, y’) – \overline{g}\} }^2} }$$ $$\mathrm{SSD}(u, v) = {\sum_{(x’,y’)} \left[f(x’+u, y’+v)-g(x’, y’)\right]}^{2}$$ Since the resolution of the channels was not too big for the cathedral image, it was fast enough to just use a couple of for loops to shift the R and G image channels and compute the similarity score with B channel. But using the same single scale implementation for larger .tiff images, the code takes substantially longer (5 mins for a window size of (15,15) pixels). And this does not necessarily guarantee a correct output because the shift is sometimes more than the (15,15)px search window. Going for a larger window size is one option but the computation time would be in hours. Hence there is a need for a faster implementation. Another issue when using the images directly is that the border pixels of the 3 channels, may cause the similarity score to be thrown off from what it should be. So I cropped the images by 20% before passing it on to the alignment algorithm. This ensured that similarity was only being computed in the region that mattered and ignored the regions which did not matter. Figure below shows the aligned channels for cathedral. Alignment on left was done using SSD as a similarity metric and alignment on right was done using NCC metric. | SSD loss | NCC Loss| |----------|---------|---| |

Cathedral.jpg

Offset[R]: (12,3)

Offset[G]: (5,2)

|

Cathedral.jpg

Offset[R]: (12,3)

Offset[G]: (5,2)

| (##) Image Alignment : Multi-Scale Pyramid Implementation As explained in the single scale solution, multiscale pyramid implementation can reduce the computation to mere few seconds. We can construct an image pyramid to represent the image at multiple scales (scaled by a factor of 2) and the processing is done sequentially starting from the coarsest scale (smallest image) and going down the pyramid until we reach the finest scale. To implement image pyramid, we downscale the original .tiff image to reach the smallest desired image size(256px < smallest image dimension < 512px). We then align the color channels in this image resolution using a (15,15)px window and then use the computed shift as a starting point in the upscaled version. We also reduce the search space from (15,15)px to (13,13)px to (11,11)px as we go towards the finer scale images refining our estimate. For images which are alredy less than 512px , we don't use image pyramids. Initially I computed the similarity and alignment in image(RGB) space, but later I computed the image gradients and used that to compute similarity and alignment. More details on this in the bells and whistles section. I used three different matching scores SSD, NCC and ZNCC. The results of all three have been shared below. An image below shows the construction of a multi-scale pyramid.

Figure: Multi-Scale Pyramid