Assignment 1

Colorizing the Prokudin-Gorskii Photo Collection.

Problem

This assignment aims to use image processing techniques to create a color image from digitized Prokudin-Gorskii glass plate images, by extracting the three color channel images, placing them on top of each other, and aligning them. An example of the misaligned color channels is show in the left figure where the three channels are directly stacked over each over. The ideal image with aligned color channels is shown in the right figure.

Misaligned Color Channels
Aligned Color Channels

Evaluation metrics for image matches

In order to accurately assess the alignment of the color channels, it is essential to explore and select the most suitable evaluation metrics to measure and quantify the structural similarity between the color channels.

Sum of squared differences (SSD) distance

The Sum of Squared Differences (SSD) distance is the most straightforward metric for measuring the alignment between two images. It calculates the differences between each pixel of the two images and then squares them, providing a more accurate measure of the alignment than taking the absolute value of the differences. This metric is especially useful when comparing images in higher resolutions, as it captures the relationships between pixels in a more comprehensive way.

Normalized cross-correlation (NCC)

Comparing to SSD, NNC is known to be more accurate and robust. Also, NNC can be used to align images with varying levels of brightness and contrast, whereas SSD is limited to images with uniform brightness and contrast. However, SSD is considered to be faster in terms of computation time.

Structural Similarity Index (SSIM)

SSIM is another metric used to measure and quantify the structural similarity between images. SSIM measures the similarity between two images in terms of their luminance, contrast, and structure, and is more robust to changes in brightness and contrast.

However, while SSIM is a good measure of the perceptual quality of an image, it may not be the best choice for image alignment, as it is more computationally intensive and may not provide as clear a measure of similarity as NCC.

Using SSD, alignment computed in 15 seconds
Using NCC, alignment computed in 28 seconds
Using SSIM, alignment computed in 43 seconds

Exhaustive search over pixel space

To begin with, an exhaustive search over the entire pixel space is performed to find the optimal displacement between the two images. The search is performed by shifting the image in each direction over a window of [-15, 15] pixels. The displacement with the lowest metric score is selected as the optimal displacement. The example below illustrates the result obtained by searching over pixel space on the small cathedral image.

Raw image without alignment
Aligned image

Fast search over pyramids

Despite the excellent performance on the small cathedral image, the exhaustive search algorithm is computationally expensive for large images (.tif images). To improve the performance, a fast search over pyramids is implemented. The algorithm constructs a Gaussian pyramid where each level is downsampled by a factor of 2 from the previous level. Then exhaustive search is performed from the smallest level to the largest level. The optimal displacement from each level is applied to the next level to reduce the search space. The algorithm stops when the optimal displacement is found at the largest level. The example below illustrates the result obtained efficiently by searching over pyramids on the large village image.

Raw image without alignment
Aligned image

Issues and improvements

Border disturbs channel matching

Due to different shifts in the three color channels, the borders of the images are not consistent. It is impossible to find a perfect match along the borders. Therefore, the algorithm is negatively affected by the borders. To improve the performance, the matching algorithm is only applied to the inner part of the images by cropping out 10% on each side.

Matching without cropping the edges
Matching with the edges cropped

Efficiency tradeoff

Although the fast search over pyramids algorithm is much faster than the exhaustive search algorithm, it is still computationally expensive to find the perfect match for large images. It is observed that using pyramid searching with a smaller window size doesn't degrade the performance visibly, but significantly reduces the computation time. This is because at small pyramid levels, the small window size is sufficient to find the optimal displacement. Therefore, a smaller window size is used to improve the performance. Similarity, skipping pixels during the searching process also reduces the computation time while maintaining the performance.

Inconsistent color brightness

In the cases like the photo of the Emir, the color brightness are inconsistent among the color channels. The algorithm thus fails to find the optimal alignment. To tackle this problem, instead of using the raw intensity values, the algorithm uses the gradient map to search for the optimal alignment. To take advantage of the intensity map as well, the matching score is a weight sum of the metric values on the intensity map and the gradient map. The example below illustrates the improvement from this modification.

Searching on the intensity map
Searching on the gradient map

Results on provided examples (Pixel search)

Red shift:(12, 3) Green shift:(5, 2)

Results on provided examples (Pyramid search)

Red shift:(106, 41) Green shift:(49, 24)
Red shift:(124, 13) Green shift:(60, 17)
Red shift:(90, 23) Green shift:(41, 17)
Red shift:(119, 13) Green shift:(56, 9)
Red shift:(176, 37) Green shift:(78, 29)
Red shift:(111, 9) Green shift:(54, 12)
Red shift:(85, 29) Green shift:(42, 2)
Red shift:(117, 28) Green shift:(57, 22)
Red shift:(137, 21) Green shift:(64, 10)

Results on more examples

Red shift:(52, 21) Green shift:(23, 15)
Red shift:(-5, 27) Green shift:(-15, 13)
Red shift:(105, -21) Green shift:(26, -9)
Red shift:(49, 43) Green shift:(17, 29)
Red shift:(23, -1) Green shift:(-10, 1)
Red shift:(102, -12) Green shift:(25, -5)

Bells & Whistles

Additional enhancements from Bells & Whistles features are implemented as listed below:

  • Implementation was done in Pytorch
  • Automatic cropping is implemented to remove the noisy borders via Sobel filters
  • Automatic contrasting is implemented to improve the perceived image quality
  • Similarity on gradient mapping is used as an additional matching metric
  • More transformation options are added to the searching strategy
The following are some results before and after applying the Bells & Whistles features on the provided examples:

Before Bells & Whistles
After Bells & Whistles
Before Bells & Whistles
After Bells & Whistles
Before Bells & Whistles
After Bells & Whistles
Before Bells & Whistles
After Bells & Whistles