Assignment 1

16-726 | Qin Han | qinh@andrew.cmu.edu.cn

Overview

Using Pytorch, I implemented an algorithm to align three color channels of the digitized Prokudin-Gorskii glass plate images so that they form a single RGB color image. The algorithm (1) calculates the norm of gradients of each channel to extract the edge; (2) constructs a pyramid of edges for each channel; (3) iteratively detects the shift between pyramids of channels; (4) stacks shifted RGB channels to form an RGB color image; (5) applies automatic border cropping, automatic white balancing, and automatic contrasting to the raw image to improve perceived image quality.

Below is an example of the algorithm, where the images are (1) shifted by raw pixels, (2) shifted by edges, (3) automatic cropping, (4) automatic white balancing, and (5) automatic contrasting, respectively.

emir_concat

Approach

To align two channels, the algorithm searches over a fixed-size shift window, and uses the L2 or NCC score to find the best shift, where the L2 score is defined as the L2 norm of the difference between the two channels, and the NCC score is defined as the normalized cross-correlation between the two channels. Also, I cropped the border to improve the matching stability. The image and reference can be any feature of channels. In my implementation, I tried both raw image pixels and edge features.

When dealing with large-scale images, it is inefficient to search over a very large shift window. Therefore, I constructed a pyramid of image features and found the shifts iteratively. In detail, the pyramid is constructed by downsampling the image features, and the algorithm starts from the coarsest level and iteratively refines the shift by upsampling the shift from the previous level. The search space now increases logarithmically with image size, instead of quadratically.

Results

Below are results of the algorithm, where the images are (1) shifted by raw pixels, (2) shifted by edges, (3) automatic cropping, (4) automatic white balancing, and (5) automatic contrasting, respectively.

cathedral_concat

cathedral: shift green [5, 2], red [12, 3]

emir_concat

emir: shift green [49, 23], red [107, 40]

harvesters_concat

harvesters: shift green [60, 17], red [123, 13]

icon_concat

icon: shift green [41, 16], red [90, 23]

lady_concat

lady: shift green [56, 9], red [119, 13]

self_portrait_concat

self_portrait: shift green [78, 29], red [175, 37]

three_generations_concat

three_generations: shift green [53, 13], red [113, 11]

train_concat

train: shift green [43, 8], red [86, 33]

turkmen_concat

turkmen: shift green [56, 22], red [117, 29]

village_concat

village: shift green [64, 11], red [137, 22]

Bells and Whistles

All of the algorithms are implemented using PyTorch.

Edge detection

First, the algorithm convolves the image with the gradient kernels to get the gradient of the image. Second, calcucate the norm of the gradient, and based on that, the algorithm considers a threshold to extract the edges to be the new features. Here is an example of extracted edge features of the cathedral image.

edge

Auto cropping

The algorithm crops the borders of the image based on the observation that the pixel values at the edges are either very small or very large. So the algorithm iteratively calculates the mean of the image and crops the border if the mean is too small or too large.

Auto white-balancing

The algorithm first calculates the average of each channel, and assumes the average is the illumination of the image. Then, the algorithm calculates the scale factor of each channel to make the average of each channel equal to the average of the three channels. Finally, the algorithm clips.

Auto contrasting

The algorithm uses non-linear sigmoid-based contrast stretching to improve the perceived image quality. The sigmoid function is defined as \(f(x) = \frac{1}{1 + e^{-\alpha(x - \mu)}}\), where \(\mu\) is the midpoint parameter of the sigmoid curve, controlling where the midpoint of the sigmoid function falls in the range of intensity values, and \(\alpha\) is a hyperparameter controlling the slope of the sigmoid function.

Extra Results

Below are extra examples, where the images are (1) shifted by raw pixels, (2) shifted by edges, (3) automatic cropping, (4) automatic white balancing, and (5) automatic contrasting, respectively.

church_concat

Railway_concat

Ostrechiny_concat

Milan_Cathedral_concat