In this project, the key idea is to align the three color channels through shifting. If we directly overlap different features, we would not be able to recover the original image accurately.

The following outlines the key algorithm we implemented:

  1. Similarity metric for evaluating shifts: Since each color channel may have different pixel intensities in the grayscale space, it's important to use other features to calculate similarity. In our implementation of the align function, we add an optional is_edge variable for comparison. We compare results using the edge feature against the traditional Euclidean distance as the similarity score. The results show that the edge detector significantly outperforms the traditional value-based approach (we use NCC as the similarity function), as seen in the "emir" picture.

Below: with edge Image

Below: no edge detection Image

  1. IOU similarity: We also use IOU (Intersection over Union) similarity, which calculates the overlap between edge-detected images. This overlap method efficiently extracts image information and provides better alignment results.

  2. Optimization using pyramid alignment: As suggested in the problem description, we use a pyramid alignment approach. This involves two steps:

    1. Generating the downsampling pyramid (using Gaussian blur with a manually implemented kernel for convolution).
    2. Narrowing down the results at the coarser image level before moving to higher resolutions.

    In each step, we retrieve the best shift values for the x and y axes. We then adjust these values by ±1 to define the search range for the next layer. This approach, similar to binary search, allows us to efficiently find the optimal alignment.

    1. Ratio-based Offset Limit: To limit the search range, we assume the offset between each color channel will less than 0.05%. This works for all our available images, and this parameter is adjustable.

Image

g shift to r: (-58, -16)

b shift to r: (-108, -40)

Train Image

g shift to r: (-44, -28)

b shift to r:(-88, -34)

church.tif

Image

g shift to r: (-34, 8)

b shift to r:(-60, 4)

harvesters.tif

Image

g shift to r: (-64, 0)

b shift to r:(-120, -16)

icon.tif

Image

g shift to r: (-48, -8)

b shift to r:(-90, -24)

lady.tif

Image

g shift to r: (-64, -4)

b shift to r:(-112, -16)

melons.tif Image g shift to r: (-96, -4)

b shift to r:(-180, -18)

monastery.jpg

Image g shift to r: (-6, 0)

b shift to r:(-2, -2)

onion_church.tif

Image

g shift to r: (-56, -12)

b shift to r:(-108, -40)

sculpture.tif

Image

g shift to r: (-108, 16)

b shift to r:(-140, 24)

self_portrait.tif

Image

g shift to r: (-98, -8)

b shift to r:(-176, -40)

three_generations.tif

Image

g shift to r: (-60, 0)

b shift to r:(-116, -12)

tobolsk.jpg

Image

g shift to r: (-4, -2)

b shift to r: (-8, -2)

cathedral.jpg

Image

g shift to r: (-8, 0)

b shift to r: (-12, -2)

Other examples:

original one:

Image

converted:

Image

g shift to r: (-6, -2)

b shift to r:(-12, -6)