Read, understand, and complete and run the following notebook. You must return the completed notebook, including your answers and illustrations (you may need to add cells to write your code or comments).
To execute a notebook, you will need to install jupyter. If you are using anaconda (strongly advised). If you cannot/don't want to use notebooks, you can return both your python code and a report in pdf. Note that all the TPs will use python2. If you are using python3, you might need to do a few changes to the provided code.
Return your work by e-mail using a single file (ipynb or zip) with the format 'introvis17_tp1_yourname.ipynb'
The first part is about doing basic image processing in Python, the second breaks down the different steps in implementing the Canny edge detector.
Here are a set of useful libraries for image processing. You will need to refer to the online documentation of the different libraries to complete the TP.
import numpy as np
# this is the key library for manipulating arrays. Use the online ressources! http://www.numpy.org/
import matplotlib.pyplot as plt
# used to read images, display and plot http://matplotlib.org/api/pyplot_api.html .
# You can also check this simple intro to using ipython notebook with images
# https://matplotlib.org/users/image_tutorial.html .
%matplotlib inline
# to display directly in the notebook
import scipy.ndimage as ndimage
# one of several python libraries for image procession
plt.rcParams['image.cmap'] = 'gray'
# by default, the grayscale images are displayed with the jet colormap: use grayscale instead
Here is a simple example how to read and display an image (you need to first download this image)
name='lena.jpg'
I=plt.imread(name)
plt.imshow(I)
I
? Its type? The type of the values it contains? What's its range?¶# complete this part.
# Tip: look for numpy attributes (e.g. https://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html)
# and functions to do what you want (e.g. np.min)
# /!\: I will resort to PYTHON3
from IPython.display import display, Markdown
# data type range of numpy dtypes
dtype_range = {np.bool_.__name__: (False, True),
np.bool8.__name__: (False, True),
np.uint8.__name__: (0, 255),
np.uint16.__name__: (0, 65535),
np.int8.__name__: (-128, 127),
np.int16.__name__: (-32768, 32767),
np.int64.__name__: (-2**63, 2**63 - 1),
np.uint64.__name__: (0, 2**64 - 1),
np.int32.__name__: (-2**31, 2**31 - 1),
np.uint32.__name__: (0, 2**32 - 1),
np.float32.__name__: (-1, 1),
np.float64.__name__: (-1, 1)}
display(Markdown("- **Size of `I`:** `{}` \n"\
"- Its *type*: `{}` \n"\
"- The **type of its values**: `{}` \n"\
" - whose **range** is: `{}` \n"\
"- Its **shape**: `{}`"\
.format(I.size, type(I).__name__, I.dtype, dtype_range[str(I.dtype)], I.shape)))
def load_image_v0(name,crop_window=-1):
I=plt.imread(name)
if crop_window!=-1:
I=I[crop_window[0]:crop_window[1],crop_window[2]:crop_window[3]] # slices the image according to crop_window
I=I.astype('float')/255
return I
I=I.astype('float')
doing? Use the function to extract and save the 100x50 pixels of the bottom left corner of 'lena'.¶The line I=I.astype('float')
casts the values of I
to the type float
, so that the division performed afterwards is a floating point one, and the resulting float image is restricted to the range (-1, 1)
.
dh, dw = 100, 50 # size of the bottom-left cropped image
height, width, _ = I.shape
crop_window = [height-dh, height, 0, dw]
I_cropped = load_image_v0(name, crop_window)
plt.imshow(I_cropped)
def load_image(name,crop_window=-1):
I = load_image_v0(name, crop_window)
# if the input is already grayscale,
# i.e. if it's a 2D array, just return it
if I.ndim == 2:
return I
else:
# it's a color image (3D array: there's an additional dimension for the RGB coordinates)
# The following dot product is a weighted sum over the RGB coordinates (last axis) of I,
# whose weights are the values of [0.2989, 0.5870, 0.1140], which is what we want
# (cf https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html)
return np.dot(I, np.array([0.2989, 0.5870, 0.1140]))
I_cropped_grayscaled = load_image(name, crop_window)
plt.imshow(I_cropped_grayscaled)
Use your function to load a grayscale version of lena and save it.
I = load_image(name)
plt.imshow(I)
Ib = ndimage.gaussian_filter(I, sigma=3, order=1)
display(Markdown("> The size of the output **is** {} the same as the input"\
.format("**not**"*(Ib.size != I.size))))
plt.imshow(Ib)
mode
paramater, whose default value is reflect
. That is, when applying the filter to the borders, the values of the last row/column are reflected across the image's boundary.mode
parameter has to be set to constant
(constant padding), the constant value (determined by the cval
paramater) being 0.0
by default.Ib = ndimage.gaussian_filter(I, sigma=3, mode='constant')
plt.imshow(Ib)
The black outline is due to the border conditions being set to zero padding: on the borders, the gaussian window pixels outside the boundary are black (constant value 0
), resulting in a blacker weighted average.
k = np.array([[-1./9, -1./9, -1./9],
[-1./9, 2.-1/9., -1./9],
[-1./9, -1./9, -1./9]])
Is=ndimage.convolve(I, k)
plt.imshow(Is)
As seen in class (page 56), this linear filter sharpens the image (by accentuating the difference with the local average):
dirac_two = np.zeros((3, 3))
dirac_two[1,1] = 2
k = dirac_two - 1./9 * np.ones((3, 3))
$$k = \underbrace{\begin{pmatrix} 0 & 0 & 0 \\ 0 & 2 & 0 \\ 0 & 0 & 0 \\ \end{pmatrix}\;}_{\text{accentuation}} \;\; - \; \underbrace{\begin{pmatrix} \frac 1 9 & \frac 1 9 & \frac 1 9 \\ \frac 1 9 & \frac 1 9 & \frac 1 9 \\ \frac 1 9 & \frac 1 9 & \frac 1 9 \end{pmatrix}\;}_{\text{local average}}$$
compute_gradient
that returns three images containing an image derivative along the horizontal direction, the vertical direction, and the gradient norm. It should use forward differences, and take an optional argument sigma
that gives the scale of the derivative. What is the size of the output compared to the size of the input?¶def compute_gradient(I, sigma=3):
# derivative along horizontal (resp. vertical) direction:
# we use the fact that d/dx(G * f) = d/dx(G) * f (resp. d/dx(G * f) = d/dx(G) * f)
# where G is a gaussian kernel
I_horiz = ndimage.gaussian_filter1d(I, sigma, order=1, axis=1)
I_vert = ndimage.gaussian_filter1d(I, sigma, order=1, axis=0)
# ||grad f|| = sqrt((df/dx)^2 + (df/dy)^2)
I_grad = np.sqrt(I_vert**2 + I_horiz**2)
return I_horiz, I_vert, I_grad
for img, label in zip(compute_gradient(I), ["Horizontal derivative", "Vertical derivative", "Gradient norm"]):
display(Markdown("#### {}".format(label)))
plt.imshow(img)
plt.show()
display(Markdown("> The size of the output **is** {} the same as the input\n \n_______"\
.format("**not**"*(img.size != I.size))))
Tip: if you want to display several images in a single cell, use
plt.show()
after each call toplt.imshow(...)
sigma_range = range(1, 5)
for sigma in sigma_range:
display(Markdown("#### `sigma = {}`".format(sigma)))
fig = plt.figure(figsize = (15., 15.))
# for sigma=sigma, retrieve the images returned by compute_gradient
# and add them to a subplot
for (index, img), label in zip(enumerate(compute_gradient(I, sigma=sigma)), \
["Horizontal derivative", "Vertical derivative", "Gradient norm"]):
fig.add_subplot(1, 3, index+1)\
.set_title(label)
plt.imshow(img)
plt.axis('off')
plt.show()
display(Markdown("\n \n_______"))