from IPython.display import HTML
HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/UBX2QQHlQ_I?rel=0&controls=0&showinfo=0" frameborder="0" allowfullscreen></iframe>')
You can think of an image as a multi-dimensional list with rows and columns, with each cell containing a 3- or 4-element list. The first three elements correspond to red, green, and blue, and the optional fourth element is an alpha value controlling opacity (whether you can see through the image at this cell).
Each of these cells are called pixels.
Recall from previous exercises that you can create 2D lists in Python and have them displayed as images using:
my2dList = make_2d_list_v2(10,15)
my2dList[8][3:10] = [255] * 7
my2dList[7][2:11] = [255] * 9
my2dList[6][2:12] = [255] * 10
my2dList[5][2:4] = [255] * 2
my2dList[5][11:13] = [255] * 2
my2dList[5][7] = 255
my2dList[4][7] = 255
my2dList[3][4:9] = [255] * 5
my2dList[2][4:9] = [255] * 5
my2dList[1][4:9] = [255] * 5
my2dList[0][7] = 255
my_pretty_print(my2dList)
plt.imshow(my2dList)
plt.show()
Which results in the following:
Using existing Python libraries, we can read image files in this format and display them.
%matplotlib inline
import pandas as pd
import matplotlib.pylab as plt
umd_logo_data = pd.read_csv("logo.csv", header=None, index_col=None)
print("Logo Dimensions:", umd_logo_data.shape)
plt.imshow(umd_logo_data, cmap="binary")
plt.show()
A 3-color image that uses red, gree, and blue pixels can be thought of as a 5-dimensional list with row, column, red intensity, green intensity, and blue intensity as the dimensions.
We will use the numpy package to build a quick multi-color image that demonstrates this.
import numpy as np
# Create a black image of 10 rows, 100 columns,
# and a cell for red, gree, and blue
# NOTE: We have to set the data type to np.uint8,
# which gets interpreted as an 8-bit integer,
# because most image files use 8 bits for red,
# green, and blue
test_img = np.zeros((10, 100, 3), dtype=np.uint8)
plt.imshow(test_img)
plt.show()
NumPy provides a lot of convenience when working with multi-dimensional arrays. Specifically, we can slice over multiple dimensions quickly using the ":" operator rather than having to deal with lists of lists directly.
# In the zero-th row, slice over all columns,
# and set the red channel to 255
test_img[0,:,0] = 255
plt.imshow(test_img)
plt.show()
# Make bands of red, green, and blue
test_img[0:3,:,0] = 255
test_img[3:6,:,1] = 255
test_img[6:9,:,2] = 255
plt.imshow(test_img)
plt.show()
You can check the pixel values at a given location by printing the value at a given row and column.
print("Pixel at 0,0:", test_img[0,0])
print("Pixel at 3,0:", test_img[3,0])
print("Pixel at 6,0:", test_img[6,0])
print("Pixel at 9,0:", test_img[9,0])
You'll notice that the [0,0,0] pixel value represents black, which equates to the red, green, and blue channels being turned off or having zero intensity.
To make white, you turn all three channels to their max intensity, which here is 255.
test_img[9:11,:] = [255, 255, 255]
plt.imshow(test_img)
plt.show()
We can use the scipy
package and its misc
module to read images directly from image files and display them without having to make them from scratch as we did above.
from scipy import misc
# Read the UMD logo file
logo_img = misc.imread("umd.jpg")
plt.imshow(logo_img)
plt.show()
You can print random pixels and the first few rows to see we are working with the same sort of data:
print("Image Dimension:", logo_img.shape)
print("Pixel at 180, 180:", logo_img[180,180]) # Red+green == yellow
print("Pixel at 180, 210:", logo_img[180,210]) # Mostly red
plt.imshow(logo_img[155:215,155:215])
plt.show()
We've now seen all the methods for reading an image from a file, accessing pixels in that file, and displaying that image.
Use these pieces to write a class called MyImage
that takes a filename as the argument for its initializer and uses misc.imread()
to read that file into an instance variable.
Your Image
class should have the following methods:
get_image_data()
that returns the image data you storedshow()
that will display the image on the screenclass MyImage:
def __init__(self, filename):
self.img_data = misc.imread(filename)
def get_image_data(self):
return self.img_data
def show(self):
plt.imshow(self.img_data)
plt.show()
# Testing code for your MyImage class
my_img = MyImage("watchmen.jpg")
my_img.show()
Now that we've read in an image, we can modify the image as we see fit.
A simple modification is to convert the image to greyscale. We've already seen something list this when we used 2D lists as images.
To convert a three-color image to greyscale, we calculate the average number across each pixel and assign each channel in that pixel to that average (remember how all 0s was black, and all 255s was white?).
# Copy the image
logo_copy = np.copy(logo_img)
# Go through each pixel and set each channel to the average across
# all channels
for i in range(logo_copy.shape[0]):
for j in range(logo_copy.shape[1]):
pixel = logo_copy[i, j,:] # Get this pixel
average = np.mean(pixel) # Calculate its average
# set the pixel to a numpy array containing this average
# and using the data type 8-bit unsigned int
logo_copy[i, j] = np.array([average] * 3, dtype=np.uint8)
# Display the copy
plt.imshow(logo_copy)
plt.show()
We've now seen all the methods for reading an image from a file and applying a filter function that converts that file to black and white.
Use these pieces to write a class called BWImage
that inherits from your MyImage
class above.
Your BWImage
class should have the following methods:
make_greyscale()
that will convert your image to greyscalesave(filename)
that will save your image to a file using misc.imsave()
# Implement the BWImage class here
class BWImage(MyImage):
def make_greyscale(self):
# Go through each pixel and set each channel to the average across
# all channels
for i in range(self.img_data.shape[0]):
for j in range(self.img_data.shape[1]):
pixel = self.img_data[i, j,:] # Get this pixel
average = np.mean(pixel) # Calculate its average
# set the pixel to a numpy array containing this average
# and using the data type 8-bit unsigned int
self.img_data[i, j] = np.array([average] * 3, dtype=np.uint8)
def save(self, filename):
misc.imsave(filename, self.img_data)
# Testing code for your MyImage class
my_bwimg = BWImage("watchmen.jpg")
my_bwimg.show()
my_bwimg.make_greyscale()
my_bwimg.show()
my_bwimg.save("watchmen-bw.jpg")
help(misc.imsave)