INST326 In-Class Exercises

20170718, Images in Python

This exercise will explore image processing in Python.


Image Format

Recall the YouTube video I had you watch on fun with sheets:

In [1]:
from IPython.display import HTML

HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/UBX2QQHlQ_I?rel=0&amp;controls=0&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>')
Out[1]:

You can think of an image as a multi-dimensional list with rows and columns, with each cell containing a 3- or 4-element list. The first three elements correspond to red, green, and blue, and the optional fourth element is an alpha value controlling opacity (whether you can see through the image at this cell).

Each of these cells are called pixels.

2D Lists as Images

Recall from previous exercises that you can create 2D lists in Python and have them displayed as images using:

my2dList = make_2d_list_v2(10,15)

my2dList[8][3:10] = [255] * 7
my2dList[7][2:11] = [255] * 9
my2dList[6][2:12] = [255] * 10
my2dList[5][2:4] = [255] * 2
my2dList[5][11:13] = [255] * 2
my2dList[5][7] = 255
my2dList[4][7] = 255
my2dList[3][4:9] = [255] * 5
my2dList[2][4:9] = [255] * 5
my2dList[1][4:9] = [255] * 5
my2dList[0][7] = 255

my_pretty_print(my2dList)

plt.imshow(my2dList)
plt.show()

Which results in the following:

Using existing Python libraries, we can read image files in this format and display them.

In [2]:
%matplotlib inline
import pandas as pd
import matplotlib.pylab as plt
In [5]:
umd_logo_data = pd.read_csv("logo.csv", header=None, index_col=None)

print("Logo Dimensions:", umd_logo_data.shape)

plt.imshow(umd_logo_data, cmap="binary")
plt.show()
Logo Dimensions: (370, 370)

Images as 5D Lists

A 3-color image that uses red, gree, and blue pixels can be thought of as a 5-dimensional list with row, column, red intensity, green intensity, and blue intensity as the dimensions.

We will use the numpy package to build a quick multi-color image that demonstrates this.

In [6]:
import numpy as np

# Create a black image of 10 rows, 100 columns, 
#  and a cell for red, gree, and blue
#  NOTE: We have to set the data type to np.uint8,
#     which gets interpreted as an 8-bit integer,
#     because most image files use 8 bits for red,
#     green, and blue
test_img = np.zeros((10, 100, 3), dtype=np.uint8)

plt.imshow(test_img)
plt.show()

NumPy provides a lot of convenience when working with multi-dimensional arrays. Specifically, we can slice over multiple dimensions quickly using the ":" operator rather than having to deal with lists of lists directly.

In [7]:
# In the zero-th row, slice over all columns, 
#  and set the red channel to 255
test_img[0,:,0] = 255

plt.imshow(test_img)
plt.show()
In [8]:
# Make bands of red, green, and blue
test_img[0:3,:,0] = 255
test_img[3:6,:,1] = 255
test_img[6:9,:,2] = 255

plt.imshow(test_img)
plt.show()

You can check the pixel values at a given location by printing the value at a given row and column.

In [9]:
print("Pixel at 0,0:", test_img[0,0])
print("Pixel at 3,0:", test_img[3,0])
print("Pixel at 6,0:", test_img[6,0])
print("Pixel at 9,0:", test_img[9,0])
Pixel at 0,0: [255   0   0]
Pixel at 3,0: [  0 255   0]
Pixel at 6,0: [  0   0 255]
Pixel at 9,0: [0 0 0]

You'll notice that the [0,0,0] pixel value represents black, which equates to the red, green, and blue channels being turned off or having zero intensity.

To make white, you turn all three channels to their max intensity, which here is 255.

In [10]:
test_img[9:11,:] = [255, 255, 255]

plt.imshow(test_img)
plt.show()

Reading Image Files

We can use the scipy package and its misc module to read images directly from image files and display them without having to make them from scratch as we did above.

In [11]:
from scipy import misc

# Read the UMD logo file
logo_img = misc.imread("umd.jpg")

plt.imshow(logo_img)
plt.show()

You can print random pixels and the first few rows to see we are working with the same sort of data:

In [12]:
print("Image Dimension:", logo_img.shape)
print("Pixel at 180, 180:", logo_img[180,180]) # Red+green == yellow
print("Pixel at 180, 210:", logo_img[180,210]) # Mostly red

plt.imshow(logo_img[155:215,155:215])
plt.show()
Image Dimension: (370, 370, 3)
Pixel at 180, 180: [254 212  38]
Pixel at 180, 210: [225  59  63]

Exercise 1: Image Class

We've now seen all the methods for reading an image from a file, accessing pixels in that file, and displaying that image.

Use these pieces to write a class called MyImage that takes a filename as the argument for its initializer and uses misc.imread() to read that file into an instance variable.

Your Image class should have the following methods:

  • get_image_data() that returns the image data you stored
  • show() that will display the image on the screen
In [13]:
class MyImage:

    def __init__(self, filename):
        
        self.img_data = misc.imread(filename)
        
    def get_image_data(self):
        return self.img_data
    
    def show(self):
        plt.imshow(self.img_data)
        plt.show()
In [14]:
# Testing code for your MyImage class
my_img = MyImage("watchmen.jpg")
my_img.show()

Modifying Images

Now that we've read in an image, we can modify the image as we see fit.

A simple modification is to convert the image to greyscale. We've already seen something list this when we used 2D lists as images.

To convert a three-color image to greyscale, we calculate the average number across each pixel and assign each channel in that pixel to that average (remember how all 0s was black, and all 255s was white?).

In [15]:
# Copy the image
logo_copy = np.copy(logo_img)

# Go through each pixel and set each channel to the average across
#  all channels
for i in range(logo_copy.shape[0]):
    for j in range(logo_copy.shape[1]):
        pixel = logo_copy[i, j,:] # Get this pixel
        average = np.mean(pixel) # Calculate its average
        
        # set the pixel to a numpy array containing this average
        #  and using the data type 8-bit unsigned int
        logo_copy[i, j] = np.array([average] * 3, dtype=np.uint8)
        
# Display the copy
plt.imshow(logo_copy)
plt.show()

Exercise 2: Create a Class For Modifiable Images

We've now seen all the methods for reading an image from a file and applying a filter function that converts that file to black and white.

Use these pieces to write a class called BWImage that inherits from your MyImage class above.

Your BWImage class should have the following methods:

  • make_greyscale() that will convert your image to greyscale
  • save(filename) that will save your image to a file using misc.imsave()
In [17]:
# Implement the BWImage class here
class BWImage(MyImage):
    
    def make_greyscale(self):
        # Go through each pixel and set each channel to the average across
        #  all channels
        for i in range(self.img_data.shape[0]):
            for j in range(self.img_data.shape[1]):
                pixel = self.img_data[i, j,:] # Get this pixel
                average = np.mean(pixel) # Calculate its average

                # set the pixel to a numpy array containing this average
                #  and using the data type 8-bit unsigned int
                self.img_data[i, j] = np.array([average] * 3, dtype=np.uint8)
    
    def save(self, filename):
        misc.imsave(filename, self.img_data)
In [18]:
# Testing code for your MyImage class
my_bwimg = BWImage("watchmen.jpg")
my_bwimg.show()
my_bwimg.make_greyscale()
my_bwimg.show()
my_bwimg.save("watchmen-bw.jpg")
In [16]:
help(misc.imsave)
Help on function imsave in module scipy.misc.pilutil:

imsave(name, arr, format=None)
    Save an array as an image.
    
    Parameters
    ----------
    name : str or file object
        Output file name or file object.
    arr : ndarray, MxN or MxNx3 or MxNx4
        Array containing image values.  If the shape is ``MxN``, the array
        represents a grey-level image.  Shape ``MxNx3`` stores the red, green
        and blue bands along the last dimension.  An alpha layer may be
        included, specified as the last colour band of an ``MxNx4`` array.
    format : str
        Image format. If omitted, the format to use is determined from the
        file name extension. If a file object was used instead of a file name,
        this parameter should always be used.
    
    Examples
    --------
    Construct an array of gradient intensity values and save to file:
    
    >>> from scipy.misc import imsave
    >>> x = np.zeros((255, 255))
    >>> x = np.zeros((255, 255), dtype=np.uint8)
    >>> x[:] = np.arange(255)
    >>> imsave('gradient.png', x)
    
    Construct an array with three colour bands (R, G, B) and store to file:
    
    >>> rgb = np.zeros((255, 255, 3), dtype=np.uint8)
    >>> rgb[..., 0] = np.arange(255)
    >>> rgb[..., 1] = 55
    >>> rgb[..., 2] = 1 - np.arange(255)
    >>> imsave('rgb_gradient.png', rgb)

In [ ]: