INST326 In-Class Exercises, Part 2

20170608, Loops, Strings, and Files

These exercises will get you more familiar with reading files and manipulating strings with loops. Remember strings have many useful functions associated with them, which can be found in Python Docs, and opening files generally look like the following:

inFile = open(">filename<", "r")
for line in inFile:
    # do something
inFile.close()

TAKE NOTE: If you don't re-open and close your file every time you try to read from the file, you may not get any data back. This issue results from how Python handles file IO (using generators). You can address this issue either by closing and re-opening your target file each time, or calling inFile.seek(0) to reset the file. More information on file IO in Python is available here: (https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files)

Exercise 1. Read and Print Lines From a File

Write code to open a file, read each line of the file (you can use the for loop iterator on the opened file object), and print each line to the screen.

As a test file, open and read the reversedTestFile.txt file accompanying this notebook. Also try the other stringOps.txt and tweetBodies.txt files as well.

In [ ]:

Exercise 2. Print the Number of Characters in Each Line

For each line in the test file you opened, use the len() function to calculate and print the number of characters in each line. Also, using nested for loops, count the number of non-space (i.e., " ") characters present in the line as well.

In [ ]:
 

Exercise 3. Calculate the Average Characters Per Line

Using the code you wrote in Exercise 2 and some basic math, calculate the average number of characters per line and average number of words per line (assume the number of words is 1+ the number of spaces in the text).

In [ ]:
 

Exercise 4. Print All Lines Starting with a Given Pattern

Write and call a function that searches for a given pattern of characters at the beginning of an input string. Call this function startsWith() that should take as arguments an input string and a pattern and return a boolean value of whether the input string starts with the given pattern. Use this function to print only lines from a file that start with the given pattern.

As a test case, open stringOps.txt and only print lines that start with "str.".

As an example, startsWith("str.lower() - Lowercase a string.", "str.") should return true.

In [ ]:
 

Exercise 5. Reverse a String

Using array indexing into a string, for loops, and string concatenation, write a function called reversed() that takes a string as an argument and returns the reversed version of that string.

As a test case, open reversedTestFile.txt and print the reversed lines to the screen.

As an example, reversed("Hello") should return "olleH".

In [ ]:
 

Exercise 6. Find a Character in a Search String

Write a function, called find(), that takes two arguments, a search string and a pattern character, and returns the index of the first occurrence of that character in the search string. If the pattern does not exist in the search string, return -1. If the pattern exists in the search string multiple times, only return the index to the first instance. Use this function to determine whether a line in a file contains a hashtag (as marked by #).

As a test case, open tweetBodies.txt and print only those lines with hashtags.

As an example, find("Pattern", "t") should return 2.

In [ ]:
 

Exercise 7. Partition a String by a Separator

Write a function, called partition(), that takes two arguments, a search string and a separator, and separates the input string by that separator. Your function should return three items:

  1. The substring before the separator,
  2. The separator itself, and
  3. The substring after the separator.

If the separator is not found, return the whole string, and two empty strings.

As an example, partition("Pattern", "t") should return ("Pa", "t", "tern").

In [ ]:
 

Exercise 8. Find Hashtags

Use the functions you've developed above to find and print all hashtags present in each line of the tweetBodies.txt file.

I.e., for each line in the file, print all hashtags that are present on that line and only those hashtags, no other text.

In [ ]:
 
In [ ]: